DA 101, Dr. Ladd
Week 5
AKA “average”
\(\dfrac{600+470+170+430+300}{5} = 394\)
AKA “50th percentile”
AKA “quantile”
AKA “extreme value”
\(\dfrac{206^2+76^2+(-224)^2+36^2+(-94)^2}{5} = 21,704\)
Rottweilers are tall, and dachsunds are short—compared to the standard deviation from the mean.
Were these the results you expected?
When you have “N” data values:
Sample variance: \(\dfrac{108,520}{4}=27,130\)
Sample standard deviation: \(\sqrt{27,130}=164\)
Think of it as a “correction” when your data is only a sample. R does this by default!
Be careful: normal distributions are assumed for many statistical analyses!
displ
and hwy
in the mpg
datasetPearson’s correlation coefficient multiplies the deviations from the mean for two variables, and divides by the product of the standard deviation.
Use dplyr
to find the summary statistics for each dataset in the datasaurus_dozen
.
round()
to round to 3 decimal places.When you’re done, try making scatter plots!