DA 101, Dr. Ladd
Week 5
AKA “average”

\(\dfrac{600+470+170+430+300}{5} = 394\)
AKA “50th percentile”
AKA “quantile”
AKA “extreme value”


\(\dfrac{206^2+76^2+(-224)^2+36^2+(-94)^2}{5} = 21,704\)
Rottweilers are tall, and dachsunds are short—compared to the standard deviation from the mean.
Were these the results you expected?
When you have “N” data values:
Sample variance: \(\dfrac{108,520}{4}=27,130\)
Sample standard deviation: \(\sqrt{27,130}=164\)
Think of it as a “correction” when your data is only a sample. R does this by default!
Be careful: normal distributions are assumed for many statistical analyses!
displ and hwy in the mpg datasetPearson’s correlation coefficient multiplies the deviations from the mean for two variables, and divides by the product of the standard deviation.


Use dplyr to find the summary statistics for each dataset in the datasaurus_dozen.
round() to round to 3 decimal places.When you’re done, try making scatter plots!