DA 101, Dr. Ladd
Week 3
tidyverse
librarympg
dataset, view it, and get a summaryUse glimpse()
to see columns and their types!
or with dplyr’s special “pipe” notation, %>%
:
These two are the same!
dplyr
outputs to variables.filter()
uses standard comparisons.>
: greater than>=
: greater than or equal to<
: less than<=
: less than or equal to!=
: not equal==
: equal (note the double equals sign!)filter()
also uses logical operators to combine comparisons.& “and”, | “or”, and ! “not”
It’s because they are never “true”. You must ask for them explicitly with is.na()
.
Work through the exercise in the textbook, section 5.2.4.
only_dates <- flights %>%
select(year, month, day)
only_dates <- flights %>%
select(year:day)
everything_but_dates <- flights %>%
select(-(year:day))
See the textbook for additional tips.
flights <- flights %>%
mutate(gain = dep_delay - arr_delay,
hours = air_time / 60,
gain_per_hour = gain / hours)
Transmute does the same thing but only keeps the new variables.
Summary tables are new dataframes that summarize our original data.
Stat functions to use with summarise: mean(), median(), min(), max(), sd(), range()
.
It doesn’t look like anything on its own!
dplyr
operations.