CIS 241, Dr. Ladd
# Use the name of a file or a URL
taxis = pd.read_csv('https://raw.githubusercontent.com/mwaskom/
seaborn-data/master/taxis.csv')
taxis
Remember to save everything in variables
>
greater than>=
greater than or equal to<
less than<=
less than or equal to!=
not equal==
equal (note the double equals
sign!)& “and”, | “or”, and ! “not”
Get only the rows for the green taxis.
.sort_values()
method lets you sort rows by
value..rename()
lets you rename columns.Notice we kept the same variable name here!
.assign()
lets you add new columns based on existing
ones..groupby()
with summary statistics to make
summary tables.Summary tables are new dataframes that summarize our original data.
This paradigm is known as split-apply-combine, and it’s key to data analysis.
Stat functions to use:
mean(), median(), min(), max(), std()
.
taxis.groupby(['dropoff_borough'])
It doesn’t look like anything on its own!