CIS 241, Dr. Ladd
mydata = pd.read_csv("name_of_file.csv")
Remember to save everything in variables
>
greater than>=
greater than or equal to<
less than<=
less than or equal to!=
not equal==
equal (note the double equals sign!)& “and”, | “or”, and ! “not”
Get only the rows for the green taxis.
.sort_values()
method lets you sort rows by value..rename()
lets you rename columns.Notice we kept the same variable name here!
.assign()
lets you add new columns based on existing ones..groupby()
with summary statistics to make summary tables.Summary tables are new dataframes that summarize our original data.
This paradigm is known as split-apply-combine, and it’s key to data analysis.
Stat functions to use: mean(), median(), min(), max(), std()
.
taxis.groupby(['dropoff_borough'])
It doesn’t look like anything on its own!