Course Schedule

Date Topics & Slides Workshops Assignments
14 & 16 Jan. What Is Data? & Responsible Data Collection Jupyter
21 & 23 Jan. Python & Data Wrangling Movies
28 & 30 Jan. Exploratory Data Analysis & Visualization Movies Documentation Assignment Due 30 Jan.
4 & 6 Feb. Hypothesis Testing: Comparison of Means Sports
11 & 13 Feb. Hypothesis Testing: Correlation Sports
18 & 20 Feb. Linear Regression Business Take-Home Test I Due 20 Feb.
25 & 27 Feb. Logistic Regression Business
4 & 6 Mar. K-Nearest Neighbors Health Tutorial Assignment Due 6 Mar.
11 & 13 Mar. SPRING BREAK
18 & 20 Mar. Naive Bayes Classifier
25 & 27 Mar. Decision Trees and the Random Forest Health Take-Home Test II Due 25 Mar.
1 & 3 Apr. Clustering and Unsupervised Approaches Literature Project Proposal Due 3 Apr.
8 & 10 Apr. Neural Networks Literature
15 & 17 Apr. Ethical Data Science & Project Discussion Progress Report Due 15 Apr.
22 & 24 Apr. Sample Final Project & Project Discussion Video Presentations Due 22 Apr.
1 May 2-5pm Panel Presentations Final Project Due 1 May at 2pm

Note on the schedule

Keep in mind that some of this schedule could change throughout the semester. However, if anything changes I’ll update this page, and I’ll be sure to give you plenty of advance notice.

Software

All projects in this course will be scripted and analyzed using Python, an open source programming language and environment. Specifically, we will be using JupyterHub as our programming environment. No previous experience with Python, statistical software packages, or computer programming is required.