8  Final Project

Published

December 13, 2025

Complete by: Saturday 13 Dec. at 2pm

Please note that I cannot accept any work past this deadline.

A polished Jupyter Notebook HTML file reporting the results of your final data analysis project, as outlined in your Project Planning Document. Roughly, 5-7 written pages (though this is hard to measure in a Jupyter notebook, so consider it a guideline). Think about this report as a “final takeaway” of all the skills you’ve learned in class over the semester. Below is a rough structure of your final written report.

This should be a ready-to-deliver report with clear section headers and interpretations of any statistical or graphical output (like several of our previous projects). You can review the Sample Final Project to get a sense of what you’ll need to accomplish.

8.1 Introduction

  • Provide a one-to-three paragraph introduction, professionally written, that gives an overview of the essentials someone needs to know to make sense of the data you show.
  • Consider some of the ethical and logistical challenges that your data presents, and discuss this in your introduction. Address the ethical issues in your project in terms of the ADSA’s four lenses.
  • You may reuse some of the text you wrote in your Project Planning document, but think carefully about how to revise and expand it.
  • You must cite and link to your dataset, and you can use Markdown to create contextual links like so: [text here](website).

8.2 Data Explanation and Exploration

  • Provide some details describing the data you are working with. What are the observations? The key variables you will be looking at? Are there any particular challenges in the data you will need to work through or be aware of during analysis?
  • Present at least two univariate or bivariate analyses of key attributes in your data—these may be summary statistics, confidence intervals, distributions, correlations, or other observations. These may be closely related to the visualizations you create in this section.
  • Provide four polished visuals that describe the data in a way relevant to your question (descriptive, not related to your statistical model specifically–not a regression plot). No more than two of these should be the same visualization type. Write text that describes the data and what the visuals tell you about your data or decisions you will need to make for the analysis.

8.3 Statistical Analysis and Interpretation

  • Provide at least two distinct statistical approaches (for example: linear regression and hypothesis testing; naive bayes classification and Kmeans clustering; KNN regression and random forest classification; etc.) that you interpret correctly and fully in the text. These can be whatever you choose, but you should explain why you chose the model you did, and why they fit the data. It’s recommended that you use two different approaches (i.e. not just two methods for regression or two methods for classification).
  • Provide at least three polished visuals that specifically support and validate the model(s) you have developed (e.g., residual and regression line/scatter, histogram showing normality of data or residuals, confusion matrix, etc.), or help to communicate your main result. Visuals should have captions and be referred to clearly in your text, and they should not all be the same (e.g., not three scatterplots). You’ll likely need more than three to complete the usual validation steps.
  • Text should fully explain what you show and your findings, to someone who is unfamiliar with your data, code, and models, in terms of the data and in plain language.

8.4 Conclusions

  • Provide one or two paragraphs concluding about the data: what does it tell us, what are the limitations to this data/model, and what is one future direction you could envision for future data analysts or data collectors?
  • Include citations for the two secondary references that you used in your Project Planning document, and explain how they pertain to your conclusions and insights. You may cite a reference by linking directly to it in your markdown [text here](link here), and listing the full citation below the conclusions section. Please ask me if you aren’t sure how to cite references.