02: College Football#
In honor of the Super Bowl this week, we’ll work on a network of football games compiled by Girvan and Newman (two famous network scientists) in 2002. Go to Newman’s network data site and download the American College football network available there. This will be a zip file that includes a .gml
file and a text file with some important metadata.
In a short Jupyter notebook report, answer the following questions about this network. Don’t simply calculate the answers: make sure you’re fully explaining (in writing) the metrics and visualizations that you generate. Consider the Criteria for Good Reports as a guide. You can create markdown cells with section headers to separate the different sections of the report. Rather than number the report as if you’re answering distinct questions, use the questions as a guide to do some data storytelling, i.e. explain this network’s data in an organized way.
What are some basic facts about this network: how many nodes and edges does it have? is it directed? is it connected? What do these basic measures tell you about the network?
Is the college football network a small world? Why or why not? Think about which path-related metrics you would need in order to draw conclusions about this. You may also need to create a degree distribution graph, and you can use the centrality calculations that you’ll create in the next step.
What are the most central teams in the network? Calculate degree, betweenness, eigenvector, and closeness centrality, and look at the highest values for each measure. Add all of these measures as node attributes to your network. Which teams have high centrality, and why do you think that is? What do the different centrality measures tell you about the network? Consider the distributions of centrality in the network as well as the ratios between different measures (i.e. the ratio of betweenness to degree).
What are the most central conferences in the network? How might you assess conferences as groups of nodes?
n.b. To answer these questions, you’ll need to use a mixture of the NetworkX skills we’ve been working on, as well as combine NetworkX code with pandas and altair.
When you’re finished, remove these instructions from the top of the file, leaving only your own writing and code. Export the notebook as an HTML file, check to make sure everything is formatted correctly, and submit your HTML file to Sakai.