As I mentioned in class, it’s difficult to develop a one-off workshop exercise for data visualization. For one thing, data viz is its own distinct field, and there’s a lot of theoretical and technical knowledge that goes into even a simple data visualization. For another, in DH data viz is often closely linked to a research question or a particular kind of analysis; it’s difficult to abstract the processes of data visualization from specific questions. We will explore these (productive) difficulties in our readings this week by asking: to what extent can and should the field of data visualization interact with our aims as humanities researchers and teachers?

For this tutorial, you’ll explore multiple ways of visualizing the same data set.

Download fight-song.csv, a table of metadata about college fight songs from ubiquitous election-season data journalism outlet FiveThirtyEight. Spend a little time examining the data before moving on to the next step. Does the data imply a hierarchy? Does it imply change over time? How many of the features are numerical, how many are categorical, and how many are binary (yes or no)? What sorts of questions would you like to ask about this data set or about specific features of it?
Enter the data into RAWGraphs. (You can refer to their help texts for a basic how-to.) Look through the different types of visualizations and choose two that will work for this data set. You can get brief descriptions of the graph types by clicking on them, but for more detail on these visualization types, Wikipedia is your friend. Choose visualization types based on what you want to know about the data. Are you interested in change over time? categories and groupings?
Create visualizations for each of the graph types you’ve chosen. Experiment with different features as different parts of your visualization. For example, maybe in one version of your scatter plot, color will represent conference, but in another color will represent whether or not the song is “official.”
After you’ve created two types of visualizations and explored different permutations of them, take a look at FiveThirtyEight’s interactive article that uses this data. How does the article choose to represent the data visually? What parts of the data does it turn into a chart as opposed to a table or graph? What rhetorical effects do these choices have? What (if any) argument is the piece advancing about the data, and how are those conclusions different from what you found using RAWGraphs?

You might also think about this exercise in relationship to our readings, especially the Drucker reading. To what extent are the “standard” forms of visualization offered by RAWGraphs, or used by FiveThirtyEight, limiting? What new visualizations paradigms might you imagine for use in the humanities?