This week’s workshop is a bit different from the previous ones. Instead of learning how to use a tool or technique, we’re trying to understand a method by looking through a detailed example.

Read through this “notebook” on statistical modeling. It includes step-by-step code as well as explanatory text for building a statistical model of genre categories. The goal here is not to learn how to code, and in fact the code is far less important than the concepts and the results being described. Hopefully you’ll walk away from this workshop with a solid foundation for understanding statistical modeling and machine learning.

Read through this example and try to get a grasp on how modeling works. I’ve (hopefully) written it in such a way that no previous knowledge of math or code is required. We’re not going very deep into the math here, just enough for you to have a general idea of what’s going on. As we talked about a little last class, I really do think its possible for folks without a math background to get a handle on these concepts. You don’t need to know the ins and outs of the equations, but you can learn enough to understand what’s going on and to critique and even use these methods responsibly. (You might also find it helpful to read the introduction to Underwood’s Distant Horizons first.)

If we were having in-person classes (alas!), I would talk through this example in class, and you’d have lots of opportunities to ask questions. Even though you’re going through the notebook on your own, don’t deprive yourself of the opportunity to get your questions answered. When you encounter something unfamiliar or would like to know more, go ahead and email me your question. I’ll be very happy to explain things in more detail or even meet over Zoom. Or if you’d prefer, just make a note of your question, and we can talk a little more about this example at the start of next class.