9  Corpora

9.1 Sample Research Questions

  • How many subjects exist within the early modern corpus?
  • How pervasive is poetry in early modern print?
  • Which texts are most similar or different?

9.2 Tools and Websites

9.3 Activities

9.3.1 Viewing the Whole Corpus

  1. Explore the Bibliographia: first, search for a specific text and find the subject headings it’s part of. Then, find all of the texts with that subject heading and see if the “map” has organized them into the same area.
  2. Compare this map to the version we made in Nomic Atlas, using semantic search to find topics you’re interested in. How do the subject headings from the Bibliographia compare to the automatically-generated topics and search results?

9.3.2 Modeling Early Modern Corpora

  1. Follow the tutorials in the [EarlyPrint + Python] notebooks, especially the Word Embeddings and Unsupervised Clustering exercises.

9.4 References