3 Words
3.1 Sample Research Questions
- When is the word “data” introduced to English and how does its use change over time?
- Does Shakespeare have a vocabulary distinct from other authors?
- How do we overcome historical spelling and word-form irregularity?
3.2 Tools and Websites
- EarlyPrint & EEBO:
- NLP & Text Analysis
3.3 Activities
3.3.1 Comparing Word Search Strategies
Choose a word, and search for it using Corpus Search and the N-Gram Browser. How the results different when you look for regularized and non-regularized forms?
Select a phrase that includes that word, and devise the CQL query to search for it in Corpus Search. Look for the same term in the N-Gram Browser and Phrase Search. Can you begin to interpret the results of these three interfaces?
Export/download the results of the Corpus Search and look within those results for related words or other collocates.
3.3.2 Text Analysis
Follow the tutorials in the EarlyPrint + Python notebooks, especially the Metadata and Text Analysis exercises:
- Specifically, refer to the Programming Historian tutorial on Text Similarity
- Then, work through the EarlyPrint notebooks using Google Colab: