Lesson Index
Our lessons are organized by typical phases of the research process, as well as general topics. Use the buttons to filter lessons by category. If you can’t find a skill, technology, or tool you’re looking for, please let us know!
- acquire (13)
- transform (36)
- analyze (35)
- present (27)
- sustain (2)
- APIs (9)
- Python (35)
- Data Management (10)
- Data Manipulation (31)
- Distant Reading (16)
- Set up (7)
- Linked Open Data (2)
- Mapping (15)
- Network Analysis (7)
- Web Scraping (5)
- Digital Publishing (14)
- R (10)
- Machine Learning (6)
- creative coding (2)
- data visualization (21)
- modeling (1)
- sort by publication date
- sort by difficulty
Filtering Results: (35) date
-
Ian Goodale
Analyzing Multilingual French and Russian Text using NLTK, spaCy, and Stanza
This lesson covers tokenization, part-of-speech tagging, and lemmatization, as well as automatic language detection, for non-English and multilingual text. You’ll learn how to use the Python packages NLTK, spaCy, and Stanza to analyze a multilingual Russian and French text.
analyzing python data-manipulation distant-reading 2024-11-13 2 -
Charles Goldberg and Zach Haala
Facial Recognition in Historical Photographs with Artificial Intelligence in Python
In this lesson, you’ll learn computer vision and machine learning principles for object recognition, and how to apply these principles using Python to recognize and classify smiling faces in historical photographs.
analyzing python machine-learning 2024-06-25 1 -
Avery Blankenship, Sarah Connell, and Quinn Dombrowski
Understanding and Creating Word Embeddings
Word embeddings allow you to analyze the usage of different terms in a corpus of texts by capturing information about their contextual usage. Through a primarily theoretical lens, this lesson will teach you how to prepare a corpus and train a word embedding model. You will explore how word vectors work, how to interpret them, and how to answer humanities research questions using them.
analyzing python distant-reading machine-learning 2024-01-31 2 -
Grace Di Méo
Creating Interactive Visualizations with Plotly
This lesson demonstrates how to create interactive data visualizations in Python with Plotly’s open-source graphing libraries using materials from the Historical Violence Database.
presenting python data-visualization 2023-12-13 2 -
Jeff Blackadar
Transcribing Handwritten Text with Python and Microsoft Azure Computer Vision
Tools for machine transcription of handwriting are practical and labour-saving if you need to analyse or present text in digital form. This lesson will explain how to write a Python program to transcribe handwritten documents using Microsoft’s Azure Cognitive Services, a commercially available service that has a cost-free option for low volumes of use.
transforming python api data-manipulation 2023-12-06 2 -
Megan S. Kane
Corpus Analysis with spaCy
This lesson demonstrates how to use the Python library spaCy for analysis of large collections of texts. This lesson details the process of using spaCy to enrich a corpus via lemmatization, part-of-speech tagging, dependency parsing, and named entity recognition. Readers will learn how the linguistic annotations produced by spaCy can be analyzed to help researchers explore meaningful trends in language patterns across a set of texts.
analyzing data-manipulation distant-reading python 2023-11-02 2 -
Jonathan Reades and Jennie Williams
Clustering and Visualising Documents using Word Embeddings
This lesson uses word embeddings and clustering algorithms in Python to identify groups of similar documents in a corpus of approximately 9,000 academic abstracts. It will teach you the basics of dimensionality reduction for extracting structure from a large corpus and how to evaluate your results.
analyzing machine-learning network-analysis python data-visualization 2023-08-09 3 -
Isabelle Gribomont
OCR with Google Vision API and Tesseract
Google Vision and Tesseract are both popular and powerful OCR tools, but they each have their weaknesses. In this lesson, you will learn how to combine the two to make the most of their individual strengths and achieve even more accurate OCR results.
transforming api python data-manipulation 2023-03-31 2 -
Christopher Goodwin
Creating GUIs in Python for Digital Humanities Projects
In this lesson, you will use Qt Designer and Python to design and implement a simple graphical user interface and application to merge PDF files. This lesson also demonstrates how to package the application for distribution to other personal computers.
presenting python data-management 2023-03-22 2 -
Chantal Brousseau
Interrogating a National Narrative with GPT-2
In this lesson, you will learn how to apply a Generative Pre-trained Transformer language model to a large-scale corpus so that you can locate broad themes and trends within written text.
analyzing python data-manipulation 2022-10-03 2 -
Daniel van Strien, Kaspar Beelen, Melvin Wevers, Thomas Smits, and Katherine McDonough
Computer Vision for the Humanities: An Introduction to Deep Learning for Image Classification (Part 1)
This is the first of a two-part lesson introducing deep learning based computer vision methods for humanities research. Using a dataset of historical newspaper advertisements and the fastai Python library, the lesson walks through the pipeline of training a computer vision model to perform image classification.
analyzing python machine-learning 2022-08-17 3 -
Daniel van Strien, Kaspar Beelen, Melvin Wevers, Thomas Smits, and Katherine McDonough
Computer Vision for the Humanities: An Introduction to Deep Learning for Image Classification (Part 2)
This is the second of a two-part lesson introducing deep learning based computer vision methods for humanities research. This lesson digs deeper into the details of training a deep learning based computer vision model. It covers some challenges one may face due to the training data used and the importance of choosing an appropriate metric for your model. It presents some methods for evaluating the performance of a model.
analyzing python machine-learning 2022-08-17 3 -
Matthew J. Lavin
Regression Analysis with Scikit-Learn (part 1 - Linear)
This lesson is the first of a two-part lesson focusing on an indispensable set of data analysis methods, logistic and linear regression. It provides an overview of linear regression and walks through running both algorithms in Python (using scikit-learn). The lesson also discusses interpreting the results of a regression model and some common pitfalls to avoid.
analyzing python 2022-07-13 3 -
Matthew J. Lavin
Regression Analysis with Scikit-learn (part 2 - Logistic)
This lesson is the second in a two-part lesson focusing on regression analysis. It provides an overview of logistic regression, how to use Python (scikit-learn) to make a logistic regression model, and a discussion of interpreting the results of such analysis.
analyzing python 2022-07-13 3 -
Thomas Jurczyk
Clustering with Scikit-Learn in Python
This tutorial demonstrates how to apply clustering algorithms with Python to a dataset with two concrete use cases. The first example uses clustering to identify meaningful groups of Greco-Roman authors based on their publications and their reception. The second use case applies clustering algorithms to textual data in order to discover thematic groups. After finishing this tutorial, you will be able to use clustering in Python with Scikit-learn applied to your own data, adding an invaluable method to your toolbox for exploratory data analysis.
analyzing python data-manipulation 2021-09-29 3 -
Quinn Dombrowski, Tassie Gniady, and David Kloster
Introduction to Jupyter Notebooks
Jupyter notebooks provide an environment where you can freely combine human-readable narrative with computer-readable code. This lesson describes how to install the Jupyter Notebook software, how to run and create Jupyter notebook files, and contexts where Jupyter notebooks can be particularly helpful.
presenting python website 2019-12-08 1 -
Charlie Harper
Visualizing Data with Bokeh and Pandas
In this lesson you will learn how to visually explore and present data in Python by using the Bokeh and Pandas libraries.
analyzing python data-manipulation mapping data-visualization 2018-07-27 2 -
Fred Gibbs
Installing Python Modules with pip
There are many ways to install external python libraries; this tutorial explains one of the most common methods using pip.
acquiring get-ready python 2013-05-06 1 -
William J. Turkel and Adam Crymble
Code Reuse and Modularity in Python
Computer programs can become long, unwieldy and confusing without special mechanisms for managing complexity. This lesson will show you how to reuse parts of your code by writing functions and break your programs into modules, in order to keep everything concise and easier to debug.
transforming python 2012-07-17 2 -
William J. Turkel and Adam Crymble
Counting Word Frequencies with Python
Counting the frequency of specific words in a list can provide illustrative data. This lesson will teach you Python’s easy way to count such frequencies.
analyzing python 2012-07-17 2 -
William J. Turkel and Adam Crymble
Creating and Viewing HTML Files with Python
Here you will learn how to create HTML files with Python scripts, and how to use Python to automatically open an HTML file in Firefox.
presenting python website 2012-07-17 2 -
William J. Turkel and Adam Crymble
From HTML to List of Words (part 1)
In this two-part lesson, we will build on what you’ve learned about Downloading Web Pages with Python, learning how to remove the HTML markup from the webpage of Benjamin Bowsey’s 1780 criminal trial transcript. We will achieve this by using a variety of string operators, string methods, and close reading skills. We introduce looping and branching so that programs can repeat tasks and test for certain conditions, making it possible to separate the content from the HTML tags. Finally, we convert content from a long string to a list of words that can later be sorted, indexed, and counted.
transforming python 2012-07-17 2 -
William J. Turkel and Adam Crymble
From HTML to List of Words (part 2)
In this lesson, you will learn the Python commands needed to implement the second part of the algorithm begun in the lesson ‘From HTML to a List of Words (part 1)’.
transforming python 2012-07-17 2 -
William J. Turkel and Adam Crymble
Python Introduction and Installation
This first lesson in our section on dealing with Online Sources is designed to get you and your computer set up to start programming. We will focus on installing the relevant software – all free and reputable – and finally we will help you to get your toes wet with some simple programming that provides immediate results.
transforming python get-ready 2012-07-17 1 -
William J. Turkel and Adam Crymble
Keywords in Context (Using n-grams) with Python
This lesson takes the frequency pairs collected in “Counting Frequencies” and outputs them in HTML.
presenting python 2012-07-17 2 -
William J. Turkel and Adam Crymble
Setting up an Integrated Development Environment for Python (Linux)
This lesson will help you set up an integrated development environment for Python on a computer running the Linux operating system.
transforming get-ready python 2012-07-17 1 -
William J. Turkel and Adam Crymble
Setting Up an Integrated Development Environment for Python (Mac)
This lesson will help you set up an integrated development environment for Python on a computer running a Mac operating system.
transforming get-ready python 2012-07-17 1 -
William J. Turkel and Adam Crymble
Manipulating Strings in Python
This lesson is a brief introduction to string manipulation techniques in Python.
transforming python 2012-07-17 2 -
William J. Turkel and Adam Crymble
Normalizing Textual Data with Python
In this lesson, we will make the list we created in the ‘From HTML to a List of Words’ lesson easier to analyze by normalizing this data.
transforming python 2012-07-17 2 -
William J. Turkel and Adam Crymble
Output Data as an HTML File with Python
This lesson takes the frequency pairs created in the ‘Counting Frequencies’ lesson and outputs them to an HTML file.
transforming python website 2012-07-17 2 -
William J. Turkel and Adam Crymble
Output Keywords in Context in an HTML File with Python
This lesson builds on ‘Keywords in Context (Using N-grams)’, where n-grams were extracted from a text. Here, you will learn how to output all of the n-grams of a given keyword in a document downloaded from the Internet, and display them clearly in your browser window.
presenting python 2012-07-17 2 -
William J. Turkel and Adam Crymble
Understanding Web Pages and HTML
This lesson introduces you to HTML and the web pages it structures.
presenting python 2012-07-17 2 -
William J. Turkel and Adam Crymble
Setting Up an Integrated Development Environment for Python (Windows)
This lesson will help you set up an integrated development environment for Python on a computer running the Windows operating system.
transforming get-ready python 2012-07-17 1 -
William J. Turkel and Adam Crymble
Working with Text Files in Python
In this lesson you will learn how to manipulate text files using Python.
transforming python 2012-07-17 2 -
William J. Turkel and Adam Crymble
Downloading Web Pages with Python
This lesson introduces Uniform Resource Locators (URLs) and explains how to use Python to download and save the contents of a web page to your local hard drive.
acquiring python 2012-07-17 2