Lesson Index

Our lessons are organized by typical phases of the research process, as well as general topics. Use the buttons to filter lessons by category. If you can’t find a skill, technology, or tool you’re looking for, please let us know!

reset to see all lessons (66)
  • sort by publication date
  • sort by difficulty

Filtering Results: All lessons date

  • Kellen Kurschinski

    Applied Archival Downloading with Wget

    Now that you have learned how Wget can be used to mirror or download specific files from websites via the command line, it's time to expand your web-scraping skills through a few more lessons that focus on other uses for Wget's recursive retrieval function.

    acquiring web-scraping 2013-09-13 2
  • Ian Milligan

    Automated Downloading with Wget

    Wget is a useful program, run through your computer's command line, for retrieving online material.

    acquiring web-scraping 2012-06-27 1
  • Taylor Arnold and Lauren Tilton

    Basic Text Processing in R

    Learn how to use R to analyze high-level patterns in texts, apply stylometric methods over time and across authors, and use summary methods to describe items in a corpus.

    analyzing distant-reading 2017-03-27 2
  • Amanda Visconti

    Building a static website with Jekyll and GitHub Pages

    This lesson will help you create entirely free, easy-to-maintain, preservation-friendly, secure website over which you have full control, such as a scholarly blog, project website, or online portfolio.

    presenting website data-management 2016-04-18 1
  • Seth van Hooland, Ruben Verborgh, and Max De Wilde

    Cleaning Data with OpenRefine

    This tutorial focuses on how scholars can diagnose and act upon the accuracy of data.

    transforming data-manipulation 2013-08-05 2
  • Laura Turner O'Hara

    Cleaning OCR’d text with Regular Expressions

    Optical Character Recognition (OCR)—the conversion of scanned images to machine-encoded text—has proven a godsend for historical research. This lesson will help you clean up OCR'd text to make it more usable.

    transforming data-manipulation 2013-05-22 2
  • William J. Turkel and Adam Crymble

    Code Reuse and Modularity in Python

    Computer programs can become long, unwieldy and confusing without special mechanisms for managing complexity. This lesson will show you how to reuse parts of your code by writing functions and break your programs into modules, in order to keep everything concise and easier to debug.

    transforming python 2012-07-17 2
  • Heather Froehlich

    Corpus Analysis with Antconc

    Corpus analysis is a form of text analysis which allows you to make comparisons between textual objects at a large scale (so-called 'distant reading').

    analyzing distant-reading 2015-06-19 1
  • Ryan Deschamps

    Correspondence Analysis for Historical Research with R

    This tutorial explains how to carry out and interpret a correspondence analysis, which can be used to identify relationships within categorical data.

    analyzing data-manipulation network-analysis 2017-09-13 3
  • William J. Turkel and Adam Crymble

    Counting Word Frequencies with Python

    Counting the frequency of specific words in a list can provide illustrative data. This lesson will teach you Python's easy way to count such frequencies.

    analyzing python 2012-07-17 2
  • Miriam Posner and Megan R. Brett

    Creating an Omeka Exhibit

    Now that you've added items to your Omeka site and grouped them into collections, you're ready for the next step: taking your users on a guided tour through the items you've collected.

    presenting website 2016-02-24 1
  • William J. Turkel and Adam Crymble

    Creating and Viewing HTML Files with Python

    Here you will learn how to create HTML files with Python scripts, and how to use Python to automatically open an HTML file in Firefox.

    presenting python website 2012-07-17 2
  • Marten Düring

    From Hermeneutics to Data to Networks: Data Extraction and Network Visualization of Historical Sources

    Network visualizations can help humanities scholars reveal hidden and complex patterns and structures in textual sources. This tutorial explains how to extract network data (people, institutions, places, etc) from historical sources through the use of non-technical methods developed in Qualitative Data Analysis (QDA) and Social Network Analysis (SNA), and how to visualize this data with the platform-independent and particularly easy-to-use Palladio.

    transforming network-analysis 2015-02-18 2
  • Caleb McDaniel

    Data Mining the Internet Archive Collection

    The collections of the Internet Archive include many digitized historical sources. Many contain rich bibliographic data in a format called MARC. In this lesson, you'll learn how to use Python to automate the downloading of large numbers of MARC files from the Internet Archive and the parsing of MARC records for specific information such as authors, places of publication, and dates. The lesson can be applied more generally to other Internet Archive files and to MARC records found elsewhere.

    acquiring web-scraping 2014-03-03 2
  • Nabeel Siddiqui

    Data Wrangling and Management in R

    This tutorial explores how scholars can organize 'tidy' data, understand R packages to manipulate data, and conduct basic data analysis.

    transforming data-manipulation data-management distant-reading 2017-07-31 2
  • Adam Crymble

    Downloading Multiple Records Using Query Strings

    Downloading a single record from a website is easy, but downloading many records at a time – an increasingly frequent need for a historian – is much more efficient using a programming language such as Python. In this lesson, we will write a program that will download a series of records from the Old Bailey Online using custom search criteria, and save them to a directory on our computer.

    acquiring web-scraping 2012-11-11 2
  • Brandon Walsh

    Editing Audio with Audacity

    In this lesson you will learn how to use Audacity to load, record, edit, mix, and export audio files.

    transforming data-manipulation 2016-08-05 1
  • John Ladd, Jessica Otis, Christopher N. Warren, and Scott Weingart

    Exploring and Analyzing Network Data with Python

    This lesson introduces network metrics and how to draw conclusions from them when working with humanities data. You will learn how to use the NetworkX Python package to produce and work with these network statistics.

    analyzing python network-analysis 2017-08-23 2
  • Adam Crymble

    Using Gazetteers to Extract Sets of Keywords from Free-Flowing Texts

    This lesson will teach you how to use Python to extract a set of keywords very quickly and systematically from a set of texts.

    acquiring data-manipulation 2015-12-01 2
  • Evan Peter Williamson

    Fetching and Parsing Data from the Web with OpenRefine

    OpenRefine is a powerful tool for exploring, cleaning, and transforming data. In this lesson you will learn how to use Refine to fetch URLs and parse web content.

    acquiring data-manipulation web-scraping api 2017-08-12 2
  • William J. Turkel and Adam Crymble

    From HTML to List of Words (part 1)

    In this two-part lesson, we will build on what you’ve learned about Downloading Web Pages with Python, learning how to remove the HTML markup from the webpage of Benjamin Bowsey’s 1780 criminal trial transcript. We will achieve this by using a variety of string operators, string methods, and close reading skills. We introduce looping and branching so that programs can repeat tasks and test for certain conditions, making it possible to separate the content from the HTML tags. Finally, we convert content from a long string to a list of words that can later be sorted, indexed, and counted.

    transforming python 2012-07-17 2
  • William J. Turkel and Adam Crymble

    From HTML to List of Words (part 2)

    In this lesson, you will learn the Python commands needed to implement the second part of the algorithm begun in the lesson 'From HTML to a List of Words (part 1)'.

    transforming python 2012-07-17 2
  • Jon Crump

    Generating an Ordered Data Set from an OCR Text File

    This tutorial illustrates strategies for taking raw OCR output from a scanned text, parsing it to isolate and correct essential elements of metadata, and generating an ordered data set (a python dictionary) from it.

    transforming data-manipulation 2014-11-25 3
  • Justin Colson

    Geocoding Historical Data using QGIS

    Learn how to use QGIS to convert lists of place names in to geographic coordinates, allowing you to map them.

    transforming mapping 2017-01-27 2
  • Jim Clifford, Josh MacFadyen, and Daniel Macfarlane

    Georeferencing in QGIS 2.0

    In this lesson, you will learn how to georeference historical maps so that they may be added to a GIS as a raster layer.

    transforming mapping 2013-12-13 2
  • Daniel van Strien

    An Introduction to Version Control Using GitHub Desktop

    In this lesson you will be introduced to the basics of version control, understand why it is useful and implement basic version control for a plain text document using git and GitHub.

    sustaining data-management 2016-06-17 1
  • Sarah Simpkin

    Getting Started with Markdown

    In this lesson, you will be introduced to Markdown, a plain text-based syntax for formatting documents. You will find out why it is used, how to format Markdown files, and how to preview Markdown-formatted documents on the web.

    presenting data-management 2015-11-13 1
  • Jim Clifford, Josh MacFadyen, and Daniel Macfarlane

    Intro to Google Maps and Google Earth

    Google My Maps and Google Earth provide an easy way to start creating digital maps. With a Google Account you can create and edit personal maps by clicking on My Places.

    presenting mapping 2013-12-13 1
  • Matthew Lincoln

    Using SPARQL to access Linked Open Data

    This lesson explains why many cultural institutions are adopting graph databases, and how researchers can access these data though the query language called SPARQL.

    acquiring lod 2015-11-24 2
  • Jonathan Reeve

    Installing Omeka

    This lesson will teach you how to install your own copy of Omeka.

    presenting website 2016-07-24 2
  • Fred Gibbs

    Installing Python Modules with pip

    There are many ways to install external python libraries; this tutorial explains one of the most common methods using pip.

    acquiring get-ready python 2013-05-06 1
  • Jacob W. Greene

    Introduction to Mobile Augmented Reality Development in Unity

    This lesson serves as an introduction to creating mobile augmented reality applications. Augmented reality (AR) can be defined as the overlaying of digital content (images, video, text, sound, etc.) onto physical objects or locations, and it is typically experienced by looking through the camera lens of an electronic device such as a smartphone, tablet, or optical head-mounted display.

    presenting website mapping 2016-07-24 2
  • Ian Milligan and James Baker

    Introduction to the Bash Command Line

    This lesson will teach you how to enter commands using a command-line interface, rather than through a graphical interface. Command-line interfaces have advantages for computer users who need more precision in their work, such as digital historians. They allow for more detail when running some programs, as you can add modifiers to specify exactly how you want your program to run. Furthermore, they can be easily automated through scripts, which are essentially recipes of text-based commands.

    transforming data-manipulation get-ready 2014-09-20 1
  • Jeri Wieringa

    Intro to Beautiful Soup

    Beautiful Soup is a Python library for getting data out of HTML, XML, and other markup languages.

    transforming web-scraping 2012-12-30 2
  • Jonathan Blaney

    Introduction to the Principles of Linked Open Data

    Introduces core concepts of Linked Open Data, including URIs, ontologies, RDF formats, and a gentle intro to the graph query language SPARQL.

    acquiring lod 2017-05-07 1
  • Ted Dawson

    Introduction to the Windows Command Line with PowerShell

    This tutorial will introduce you to the basics of Windows PowerShell, the standard command-line interface for Windows computers.

    transforming data-manipulation get-ready 2016-07-21 1
  • Shawn Graham

    An Introduction to Twitterbots with Tracery

    An Introduction to Twitter Bots with Tracery This lesson explains how to create simple twitterbots using Tracery and the Cheap Bots Done Quick service. Tracery exists in multiple languages and can be integrated into websites, games, bots.

    presenting api 2017-08-29 2
  • William J. Turkel and Adam Crymble

    Python Introduction and Installation

    This first lesson in our section on dealing with Online Sources is designed to get you and your computer set up to start programming. We will focus on installing the relevant software – all free and reputable – and finally we will help you to get your toes wet with some simple programming that provides immediate results.

    transforming python get-ready 2012-07-17 1
  • Matthew Lincoln

    Reshaping JSON with jq

    Working with data from an art museum API and from the Twitter API, this lesson teaches how to use the command-line utility [jq] to filter and parse complex JSON files into flat CSV files.

    transforming data-manipulation 2016-05-24 2
  • William J. Turkel and Adam Crymble

    Keywords in Context (Using n-grams) with Python

    This lesson takes the frequency pairs collected in [Counting Frequencies][] and outputs them in HTML.

    presenting python 2012-07-17 2
  • William J. Turkel and Adam Crymble

    Setting up an Integrated Development Environment for Python (Linux)

    This lesson will help you set up an integrated development environment for Python on a computer running the Linux operating system.

    transforming get-ready python 2012-07-17 1
  • William J. Turkel and Adam Crymble

    Setting Up an Integrated Development Environment for Python (Mac)

    This lesson will help you set up an integrated development environment for Python on a computer running an Apple operating system.

    transforming get-ready python 2012-07-17 1
  • William J. Turkel and Adam Crymble

    Manipulating Strings in Python

    This lesson is a brief introduction to string manipulation techniques in Python.

    transforming python 2012-07-17 2
  • Kim Pham

    Web Mapping with Python and Leaflet

    This tutorial teaches users how to create a web map based on tabular data.

    presenting mapping 2017-08-29 2
  • Vilja Hulden

    Supervised Classification: The Naive Bayesian Returns to the Old Bailey

    This lesson shows how to use machine learning to extract interesting documents out of a digital archive.

    analyzing distant-reading 2014-12-17 3
  • William J. Turkel and Adam Crymble

    Normalizing Textual Data with Python

    In this lesson, we will make the list we created in the 'From HTML to a List of Words' lesson easier to analyze by normalizing this data.

    transforming python 2012-07-17 2
  • William J. Turkel and Adam Crymble

    Output Data as an HTML File with Python

    This lesson takes the frequency pairs created in the 'Counting Frequencies' lesson and outputs them to an HTML file.

    transforming python website 2012-07-17 2
  • William J. Turkel and Adam Crymble

    Output Keywords in Context in an HTML File with Python

    This lesson builds on 'Keywords in Context (Using N-grams)', where n-grams were extracted from a text. Here, you will learn how to output all of the n-grams of a given keyword in a document downloaded from the Internet, and display them clearly in your browser window.

    presenting python 2012-07-17 2
  • James Baker

    Preserving Your Research Data

    This lesson will suggest ways in which historians can document and structure their research data so as to ensure it remains useful in the future.

    sustaining data-management 2014-04-30 1
  • Jim Clifford, Josh MacFadyen, and Daniel Macfarlane

    Installing QGIS 2.0 and Adding Layers

    In this lesson you will install QGIS software, download geospatial files like shapefiles and GeoTIFFs, and create a map out of a number of vector and raster layers.

    presenting mapping 2013-12-13 1
  • Taryn Dewar

    R Basics with Tabular Data

    This lesson teaches a way to quickly analyze large volumes of tabular data, making research faster and more effective.

    transforming data-manipulation 2016-09-05 1
  • James Baker and Ian Milligan

    Counting and mining research data with Unix

    This lesson will look at how research data, when organised in a clear and predictable manner, can be counted and mined using the Unix shell.

    transforming data-manipulation 2014-09-20 2
  • Shawn Graham

    The Sound of Data (a gentle introduction to sonification for historians)

    There are any number of guides that will help you visualize the past, but this lesson will help you hear the past.

    transforming distant-reading 2016-06-07 2
  • Dennis Tenen and Grant Wythoff

    Sustainable Authorship in Plain Text using Pandoc and Markdown

    In this tutorial, you will first learn the basics of Markdown—an easy to read and write markup syntax for plain text—as well as Pandoc, a command line tool that converts plain text into a number of beautifully formatted file types: PDF, .docx, HTML, LaTeX, slide decks, and more.

    sustaining website data-management 2014-03-19 2
  • Peter Organisciak and Boris Capitanu

    Text Mining in Python through the HTRC Feature Reader

    Explains how to use Python to summarize and visualize data on millions of texts from the HathiTrust Research Center's Extracted Features dataset.

    analyzing distant-reading 2016-11-22 3
  • Shawn Graham, Scott Weingart, and Ian Milligan

    Getting Started with Topic Modeling and MALLET

    In this lesson you will first learn what topic modeling is and why you might want to employ it in your research. You will then learn how to install and work with the MALLET natural language processing toolkit to do so.

    analyzing distant-reading 2012-09-02 2
  • M. H. Beals

    Transforming Data for Reuse and Re-publication with XML and XSL

    This tutorial will provide you with the ability to convert or transform historical data from an XML database (whether a single file or several linked documents) into a variety of different presentations—condensed tables, exhaustive lists or paragraphed narratives—and file formats.

    transforming data-manipulation 2016-07-07 1
  • Seth Bernstein

    Transliterating non-ASCII characters with Python

    This lesson shows how to use Python to transliterate automatically a list of words from a language with a non-Latin alphabet to a standardized format using the American Standard Code for Information Interchange (ASCII) characters.

    transforming data-manipulation 2013-10-04 2
  • Doug Knox

    Understanding Regular Expressions

    In this lesson, we will use advanced find-and-replace capabilities in a word processing application in order to make use of structure in a brief historical document that is essentially a table in the form of prose.

    transforming data-manipulation 2013-06-22 2
  • Miriam Posner

    Up and Running with Omeka.net

    Omeka.net makes it easy to create websites that show off collections of items.

    presenting website 2016-02-17 1
  • Stephanie J. Richmond and Tommy Tavenner

    Using JavaScript to Create Maps of Correspondence

    Demonstrates how to use the JavaScript library "Leaflet" to produce an interactive map that can be hosted online or viewed locally, and demonstrates how to customize many of its features.

    presenting mapping 2017-04-24 2
  • Jim Clifford, Josh MacFadyen, and Daniel Macfarlane

    Creating New Vector Layers in QGIS 2.0

    In this lesson you will learn how to create vector layers based on scanned historical maps.

    presenting mapping 2013-12-13 2
  • William J. Turkel and Adam Crymble

    Understanding Web Pages and HTML

    This lesson introduces you to HTML and the web pages it structures.

    presenting python 2012-07-17 2
  • William J. Turkel and Adam Crymble

    Setting Up an Integrated Development Environment for Python (Windows)

    This lesson will help you set up an integrated development environment for Python on a computer running the Windows operating system.

    transforming get-ready python 2012-07-17 1
  • William J. Turkel and Adam Crymble

    Working with Text Files in Python

    In this lesson you will learn how to manipulate text files using Python.

    transforming python 2012-07-17 2
  • William J. Turkel and Adam Crymble

    Downloading Web Pages with Python

    This lesson introduces Uniform Resource Locators (URLs) and explains how to use Python to download and save the contents of a web page to your local hard drive.

    acquiring python 2012-07-17 2