- Your First Lesson · Welcome to your first Lesson. All of our lessons branch off from this core set of instructions, which are designed to get you ready to start programming. In this lesson you will learn where to turn when you need help should you get stuck.
- Getting Started with Online Sources · In this lesson we will help you to get your toes wet with some simple programming that provides immediate results.
- Working with Files and Web Pages · Lesson Goals In this lesson you will learn how to manipulate text files using Python. This includes opening, closing, reading from, and writing to .txt files. You will also be introduced to some central programming concepts such as writing functions to reuse code, and the idea of modularization. The second part of the chapter introduces [...]
- From HTML to a List of Words · Lesson Goals In this lesson, we will build on what you’ve learned about Working with Files and Webpages, learning how to remove the HTML markup from the downloaded page using a variety of string operators, string methods and close reading skills. We introduce looping and branching so that programs can repeat tasks and test for [...]
- Computing Frequencies · Lesson Goals The list that we created in the previous lesson needs some normalizing before it can be used further. We are going to do this by applying additional string methods, as well as by using regular expressions. Once normalized, we will convert our list to a dictionary and create frequency pairs. A series of [...]
- Wrapping Output in HTML · Lesson Goals This chapter takes the frequency pairs created in Computing Frequencies and outputs them to an HTML file. If you write programs that output HTML, you can use any browser to look at your results. This is especially convenient if your program is automatically creating hyperlinks or graphic entities like charts and diagrams. Here [...]
- Keywords in Context (KWIC) · Lesson Goals Like in Wrapping Output in HTML, this lesson takes the frequency pairs collected in Computing Frequencies and outputs them in HTML. This time the focus is on keywords in context (KWIC) which creates n-grams from the original document content – in this case a trial transcript from the Old Bailey Online. You can [...]
- Downloading Multiple Records Using Query Strings · In this lesson, we will write a program that will download a series of records from the Old Bailey Online using custom search criteria, and save them to a directory on our computer.
- Automated Downloading with Wget · Wget is a useful program, run through your computer's command line, for retrieving online material.
- Getting Started with Topic Modeling and MALLET · In this lesson you will first learn what topic modeling is and how to employ it in your research.