Topic Modeling with jsLDA
Machines “reading” IS Abstracts
So, we’ve been working with our Independent Study Abstracts of five majors: Soc/Anthro, History, Communications, Econ, and English. After prepping the dataset for this project, it now has 451 SOAN abstracts, 210 HIST, 194 COMM, 157 ECON, and 122 ENGL. It also includes one file to contain them all, which has 1134 abstracts.
We will run some topic models on these abstract datasets, make note of our observations, and articulate the affordances of topic models in exploring this dataset.
- From our Moodle site, select and download one of the singular topic I.S. abstract datasets (so, not all-abstracts, which we’ll get to below). To download, you may have to right-click and “save link as”; be sure to save it somewhere you can find it easily in step 3!
- From the jsLDA site, choose “Run a Model”
- In the menu header at the top of the jsLDA page, select “choose file” next to “Document Upload” and select the file you’ve just downloaded.
- On the far right of that top border, select “Load” to run your model. Notice how the topics (on the left margin) align after just uploading.
- On the far left of the top margin, select “Run 50 Iterations” on your model. Notice how the topics shift around.
- Explore the sections with the “Topic Correlations” matrix and the “Time Series” sparklines for trends over time.
- Navigate into the “Time Series” area and “Run 50 Iterations” again. Notice how the sparklines shift as you iterate.
- Repeat steps 5-7 until you see less shifting around of the topics (probably 300-500 iterations).
- What do you notice about the identified topic groupings as you iterate?
- Identify an interesting correlation in your discipline’s abstracts using the “Toic Correlations” matrix.
- Do you notice any interesting trends over time in your discipline? Given what you know of history, does this make sense to you?
- Now download that
- Repeat steps 3-8 using this file and answer the questions below.
- What groupings do you notice in the disciplines?
- Can you identify correlations or changes over time that might prompt further exploration?
Course Archive: 2017
An Introduction to Digital Humanities by Jacob Heil is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.