Markov Text Generation
“Writing” IS Abstracts
As part of our adventures in kicking the tires on python (does that metaphor make sense to students in 2018?), we’re going to do some iterative code manipulation to “write” Independent Study Abstracts for five majors: Soc/Anthro, History, Communications, Econ, and English. In our dataset we have 453 SOAN abstracts, 214 HIST, 195 COMM, 157 ECON, and 130 ENGL. Our script will read these examples and then try to write like a major in that subject.
markovmaker.py(which is located in
/[your_username]/markoving/in your pythonanywhere account. )
- Abstracts. Lots of abstracts.
- In your pythonanywhere account, navigate to
- Open ` markovmaker.py ` and change the variables that you’ll need to change in order to (a) read the abstracts from your chosen subject and (b) create new files with those abstracts as output.
- Save your work.
- With the ` markovmaker.py ` script open in the PythonAnywhere “Files” interface (or simply in the
/markoving/directory), locate “Open Bash Console” and do so. The screen will either have an option to “Open Bash Console Here” near the top of the screen if you’re in the directory, or “$ Open Bash Console” at the bottom if you’re reading the script.
lsand press ‘enter’ to see where you are. You should get a list of the files in the
- Let’s run the script. In your BASH console type
python markovmaker.pyand press enter. For now, only do this once so we can pause check our progress. “Abstracts” of varying lengths should show up in your console.
- Let’s check the output file. In your PythonAnywhere file structure, navigate back to
/markoving/. Open your output file, which should be named something like
engl_output.txt(or whatever you named it). Remember that this file will get overwritten with every iteration, so be sure to change the name of the
markovmaker.pyif you don’t want it to disappear.
- Now let’s find some examples to fool our classmates. Repeat steps 6 and 7, pausing to read your output. You want to locate sentences (or near-sentences) that seem to you to be exceptionally close, stylistically speaking, to sentences from actual abstracts. Make note of these – copy and paste them somewhere or make a note of which “outfile” they’re in or something – as you’ll need them for your homework submission.
- For your homework submission you’ll create two well-formed markdown files.
- Name one of them
scrambled_[SUBJ]_abstracts.md: this will be an intermingled list of sentences, five selections from your “markov-ing” experiments and five sentences that you copy and paste word-for-word from actual I.S. abstracts from the same discipline.
- Name the other
key.mdand, in this file, sort the same sentences into two lists, “real” and “markov’ed”.
- Name one of them
- Upload or copy your files to a repository linked in this GitHub Classroom Assignment.
Course Archive: 2017
An Introduction to Digital Humanities by Jacob Heil is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.