intro-to-dh

Tuesday + Thursday, 9:30-10:45, 243 Kauke // Dr. Jacob Heil, 158D Andrews Library

View on GitHub

Home. // Assignments/Grading. // Schedule. // Policies.

Markov Text Generation

“Writing” IS Abstracts

The Gist

As part of our adventures in kicking the tires on python (does that metaphor make sense to students in 2018?), we’re going to do some iterative code manipulation to “write” Independent Study Abstracts for five majors: Soc/Anthro, History, Communications, Econ, and English. In our dataset we have 453 SOAN abstracts, 214 HIST, 195 COMM, 157 ECON, and 130 ENGL. Our script will read these examples and then try to write like a major in that subject.

Ingredients

Process

  1. In your pythonanywhere account, navigate to /[your_username]/markoving/.
  2. Open ` markovmaker.py ` and change the variables that you’ll need to change in order to (a) read the abstracts from your chosen subject and (b) create new files with those abstracts as output.
  3. Save your work.
  4. With the ` markovmaker.py ` script open in the PythonAnywhere “Files” interface (or simply in the /markoving/ directory), locate “Open Bash Console” and do so. The screen will either have an option to “Open Bash Console Here” near the top of the screen if you’re in the directory, or “$ Open Bash Console” at the bottom if you’re reading the script.
  5. Type ls and press ‘enter’ to see where you are. You should get a list of the files in the /markoving/ directory.
  6. Let’s run the script. In your BASH console type python markovmaker.py and press enter. For now, only do this once so we can pause check our progress. “Abstracts” of varying lengths should show up in your console.
  7. Let’s check the output file. In your PythonAnywhere file structure, navigate back to /markoving/. Open your output file, which should be named something like engl_output.txt (or whatever you named it). Remember that this file will get overwritten with every iteration, so be sure to change the name of the outfile in markovmaker.py if you don’t want it to disappear.
  8. Now let’s find some examples to fool our classmates. Repeat steps 6 and 7, pausing to read your output. You want to locate sentences (or near-sentences) that seem to you to be exceptionally close, stylistically speaking, to sentences from actual abstracts. Make note of these – copy and paste them somewhere or make a note of which “outfile” they’re in or something – as you’ll need them for your homework submission.
  9. For your homework submission you’ll create two well-formed markdown files.
    • Name one of them scrambled_[SUBJ]_abstracts.md: this will be an intermingled list of sentences, five selections from your “markov-ing” experiments and five sentences that you copy and paste word-for-word from actual I.S. abstracts from the same discipline.
    • Name the other key.md and, in this file, sort the same sentences into two lists, “real” and “markov’ed”.
  10. Upload or copy your files to a repository linked in this GitHub Classroom Assignment.

Booking Meetings


Course Archive: 2017

Creative Commons License
An Introduction to Digital Humanities by Jacob Heil is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.