Digital humanities


Maintained by: David J. Birnbaum (djbpitt@gmail.com) [Creative Commons BY-NC-SA 3.0 Unported License] Last modified: 2018-01-17T20:22:50+0000


Course description: Spring 2018 (2184)

In brief

Overview

This course is entitled Computational methods in the humanities. The instructors are Professor David J. Birnbaum, Angela Klinger, Gabrielle (Gabi) Keane, and Zachary (Zac) Enick. Professor Birnbaum is the Co-chair of the Department of Slavic Languages and Literatures, and you can find some of his digital humanities projects at http://www.obdurodon.org. Ms. Klinger is majoring in Anthropology, and her project in this course was Networks in Chinese and Japanese poetry. Ms. Keane is majoring in English Literature, and her project in this course was Victorian ghost stories: a linguistic analysis of Victorian ghost stories written by female authors. Mr. Enick is majoring in French and Italian, and his project in this course was Immigration station: a critical discourse analysis of immigration in the 2016 presidential election.

In addition to CourseWeb (Blackboard), this course has its own web site, which is located at http://dh.obdurodon.org.

The course carries three credits and satisfies the Arts & Sciences skills requirement for Quantitative and Formal Reasoning. It is one of the very few courses offered at the University of Pittsburgh that are designed specifically to address the knowledge and skills involved in quantitative and formal reasoning within the context of the interests and needs of students in the humanities. The course meets three days a week for fifty minutes and involves a combination of lecture, discussion, and practical programming exercises. There are no prerequisites; in particular, students are not expected to have any prior computer programming experience and they are not required to know any foreign languages. On the other hand, as is the norm for courses with 1000-level numbers, students should have some experience with college-level study, especially in the humanities; this will assist them in identifying interesting humanities research questions, which they will then explore with the computational skills they will acquire in the course.

Students may enroll under any of the cross-listed rubrics and both undergraduate and graduate students are welcome. Whether the course satisfies requirements for a departmental major is up to the individual departments, and interested students should inquire about this with their major advisors. For undergraduate students, the course carries a University Honors College (UHC) designation. For general information about UHC courses, see http://www.honorscollege.pitt.edu/academics/courses-0. For information about enrolling in UHC courses, see http://www.honorscollege.pitt.edu/academics/courses/eligibility-enroll-uhc-courses.

An Honors course provides a research-oriented, graduate-level educational opportunity, and students who complete this course have frequently commented that it was simultaneously one of the most demanding and one of the most rewarding experiences of their academic careers. That experience requires a commitment, appropriate for an Honors course, to attend all course meetings and complete all homework conscientiously and in a timely fashion. Given that commitment on your part, the instructors will be eager to work with you to ensure that you finish the course having acquired skills that will enable you to use computational methods to conduct professional-level primary research in the humanities.

Course objectives

Humanities students often do not realize (or even imagine) that 1) they are capable of learning to write useful and practical computer programs within the course of a semester even if they have no prior background in programming; 2) the ability to write one’s own programs can be valuable for scholars in the humanities, especially because commercial software often does not address research needs in the humanities; and 3) practical computer programming, no less than reading, writing, and arithmetic, is a useful skill that is within the reach of any educated person regardless of academic specialization.

This course will introduce students to the role that computational methods can play in primary research and scholarship in the humanities, using as a framework eXtensible Markup Language (XML) and related technologies. XML has excellent properties for textual modeling, which makes it singularly useful for humanities computing, and it is not an accident that many digital humanities projects today are built around XML and related technologies. The related technologies addressed in the course include a powerful declarative programming language (XSLT), a query language for XML databases (XQuery), a formal model for the navigation of XML documents (XPath) used by XSLT and XQuery, several metalanguages for the formal modeling of documents (W3C Schema, Relax NG, DTD, TEI), a constraint modeling language (ISO Schematron), a graphic description language (SVG), and others.

While the focus of the course will be on computational methods, modeling, and programming for humanities research, it may be noted that XML is also the internal format for many general applications in broad use today (including the entire Microsoft Office and LibreOffice suites), it has been universally embraced by the relational database community as an application-independent interchange format, and it plays a substantial role on the World Wide Web, where HTML (the language underlying most web pages) may be expressed as an implementation of XML. The course does not concentrate specifically on the development of web pages, but because of the inherent relationship between XML and HTML and because so much digital humanities research is designed to exploit the advantages of Internet publication, students will also acquire a drive-by knowledge of how the technologies underlying the World Wide Web operate and will develop their own web sites to support their research projects for the course (about which see below).

Upon successful completion of this course students will be able to 1) identify opportunities for the application of computer technology to authentic research problems in the humanities; 2) analyze the structure of texts in the humanities and develop formal representations of those structures; 3) and write original computer programs to conduct research on those texts.

Organization of course content

This course will involve lectures, discussions, reading, and small problem sets and related assignments. Additionally, early in the semester students will identify, through consultation with the instructors, collaborative humanities-based problems that interest them personally and that are amenable to computational processing, and over the course of the semester they will develop and implement their own systems to engage with the research questions they have identified. Both to provide a general survey of the range of activities subsumed under the rubric of digital humanities and to help identify suitable projects, the course will also introduce students to computer-related issues in the humanities, to digital humanities as a discipline, and to the relationship between digital humanities and traditional fields in the humanities, on the one hand, and to computer and information science, on the other. Grading will be based on a combination of small assignments, tests, and the large, semester-long research project.

Course requirements

As is appropriate for an Honors course, this course is run similarly to a graduate course: there is a lot of work and students are expected to attend all sessions and complete all assignments on time. Students who regularly keep on top of the workload typically earn high grades and are very satisfied with how much they have been able to learn in the course of the semester. Students who frequently miss class or assignment deadlines will not be able to complete the course successfully. Requirements, then, include regular attendance, timely completion of assignments, regular participation in online course and project discussion, and successful completion of tests. Assignments include readings, small coding problems, response papers, and a large collaborative research project that will be developed over the course of the semester.

Required texts and other materials

The principal (and required) textbook is Michael Kay, XSLT 2.0 and XPath 2.0 programmer’s reference, 4th edition, Indianapolis: Wiley/Wrox, 2008, ISBN-10: 0470192747. ISBN-13: 978-0470192740. Earlier editions are not acceptable. This book is available in the Book Center, as well as from Amazon (a Kindle edition is also available; it’s easy to carry around, but difficult to use because of poor indexing) and other on-line vendors. A digital copy (PDF format) is accessible through the Pitt library system. Other required materials, a list of which will be distributed by the instructor for each topic, are available at no cost on the Internet. All of these materials, including the Kay book, are intended primarily for reference, which is to say that students should anticipate using them frequently and intensively to research solutions to particular problems, but they are not expected to read them cover to cover.

This course will use the <oXygen/> XML Editor and IDE (integrated development environment), which is available in all CSSD computing labs through a site license purchased by the University. The site license also permits students enrolled in this course to install a copy of the software on their personal computers (Windows, Mac, and Linux) for use at home (for course-related purposes only). The software can be downloaded from http://www.oxygenxml.com and the license code to register your home copy can be obtained from the instructor.

CourseWeb and more

Most course materials will be distributed not on CourseWeb, but on the instructor’s server: http://dh.obdurodon.org (Obdurodon). This includes the course description (this document) and syllabus, readings, assignments, announcements, etc.

CourseWeb (also known as Blackboard, http://courseweb.pitt.edu) will be used for two purposes:

  1. Students will upload written assignments to CourseWeb.
  2. Grades will be posted to CourseWeb as they become available, so that students can keep track of their academic progress in the course.

We use Obdurodon for most purposes because it is integrated with course-specific resources not available on CourseWeb. We use CourseWeb for uploading assignments and for grades because it provides appropriate controls for security and privacy. The instructors have installed links between Obdurodon and CourseWeb to facilitate integration and navigation.

Projects

Early in the semester groups of students will identify, though consultation with the instructors, a public domain text in the humanities that interests them and a set of related research questions, and will work with that text throughout the semester, performing document analysis, developing and implementing a formal structural model, encoding (marking up) the text according to that model, developing programs to perform research with that text, and constructing a web site to publish their research. Each project team will meet weekly with a project mentor (one of the instructors) outside class for project planning and discussion.

We will manage course projects in a system called GitHub, and we’ll explain how it works and what the relevant terms (repo, Issue, Project etc.) mean in class, so if you’re reading this at the beginning of the semester, don’t worry about the jargon or the technical details.

Once project teams have been formed, each team (not each individual student) will post a weekly project update, in the form of an Issue on their GitHub project repos, due on Wednesday. These postings should be status reports about your projects and should address four topics: 1) what you accomplished in the preceding week, 2) what you learned, 3) where you got stuck (and how you plan to get unstuck), and 4) what you plan to accomplish in the upcoming week. Reports are typically brief, but they should address both the state of the project in general and the specific weekly contributions of each of the individual team members. The instructors will show you how to use the Projects and Issues tabs in your GitHub repos (which is what we do in our own research) to manage tasks, timeliness, and any impediments you might face. Weekly project update postings should refer (with a link) to specific new or revised content on the repo.

Unlike with projects in many other courses, the project evaluation is based not only on the final product, but also on regular, steady progress on individual project related tasks, and your instructors will work with you to identify appropriate weekly goals for your project using Github Projects. Weekly progress will be graded on the following scale: exceeds target (A+), meets target (A), some progress (B), negligible progress (C), no progress (F); only the best 8 (out of 10) Project Issue grades will be included in the course grade. Because the primary purpose of the project is for students to learn how to use the Digital Humanities tools and methods employed by professionals, the instructors will work with you during your weekly project meetings to help you learn to apply the necessary methods to your own research, and obtaining your results according to those methods is part of the evaluation of the project. If your team determines your research needs a technology or method not taught in depth in the course, please speak with an instructor about how you might incorporate it. All project components, taken together, are worth 60% of the course grade.

Weekly assignments

In addition to project tasks, course activities include reading, coding, and response paper assignments, as well as participation in online discussion. Students must complete at least 90% of each homework component (programming assignments, response papers, project tasks, participation in online activities) on time in order to pass the course. For students who meet the 90% requirement, these assignments, taken together, are worth 25% of the course grade.

Students will need to observe file-naming conventions for all uploaded homework and project files. These conventions are explained at http://dh.obdurodon.org/file-naming_conventions.xhtml (also linked from the main course page); please ask the instructors if anything is unclear.

Coding assignments

Coding assignments in this course are a technique for learning and studying, and not—as in many other courses—a way of testing whether you’ve already learned something covered in class or in an assigned reading. Because a crucial skill in DH development is the ability to look up how to do something you don’t already know how to do, these assignments will frequently challenge you to write code that you do not yet know how to write. Students who are used to the homework-as-test model of course design may find this disconcerting (But they haven’t told us how to do this!), but professional developers have to look things up all the time, and learning how to look things up is a major outcome goal of this course. There may be times when you don’t get the result you want, and in those cases you can still get full credit for the assignment if you’ve made a serious attempt and if you submit, along with your code, a description of what else you tried, what results you expected, what results you got, and what you think went wrong. Getting stuck is part of the learning process and the instructors will be happy to help unstick you as long as you’ve described your understanding of the problem and your attempts to resolve it on your own.

The instructors will post solutions to and discussion of programming assignments on Obdurodon after the assignment deadline. The instructors will read and evaluate all student homework, and will post an assessment on CourseWeb, and we will write back to you with individual comments only if your specific submission raises an issue that we don’t address in our posted solution. If we don’t return your assignment, that means that we have nothing to add to our posted solution, but should you have any specific questions after you’ve read our posted solution, please ask the instructors. Coding assignments are assessed as check plus, check, and check minus. Don’t think of these as grades, since they all receive full credit; they are feedback, for learning purposes, about how well you engaged with the assignment. If you have not engaged with the assignment adequately (whether that means solving the tasks or discussing the coding impediments you encountered and how you dealt with them), we will ask you to meet with us to review the issues and then complete a followup (redo) task in order to receive credit.

Response papers

Students will be asked to submit 300- to 400-word response papers to readings or as evaluations of digital humanities web sites, projects, and resources. Your response does not have to be long, but it must show a thoughtful intellectual engagement with the material. For example, don’t summarize an article (which you could do just by skimming or reading the first paragraph) and don’t just praise or condemn a web site without going into specifics about why some component is or is not well designed and suggesting specific ways it could be improved. Good response papers show thought and attention, and may respond to a reading or site in general, or to a detail of specific interest. Some sites are significant for their design and user experience, others for their research methodology, and response papers should engage with the aspects that are most significant to the project. In some cases the syllabus will specify the focus of the response papers for a particular assignment. Response papers are also assessed as check plus, check, and check minus.

Issue responses

Once the weekly project updates begin, each student is required to read all new Issue postings on the other project repos and respond to at least one status report by another team. These responses may be brief, but they need to be thoughtful, something more than nice job! or very interesting! You might make a suggestion, offer a critique, ask (or answer) a question, discuss how something in someone else’s posting gives you an idea of something to do on your own project, report on a resource you discovered that might be useful for the other team, etc. You will find it helpful to follow the other team's repos, which you can do by clicking the Watch button at the top of their repo page. Issue responses are also assessed as check plus, check, and check minus.

Class discussion postings

In addition to weekly project status reports on your project repos, our course GitHub repo will host general discussion, also in the form of GitHub Issues. Despite the image of the programmer as a solitary soul bent over a glowing screen in a cubicle, programming is a social activity, and real programmers make liberal use of discussion boards to ask and answer questions. Furthermore, despite the classroom tradition where students submit homework to a professor, who grades and returns it, so that nobody else ever sees it, that’s not the way coding works in the real world and it’s not the model in this course. Beginning in the second week of the semester, each student is required to read all new discussion postings (it is not necessary to read the backlog from prior semesters, although you may find that interesting) and contribute at least one posting a week to the general discussion: ask a question (new Issue), answer a question (comment on an existing Issue), make a suggestion, pass along a link to a site that showed you something you could use in your project (and explain what it is and why it was useful), etc. Don’t be shy about asking questions on in this forum; your instructors do this all the time in their own work, and you can’t learn to code if you don’t learn how to participate in a coding community. And don’t be shy about answering questions, which is both good citizenship and personally satisfying. Participation in the discussion beyond the minimum requirement is strongly encouraged. Much of the classroom time will be devoted to teacher-centered instruction, and the discussion framework offers an opportunity to interact with classmates to cultivate a community in which peers provide insight, solutions, and general discussion. This discussion forum should be the first resource for questions, comments, etc.

Over the course of the semester it is likely that you will find a technology or software we use to be frustrating. Please keep your postings professional and civil.

Tests

There are eight coding tests, of which only the best six are incorporated into the course grade (15%). There are no make-up tests; if you miss a test, it is effectively excused, since it just becomes one of the two that don’t count.

All but the first test will be take-home, and assigned over a weekend. They will be open-book/open-notes, but they must be completed individually, which means that although you can look things up (and encouraged to do so, as needed), you are not permitted to receive help from any persons. If you are confused or uncertain about anything on any of the tests, post an inquiry in the discussion forum on our course GitHub repo, which your instructors monitor regularly, and we’ll respond there.

Exams

There is one midterm evaluation (take-home; 10%), which is collaborative and project-related. There is no final examination.

Approximate time spent outside of class

This is an Honors course, and students should expect a minimum of 2 hours of outside preparation for each hour of class time. The workload will be heavier at the beginning of the semester, when students will need to acquire quickly a foundation for dealing with more complex or advanced topics. The tempo of the course changes in the last few weeks; before that point, you’ll be learning and practicing a lot of new technologies and beginning to apply them to your projects. After that point there are fewer new technologies, and almost all of your course time outside class will be devoted to developing your projects. Once the projects begin in the second or third week of the semester, project teams will meet once a week for approximately an hour outside class with one of the instructors, who will help guide the development of the project.

Grading policy

Letter grade (LG) or satisfactory/no-credit (S/NC). In accord with Dietrich School policy, students enrolled on an S/NC basis must earn at least a straight C to receive credit. G grades (incompletes) are given at the instructor’s discretion and only in conformity with the requirements described in the Undergraduate Student Handbook at https://catalog.upp.pitt.edu/content.php?catoid=72&navoid=6226#grading-systems.

Relative weight of each requirement

Homework assignments 25%
Tests (best six) 15%
Midterm (project based) 10%
Weekly project grade 30%
Large project 20%

Note that students must complete at least 90% of each individual course component (coding assignments, response papers, discussion activities, project milestones) in a timely fashion in order to pass the course. There is no separate extra-credit option, but exceptionally ambitious and original projects will be recognized as such when they are evaluated and graded, as will exceptionally engaged participation in the discussion activities.

Policy on attendance, late work, make-ups

Because each topic in this course builds on topics introduced previously, students who wish to do well must attend regularly and complete all assignments on time.

Attendance: Attendance is strictly required, and students must attend at least 90% of the class meetings (arriving on time; repeated late arrival may be counted as absence at the instructor’s discretion) in order to pass the course. Absence is excused only for religious observances and genuine emergencies, only with explicit documentation, and only if the documentation is provided promptly after the student returns to class.

Late work: Coding assignments and response papers must be uploaded to CourseWeb and Github Issue and comment contributions must be posted by the date and time specified in the syllabus. Homework assignments will be posted on Obdurodon, where they will be accessible at any time, and students who miss class are nonetheless expected to consult this site and submit assignments in a timely fashion. Because the instructors will often have provided answers to the homework (either in class or by posting solutions to Obdurodon) after the submission deadlines, no late homework will be accepted.

Homework must be uploaded to CourseWeb before class on the due date. If you wish to upload a revised version afterwards, reflecting additional understanding you may have acquired from our posted solution or our discussion in class, you're welcome to do so, but in that case it isn't helpful to submit only a late, improved version because we can't tell whether that reflects your improved understanding of the problems you had originally or just your understanding (or even just copying) of our presentation of our solution. For that reason, if you submit a revision of a coding assignment, it should take the form of not just new code, but a critique of your original version, in which you discuss what you did wrong, why it’s wrong, what it does instead of what you wanted it to do, how you can fix it, etc. That type of resubmission can demonstrate that you understand how to approach solving the problem, which is more important in terms of learning to code than just understanding how we solved it. If you do this, please notify the instructors.

Make-up tests: There are no make-up tests, but only the best six test results (out of eight) will be counted in the course grade, which means that students may miss at least two tests without a grade penalty. There is a take-home midterm, but there is no final examination.


Official University policies

Disabilities

If you have a disability for which you are or may be requesting an accommodation, you are required to contact both your instructor and the Office of Disability Resources and Services, 216 William Pitt Union, 412-648-7890/412-383-7355 (TTY), as early as possible in the term. Disability Resources and Services will verify your disability and determine reasonable accommodations for this course.

Academic integrity

This course permits (and encourages) collaborative homework preparation, but only under the conditions specified at http://dh.obdurodon.org/collaboration.xhtml.

All students and instructors are required to observe the Dietrich School Academic Integrity guidelines, as described at https://as.pitt.edu/faculty/policies-and-procedures/academic-integrity-code, and violations of the Academic Integrity Code will be addressed according to those guidelines.

Email

Each student is issued a University email address (username@pitt.edu) upon admission. This email address may be used by the University for official communication with students. Students are expected to read email sent to this account on a regular basis. Failure to read and react to University communications in a timely manner does not absolve the student from knowing and complying with the content of the communications. The University provides an email forwarding service that allows students to read their email via other service providers (e.g., Hotmail, AOL, Yahoo). Students who choose to forward their email from their pitt.edu address to another address do so at their own risk. If email is lost as a result of forwarding, it does not absolve the student from responding to official communications sent to their University email address. To forward email sent to your University account, go to http://accounts.pitt.edu, log into your account, click on Edit Forwarding Addresses, and follow the instructions on the page. Be sure to log out of your account when you have finished. (For the full Email Communication Policy, go to http://www.cfo.pitt.edu/policies/policy/09/09-10-01.html.)