Digital humanities

Maintained by: David J. Birnbaum ( [Creative Commons BY-NC-SA 3.0 Unported License] Last modified: 2018-04-13T22:57:33+0000

Test #7: XQuery


For this test we’ll be using the version of Hamlet that is available in eXist. You can access it as doc('/db/apps/shakespeare/data/ham.xml'). Remember that this document is in the TEI namespace so you'll need to use a namespace declaration:

declare namespace tei="";

Don’t forget the trailing semicolon, which is required at the end of all XQuery declare statements.

The task

Your goal is to produce a valid HTML document that contains a table of acts, scenes, and an alphabetical list of speakers (<speaker> elements) that appear in each scene. The underlying HTML for the first two scenes in Act 1 (preceded by a header row)) looks like:

Act Scene Speakers
Act 1 Scene 1 Bernardo, Francisco, Horatio, Marcellus
Act 1 Scene 2 All, Cornelius and Voltimand, Gertrude, Hamlet, Horatio, King, Laertes, Marcellus, Marcellus and Bernardo, Polonius

(Note that the Scene column should not repeat the act number. For the basic solution, you can treat people who speak together as if they were different speakers than when they speak alone, as we do with Cornelius, Voltimand, Marcellus, and Bernardo in Act 1, Scene 2, above.)

We would suggest approaching this step by step: first get the HTML superstructure and the table, then the right number of rows, then the right content in one column, then in another, etc. When we developed it, we broke it down into even smaller steps: for scenes we first returned the full value of the <head> element (e.g., Act 1, Scene 2) and then worried about trimming off the first part. For speakers we first got all the speakers in the scene, then got rid of the duplicates, and then sorted them (using a nested FLWOR). Here are a few other things to consider:

For extra credit

If you finish early, you’re encouraged (for extra credit, and also because it’s interesting) to enhance your output.

Easy enhancements

More challenging enhancements