Digital humanities


Maintained by: David J. Birnbaum (djbpitt@gmail.com) [Creative Commons BY-NC-SA 3.0 Unported License] Last modified: 2023-01-08T19:22:20+0000


XSLT assignment #1

The assignment

Your assignment is to create an XSLT stylesheet that will transform Bad Hamlet into a hierarchical outline of the titles of acts and scenes in HTML. This isn’t very interesting on its own, of course, but if you were transforming the entire document into HTML for publication on the web, this might serve as the skeleton. It might also stand on its own as a table of contents at the top of such a publication, so that the reader could click on the title of a scene to jump to that location in the file.

If you’re feeling adventurous, you’re welcome to include more information, whether of a publication-oriented sort (e.g., speakers, speeches, stage directions, etc., as if you were publishing the entire play) or as a foray into exploration and analysis (e.g., list of characters who speak in each scene, perhaps with a count of their speeches, length of speeches, etc). The only required content of your homework, though, is the HTML outline of act and title chapters, which might look something like:

The underlying HTML, which we generated using XSLT, is:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
   <head>
      <title>Hamlet</title>
   </head>
   <body>
      <ul>
         <li>Act 1
            <ul>
               <li>Act 1, Scene 1</li>
               <li>Act 1, Scene 2</li>
               <li>Act 1, Scene 3</li>
               <li>Act 1, Scene 4</li>
               <li>Act 1, Scene 5</li>
            </ul>
         </li>
         <li>Act 2
            <ul>
               <li>Act 2, Scene 1</li>
               <li>Act 2, Scene 2</li>
            </ul>
         </li>
         <li>Act 3
            <ul>
               <li>Act 3, Scene 1</li>
               <li>Act 3, Scene 2</li>
               <li>Act 3, Scene 3</li>
               <li>Act 3, Scene 4</li>
            </ul>
         </li>
         <li>Act 4
            <ul>
               <li>Act 4, Scene 1</li>
               <li>Act 4, Scene 2</li>
               <li>Act 4, Scene 3</li>
               <li>Act 4, Scene 4</li>
               <li>Act 4, Scene 5</li>
               <li>Act 4, Scene 6</li>
               <li>Act 4, Scene 7</li>
            </ul>
         </li>
         <li>Act 5
            <ul>
               <li>Act 5, Scene 1</li>
               <li>Act 5, Scene 2</li>
            </ul>
         </li>
      </ul>
   </body>
</html>

We’ve used HTML unordered lists (<ul>) elements. The only content allowed inside a <ul> element is list items (<li>), and we’ve nested them, so that each each list item that represents an act contains the title of that act followed by an embedded <ul> that contains, in turn, a separate list item for the title of each scene. This isn’t the only way to format this type of outline and you’re welcome to take a different approach. For example, if you’d like to include the full text of the play, that is, the stage directions and speeches, the embedded list format isn’t really appropriate. In that case we might use the HTML header elements (<h1> through <h6>) to create hierarchical headers.

Before you begin

Both your input document and your output documents have to be in the correct namespace, and you need to tell your XSLT stylesheet about that.

Finally, to output an HTML document that conforms to HTML5 expectations for XML syntax, we also create an <xsl:output> element as the first child of our root <xsl:stylesheet> element (lines 8–9 below). Our modified skeleton looks like the following:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xpath-default-namespace="http://www.tei-c.org/ns/1.0"
    xmlns="http://www.w3.org/1999/xhtml"
    xmlns:math="http://www.w3.org/2005/xpath-functions/math" 
    exclude-result-prefixes="#all"
    version="3.0">
    <xsl:output method="xhtml" html-version="5" omit-xml-declaration="no" 
        include-content-type="no" indent="yes"/>
    
</xsl:stylesheet>

Guide to approaching the problem

Our XSLT transformation (after all this housekeeping) has three template rules:

  1. We have a template rule for the document node (<xsl:template match="/">), in which we create the basic HTML output: the <html> element, <head> and its contents, and <body>. Inside the <body> element that we’re creating we create a <ul> to hold a list of acts, and inside that we use <xsl:apply-templates> and select the acts (using an XPath expression as the value of the @select attribute).

  2. We have a separate template rule that matches acts, so it will be invoked as a result of the preceding <xsl:apply-templates> instruction, and will fire once for each act. Inside that template rule we create a new list item (<li>) for the act being processed and inside the tags for that new list item we do two things. First, we apply templates to the <head> for the act, which will eventually cause its title to be output. Second, we create wrapper <ul> tags for the nested list that will contain the titles of the scenes. Inside that new <ul> element, we use an <xsl:apply-templates> rule to apply templates to (that is, to process) the scenes of that act.

  3. We have a separate template rule that matches scenes, and that just applies templates to the <head> element in each scene, which ultimately causes the textual content of the <head> element to be output. This rule will fire once for each scene in the play, and it will be called separately for the scenes of each act, so that the scenes will be rendered properly under their acts.

We don’t need a template rule for the <head> elements themselves because the built-in (default) template rule in XSLT for an element that doesn’t have an explicit, specified rule is just to apply templates to its children. The only child of the <head> elements is a text node, and the built-in rule for text nodes is to output them literally. In other words, if you apply templates to <head> and you don’t have a template rule that matches that element, ultimately the transformation will just output the textual content of the head, that is, the title that you want.

Important

What to submit

You should upload both the XSLT stylesheet you created to run the transformation and the HTML it produced. The HTML must be valid; if it isn’t, the XSLT must include, as properly formatted code comments, information about how you tried to debug your transformation.