Digital humanities


Maintained by: David J. Birnbaum (djbpitt@gmail.com) [Creative Commons BY-NC-SA 3.0 Unported License] Last modified: 2018-03-27T18:56:32+0000


Test #5: XSLT Solution

The task

For this test, we asked you to write an XSLT stylesheet to transform the play The Bicyclists and three other farces into a valid and appropriately rendered HTML reading view of the text. This task could have been completed successfully in a number of ways (different valid HTML output, different XSLT stylesheets, lots of CSS flexibility), so although we discuss one solution below, yours does not need to look like ours. If, once you get your test back, you have any questions about your code, please ask one of the instructors and we’ll be happy to go over it with you.

Our Solution

We’ve used XML comments below the way we often do in real development, so that if we go back to our code after not having thought about it for a while, or if we share it with someone else, the comments will make it easier to understand.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs"
    xmlns="http://www.w3.org/1999/xhtml" version="3.0">
    <xsl:output method="xml" indent="yes" doctype-system="about:legacy-compat"/>

    <!-- -->
    <!-- optional <xsl:strip-space> and <xsl:preserve-space> for prettier white-space handling -->
    <!-- -->
    <xsl:strip-space elements="*"/>
    <xsl:preserve-space elements="speech"/>
    <!-- -->
    <!-- end of whitespace configuration -->
    <!-- -->

    <xsl:template match="/">
        <html>
            <head>
                <title>XSLT Test</title>
                <link rel="stylesheet" type="text/css" href="http://www.obdurodon.org/css/style.css"/>
                <style>
                    .stage {
                        font-style: italic;
                    }
                    .speaker {
                        font-weight: bold;
                    }</style>
            </head>
            <body>
                <xsl:apply-templates/>
            </body>
        </html>
    </xsl:template>

    <!-- -->
    <!-- title and table of contents -->
    <!-- -->
    <xsl:template match="playTitle">
        <h1>
            <xsl:apply-templates/>
        </h1>
    </xsl:template>
    <xsl:template match="toc">
        <h2>Contents</h2>
        <ul>
            <xsl:apply-templates/>
        </ul>
    </xsl:template>
    <xsl:template match="sceneName">
        <!-- toc is clickable links to skip to scene -->
        <li>
            <a href="#scene{@n}">
                <xsl:apply-templates/>
            </a>
        </li>
    </xsl:template>
    <!-- -->
    <!-- end of title and table of contents -->
    <!-- -->

    <!-- -->
    <!-- each scene is a section with an <h2> title -->
    <!-- -->
    <xsl:template match="scene">
        <hr/>
        <section id="scene{@number}">
            <xsl:apply-templates/>
        </section>
    </xsl:template>
    <xsl:template match="title">
        <h2>
            <xsl:apply-templates/>
        </h2>
    </xsl:template>
    <xsl:template match="sceneDescription">
        <p class="stage">
            <xsl:text>[</xsl:text>
            <xsl:apply-templates/>
            <xsl:text>]</xsl:text>
        </p>
    </xsl:template>
    <!-- -->
    <!-- end of scene / section -->
    <!-- -->

    <!-- -->
    <!-- cast of characters for each scene -->
    <!-- -->
    <xsl:template match="characters">
        <h3>Characters</h3>
        <ul>
            <xsl:apply-templates/>
        </ul>
    </xsl:template>
    <xsl:template match="character">
        <li>
            <xsl:value-of select="concat(name, ', ', desc)"/>
        </li>
    </xsl:template>
    <!-- -->
    <!-- end of cast of characters for scene -->
    <!-- -->

    <!-- -->
    <!-- speeches are paragraphs with speakers in bold -->
    <!-- -->
    <xsl:template match="speech">
        <p>
            <xsl:apply-templates/>
        </p>
    </xsl:template>
    <xsl:template match="speaker">
        <span class="speaker">
            <xsl:apply-templates/>
        </span>
    </xsl:template>
    <!-- -->
    <!-- end of speeches -->
    <!-- -->

    <!-- -->
    <!-- stage directions are standalone (<p class="stage">) 
        or embedded in speech lines (<span class="stage") -->
    <!-- -->
    <xsl:template match="stage">
        <xsl:element name="{if (parent::lines) then 'span' else 'p'}">
            <xsl:attribute name="class" select="'stage'"/>
            <xsl:text>[</xsl:text>
            <xsl:apply-templates/>
            <xsl:text>]</xsl:text>
        </xsl:element>
    </xsl:template>
    <!-- -->
    <!-- end of stage direction processing -->
    <!-- -->
</xsl:stylesheet>

Overview

For this exercise, our reading view of the play (http://dh.obdurodon.org/xslt-test_output.xhtml) is relatively straightforward. We didn’t reorder anything, so that title, table of contents, and the four scenes and their contents are arranged as they were in the input file. We used Attribute Value Templates (AVTs) for internal linking (lines 52 and 66), and we used the concat() function (line 97) to output each character name followed by a comma, a space, and the character description. The most unusual feature is our tagging of stage directions (lines 125–32), about which see the discussion below.

Whitespace handling

We use <xsl:strip-space> and <xsl:preserve-space> (lines 10–11) for neater whitespace handling. In this document their use is optional because the whitespace in question is visible only in the angle-bracketed raw HTML, and the formatted view in the browser is the same with and without these elements. You can run the transformation both ways to see the difference, and you can read about how they work in Michael Kay.

CSS

We advised you to create a separate CSS file for styling your page, but for ease of description in this solution we’ve combined that (we’ve linked to the regular Obdurodon style file on line 20) with a <style> element in our <head> (lines 21–27) to supplement the basic styling with a few additional rules for <stage> and <speaker> elements. As we’ve discussed, the <link> element in the <head> must be created by the XSLT transformation, and not added manually afterwards.

Internal linking

We used AVTs to create links that point from the Table of Contents (@href attributes on <a> elements line 52) to each of the scenes in the play (@id attributes on <section> elements, line 66). When generating ATVs, we need to surround our interpreted value in curly braces so that the program knows not to output the text literally. In previous exercises, we used AVTs in conjunction with modal XSLT because we had to process the same elements (e.g., sonnets) different ways in different places. Since we process everything here in only one way, we don’t need (= don’t use) modes.

The concat() function

The character elements have <name> and <desc> children, and we wanted to combine them into lines that read the way a list of characters might be expected to look at the beginning of a play. On lines 95–99 our template that matches <character> elements creates a list item (<li>) for each character (we create the unordered list that wraps them in the template that matches the <characters> element just above). Inside the <li> we use concat(), which concatenates strings of text, to combine the <name> of the current <character> element, then a comma and a space character, and then the <desc> child. If either <name> or <desc> could have contained markup that we wanted to transform, we wouldn’t have been able to use concat() because concat() atomizes its arguments, that is, it strips off any markup on them and treats them like strings (would have applied templates instead). Since we know that they don’t have any internal markup, though, it’s safe to use concat() in this case.

Transforming stage directions

Stage directions (<stage> elements) can occur in two environments: inside <lines> (that is, within spoken text inside a speech) and on their own, between speeches, rather than inside a speech. When stage directions occur between speeches, they are block level, that is, paragraph-like, so we want to tag them as HTML <p> elements, the same way we’re tagging speeches. But when stage directions occur within spoken lines, they are inline, and we want to tag them as HTML <span> elements. Except for the name of the element that we’re creating, though, we want to process <stage> elements identically in those two situations: in both cases they should have a @class attribute with a value "stage" and in both cases they should wrap the actual stage direction in square brackets. And we don’t want to type that shared code twice.

We address this requirement by dynamically creating the name of the output wrapper element (<p> or <span>) by using <xsl:element> (which creates an element) on line 126. When we create an element with <xsl:element>, the element name is specified in the @name attribute. The value of our @name attribute (which is an AVT, since it has to be interpreted as XPath, and not output literally) is an XPath if expression, which evaluates to the string "span" when the parent of the <stage> element is <lines>, and the string "p" otherwise. This has the effect of creating a <span> in the output in the first case and a <p> in the second case. The same code to create the @class attribute (an <xsl:attribute> element on line 127) and to wrap the text of the stage direction in square brackets (lines 128–30) is used in both cases, so we successfully avoid the repetition.

It wasn’t necessary to adopt this approach, and since the repetition is just a few lines and we haven’t practiced <xsl:element> or <xsl:attribute> much, it’s reasonable not to have thought of them here. But now that you’ve learned and practiced many of the fundamentals of XPath and XSLT, we’d encourage you to be alert for opportunities to use more advanced features to improve your code, in this case, as a way of avoiding repetition.