Digital humanities


Maintained by: David J. Birnbaum (djbpitt@gmail.com) [Creative Commons BY-NC-SA 3.0 Unported License] Last modified: 2020-04-21T21:41:53+0000


Test #6: XSLT

The task

For this test, we asked you use to The Blithedale Romance (familiar from the regex unit), but we simplified the markup.

Your XSLT code should output HTML that includes a table of contents that lists the number of each chapter and its title, and links to each chapter from this table of contents. This task is similar to what we've been practicing with the Shakespearean Sonnets; it just has a slightly different structure.Your HTML must be valid.

Our Solution

We’ve used XML comments below in the way we often do in real development, so that if we go back to our code after not having thought about it for a while, or if we share it with someone else, the comments will make it easier to understand.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://www.w3.org/1999/xhtml"
    exclude-result-prefixes="xs" version="3.0">
    <xsl:output method="xml" indent="yes" doctype-system="about:legacy-compat"/>
    <xsl:template match="/">
        <html>
            <head>
                <title>Solution for XSLT Test</title>
            </head>
            <body>
                <h1>
                    <!-- 
                        The title and author of the novel are in the XML markup, 
                        so retrieve them from there instead of retyping it
                    -->
                    <xsl:apply-templates select="//noveltitle"/>
                </h1>
                <h2>
                    <!-- using <xsl:text> gives you better whitespace control -->
                    <xsl:text>by </xsl:text>
                    <xsl:apply-templates select="//author"/>
                </h2>
                <hr/>
                <h2>Contents</h2>
                <ul>
                    <!-- 
                        Use a mode to process the <title> elements differently for
                        the table of contents than as chapter headings in the 
                        full-text reading view. 
                    -->
                    <xsl:apply-templates select="//title" mode="contents"/>
                </ul>
                <hr/>
                <!-- 
                    Create the reading view by applying templates to <title>
                    and <chapter> elements in document order. We don’t create
                    a template to process chapters, since the built-in template
                    will do what we want, that is, apply templates to the 
                    paragraphs in the chapter.
                -->
                <xsl:apply-templates select="//title | //chapter"/>
            </body>
        </html>
    </xsl:template>
    <xsl:template match="title" mode="contents">
        <li>
            <!-- 
                The only characters prohibited in an @id value in HTML (and therefore in an @href that 
                points to an @id value) are whitespace characters, so we strip spaces out of the titles 
                when we construct these attributes (while keeping them in the reading text shown to the
                user). This is a new development in HTML5; earlier version of HTML prohibited several 
                punctuation characters, as well. Newlines and tabs (that is, whitespace characters, and
                not just space characters) are also prohibited, and in Real Life we would strip those out,
                as well.
                
                We need to use an attribute value template to tell the XSLT processor to run the
                translate() function, instead of outputting the instruction as literal text.
            -->
            <a href="#{translate(., ' ', '')}">
                <xsl:apply-templates/>
            </a>
        </li>
    </xsl:template>
    <!-- 
        We create an @id attribute on each title (<h3>) in the reading view, which serves as a target
        for the links in the table of contents. The value of the @id must match the value of the @href
        in the table of contents, except that the @href value begins with #, and the @id doesn't.
    -->
    <xsl:template match="title">
        <!--
            Strip the same characters to form the @id as the @href, above, so that they'll match. 
        -->
        <h3 id="{translate(., ' ', '')}">
            <xsl:apply-templates/>
        </h3>
    </xsl:template>
    <!-- transform an input <p> into an output HTML <p> -->
    <xsl:template match="p">
        <p>
            <xsl:apply-templates/>
        </p>
    </xsl:template>
</xsl:stylesheet>

Overview

This is a relatively straightforward implementation of XSLT methodology of a type that you used for the Shakespeare sonnets homework. As in that assignment: