Digital humanities


Maintained by: David J. Birnbaum (djbpitt@gmail.com) [Creative Commons BY-NC-SA 3.0 Unported License] Last modified: 2017-02-28T04:27:08+0000


XSLT assignment #3

The assignment

Transform the Skyrim XML to HTML in way that uses an external CSS stylesheet to style the in-line elements. What you should do, then, is:

Our solution

Here is one solution (we process both the <cover> and the <body>; it’s fine if you process just the <body> and skip the <cover> along with the <cast>):

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns="http://www.w3.org/1999/xhtml" version="2.0">
    <xsl:output method="xml" indent="yes"
        doctype-system="about:legacy-compat"/>
    <xsl:template match="/">
        <html>
            <head>
                <title>Skyrim</title>
                <link rel="stylesheet" type="text/css" href="skyrim.css"/>
            </head>
            <body>
                <xsl:apply-templates/>
            </body>
        </html>
    </xsl:template>
    <xsl:template match="skyrim">
        <xsl:apply-templates select="* except cast"/>
    </xsl:template>
    <xsl:template match="title">
        <h1>
            <xsl:apply-templates/>
        </h1>
    </xsl:template>
    <xsl:template match="attribution">
        <h2>
            <xsl:apply-templates/>
        </h2>
    </xsl:template>
    <xsl:template match="subtitle">
        <h3>
            <xsl:apply-templates/>
        </h3>
    </xsl:template>
    <xsl:template match="author">
        <em>
            <xsl:apply-templates/>
        </em>
    </xsl:template>
    <xsl:template match="paragraph">
        <p>
            <xsl:apply-templates/>
        </p>
    </xsl:template>
    <xsl:template match="QuestEvent">
        <span class="QuestEvent">
            <xsl:apply-templates/>
        </span>
    </xsl:template>
    <xsl:template match="QuestItem">
        <span class="QuestItem">
            <xsl:apply-templates/>
        </span>
    </xsl:template>
    <xsl:template match="character">
        <span class="character">
            <xsl:apply-templates/>
        </span>
    </xsl:template>
    <xsl:template match="epithet">
        <span class="epithet">
            <xsl:apply-templates/>
        </span>
    </xsl:template>
    <xsl:template match="faction">
        <span class="faction">
            <xsl:apply-templates/>
        </span>
    </xsl:template>
    <xsl:template match="location">
        <span class="location">
            <xsl:apply-templates/>
        </span>
    </xsl:template>
</xsl:stylesheet>

The output HTML contains a link to an external CSS stylesheet called skyrim.css. Our CSS file looks like:

span.QuestEvent{
  color:red;
}
span.QuestItem{
  color:blue;
}
span.character{
  color:green;
}
span.epithet{
  color:orange;
}
span.faction{
  color:purple;
}
span.location{
  color:fuchsia;
}

The except operator in <xsl:apply-templates select="* except cast"/> is a convenient way of saying apply templates to all of the children of the current context except any <cast> children. You can read about except at the top of Kay, p. 631.

The XSLT transforms each of the six types of inline elements into an HTML <span> element with a @class attribute. The CSS contains six selectors that match <span>, one for each possible value of the @class attribute.

Refining the CSS

A CSS selector like span.QuestItem matches a <span> element with a @class attribute that has the value QuestItem. A CSS selector can also match just the element name (e.g., a selector span will match all <span> elements). It can also match just the @class, so that a selector like .QuestItem, which has the period and @class value but not the element name, will match any element that has a @class attribute set to the specified value. This means that, for example, you could use a single rule of this type to assign the same styling not only to <span> elements with specified @class value, but also <p> elements, <em> elements, <h1> elements, or any other element type, as long as it has the same specified value for the @class attribute. In our transformation we’ve encoded a @class attribute only for <span> elements, so the results of applying a particular style to all elements with a specific value of the @class attribute would be the same as those of applying that style only to <span> elements with that @class attribute value. In a larger project that makes broader use of the @class attribute, though, a selector that specifies only a class value, without an element name, can help consolidate a large CSS stylesheet into a smaller one.

Refining the XSLT

Our XSLT above has separate rules for six types of in-line elements that all do the same thing: they create an HTML <span> element with a @class attribute, the value of which is the same as the generic identifier (GI, or name) of the element that was matched in the original XML. For example, a <QuestEvent> element in the input XML is converted to <span class="QuestEvent"> in the output HTML. Using attribute value templates (AVT; see http://dh.obdurodon.org/avt.xhtml), we can exploit the similarities and consolidate those six rules into one, replacing the six separate rules with the following one:

<xsl:template match="QuestEvent | QuestItem | character | epithet | faction | location">
  <span class="{name()}">
    <xsl:apply-templates/>
  </span>
</xsl:template>

We use the union operator (|) to create a single template that will match any of those six types of elements, create an HTML <span> element with a @class attribute, and set the value of that attribute to the GI of the original XML element that is being matched at the moment. The XPath name() function returns the name of an element, and since the current context is the element currently being processed, each time this template processes an element it can retrieve the GI for that particular element instance. The function has to be surrounded by curly braces for reasons explained in our AVT documentation.

One of the enhancements we suggested in the assignment was that you could style input nodes on the basis of not only the element type, but also attribute values. For example, instead of setting all <faction> elements to the same value of the @class attribute, you can use <xsl:template match="faction[@ref='MythicDawn']"> to match <faction> elements only if they have a @ref attribute with the value MythicDawn. The AVT strategy we describe above to use a common template rule to treat different element types similarly could to be used to set the value of the HTML @class attribute according to not just the GI of the element being transformed, but also the value of, say, its @ref attribute. That is, we could use one template to convert <faction ref="MythicDawn"> into <span class="MythicDawn"> and <faction ref="DarkBrotherhood"> into <span class="DarkBrotherhood">. A template that assigns a different class to each <faction> element in the input XML file according to the value of its @ref attribute might look like:

<xsl:template match="faction">
    <span class="{@ref}">
        <xsl:apply-templates/>
    </span>
</xsl:template>

This template matches all <faction> elements in the input XML, regardless of the value of their @ref attribute, and converts them into HTML <span> elements with a @class attribute. It determines the value of the @class attribute, though, by copying it from the @ref attribute of the input <faction> element. The CSS stylesheet would, of course, have to be modified, since the example above does not include any selectors for <span> elements with any of these new @class attribute values.

Dereferencing <faction> elements

We also invited you to stretch your XPath skills along the following lines:

In the header (the <cast> element) some factions are described (with an @alignment attribute) as evil, good, or neutral. You can write a matching rule that will dereference the @ref attribute on, say, <faction ref="MythicDawn">assassins</faction>, look up whether this is an evil, good, or neutral faction, and set the @class value of the HTML <span> that you are creating accordingly. You could make all good factions one color and all evil factions a different color, letting XPath look up the moral alignment of a faction for you.

When you match a <faction> element in the input XML, it might say, for example, <faction ref="blades">. Meanwhile, inside the <cast> element at the top there is an element that establishes the alignment of the blades faction as good:

<faction id="blades" alignment="good"/>

You can deference (‘look up’) an in-line <faction> with a @ref attribute value of blades by looking up the <faction> element inside the <cast> that has an @id attribute with the same value, and you can then retrieve the value of the @alignment attribute associated with that entry. For example:

<xsl:template match="faction">
  <span class="{//cast/faction[@id = current()/@ref]/@alignment}"/>
</xsl:template>

The preceding template matches any <faction> element and creates an HTML <span> element with a @class attribute. The value of the @class attribute is determined by retrieving all of the <faction> elements in the <cast> and finding the one whose @id attribute matches the @ref attribute of the inline <faction> element (in the <body>) currently being processed. We have to use the XSLT current() function to refer to the inline <faction> element in the <body> currently being processed, rather than the dot, because the dot gets the current XPath context, and within the XPath we’re evaluating, that’s one or another <faction> element in the <cast> element. The current() function, on the other hand, returns the current XSLT context, which is the inline <faction> element that our template has matched, and a <faction> element that has a @ref attribute must be in the <body>.

One more complication

Inside the <body> element, the <faction> elements have @ref attributes that point to a single faction. The situation is different with the <character> elements inside the <body>, though, For example, there is one place where the XML reads:

<character ref="MartinSeptim hero Jauffre">they</character>

Trying to apply the strategy above will fail here because there is no <character> element in the <cast> element that has an @id attribute with the value MartinSeptim hero Jauffre. The problem is that this @ref attribute points to three separate <character> elements inside the <cast> element, a situation our templates above cannot handle.

The correct solution to this problem requires the XPath tokenize() function, which you don’t know yet. It would let you break the value MartinSeptim hero Jauffre into three parts and look each of them up individually. What you would then do, though, is a separate question: MartinSeptum and Jauffre both have good alignment and the hero has neutral alignment, so how should you style the word they with CSS? In a previous semester one student resolved this by dilemma by using parenthesized content, instead of color coding, that is, by rendering they (good) (neutral) (good). The advantage of this solution is that a string of text can have only a single color, but you can have multiple parenthesized annotations.