Maintained by: David J. Birnbaum (djbpitt@gmail.com)
Last modified:
2023-03-20T16:27:13+0000
Write an XSLT stylesheet that will transform the XML input document into an HTML document that consists entirely of tables of characters and factions. You can see the desired output at http://dh.obdurodon.org/skyrim-02.xhtml.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://www.w3.org/1999/xhtml"
xmlns:math="http://www.w3.org/2005/xpath-functions/math" exclude-result-prefixes="#all"
version="3.0">
<xsl:output method="xhtml" html-version="5" omit-xml-declaration="no" include-content-type="no"
indent="yes"/>
<xsl:template match="/">
<html>
<head>
<title>Skyrim</title>
<style>
table { border-collapse: collapse; }
table, th, td { border: 1px solid black; }
</style>
</head>
<body>
<h1>Skyrim</h1>
<h2>Cast of characters</h2>
<table>
<tr>
<th>Name</th>
<th>Faction</th>
<th>Alignment</th>
</tr>
<xsl:apply-templates select="//cast/character"/>
</table>
<h2>Factions</h2>
<table>
<tr>
<th>Name</th>
<th>Alignment</th>
</tr>
<xsl:apply-templates select="//cast/faction"/>
</table>
</body>
</html>
</xsl:template>
<xsl:template match="character">
<tr>
<td>
<xsl:apply-templates select="@id"/>
</td>
<td>
<xsl:apply-templates select="@loyalty"/>
</td>
<td>
<xsl:apply-templates select="@alignment"/>
</td>
</tr>
</xsl:template>
<xsl:template match="faction">
<tr>
<td>
<xsl:apply-templates select="@id"/>
</td>
<td>
<xsl:apply-templates select="@alignment"/>
</td>
</tr>
</xsl:template>
</xsl:stylesheet>
Before anything else: the Skyrim XML is not in a namespace. This means that you
should not include an @xpath-default-namespace
attribute in your XSLT. If you make the mistake of specifying, say, the TEI
namespace as the default, your templates will match only TEI elements, and since
there aren’t any TEI elements in Skyrim, your templates will match nothing. That
isn’t an error, but it is a mistake and it isn’t what you want.
To keep everything in one place for this answer key, so that you don’t have to look
at multiple files simultaneously, we’ve used a
<style>
element in the
<head>
to hold the CSS that specifies a
border for the table. However, in your projects you’ll want to declare all CSS rules
in a separate CSS stylesheet, so that you can assign the same styles to multiple
HTML documents without having to repeat the same CSS instructions inside each of
those files. The old @border
attribute on
<table>
elements is officially
non-conforming in HTML5, which means that although it may work in
your browser, it should be avoided; best practice for specifying a table border is
now to use CSS.
Before writing any code to extract the information that is going to populate the rows
of our tables, we begin by creating the superstructure of our HTML document, as we
did in XSLT assignment #1. As always,
the HTML document that we create has an
<html>
root element with two children,
<head>
(with
<title>
and, in this case, although not in
your projects, <style>
) and
<body>
. Inside the body, we create a
prominent <h1>
element to title our page as
Skyrim
, and then an <h2>
subtitle
followed by a <table>
for the Cast of
characters
and another <h2>
and
<table>
for Factions
. You can read
more about HTML tables at http://www.w3schools.com/html/html_tables.asp.
The tables that we are creating need to be filled. HTML tables are constructed row by
row, and rows are defined by <tr>
. Rows
contain cells, of which there are two types:
<th>
(table header) is used to label a row
or column and <td>
is used to represent
actual data. When, inside our template rule for the document node, we create the
start- and end-tags for the two tables themselves, we also create the header rows
for each table (<tr>
containing
<th>
), since we want to create the
<table>
tags and the header rows just once per
table. Since every document has exactly one document node, the template that matches
the document node will fire exactly once, so by creating the tables and their header
rows in that template, we ensure that we create each table, with its single header
row, just once.
Meanwhile, though, although we want just a single header row, we want to create a separate data row for each character or faction. Since we want to create one row per character or faction, we need to create those in a template that fires once per character or faction.
Our goal in the first table is to create a row for every character, and that row
should contain three cells, the first with the character’s name, the second with the
character’s faction, and the third with the character’s alignment. Every character
and every faction is listed in the <cast>
element in our XML document, and so we want to look there to retrieve the data that
we’re going to insert into our table rows. In other words, we need to apply
templates to each of the <character>
and
<faction>
elements that we find inside
<cast>
to create rows for the HTML tables we
are creating. Since we want to create rows with information about characters after
the header row for that table, we put the
<xsl:apply-templates select="//cast/character">
element right there, immediately after the first table row (the header row) inside
the table. The value of the @select
attribute
tells our stylesheet to navigate directly to the
<cast>
, to find all of its
<character>
children, and then to apply
templates to those <character>
elements,
putting content exactly where the
<xsl:apply-templates>
element was. That
tells the stylesheet where we want to process
<character>
children of
<cast>
(not other
<character>
elements!), but it doesn’t say
how. In order to communicate to the stylesheet how we want to process
characters, we need to define a template for
<character>
elements that tells our
stylesheet specifically what to do.
The template we use has a @match
attribute with
the value character
, which will match any
<character>
it sees, which in this case
means the ones that are thrown at it when the
<xsl:apply-templates select="//cast/character"/>
element fires inside the first table, the one we create in the template rule for the
document node. (The template that matches those characters would also match any
other <character>
elements it might see, but
it never sees any others, since this stylesheet never applies templates to anything
outside the <cast>
element.) Inside the
template rule for <character>
we create a
<tr>
, which will be inserted into the table
we’re creating in exactly the place where the
<xsl:apply-templates select="//cast/character"/>
element was located, that is, right after the header row. This new row has three
<td>
elements to hold the data for the
character we’re processing at the moment. That data is retrieved because each of the
data cells contains its own
<xsl:apply-templates>
element, which selects
the @id
,
@alignment
, and
@loyalty
attributes of the particular
<character>
, respectively. Since all we want
from those attributes is their textual value, we don’t have to define our own
templates to handle them; we can rely instead on the behavior of the build-in
default template, which just outputs the textual value of any attribute that does
not have its own template rule. As a result, each data cell will come to be
populated with the string value of the targeted attribute.
We follow a similar process for creating the table listing each faction. We call
<xsl:apply-templates select="//cast/faction">
,
define a template for factions
(<xsl:template match="faction"/>
), and
create a table row inside the template with data cells applying templates to
@id
and
@alignment
. Keep in mind that
<faction>
elements only have two properties,
so our table rows for factions will have only two data cells.
Name | Faction | Alignment |
---|---|---|
UrielSeptim | empire blades | good |
hero | neutral | neutral |
Jauffre | empire blades | good |
MartinSeptim | empire blades | good |
MehrunesDagon | daedra | evil |
MankarCamoran | daedra MythicDawn | evil |
Name | Alignment |
---|---|
MythicDawn | evil |
blades | good |
daedra | evil |
empire | good |
DarkBrotherhood | neutral |
The XML document includes <character>
and
<faction>
elements in different
contexts: some are children of <cast>
and others are descendants of <body>
.
Our templates that match <character>
and
<faction>
elements don’t specify a
context, so they would match elements of those types both in the
<cast>
and in the
<body>
. Why, then, isn’t our output
cluttered with unwanted <character>
and
<faction>
elements elements from within
the <body>
?
The answer is that we never apply templates to anything inside the
<body>
. In order for a template to fire
on an element, two things have to happen:
We have to apply templates to the element.
A template has to match the element.
Since we never apply templates to
<character>
and
<faction>
descendants of
<body>
, the first requirement is never
met and those instances of <character>
and <faction>
elements are not
processed. If we were processing both the
<cast>
and the
<body>
, though, we would want
<character>
and
<faction>
elements in those two contexts
to be processed differently, so we would need separate templates that match the
element types in each context. This is similar to the way we wrote separate
templates to match act and scene <div>
elements in the first XSLT
assignment, since we needed to process those
<div>
elements differently in different
contexts.
You can stop reading here if you’d like, and something like the code above is a fine
solution for this assignment. At the same time, that XSLT has a lot of fragmentation
and repetition. The character table and the faction table have a lot in common, yet
we create them separately, and we treat all attributes pretty much the same way, yet
we create a <td>
to hold each type of
attribute separately. We can make our XSLT more concise, and therefore easier to
maintain, in the following ways:
Instead of our current templates for creating character and faction rows, we can write:
]]>
The XPath expression @*
matches any
attribute. When we apply templates to the attributes for the two element types
we have to use the comma operator to combine them because the comma operator
specifies the order. Although attributes inside a start-tag may look ordered to
us, The Real XML is a tree, and not tags, and the order in which attributes are
listed inside the start-tag is not part of the information available in the
tree. Were we to apply templates to all of the attributes of the current context
node with
<xsl:apply-templates select="@*"/>
there
would be no guarantee that they would be output in the order in which they
appear inside the start-tag. By specifying their order with the comma separator,
though,we tell XSLT to apply templates to them in the order listed in the
XSLT.
As a result of this consolidation our XSLT has only one location where it creates
<td>
elements, removing a lot of
repetition. We can consolidate further, though. Currently we create rows for
characters in one template and rows for factions in a different template, but
the processing is otherwise the same: we create a
<tr>
element and, inside it, apply
templates to the attributes of the element we’re processing in a specific order.
We can take advantage of the fact that although the attributes on characters and
factions differ, those that are present on both types of element observe the
same order. This means that we can further refactor the code above as:
]]>
The first template matches anything that is either a
<character>
or a
<faction>
element, so we no longer need
separate templates to process those two types of elements. This approach works
even though characters and factions have different attributes because applying
templates to something that doesn’t exist
(<faction>
elements do not have
@loyalty
attributes) is not an error. What
the code says is apply templates to all of my
. What happens with characters is straightforward: we get three cells
per character, populated with information from those attributes in the specified
order. What happens with factions is that it applies templates first to all of a
faction’s @id
,
@loyalty
, and
@alignment
attributes in that
order@id
attributes (there is always
exactly one), then to all of its @loyalty
attributes (there are never any, so it applies templates to all zero of
them—that is, it does nothing), and then it applies templates to all the
faction’s @alignment
attributes (there is
always exactly one).
Our revised XSLT now looks like:
Skyrim
Skyrim
Cast of characters
Name
Faction
Alignment
Factions
Name
Alignment
]]>
There’s one more conspicuous repetition: for both characters and factions we
create an <h2>
header and a table with a
row of labels above the actual table data. There is the further complication
that the column labels are title-cased versions of the attribute names (e.g.,
the attribute @alignment
goes in a column
labeled Alignment), except that we want the first column to be headed
Name, and not Id, and we have to specify that because it can’t
be computed without more information. We also cannot compute the content of the
<h2>
headers automatically because there
is no natural way for XSLT to know, unless we specify it, that
<character>
elements go in a table
labeled Cast of characters
and
<faction>
elements go in a table labeled
Factions
. Finally, there isn’t a particularly natural way for XSLT to
look at the contents of the <cast>
element and know to create one table for all
<character>
elements and one for all
<faction>
elements.
In Real Life we would probably stop here because the overhead of trying to combine the character and faction processing further would offset any savings realized through the consolidation. If, though, we had to create hundreds of tables instead of just two, the consolidation would pay off by removing a lot of repetition, and therefore a lot of unnecessary opportunity for error. Here’s what that might look like:
Skyrim
Skyrim
Cast of characters
Factions
Name
Loyalty
Alignment
]]>
The XSLT above recruits some advanced features that we haven’t seen before:
We use <xsl:for-each>
to create a
deduplicated and alphabetized list of names of element types that are
children of the cast list. The XPath expression
distinct-values() => sort()"/>]]>
says to select all element children of the cast list and, for each of
them, return the element name. This returns a sequence of six instances
of the string character
and five instances of the string
faction
. We then remove the duplicates and sort
alphabetically, so the expression eventually returns a sequence of two
strings: character
followed by faction
. Because there are
two strings in that sequence, we do everything inside the
<xsl:for-each>
element twice,
once for each of the two strings.
We use <xsl:for-each>
because
we’re dealing with strings (element names) and not nodes in the tree. It
is possible to apply templates to strings, but a general rule of thumb
is that we process a sequence of nodes by applying templates to them but
we process a sequence of atomic values (strings or numbers) with
<xsl:for-each>
. This is why
we’ve advised you not to use
<xsl:for-each>
so far (you
haven’t had to process sequences of atomic values). When we get to our
SVG unit, which is coming next, we’ll use
<xsl:for-each>
more often
because with SVG we compute a lot of numerical values, that is, atomic
values that are not in the tree. Stay tuned!
Because our <xsl:for-each>
will
do two things, one for each of the two strings, we can use the same code
to create the <h2>
header and
the table. We need to customize the header and the column labels in the
table, though, and we use conditional expressions to do that. We
describe conditionals
(<xsl:choose>
and
<xsl:if>
) in our second XSLT tutorial, so we won’t
repeat ourselves here, but here is a brief summary:
The <xsl:choose>
element
provides an
if … then … else
type of
functionality. We say that if we’re creating output for
characters the value inside the
<h2>
should be the
string Cast of characters
, and otherwise it should be the
string Factions
.
We want a column labeled Loyalty
only for characters, but
not for factions. We use
<xsl:if>
for that
because <xsl:if>
doesn’t
have an else
, so it means do this for characters, but
otherwise (that is, for factions) do nothing
.
The conditionals here let us create different headers and different
column labels for characters and for factions within a single
<h2>
element and a single
<table>
element. With just two
tables the overhead might outweigh any benefit of this consolidation (=
we probably wouldn’t write our code this way in Real Life if we had only
two tables to deal with), but the reduction in repetition would begin to
pay off were there more tables.
Inside <xsl:for-each>
over a
sequence of strings (the element type names are strings and not nodes in
the tree) we’re cut off from the tree, which means that if we try to
apply templates to something that begins
//cast
we’ll raise an error that a
string has no document node, so we can’t use a path that starts at a
document node. In order to be able to refer to the input XML tree inside
the <xsl:for-each>
, then, we
create a variable called $root
(the
name is arbitrary, but we chose this one because it’s the root of the
input tree) and bind the document node of the input XML to
that variable name. We can then start our XPath expression inside the
<xsl:for-each>
from there, so
that:
$root//cast/*[local-name() eq current()]
means to start from the top of the input XML, find all of its
<cast>
descendants (there’s
exactly one), and then find all of the children of that
<cast>
element that have names
that are equal to the current value of the
<xsl:for-each>
operation. The
function current()
is equivalent to
the context item, that is, the thing we’re operating over, which here
will be one of the two strings we selected with our
<xsl:for-each>
. That value will
be the string character
the first time and faction
the
second time, so our XPath expression will select the elements we care
about for the particular table that we are creating. We use
<xsl:sort>
to sort those
elements alphabetically after telling the sort instruction to ignore
case differences by treating all text as if it were lower case.