Author: Janis Chinn (janis.chinn@gmail.com) Maintained by: David J. Birnbaum (djbpitt@gmail.com) Last modified: 2023-03-13T20:36:42+0000
You already know how to mark up (XML), constrain (Relax NG), and navigate (XPath) your documents; XSLT (eXtensible Stylesheet Language Transformations) is one way to transform a document, manipulate the tree, and output the results as XML, HTML, SVG, or plain text. You might use XSLT to generate project pages for display on your site, to create intermediary documents for analysis and development, or to feed pieces of your data into another format for analysis with another tool, one that requires data in a particular format that is different from your main XML structure. Since XSLT is XML-aware, it can use XPath to navigate and manipulate your document, which means that when you use XSLT to implement a transformation (see below), you automatically use XPath within XSLT to find the pieces you want to transform (XPath expressions and XPath patterns) and to manipulate the data (XPath expressions).
An XSLT stylesheet is an XML document that must be valid against the XSLT schema. The
root element in this schema is <xsl:stylesheet>
and the children of the root are primarily
<xsl:template>
elements. These template
elements typically have a @match
attribute that matches an XPath pattern and instructs the computer to use
that template to process all matching nodes. For example, a template that matches
<p>
elements will be used to process
<p>
elements in the input document.
XSLT is a declarative programming language (unlike most programming languages
with which you are likely to be familiar), which means that part of the way it works is
that the templates don’t get applied from the top of the file to the bottom. What
happens instead is that program execution passes from template to template because an
<xsl:apply-templates>
element inside a template
rule tells the system what to process next. One consequence of this model is that
the order of template rules inside the stylesheet doesn’t matter because
they don’t get applied in that order. Rather, they get applied whenever an
<xsl:apply-templates>
element or the equivalent
specifies that a particular type of node must be processed. As a result, for every
element or other item in your input document that is specified as the target of an
<xsl:apply-templates>
element, if there is a
template anywhere in the stylesheet that matches it, the stylesheet will find it and the
template will fire.
XSLT builds in default rules to handle nodes for which there is no explicit template rule, which means that you have to write your own template rules only where you want something other than the default behavior. The default behavior is that:
If you try to apply templates to an element for which you haven’t created an explicit template, the system will do nothing with the tags for that element and just apply templates to its children, until eventually the only thing left is to output the text.
If you try to apply templates to a text node (that is, to text inside an element), the system will return the text.
For that reason, if your stylesheet contains no templates at all, the built-in templates will do all of the processing. This means that applying a stylesheet with no user-defined templates to a document will output all the plain text in the XML, without any markup. The default behavior will navigate from the document node at the top of the tree all the way down, throwing away markup and outputting text whenever it encounters it. This is rarely what you want; you will normally want to create at least one template rule in your stylesheet.
A typical stylesheet has the following exoskeleton, which <oXygen/> will generate for you when you create a new XSLT document:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:math="http://www.w3.org/2005/xpath-functions/math"
exclude-result-prefixes="xs math"
version="3.0">
</xsl:stylesheet>
The @version
attribute needs to be set to 3.0, which
should be the default behavior in <oXygen/> (and you can make it the default if it
isn’t already).
If your XML document is in a namespace, you’ll need to tell your stylesheet about the
namespace in order to process it with XSLT. To do this, add an
@xpath-default-namespace
attribute to the root
<xsl:stylesheet>
element and set its value to
the value of the namespace declaration from the input XML file. For example, if you are
transforming a TEI XML document with the following namespace declaration:
<TEI xmlns="http://www.tei-c.org/ns/1.0">
the root <TEI>
element states that all elements
within the document are in the TEI namespace (unless you explicitly say otherwise on
some descendant element). If you were to write a template rule in your XSLT matching
just TEI
and you hadn’t told the system that input elements were in the TEI
namespace, your template wouldn’t be applied because it would match
<TEI>
elements only if they are in no
namespace, whereas the XML declares that the
<TEI>
element is in the
http://www.tei-c.org/ns/1.0
namespace.
If you run a transformation where you have template rules but none of them gets applied, so that you just get plain text in the output, it’s often because of mismatched namespaces. In that situation, no template rules are being applied because they only match elements in no namespace and all of the elements in your input XML are in a namespace, which means that the transformation falls back on the default behavior described above.
To tell your stylesheet always to look for elements in the TEI namespace, the
<xsl:stylesheet>
element should look something
like this:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:math="http://www.w3.org/2005/xpath-functions/math"
xpath-default-namespace="http://www.tei-c.org/ns/1.0"
exclude-result-prefixes="xs math"
version="3.0">
</xsl:stylesheet>
In the example above the @xpath-default-namespace
attribute has a value equal to the TEI namespace value declared on your XML. If your
input XML is in some other namespace, the value of
@xpath-default-namespace
should match that. If your
input XML is not in a namespace, do not include the
@xpath-default-namespace
attribute because its
presence will cause all templates to match only elements in the declared namespace, and
it will prevent them from matching elements in no namespace.
Should you have input in mixed namespaces (perhaps a TEI document in the TEI namespace that contains embedded SVG in the SVG namespace), see your instructors for guidance about how to deal with it.
The @xpath-default-namespace
attribute specifies the
namespace of the input XML. If your output is going to be in a namespace (for
example, if you are outputting HTML, which must be in the HTML namespace), you also need
to specify the output namespace. When outputting HTML, the namespace declaration is
http://www.w3.org/1999/xhtml
, so if you are transforming TEI to HTML, your
root <xsl:stylesheet>
element must read:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:math="http://www.w3.org/2005/xpath-functions/math"
xpath-default-namespace="http://www.tei-c.org/ns/1.0"
xmlns="http://www.w3.org/1999/xhtml"
exclude-result-prefixes="xs math"
version="3.0">
</xsl:stylesheet>
Line 6 says that the default namespace for all literal elements that you are creating in your output document is the HTML namespace. Line 5, copied from the example above, says that the namespace for all elements in your input document is the TEI namespace. If your input or output are in no namespace you must omit these declarations, and if they are in other namespaces, you’ll need to use the appropriate namespace values.
<xsl:output>
You should always have an <xsl:output>
element to
control the type and formatting of your output.
<xsl:output>
is a top-level element,
which means it must be a child of the root
<xsl:stylesheet>
element (making it a sibling to
all your template rules, which are also top-level elements).
<xsl:output>
is usually placed at the top of the
document, as a first child of the root
<xsl:stylesheet>
element, because that makes it
easier for humans to find, but as long as it is a child (not a grandchild or other
descendant) of the root element, your document will be valid. Officially,
<xsl:output>
is an optional element, which means
that if it’s omitted you won’t get an error message, and the system will try to guess
the kind of output you want, which can lead to errors if it guesses wrong. At minimum,
<xsl:output>
should have a
@method
attribute. When we create HTML output, we
also set some additional attributes (see below for an example). Here are some
guidelines:
@method
specifies the type of output, and
the most common accepted values are xml
, html
, xhtml
, and
text
. The recommended output method for HTML5 documents that use XML
syntax (that is, the type of HTML5 we write in our course) is xhtml
. The
only other values we use in our own work are xml
and text
(for
plain text output). When we create HTML5 output we also add the following
attributes (see below for a complete example):
html-version="5"
. HTML5 differs
substantially from earlier versions of HTML, and this attribute notifies
the XSLT transformation engine that it should output HTML that is
conformant with HTML5 expectation.
omit-xml-declaration="no"
The XML
declaration, which is the first line of many XML files, and which reads
<?xml version="1.0" encoding="UTF-8"?>
,
is optional, but we recommend including it to help the processes that
deal with your file (e.g., the web server that delivers the file to the
user, the browser in which the user reads the file) recognize that it is
XML.
include-content-type="no"
The
content-type is represented by a
<meta>
element that the
transformation process will insert automatically into your HTML output
unless you tell it not to. This element is needed for HTML5 documents
that do not use the XML syntax, but it should be omitted from those that
do use the XML syntax, as we do in this course.
The @indent
attribute specifies whether or
not the output should be pretty-printed, that is, indented in a way that wraps
long lines and makes it easy to see the hierarchical structure. We normally set
this to yes
because it makes the output easier for humans to read.
Because this type of indentation works by inserting spaces and new-line
characters, there are some situations where automatic indentation can mess up
your content, and in those situations you can use this attribute to turn off the
indentation. XML (including HTML5) doesn’t normally care about the indentation,
so whether you turn it on or off is just for the convenience of the human who
may need to look at the angle-bracketed output of the transformation. HTML5
output will be wrapped properly in the browser even if you turn off indentation
(although the angle-bracketed view may look like one long line).
For HTML5, then, putting it all together, you should use:
<xsl:output method="xhtml" html-version="5" omit-xml-declaration="no"
include-content-type="no" indent="yes"/>
@match
attribute with an XPath patternExcept in situations you are unlikely to encounter in our course,
<xsl:template>
requires the attribute
@match
, which matches an XPath pattern. An
XPath pattern is not the same as a full XPath expression; it is just a piece of
one, the minimum XPath needed to describe what you want to match. For example, to match
all <p>
elements in the document, write
match="p"
instead of
match="//p"
. In other words, templates don’t
specify where to look for the elements they match; they don’t have to do that because
they sit around waiting for the elements to come to them (courtesy of
<xsl:apply-templates>
or other rules). For that
reason they only have to describe what it is that they match, and not how or where to
find it.
An XPath path expression, which is what we have been practicing in our
XPath unit, is evaluated from a current context. In our XPath explorations in
<oXygen/>, the current context is the last place we clicked inside the
document, and most of the time we’ve been ignoring that and instead beginning our
path expressions with a slash, which means no matter what the current context,
start the path at the document node
. The advantage of always starting at the
document node is that we don’t have to think about the current context, but that
works only for our exploratory data analysis, and it won’t work in an XSLT
environment.
A common mistake is to write match="//p"
instead
of match="p"
. The reason this is a mistake,
even though it happens to work, is that it makes your code harder to read because
the leading double slash does not affect the meaning, yet its presence implies that
it must be there for a reason. All you want to specify in the value of a
@match
attribute, or any XPath pattern, is
enough to identify unambiguously the nodes to which you want the template to apply.
If we were transformating our TEI version of Hamlet, for example,
templates that apply only to acts could include
match="body/div"
, templates that apply only to
scenes could include match="div/div"
, and
templates that apply to both acts and scenes, but not to
<div>
elements in the header, could include
match="body//div"
.
We find it most helpful to read XPath expressions from the left, path step by path
step, because each step specifies the current context(s) for the next step. An XPath
expression like //body/div
, then, means
start at the document node, find all
<body>
elements on its descendant axis,
and then, for each <body>
element, find
all <div>
elements on its child
axis.
We find it most helpful to read XPath patterns from the right. For example, an XPath
pattern like body/div
means find all
. Reading from the
right helps us avoid thinking that we have to navigate to the leftmost component of
the pattern first. We don’t have to do that because XPath patterns match, but they
don’t traverse.<div>
elements that are children of
<body>
elements
Where we use XPath expressions and where we use XPath patterns is specified by the
languages that use XPath, and is not up to us. In XSLT, the value of the
@match
attribute is defined as an XPath
pattern, and the value of the @select
attribute
on <xsl:apply-templates>
(see below) is
defined as an XPath expression, for which the current context item is the item that
the @match
attribute matched. If
@match
matches multiple items (for example, if
it matches acts and there are multiple acts in a play), the rule fires once for each
of them, and only one of them will be the current context item at a given moment in
the transformation process.
As the examples above illustrate, by varying the completeness of the pattern, you can get
more or less specific about how to handle, say,
<p>
elements in different parts of the XML tree.
If you want to treat <p>
elements inside a
<chapter>
differently from
<p>
elements inside an
<introduction>
, you can create separate
templates with match="chapter/p"
and
match="introduction/p"
, with as little context as
you can get away with to specify the difference. But you don’t need (= shouldn’t have) a
full path; your XPath pattern must be the simplest pattern that will match what you
want to match. Most of your stylesheets will consist of
<xsl:template>
elements for each type of element
that might arise in your input document (unless the built-in behavior, described above,
which applies if there is no template, already does what you want, in which case you
should not create an explicit template just to mimic that behavior).
Most (if not all) stylesheets you’ll write in this course will begin with a template
matching the document node, which is both the (generally invisible) parent of
the root element and the uppermost node in the hierarchy of every XML document. When an
XSLT stylesheet is applied to an XML document, the system always starts at the document
node when looking for templates to apply. To match the document node, use the XPath
pattern /
. Any instructions that should fire only
once to create the superstructure for your output will typically be created inside this
template, and you’ll need at least one
<xsl:apply-templates>
element in order to
interact with the lower branches of your tree. If you’re planning on outputting HTML,
the template that matches the document node is the place to create your HTML
superstructure, and within this superstructure you’ll want to include, typically, an
<xsl:apply-templates>
element that tells the
processor how to build the HTML output inside that superstructure. For example, a
typical XML-to-HTML transformation might start with code like:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:math="http://www.w3.org/2005/xpath-functions/math"
xpath-default-namespace="http://www.tei-c.org/ns/1.0"
xmlns="http://www.w3.org/1999/xhtml"
exclude-result-prefixes="xs math"
version="3.0">
<xsl:output method="xhtml" html-version="5" omit-xml-declaration="no"
include-content-type="no" indent="yes"/>
<xsl:template match="/">
<html>
<head>
<title>Title goes here</title>
</head>
<body>
<xsl:apply-templates/>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
The template rule above matches the document node, creates the HTML superstructure that
will go into the output, and then, inside the HTML
<body>
element, applies templates to the
children of the document node (by default,
<xsl:apply-templates>
means apply templates
to the children of the node currently being processed
). The only child element
of the document node is always the root element of your input XML. Your stylesheet will
also include other templates that specify what to do with the various elements of your
input XML (see below).
Think of your <xsl:apply-templates>
elements as
place-holders that mark where to output the results of applying the templates they call.
For example, any content you want to appear immediately inside the HTML
<body>
element that you’re creating can be
placed correctly by putting the
<xsl:apply-templates>
element between the
<body>
start- and end-tags.
By default, <xsl:apply-templates>
means apply
templates to all child nodes (elements and text) of the current context, that is,
the node currently being processed
. You are not restricted to processing only
child nodes, though; <xsl:apply-templates>
optionally takes a @select
attribute, which tells
the system what nodes to apply templates to. The value of
@select
is a full XPath expression and will start
from the current context, that is, from whatever node is being processed at the
time. For example, if you are transforming TEI to HTML and the only XML you want to
process is in the <teiHeader>
, you can replace
the general <xsl:apply-templates>
with
<xsl:apply templates select="//teiHeader">
. If
@select
is omitted, the system will default to
applying templates to all children of the current context node. That this is the default
behavior means that you don’t need to (= shouldn’t) specify
@select
if what you want to select is all of the
children of the current context.
Any elements you want to handle specially (that is, for which the built-in behavior is
not what you want) will need their own template rules. Remember, though, that templates
fire every time the system encounters a matching node in the XML, so if you want an
element to be created once (for instance, the
<html>
element), it should go within a template
that matches a node that only appears once (for instance,
/
). If you’re generating HTML
<p>
elements, on the other hand, you’ll need
those to be inside a template that will fire many times because you want to generate
many <p>
elements, not one giant
<p>
element which contains the text of all the
paragraphs. Similarly, if you are creating an HTML table with a lot of rows, you
typically want only one table, so you should create that directly inside the
<body>
element and then create the
<tr>
elements for the rows in a template that
fires once for each row you want to create. If, say, you want to create one table row
for each <character>
element in your input, your
XSLT will probably look something like:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:math="http://www.w3.org/2005/xpath-functions/math"
xpath-default-namespace="http://www.tei-c.org/ns/1.0"
xmlns="http://www.w3.org/1999/xhtml"
exclude-result-prefixes="xs math"
version="3.0">
<xsl:output method="xhtml" html-version="5" omit-xml-declaration="no"
include-content-type="no" indent="yes"/>
<xsl:template match="/">
<html>
<head>
<title>Title goes here</title>
</head>
<body>
<table>
<tr>
<!-- header row with <th> elements to label the columns -->
</tr>
<xsl:apply-templates select="//character"/>
</table>
</body>
</html>
</xsl:template>
<xsl:template match="character">
<tr>
<!-- apply other templates to create the cells in the table row
for a particular character-->
</tr>
</xsl:template>
</xsl:stylesheet>
Note that you create only one <table>
, so you do
that inside a template that fires just once, the template that matches the document
node. You create one row for each character, though, so you create your
<tr>
elements inside a template that matches
<character>
elements, and therefore fires once
for each <character>
element. The only
<tr>
that gets created inside the template rule
for the document node is the one that labels the columns, since you want just one of
those.
<xsl:apply-templates>
vs.
<xsl:value-of>
Sometimes you get the content of an element or attribute by applying templates to it and
sometimes you use <xsl:value-of>
. The difference
between <xsl:apply-templates>
and
<xsl:value-of>
is that, as far as nodes are
concerned, <xsl:value-of>
can return only a text
node, that is, plain text. We often use
<xsl:value-of>
to return constructed textual
values. For example, if we want to create a table cell that contains the number of
speeches by Hamlet, that number is a value that we construct by using the
count()
function, along the lines of:
<td><xsl:value-of select="count(//sp[@who eq 'Hamlet'])"/></td>
.
The output created by <xsl:value-of>
is always a
text node, and it represents a dead end in the XML tree insofar as it cannot contain
markup (because it is a single text node) and it isn’t on the tree (because you
constructed it). This means that you cannot apply templates to its children (it doesn’t
have any) or anything else on an axis from it (because it isn’t on a tree and therefore
doesn’t have axes). If, for example, you are processing a paragraph node tagged as
<p>
,
<xsl:value-of select="."/>
will return the
textual value of the paragraph, throwing away any internal markup. If you want to
process that internal markup (for example, if the paragraph contains titles or foreign
words or emphasis or anything that should be processed separately),
<xsl:value-of>
will make it impossible to
process those elements, and all you’ll get is their textual content, as if they weren’t
marked up in the first place. If, on the other hand, the paragraph has no internal
markup, there is no difference in behavior between
<xsl:apply-templates>
and
<xsl:value-of>
. As a rule of thumb:
Use <xsl:apply-templates>
when processing
an element unless there is good reason to do otherwise. If there’s no difference
(perhaps because you’re processing an element that just contains plain text),
this will do what you want. Where there is a difference, though, using
<xsl:value-of>
will throw away internal
markup without processing it, which is rarely what you want.
Use <xsl:value-of>
when outputting an
atomic value, such as the value of an XPath function that retuns a string or a
number.
There is no difference between applying templates to an attribute node and asking for its value because attribute nodes cannot contain markup. Use whichever option you find easiest to understand.
Both <xsl:apply-templates>
and
<xsl:value-of>
can take a
@select
attribute to specify what should be
processed. That attribute is optional with
<xsl:apply-templates>
; as we noted above, if you
don’t use @select
, you will apply templates to all
of the child nodes of the current context, whatever they may be. In the case of
<xsl:value-of>
, though, the
@select
attribute is obligatory. For example,
<xsl:value-of select="."/>
will output the
string value of the current context node, throwing away any markup.
<xsl:value-of select="string-length(.)"/>
will
output a single integer, representing a count of the number of characters in the current
context node.
XSLT usually does The Right Thing when it is outputting just elements or just plain text,
but mixed-content output (that is, a mixture of elements and plain text) can lead to
awkward white-space handling. You can avoid having to worry about the intricacies of
XSLT white-space handling by applying the following rule of thumb: when you are
outputting mixed content, wrap all plain text in
<xsl:text>
tags. For example, instead
of writing:
<xsl:template match="book">
<item>
<cite>
<xsl:apply-templates select="title"/>
</cite>
by
<xsl:apply-templates select="author"/>
</item>
</xsl:template>
you should use:
<xsl:template match="book">
<item>
<cite>
<xsl:apply-templates select="title"/>
</cite>
<xsl:text> by </xsl:text>
<xsl:apply-templates select="author"/>
</item>
</xsl:template>
By way of illustrating a complete transformation, here are a sample XML document (whose content you may recognize from the first week of class) and a sample XSLT stylesheet to transform the XML into HTML for publication on the web.
<letter>
<head>
<context>The following letter was written shortly after Wilde’s
release from prison:</context>
</head>
<content>
<dateline>
<location>Rouen</location>,
<date>
<month>August</month>
<year>1897</year>
</date>
</dateline>
<salutation><person type="recipient">My own Darling Boy</person>,</salutation>
<body>
<p>I got your telegram half an hour ago, and just send a line to say that
I feel that my only hope of again doing beautiful work in art is being
with you. It was not so in the old days, but now it is different, and you
can really recreate in me that energy and sense of joyous power on which
art depends.</p>
<p>Everyone is furious with me for going back to you, but they don’t
understand us. I feel that it is only with you that I can do anything at
all. Do remake my ruined life for me, and then our friendship and love
will have a different meaning to the world.</p>
<p>I wish that when we met at <location>Rouen</location> we had not parted at
all. There are such wide abysses now of space and land between us. But we
love each other.</p>
</body>
<valediction>Goodnight, dear. Ever yours, <person type="sender">Oscar</person>
</valediction>
</content>
</letter>
The XML is pretty straightforward. The root element is
<letter>
, which has two children, a
<head>
and a
<content>
element, and the latter contains the
body of the letter and the rest of the elements. Locations within the text are tagged,
but for the sake of simplicity and brevity, the sender and recipient are tagged only in
the salutation and valediction, as <person>
elements. (That is, the personal pronouns that refer to them in the body of the letter
are not tagged.)
Our sample output will be an HTML document that does not include any information from the
<head>
element; it outputs our paragraphs as
HTML paragraphs and italicizes all persons and locations. The result of the
transformation can be seen below:
My own Darling Boy,
I got your telegram half an hour ago, and just send a line to say that I feel that my only hope of again doing beautiful work in art is being with you. It was not so in the old days, but now it is different, and you can really recreate in me that energy and sense of joyous power on which art depends.
Everyone is furious with me for going back to you, but they don’t understand us. I feel that it is only with you that I can do anything at all. Do remake my ruined life for me, and then our friendship and love will have a different meaning to the world.
I wish that when we met at Rouen we had not parted at all. There are such wide abysses now of space and land between us. But we love each other.
Goodnight, dear. Ever yours, Oscar
Our stylesheet that performs this transformation is below, followed by a discussion of how it works:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:math="http://www.w3.org/2005/xpath-functions/math"
xpath-default-namespace="http://www.tei-c.org/ns/1.0"
xmlns="http://www.w3.org/1999/xhtml"
exclude-result-prefixes="xs math"
version="3.0">
<xsl:output method="xhtml" html-version="5" omit-xml-declaration="no"
include-content-type="no" indent="yes"/>
<xsl:template match="/">
<html>
<head>
<title>Oscar Wilde Letter 2</title>
</head>
<body>
<xsl:apply-templates select="//content"/>
</body>
</html>
</xsl:template>
<xsl:template match="dateline">
<h4>
<xsl:apply-templates/>
</h4>
</xsl:template>
<xsl:template match="location|person">
<em>
<xsl:apply-templates/>
</em>
</xsl:template>
<xsl:template match="p|salutation|valediction">
<p>
<xsl:apply-templates/>
</p>
</xsl:template>
</xsl:stylesheet>
Lines 1–7 are created by <oXygen/> when you tell it to create a new XSLT stylesheet. The only part that we’ve added is the HTML namespace declaration on line 5, so that all output will be in the HTML namespace:
xmlns="http://www.w3.org/1999/xhtml"
Lines 10-19 set up our HTML superstructure (we’ve added a
<title>
, which will show up in the browser tab,
but not in the browser window), populating our
<body>
element with the results of applying
templates to all <content>
elements wherever
they appear. There’s only one <content>
element,
and no template for <content>
, so the system
falls back on the default behavior and applies templates to all of its children. (Note
that we never apply templates to <head>
or
<context>
, so they will not be output at all in
our result document.)
The children of <content>
are
<dateline>
,
<salutation>
,
<body>
, and
<valediction>
, and we have templates for all of
those except <body>
. That means that we’re
relying on the default behavior for <body>
,
which is, again, to apply templates to its children. The
<dateline>
element, whose template is on lines
20–24, will process the contents of the <dateline>
element and output the results inside an HTML
<h4>
element. There’s no
@select
attribute on the
<xsl:apply-templates>
here, so the system will
apply templates to all children of the element (there are three: the
<location>
element, the text node after it that
contains a comma and some white space, and the
<date>
element). We don’t have template rules
for the second and third of these, so the built-in rules will take care of them; the
<location>
element is processed by the template
on lines 25–29, which outputs the content wrapped in an
<em>
element (typically rendered as italics in
the browser). The @match
attribute in this template
rule uses the union operator (|
) to say that the
same template matches both <location>
and
<person>
elements.
Using the HTML <em>
element to italicize
locations, as we do above, is not good practice because HTML elements have semantics
(that is, meanings). The <em>
element is
appropriate only for emphasized text, and there is nothing emphatic about a
placename. Using HTML elements in a semantically incorrect way in order to achieve a
rendering effect is called tag abuse, and it should be avoided. It would
be more correct to tag the locations by wrapping them in
<span class="location">
and then using a CSS
rule like span.location { font-style: italic; }
to
italicize them.
The template on lines 30–34 matches three different types of elements:
<p>
elements,
<salutation>
elements, and
<valediction>
elements. For all three, it
outputs the contents inside an HTML <p>
element.
This way any <p>
,
<salutation>
, and
<valediction>
element in the input XML will
become an HTML paragraph in our output. Since we again applied templates without a
@select
attribute, we again revert to the default
behavior of applying templates to all child elements of any
<p>
,
<salutation>
, or
<valediction>
element. Those child elements (and
their children, all the way down the tree) will be processed by whatever templates match
them, whether those are templates that we wrote or built-in default templates.
Here are a few key features of how XSLT works:
Conditionals are rare Our XSLT has no conditional testing. XPath
has an if
…
then
…
else
construction and there is an XSLT
element called <xsl:if>
, but we don’t
use those in XSLT as much as you might in other programming languages. The
reason is that because templates match specific element types (or other
components of the input document), they build in the meaning of if you should
happen to encounter this type of element, here’s what to do with
it
.
The order of templates doesn’t matter Our XSLT does not apply
from top to bottom. XSLT templates describe what should happen when certain
element types are encountered, but in the example above it doesn’t say much
about the order in which those elements will appear in the input document. This
is also part of the if you should happen to encounter this type of element,
here’s what to do with it
nature of declarative programming.
XSLT handles mixed content without knowing what it contains XML documents used in humanities research often include elements that contain mixed content, that is, an unpredictable combination of text nodes and element nodes. This is illustrated by the unpredictable appearance of personal and place names mixed in with text in the Wilde letter, above. The following template matches a salutation:
]]>
creates a paragraph in the output, and inside the paragraph says to process all
children of the input <salutation>
,
whatever they might happen be. The XSLT doesn’t have to know where there are
text nodes or element nodes (of a particular type) or the order in which they
occur; it just says process whatever you find in the order in which it
appears
.
This division of labor, where
<xsl:apply-templates/>
says process
all children of the current context node without having to know anything
about them
while templates know what to do with different types of nodes
but don’t know where they appear in the input XML is what enables XSLT to
process varied text without having to include tests for the type of node that
might come next. Applying templates to all children of the current context
cooperates with template elements elsewhere to ensure that everything is
processed the way you want as it arises.