Maintained by: David J. Birnbaum (djbpitt@gmail.com) Last modified: 2021-12-27T22:03:43+0000
In class we began to develop an XSLT stylesheet to convert an XML version of Hamlet into a table listing the number of speeches in each act in the play.
In order to create an HTML table, you need to know that a table in HTML is a
<table>
element that contains one <tr>
(table row
) element for each row of the table. Each cell in the row is a
<td>
(table data
) element for regular rows and a
<th>
(table header
) element for the header row. We specify
that we want a thin border around each cell in the table by creating a
@border
attribute on the <table>
element and setting
its value to 1
. (This isn’t the best way to specify this formatting feature, and
in Real Life we would use CSS. We’ve taken a shortcut here to avoid the overhead of
introducing CSS at a time when we want you to concentrate on learning XSLT.) You can
read more about HTML tables at http://www.w3schoos.com/html/html_tables.asp.
The desired output will look like:
Act | Speeches |
---|---|
Act 1 | 251 |
Act 2 | 201 |
Act 3 | 249 |
Act 4 | 179 |
Act 5 | 257 |
and the underlying raw HTML looks like:
<?xml version="1.0" encoding="UTF-8"?> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> <title>Speeches per act in Hamlet</title> </head> <body> <table border="1"> <tr> <th>Act</th> <th>Speeches</th> </tr> <tr> <td>Act 1</td> <td>251</td> </tr> <tr> <td>Act 2</td> <td>201</td> </tr> <tr> <td>Act 3</td> <td>249</td> </tr> <tr> <td>Act 4</td> <td>179</td> </tr> <tr> <td>Act 5</td> <td>257</td> </tr> </table> </body> </html>
Here is the completed XSLT:
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0" xpath-default-namespace="http://www.tei-c.org/ns/1.0" xmlns="http://www.w3.org/1999/xhtml"> <xsl:output method="xhtml" indent="yes"/> <xsl:template match="/"> <html> <head> <title>Speeches per act in Hamlet</title> </head> <body> <table border="1"> <tr> <th>Act</th> <th>Speeches</th> </tr> <xsl:apply-templates select="//body/div"/> </table> </body> </html> </xsl:template> <xsl:template match="div"> <tr> <td> <xsl:apply-templates select="head"/> </td> <td> <xsl:value-of select="count(.//sp)"/> </td> </tr> </xsl:template> </xsl:stylesheet>
The superstructure for creating an XSLT stylesheet to transform a TEI document, such as
our version of Hamlet, into XHTML, is
described in XSLT Assignment #1. Once that’s all
in place here, the heavy lifting is done by two template rules, one that matches the
document node (/
) and one that matches acts (div
).
XSLT transformation always begin at the document node, so the template for that is the
first to fire. It builds the HTML document with the <table>
element
inside the <body>
, and inside the <table>
tags it
creates the header row. In the eventual output, you want to insert one table row for
each act in the play, so you tell the stylesheet where to put those rows by inserting an
instruction in that place, just below the header row, that reads
<xsl:apply-templates select="//body/div"/>
. The
<xsl:apply-templates
instruction means go look for a template
rule to take care of whatever I’m selecting here,
and the @select
attribute selects the acts (<div>
elements directly under the
<body>
). Where the rows for the acts will be created depends on
where you put this instruction, so you need to put it where you want those rows to
appear.
<xsl:apply-templates>
tells the system to round up everything
specified by the value of the @select
attribute, which in this case is the
five acts, and then process them. The stylesheet processes them by looking for a
template that knows what to do with them. Since we have a template that knows what to do
with a <div>
element (it’s the template that says
match="div"
), it will do the processing. That template rule will fire
five times, once for each of the acts that were collected and passed to it by the
<xsl:apply-templates>
element in the template rule above.
The template rule for <div>
elements fires five times, once for each
act, and creates a table row for that act. Inside the row it creates two cells. The
first cell holds the title of the act, which the system retrieves by applying templates
to (that is, processing) the <head>
child of the act, with is the
title of the act. The second cell holds a count of the speeches in that act. The current
context (for XPath purposes) for a template rule is the element that was used to call
the template. That means that each time this template rule for an act fires, the current
context will be the particular act that is being processed. This means that
<xsl:apply-templates select="head">
will process on the
<head>
child of that particular act (a different act
each time the rule fires), and in the count(.//sp)
function that dot
(.
) refers to the current context, that is, the current act, and
therefore retrieves only the speeches that are descendants of that individual
act.
Note that we apply templates to <head>
elements, but we don’t have a
template rule that matches those elements. XSLT has a built-in rule that says
that if you’re applying templates to an element that contains only plain text and there’s
no explicit template rule, by default you just output the text. Since that’s just what
we want to do with <head>
elements, we don’t have to write a rule to
do it for us, and we can just rely on the built-in behavior.
How come the template rule for <div>
elements doesn’t process scenes?
It would match scenes because it matches any <div>
elements, but it
never even knows there are any scenes because the program flow makes sure that it never
sees them. The program grabs control with the template for the document node, and
specifies that only acts should be processed. The template for <div>
elements processes acts, but it never touches the scenes. How would you modify this
stylesheet to count by scene, and not just by act?