Digital humanities


Maintained by: David J. Birnbaum (djbpitt@gmail.com) [Creative Commons BY-NC-SA 3.0 Unported License] Last modified: 2017-06-12T18:13:17+0000


What’s new in XSLT 3.0 and XPath 3.1?

Introduction

This page aims to provide a brief introduction to small but useful enhancements to XPath and XSLT that have emerged since the publication of Michael Kay’s XSLT 2.0 and XPath 2.0 programmer’s reference, 4th edition, which covers XPath 2.0 and XSLT 2.0. Two of the most significant additions to XSLT 3.0, streaming and packaging, are not covered here because, as important as they are for large files or complex transformations, we haven’t found a need for them in the smaller scale on which we usually operate.

References

Configuring <oXygen/>

To tell <oXygen/> that new XSLT files should default to XSLT 3.0, click on File → New → XSLT → Customize and select 3.0.

XPath 3.0 and 3.1

Variable declaration

XPath in XSLT allows the use of the let construction, which was previously available only in XQuery. See immediately below, under Concatenation.

Concatenation with ||

The string concatenation operator || can be used in situations that previously required the concat() function. For example, the following XPath expression:

let $a := 'hi', $b := 'bye' return $a || ' ' || $b

is equivalent to:

let $a := 'hi',  $b := 'bye' return concat($a,' ', $b)

Simple mapping with the bang operator (!)

The bang operator applies the operation to the right of the bang to each item in the sequence on the left. For example:

('curly', 'larry', 'moe') ! string-length(.)

returns a sequence of three integers: (5, 5, 3). The expression is equivalent to:

for $stooge in ('curly', 'larry', 'moe') return string-length($stooge)

The simple mapping operator is similar to /, except that 1) the sequence to the left of / must be a sequence of nodes, while the sequence to the left of ! can be a sequence of any items, and 2) / sorts the sequence on the left into document order and eliminates duplicates, while ! performs no sorting or deduplication.

Function chaining with the arrow operator (=>)

The arrow operator pipes the output of the item on the left into the first argument of the function on the right. It thus provides an alternative to nested parentheses. For example (from the XPath 3.1 spec, §3.16):

tokenize((normalize-unicode(upper-case($string))),"\s+")

is equivalent to:

$string => upper-case() => normalize-unicode() => tokenize("\s+")

The functionality of the bang and arrow operators overlaps where the operation on the right is a function, but only then. For that reason:

$book ! (@author, @title)

return the values of the @author and @title attributes of some element that is the value of the variable $book, but because the operation on the right is not function, if you replace the bang with the arrow operator, you throw an error. The arrow operator does not use the dot to specify the first argument to the function because the operator supplies that argument instead.

Because the bang operator is a mapping and the arrow operator is a pipe, the following two expressions produce different results:

'curly larry moe' => tokenize('\s+') => count()

The preceding returns the integer value 3. But

'curly larry moe' ! tokenize(.,'\s+') ! count(.)

returns a sequence of three instances of the integer value 1. The difference is that after tokenize() returns a sequence of three items, the bang operator maps each item individually as the input to the count() function, while the arrow operator counts the items in the sequence.

unparsed-text-lines()

unparsed-text-lines() works like unparsed-text(), except that it tokenizes on newlines and streams the input line by line.

Maps

The following example creates a map and then serializes it as JSON on output:

<xsl:variable name="mymap" as="map(*)"
        select='map {
        "Su" : "Sunday",
        "Mo" : "Monday",
        "Tu" : "Tuesday",
        "We" : "Wednesday",
        "Th" : "Thursday",
        "Fr" : "Friday",
        "Sa" : "Saturday"
        }'/>
    <xsl:template match="/">
        <root>
            <text>Hi, Mom! Here’s some information:</text>
            <para>{
                serialize($mymap, map{"method":"json","indent":true()})
            }</para>
        </root>
    </xsl:template>

$stuff?row ...

Arrays

Add stuff here

XSLT 3.0

Boolean values

Boolean values can be expressed as any of true/1/yes or false/0/no. For example, to turn on pretty-printed output, set the value of the @indent attribute of <xsl:output> to any of true, 1, or yes.

Starting from a named template

If you set the value of the @name attribute of an <xsl:template> element to xsl:initial-template and run a transformation from the command line with the -it (= ‘initial template’) switch, the template named xsl:initial-template is now the default. Previously you had to specify the name of your initial template on the command line.

Content Value Templates

Like Attribute Value Templates, Content Value Templates let you specify that certain text should be intepreted as XPath instead of being output literally. The syntax for CVTs is the same as for AVTs: surround the expression in curly braces (to use a literal curly brace, double them), and multiple values are output with a single space between them. CVTs work ony if you create an @expand-text attribute on the root <xsl:stylesheet> element and give it a positive Boolean value. CVTs are similar to the use of curly braces in XQuery to switch from XML mode into XQuery mode, and they can be used in situations where you may previously have had to use <xsl:value-of> or something that converts its arguments to strings, like concat() or ||. Here’s an example:

<xsl:template name="xsl:initial-template">Hello, World! It’s {current-time()}</xsl:template>

The preceding is equivalent to:

<xsl:template name="xsl:initial-template">
    <xsl:text>Hello, World! It’s </xsl:text>
    <xsl:value-of select="current-time()"/>
</xsl:template>

or

<xsl:template name="xsl:initial-template">
    <xsl:value-of select="concat('Hello, World! It’s ', current-time())"/>
</xsl:template>

or

<xsl:template match="/">
    <xsl:value-of select="'Hello, World! It’s ' || current-time()"/>
</xsl:template>

@item-separator

The @item-separator attribute on <xsl:output> can be used to change the item separator from the default space to something else. Must be combined with @build-tree="no".

Shadow attributes

Shadow attributes mask regular attribute values, and have the same name as the regular attribute, but with a leading underscore.

Variables and functions

Functions can be assigned to a variable. To reference them, add parentheses after the variable name.

Creating HTML5

To create HTML5 output, use <xsl:output method="html" version="5"/>. This creates HTML5 using HTML (not XML) syntax, which means that it omits the XML declaration and it creates a <meta> element inside the <head>. If you serve your HTML5 as mime type application/xhtml+xml and want to validate it as XML, set @method to xml instead (and set @indent to a positive Boolean value unless that messes up your white space). Setting the @method to html also doesn’t add the HTML namespace automatically (fair enough).

Identity transformation

The identity transformation can be expressed in a single top-level <xsl:mode> element:

<xsl:mode on-no-match="shallow-copy"/>

Iteration

Iteration may sometimes be easier to write than recursion. The following code returns a running total of the integers from 1 through 10:

<xsl:iterate select="1 to 10">
    <xsl:param name="total" as="xs:integer" select="0"/>
    <xsl:variable name="newTotal" as="xs:integer" select="$total + ."/>
    <xsl:value-of select="concat($total, ' + ', . , ' = ' , $newTotal, '&#x0a;')"/>
    <xsl:next-iteration>
        <xsl:with-param name="total" select="$newTotal"/>
    </xsl:next-iteration>
</xsl:iterate>

This outputs the results of each iteration. To output only the final total, remove the <xsl:value-of> statement and use <xsl:on-completion>:

<xsl:iterate select="1 to 10">
    <xsl:param name="total" as="xs:integer" select="0"/>
    <xsl:on-completion select="$total"/>
    <xsl:variable name="newTotal" as="xs:integer" select="$total + ."/>
    <xsl:next-iteration>
        <xsl:with-param name="total" select="$newTotal"/>
    </xsl:next-iteration>
</xsl:iterate>

although for this contrived problem it would, of course, be simpler to write <xsl:value-of select="sum(1 to 10)"/>.

A recursive template call might look like:

<xsl:template match="/">
    <xsl:variable name="result">
        <xsl:call-template name="accumulate">
            <xsl:with-param name="total" select="0"/>
            <xsl:with-param name="range" select="1 to 10"/>
        </xsl:call-template>
    </xsl:variable>
    <xsl:sequence select="$result"/>
</xsl:template>
<xsl:template name="accumulate">
    <xsl:param name="total" as="xs:integer"/>
    <xsl:param name="range" as="xs:integer*"/>
    <xsl:choose>
        <xsl:when test="empty($range)">
            <xsl:sequence select="'done'"/>
        </xsl:when>
        <xsl:otherwise>
            <xsl:variable name="currentValue" as="xs:integer" select="$range[1]"/>
            <xsl:variable name="newTotal" as="xs:integer" select="$total + $currentValue"/>
            <xsl:value-of select=" concat($total, ' + ', $currentValue, ' = ', $newTotal, '&#x0a;')"/>
            <xsl:call-template name="accumulate">
                <xsl:with-param name="total" as="xs:integer" select="$newTotal"/>
                <xsl:with-param name="range" as="xs:integer*" select="remove($range, 1)"/>
            </xsl:call-template>
        </xsl:otherwise>
    </xsl:choose>
</xsl:template>

This returns a report on each step plus the word done at the end. To see just the steps, make <xsl:when> an empty element. To return just the total, remove the <xsl:value-of> from the <xsl:otherwise> element and set the value of the sequence returned inside <xsl:when> to $total.