Digital humanities


Authors: Janis Chinn and Elisa Beshero-Bondar Maintained by: David J. Birnbaum (djbpitt@gmail.com) [Creative Commons BY-NC-SA 3.0 Unported License] Last modified: 2015-08-30T20:10:06+0000


Transforming XML with XSLT

Introduction

In addition to the <oXygen/> XSLT debugger mode, there are alternative ways to transform XML files with XSLT that developers apply in projects, including some that can transform more than one XML file at a time.

Client-side transformation with XSLT

One procedure involves client-side transformation, or transforming within a web browser. To implement client-side transformation you insert a line into the XML document that tells the browser to use a particular XSLT stylesheet to transform the document in order to render it for viewing. This is similar to associating a CSS stylesheet with an XML document, and it has at least two related advantages over having the developer run the transformation and upload the HTML output. Transforming the XML to HTML before uploading is an extra step, and it has to be repeated every time the author revises the XML, so the first advantage is cutting out this extra stage. The second advantage is that because the user is downloading the XML and transforming it in the browser, the transformation is applied automatically in real time to the XML, so there is no danger of updating the XML but forgetting to rerun the transformation and upload the correspondingly revised HTML.

The principal disadvantage of client-side (in-the browser) XSLT transformation is that web browsers (as of this writing in 2015) do not evenly support XSLT rendering and none of the major browsers supports XSLT 2.0 which means that you cannot be confident of reliable output and you have to confine yourself to XSLT 1.0 features and functionality. Furthermore, associating the XSLT with the XML makes the most sense if the XML is going to be transformed only in one way, such as to create a reading text in HTML. If you want to transform your XML in more than one way, you can’t associate two stylesheets with the same XML and have them operate independently. For a comparative review of browsers in client-side transformation, see Julian Reschke’s Test cases for XSLT support in browsers and Yegor Bugayenko’s survey of options for client-side rendering. Finally, Saxon CE is a JavaScript library that can provide XSLT 2.0 support for client-side transformations in any browser, but it imposes a delay on first download and requires that your XSLT be structured in a way that matches the Saxon CE protocols.

If you decide to use client-side (in-browser) transformation with XSLT, the easiest way to associate your XSLT stylesheet with your XML document instance is to let <oXygen> help you. As is illustrated below, with your XML document open in <oXygen>, go to the Document menu and select XML Document, then Associate XSLT/CSS Stylesheet. Change to the XSLT tab, click on the little folder icon to the right of the text input box, and pick your XSLT file. <oXygen/> will write a stylesheet association line into your XML, and when you next load the XML file into a web browser, you should see your XML styled according to your XSLT.

Associate your XSL

Developer transformation with XSLT: transformation scenarioes

As an alternative to client-side transformation, you can perform the transformation yourself and upload just the HTML (or other) output of the stylesheet transformation, instead of the raw XML together with the stylesheet. This has the advantage of letting you use XSLT 2.0 features, but it the disadvantage is that you have to remember to regenerate the output whenever you change the XML. Because we often use XSLT 2.0 features and often transform our XML in more than one way, this is the strategy we use most often in our own projects. It’s a little more complicated for the developer (it involves more clicking), but not terribly so, and here, too, you can let <oXygen/> help you.

You already know how to transform an XML document with XSLT in the <oXygen/> XSLT debugger, and how to save the output of the transformation so that you can upload it to your project site. This isn’t necessarily the best way to create HTML output, though. The XSLT debugger interface excels at—well—XSLT debugging, but if know your XSLT is okay (doesn’t require debugging) and you just want to transform your XML, you may find it easier to do that by configuring an <oXygen/> transformation scenario.

As is illustrated below, start by going to that same Document menu and going to Transformation and then Configure Transformation Scenario. Any saved transformations will be listed, but to create a new one you ignore those and click instead on New and from Scenario type select XML transformation with XSLT. The New scenario dialog, illustrated below, then opens and asks for additional information.

Configure transformation scenario menu path

You’ll need to specify both your XML and XSL files in the first tab (labeled XSLT), along with the XSLT transformer you want to use. Although <oXygen/> may try to guess which files and transformer to use based on which file was in focus when you opened the menu, it’s safer and easier in the long run to specify all of this information explicitly. If you’re using XSLT 2.0 features, we’d recommend selecting either Saxon PE or Saxon HE as the transformer. To the right of the XML URL and XSL URL boxes you can click on the little folder icons and navigate to the XML and XSLT files you want to use for the transformation. <oXygen/> will let you set one or both to Current file and a few other variable values, but if you accidentally have the wrong files in focus when you do that, you’ll get the wrong result, so it’s safest just to specify both files fully. In the Output tab you need to specify the location and name of the HTML file you want to create, and you can select to view the file in <oXygen/> and/or in your Internet browser if you want (see the green highlighting in the illustration below). When you’re all done, hit OK, run the transformation, and then upload your new HTML file to your project directory on the server and link to it, so that users can access it easily.

Create a new scenario

If, in the future, you wish to run this scenario again, simply navigate the menus again and select Apply Transformation Scenario instead of Configure Transformation Scenario (although you can actually do it from either, as noted above), and select the scenario you wish to run. You’ll need to re-upload your HTML file, if you regenerate it, to replace the old version on the server.

Batch processing multiple XML files with XSLT

Project developers frequently need to transform many XML files at once. One common situation is that you are delivering a large number of files with similar markup (such as a poetry collection), you want them all rendered similarly, and you change your rendering preferences. In this case you’ll want to modify the XSLT you use to generate the HTML versions of the poems and then transform them all. Loading each one separately into <oXygen/> and transforming it in the XSLT debugger or by using the methods above means a lot of repetitive actions on your part, and what you’d prefer is to run a single transformation that gets applied to each of the files without your having to specify them one by one.

The <oXygen/> Project menu provides a helpful tool for batch processing. The following list describes how to use that menu to set up a simple batch transformation over a group of XML files stored in a local directory on your computer.

Transformations at the command line

Configuring a transformation scenario for a single XML file, and especially for batch transformation of multiple files, avoids the overhead of going through the <oXygen/> debugger. But because the transformation is actually performed by the Saxon XSLT engine operating within <oXygen/>, and not by <oXygen/> itself, we can simplify the process even further by cutting <oXygen/> out of the process altogether.

Saxon is available in three versions: EE (enterprise edition), PE (professional edition), and HE (home edition) (see the feature matrix to compare them). All three are bundled with <oXygen/> and any of the three can be used in the <oXygen/> XSLT debugger or in a transformation scenario. EE and PE are commercial products that you can purchase to use outside <oXygen/>; HE is free and open source, so you can download and use it outside <oXygen/> without cost. Although HE has the fewest features of the three, we’ve been able to use it effectively in our projects, and if you’d like to experiment with command-line XSLT transformation, you can download it at http://sourceforge.net/projects/saxon/files/Saxon-HE/.

Instructions for using Saxon at the command line are at http://www.saxonica.com/documentation/index.html#!using-xsl/commandline. The short version is:

java  -jar /path-to-saxon/saxon9he.jar -s:input.xml -xsl:stylesheet.xsl -o:output.xml

Replace the filenames with the names of your real files. To transform an entire directory of files, you can specify the directory names as arguments to the -s: and -o: switches.