Digital humanities


Maintained by: David J. Birnbaum (djbpitt@gmail.com) [Creative Commons BY-NC-SA 3.0 Unported License] Last modified: 2025-05-11T16:27:21+0000


Configuring XProc and ixml processors

Introduction

Because we often use XProc and ixml together and because they share some resources, we have combined the installation and configuration for both in this single document. We include installation and configuration instructions for two XProc engines (XML Calabash, MorganaXProc-IIIse) and three stand-alone ixml processors (CoffeePot, Markup Blitz, and xmq). The configuration procedure described here is based on our requirements; if your requirements differ from ours, see the official documentation for information about alternatives.

Except where otherwise noted, the applications described below do not come with installer scripts. What we mean, then, when we write below that you should download and install an application is that you should unzip it (if it is distributed in zip format) and move it (or, if it came in zip form, the files that emerge from unzipping) to a location of your choice.

We assume that:

The main pages for the two technologies are https://xproc.org/ and https://invisiblexml.org/ You may also find the online jωiXML Invisible XML Workbench (ixml processor that runs in your local browser) and ixampl (online ixml service) useful.

Configuring stand-alone command-line XProc processors

Configuring XML Calabash (XProc)

These instructions configure XML Calabash (tested with beta1) to use Saxon EE (or, if you don’t have an EE license, Saxon HE instead) for XSLT and XQuery. XML Calabash documentation, including the installation instructions on which this report is based, is located at https://docs.xmlcalabash.com/userguide/current/index.html.

  1. Download and install the current version of XML Calabash from https://github.com/xmlcalabash/xmlcalabash3/.

  2. If you have a license for Saxon EE, in the lib subdirectory of the XML Calabash installation delete Saxon-HE-12.6.jar and copy your Saxon EE jar file into that lib directory to replace the deleted HE file. If you don’t have a license for Saxon EE, skip this step; your installation will use Saxon HE.

  3. If you have a license for Saxon EE, set the environment variable SAXON_HOME (in your .zshrc file) to point to the directory in which your regular Saxon EE jar file (not the copy you just made) and your license key file reside. If you don’t have a license for Saxon EE, skip this step.

  4. In your home directory create a file called .xmlcalabash3 (note the leading dot) based on the following content:

    
      
    ]]>

    Edit the file so that the ]]> element points to the location of your Graphviz dot executable. (You can install Graphviz with homebrew.)

    Optional: Include the @style attribute on the ]]> element only if you want to style your graphs, in which case the attribute must point to the XSLT that you use to control the styling. See Pipelines vs. Graphs in the XML Calabash documentation for more information about styling graph output.

    The Saxon EE configuration file (and the line in the XML Calabash configuration file that points to it) may no longer be necessary, but if you use it, ensure that it points to your EE license file. Our Saxon config.xml file, located at /opt/saxon/config.xml, is:

    
      
    ]]>

    and there is a copy of the EE license file in the same directory as the configuration file.

    If you don’t have a license for Saxon EE and do not need to specify custom Saxon configuration settings, omit the @saxon-configuration attribute from your .xmlcalabash3 configuration file.

  5. Create an alias for XML Calabash based on the following value:

    '/opt/calabash/xmlcalabash-3.0.0-beta1/xmlcalabash.sh'

    Adjust the alias text so that the path points to your local installation of XML Calabash.

Assuming your XProc file manages input and output and your alias is bound to xmlcalabash, run with:

xmlcalabash filename.xpl

XML Calabash and ixml: XML Calabash by default uses CoffeePot for ixml processing. CoffeePot is rich in features, but it sometimes struggles with large input files or ambiguous ixml grammars. It’s possible to configure Markup Blitz as the default ixml processor within XML Calabash (see Configuring the Invisible XML Processor at the bottom of https://docs.xmlcalabash.com/reference/current/p-ixml.html), but for maximum flexibility we recommend leaving CoffeePot as the default and invoking Markup Blitz explicitly when you want to use it. If you use the standard ]]> step and add an @cx:processor attribute with the value "markup-blitz", your step will work correctly, using Markup Blitz, in both XML Calabash and Morgana. Example:


  
    
  
]]>

Visualizing an XProc pipeline: If you append the --graphs:g switch to the command line, XML Calabash will create a g subdirectory ( we’ve chosed g as our directory name, but you can call it whatever you’d like) that will hold an index.html file that links to newly generated graphs in subdirectories. Open the index file in a browser to explore a graphic representation of your pipelines. See Pipelines vs. Graphs in the official documentation for details.

Configuring MorganaXProc-IIIse (XProc)

These instructions configure MorganaXProc-IIIse (tested with 1.6.4) to use Saxon EE (or, if you don’t have an EE license, Saxon HE instead) for XSLT and XQuery, to use Markup Blitz for ixml, and to use Schxslt2 for Schematron. The official MorganaXProc-IIIse documentation, including the installation instructions on which this report is based, is located at https://www.xml-project.com/manual/index.html.

  1. Download and install the current version of MorganaXProc-IIIse from https://sourceforge.net/projects/morganaxproc-iiise/.
  2. Download and install the current version of Schxslt2 from https://git.sr.ht/~dmaus/schxslt2/refs.
  3. Download, build and install the current version of Markup Blitz. See the instructions below.
  4. Create morgana-config.xml in your home directory based on the following content:

    
      
       
      /opt/schxslt2/schxslt2-v1.3.4/transpile.xsl
        
      LAX
      
      Saxon12-3
      Saxon12-3
      schxslt2
    
      /opt/saxon/config.xml
          true
        
      
      
      `
      com.xml_project.morganaxproc3.markupblitzConnector.MarkupBlitzConnector
    
      
    ]]>

    This file is based on the sample config.xml included in the root directory of the MorganaXProc-IIIse distribution. The important settings are the values for:

    • ]]>: Change the path to point to the location on your file system

    • ]]>, ]]>, ]]>: Set to exactly these values. Note that the correct Saxon value for the XSLT and XQuery connectors is 12-3 even if you are running Saxon version 12.6

    • ]]>: This selects Markup Blitz as the ixml engine for MorganaXProc-IIIse pipelines.

    • ]]>: If you are using Saxon EE, create a Saxon configuration file and set the Morgana ]]> option to point to its location. Our Saxon configuration file is located at /opt/saxon/config.xml and looks like the following:

      
        
      ]]>

      If you are using Saxon HE you do not need a Saxon configuration file, so delete or comment out the ]]> line in your morgana-config.xml file.

  5. Create an alias for MorganaXProc-IIIse based on the following value:

    '/opt/morgana/MorganaXProc-IIIse-1.6.4/Morgana.sh -config=/Users/djb/morgana-config.xml'

    Adjust the location of both Morgana.sh and the config file in the example above so that they match your filesystem.

  6. Download and install both CoffeeGrinder and CoffeeFilter from https://github.com/nineml.

  7. Edit the Morgana.sh file that is part of the standard MorganaXProc-IIIse distribution. First make it executable (chmod +x Morgana.sh) and then edit it as follows:

    &1 | sed -n ';s/.* version "\(.*\)\.\(.*\)\..*".*/\1\2/p;')
    
    if [ $JAVA_VER = "18" ]
    then
    	JAVA_AGENT=-javaagent:$MORGANA_HOME/MorganaXProc-IIIse_lib/quasar-core-0.7.9.jar
    fi
    
    # All related jars are expected to be in $MORGANA_LIB. For externals jars: Add them to $CLASSPATH
    CLASSPATH=$MORGANA_LIB:$MORGANA_HOME/MorganaXProc-IIIse.jar:$BLITZ_JAR:$COFFEEGRINDER_JAR:$COFFEEFILTER_JAR:$SAXON_JAR
    
    java $JAVA_AGENT -cp $CLASSPATH com.xml_project.morganaxproc3.XProcEngine "$@"]]>

    The important details are adding the four local customization variables, which must point to the appropriate jar files on your system, and editing the CLASSPATH value, where you append your four new variables to the original, default setting. Note that the two Morgana-related jar files must come first in the CLASSPATH value, before the four local-customization ones. If you do not have a license for Saxon EE, install Saxon HE and set the SAXON_JAR variable to point to the HE jar file.

Assuming your XProc file manages input and output and your alias is bound to morgana, run with:

morgana filename.xpl

Configuring stand-alone command-line ixml processors

About these ixml processors

This section provides instructions for installing and configuring three stand-alone command-line ixml processors, CoffeePot, Markup Blitz. and xmq. We prefer CoffeePot for development because it supports more options and provides a wider range of feedback, but Markup Blitz and xmq are often faster, especially with large files.

You can run any of these ixml processors on the command line. Additionally, the XProc processors described above use them as follows:

Configuring CoffeePot (ixml)

These instructions configure CoffeePot (tested with 3.2.9). The CoffeePot documentation, including the installation instructions on which this report is based, is located at https://docs.nineml.org/current/coffeepot/.

  1. Download and install the current version of CoffeePot from the link on https://github.com/nineml/coffeepot to the latest release.

  2. Create an alias for CoffeePot based on the following value:

    'java -jar /opt/coffeepot/coffeepot-3.2.9/coffeepot-3.2.9.jar'

    Adjust the alias text so that the path points to your local installation of CoffeePot.

  3. Create a file called .nineml.properties (note the leading dot) in your home directory based on the following example:

    graphviz=/opt/homebrew/bin/dot
    ignore-trailing-whitespace=true
    pretty-print=true
    progress-bar=tty
    assert-valid-xml-characters=true
    assert-valid-xml-names=true
    ignore-bom=true
    normalize-line-endings=true
    trailing-newline-on-output=true

    For information about tuning these settings according to your requirements see https://docs.nineml.org/current/coffeepot/bk02ch07.html.

    To check your grammar for ambiguities, append --analyze-ambiguity to your command line. To see an SVG graph of your grammar append --graph:filename.svg to your command line (changing the filename according to your requirements). The volume of information included in the graph is so great that it’s practical only with small or simple grammars and documents that incur minimal ambiguity, and CoffeePot will decline to create graphs beyond a certain size.

Assuming your alias is bound to coffeepot, run with:

coffeepot -g:filename.ixml -i:filename.txt

Configuring Markup Blitz (ixml)

These instructions configure Markup Blitz (tested with 1.8). The official Markup Blitz documentation, including the installation instructions on which this report is based, is located at https://github.com/GuntherRademacher/markup-blitz.

  1. Follow the instructions at https://github.com/GuntherRademacher/markup-blitz to download and build Markup Blitz, which creates markup-blitz.jar. You must perform the build step to create the jar file to which your alias will point; this is different from the installations above, where you download a pre-built jar file.

  2. Create an alias for Markup Blitz based on the following value:

    'java -jar /Users/djb/repos/markup-blitz/build/libs/markup-blitz.jar --indent'

    Adjust the path to point to your local jar file.

Assuming your alias is bound to blitz, run with:

blitz filename.ixml filename.txt

Configuring xmq (ixml)

These instructions configure xmq (tested with 3.3.2). The primary purpose of xmq is to support an alternative (non-XML) syntax for XML documents, but xmq can also be used as a stand-alone ixml processor that creates XML output. xmq cannot (yet) be incorporated into XML Calabash or Morgana, so we use it only as a stand-alone ixml processor. The official xmq documentation, including the information on which this report is based, is located at https://github.com/libxmq/xmq.

  1. Install xmq using Homebrew (brew install xmq). If you have not yet installed Homebrew, you’ll find instructions for doing so at https://brew.sh/.

  2. Create a function inside your .zshrc file based on:

    xmq() {
      /opt/homebrew/bin/xmq $1 $2 to-xml;
    }

    Verify that the path points to the location where Homebrew installed your xmq. You need to use a function, rather than an alias, because the string to-xml has to come after the command-line arguments and aliases are not designed to support text after command-line arguments.

Run with:

xmq --ixml=filename.ixml filename.txt

xmq does not pretty-print (format and indent) its XML output, but you can pipe the output through a tool that does, such as xmllint. To do that:

  1. Install xmlstarlet using Homebrew (brew install xmlstarlet). Installing xmlstarlet automatically installs xmllint.

  2. Run with

    xmq --ixml=filename.ixml filename.txt | xmllint --format -