Digital humanities


Maintained by: David J. Birnbaum (djbpitt@gmail.com) [Creative Commons BY-NC-SA 3.0 Unported License] Last modified: 2025-02-15T16:18:04+0000


Configuring XProc and ixml processors

Introduction

Because we often use XProc and ixml together and because they share some resources, we have combined the installation and configuration for both in this single document. We include installation and configuration instructions for two XProc engines (XML Calabash, MorganaXProc-IIIse) and three stand-alone ixml processors (CoffeePot, Markup Blitz, and xmq). The configuration procedure described here is based on our requirements; if your requirements differ from ours, see the official documentation for information about alternatives.

Except where otherwise noted, the applications described below do not come with installer scripts. What we mean, then, when we write below that you should download and install an application is that you should unzip it (if it is distributed in zip format) and move it (or, if it came in zip form, the files that emerge from unzipping) to a location of your choice.

We assume that:

The main pages for the two technologies are https://xproc.org/ and https://invisiblexml.org/ You may also find the online jωiXML Invisible XML Workbench (ixml processor that runs in your local browser) and ixampl (online ixml service) useful.

Configuring stand-alone command-line XProc processors

Configuring XML Calabash (XProc)

These instructions configure XML Calabash (tested with alpha20) to use Saxon EE (or, if you don’t have an EE license, Saxon HE instead) for XSLT and XQuery and to use CoffeeSacks for ixml within XSLT. XML Calabash documentation, including the installation instructions on which this report is based, is located at https://docs.xmlcalabash.com/userguide/current/index.html.

  1. Download and install the current version of XML Calabash from https://github.com/xmlcalabash/xmlcalabash3/.

  2. If you have a license for Saxon EE, in the lib subdirectory of the XML Calabash installation delete Saxon-HE-12.5.jar and copy your Saxon EE jar file into that lib directory to replace the deleted HE file. If you don’t have a license for Saxon EE, skip this step; your installation will use Saxon HE.

  3. If you have a license for Saxon EE, set the environment variable SAXON_HOME (in your .zshrc file) to point to the directory in which your regular Saxon EE jar file (not the copy you just made) and your license key file reside. If you don’t have a license for Saxon EE, skip this step.

  4. Download CoffeeSacks from https://github.com/nineml/coffeesacks and copy the CoffeeSacks jar file into the XML Calabash extra subdirectory.

  5. In your home directory create a file called .xmlcalabash3 (note the leading dot) based on the following content:

    
      
    ]]>

    Edit the file so that the ]]> element points to the location of your Graphviz dot executable. (You can install Graphviz with homebrew.)

    The Saxon EE configuration file (and the line in the XML Calabash configuration file that points to it) may no longer be necessary, but if you use it, ensure that it points to your EE license file. Our Saxon config.xml file, located at /opt/saxon/config.xml, is:

    
      
    ]]>

    and there is a copy of the EE license file in the same directory as the configuration file.

    If you don’t have a license for Saxon EE and do not need to specify custom Saxon configuration settings, omit the @saxon-configuration attribute from your .xmlcalabash3 configuration file.

  6. Optional: Some extension steps (not part of standard XProc) are distributed in XML Calabash in their own zip files, which can be downloaded from the same latest-release link as XML Calabash itself. Their use is not discussed here, but should you wish to install any of them:

    1. Download and unzip the zip distribution file.

    2. Copy all jar files from the extra subdirectory that you just unzipped into the extra subdirectory of XML Calabash.

    3. The extension steps supported by these optional resources are documented in the user guide at https://docs.xmlcalabash.com/reference/current/extension-steps.html.

  7. Create an alias for XML Calabash based on the following value:

    '/opt/calabash/xmlcalabash-3.0.0-alpha20/xmlcalabash.sh --init:org.nineml.coffeesacks.RegisterCoffeeSacks'

    Adjust the alias text so that the path points to your local installation of XML Calabash.

Assuming your XProc file manages input and output and your alias is bound to xmlcalabash, run with:

xmlcalabash filename.xpl

If you append the --graphs:. (note the trailing dot) switch to the command line, XML Calabash will create a graphs subdirectory along with an index.html file, linked to the newly generated graphs, in the current directory. Open the index file in a browser to explore a graphic representation of your pipeline(s).

Configuring MorganaXProc-IIIse (XProc)

These instructions configure MorganaXProc-IIIse (tested with 1.5) to use Saxon EE (or, if you don’t have an EE license, Saxon HE instead) for XSLT and XQuery, to use Markup Blitz for ixml, and to use Schxslt2 for Schematron. The official MorganaXProc-IIIse documentation, including the installation instructions on which this report is based, is located at https://www.xml-project.com/manual/index.html.

  1. Download and install the current version of MorganaXProc-IIIse from https://sourceforge.net/projects/morganaxproc-iiise/.
  2. Download and install the current version of Schxslt2 from https://git.sr.ht/~dmaus/schxslt2/refs.
  3. Download, build and install the current version of Markup Blitz. See the instructions below.
  4. Create morgana-config.xml in your home directory based on the following content:

    
    	
    	 
    	/opt/schxslt2/schxslt2-v1.3.4/transpile.xsl
    		
    	LAX
    	
    	Saxon12-3
    	Saxon12-3
    	schxslt2
    
    	/opt/saxon/config.xml
          true
        
    	
    	
    	`
    	com.xml_project.morganaxproc3.markupblitzConnector.MarkupBlitzConnector
    
    	
    ]]>

    This file is based on the sample config.xml included in the root directory of the MorganaXProc-IIIse distribution. The important settings are the values for ]]> (change the path to point to the location on your file system), ]]>, ]]>, ]]>, and ]]>. MorganaXProc-IIIse is able to support CoffeePot instead of Markup Blitz as its ixml processing engine. See below about the differences between the two.

  5. Create an alias for MorganaXProc-IIIse based on the following value:

    '/opt/morgana/MorganaXProc-IIIse-1.5/Morgana.sh -config=/Users/djb/morgana-config.xml'

    Adjust the location of both Morgana.sh and the config file in the example above so that they match your filesystem.

  6. Download and install both CoffeeGrinder and CoffeeFilter from https://github.com/nineml.

  7. Edit the Morgana.sh file that is part of the standard MorganaXProc-IIIse distribution. First make it executable (chmod +x Morgana.sh) and then edit it as follows:

    &1 | sed -n ';s/.* version "\(.*\)\.\(.*\)\..*".*/\1\2/p;')
    
    if [ $JAVA_VER = "18" ]
    then
    	JAVA_AGENT=-javaagent:$MORGANA_HOME/MorganaXProc-IIIse_lib/quasar-core-0.7.9.jar
    fi
    
    # All related jars are expected to be in $MORGANA_LIB. For externals jars: Add them to $CLASSPATH
    CLASSPATH=$BLITZ_JAR:$COFFEEGRINDER_JAR:$COFFEEFILTER_JAR:$SAXON_JAR:$MORGANA_LIB:$MORGANA_HOME/MorganaXProc-IIIse.jar
    
    java $JAVA_AGENT -cp $CLASSPATH com.xml_project.morganaxproc3.XProcEngine "$@"]]>

    The important details are adding the four local customization variables, which must point to the appropriate jar files on your system, and editing the CLASSPATH value, where you prepend your four new variables to the original, default setting. If you do not have a license for Saxon EE, install Saxon HE and set the SAXON_JAR variable to point to the HE jar file.

Assuming your XProc file manages input and output and your alias is bound to morgana, run with:

morgana filename.xpl

Configuring stand-alone command-line ixml processors

About these ixml processors

This section provides instructions for installing and configuring three stand-alone command-line ixml processors, CoffeePot, Markup Blitz. and xmq. We prefer CoffeePot for development because it supports more options and provides a wider range of feedback, but Markup Blitz and xmq are often faster, especially with large files.

You can run any of these ixml processors on the command line. Additionally, the XProc processors described above use them as follows:

Configuring CoffeePot (ixml)

These instructions configure CoffeePot (tested with 3.2.7). The CoffeePot documentation, including the installation instructions on which this report is based, is located at https://docs.nineml.org/current/coffeepot/.

  1. Download and install the current version of CoffeePot from the link on https://github.com/nineml/coffeepot to the latest release.

  2. Create an alias for CoffeePot based on the following value:

    'java -jar /opt/coffeepot/coffeepot-3.2.7/coffeepot-3.2.7.jar'

    Adjust the alias text so that the path points to your local installation of CoffeePot.

  3. Create a file called .nineml.properties (note the leading dot) in your home directory based on the following example:

    graphviz=/opt/homebrew/bin/dot
    ignore-trailing-whitespace=true
    pretty-print=true
    progress-bar=tty
    assert-valid-xml-characters=true
    assert-valid-xml-names=true
    ignore-bom=true
    normalize-line-endings=true
    trailing-newline-on-output=true

    For information about tuning these settings according to your requirements see https://docs.nineml.org/current/coffeepot/bk02ch07.html.

    To check your grammar for ambiguities, append --analyze-ambiguity to your command line. To see an SVG graph of your grammar append --graph:filename.svg to your command line (changing the filename according to your requirements). The volume of information included in the graph is so great that it’s practical only with small or simple grammars and documents that incur minimal ambiguity, and CoffeePot will decline to create graphs beyond a certain size.

Assuming your alias is bound to coffeepot, run with:

coffeepot -g:filename.ixml -i:filename.txt

Configuring Markup Blitz (ixml)

These instructions configure Markup Blitz (tested with 1.6). The official Markup Blitz documentation, including the installation instructions on which this report is based, is located at https://github.com/GuntherRademacher/markup-blitz.

  1. Follow the instructions at https://github.com/GuntherRademacher/markup-blitz to download and build Markup Blitz, which creates markup-blitz.jar. You must perform the build step to create the jar file to which your alias will point; this is different from the installations above, where you download a pre-built jar file.

  2. Create an alias for Markup Blitz based on the following value:

    'java -jar /Users/djb/repos/markup-blitz/build/libs/markup-blitz.jar --indent'

    Adjust the path to point to your local jar file.

Assuming your alias is bound to blitz, run with:

blitz filename.ixml filename.txt

Configuring xmq (ixml)

These instructions configure xmq (tested with 3.2.2). The primary purpose of xmq is to support an alternative (non-XML) syntax for XML documents, but xmq can also be used as a stand-alone ixml processor that creates XML output. xmq cannot (yet) be incorporated into XML Calabash or Morgana, so we use it only as a stand-alone ixml processor. The official xmq documentation, including the information on which this report is based, is located at https://github.com/libxmq/xmq.

  1. Install xmq using Homebrew (brew install xmq). If you have not yet installed Homebrew, you’ll find instructions for doing so at https://brew.sh/.

  2. Create a function inside your .zshrc file based on:

    xmq() {
      /opt/homebrew/bin/xmq $1 $2 to-xml;
    }

    Verify that the path points to the location where Homebrew installed your xmq. You need to use a function, rather than an alias, because the string to-xml has to come after the command-line arguments and aliases are not designed to support text after command-line arguments.

Run with:

xmq --ixml=filename.ixml filename.txt

xmq does not pretty-print (format and indent) its XML output, but you can pipe the output through a tool that does, such as xmllint. To do that:

  1. Install xmlstarlet using Homebrew (brew install xmlstarlet). Installing xmlstarlet automatically installs xmllint.

  2. Run with

    xmq --ixml=filename.ixml filename.txt | xmllint --format -