Digital humanities


Maintained by: David J. Birnbaum (djbpitt@gmail.com) [Creative Commons BY-NC-SA 3.0 Unported License] Last modified: 2018-04-24T16:42:02+0000


Schematron test: answers

The text

This test uses the main text from the Zelda: Twilight princess project, which you can download by right-clicking on http://dh.obdurodon.org/schematron-test-2184-xml-instance.xml

The structure of this document, which is not in a namespace, is that the root element, <zelda>, contains three acts (<act> elements), each of which contains several chapters (<chapter> elements). Both acts and chapters have @n numbers, which are used to number them sequentially. The acts, then, are numbered 1, 2, and 3, and the chapters within each act are numbered consecutively, beginning anew with 1 at the start of each act.

The task

When you first began marking up documents at the start of the semester, we advised you that it wasn’t necessary to number document components (such as acts and chapters) manually because XPath can do the numbering for you. Should you nonetheless choose to do it manually, not only are you doing unnecessary work, but you run the risk of introducing an error. To ensure that that hasn’t happened here, your task is to write a Schematron schema to validate that the chapter numbers (the values of the @n attributes on the <chapter> elements) within each act 1) begin at 1 and 2) run consecutively through the end of the act. You do not have to validate the act numbers, although you are encouraged to do so as a bonus task if you would like to challenge yourself.

You should associate your schema with your XML document instance in <oXygen/> and verify that it works by changing some of the @n values on chapters in the XML. When you are satisfied with your answer, please upload just the Schematron file (not the XML).

Two answers

An inferior literal answer

The follow solution is literal because it compares each chapter number to the preceding one and verifies that it is greater by 1. Because the first chapter doesn’t have a preceding chapter number, it requires a separate rule.

<schema xmlns="http://purl.oclc.org/dsdl/schematron" queryBinding="xslt2">
    <pattern>
        <rule context="chapter[1]">
            <assert test="@n = 1">Invalid chapter numbering.</assert>
        </rule>
        <rule context="chapter[preceding-sibling::chapter]">
            <assert test="(number(@n) - number(preceding-sibling::chapter[1]/@n)) = 1">Invalid
                chapter numbering.</assert>
        </rule>
    </pattern>
</schema>

The first rule has a @context attribute of "chapter[1]" which means this rule will be used to validate only the first <chapter> element in each <act>. Our <assert> tests whether or not the @n attribute is equal to 1. The second rule is the element responsible for checking whether or not the chapter numbers increase by one each time a new chapter is added after the first. To do this, we test whether or not the current @n minus the @n of the previous chapter was equal to 1. Our @context for the second rule is chapter[preceding-sibling::chapter], where the predicate tells the system to only run the test if there is a <chapter> element before the one being processed. This means that it will never test the first <chapter>, which is thebehavior we want, since we've already written a different rule for the first chapter in each act). So if, for example, we were looking at the third <chapter> element which would have an @n value of 3, the previous <chapter> element's @n value should be 2... 3 - 2 = 1, so the arithmetic succeeds. However if the third chapter had been misnumbered as '4', the arithmetic would fail and result in the appearance of the error message we have written.

This rule could have also been written containing a <report> instead of an <assert>. The only change that would need to be made is the addition of an exclamation point (which you should know is the negation operator in XPath) before the equal sign.

A superior, more XPath-y answer

The more elegant solution takes advantage of the fact that the position() values for the chapters within each act are automatically consecutive integers starting at 1. This means that if the @n value equals the position() of the <chapter> in its sequence of sibling chapters, the @n values correctly start at 1 and proceed consecutively. We can write this as:

<sch:schema xmlns:sch="http://purl.oclc.org/dsdl/schematron" queryBinding="xslt2"
    xmlns:sqf="http://www.schematron-quickfix.com/validator/process">
    <sch:pattern>
        <sch:rule context="chapter">
            <sch:assert test="@n = position()">
                The chapter at position <sch:value-of select="position()"/> has an @n attribute value of <sch:value-of select="@n"/>
            </sch:assert>
        </sch:rule>
    </sch:pattern>
</sch:schema>

This version has several advantages over the literal approach, above. It uses the same rule for the first chapter of each act as for the other chapters, it reports the exact numbers that don’t correspond, and it lets the XPath position() function do the counting for us.