Digital humanities


Maintained by: David J. Birnbaum (djbpitt@gmail.com) [Creative Commons BY-NC-SA 3.0 Unported License] Last modified: 2021-12-27T22:03:53+0000


Patterns, anti-patterns, and other Relax NG details

Introduction

This document includes brief examples of issues that frequently arise in Relax NG homework. Examples of bad code have a pink background.

Comments in Relax NG and elsewhere

Well-formedness and validation

Associating a Relax NG schema with an XML document

White space in Relax NG

Relax NG patterns and anti-patterns

Patterns are examples of good practice. Anti-patterns are examples of bad practice that are nonetheless common because they don’t look like bad practice on first acquaintance. Of course there’s a Wikipedia page for the term: https://en.wikipedia.org/wiki/Anti-pattern!

Pattern: the repeatable or group in mixed content

This structure allows any of the specified elements to appear zero or more times in any combination and in any order, with plain text optionally before, between, or after the element.

The following alternatives are correct Relax NG. We prefer the one above because the use of the mixed keyword makes it easy to see that mixed content is involved, but we should also accept:

Anti-pattern: nested repetition indicators

The bad example below works, but the question marks serve no purpose, since the corrected version after it has the same meaning and is easier to read and to debug.

Anti-pattern: repetition indicators on attributes

Attributes are not repeatable on their element (this is a well-formedness constraint), which means that they should never have * or + repetition indicators in a schema. Relax NG will—sloppily—let you write those repetition indicators, but that won’t make the attributes repeatable, since you cannot override a well-formedness constraint. Because the attributes are not repeatable, misrepresenting the structure as if they were is bad practice.

The bad example below might represent an attempt to allow for simultaneous speech by multiple speakers:

speech = element speech { speaker+, content }
speaker = attribute speaker { text }

The following XML, though, is not well formed:

<speech speaker="hamlet" speaker="ophelia">Hi, Polonius!</speech>

Fix this by removing the plus sign and listing the speakers in the attribute value, along the lines of:

<speech speaker="hamlet ophelia">Hi, Polonius!</speech>

Anti-pattern: repetition indicators on text

The reserved word text means zero or more textual characters. This has two perhaps surprising consequences:

The keyword text, then, should never be followed by a repetition indicator.

Pattern: named values

If multiple elements have the same content, you can define a pattern as equal to that content and then use the pattern when you define the elements that contain it.

The bad block below is not invalid Relax NG, but it is bad because it is unnecessarily repetitive, and writing the same code more than once (in this case the same content model three times) creates an opportunity for inconsistency. The version below that is an improvement because it starts by defining the keyword stuff as representing a pattern that can then be used in content models. Note that the first line does not define an element or an attributes (those keywords are missing), so what it means is that you can use the label you’ve just defined, stuff, in a content model wherever you might otherwise have written its expansion. The next three lines then use that label in the content models. These two code blocks have the same meaning, but the second one is better because it avoids the repetition.

Pattern: where to declare attributes in Relax NG

Anti-pattern: not ensuring that @xml:id values are unique

The XML below has an error: the value time is assigned to the @xml:id attribute for two different elements. If in your schema you define @xml:id as being of type xsd:ID (and you should), all @xml:id values in the entire document must be unique.

<types>
    <type xml:id="money"/>
    <type xml:id="time"/>
    <type xml:id="weight"/>
    <type xml:id="percent"/>
    <type xml:id="sport"/>
    <type xml:id="animal"/>
    <type xml:id="misc"/>
</types>
<units>
    <unit xml:id="dollar" n="1"/>
    <unit xml:id="mDollar" n="1000000"/>
    <unit xml:id="time" hr="1"/>
    <unit xml:id="dTime" hr="24"/>
    <unit xml:id="yTime" yr="1"/>
</units>

Pattern: reusing element and attribute names