Digital humanities

Maintained by: David J. Birnbaum ( [Creative Commons BY-NC-SA 3.0 Unported License] Last modified: 2023-03-06T23:49:57+0000

Test #2: Relax NG: Answers

Sample schema

Your schema does not have to look the same as ours, but one possible schema that could be used for this XML document (and other songs of the same type) could be:

 elements have a kind attribute with values like building, nature, etc.
#  elements have a ref attribute that contains a standardized character name
# The ref value is defined as text, rather or-group fixed values, to accomodate other shows 
place = element place { kind, text }
kind = attribute kind { text }
chara = element chara { ref, text }
ref = attribute ref { text }]]>


The schema tries to use self-documenting names and explanatory comments wherever possible, but here are a few additional details:

Common issues

Flexibility: The element <direction> indicates a stage direction, which will not always appear at the beginning of the lyrics section, and there’s no reason there couldn’t be multiple stage directions within a song. Putting it in a schema without attaching a repetition indicator and without allowing some flexibility to its location risks limiting the schema to just this song.

Empty elements: The element <toneshift/> is an empty element. Empty elements can be represented in two ways that are syntactically different but have exactly the same meaning:

These notations mean exactly the same thing, but we prefer using the self-closing empty-element single-tag version because it is more self-documenting and easier to understand at a glance.

The HTML specification recommends using the single-tag notation only for elements that must always be empty and the combination of start- and end-tag with nothing between them for elements that could have content in principle but happen not to in a particular location. That recommendation is not part of XML, where the two notations are exactly synonmous and can be used in exactly the same locations.

Content models: Whatever is listed in the content model of an element is only its attributes plus whatever (elements or text) is directly between its start- and end-tags, that is, its children but not its deeper descendants. For example, if you have a <song> element that contains a <stanza> element that, in turn, contains lines, you wouldn’t model the <song> as:

because <song> doesn’t have any child <line> child elements (although it does have <line> descendants).