Maintained by: David J. Birnbaum (djbpitt@gmail.com) Last modified: 2021-12-27T22:03:54+0000
Because scholars often feel passionate about defending their arguments, academic journals are frequently home to polemical exchanges of responses, rejoinders and ripostes, some of which may seem surprisingly strident. You can see an example of the genre from my own field (Slavic linguistics) at https://openaccess.leidenuniv.nl/bitstream/handle/1887/1902/344_073.pdf. (Biographic aside: Horace Lunt, to whom Frederik Kortlandt is responding in this article, was my professor.)
Because professors may be curiously insensitive to the virtues of brevity, rejoinders can easily outstrip in size the articles to which they are responding, and in a cascading polemic they can seem to grow insatiably. In an effort to combat this tendency, and to build in a natural mechanism for winding down such exchanges, a few journals have instituted size limits such as “authors are welcome to respond, but the response must be no longer than half the length of the original.” The hope is that one or the other of the participants in such a discussion will walk away before it deteriorates into something like “So’s your old man!” (4 words), “Sez you!” (2), and “Jerk” (1).
Imagine that you are the editor of an academic journal where submissions are managed in XML, and you are responsible for enforcing a “no more than half the length” policy like the one described above. The structure of your journal is described by:
start = journal journal = element journal { issue+ } issue = element issue { article+ } article = element article { title, author, date, content } content = element content { p+ } title = element title { text } author = element author { text } date = element date { xsd:date } p = element p { text }
(This is clearly a simplification. In real life the structure would allow emphasis, bibliography, footnotes, etc.)
Here is an XML file that is valid against that schema. For convenience it contains only
one issue, but in Real Life there might be multiple <issue>
elements:
<?xml version="1.0" encoding="UTF-8"?> <?xml-model href="journal.rnc" type="application/relax-ng-compact-syntax"?> <journal> <issue> <article> <title>My favorite XPath function</title> <author>Eric Gratta</author> <date>2012-09-05</date> <content> <p>You can do anything with tokenize(). You can divide text into words, breaking on white space, on punctuation, or in any other way that meets your needs. You can even use tokenize() in a nested context, perhaps breaking on white space and then, within each white-space-delimited word, on hyhpens. How cool is that!</p> </content> </article> <article> <title>Reflections on tokenize()</title> <author>David J. Birnbaum</author> <date>2012-10-10</date> <content> <p>Gratta’s attention to tokenize() is misplaced. matches() is much more useful.</p> <p>Sometimes, though, contains() alone is enough.</p> </content> </article> <article> <title>On the relative merits of tokenize() and matches()</title> <author>Eric Gratta</author> <date>2012-10-23</date> <content> <p>No way matches() is as cool as tokenize()!</p> </content> </article> </issue> </journal>
Your task is to write a Schematron schema that will ensure that the
<content>
section of each <article>
is no more
than half the length of the <content>
section of the immediately
preceding
<article>
. As a way of simplifying the task for pedagogical purposes,
we are assuming that the only articles in the XML document are part of the polemic, so
you don’t have to worry about other articles that are not part of the polemic, and that
wouldn’t be governed by the length restriction. We are also putting the articles in
chronological order, so you may assume that when we say the immediately preceding
article,
we mean that article that precedes immediately both in order of
publication and in document order in the XML document.
Three complications:
normalize-space()
function. And if you haven’t
used this function, this would be a good time to look it up in Michael Kay. We use
it all the time, and you’ll probably need it for your projects.