Digital humanities


Maintained by: David J. Birnbaum (djbpitt@gmail.com) [Creative Commons BY-NC-SA 3.0 Unported License] Last modified: 2025-11-02T16:57:17+0000


XQuery drive-by

Overview

XQuery is a programming language designed primarily to work with XML databases, that is, databases that engage with XML content in an XML-idiomatic way, such as by using XPath path expressions and functions. We introduce XML databases in an appendix below, but because XQuery can also be used with XML that has not been stored into an XML database, most of this tutorial is about XQuery in general, that is, without any dependency on a database.

In Real Life we often use XQuery for exploratory data analysis, where we run queries over our XML to learn how it is structured. You’ve already practiced performing exploratory data analysis with XPath, but, as we’ll see below, XQuery is more powerful than XPath, which means, among other things, that it may be easier to ask some questions about XML in XQuery than in XPath.

XQuery basics

Every XPath expression is an XQuery expression

XQuery, like XSLT, is built on top of XPath, but XQuery syntax is similar to XPath syntax and every XPath expression is also a valid XQuery expression. For example, the expression 1 + 1 is a complete and valid XPath expression and also a complete and valid XQuery expression that evaluates to the value 2. Similarly, if http://dh.obdurodon.org/wilde-testimony.xml points to an XML document, you can use the standard XPath doc() function to select it, so the standard XPath expression:

doc("http://dh.obdurodon.org/wilde-testimony.xml")//speech => count()

will select all ]]> descendants of the document and return a count of them. Because this is a complete and valid XPath expression, it is also a complete and valid XQuery expression.

The principal feature that XQuery adds to XPath is FLWOR

XQuery supports FLWOR (pronounced flower) expressions, where the letters stand for:

A FLWOR expression must start with either a for or let statement, must end with a single return statement, and may include any of the other statements (including more instances of for and let, but not of return) as needed.

Pure XPath supports some of these statement types, but not all of them, and not as many ways of combining them as XQuery does. There are a few additional statement types that are also allowed within FLWOR expressions, but in this tutorial we concentrate on the ones above.

The following is a FLWOR expression that returns a sequence of all speeches not by Oscar Wilde himself in the document, sorted by speaker name:

let $testimony as document-node() := doc("http://dh.obdurodon.org/wilde-testimony.xml")
for $speech as element(speech) in $testimony//speech
where $speech/speaker ne "Wilde"
order by $speech/speaker
return $speech

The FLWOR expression correctly starts with a for or let statement (in this case a let), ends with a single return, and includes for, where, and order by statements in between. Here’s what each line means:

  1. let $testimony := doc("http://dh.obdurodon.org/wilde-testimony.xml")

    This expression binds the result of evaluating the expression on the right to the variable name $testimony. That expression uses the standard XPath doc() function to select a document, and the function is defined as returning the document node (which is the parent of the root element).

    The as document-node() phrase specifies that the value retrieved by the expression on the right must be a document node. Datatyping is not required by the XQuery spec, but it is considered good practice because if the expression somehow returns something other than a document node, you want to raise an error immediately so that you can find and fix the problem. We strongly recommend using as phrases whenever you define a variable in XQuery.

    Note the following differences between XQuery and XSLT:

    • In XQuery we always precede a variable name with a dollar sign—both when we define the variable and when we use it. In XSLT we precede a variable name with a dollar sign only when we use the variable, but not when we define it. For example, in XSLT we can bind the value 10 to the variable $x with ]]> (no dollar sign inside the value of the @name attribute) and then refer to it with, e.g., ]]> (using the dollar sign when we refer to the variable). In the XQuery above, however, we use the dollar sign both when we define $testimony (line 1) and when we use it (line 2).

    • The operator that binds a value to a variable in XQuery is :=. This is sometimes informally (that is, it isn’t an official term and you won’t find it in the XQuery specification) called the walrus operator because it looks like the eyes and tusks of a walrus lying on its side—at least if you have a lively imagination!

  2. for $speech as element(speech) in $testimony//speech

    The part of this expression that follows the in keyword is a standard XPath path expression that selects all ]]> descendants of the document node (which is what the variable $testimony represents because that’s how we defined it in the first statement). The expression as a whole binds each of those elements, one at a time, to the variable $speech. The as element(speech) phrase says that the values processed in this for statement must be elements of type ]]>. As with the as clause in line 1, this part of line 2 is optional, but we strongly recommend including it. The code that follows a for expression (in this case, the last three lines of the XQuery script) will fire once for each item, which, in this case, means once for each ]]> element.

    There are technical reasons that a for expression is not officially a loop (ask us about this if you’re curious). What it has in common with a loop, though, is that it does something once for each item.

    It is common to refer to the XPath expression after the keyword in as the sequence variable because it identifies the sequence of items to which the following statements will be applied. The variable after the for keyword is commonly called the range variable because it ranges over the items in the sequence that it processes.

    The sequence of items to be processed doesn’t have to be a variable. The FLWOR expression:

    for $x in (1, 2, 3)
    return $x * 2

    returns the sequence (2, 4, 6). Here the sequence to be processed is not a variable; it is a sequence of literal integer values.

  3. where $speech/speaker ne "Wilde"

    This expression filters the items (]]> elements) selected by the for expression and keeps only those that have a child ]]> element that is not equal (using the standard XPath ne value-comparison operator to mean not equal to) to the string value Wilde.

    We could, alternatively, have omitted the where expression and used an XPath predicate to do the filtering:

    for $speech in $testimony//speech[speaker ne "Wilde"]

    These expressions are synonymous and will return the same results, although one or the other may be more efficient when used with an XML database (see below).

  4. order by $speech/speaker

    This expression sorts the speeches (the ones that survived the filtering) by the value of the ]]> child of the ]]> element. The speeches by each distinct speaker are returned in document order by default, but we could subsort them (for example from shortest to longest) if we wanted to do that.

  5. return $speech

    This expression returns each speech that has undergone the processing above, that is, that has survived the filtering and been sorted by speaker name.

It is possible to obtain the same result with a pure XPath expression instead of an XQuery FLWOR expression:

doc("http://dh.obdurodon.org/wilde-testimony.xml")//speech[speaker ne "Wilde"] => 
sort((), function($x) {$x/speaker})

We’ve broken this over multiple lines for legibility, but it could have been written on a single line. The path expression selects speeches (mimicking, in this example, the XQuery for expression), we use a predicate (instead of XQuery where) to filter out Wilde’s speeches, and we use the standard XPath sort() function (instead of XQuery order by) to sort the speeches by speaker. Although the meaning of the two versions is the same and they return the same results, we find the XQuery version easier to read and understand.

Using XQuery in <oXygen/>

To practice using XQuery inside <oXygen/> do the following:

  1. Open <oXygen/>.

  2. In the menu bar, select Window → Show View → XPath/XQuery Builder. This opens a panel (usually on the right side of the screen).

  3. At the upper left corner inside that panel click on the dropdown to the immediate left of the red rightward-pointing triangle (see the image below). From the dropdown list select Saxon-HE XQuery 12.5 (your version number might be different).

In the image of the XPath/XQuery builder below, the dropdown list where you select the XQuery version is circled in red and the red rightward-pointing triangle that you click to evaluate the XQuery is circled in blue:

[Image of XPath/XQuery builder interface]

You can use the XPath/XQuery builder to apply XQuery to a document that is open in <oXygen/> or to a document that you select using the standard XPath doc() function (or to a sequence of documents that you select with the standard XPath collection() function, about which see below). Let’s illustrate those two variants:

Applying XQuery to a remote document in <oXygen/>

You can practice applying XQuery to a remote document (addressed by using the standard XPath doc() function) by copying and pasting the following XQuery expression into the XPath/XQuery builder panel and evaluating it by clicking on the red rightward-pointing triangle:

let $testimony as document-node() := doc("http://dh.obdurodon.org/wilde-testimony.xml")
for $speech as element(speech) in $testimony//speech
where $speech/speaker ne "Wilde"
order by $speech/speaker
return $speech

<oXygen/> should open a panel at the bottom where it displays the 74 results. You can scroll to verify that the results are a sequence of ]]> elements, that none of the speeches are by Oscar Wilde, and that the speeches are sorted according to alphabetic order of the speaker name.

Applying XQuery to an open document in <oXygen/>

You can practice applying XQuery to document that is open in <oXygen/> (instead of one that you access with the XPath doc() function) as follows:

  1. To open the Wilde testimony document in <oXygen/>, start by typing Command-u (Mac) or Control-u (Windows), which opens an Open URL dialog. Paste the URL for the document (http://dh.obdurodon.org/wilde-testimony.xml) into the box and click OK. The document should open in <oXygen/>. If you have other tabs open in <oXygen/>, select the one that contains the Wilde testimony document, which makes it the active document for the XPath/XQuery builder.

  2. Click inside the XPath/XQuery builder panel, remove any content that is already there, and type the following:

    for $speech as element(speech) in //speech
    where $speech/speaker ne "Wilde"
    order by $speech/speaker
    return $speech
  3. Use the red triangle to evaluate the XQuery against the open document.

The results should be the same as above. We changed the XQuery to remove the $testimony variable (and its binding to the result of evaluating the doc() function) because <oXygen/> will use the current active document as the context for XQuery processing unless we specify something else. That means that the first line of the revised XQuery means select all of the ]]> descendants of the document node of the active document and proceed from there.

Beyond the basics

The following topics are important where you need them, but you can do a lot with XQuery using just the information above. When you begin learning XQuery we recommend:

Namespaces in XQuery

If your XQuery selects or creates nodes that are in a namespace you must make the namespace information available to your XQuery script. There are two ways to do this:

We didn’t have to think about namespaces with the Wilde testimony example above because that XML document does not use any namespaces, but the example below is in the TEI namespace, and therefore requires that we declare and use that namespace in our XQuery.

A default element namespace in XQuery applies only to elements, and not to attributes. There is no way to declare a default attribute namespace, which means that if you need to refer to attributes that are in a namespace, you must use a namespace prefix.

Default element namespace declaration

You can declare a default namespace that will apply to all elements (but not attributes; see the note above) mentioned in your XQuery (both those you select from the XML you’re processing and those you might create for your output) by including a declaration like the following at the beginning of your XQuery:

declare default element namespace "http://www.tei-c.org/ns/1.0";

The part in quotation marks needs to match the namespace used in your document. Since the bad-hamlet.xml document that we process below is in the TEI namespace, we’ve made the TEI namespace the default.

XQuery statements that begin with the keyword declare go at the beginning of the XQuery document and each declare statement must end with a semicolon.

You can find all 357 speeches by Hamlet in the play with the following XQuery:

declare default element namespace "http://www.tei-c.org/ns/1.0";
let $play as document-node() := doc("http://dh.obdurodon.org/bad-hamlet.xml")
for $speech as element(sp) in $play//sp
where $speech/@who eq "Hamlet"
return $speech

If you omit the XML namespace declaration you will return no results because you will be asking for ]]> elements in no namespace instead of ]]> elements in the TEI namespace. Asking for something that does not exist is not an error; your XQuery will find all zero instances of ]]> elements in no namespace, keep only those spoken by Hamlet (there are none, of course), and return all zero of them. This is, as they say, probably not what you want. The takeaway is that namespace mistakes can be difficult to find and debug because they may not raise errors, so if your XQuery is expected to return results and it doesn’t, a common reason is that you’ve failed to specify a namespace correctly.

Namespace prefix binding

You can bind a namespace URL to a prefix with a statement like:

declare namespace tei="http://www.tei-c.org/ns/1.0";

You can then refer to elements in the TEI namespace by using the prefix, as in line 3 of the following example (in two places in that line):

declare namespace tei="http://www.tei-c.org/ns/1.0";
let $play as document-node() := doc("http://dh.obdurodon.org/bad-hamlet.xml")
for $speech as element(tei:sp) in $play//tei:sp
where $speech/@who eq "Hamlet"
return $speech

If you omit the tei: prefix in line 3 you’ll get no results because you’ll be asking for speeches in no namespaces, and all speeches in this document are in the TEI namespace.

Input and output namespaces

What happens when the input XML is in one namespace (such as TEI) and the output you are creating is in a different namespace (such as HTML or SVG)? The XQuery default element namespace declaration applies to all elements, both those read from input XML and those created in the output as literal result elements, and there is no way to declare different default namespaces for input and output. You could make one of those namespaces the default and use a prefix for the other, but, for what it’s worth, we sometimes use namespace prefixes for both input and output, so that we don’t have to remember which one we’ve made the default. We’ll illustrate below how to manage namespaces in XQuery when we read XML in one namespace and create XML in a different namespace.

Unlike XQuery, XSLT is able to declare default namespaces for input and output separately. An xquery-default-namespace="http://www.tei-c.org/ns/1.0" attribute setting on the root ]]> element specifies that the TEI namespace is the default namespace for elements read from the input document. An xmlns="http://www.w3.org/1999/xhtml" attribute setting on the same root element specifies that elements created in the output will be in the HTML namespace unless you say otherwise explicitly.

Creating XML output

Below we illustrate a transformation from TEI XML to HTML in two steps. First we create plain-text output; we then modify the query to create HTML output. The goal of this exercise is to illustrate how to read input in one namespace (TEI) and create output in a different namespace (HTML), and we create the plain-text output first just to separate the general query logic from the code used to manage the namespaced HTML output.

Creating plain-text output

The following XQuery creates a deduplicated list of speakers in each act of Hamlet:

The output looks like:

<?xml version="1.0" encoding="UTF-8"?>Act 1: All, Bernardo, Cornelius and Voltimand, Francisco, Gertrude, Ghost, Hamlet, Horatio, King, Laertes, Marcellus, Marcellus and Bernardo, Marcellus and Horatio, Ophelia, Polonius
Act 2: First Player, Gertrude, Guildenstern, Hamlet, King, Ophelia, Polonius, Reynaldo, Rosencrantz, Rosencrantz and Guildenstern, Voltimand
Act 3: All, First Player, Gertrude, Ghost, Guildenstern, Hamlet, Horatio, King, Lucianus, Ophelia, Player King, Player Queen, Polonius, Prologue, Rosencrantz, Rosencrantz and Guildenstern
Act 4: Captain, Danes, Fortinbras, Gentleman, Gertrude, Guildenstern, Hamlet, Horatio, King, Laertes, Messenger, Ophelia, Rosencrantz, Rosencrantz and Guildenstern, Sailor, Servant
Act 5: All, First Ambassador, First Clown, Fortinbras, Gertrude, Hamlet, Horatio, King, Laertes, Lord, Osric, Priest, Second Clown

The XML declaration appears at the beginning of the output because we didn’t specify that we were creating plain-text output, and insofar as the default output type is XML, the transformation prepends the XML declaration to the actual output. In Real Life we would tell XQuery about the output type, which would cause the XML declaration to be omitted automatically, but since plain-text output is just an interim step toward our ultimate goal of producing HTML 5 output that uses XML syntax, we’ll leave it in place for now.

Here’s how each line of the XQuery works:

  1. Because the document is in the TEI namespace, we declare that namespace and bind it to the prefix tei:. We have to prepend this prefix to every element in the TEI input document that we mention. (We could, alternatively, have used a default element namespace and omitted the prefix.)

  2. We use the standard XPath doc() function to bind the document node of the play document to the variable $play.

  3. We use a for statement to process each act in the play, binding the ]]> elements that represent each individual act, in turn, to the variable $act. The at $pos clause sets the value of the variable $pos to the position of each act, in turn, within the sequence of acts being processed. You can call the position variable whatever you want; we use $pos as a mnemonic for position.

  4. We use the ]]> descendants of each act to represent the speakers. We could, alternatively, have used the @who attribute, but the ]]> element is more user-friendly.

  5. We use the standard XPath distinct-values() function to deduplicate the sequence of speakers for each act.

    Although human readers know that the associated values are strings, from an XQuery perspective they have the datatype xs:untypedAtomic, which is the default datatype for element and attribute values read from an XML document unless we say otherwise. If we include the as xs:string+ phrase and omit the ! string() phrase we’ll be notified of a datatype error because we’ve specified that the value must be one or more strings and XQuery thinks it’s one or more untyped atomic values. If we omit the as xs:string+ phrase we’ll accept any datatype, and in that case we don’t need to use the ! string() phrase to convert the untyped atomic values to string values. We nonetheless recommend always specifying datatypes so that we’ll be notified if we’ve made a mistake that causes our code to select items of an unexpected type.

  6. We use the standard XPath sort() function to sort the distinct speaker names alphabetically.

  7. We use the standard XPath string-join() function to form the deduplicated sequence into a string of comma-separated values.

  8. We use the standard XPath concat() function to construct an output line for each act that combines the act number and the list of speaker names into a human-friendly statement. The string at the end is a newline character; its presence causes the information for each act to be printed on a separate line.

In Real Life we might have combined some of the XPath functions within a single line, but we’ve separated them here for didactic reasons, so that we could focus on each one individually.

Creating HTML output

The preceding XQuery constructs plain-text output, but what if we want to construct HTML output? Let’s try constructing a two-column table, with the act number in the first column and the comma-separated list of speakers in the second column. The output should look like the following:

Act Speakers
1 All, Bernardo, Cornelius and Voltimand, Francisco, Gertrude, Ghost, Hamlet, Horatio, King, Laertes, Marcellus, Marcellus and Bernardo, Marcellus and Horatio, Ophelia, Polonius
2 First Player, Gertrude, Guildenstern, Hamlet, King, Ophelia, Polonius, Reynaldo, Rosencrantz, Rosencrantz and Guildenstern, Voltimand
3 All, First Player, Gertrude, Ghost, Guildenstern, Hamlet, Horatio, King, Lucianus, Ophelia, Player King, Player Queen, Polonius, Prologue, Rosencrantz, Rosencrantz and Guildenstern
4 Captain, Danes, Fortinbras, Gentleman, Gertrude, Guildenstern, Hamlet, Horatio, King, Laertes, Messenger, Ophelia, Rosencrantz, Rosencrantz and Guildenstern, Sailor, Servant
5 All, First Ambassador, First Clown, Fortinbras, Gertrude, Hamlet, Horatio, King, Laertes, Lord, Osric, Priest, Second Clown

Below is one way to create this output (an explanation follows the code):


    
        Hamlet
        
    
    
        

Hamlet

{ let $play as document-node() := doc("http://dh.obdurodon.org/bad-hamlet.xml") for $act as element(tei:div) at $pos in $play//tei:body/tei:div let $act-speakers as element(tei:speaker)+ := $act/descendant::tei:speaker let $distinct-act-speakers as xs:string+ := distinct-values($act-speakers) ! string() let $sorted-act-speakers as xs:string+ := sort($distinct-act-speakers) let $sorted-act-speakers-string as xs:string := string-join($sorted-act-speakers, ", ") return }
Act Speakers
{$pos} {$sorted-act-speakers-string}
]]>

Here’s how it works:

  • Lines 1–2. We declare the TEI and HTML namespaces, making the HTML namespace the default. This means that all elements will be in the HTML namespace unless we say otherwise, and we can use the tei: prefix when we need to specify that an element (from the Hamlet document) is in the TEI namespace.

  • Lines 3–9. Line 3 declares the namespace used to specify serialization options, that is, details about how output should be formatted. The following options say that output will be HTML that uses XML syntax (line 4) and should identify itself as such during HTTP (e.g., web browser) access (line 5). Line 6 specifies that we want to include the XML declaration (line 6), that the version of HTML we target is version 5 (line 7), that the output should be pretty-printed (line 8), and that we do not want to include an HTML ]]> element to specify the encoding (because when we create HTML using XML syntax and including an XML declaration, the UTF-8 encoding is specified as part of the XML declaration: ]]>).

  • Lines 10–41. The root element of the output document is ]]>. We don’t specify the HTML namespace literally here because we’ve already established it as the default on line 2. When the output is created, the namespace declaration will automatically be written correctly on the root ]]> element.

  • Lines 12–18. We create some minimal CSS styling to improve the appearance of the output. CSS property sets (after a CSS selector value) are wrapped in curly braces, and because single pairs of curly braces have a special meaning in XQuery (see below), we have to double them here to tell the processing to output them (as single, not doubled, pairs), which is what a browser will understand.

  • Lines 23–40. As described above, the data is output as a two-column table, with the act number in the first column and the alphabetized, comma-separated list of distinct speakers for each act in the second column. We create the table and a header row (lines 24–27) and then begin to create the data rows.

  • Lines 10–27. Everything from lines 10 through 27 is output literally (except that the doubled curly braces are undoubled, as described above), but beginning with line 28 we want to output not the literal XQuery, but the result of evaluating it. We tell the processor to switch from outputting literal text to processing XQuery and outputting the value by wrapping the XQuery in curly braces. The paired curly braces on lines 27 and 40, then, specify that whatever appears between them is XQuery that must be processed and evaluated, and not output literally.

  • Lines 28–35. These lines are identical to the XQuery that produces the plain-text output, above.

  • Lines 35–38. The return statement (line 34) returns a single ]]> element. Specifying a literal element as the value of the return statement will output everything literally unless we say otherwise. We want the tags to be output literally, but the values of the ]]> elements must be determined by evaluating the variables (not just printing their names), so we need to wrap the content of those elements in curly braces. If we were to omit the curly braces, we would output the names of the variables—that is, the literal XQuery code—instead of the results of evaluating them.

We find it helpful to think of this type of XQuery as operating in two modes, which we think of as XQuery mode, where XQuery is evaluated, and XML mode, where XML is output literally. These modes are nested within one another as follows:

  • Line 1. We start with an XQuery declare statement, so we start in XQuery mode, and we remain there until we switch out of it.

  • Line 10. We switch from XQuery mode into XML mode by typing a literal XML element. At this point everything will be output literally unless we switch back into XQuery mode (which we do on line 27; see below).

  • Line 27. The opening curly brace at the end of line 27 embeds XQuery mode inside the current XML context. Everything that follows the opening curly brace will be evaluated as XQuery until we reach the matching closing curly brace (line 39). We can, however, switch into and out of XML mode inside this XQuery range, which we do on lines 35–38.

  • Line 35. Typing a literal XML element embeds XML mode inside our current XQuery context, and we remain there, in this case, until the closing curly brace on line 39 signals the end of the enclosing XQuery block.

  • Lines 36–37. Within XML mode we use paired curly braces to switch into XQuery mode to evaluate the variables and output the results.

The nesting of XQuery and XML mode in an XQuery script follows a straightforward pattern: we delimit XML blocks nested within an XQuery context by typing literal XML elements and we delimit XQuery blocks nested within an XML context by wrapping the XQuery in curly braces. The outermost block is XQuery because that’s how the file begins, and it contains a single XML block that consists of an ]]> element (lines 10–41). That XML block contains a single XQuery block, demarcated by curly braces, on lines 28–38. That XQuery block contains a single XML block, demarcated by ]]> tags, on lines 35–38. That XML block includes two small XQuery blocks, demarcated by curly braces, on lines 36 and 37. XQuery and XML blocks can nest within one another as deeply as the task requires,

If you are like us, you’ll forget the curly braces sometimes, and for line 36 you might accidentally write $pos]]> instead of {$pos}]]>. If you do that, the cell will contain the literal four-character string $pos, instead of the actual integer number of the act. This is a common mistake, and one that is easily remedied: just return to your XQuery and insert the missing curly braces.


The XPath collection() function

In addition to the standard XPath doc() function, which selects a single document, XPath has a collection() function that selects a sequence of documents. Some parts of the syntax for using the collection() function are defined as implementation dependent, which means that different implementations may have different rules for specifying a set of files to be processed together as a collection. The rules for Saxon, which is the XQuery processor that we’re using inside <oXygen/>, are that the single argument to the function is a string representing the URL of the directory that holds the files to be processed (much as the single argument to the doc() function is a string that points to a single XML document). For example, collection("http://www.example.com/files") selects a sequence of all documents located in the files subdirectory at the specified URL. You can also select files on a local filesystem, e.g., collection("/Users/userid/files") selects all files in the specified directory.

The Saxon implementation of the collection() function includes the ability to specify several details, the most useful of which, in our experience, have been:

See the official Saxon documentation for details.

XML databases

Databases do more than just store data; they also build in functionality for quick retrieval by pregenerating indexes that can be used for fast searching. The effect is similar to that of a back-of-the-book index in a printed book; instead of flipping through all of the pages to find particular content, you can use a concise and organized index to find a list of the pages where that content appears. Different kinds of databases are designed to manage different types of data, and XML databases incorporate the ability to work quickly, efficiently, and idiomatically (using XPath path expressions and functions) with XML. XQuery can work with XML documents regardless of whether they have been uploaded to an XML database, but large projects that require quick retrieval typically use XML databases to improve the user experience.

There are two open-source XML databases in wide use:

Other XML databases include:

XQuery or XSLT?

XSLT and XQuery have overlapping functionality in that both can read XML (and other types of documents), operate over them, and emit XML (and other) results. So which should you use? Some developers use one or the other exclusively, but, for what it’s worth, our experience has been that:

The choice between XSLT and XQuery is not necessarily an either/or matter. XQuery supports the standard XPath transform() function, which performs an XSLT transformation, so it’s possible to perform XSLT transformations as part of XQuery processing. XPath also includes a load-xquery-module() function, which makes it possible to run XQuery from within an XSLT transformation. (The load-xquery-module() function is supported in Saxon EE, but not Saxon HE or PE, so if you need to use it, select Saxon EE instead of HE as your XSLT processor.)

Further reading

There are two good general XQuery books:

If you use eXist-db as your XML database, we also recommend Erik Siegel and Adam Retter, eXist: a NoSQL document database and application platform, O’Reilly, 2014.

The official specifications are: