Maintained by: David J. Birnbaum (djbpitt@gmail.com)
Last modified:
2023-03-03T15:57:47+0000
This test has two required parts plus an optional bonus (extra credit) section. The
first part asks questions about your understanding of XPath and the second asks you
to create XPath expressions and use them to learn about a Bad Hamlet file
similar to the one you’ve been using for practice. You’ll find the file at http://dh.obdurodon.org/even-worse-hamlet.xml.
This file contains altered content that is different from the Bad Hamlet
version that you’ve been using in your XPath assignments, so be sure to work with
this new file.
Don’t forget to set the XPath version in the <oXygen/> XPath toolbar or XPath builder to 3.1. You may also want to revisit our XPath functions we use most tutorial.
Define nodes, sequences and atomic values. Give an example of how each of those concepts might arise when you use XPath to explore Hamlet in <oXygen/>. Your examples of these three concepts might involve either XPath expressions themselves or the results that XPath expressions return.
What is the difference between an axis and a predicate in a path expression? To answer this question, give an example of each within an XPath expression, explain how they are distinguished syntactically (that is, how each is spelled when used in an XPath expression), and explain what each contributes to the overall meaning of the XPath expression you use to illustrate them.
Explain the difference between the simple map operator
!
and the arrow operator
=>
. For example, consider the two
expressions //sp ! count(.)
and
//sp => count()
and how they return
different results. Give one example each of a reasonable way you might use
these operators to explore Hamlet.
The functions we used to answer the following questions include
contains()
,
count()
,
distinct-values()
,
not()
,
sort()
,
string-join()
. All of these are described in
Michael Kay except sort()
because it was
introduced in XPath 3.1 and Mike’s book was written when 2.0 was the most recent
version. The sort()
function returns a sequence
of items sorted into alphabetical order. There may be more than one correct answer
to some of the questions.
Questions 5–9 build on one another. If you get stuck at some point, you can still receive partial credit for the following questions by explaining and illustrating how you would answer them if you had the requisite input. For example, if you can’t get the 77 lines you want for question 5, select some alternative lines as input into question 6 and describe and illustrate how you would find the speakers of speeches that contain those lines.
All line elements in the play <l>
are
supposed to have attributes of type @n
, but
some don't, which is a markup mistake. What XPath expression will select the
lines that don’t have @n
attributes? (Hint:
There are five such lines.)
Building on the preceding question, what XPath expression will tell you how many
such lines there are? Your expression must return a single integer value, that
is, XPath needs to do the counting instead of returning the lines and your
finding the answer with your human eyeballs by looking next to the
Description
.
Hamlet’s Ghost (referred to as Ghost
), although not appearing much, is an
important symbol in the play as it represents Hamlet’s dead father. What XPath
expression finds the scenes where Ghost
is featured as a speaker? (Hint:
There are 2 such scenes.)
What XPath expression finds all speeches spoken by Ghost
? Your XPath
expression must select the speeches themselves, and not just the speakers.
(Hint: there are 14 such speeches.)
What XPath expression will find every line
(<l>
or
<ab>
element) in which the name
Hamlet
is spoken? Caution: There are lines that contain stage
direction (<stage>
) elements the mention
Hamlet’s name, but being mentioned inside a stage direction isn’t the same as
being spoken. Your XPath expression must include only lines where the name
Hamlet
is spoken within speech. (Hint: there are 77 such lines, 10
instances of <l>
and 67 of
<ab>
.)
What XPath expression will return the speakers of each speech that contains a
line (<l>
or
<ab>
element) that mentions
Hamlet
? (Hint: There are 68 such speakers because some speeches
contain more than one line that mentions Hamlet
. Some of the speaker
names are repeats because the same person may have multiple speeches that
mention Hamlet by name.)
What expression would deduplicate the results of the last expression? In other words, you should return a sequence of strings where each name is listed only once. (Hint: There are 13 such speaker names.)
What XPath expression will sort the sequence in alphabetical order?
What Xpath expression will return the sequence as a comma-separated list?
What XPath expression will return a deduplicated list of all element names
within the document? (Hint: You’ll need the
name()
function, which you can look up
in Michael Kay. There are 28 distinct element names.)
What XPath expression will select all speech
<sp>
elements that have both
<l>
and <ab>
children? (Hint: There are 7 such speeches.)
What XPath expression will return the ratio of
<l>
to <ab>
children for each of the speeches selected in the previous step and sort
them from lowest to highest? (Hint: There are 7 such ratios, ranging from a
low of 0.117 to a high of 6, and the number 1 appears twice in that list
because two of the speeches in question have the same number of elements of
both types.)
Given the 7 values in the preceding question, what XPath expressions will return just the lowest value,just the highest value, and just the average (arithmetic mean) of all 7 values? (Hint: You’ll want to look up the appropriate functions in Michael Kay.)
Write your answers in a properly formatted markdown file with a filename that conforms to our usual filenaming conventions, with an .md filename extension and upload it to Canvas. You can remind yourself about markdown syntax at the GitHub three-minute guide to Mastering markdown that you read earlier. The test is open book and you can use any references you’d like, except that you cannot receive help from another person.
Should you have any questions, please ask in the #xpath channel in our Slack workspace. We can’t give you the answer, but we’ll do whatever we can short of that to help.