Digital humanities


Maintained by: David J. Birnbaum (djbpitt@gmail.com) [Creative Commons BY-NC-SA 3.0 Unported License] Last modified: 2016-11-16T03:04:35+0000


SVG assignment #2 answers

Overview

The basic task of SVG assignment 2 was to create an SVG bar graph of the Democratic results of the 2012 US presidential election. See the assignment page for a more detailed description, suggestions, and a link to the input data in XML form.

Simple solution

Here is one possible solution. It isn’t easy to follow because we’ve used a lot of hard-coded numbers, and we provide a more legible version below:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="http://www.w3.org/2000/svg"
    version="2.0" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="#all">
    <xsl:output method="xml" indent="yes"/>
    <xsl:template match="/">
        <svg height="375">
            <g transform="translate(30, 330)">
                <line x1="20" x2="20" y1="0" y2="-320" stroke="black" stroke-width="1"/>
                <line x1="20" x2="1550" y1="0" y2="0" stroke="black" stroke-width="1"/>
                <line x1="20" x2="1550" y1="-150" y2="-150" stroke="black" opacity="0.5"
                    stroke-dasharray="8 4" stroke-width="1"/>
                <text x="10" y="5" text-anchor="end">0%</text>
                <text x="10" y="-145" text-anchor="end">50%</text>
                <text x="10" y="-295" text-anchor="end">100%</text>
                <xsl:apply-templates select="//state"/>
            </g>
        </svg>
    </xsl:template>
    <xsl:template match="state">
        <xsl:variable name="xPosition" select="(position() - 1) * 30"/>
        <xsl:variable name="totalVotes" select="sum(candidate)"/>
        <xsl:variable name="demVotes" select="candidate[@party = 'Democrat']"/>
        <xsl:variable name="demPer" select="$demVotes div $totalVotes"/>
        <xsl:variable name="acro" select="@acro"/>
        <rect x="{$xPosition + 22}" y="-{$demPer * 300}" stroke="black" stroke-width=".5"
            fill="blue" width="{20}" height="{$demPer * 300}"/>
        <text x="{$xPosition + 20 div 2 + 22}" y="20" text-anchor="middle">
            <xsl:value-of select="$acro"/>
        </text>
    </xsl:template>
</xsl:stylesheet>

Using variables to improve legibility, development, and maintenance

Here we replace a lot of the hard-coded numbers with variables, which we find easier to read. They’re also easier to maintain; should you decide during development to change the width of the bars, for example, it’s easier to change the value you’ve used to declare a variable called $barWidth than to pick apart all of the literal numbers in the code to find the one where you specify that width. We’ve documented the use of variables in XML comments, which is what we’d do in Real Life, as well:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="http://www.w3.org/2000/svg"
    version="2.0" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="#all">
    <xsl:output method="xml" indent="yes"/>
    <!--
        Global variables (available anywhere in the stylesheet):
        $barWidth = width of rectangular bar
        $barInterval = distance between the left sides of adjacent bars,
            equal to $barWidth plus the inter-bar spacing
        $barHeight = has the effect of tripling the raw values, stretching
            the bars vertically so that the differences are easier to see
        $barShift = shift bars and their labels to the left, without moving
            the axes
    -->
    <xsl:variable name="barWidth" select="20"/>
    <xsl:variable name="barInterval" select="$barWidth + 10"/>
    <xsl:variable name="barHeight" select="300"/>
    <xsl:variable name="barShift" select="22"/>
    <xsl:template match="/">
        <svg height="375">
            <g transform="translate(30, 330)">
                <!--
                    The height of the y axis is hard-coded
                    The length of the x axis is calculated according to the number of states
                    There's a dashed line at 50% on the y axis
                -->
                <line x1="20" x2="20" y1="0" y2="-320" stroke="black" stroke-width="1"/>
                <line x1="20" x2="{count(//state)*$barInterval + 20}" y1="0" y2="0" stroke="black"
                    stroke-width="1"/>
                <line x1="{20}" x2="{count(//state)*$barInterval + 20}" y1="-{$barHeight div 2}"
                    y2="-{$barHeight div 2}" stroke="black" opacity="0.5" stroke-dasharray="8 4"
                    stroke-width="1"/>
                <text x="10" y="5" text-anchor="end">0%</text>
                <text x="10" y="{5 - $barHeight div 2}" text-anchor="end">50%</text>
                <text x="10" y="{5 - $barHeight}" text-anchor="end">100%</text>
                <xsl:apply-templates select="//state"/>
            </g>
        </svg>
    </xsl:template>
    <xsl:template match="state">
        <xsl:variable name="statePos" select="position()-1"/>
        <xsl:variable name="xPosition" select="$statePos*$barInterval"/>
        <xsl:variable name="totalVotes" select="sum(candidate)"/>
        <xsl:variable name="demVotes" select="candidate[@party='Democrat']"/>
        <xsl:variable name="demPer" select="$demVotes div $totalVotes"/>
        <xsl:variable name="acro" select="@acro"/>
        <rect x="{$xPosition + $barShift}" y="-{$demPer * $barHeight}" stroke="black"
            stroke-width=".5" fill="blue" width="{$barWidth}" height="{$demPer*$barHeight}"/>
        <text x="{$xPosition + $barWidth div 2 + $barShift}" y="20" text-anchor="middle">
            <xsl:value-of select="$acro"/>
        </text>
    </xsl:template>
</xsl:stylesheet>

Even better would be specify the datatype of the variable, that is, whether it’s supposed to be a string, an integer (whole number), a double (number that may have digits after the decimal point), etc. If you don’t specify the datatype, XSLT will usually figure it out for you, but sometimes it guesses wrong, so in our own work we always specify the type explicitly by using @as variable. The most common datatypes are strings (as="xs:string"), integers (as="xs:integer), and doubles (as="xs:double"). You read more about XSLT datatypes at the IBM developerWorks Improve your XSLT 2.0 stylesheets with types and schemas tutorial. Here’s our second solution with datatype information added to the variable declarations:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="http://www.w3.org/2000/svg"
    version="2.0" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="#all">
    <xsl:output method="xml" indent="yes"/>
    <!--
        Global variables (available anywhere in the stylesheet):
        $barWidth = width of rectangular bar
        $barInterval = distance between the left sides of adjacent bars,
            equal to $barWidth plus the inter-bar spacing
        $barHeight = has the effect of tripling the raw values, stretching
            the bars vertically so that the differences are easier to see
        $barShift = shift bars and their labels to the left, without moving
            the axes
    -->
    <xsl:variable name="barWidth" select="20" as="xs:integer"/>
    <xsl:variable name="interbarSpacing" select="$barWidth div 2" as="xs:double"/>
    <xsl:variable name="barInterval" select="$barWidth + $interbarSpacing" as="xs:double"/>
    <xsl:variable name="barHeight" select="300" as="xs:integer"/>
    <xsl:variable name="barShift" select="22" as="xs:integer"/>
    <xsl:template match="/">
        <svg height="375">
            <g transform="translate(30, 330)">
                <!--
                    The height of the y axis is hard-coded
                    The length of the x axis is calculated according to the number of states
                    There's a dashed line at 50% on the y axis
                -->
                <line x1="20" x2="20" y1="0" y2="-320" stroke="black" stroke-width="1"/>
                <line x1="20" x2="{count(//state) * $barInterval + 20}" y1="0" y2="0" stroke="black"
                    stroke-width="1"/>
                <line x1="{20}" x2="{count(//state) * $barInterval + 20}" y1="-{$barHeight div 2}"
                    y2="-{$barHeight div 2}" stroke="black" opacity="0.5" stroke-dasharray="8 4"
                    stroke-width="1"/>
                <text x="10" y="5" text-anchor="end">0%</text>
                <text x="10" y="{5 - $barHeight div 2}" text-anchor="end">50%</text>
                <text x="10" y="{5 - $barHeight}" text-anchor="end">100%</text>
                <xsl:apply-templates select="//state"/>
            </g>
        </svg>
    </xsl:template>
    <xsl:template match="state">
        <xsl:variable name="statePos" select="position() - 1" as="xs:integer"/>
        <xsl:variable name="xPosition" select="$statePos * $barInterval" as="xs:double"/>
        <xsl:variable name="totalVotes" select="sum(candidate)" as="xs:double"/>
        <xsl:variable name="demVotes" select="candidate[@party = 'Democrat']" as="xs:integer"/>
        <xsl:variable name="demPer" select="$demVotes div $totalVotes" as="xs:double"/>
        <xsl:variable name="acro" select="@acro" as="xs:string"/>
        <rect x="{$xPosition + $barShift}" y="-{$demPer * $barHeight}" stroke="black"
            stroke-width=".5" fill="blue" width="{$barWidth}" height="{$demPer * $barHeight}"/>
        <text x="{$xPosition + $barWidth div 2 + $barShift}" y="20" text-anchor="middle">
            <xsl:value-of select="$acro"/>
        </text>
    </xsl:template>
</xsl:stylesheet>

When we developed the preceding version, we initially specified the datatype of the $totalVotes variable as an integer because we knew that all of the input values to the sum function (the votes for all of the candidates in a particular state) were integers, which means that their sum would also be an integer. To our surprise, when we ran the transformation, <oXygen/> reported an error, saying that although the type was supposed to be an integer, it was actually a double. To figure out what was going on, we looked up the sum() function in Michael Kay (p. 889), where we learned that when sum() is applied to values that don’t have an explicit datatype (and the numbers in XML in this case don’t), the sum() function converts them to doubles before adding them.

So why doesn’t the XSLT transformation engine know that the vote counts in the XML are integers and convert them to integers instead of doubles? It turns out that although it’s clear to a human that they’re integers because they look like whole numbers, they could have more than one logical datatype (for example, they could be doubles that happen to have nothing but implicit zeroes after an implicit decimal point; they could also be strings of characters to be printed, and not any numeric type), and XSLT deals with that by treating all literal values that don’t have an explicit datatype as untyped (the technical explanation is that, since everything has to have a datatype, it regards their type as xs:untypedAtomic, which means … er … that they don’t have an explicit datatype and have to be converted to something explicit internally before you add them or otherwise process them). If you try to perform arithmetic on them, such as with with sum() function, the XSLT engine has to convert them to a type that’s explicitly numeric first, since you can perform arithmetic only on numbers. So should they be converted to integers or doubles? Since you can treat integers like doubles that happen to have only zeroes after the decimal point and get the right answer, but the reverse isn’t the case, XSLT decided that it would always treat numbers to be added as doubles, whether they could alternatively be regarded as integers or not.

To correct our error, we changed the datatype of our $totalVotes to xs:double, since that was the value that the sum() function was going to return. A double is a number that may have digits to the right of the decimal point, but those digits can be zeroes (or, in this case, potential zeroes), and in that case the double has the same numeric value as the corresponding integer. In other words, if the total number of votes in a particular state is equal to, say, the integer 100, it’s also equal to the double 100..