Digital humanities

Maintained by: David J. Birnbaum (djbpitt@gmail.com) Last modified: 2022-11-16T19:23:17+0000

SVG assignment #3

Overview

In SVG assignment #2 we asked you to create a bar chart representing popular votes cast for Democrats in the 2012 election. As a bonus task, you could have also represented the electoral votes by varying the widths of the bars. This assignment asks you to create a different type of election results visualization, one that represents both the popular and electoral votes for candiates of all parties in presidential elections from 1900 to 1912. The raw data, scraped from https://www.britannica.com/topic/United-States-Presidential-Election-Results-1788863, is below, and you can copy and paste it into an XML document so that you can transform it to SVG:


    
        William McKinley
        William Jennings Bryan
    
    
        Theodore Roosevelt
        Alton B. Parker
        Eugene V. Debs
    
    
        William Howard Taft
        William Jennings Bryan
        Eugene V. Debs
    
    
        William Howard Taft
        Woodrow Wilson
        Theodore Roosevelt
        Eugene V. Debs
    
]]>

The task

Choosing the correct visualization for the data we want to represent is important and sometimes challenging, and the choice starts with questions about the number and type of variables. We’re asking you to represent four variables here: election year, political party, electoral votes, and popular votes (we are not going to include the candidates’ names in the SVG). Some variables, such as electoral and popular votes, are countable; others, like political party, are not. (Years are sort of countable; they are ordered and expressed in numbers, but the year 2000 is not twice as large as the year 1000 because the year 0 starting point doesn’t have the same type of absolute, real-world meaning as receiving 0 votes in an election does.) One reason you don’t see a lot of bar graphs with variable-width bars is that bar graphs are not very well suited to representing so many variables.

We’re going to use a bubble chart to visualize these four variables, and before you do anything else, navigate to https://datavizcatalogue.com/methods/bubble_chart.html) to learn why bubble charts are a good choice for for visualization more than two variables, some countable and some not. A bubble chart has an X (in our graph, the election year: 1900, 1904, 1908, or 1912) and Y axis (in our graph the count of electoral votes won by a candidate, which ranges, in this data set, from a low of 0 to a high of 435), and it lets us also use the area of circles to represent a third variable (in our graph, percentage of popular vote, which you’ll have to compute using XPath, see below) and color to represent a fourth (in our case, political party). Your bubble chart should look something like the following:

Baseline requirements for all XML-to-SVG transformation

Your XSLT must include meaningful comments that document what each part of the code does. You don’t have to comment the obvious parts, but you should have comments that label the major sections of your code and that explain anything that won’t be self-explanatory should you look back at it six months from now.
Your output SVG must be valid. This means that you have to save the output to disk, open it in <oXygen/>, validate it there, and fix any issues in the XSLT that are impinging on the validity of the output.

Alternatively, if you are not able to fix a validity issue, your XSLT must include a comment explaining exactly what you think is wrong, exactly what you tried to do to fix it, and exactly how you understand the difference between what you expected and what you got. To a large extent this is rubber-duck debugging and is likely to lead you to recognize the source of the problem yourself, but it also helps us give you more meaningful feedback when that doesn’t happen.

Working with circles in SVG

You may have worked only with <rect>, <line>, and <text> elements in SVG so far. Circles are similar to the first two in that they are empty elements that use attributes to specify their position and size: @cx (X-value of center), @cy (Y-value of center), and @r (radius). For example:

]]>

will draw a circle with a radius of 50px centered at x = 100 and y = 100. The color of a circle is set with a @fill attribute. We set the @opacity attribute to .25 to make the circles partly transparent, since they may overlap (see especially the 1912 results). The low opacity isn’t a perfect solution because the visualization in case of complete overlap will change depending on which circle is drawn first, so it’s best to draw larger circles before smaller ones. For this exercise you can assume that third-party candidates always get fewer electoral votes than major-party candidates, so you can just draw them last (and they’re listed last in the input XML, so you don’t have to do anything special to control when they’re drawn), but a more professional strategy would sort the data for each election by electoral vote before drawing any of the circles for that election.

Because some of the popular vote percentages are so close that the differences in the size of the circles are not easy to see, we’ve added labels for the percentages. Those labels are a bonus task that we encourage you to undertake, but they aren’t required. See below.

Here is how to think about the four data variables:

The values along the X axis, which represent election years, should be evenly spaced, that is, with the same amount of space between the centers of the year labels (which are <text> elements) and between the centers of the bubbles above those labels (which are <circle> elements).
The values along the Y axis, which represent electoral votes won by each candidate, should also be evenly spaced. You’ll want to create enough horizontal ruling lines and labels on the Y axis to make the chart easy to read, but not so many that it becomes cluttered. We drew ruling lines and labels at Y positions 100, 200, 300, 400, and 500 (the X axis serves as a ruling line at Y position 0, so we just had to add the label). The Y position of the center of each circle represents the number of electoral votes earned by each candidate, and we’ve drawn a small black dot there to make the center easier to see.
The area of the circle (not the radius; see Scaling to radius or area?) represents the percentage of popular vote. This means that 1) you’ll have to use the area (which is based on the percentage) to compute the radius; and 2) you may have to scale the values.

You may remember from high-school geometry that the area of a circle (we’ll call it A) is equal to πr², where r is equal to the radius, which means that if you know the area, you can compute the radius: first divide A by π and then take the square root. The XPath function math:pi() (with nothing inside the parentheses) returns the value of π and the function math:sqrt() computes the square root of its argument, so if the variable $area equals the area of the circle, math:sqrt($area div math:pi()) will return the radius.

The popular votes, represented by the areas of the circles, are percentages, so they’ll vary (theoretically, at least) from 0 to 100. If your circles are too small or too large when you use the actual percentage, you may want to create a variable that you can use to scale up or down. You can determine a scaling factor by trial and error, but a more professional approach would compute the largest actual percentage ahead of time and use that to control the scaling.
The color of the circle represents the political party. We’ve simplified the color coding by making all third parties green, but for obvious reasons we don’t combine their votes. See, for example, 1912, where two third-party candidates received electoral votes, so there are two green circles.

We recommend drawing the SVG in the upper right quadrant of the coordinate space, using negative Y values, and then using @viewBox to make the graph visible in the viewport (browser window). If you need a refresher on @viewBox, refer to our Viewbox tutorial: an introduction to the SVG coordinate space.

Bonus tasks

The required part of the assignment is the bubble chart with all of the labels on the X axis (years plus general label) and Y axis (electoral vote counts plus general label). You are not required to include a legend or label the bubbles, but as bonus tasks you may:

Add a legend mapping parties to colors. This legend should also include a scale against which to measure bubble sizes; you might, for example, show how big a bubble is when it represents 50% of the popular vote.
Draw a small dot in the middle of each circle so users can more easily find the center, the Y position of which is the electoral vote count.
Add labels for the bubbles that report the percentages. To avoid an awkward overlap of labels toward the bottom of the 1912 data, in our version we created labels only for percentages greater than 20%.

What to submit

Submit only your XSLT; do not submit the SVG file. We’ll run the transformation ourselves.