I’m generally a big fan of slope charts (as is, apparently, Andy Kirk at visualisingdata.org)—they are visually appealing; for many people, they are easier to understand than a scatterplot; and, they can really draw in the reader to explore the different relationships. So when I saw another discussion in the Washington Post about Neil Freeman’s map of the electoral college, I wondered whether a slope chart would fit the data well.
I have lots of favorite slope charts, but the one on the cover of Alberto Cairo’s book, The Functional Art, which plots average obesity rates and educational attainment rates for all 50 states, is near the top. He could have used a scatterplot or even a couple of maps to show the data, but his slope chart invites the reader to explore the data. (Noah Iliinsky, in this great talk, makes the case that geographic data need not—and should not—always be plotted on a map.) With Alberto’s chart in the back of my mind, I often test state-level data in the slope chart format.
Back to the data. Freeman redraws the U.S. state map by giving every state an equal population. Thus, to tell the story of interest—how state populations compare with their share of the total electoral vote—we only need two variables. I pulled 2012 state population data from the Census Bureau and electoral vote data—based on the 2010 Census—from the federal register. I converted both variables to shares of the total so as to put them on a similar basis.
My first approach was a simple paired bar chart, which allows me to easily fit the state labels. One problem is that the alphabetical sorting makes it easy to find a specific state, but makes it more difficult to see the patterns; an alternative therefore, is to sort by either series (the left-hand chart sorts by population).
Another problem is that I want readers to understand the differences between the share of total electoral votes and the share of the total U.S. population. In the bar chart format, the reader has to mentally compare the different heights of the bars. An easy way to address that concern is to calculate the difference between the two variables and plot a single bar for each state. My main frustration here is defining what’s being plotted is a bit cumbersome—Difference between a State’s Share of the U.S. Population and it’s Share of Total Electoral Votes. The reader really has to think about what’s being shown, but it’s easy to perceive the patterns in this layout, especially when the data are sorted by the difference.
This brings me to the slope chart. As with any slope chart that has a lot of observations, the user has to explore the data and pull out their own stories. Even in those cases, readers at least get an overview of the data, but here I didn’t feel like any real story came through—there are a bunch of smaller states that have populations shares (right axis) that are smaller than their electoral shares (left axis). (Because I didn’t like the approach, I didn’t bother cleaning it up, fixing the state labels or adding other elements that might be helpful).
So I tried a scatterplot. This approach shows both the level and, implicitly, the difference between the two, especially once I add the 45-degree line. In this graph, you can again see that there are a bunch of states (31 plus the District of Columbia, to be exact; colored blue) with an electoral vote share that exceeds their population share. The remaining states—all larger states—have a higher population share than their electoral vote share (colored green) would suggest.
I could, of course, color the points by the state’s vote in the 2012 Presidential election, which I suppose gives a bit more information.
What do you think? What alternatives would you try? Is the map just the best way to go? Here’s the the .csv file with the data if you want to try toying around with it yourself and, if there’s enough interest, I’ll put it up on HelpMeViz.
In my opinion, the 4th chart – just the delta, as a bar chart – works best. It is scanable and clear. The slope graph seems too busy with 50 values to pull out meaningful stories. Scatter feels wrong for this application.
I divided the electoral percentage by the population percentage (and called it the Vote Conversion Factor). Plotting that against population percentage shows the large discrepancy between the sparse and highly populated states quite clearly, I think.
Basic error on my part. The x axis in the previous post actually shows the population of each state in millions (despite what the axis label might say). The actual population percentage is shown on this one. Apologies.
Hey Jon!
Nice post & dataviz question. I took a stab at it and came up with something that I believe works relatively well. I would have liked something more straightforward to read and interpret, but with a good title & legend, it should be much better.
States are sorted according to the difference between the share of total electoral votes and the share of the total U.S. population. Basically, each segment represents this difference.
The color coding shows whether a State is over-represented (share vote > share population) or under-represented (share vote < share population). Black = over-represented & yellow = under-represented.
Finally, the size of the dots is proportional to the State's population. I particularly like this because it shows the correlation between State's size and over/under representation.