Killing the paired bar chart

We’ve all seen it: The bar (or column) chart with 20 or 30 or 40 different groups, each with two, three, of four categories. Is that a good chart type? Is it useful? I’ve recently started to think that these types of charts are just data dumps and are not effective ways to communicate data.

As an example, I came across this paired bar chart from Max Roser, which shows different Gini coefficients for about 20 countries. (It’s worth checking out Max’s collection of charts on living standards around the world, Our World in Data). In the interactive version, the user can toggle between the two the Ginis (defined in the legend at the top of the graph shown below), plus the reduction (or change) in the Gini.

Paired bar chart of Gini coefficients from Max Roser

When I looked at this chart–and there are lots of these, so I’m not trying to bash this particular graph–I had a difficult time pulling out any story. The data are sorted on the red bar, but I found myself first looking at the blue bar–maybe because it’s longer and therefore takes up a larger share of the screen? I think I was then mentally computing the difference between the two, to get a sense of the impact of taxes and transfers on inequality.

So what’s the solution? I’ve been toying around with this chart to see what else I might do, and have a couple of ideas.

The obvious first approach is to visualize the gap or difference between the two series. You cut the number of bars in half here, so I think that’s helpful. One problem, however, is that many people want to show the level of both variables variable, which you don’t get from the graph of the differences.

Bar chart of the difference between two Gini coefficients, variation on graph from Max Roser

The second approach is to redesign this as a dot plot. Here, I think putting the values on the same horizontal line provides more balance and the grey line helps visualize the gap.

Dot Plot of different Gini coefficients, variation on graph from Max Roser

One problem with the above dot plot is that I don’t include the data values, which, again, a lot of people like to include. In this second version, therefore, I’ve added the data values to the right of each data marker. Maybe a little unbalanced here–look at Taiwan and Korea–but not too bad, I don’t think. Dot Plot of different Gini coefficients with labels, variation on graph from Max Roser

Finally, instead of worrying about getting the labels and the dots to line up, I just used labels instead of the dots. I’m not entirely sold on this one either, but I still think it’s easier to pull out the patterns and differences than in the original.

Dot Plot of different Gini coefficients with labels instead of points, variation on graph from Max Roser

Other possibilities–which will depend on the nature of the data–include a scatterplot, slope chart, or stacked bar chart, among many others.

So where does this leave me? I think the paired bar chart with 40+ bars simply has too much information on it. A paired bar chart with 5 or 7 groups–even with 2 bars each–may be easier to see (and yes, 5 and 7 were chosen purposefully, thank you Professor Miller). It’s clearly an empirical question–put some charts in front of some people and ask them what they remember–an exercise I’m sure people have done.

What do you think? What are other alternatives to the basic paired bar chart with all those categories?

March 30, 2015

8 Comments

11343

8 comments

Paulie D

March 31, 2015 at 5:24 am

As GINI summarises inequality, I’ve made the guess that the intent of the chart could be to show the equalising effect of taxes and the like.

With that in mind, when I’ve re-made, I’ve tried it as a slope chart, coloured by end inequality (error in the label colouring, but please imagine I’d spent the time fixing that 🙂 ). I know grid lines are often seen as chart junk, but I prefer them to labelling every point, especially on a fuzzy measure like GINI. And I’ve gone for the potentially upsetting decision to reverse the y-axis. Typically when I parse a chart up = good, hence my decision. Happy to be called out for that though.

For an interactive, I’d probably go for selecting certain “stories” in the data (similarity between USA, UK, Israel; differences in the Nordics; SE Asia; Poland vs. USA on tax effects etc.). As a static, I’d be tempted to remove many of the countries for readability.

It would also be interesting (to me at least) to overlay an extra measure like GDP to see if there is any correlation with absolute and relative inequality.

Reply

Jon

March 31, 2015 at 8:47 am

Thanks for your comment, Paulie.

My love for slope charts knows almost no bounds. Well, actually, it knows two bounds. One, I sometimes run into the problem where there are a couple of high values in the data and then everything else looks like a straight line across. That’s not really a problem here. Two, sometimes there is just too much data and you end up with a spaghetti chart. I think that’s more of my concern here.

Also, I agree with your other points about highlighting a specific story or pattern, but my objective in this post was to simply recreate this chart without adding my own views/stories on top of it.

Thanks again,
Jon

Andy Cotgreave

March 31, 2015 at 11:53 am

Excellent post. So good, in fact, it inspired me to write the post about killing side-by-sides which I’ve been meaning to post for ages: http://gravyanecdote.com/uncategorized/killing-the-paired-bar-chart/

Andrej Lapajne

March 31, 2015 at 2:54 pm

Great post Jon!

I would only delete the value axis (it’s redundant), add category axis to facilitate the comparisons between Gini of Income and the difference and perhaps emphasize the differences (use color accent for differences instead of value positions).

Andrej Lapajne

March 31, 2015 at 3:10 pm

More interesting is the question of sorting: should we sort by income before taxes, after taxes or by the difference. Each sort order imposes a slightly different message, see here:

1. sort by Gini of Income after Taxes and Transfers: http://snag.gy/jclj0.jpg
2. sor by Gini of Income before Taxes and Transfers: http://snag.gy/heLFW.jpg
3. sort by difference: http://snag.gy/2jSbY.jpg

Reply

jon@policyviz.com

March 31, 2015 at 3:35 pm

Andrej,

I agree that the sorting is key and I obviously chose to follow the original. One interesting question, I think, is whether sorting by a variable that is not shown–in this case, the difference–is more confusing than it’s worth. Other than a title, there’s no obvious thing telling me the 3rd version is sorted by the difference, and I wonder whether that will confuse readers.

Thanks,
Jon

Austin Seymour

May 15, 2015 at 12:56 pm

I have used trend lines to solve this issue before. You can also fill in below the line to make it look like its own bar. I’m not sure if this is a better solution, but I thought it was worth mentioning.


Paulie D

March 31, 2015 at 5:24 am

As GINI summarises inequality, I’ve made the guess that the intent of the chart could be to show the equalising effect of taxes and the like.

With that in mind, when I’ve re-made, I’ve tried it as a slope chart, coloured by end inequality (error in the label colouring, but please imagine I’d spent the time fixing that 🙂 ). I know grid lines are often seen as chart junk, but I prefer them to labelling every point, especially on a fuzzy measure like GINI. And I’ve gone for the potentially upsetting decision to reverse the y-axis. Typically when I parse a chart up = good, hence my decision. Happy to be called out for that though.

For an interactive, I’d probably go for selecting certain “stories” in the data (similarity between USA, UK, Israel; differences in the Nordics; SE Asia; Poland vs. USA on tax effects etc.). As a static, I’d be tempted to remove many of the countries for readability.

It would also be interesting (to me at least) to overlay an extra measure like GDP to see if there is any correlation with absolute and relative inequality.
- Reply
  
  Jon
  
  March 31, 2015 at 8:47 am
  
  Thanks for your comment, Paulie.
  
  My love for slope charts knows almost no bounds. Well, actually, it knows two bounds. One, I sometimes run into the problem where there are a couple of high values in the data and then everything else looks like a straight line across. That’s not really a problem here. Two, sometimes there is just too much data and you end up with a spaghetti chart. I think that’s more of my concern here.
  
  Also, I agree with your other points about highlighting a specific story or pattern, but my objective in this post was to simply recreate this chart without adding my own views/stories on top of it.
  
  Thanks again,
  Jon

Andy Cotgreave

March 31, 2015 at 11:53 am

Excellent post. So good, in fact, it inspired me to write the post about killing side-by-sides which I’ve been meaning to post for ages: http://gravyanecdote.com/uncategorized/killing-the-paired-bar-chart/

Andrej Lapajne

March 31, 2015 at 2:54 pm

Great post Jon!

I would only delete the value axis (it’s redundant), add category axis to facilitate the comparisons between Gini of Income and the difference and perhaps emphasize the differences (use color accent for differences instead of value positions).

Andrej Lapajne

March 31, 2015 at 3:10 pm

More interesting is the question of sorting: should we sort by income before taxes, after taxes or by the difference. Each sort order imposes a slightly different message, see here:

1. sort by Gini of Income after Taxes and Transfers: http://snag.gy/jclj0.jpg
2. sor by Gini of Income before Taxes and Transfers: http://snag.gy/heLFW.jpg
3. sort by difference: http://snag.gy/2jSbY.jpg
- Reply
  
  jon@policyviz.com
  
  March 31, 2015 at 3:35 pm
  
  Andrej,
  
  I agree that the sorting is key and I obviously chose to follow the original. One interesting question, I think, is whether sorting by a variable that is not shown–in this case, the difference–is more confusing than it’s worth. Other than a title, there’s no obvious thing telling me the 3rd version is sorted by the difference, and I wonder whether that will confuse readers.
  
  Thanks,
  Jon

Austin Seymour

May 15, 2015 at 12:56 pm

I have used trend lines to solve this issue before. You can also fill in below the line to make it look like its own bar. I’m not sure if this is a better solution, but I thought it was worth mentioning.

Killing the paired bar chart

8 comments

Leave a Reply Cancel reply

Search

Listen

Categories

Shop

Killing the paired bar chart

Share this:

8 comments

Leave a Reply Cancel reply

Search

Listen

Categories

Shop