We’ve all seen it: The bar (or column) chart with 20 or 30 or 40 different groups, each with two, three, of four categories. Is that a good chart type? Is it useful? I’ve recently started to think that these types of charts are just data dumps and are not effective ways to communicate data.
As an example, I came across this paired bar chart from Max Roser, which shows different Gini coefficients for about 20 countries. (It’s worth checking out Max’s collection of charts on living standards around the world, Our World in Data). In the interactive version, the user can toggle between the two the Ginis (defined in the legend at the top of the graph shown below), plus the reduction (or change) in the Gini.
When I looked at this chart–and there are lots of these, so I’m not trying to bash this particular graph–I had a difficult time pulling out any story. The data are sorted on the red bar, but I found myself first looking at the blue bar–maybe because it’s longer and therefore takes up a larger share of the screen? I think I was then mentally computing the difference between the two, to get a sense of the impact of taxes and transfers on inequality.
So what’s the solution? I’ve been toying around with this chart to see what else I might do, and have a couple of ideas.
The obvious first approach is to visualize the gap or difference between the two series. You cut the number of bars in half here, so I think that’s helpful. One problem, however, is that many people want to show the level of both variables variable, which you don’t get from the graph of the differences.
The second approach is to redesign this as a dot plot. Here, I think putting the values on the same horizontal line provides more balance and the grey line helps visualize the gap.
One problem with the above dot plot is that I don’t include the data values, which, again, a lot of people like to include. In this second version, therefore, I’ve added the data values to the right of each data marker. Maybe a little unbalanced here–look at Taiwan and Korea–but not too bad, I don’t think.
Finally, instead of worrying about getting the labels and the dots to line up, I just used labels instead of the dots. I’m not entirely sold on this one either, but I still think it’s easier to pull out the patterns and differences than in the original.
Other possibilities–which will depend on the nature of the data–include a scatterplot, slope chart, or stacked bar chart, among many others.
So where does this leave me? I think the paired bar chart with 40+ bars simply has too much information on it. A paired bar chart with 5 or 7 groups–even with 2 bars each–may be easier to see (and yes, 5 and 7 were chosen purposefully, thank you Professor Miller). It’s clearly an empirical question–put some charts in front of some people and ask them what they remember–an exercise I’m sure people have done.
What do you think? What are other alternatives to the basic paired bar chart with all those categories?