Avoiding the Dual Axis Chart

You’ve seen plenty of them, I’m sure. A line chart with two or more lines with two vertical axes. You read the title, the legend, the axis labels, and start contemplating the patterns. Suddenly, you realize that the metrics of the axis doesn’t match what the line is supposed to be showing. What’s going on? you wonder. Then, you notice another vertical axis on the other side of the graph. Your brow furrows and you start trying to parse the whole thing together.

Line chart with three lines titled that shows average scaled household income from 1970 to 2014 for low, middle, and high income households. — *Source: IMF*

Like others, I’m going to suggest you avoid dual axis charts. They are confusing, hard to read, and can be easily manipulated to suggest correlations when none exist. I’m going make this argument by backing my way into a famous alternative to a dual line chart.

To start, consider this dual-axis line chart that shows the number of auto fatalities (per 100,000 people) on the left axis and the number of miles driven per capita on the right axis in the United States from 1950 to 2011.

Dual axis line chart with a blue line for auto fatalities and a green line for miles per capita.

It’s not immediately obvious that the fatalities data are shown in the blue line and associated with the left axis, and miles driven per capita is the green line plotted along the right axis. The purpose of a graph like this is to show the decline in one series (i.e., auto fatalities) while a concurrent increase in the other (i.e., miles driven).

But there are three problems with plotting the data like this.

First, these graphs are often hard to read. Did you intuitively know which lines corresponded to which axis? I didn’t. Even if the labels and axes were colored to match the lines (which many dual-axis charts don’t include), it’s hard to discern patterns in the data. Overall, they’re extra work for the reader, especially when the labeling is not obvious.

Dual axis line chart with a blue line for auto fatalities and a green line for miles per capita. The left vertical axis labels are blue and the right vertical axis labels are right.

Second, the gridlines may not match up. Notice how the horizontal gridlines in this version of the graph are associated with the left axis, which leaves the numbers on the right axis floating in space. At the crossing point in 1989, it’s hard to see if the number of miles driven is closer to 8,000 or 9,000 because the gridlines are not lined up.

Dual axis line chart with a blue line for auto fatalities and a green line for miles per capita. The gridlines on the two axes don't match up.

Third, and most importantly, the point where the lines cross becomes a focal point, even though it may have no real meaning. In the first version, our eye is drawn to the middle of the chart where the two lines intersect, because that’s where the most interesting thing is happening. But there’s nothing special about 1985 where the lines cross—it’s just a simple coincidence of the vertical axis scales. The intended takeaway of the chart is how the two series move in opposite directions, but that’s not what draws the eye.

Vertical Axis Ranges

The vertical axis in a line chart does not need to start at zero nor is there a distinct rule for the range of the vertical axis in our line charts. And by that logic, we could arbitrarily change the dimensions of each axis to make the lines cross wherever we like.

Each of these next two graphs are reasonable ways to set the vertical axes, and by manipulating those ranges, I can make the two series look like they are closely matched for a few years at the beginning of the period and then diverge. Or, I can change the axes to make it look like the two series converge over the period. By arbitrarily choosing the axis ranges, we can make different data series look as correlated as we like.

Two line charts, both with a blue line for auto fatalities and a green line for miles per capita. The axis ranges on the two graphs differ so that the lines on the left graph look like the lines are diverging and the lines in the right graph look like they are converging.

And this is the core problem with dual-axis line charts: the chart creator can deliberately mislead readers about the relationship between the series. If you haven’t seen it, Tyler Vigen at the Spurious Correlations shows these sorts of false relationships in pretty funny ways.

Line chart from the Spurious Correlations website with a red line for the divorce rate in Maine and a black line for per capita consumption of margarine. — *Source: Spurious Correlations*

Solutions to the Dual Axis Chart Problem

There are a few solutions to the dual-axis chart challenge.

First, try setting the separate line charts side by side. Not everything needs to be packed in a single graph. We can break things up and use a small multiples approach. Although side-by-side graphs should ideally have the same vertical axis range to facilitate easier comparisons, we’ve already determined that these data series are not on comparable ranges, so splitting them up and using different axis ranges can work.

Two line graphs. The graph on the left is a blue line with the title Fatalities. The graph on the right is a green line with the title Miles per capita.

If it’s important to annotate a specific point on the horizontal axis, you could also vertically arrange the two and draw a line across both. This will change the rotation of the final graphic, but may offer an easier way to label a specific value or year.

Two line graphs stacked on top of each other. The graph on the top is a blue line with the title Fatalities. The graph on the bottom is a green line with the title Miles per capita. There is a vertical dashed line that goes through the year 2000.

Second, we might calculate an index or the percent change from some value or year and plot the data to a single vertical axis. With this approach, the reader can see the change over time for both series and compare them along the same metric. The obvious trade-off is that we lose the level presentation of the data and instead present the change.

Line graph for fatalities and miles per capita that start at 0% in 1950 and grow from there.

Third, try a different chart type. If showing the changes in the associations between the two series is important, try a connected scatterplot—a graph that is like a scatterplot with a horizontal and vertical axis, but each point represents a different unit of time, such as a quarter or a year.

The data I’ve used thus far were originally shown in perhaps one of the most famous connected scatterplots (aside from the Beveridge Curve, which only economists know). Created by Hannah Fairfield at the New York Times in 2012, this Driving Safety, in Fits and Starts connected scatterplot shows how the two series we’ve been looking at so far moved over the 62-year period.

Connected scatterplot graph from the New York Times titled Driving Safety, in Fits and Starts. — *Source: New York Times*

In this excellent presentation of the data, Hannah wrapped the graph around explanatory text positioned at the bottom-left of the page and labeled specific areas to denote important periods like the energy crisis in the early 1970s and air bags in the 1990s. (It’s also worth checking out Hannah’s 2010 connected scatterplot, Driving Shifts Into Reverse, that showed the relationship between the price of a gallon of gasoline and miles driven per capita.)

There is a caveat, here. I find that about 8 out of 10 times I end up in one of two places with my connected scatterplots: either they are straight lines (e.g., program participation and program spending) or they are some kind of cluttered mess. Look what happens to the fatalities-miles connected scatterplot when we extend it through 2021–the 2011-2019 period is all over the place!

Connected scatterplot with the fatalities/driving data that extends to 2021.

Zooming in on the most recent 20 years gives us a bit more insight, but it’s still not as straightforward as the rest of the time series. Here, we can see a decline in both metrics between 2000 and 2010. Then, there’s some squiggliness (is that a word?) between 2010 and 2016. We then see fatalities fall and miles driven increase just before the pandemic. But then, between 2019 and 2020, fatalities rise slightly and driving declines by about 1,500 miles per capita. Then, between 2020 and 2021, we see a recovery of driving and a considerable increase in auto fatalities. Last year, Americans driving behavior was about what it was in 2008.

Connected scatterplot with the fatalities/driving data from 2000 to 2021.

The Exceptions

There are always exceptions, right? For dual axis charts, I think there are three exceptions to consider.

First, when we are showing a translation of a single measure, for example Fahrenheit and Celsius temperatures. In these cases, we are not trying to track two different variables but showing how one maps directly onto another.

Second, a Pareto chart, in which (typically) vertical bars are showing individual values tagged to one vertical axis and a line is showing the sum of those values (either levels or percentages) on the other axis.

In both of these cases, I don’t think the usual pitfalls apply.

Pareto chart with the title Late Arrivals by Reported Cause. There are 6 blue bars in descending order and a orange line that increases. — *Source: Wikipedia*

What does the research say?

There is not a ton of research on how readers perceive and process dual axis charts.

In their 2011 study, A Study on Dual-Scale Data Charts, Petra Isenberg, Anastasia Bezerianos, Pierre Dragicevic and Jean-Daniel Fekete report that their study participants found the dual axis chart (or what the authors called the “superimposed chart”) “very confusing and demanding too much concentration or reflection.” Relative to other chart types, participants in the study “performed poorly both in terms of accuracy and time” when viewing dual axis charts.

There is some slightly more recent research on the effectiveness of connected scatterplots. In their 2014 paper, The Connected Scatterplot for Presenting Paired Time Series, Steve Haroz, Robert Kosara, and Steven Franconeri test how 14 study participants read and process different connected scatterplots. They conclude that the “low-complexity” versions of the connected scatterplot (like the one presented in this post) “can be understood with little explanation” and are useful at engaging readers over more traditional graph types—like the line chart.

Conclusion

I hope I’ve demonstrated how dual axis charts can be troublesome. They can confuse your reader and they can be used to distort the presentation of the data. Try some of the alternatives listed above to make your data clearer and easier to read. In some cases, some of these alternatives—like the connected scatterplot—can be used to help your reader see how the patterns in your data are related to one another.

This post draws on Chapter 5 from my book, Better Data Visualizations: A Guide for Scholars, Researchers, and Wonks and was first published in my bi-weekly newsletter. You can sign up for the (free!) newsletter on my Twitter profile page. Thanks to Amy Cesal for suggesting I write, and reviewing, this post.

October 6, 2022

1 Comment

1413

Avoiding the Dual Axis Chart

Vertical Axis Ranges

Solutions to the Dual Axis Chart Problem

The Exceptions

What does the research say?

Conclusion

1 Comment

Leave a Reply Cancel reply

Search

Listen

Categories

Shop

Avoiding the Dual Axis Chart￼

Vertical Axis Ranges

Solutions to the Dual Axis Chart Problem

The Exceptions

What does the research say?

Conclusion

Share this:

1 Comment

Leave a Reply Cancel reply

Search

Listen

Categories

Shop

Avoiding the Dual Axis Chart