Gridlines are one of those elements of most data visualizations that many of us just accept. They are just part of the graph, not too big and not too bold, so just let them be. Personally, I prefer to forgo gridlines in bar charts when I include data labels above the bars. Let me demonstrate.
This is a bar chart of the population in 10 countries around the world. Consider Italy (highlighted in blue)—without the labels, the gridline at 50 million helps us see that that there are more than 50 million people living in the country. Similarly, we can see that there a bit more than 100 million people living in Ethiopia and more than 200 million people living in Brazil.
Now let’s add a data label to the bars in the case where we think it would be helpful for the reader to see the exact values. Here, we see that Italy’s population is 60 million people, so the gridline doesn’t help us discern more information from the graph. And once the gridlines are removed, the y-axis labels would be just hanging out in space, so I deleted those labels as well.
It’s not the case that I’ll delete the gridlines every time I included data labels. Consider this stacked bar chart from the National Center for Education Statistics that shows where schools with the highest density of black students are located (pulled from my Data Visualization Catalog). Here, even though the data labels are included on the bar segments, I might want to make it clear to the reader that, say, the shares in the Northeast and Midwest are below 50% in three of the four categories.
I would make some slightly different decisions in this graph, however, as you can see below. First, I would directly label the bar segments, lighten and reduce the number of gridlines, and remove the tick marks along the horizontal axis. I would also break the final ‘60-100’ group into two groups (‘60-80’ and ‘80-100’) to maintain consistency with the rest of the categories, but I don’t have those data values to make that graph here. Here, even with the data labels, you can see that the Northwest and Midwest account for more than 50% of students in just the category on the far right.
Changing the Colors
I’ve maintained the green shades here, though I might consider using different colors altogether. And before you point out that the red-green color choices here are potentially inconsistent for readers who have a form of color vision deficiency (“color blindness”) let me just point out that, here, the red and green colors have sufficiently different saturation as to make them distinguishable. The exact colors are also less important for two reasons: First, the segments don’t cross, so there is no confusion about which bar segment corresponds to which region; and second, by deleting the legend and directly labeling the bar segments, I avoid any potential issue of figuring out which label corresponds to which color.
I follow a somewhat similar view about gridlines with line charts, but am more likely to include them. As just a simple example, this line chart shows average annual participation in the Supplemental Nutrition Assistance Program (SNAP) from 1970 to 2020. By leaving in the gridlines, the reader can more easily see how participation varied between around 20 million and 30 million people between about 1980 and 2010 with participation reaching almost 50 million people in 2013. In this case, this is even more important because the axis labels are on the left side of the graph, ‘far’ away from the end of the period on the right side of the graph. And labeling all of the points in this charts would be overwhelming and not particularly useful for the reader.
Including gridlines is primarily an aesthetic choice but my view is to try to omit as much non-data elements on the graph as possible. As you continue to work with your data and make your own graphs, you will develop your own style for gridlines and other graphic elements.