A couple of weeks ago, Jeffrey Shaffer at Data+Science wrote an interesting post about how to visualize categorical data when the scales of the values being compared are very different. As an example, he explored visualizing stock prices, which can be very different across companies. As an example, Jeff plotted the current stock price of Berkshire Hathaway at $268,370 per share and and Kroger at $24.06 per share in a bar chart. Yes, it shows you tremendous difference between the two values, but you can’t get any further information from this graph.
In his post, Jeff suggests an alternative approach in which he normalizes the stock prices. One could obviously normalize all prices on a 0-1 scale or 0%-100% scale, but Jeff makes the important point that oftentimes we want to show actual values–knowing that the Kroger price is 0.009% of the Berkshire Hathaway price is interesting, but not very useful if you’re trying to decide which stock to buy. To demonstrate, Jeff created the following in Tableau:
In general, I like Jeff’s approach–it’s clean and clear with good empty space and labels. I especially like how it easily allows you to see the current value relative to the high and low within an individual company. It also allows you to compare where the current price sits relative to the high and low across the different stocks. The obvious drawback, however, is that it doesn’t allow you to visually compare the magnitudes across the stocks, which comes back to the scaling issue just mentioned.
An Excel Version
Okay, enough of the summary. What I want to show now is how I remade Jeff’s graph in Excel. I tried a number of methods and I’m not sure my final approach is the best, but I’m providing the Excel file to you (see below) so you can play around. Let me know if you have a better way.
Essentially, the entire chart consists of 10 different scatterplots, a bunch of data labels, and error bars. Using Excel 2016 makes this graph easier to make because you can select custom values for the data labels in the Format Data Labels menu. Simply add data labels, select them and right-click (or, use the CTRL+1 or CMD+1 keyboard shortcut) and use the Value from Cells option in the menu (but take note, this feature is not available in previous versions of Excel and is also not available on the Mac version of Excel 2016).
I won’t go through the step-by-step here, but I think you’ll get the point pretty quickly. I have the data organized as follows, which I’ve shaded simply to make it easier to explain.
- The blue cells at the top are the original data values. I will use these to do the custom labeling.
- The green cells are the data I’m going to plot:
- In Columns B and C, I’ve created a 52-week low and 52-week high set to 0 and 10 for all companies.
- In Column D, I’ve created a current price variable that is the ratio of the current price to the 52-week high (e.g., 9.67 = $145.51/$150.40). I’ll ultimately use the custom labeling to add the dollar value, so the ratio is just used for the plotting.
- The two “height” columns (E and F) serve as the y-axis values for the data. Remember, a scatterplot needs an x- and y-value, so in this case the y-values are simply arbitrary. I’ve created them in such a way that I can use horizontal gridlines as the lines that separate the companies.
- The last four columns (G-J) are the x-axis values for “Symbol” and “Price” labels to the left of the graph.
- The yellow cells at the bottom are for the various labels at the top of the chart—“Symbol” and “Price” at the left, and “52-Week”, “Low”, “Current Price”, and “High” at the very top.
I now create a scatterplot with my 10 series and use a few different techniques to pull it all together:
- I use horizontal error bars to create the bar effect. The line is colored gray and set to a thickness of 10pt.
- The little vertical bars at the end of each gray bar are vertical error bars set to a height of 0.5. I played around with the height of the vertical error bars and the thickness of the horizontal error bar to get them to match.
- I place another scatterplot just below the low and high points (i.e., the vertical line) for the dollar value labels. I needed to do this separately to get them left- and right-aligned.
- The series for the Symbol and Price labels are set off to the side, data labels applied, and markers set to None; similar for the legend and labels at the top.
For the labels themselves, I use the new Excel 2016 (PC only) trick. I add the data labels and format them using the Add Custom Labels option. I can now input the values I want to use as labels; for example, for the column of Prices in blue text along the left, the x-values are all -1.25 (the “CurrentPriceLabel” series in column J) and the y-values go up along the rows (the “Height” series in column E). I add data labels and then select the actual Current Price data (cells D2:D9) as the labels.
This image shows all of the scatterplot points that are included to make up the graph. You can also see the x- and y-axes (here in red), from which you can see how I spaced things horizontally and vertically. Here, you can clearly see how I use Major Horizontal Gridlines for the lines between the stock prices.
I went through this pretty quickly, so if you have questions, please drop me a line in the comment boxes below or on Twitter. You can download the Excel file here to modify the graph with your own data.
If you’re interested in learning more of these sorts of approaches to extend what Excel can do for you, please check out my step-by-step ebooks for Excel 2010, 2011, and 2016 in the PolicyViz Shop.