About three years ago, I created a Tile Grid Waffle Chart Map for the African continent. At the time, I played around with creating a version for the United States, but never got around to using it. Recent work gave me the opportunity to dust off that workbook and to finish it off, so I’m writing up my steps here and making the template available in the PolicyViz Shop. The creation steps are very similar to the African map, but I’ll describe it again in case having another explanation is helpful.
The map itself is built in an Excel spreadsheet not an Excel graph. I’ve used my previous Tile Grid Map template to set up the map—instead of a single square for each state, there are now 100 squares for each state. I’ve also inserted an empty gap column and row between each state. Color is added to each cell based on the value in that cell using the Conditional Formatting menu. A few reference maps and sheets help make this visualization nothing more that a series of comparisons between the data to be visualized and a series of counters.
The entire workbook has four separate tabs:
Names. Uses the layout of the final map with abbreviations for each state in each cell.
Counts. Uses the layout of the final map with numbers in each cell, increasing from 1 in the bottom-left cell to 100 in the top-right cell.
Data. This is where everything is set up. In the first few columns (B-G), I read in the raw data. For this example, I show the distribution of each state’s population across 6 age groups.
The second set of columns (J-P) rounds the raw data. A waffle chart requires round numbers because each square represents a single percentage point. Notice how some of the totals are not exactly 100%, which admittedly is a potential issue.
The final set of columns (Q-V) sequentially sums the shares and will be needed for the ultimate map creation. In the first row, the first value in column Q is just the raw data value (24%). The second value in column R is the sum of the first two shares: 24% + 9% = 33%. The third value in column S is the sum of the first three shares: 24% + 9% + 12% = 45%. And so on. The values in the final column are set (i.e., hard-coded) to 100% to avoid the problem with totals that are not exactly equal to 100%. This affects the older two groups, but with 5,000 squares in the final map (100 squares times 50 states), I don’t know if it really matters.
I’ll demonstrate how the whole thing works in this same worksheet. I’ve created a small version of the Counts and Names tabs in the first two panels of this image. The bottom panel is going to use a big formula and will be repeated across the entire final map for each state.
Here’s the formula:
I clearly need to break this down for you, but it’s actually not that complicated (an alternative approach might be to use a VLOOKUP with an approximate match range lookup value). It’s just a series of embedded IF statements that evaluate each data share relative to the Counts panel. If the statement is true, the formula places a number in the cell (1, 2, 3, 4, 5, 6) corresponding to that age group; if false, it puts a “99”.
Let’s look at the first two arguments to see how it works:
IF(Y12<=VLOOKUP(Y26,$I$3:$V$54,9,0),1,…). First, let’s look at what VLOOKUP gives us. The VLOOKUP finds the data value for the correct state and age group cell (e.g., Alabama and 0-18 year olds). Here, the Y26 cell is the bottom-right cell in the Names tab (in this example, “AL”). The formula then looks into the data for the “AL” abbreviation and pulls out the data in column 9, which corresponds to the first age group (and can be seen in column Q). For this specific example, the share of 0-18 year olds in Alaska is 24%.
We now take the value found from the VLOOKUP command and compare it to cell Y12, which is just a simple counter (see the Counts tab). As we copy and paste, we continue to compare the cells in the Counts tab to the shares of 0-18 years old. If that relationship is true, we get the number one (1) in each cell—you can see we get 24 occurrences of the number one in the bottom panel.
If the relationship is not true—that is, the counter is larger than the share—we evaluate the next relationship:
IF(Y12<=VLOOKUP(Y26,$I$3:$V$54,10,0),2,…). Same thing happens—if the share is in that bracket—which works because we sequentially added the values in columns Q-V—we put a 2 in the panel, and so on. This gives us a panel sequentially full of ones to sixes.
Finally, we use the Conditional Formatting menu to add colors to each number and add a white border around all of the cells.
Maps. By using a combination of relative and absolute references, all we have to do is copy and paste the formula from the example above to the state layout in the Map tab (references are changed in the original formula to call the Names and Counts tabs). In this version, row heights and column widths are set to the same size (here 11 pixels each) as well as separating rows and columns (here 4 pixels). Throw Conditional Formatting on top, add a white border around all of the cells, manually add the text boxes for each state abbreviation (kind of a pain, but no big deal), and you’ve got your Waffle Tile Grid Map.
If you don’t want to go through the hassle of building this yourself, you can buy a copy of my Excel file in the PolicyViz Shop. You’ll need to expand the big IF formula if you have more than six groups, but the existing file will work just fine for anything with six groups or fewer.