Marimekko charts encode two variables: one along the height of the vertical axis and another using the width of the bars or columns. This one from The Economist, for example, has GDP per capita ($) on the vertical axis and share of global population (%) along the horizontal axis. There are alternatives, of course; they could have used a scatterplot or maybe two column charts. But the Marimekko, I think, does a nice job extending the standard column chart in another dimension.
In Excel, the Marimekko is not a default chart type, so it takes a little doing to create. Jon Peltier has a nice tutorial combining a stacked area chart and line chart, but it requires a bunch of data manipulation, which I found hard to automate (it’s perhaps why he includes it in his Peltier Tech Chart Utility). You may find my method just as complicated, but I use a series of formulas to arrange the data so that it can be easily replicated.
Let’s start with the data. For this example, I have 10 items, and for each I have the percent of each item purchased (% Purchased); these vary from 45% for Item 1 to 4% for Item 10. I also have the share of the total sold for each item (% of Total), which all sum to 100%.
There are lots of ways to plot these data: small multiples, scatterplot, pie chart (aghast!), treemap, or a unit chart. For this Marimekko, I’ll put % Purchased along the vertical axis and % of Total on the horizontal axis, and use a different color for each of the 10 Items.
Quick, important caveat here: This method only works when your shares (% of Total) are whole numbers. Peltier’s method allows those shares to be real numbers, so if you have such data, best to use that approach.
The philosophy is pretty simple: I’m going to build a column chart with 100 columns, repeating values as necessary. The easy way to do this will give you a single color across all 10 groups (100 columns). The harder—and more satisfying—way will use multiple formulas, but you’ll be able to give each series a different color. In either case, you could simply copy and paste your data multiple times to match the observed frequency, but that can take a long time and is not easily replicable with different data.
Setting up the Data
For either version, we’re going to need to set up the data in a specific way to use the formulas later on, so I’ll walk you through the set-up first, and then the construction of the chart. Some of this might seem repetitive, but bear with me. You can download the Excel file here.
- [Column E]. Create a new column for the % Total variable, but change it from a percentage to an integer. You could copy and paste the original data and change the format of the new cells, but I’ll use a formula [E3=C3*100] so that the data can be easily updated.
- [Column F]. Create a Count variable that will denote the cumulate count of each item. We’ll start with the number 1, which I hard-coded in cell F3. I’ll then use a simple formula [F4=F3+E3, F5=F4+E4, …] to set the number of cells for each value. Thus, the first Item will ultimately be represented by the 1st through 10th columns; the second Item starts with the 11th
- [Column G]. Just repeat the % Purchased series; again, I’ll use a formula [G3=B3] so the data can be updated later.
- Column [I]. Just repeat the Count variable [I4=F4].
- [Column J]. Simple Item # here, counting up from 1 to 10.
- [Column L]. Repeat that Item # series [L4=J4].
- [Column M]. Repeat the % Purchased variable [M4=G4].
- [Column O]. We won’t need this column for the version with a single color, so let’s skip it for now.
- [Column P]. You can think of starting to build the chart by setting up this column first. This Counter simply counts from 1 to 100. Don’t type this manually—type a “1” in cell P3 and then the formula [P4=P3+1] in the cell just below and drag it down.
- [Column Q]. Okay, here’s where we start using some actual formulas to build the chart and make the chart easier to replicate. I’m going to use a VLOOKUP formula to do so (you could also probably use a combination of INDEX and MATCH). If you’ve never used VLOOKUP, it’s worth learning because you can create all sorts of look-up tables that will negate the need to copy and paste, which is hard to update and can lead to errors. I like some of Dave Bruns’ tutorials available at ExcelJet.
Here’s what I put in cell Q3:
=VLOOKUP(P3, $F$4:$G$13, 2, 1)
Let me explain each piece of the formula:
- P3: this is the thing we are going to look-up; specifically, we are going to search in the Counter series we just created.
- $F$3:$G$3: We’re going to query the data we created earlier that consists of the Count and % Purchased data found in columns F and G. VLOOKUP will match the first argument to the first column in the look-up table, which is why we needed to order things the way we did in the first set of steps. (The dollar signs ($) act as the absolute references so that I can copy and paste this formula with changing the cell references.)
- 2: This corresponds to the column number we want to extract, so this refers to the % Purchased data in Column G.
- 1: This is the “range_lookup” value and tells Excel we want an “exact match” of the value in cell P3 instead of an “approximate match.” It’s good practice to specify this argument whenever you use VLOOKUP.
So what happened here? All we’re doing is repeating the % Purchased variable some number of times equal to its share of the total. Thus, this formula will give you 10 values of 45%, and in the 11th row, you’ll start with values of 38%, which will continue for 18 positions, and then on and on until you’ve filled up 100 values.
You can now create a column chart from Column Q. Once you’ve done so, right-click on the chart and change the Gap Width to 0% and the Series Overlap to 100%. You’ll now have the basic, single-color Marimekko. (We’ll talk about getting the x-axis labels set up correctly at the end.)
Marimekko with Different Colors
The challenge with this first graph is that to edit the color of the 10 series separately requires you to change the color of each of the 100 columns. But we can create additional data series to automate that task.
- [Column O]. To start, let’s go back to Column O, which we skipped earlier. We’ll use another VLOOKUP formula here: =VLOOKUP(P3,$I$4:$J$13,2,1). This VLOOKUP again queries the Counter series and here pulls out the Item # from Column J (notice that I specified the range_lookup value in the last argument again).
11-20. [Columns R-AA]. I’m going to create 10 different data series (one for each Item) so that I can add them all to one chart and then color each by simply grabbing the entire grup. The numbers at the top of the series (in Row 2) are important here and correspond to each Item Number. This formula is a little complicated, but once I build it in the first column, I can just drag it horizontally and vertically.
Let’s break it down again:
- To start, the IF statement evaluates the first argument, here $O3=R$2. I’m simply comparing the Item Number entered in the second row to how many times each series will repeat in Column O. Notice the strategic use of the absolute/relative reference “$” symbol; I’ve placed these here so the references move or don’t move as desired when I drag the formula across the worksheet.
- Another VLOOKUP: Here, look-up the % of Total from Column O in the last pair of data in Columns L and M. This formula pulls out the % Purchased variable, which is the thing I’m going to plot.
- The “0” in the last argument finishes the IF statement, so that if the condition is true, I get the data values and if not, I get a zero. As an example, for the first Item [Column R], the first 10 rows fill with 45% and the remainder fills with 0%; for the second Item, the first 10 rows are set to 0%, the next 18 rows are 38%, and the remaining 72 rows are 0%.
Dragging this across and down should give you blocks of numbers that repeat the % Purchased values corresponding to their shares of the total. If you plot this as a column chart, you’ll have 10 separate series that you can color and style individually. Again, you need to format the columns and set the Gap Width to 0% and the Overlap to 100%.
The final piece is to set up the x-axis labels. I had to sort of hammer this together a little bit, so bear with me; one last formula.
- [Column AB]. I want to place labels at every 10% increment, so I’m going to create a custom x-axis series. Here’s the formula: =IF(INT(P3/10)*10=P3,P3/100,” “). I’ll let you inspect this on your own, but the basic idea is to create a value at each round number and a missing otherwise. If you enter this formula in cell AB3 and drag it down the column, you’ll only see “10%”, “20%”, “30%”, etc. show up in the cells.
Now, right-click on your chart and choose Select Data and add Column AB as the Category (X) Axis Labels. Now go back to the chart and format the x-axis:
- On Excel 2011 on Macs, you need to change the Interval between labels from 4 to 1, and the Interval between tick marks from 1 to 10.
- On Excel 2010 and 2013 on PCs, you need to change the Interval between tick marks from 1 to 10. You’ll also need to select the Specify interval unit option in the Interval between labels menu, and Excel should fill in the 1 for you.
Well, there you have it folks: A Marimekko that just repeats a bunch of columns. If you’ve used this arrangement and formula, the chart should easily update with new data, you can easily style and color the different series, and you will have nice labels along the horizontal axis.
There are probably ways to streamline this approach. Using a combination of INDEX and MATCH, for example, might avoid having to set up the data at the beginning of this tutorial, and there may be an easier way to format the x-axis. If you have comments, suggestions, or other ideas, please add them to the comments section below.