Imagine a group of adults (~1300 total) with heart attacks who go to an ER.
Using cocaine is bad for your heart, so ideally all patients with heart attacks should be asked about cocaine use. But only about half are asked (blue circle).
Our research staff surveyed patients about cocaine use, and about 11% said they use it (red circle).
Most people have to be admitted, but about 25% (yellow circle) get to go home — where they could use cocaine again (since they’re not being monitored).
What I’m trying to show:
- some people will be surprised at how only half of people who use cocaine were asked about this by doctors (since it’s a risk) — suggests doctors aren’t good at guessing who uses, or don’t ask everyone
- some people will be interested that 25% of patients get to go home
- it’s concerning that a chunk of people going home weren’t asked about cocaine (N=52+N=24), including those who actually do use it (N=24)
- the N=161 group is people who use cocaine and weren’t screened, which is concerning, but at least they’re in the hospital so they won’t be able to use it right away again.
Having 3 binary factors does fit a Venn diagram, but it’s difficult and sometimes impossible to make the sizes work out. Plus the outer count is not represented at all.
Two options that come to mind are a mosaic plot and a decision tree. The mosaic plot is more graphical. (That is, it relies more on a graphical representation of the values.) Here’s an example, which could be labeled as desired to highlight your point.
For my chart I swapped the the discharged and using-cocaine values. You say the yellow circle is 25% and the red is 11% but the red circle is bigger than the yellow one, so I’m guessing they got swapped somewhere.
The mosaic chart makes it easier to see about 25% are discharged overall and the rate is higher for those not asked about cocaine use.
Xan, that’s a cool idea. Do you have recommendations on software/programs to do that? Thanks!
Great — I hope it will lead to something useful. I used my own software, JMP, so I can’t offer any objective recommendations. Mosaic charts are most commonly found in statistics software.
Emmy,
I think you can make this mosaic in Excel. Just repeat the data for the “Asked”, “Not Asked”, etc. series as many times as you need and then create a stacked bar chart. (If that doesn’t work, you may need to rotate the whole thing 90 degrees.)
-Jon
Not sure how much control Excel gives you, but an important feature of a mosaic chart is than widths of the “bars” also vary with the data. It’s more of a nested treemap than a bar chart.
I have never found Venn Diagrams particularly useful, except as an abstract form to show that overlaps exist between sets of things – trying to use them to show how much of one thing overlaps another just becomes a pile of different things that can’t be accurately compared.
I have to say that I do not find the Marimekko chart to be much of an improvement, and tend to think that this is a case of trying to shove everything into one chart, when multiple charts is the solution.
Stephen Few illustrates this concept well in his ‘examples’ section:
http://www.perceptualedge.com/example13.php