Kirk Munroe is a business analytics and performance management expert. He has held leadership roles in product management, marketing, sales enablement, and customer success in analytics software companies including, Cognos, IBM, Kinaxis, Tableau, and Salesforce. Kirk has a passion for coaching and mentoring people to make better decisions through storytelling with data. He is currently one of the two owners and principal consultants at Paint with Data, a visual analytics consulting firm. Kirk lives in Halifax, Nova Scotia, Canada.

Episode Notes

Kirk | Web | Twitter 

Book: Data Modeling with Tableau: A practical guide to building data models using Tableau Prep and Tableau Desktop

Kirk Munroe: 4 Common Tableau Data Model Problems…and How to Fix Them

Related Episodes

Sponsor

Are you ready to earn extra income from sharing your expert opinion? Head over to userinterviews.com/hello to sign up and participate today!

New Ways to Support the Show!

With more than 200 guests and eight seasons of episodes, the PolicyViz Podcast is one of the longest-running data visualization podcasts around. You can support the show by downloading and listening, following the work of my guests, and sharing the show with your networks. I’m grateful to everyone who listens and supports the show, and now I’m offering new exciting ways for you to support the show financially. You can check out the special paid version of my newsletter, receive text messages with special data visualization tips, or go to the simplified Patreon platform. Whichever you choose, you’ll be sure to get great content to your inbox or phone every week!

Transcript

This episode of the PolicyViz podcast is brought to you by User Interviews, which connects researchers with quality participants who can earn money for their feedback on real products. You can go over to userinterviews.com, it’s free to sign up, you can apply for your first study in under five minutes, and you can earn real money for helping software developers and engineers create better products. Most of the interviews are your sort of common type of study, but there’s also surveys, there’s also diary studies, and there’s also online focus groups. So you have an opportunity to share your opinions with top brands like Adobe, Intuit, Spotify, Amazon, and many more, if you just go over to userinterviews.com. So if you are ready to earn extra income from sharing your expert opinion, again, really designed for developers and engineers to provide feedback on their products, head on over to userinterviews.com/hello to sign up and participate today. 

Welcome back to the PolicyViz podcast. I am your host, Jon Schwabish. I hope you are well, I hope the weather is turning nice into spring. But, you are listening to a podcast, you’re probably out walking in the sun, walking your dog, your run, taking your jog, I don’t know, but I’m glad you’re here, I’m glad you have tuned in to this week’s episode of the show where we’re going to learn more about data in Tableau. I’m really excited to have on the show Kirk Munroe join me for the conversation. Kirk is the author of the new book Data Modeling with Tableau. Kirk is also one of the chiefs at Paint with Data. And here’s the thing, I was on Twitter, sort of, complaining about – I’m going to admit, I was complaining – complaining about how a lot of Tableau blog posts ignore the part about the format of your data, the structure of your data, and a lot of that is because most of those tutorials use the basic built-in datasets within Tableau. Fine, that’s great. But in many cases, my data aren’t as clean, or they’re not in the same format, the same structure. So Kirk reached out, he’s got this great blog post on the Kevin & Ken Flerlage twins blog on their website, which I will also share in the show notes. But he’s also got this new great book Data Modeling with Tableau, where I am almost certainly going to offer the first two chapters, at least, if not the full book, but, at least, the first two chapters to my students, ways to think about how to use data in Tableau. I just think this is super important, you can’t get to the visualization part without knowing anything about the data part, and particularly about the data structure part. So I hope you’ll enjoy this week’s episode of the podcast. Here’s my conversation with Kirk. 

Jon Schwabish: Hey, Kirk, good morning for both of us, even though we’re like an hour apart, right? 

Kirk Munroe: Yeah, good morning, Jon. Thanks for having me on, flattered to be here.

JS: I was saying earlier, now that I understand that Halifax time is an hour ahead, when it’s four o’clock Eastern, I can say, well, it’s cocktail hour in Halifax. 

KM: Right, five o’clock, somewhere four. 

JS: Right, now I know exactly where it’s five o’clock and four, so I feel better, yeah. Well, thanks for coming on the show. I’m excited to chat with you, because let me give listeners just a little quick background. So I was having not a struggle, I was wrestling a little bit with data formats in Tableau, and the one thing I noticed in a lot of blog posts of Tableau tutorials was that people didn’t talk about the structure of the data, they’re always using the superstore data, and it’s in this particular format, and you responded with this fantastic blog post that you had written for Kevin & Ken Flerlage, and that was really great. And then, on top of that, you have this great new book, Data Modeling with Tableau, which the first two or three chapters go into even more detail on that. So I reached out, super happy to have you on the show, but I want to start with a little bit of background. You have a firm Paint with Data, but you also have an impressive background before that, so I was hoping you could just talk a little bit about that, and then, we can dive into some Tableau stuff. 

KM: Yeah, sure. And yeah, thanks, happy to be here. So yeah, my background’s in BI, at least, with, you know, going through whole history started in 2001. I went to Cognos as a product manager, and I kind of went up the product management ranks there for about five years. The reason I like being a product manager at that time was BI analytics was still a little bit nascent. So I felt like being on the product side, I have a bit of a technical background, was the place to be. Then once we got bought by IBM, I went into sales enablement, because I thought the natural next step of that was to help sellers, customer facing people actually understand what analytics was, like, it’s not just reports that you can present it to people that have static information, like, what actual analytics meant and answering business questions. 

And then, I went to a supply chain analytics company called Connexus, and ran product marketing there with really super deep analytics, like, but very niche. So we could do cool things, like, if you had a bill of materials for say, like, I don’t know, two different laptop models, and you could ask a question of it and say, well, what if we took this part from this one and gave it to this one, and you can see how many customer orders would be laid within like seconds, which would normally take overnight to run in an MRP system or something. And then, anyway, I did a start-up again, then I went to Tableau actually for four years to do customer success, because I was drawn to that, because I thought, you know, this stage we’ve gotten to is that people didn’t understand –they understood the analytics at a high level, but they didn’t know how to make it happen. I did that for four years, and then, on the payment data thing, my wife actually who’s a Tableau ambassador, a User Group ambassador, started a company called Paint with Data to be a consultancy. And, of course, we used to take data and actually paint with it. So we were a consultancy that worked with companies of all kinds of different sizes, I joined 18 months ago. Was it intersection of two things really, one was we’d always said we wanted to work together, and then, we kind of went, well, just run them [inaudible 00:06:23]. 

JS: Right, yeah.

KM: And part of it was just the customer success role was good at Tableau, it certainly got a little bit frustrated that I couldn’t get hands on, like, the things that really make customers successful, like, the job became a little bit too much Relationship, as opposed to the things that actually make people like as their data source as an example, right? And the role got really far removed from that kind of stuff, like, if they didn’t care that new hires are trained in Tableau really are certainly at that kind of level, right? Like, I would go get every Tableau certification I could get, I mean, I thought it was really important to – if you’re going to be a trusted advisor, you got to know more than people your advice and conceptual.

JS: Right, yeah. Okay, so you joined 18 months ago. So were you working on the book. 

KM: No, so the book, it was kind of a funny thing, because I joined, and then, I was in for maybe six months, and then, the publisher called me and went, we want someone to write a book on data modelling, will you write this book – in Tableau. And the first thing I thought was, well, who am I to write a book, and, of course, you get into, there’s so many smarter people than me that could do this, which is just a natural thing to do, I think. And I thought, and I think in my head, I went, there’s so many Viz books being written all the time, there must be a lot of data modeling books. So I asked for 24 hours, and I did a search, and there are none. So, to be fair, Carl Allchin is awesome, like, Carl’s got a book on Tableau Prep, but not on data modeling, kind of, the stack for Tableau. 

So I went, well, then you quickly – this has happened to me a lot in my life – I go from why me to why not me. And I started writing the book in about April, I guess, so it took about six months from April to October to write the book. And then, I thought it would be a 200-page book, and trying to be as concise as I can possibly be. I got it down to like, it turned out to be 325 pages in the end, like, there are probably more there than I talk about.

JS: Right. Yeah, I’ve been there. Okay, so let’s talk about data. I preface this whole conversation with this difference kind of between wide data and tall data, but maybe I’ll ask the question in sort of a more general way, like, from your perspective, what is the biggest challenge, in particular, I guess, new Tableau users face when it comes to the data modeling and the data structure in Tableau? 

KM: Yeah, to do one step back from that, which I’ll get to it a little bit in the book, and I think we’ll probably have more blog posts on this is what makes Tableau so special that no one sees actually is SQL, which Tableau used to talk about. So until Tableau came along you would have to query your data, and then, you have to format your data a little bit, and then, you would visualize your data. So, in most other BI tools, I think ThoughtSpot have done a pretty good job, the same kind of approach as Tableau now, but traditionally, what it is, is you open up a product, and the first thing it will ask you is, how do you want to chart this data. 

And my frustration till I saw Tableau, you know, I was working with Cognos so we are one of the frustrations at the time, was like, I don’t know how I want to visualize it, yeah, I’m just trying to interact with my data. So Tableau solved that problem, and what it is, is, so basically, when they write their SQL underneath the scenes, they have this clause appended on the end that basically goes display as, which is why people fight Tableau sometimes and find it unintuitive at the very start. But if you get in, Tableau talks about – used to talk about, at least, the analytics flow, you just start clicking around and ask and answer questions and dragging things to [inaudible 00:10:02] and the Viz keeps changing without you explicitly going drop this on columns, right, one color on, it just – it’s smart enough to mostly know. Like, I never drag anything to – I rarely drag anything to a column or row shelf as an example, I’m a big double clicker. 

JS: Okay.

KM: And I let Tableau to figure that out, like, I’ve done in my top rows and columns, but I’m rarely dragging.

JS: You’re rarely dragging, okay.

KM: That background is important, I think, because Tableau always assumes, and this is where the data modeling comes in, is that your data structure underneath has a series of columns that it’s going to convert to feel, and every one of those columns is going to be distinct. So not discrete which we get, but distinct in that if it says customer, the only thing in that field is going to be customer. And if it says revenue, the only thing that’s going to be in there is revenue, sales, etc., and that’s why a lot of people pick up superstore that’s already formatted that way, and they have a lot of fun, and they bring their own data, they token it off what’s going on. And the reason Tableau does that is it makes this SQL thing easy to do, it makes it easy to know how to visualize it, because they know how the data is structured. I guess, they assume how the data is structured.

JS: They assume, right.

KM: Right. So that’s why you can run into problems. And we don’t need to be super technical. The way CPUs work, like, have always worked is they work well when you pass them an array of data. And then, you can filter, slice, whatever that – and they’re really good at aggregating it. So analysis at the end of the day is about some level of aggregation of data, visual analytics which Tableau does is that visually. So basically, it’s really easy if you have a row called revenue, and then, you pull on region, it’s easy for Tableau to go, okay, some, and then, break it down by that region, right? And then, colored by subcategory, it’d be ugly, but it would know by subcategory. It does this stuff very fast if the data is structured that way. 

If, for instance, your field was conditional, and it was called name, and the next column was vendor or customer, and then, there was a name, and the next column had vendor customer, and you were to try to dynamically write a calculation in Tableau to go, if this column equals vendor, then this one, Tableau is going to be terrible. Like, it’s just awful, right? Because it assumes that struck, which I knew, as soon as I went to Tableau I – we had to do a demo at a time to get hired. I refused to use superstore because no data looks like this. I’ve been around long enough that I knew that, so I brought in a bunch of Airbnb data, or inside Airbnb data, you know, just great data, but to show. And then, I made a whole scenario about if you’re working for the city, how happy would you be with this, or your potential host, and I did a whole demo around that. And then, I realized, oh, this, like, formatting data thing is tricky, like, you have to get this right. 

JS: Yeah. 

KM: So yeah, that’s it – so that’s fundamentally it’s 325 pages of how do you get your data like that, and because there’s a lot of nuance, they need that.

JS: So I know you’re not at Tableau anymore, so this is just dreaming, but, like, if you had your druthers, would you have Tableau focus there, presumably, they have an AI and a ML team working on a variety of things, would you have them do something similar to the show me tab, but not for graphs, but for data where it says it looks like your data in this structure, would you want in this structure, and you click a couple of buttons and you’re good to go. 

KM: Yeah, for sure. And maybe, on the – what’s the new feature called – the workbook optimizer, it could be more than just – it should be able to go the – it’s getting there a little bit, but it definitely has to, I think, get better at going. The reason that you have all these calculations and weird parameters and whatever is because your data is not shaped, right? 

JS: Right. 

KM: I think it would be not terribly hard for them to pick it up, I know, like, Ken Flerlage has a great line, which I love, which is, if you’re doing something in Tableau, and it seems like it’s more difficult than it should be, it’s probably because you’re data [inaudible 00:14:11] shaped right. You can just, like, this is more complicated than it should be, like, nine times out of 10, that’s because your data is not shaped.

JS: Your data is not shaped right. So for those who are, let’s say, like me, sort of, relatively Tableau newbies, what do you recommend for folks to do when they’re in that position where they’re struggling, and maybe they even realize, oh, my data’s wide, and it needs to be long or tall, like, what tools do you use to do that reshaping when you’re working there. 

KM: Yeah, well, I mean, first, they should buy the book. 

JS: Obviously, yeah. 

KM: I think it’s just that the first exercise is whether you use a piece of paper or you mentally do it or whatever, just like if you were going to, I know, Chantilly talks about this a lot is not the only one who talks about if you’re going to create a visualization, you should kind of map it on a piece of paper somewhere first to see. You should think about what would it take before I even get to a tool, what would it take to get these fields into these very distinct colors. 

JS: Right. 

KM: And then, the talk would be sometime, and most of the time, what it’s all it’s going to take is, if it’s not, it’s pivoting rows to columns, or pivoting columns to rows. And it’s that simple, usually, and then, you can pivot columns to rows in either Desktop or Prep, and you can pivot rows to columns in Prep. I feel like that’s a feature that’s been in there now for, at least, a couple of years that almost no one sees…

JS: No one knows about.

KM: Even the Prep community knows it’s there, it’s a slickest little feature, and it just makes the world of difference, because when you open up a workbook, and you see if people have those type of calculations, I was talking about, like, if this field equals profit, then profit, and then, they’ve got a string field to put dollar signs in front of it. I’m like, just reshape that feeling. 

JS: Yeah. 

KM: Just reshape it… 

JS: [inaudible 00:16:10] and add a dollar sign, yeah, I’ve done all that too, yeah. 

KM: Yeah, and, in Prep, it’s literally pivot rows to columns, AND it’s like, which one do you want to – and then, you drag two things over, and it’s done. I mean, like Prep, like, Prep is terribly underrated tool. 

JS: So let’s talk about Prep, because you have, I think it’s like part two of the book, I think there’s four or five chapters that are dedicated to using Prep. So is that section in there, because you feel it is the right tool for Tableau users, or, because there really isn’t, like, in my reading, there’s just not enough, I would say, enough materials out there to really help you dive into it. 

KM: Yeah, sometimes I wonder a lot if Prep, like a lot of features come out. So it’s probably a few things like Alteryx had this great partnership with Tableau, right? So a lot of people probably knew that, right? And if you’re already licensed for that, then you could substitute everything in Prep. That’s fine. But I think for people that don’t have it, especially, I think a couple of things happen. First off, when Prep came out, it didn’t do a lot, it was okay, but it didn’t do a lot. It was still kind of a cool product back in 2018 now. And I think a lot of people might have evaluated it then and written it off a little bit, and not kept up with all the innovation that’s in it. But certainly, it’s a really valuable tool for people who already know Tableau well, because why? First off, it comes with the creator license. I mean, there’s the annoying thing to schedule, you need data management, but it comes with it. It’s the same calculation language. It’s the same UI in, as far as you could have the same UI and UX, like, they’re different UX because they’re different processes. So, I mean, from a cost of ownership, it’s just so long to use Prep, like, I wish more people use… 

I’ll tell you, we talked about this before we started a little bit, like, so I’ve been in data for 22 years, and I probably haven’t, other than a very simple line of SQL, probably haven’t written a line in 18 or something. Even though people insist on asking for the SQL skills, I’ll give you an idea of how much I avoid it. Lately, I’ve been using Snowflake. All right, so technically, I’m writing a SQL statement to create my Snowflake table, but it’s not really a SQL statement. So I create my table, and then, what I do is, Relationships are also a very powerful thing in Tableau, but they only work against live connections. So you can’t use Prep per se. But the Prep I know so well now, and it’s so familiar that I usually create my Snowflake tables, and I Prep the data and load it into Snowflake with Prep. Like I don’t even – I always use a published data source. Like, I’ll go into Prep, and then, do what I need to do to it, and I’ll move it to a Snowflake table, because, let’s say, I’ve got two tables at different levels of aggregation, and I don’t want to explode those, I’ll put them into two Snowflake tables, and then, I’ll get Tableau to create a Relationship. 

JS: Oh wow. 

KM: So I don’t even always use it, a published data source. So sometimes I’m using something Snowflake to actually as the output, so you don’t even have to think about it’s always been the last thing on is I don’t know why Tableau has been so hesitant to call it an ETL tool, because it’s what it’s always been, like, you extract data, you transform it, and you load it somewhere else. I think they were afraid to do it, because they’re like, well, we don’t want to compete against all these, like, ETL specific… 

JS: Yeah.

KM: You don’t have to say that’s what you. You only mean, like, I would – if I was going to some data engineering team that never did visual analytics, I would not recommend Tableau Prep, just because there’s probably more powerful tools. And for someone who’s used to Tableau anyway, like, why not, like, the cost of ownership… 

JS: It is an interesting thought about who they target, like, what is their avatar of their core customer base, and I always find that interesting, because the folks that I work with tend to be, you know, it’s a nice nonprofit of six, eight, 12 people, and there’s like one person or two people who have demonstrated this interest in creating visualizations, but they’re not necessarily maybe data people, they haven’t – they don’t have coding experience, maybe they’ve never used a tool like this before. And I think they often get, as you’ve mentioned, they get frustrated by these little but crucial things, and maybe being able to help those folks would unleash it, I don’t really know, but… 

KM: Yeah, you know what, that’s a great point, for consulting with small companies, I would say, it’s worth learning data at the level of Tableau Prep first, especially for the nontraditional technical people who don’t come from a programming background, because not only is your Viz going to be easier and faster, they’re actually not going to have to write nearly as many calculations and struggle with that kind of stuff in Tableau. Because otherwise, they’re going to get frustrated, because, they’re like, I’m not a coder, and I have to write all this, and, like, you would and if the data were structured. 

JS: Right. 

KM: And this idea of, like, every column being distinct is not that hard of a concept again, I don’t know why people don’t go back to make it that way. I just neatly need it in a column. It’s just, you need pretty good SQL skills before this pivot rows to columns. I think that’s the secret feature Tableau has, because it’s so fast, and it’s a little bit hard to do in SQL, like, it would be daunting to try to split that up. And it’s counterintuitive, because it makes your data longer, and people think, long data is slow. I’m like, long data is not slow. 

JS: No, right. Yeah, it is interesting, I mean, I grew up in the SaaS Data World, and there is, like, a burn into my brain, there’s a little image in the state of helpfile about the reshape command, which goes from, they use long and wide is what the language they use, but there’s this, like, there’s this image, it’s like, here’s a wide, and then, it’s like a little arrow, and then, here’s the long and, you know, if you want to go left to right, this is the syntax here, right, the left, this is the syntax. And I agree, it’s not a complicated concept to get, but it’s so crucial to everything that you’re going to do down the road. 

KM: Right. 

JS: Okay, so almost without intentionally doing it, we’ve talked about the first two parts of the book, so I think it makes sense, we should get to it, so let me get to the last part. So the first part is really about the types of data model setting up your data. The second part is about Prep, and then, the third part is about connecting and building Relationships, which I have found also to be a frustrating, especially, the Relationship part, it’s a frustrating piece. So I’ll just make it a super general question, which is, as we walk through now, through the book, and sort of the process by which someone would work in Tableau, what is this third part about when it comes to connecting and building Relationships in the data?

KM: Yeah, so the next thing becomes almost, I wish there’s a term for this, like, it’s treating tables instead of tables as distinct analytic units that sometimes need to be combined to perform a different level of analysis. So I also, like, you referenced the blog post that I had on the Ken and Kevin site, on the flerlagetwins.com. We have two more coming, one on when to use Prep and when to use Desktop, and then, another one on when to use Relationships versus Joins, and just a little bit on blends, because the blends answers almost never. So there’s just one use case, just one very specific use case.

So imagine this, this will probably be in the blog post, but imagine this Airbnb data, I think we can do it, right? So you’ve got, let’s say, we have five tables, right? And the five tables are one table contains a list of all the properties, say, in a city or whatever, it could be all of them, but with a city column, but like, let’s say, even for a city, and then, we’ve got reviews. Right? And so, reviews are at a different level of granularity than the properties, because one property can have many reviews, but a review can only be for one property. 

So you don’t until Relationships came out, that’s a perfect example of tables, you don’t want to join, and the reason you don’t want to join them is you’re going to explode it, and then, you’re going to have to watch your level of aggregation, because you’re going to have many rows now for individual properties, because it had many reviews. So what’s just magical about Relationships, is you create a Relationship on those on property ID, listing ID, and now what happens, and, I mean, by individual units of analysis is I can ask questions about reviews without asking about properties, and I can ask about properties without asking reviews. So if I ask a question on either side of that, Tableau is only going to generate the SQL behind the scenes against that one table, like, no join, I won’t even look to join it. But then, what’s really smart about Relationships, if, let’s say, I want reviews on a given property, Tableau is smart enough to do, like, you know, that would probably be a right Join the way I’m describing it, it doesn’t matter, but it would create that join dynamically to answer your question, and handle the level of aggregation, so you don’t get it, because I think, especially for nontechnical users, like, understanding levels of aggregation, that’s like really hard to wrap your head around. And then, Tableau takes your mind away from that. So imagine those two takes. 

Next thing, you want to do is, I’m going to bring in neighborhood information, right? So I want to bring in a shapefile of neighborhoods so I can map it, and then, I want to bring in the walk score and bike score or whatever, I can go get that off the internet. And maybe, I want to go get apartment information on how expensive apartments are by neighborhood. So I could answer the question, does it look like Airbnb you’re driving up the price of apartments, or, it’s a tricky question, that’s part of analysis, or, am I helping people afford it, because they can use Airbnb to help them offset. But those three – those could be three separate tables, three, yeah, one with the shapefile, one with the walk scores, and one with the cost of. So the temptation would be, oh, I should – and you could do this, but it would be a little bit complicated, you could bring all those in those relationships, because that would be the default, but if you think about it, those three tables are all about neighborhood information, and they’re all at the same level of granularity, which is one row per neighborhood. 

JS: Right.

KM: So what I would do is, and this is why Relationships and Joins go together really well, I would join those three tables together, and then, I would create a Relationship, and those three tables join together, because Tableau would benefit actively three Join tables as one table, because it really should be one, but don’t we having to do a whole data engineering job to put those in one in the background. And then, I can ask questions just about neighborhoods or again, like, an ask neighborhood, how many reviews per neighborhood, how many listings per neighborhood or whatever. But it’s just an example of, if you think about it, in terms of is it a unit of analysis on its own, one is neighborhood, one is reviews, one is listing, as opposed to tables, and what their level of granularity is, that also takes the complications out of that a little bit. 

JS: It’s also interesting from the perspective back to this single person in some small organization, back to they need to pull all these data together, maybe not even for visualization purpose, they just need to have these data together for whatever it is, and they could use Tableau to do that, because it’s so efficient at doing these different things. 

KM: Yeah, a 100%, so that, and I still think in even a big org – some little org, you have no data engineering, right? 

JS: Right.

KM: My experience, at least, maybe people have seen other things, the one place where I still see waterfall process heavy, slow things are data engineering teams. Like, Cloud Data Warehouses didn’t magically solve that problem, so sometimes, you’d be like, I need my data shaped this way, and they’ll be like, well, it’s got to go through this, you need to get this approval, it’s going to cost this much of a chargeback, and we’ll have it to you in three months. It’s like, well, you know what, I’ll take the data you’ve already given me, without going all the way back to source, and later in the data pipeline, I’ll clean it up. Right? So I talk a lot about, in a completely idealistic world, you wouldn’t want the analysts doing this stuff, but you have to, but you’re never going to get answers out to the organization, if you wait for the data engineering to do it or whatever, that would throw us back into the eight, and like, argh. 

JS: The last part of the book, which, admittedly, I haven’t read, because it isn’t as useful for my use case…

KM: Yeah.

JS: Everybody’s got limitations. But the last part is on Tableau Server and Tableau Online, so what are the differences in that section versus, say, working in Desktop? 

KM: Yeah, so the last sections cover, and I wish there was a word for Tableau Server and Online together, because they’re synonymous, and so I keep going Tableau Server or Cloud, and it’s like, the exact same, for all intents and purposes. There’s basically, that you can think of in three different ways, the book’s not exactly organized this. One is all about data security and data security at two levels, so one is who can see the data model. 

And the next level is who can see data within that data model, because so imagine the first one, you just want only the finance team to see it, how do you make sure that only the finance team can see it, the next example of who can see it is like imagine you were giving this out to your customers, and you had 1000s of them, you would want to build 1000s of workbooks, you would want to say only the customer login can only see their own data and produce one workbook, so that row level security, so part of it’s about that. 

So part of it’s about how do you secure these data models, and how do you secure the data within them. Part of it’s on distribution, so when to use published data sources versus embedded data sources. Another thing, lots of people have been using Tableau, including Tableau Server or Online Cloud for a long time, and don’t know the difference between an online versus an embedded data source and when to use one versus the other, and it’s important to get that right from a cost of ownership thing, because you could be rebuilding the same data model over and over again on one side; or the flipside is, you could be publishing a data model that was really only intended for one workbook, and people are trying to build workbooks on it, and a lot of them do, because someone’s very specific for the work, so they both have their place. 

And Tableau doesn’t make it very explicit, although, at least, now, on Tableau Cloud, Tableau Server, you can say new published data source, so they at least have taken to that a Desktop, and now it feels like a distinct thing. And then, the last part is around the other things that come with Tableau data management, so how to schedule Prep flows, how to have data quality warnings and Tableau data catalog at lineage to see who else is using this. So there’s only one of 15 chapters on data management, but one of them covers all the things in data.

JS: Right. So before we wrap up, I want to come back to one of the things we were talking about before we actually started recording, which is the size of data, because you mentioned a couple of times when we were chatting, but you had mentioned a very interesting piece of extracting data in Tableau that I don’t think I even really recognize, because when I do the extract, I just do extract and good and publish, and I’m good to go. But I wanted to finish up with that data extract tip, because I think this is something that probably most people don’t know about the sort of way that you can modify or option in that extract menu. 

KM: Yeah, and the background for this, I think I heard recently, I hope this is a true set, but that the average Salesforce deployment as an example has a 100 custom fields in it. I’m sure some people go, a 1000, I don’t know. But what happens is, anyway, that’s the kind of thing that leads to really wide data, even if all the fields are discrete, sometimes the data still gets incredibly wide with all these ways you could slice and dice the data. And that wide data definitely makes Tableau slow, especially, if they’re string fields, because they’re usually string descriptive type fields. And so, we’re working with clients, you often get, well, the business might want to analyze by all those different things. Right? Like, anyone could possibly fill for… 

JS: [inaudible 00:32:16] 

KM: Yeah, so, let’s say, whether using a published or embedded data source, but say you’re just using Tableau Desktop to make it easy. And then, what I say to people all the time, then, if you don’t know which of those columns you’re going to do, like, leave him in your data model, this is if you’re using an extract, leave him in your data model, build all your Vizs, and as your very last step, before you run your extract, just take the little down arrow, you make calculations and everything else, and go hide all on it. And it will hide all the fields you’re not using, and then, what most people don’t know is when you run your extract, it doesn’t bring all that data into your extract. So it’s going to perform way better, and no one’s using it, and then, people come back and go, well, what if I want to use those in the future, well, what’s slick is Tableau will bring in almost like a ghost field, where you can say show hidden fields, you just go to the field you want, you add it, of course, you can add to the visit at that point, because the data is not there, but you run the extract again, and then, you can add it. 

JS: You could do it.

KM: So this is the surefire way to make sure that you’re not bringing in data from a width perspective that you’re not using. So it’s, again, I don’t think Tableau talks about it much, it’s just a terribly hidden feature, pun intended..

JS: No, it’s just like… Yeah, it’s just like this thing that kind of showed up that, like, oh, you have to do this, and that’s correct too.

KM: Yeah, it’s super. 

JS: Yeah. Well, Kirk, thanks so much for coming on the show, the book is really great, the lessons are fantastic, and something I think more people need to learn and read about. So thanks so much for coming on the show, I really appreciate it. 

KM: Well, thank you Jon, yeah, and I really enjoyed it.

Thanks for tuning in to this week’s episode of the show. I hope you enjoyed that interview, I hope you will go check out the blog posts we mentioned, we talked about, and, also, of course, Kirk’s new book, Data Modeling with Tableau, and, of course, his website, Paint with Data, a lot of good content there, lot of great ways to think about all the things that you need to be better, I would say, obviously, in Tableau, but also just a better person to work with data, just a better data visualizer, a better data scientist, a better statistician, anyone who works with data really needs to understand this content, and these lessons even better. So I hope you enjoyed that episode. I hope you will consider sharing today’s episode with your friends and your family. Put it on your social media networks, share a review or a rating on your favorite podcast provider. And until next week, this has been the PolicyViz podcast, thanks so much for listening.

A whole team helps bring you the PolicyViz podcast. Intro and outro music is provided by the NRIs, a band based here in Northern Virginia. Audio editing is provided by Ken Skaggs. Design and promotion is created with assistance from Sharon Sotsky Remirez. And each episode is transcribed by Jenny Transcription Services. If you’d like to help support the podcast, please share and review it on iTunes, Stitcher, Spotify, YouTube, or wherever you get your podcast. The PolicyViz podcast is ad free and supported by listeners. But if you would like to help support the show financially, please visit our Winno app, PayPal page or Patreon page, all linked and available at policyviz.com.