Minhaz Kazi is a Developer Advocate for Google Data Studio. A business intelligence veteran, Minhaz is always exploring new ways for developers to collect, analyze, and visualize data. He is available for long discussions on circular reference errors, benefits of pie charts, SQL commas, and the design of everyday things.
Minhaz and I chat about the tool itself, how it fits with the rest of the Google ecosystem, and how individuals might use the tool.
A few other notes:
-There are still a few seats left for the December 5 Data, Designed workshop in Boston and the December 6 Data, Designed workshop in NYC.
-If you’d like to support the podcast, please consider being a supporter on Patreon.
Episode Notes
Minhaz Kazi | Twitter | Github | LinkedIn
Transcript
PolicyViz Podcast Episode #138: Minhaz Kazi
Welcome back to the PolicyViz Podcast. I am your host, Jon Schwabish. On today’s show, we are going to talk about a new tool, Google’s Data Studio and with me to talk about it is one of the developers on the team, Minhaz Kazi.
Jon Schwabish: Minhaz, welcome to the show.
Minhaz Kazi: Thank you Jon.
JS: Great to chat with you.
MK: Yup. Happy to be here.
JS: So you were just telling me that you went to the University of Maryland.
MK: Yes.
JS: We have like DC connection. Do you want to start by talking a little bit about your background and your work and how you ended up here?
MK: Sure. Originally from Bangladesh, my undergrad was in business, and then I worked at Unilever in Finance for about four years, so lots of excel, lots of graphs and charts, and that’s where I kind of got interested in InfoVis and then I moved to US to do my Masters at University of Maryland because the program there has some focus on InfoVis on it. I did my Masters there, I worked with Professor Ben Shneiderman over there, did a project with him, and then worked for a public policy research firm in DC doing business intelligence/analyst work with dashboards, creating visualizations and then I kind of moved to Google last year. I am the developer advocate for data studio, that role kind of involves working with external developers who develop solutions with or around a product.
JS: Where did data studio originate? What was the thought of “We need this kind of tool”?
MK: Data studio right now, its Google’s BI/visualization platform. It originally started as an extension for the visualizations of Google analytics. Google analytics already has a ton of dashboarding or reporting built into it, but users wanted a lot more flexibility with that, they wanted to share the reports, they wanted to drill down, have a lot more customizability and that’s where the data studio sort of originated that we have a separate tool that would take the GA, Google Analytics data and then build reports on top of it. And then it slowly grew from there, it can now talk to other Google data sources like Sheets, Cloud SQL, BigQuery. It can also talk to other data sources that are not in around Google like we can talk to MySQL, we can talk to Postgres and you can also talk to bringing data from external APIs using community connectors.
JS: Interesting. So it sort of lives in the Google Sheets, Google Docs environment.
MK: Yes.
JS: Can you talk a little bit about how you view the environment working – I mean, we have some of these tools, there’s Tableau and there’s Power BI, but for people who haven’t tried and of course I will put the link on the website, but can you talk a little bit about how a person might use it and sort of how you start here with a nice blank canvas and sort of just dive in, so can you just talk a little bit about how I might use it and develop something?
MK: Sure. The good thing about data studio is that the barrier to entry is very low. You don’t have to pay to use it, it’s completely free, so anyone can immediately just go to the [inaudible 00:03:14] website and start using it. When you start using it, you have to create a report, that’s where you basically create – add your charts and stuff and then share the report with other people or just use it, work on it by your own. When you get the report, it’s like campus, and then you start by adding data sources to it. Each data source is like a cable. Your data source can be a Google Sheet, your data source can be a BigQuery cable, your data source can be an external API or API that you are pulling the data from. And as you add the data sources, then you can use those data sources to add different charts, you can add multiple data sources to the same report, you can create multiple charts from the same report. We support basic chart types like bar charts, line charts. In the report you can have multiple pages. So once you create all of these, you can also add filters, you can add date based filters or even dimension based filters. And the filters work in two ways, you can add filters during the report creation time, which are dealing with the amount of data the user gets to view at the report viewing time or you can add the filters at the viewing time where the user gets to pick what they want to see.
JS: And you can filter across the different sheets or tabs that you have.
MK: Correct.
JS: When it comes to the data, does the data live – so let’s say I started a data studio project and I import the data, does the data now live – is it directly connected to that, to the data studio or do I need to keep moving my data file around my Google Drive to keep them all connected?
MK: The data remains connected, so there are multiple ways you can use it. The one way that if you have a local file on your hard drive, a CSD file, you can just upload that to data studio and that goes into Google [inaudible 00:04:58] and it will always remain [inaudible 00:04:59]. However, if you have the separate piece of data somewhere else, let’s say you have data on Google Sheets, once you link it, that data is always linked. And data studio doesn’t store that data anywhere else. Whenever you open the dashboard, whenever you refresh the dashboard, it will always pull in the data from the sheet and display it in the dashboard. Same goes for other data sources. If you have a SQL database and you are pulling data from there, the connection will always be live. So whenever you are refreshing the dashboard or you are adding a new chart, it will issue a new query, get the data, show it to the user. And the data is not persistent on both sides, it’s just cache for however long the user is using it.
JS: So if you had a local CSD file on your computer, would you, for workflow purposes, would you recommend that someone take that and load it into Google Sheets and then go into data studio so that it’s always updated, and then they don’t have to navigate to the spot on their local computer to find it or what’s your preferred workflow when you create something in data studio?
MK: It will depend on two things, one is the size of the data and two is what do I want to do with the data. So is it like already prepared, I don’t have to do any kind of additional calculations or aggregations, it’s completely ready. And obviously the size, if you have a large sized data, you are putting it in the sheet and then trying to do some work with it might become cumbersome. So if you have like, let’s say, 100 megabytes CSD files, you might want to just upload that to the cloud storage and use it from there. But if you want to play around with the data, add more calculations, do aggregations, change the data or merge it with other data sources, then the recommendation would be yes, data on Google Sheets and then put [inaudible 00:06:39]. And you also get some calculations, moral support in data studio yourself. I think we support about 70 different functions right now. You can do [inaudible 00:06:49] statements, regular expressions, different kinds of analytical calculations, those [inaudible 00:06:56] supported.
JS: Can you talk a little bit about the graphs – I guess, not just the graphs, there’s a lot of things that you can insert into a data studio portfolio I guess or project, including textboxes and images and shapes. But there’s also a graphing library here, and what’s interesting to me is that it has your sort of obvious ones that you’d expect, the lines and the column and the pie chart and the map which is really cool because it builds right in. But it’s also – I mean, to me it’s interesting that a bullet chart is one of those chart types, could you talk about why this set of graphs and like – I mean, I don’t know, we could pick out any chart type that we want, that’s not here. But I am curious what the thinking was when you were building out the menu.
MK: So the current charting library we have, I believe, it is a result of two different things, one is since data studio started as an offshoot product from Google Analytics, the type of charts that people want to see are used in analyzing Google Analytics data, those are the charts that we introduce at first. So there might be some limitations around that. And from there, we look at our users, what do the users want, we have a feature request tracker and people can go in, request for new chart types, they can vote for it, and whatever becomes the most popular, most requested feature or chart type, we will try to add those. So, these are the two things that contributed to the current charting library we have. You have line charts, bar charts, pie charts, then you have maps, bullet charts, scorecards. One new feature that we are working on right now and anyone can basically go and sign up for it is you can write your own visualization in JavaScript, you can use D3 or any of the charting library that you want and then take that and plug it into data studio. So the possibilities are literally endless because you can have any kind of chart that you want and then you can have that chart alongside any of the data studio charts, you can put a filter on top of that and that will work across all the chart systems.
JS: Can you talk about that a little bit more, so if I am a D3 programmer, which I am not, let’s just imagine, I magically have the skills, so if I build a – let’s pick one that’s not in here – like a core diagram for example, so I build a core diagram in D3 and I can make that work on my browser, so now what’s the process by getting into data studio, and then also how do I allow or enable my team members to also use that as well?
MK: One thing I will get clear first is if you build it, then anyone can use it. So if you build a new chart type and then create a report with that chart type, anyone can view that report and use it. Or if you put that chart type in your report and shared that with EDITx with someone, someone can edit it. We are still looking into how that can be leveraged so that if you create a chart type, someone else who is a completely new report can use that chart type in their chart. So once this feature gets into developer preview or gets out of beta, then we will have some solution around that.
JS: Interesting.
MK: Maybe a library where you can go in and say, all right, I will use this chart type and then you can input that. Creating it, it’s more of a developer feature right now, so there is a sort of API behind it and there’s a library that you have to use. We have the documentation [inaudible 00:10:18] developer website so if you go there you will have to sign up for a group and that group will give you access to the feature. At the same time we have a step by step to fill. The first [inaudible 00:10:28] let’s you just draw one blue box on the canvas and then it will slowly build up from there, you can have like a small bar chart, and then you can also have styling. There are some limited functionalities like you won’t get everything that needs to be [inaudible 00:10:43] or maybe even if you can render it, you won’t get the flexibility of changing everything on the fly. But even with the limitations, it will be pretty flexible in terms of what you can do.
JS: I wanted to ask, you had mentioned that a lot of this came out of the Google Analytics, what people are using in Google Analytics. Do people, who are using Google Analytics will continue to live in that environment or do you think they will move between these two to do the reporting?
MK: For Google Analytics, what we see is that a lot of the – the user spectrum is – it’s a spectrum. The users who are casual users will probably not use this feature, but when you go to the middle or the high-end where you have power users, they sort of tend to use both. If you want your out of the box, easy to use what is going on right now kind of features, then you go into the Google Analytics dashboard, but if you want more customized solutions or when you are trying to drill into data or even to answer a very specific question, that’s when you bring in your [inaudible 00:11:48] data into data studio and then build the [inaudible 00:11:50].
JS: That’s cool. Let’s also talk about teams, we were talking about this before we started recording. I mean, it’s clearly a big issue is how you get people to work on the same document or the same project and Google Sheets and Google Docs does that really well, so can you talk about how teams can use this in a similar sort of fashion?
MK: The trend we are seeing right now is that not just for visualization but for other works it’s more collaborative right now. People will work with other people who will share their work and their results, their findings with other people. So the goal is to how to make that as seamless or as easy as possible. That’s where we think the data studio really shines because it’s a collaborative working environment just like Google Docs or Google Sheets, multiple people can work on the same dashboard at the same time. You can start creating one chart on one side of the dashboard and you can see you are probably creating a separate chart on the other side, and these get updated in real time. And the [inaudible 00:12:52] mechanism also work pretty well, so if we have access to certain data sources but you don’t want other people who are working on the dashboard to look at the data, you can still share the data source like the connection to the data, but not give them permission to you underlying data.
JS: I see.
MK: So for example, you have access to the MySQL database where you are pulling data from, and your colleague doesn’t have access to it, you can create a connection to that, share the connection with your colleagues so that your colleague will be able to pull in the data but your colleague doesn’t necessarily have access to the [inaudible 00:13:22].
JS: Access to the ton of data. And then with the sharing and adding, so that means that people can what – they can move things around, they can change colors, what are the spectrum of things that people can do when they are sharing?
MK: When you are sharing it with edit capabilities, the owner of the file and the other editors can do everything, like you don’t get any limitation as an editor. You can add more pages, you can add chart types, you can add text, pictures, we also have a way where you can render image links with [inaudible 00:13:54] so you can have nice [inaudible 00:13:57]. That’s from the editing point of view. Once you have done the editing and you want to sort of share your results with other folks, you can share the report with the view access. You can do that just like other Google Docs or Sheets, you can share them with specific other people, you can share that within across your organization or you can share that with the whole world. Anyone can be able to see your dashboard, you send them a link and they can view it. You can also take the dashboard and embed it in iframe we also just recently started supporting Embed so you can take your dashboard and put it into a media block, and there are no limitations around that.
JS: So when someone views it, when they just have the viewer, so when I am editing it, I am seeing, I have like the standard Google environment, it’s the name and then the file edit view tab and then the data studio tab with all the chart types. So when I get just a view version, it just appears like a regular website.
MK: It appears as a dashboard like if you view it, it will be like – if you go to the direct, link up the data studio [inaudible 00:15:02] we have a header that says data studio, but we need that and it will be just your canvas with your data on it. That’s it.
JS: So my view of – and this is probably not the way Microsoft and Tableau would argue, but my sort of view of like Power BI for example, the aspect that I really like about Power BI is that you can create a dashboard where everything is linked immediately, you don’t have to add filters the way you have to do in Tableau, but from my perspective the tradeoff is Tableau looks a lot nicer whereas Power BI kind of doesn’t. But for me, it seems like Power BI would sit around the table and we can make something a lot faster and we can drill down the data. And I wonder, like all the graphs here look very Google Sheets, they have that look. So, I wonder do you and your team, do they have a view on who you think is going to be the primary user like is it the people who are sitting around the table working or is it going to be like the Tableau public community where it’s like let’s make a nice beautiful dashboard and post it to the site or is it going to be like, what it probably will be like a mix of both of those?
MK: Since the barrier to entry to this product is very low, we think that a lot of people like anyone who wants to do visualization can use this tool, unless you are doing very specific kind of visualization where you want a lot of flexibility and a lot of customization, data studio should be able to meet your need. You can build a dashboard in five minutes, then share that with the whole world, like from the point where you go into data studio to the point where you are sharing with the complete dashboard with the whole world can literally take five minutes. And we think that that’s a very good advantage that people want to get things done, people want to get things done quickly and they want to share the findings with others, they want to talk about the data story, and this is something that lets you do that.
JS: The low barrier to entry actually reminds me to ask a question that my daughter would actually be more interested because at her school, their school newspapers run on Google sites. So I assume that you could create a dashboard and data studio and just embed it. Now is it embedded in the Google site or is it iframed in or how does that work – I mean, it’s all within the Google environment?
MK: It’s iframed in. It supports iframe embeds and we also support [inaudible 00:17:25].
JS: Nice. Looking forward, so you mentioned there’s a couple of things that are beta, that are coming out, there’s a developer side, and right now you mentioned that sort of pretty good but somewhat limited library of [inaudible 00:17:38] so what does the next – I don’t know, six months to a year bring for data studio?
MK: One thing that we are actively doing right now is trying to grow the number of data sources that we support. We have a program called the community connectors program where developers can come in and write their own connector in Apps Script. Apps Script is a Google scripting language that is a subset of JavaScript, so you can go in and sort of write a connector that fetches data from your own API. We have a connector gallery that has about 800 connectors right now that connect more than say 400 different data sources. We are trying to grow that because when you are doing a visualization, you have the data part and the visualization part. We want you to focus more on the visualization part. You shouldn’t be really bothered about how do I get my data, how do I store it, how do I clean it. That should be taken care of for you, and that’s why we are trying to make the process seamless so that if let’s say you are trying to see what was your [inaudible 00:18:42] count like, you don’t have to go in and fetch the data from API, you don’t have to go in or script the webpage, you just add the connector, put in your information and it immediately gets you the data. You have the data and then now you have time to explore the data. So that’s one thing we are looking at. One of the big features that was requested by folks was data blending, so if you have data from different sources how do you blend that. We are working on that. We are also working on community visualizations where, like I mentioned, that you can come in and write your own [inaudible 00:19:15] JavaScript. So these are some of the things that will come out in near future.
JS: That’s correct. That’s cool. Looks like a great project, I will link to of course data studio and all the things that you mentioned where you can report feature requests and the developer website so that people can come in and hopefully play with it. Minhaz, thanks so much, this is a really cool looking tool and I am looking forward to toying around with it.
MK: Thank you so much.
JS: Thanks everyone for tuning into this week’s episode. If you have comments or questions, please let me know in the comments section or on Twitter. Until next time, this has been the PolicyViz Podcast. Thanks so much for listening.