Episode #216: Dr. Claire McKay Bowen

Dr. Claire McKay Bowen is a principal research associate in the Center on Labor, Human Services, and Population and leads the Statistical Methods Group at the Urban Institute. Her research focuses on developing and assessing the quality of differentially private data synthesis methods and science communication. She holds a BS in mathematics and physics from Idaho State University and an MS and PhD in statistics from the University of Notre Dame. After completing her PhD, she worked at Los Alamos National Laboratory, where she investigated cosmic ray effects on supercomputers.

In 2021, the Committee of Presidents of Statistical Societies identified her as an emerging leader in statistics for her “contributions to the development and broad dissemination of Statistics and Data Science methods and concepts, particularly in the emerging field of Data Privacy, and for leadership of technical initiatives, professional development activities, and educational programs.”

Episode Notes

Claire on Twitter

Claire at the Urban Institute

Claire’s personal website: https://clairemckaybowen.com/

Protecting Your Privacy in a Data-Driven World

Book page: https://clairemckaybowen.com/book/

Data4Kids

Overview of GDPR

One Nation, Tracked. Story from the New York Times

NetFlix Cancels Recommendation Contest After Privacy Lawsuit

New Ways to Support the Show!

With more than 200 guests and eight seasons of episodes, the PolicyViz Podcast is one of the longest-running data visualization podcasts around. You can support the show by downloading and listening, following the work of my guests, and sharing the show with your networks. I’m grateful to everyone who listens and supports the show, and now I’m offering new exciting ways for you to support the show financially. You can check out the special paid version of my newsletter, receive text messages with special data visualization tips, or go to the simplified Patreon platform. Whichever you choose, you’ll be sure to get great content to your inbox or phone every week!

Transcript

Welcome back to the PolicyViz podcast. I am your host, Jon Schwabish. On this week’s episode, I talk to my Urban Institute colleague, Claire Bowen, about data privacy and security. It’s one of the most important issues, when we think about how our data is being used and how it’s being used for us and against us. Claire has a great book about this, I really highly recommend it, it’s linked in the show notes, you should check it out, it gives you a great overview of these data privacy issues. And so, Claire and I talk about her work, her background, she tells some incredible stories about things that might scare you a little bit when it comes to data, but hopefully, this will get you thinking about how we can be more careful with how we collect and use our own data. So here’s my interview with Claire Bowen.

Jon Schwabish: Hi Claire. Welcome to the podcast.

Claire Bowen: Thanks Jon for having me. I’m really excited to be talking about data privacy.

JS: Data privacy on the podcast…

CB: On the podcast.

JS: On the podcast, right, because that’s where data privacy should be discussed. So this is exciting, so just quickly for folks who don’t know, Claire and I work together at the Urban Institute, we’ve done a number of projects together. Our most recent one is our data for kids project on helping kids learn more about data science, data, data visualization, all the good things that kids should learn about these days. And today, we’re going to talk about Claire’s book and her work on data privacy. So her book, which I’ll hold up for those of you who are watching the video on YouTube is – can’t really see, but anyway, Protecting Your Privacy in a Data-Driven World. So we’re going to talk about data privacy, it’s so important for those of us working with data, so let’s start with the Claire Bowen origin story. So what got you interested in these issues of data privacy and security?

CB: That’s a really great question, so I will give you the really short answer, and then I go a little bit longer, because the short answer was I was applying for funding as a graduate student, and I was looking at different options, and I thought, oh wow, there’s a cool fellowship through Microsoft, and I brought this up to my advisor, and again, this is my first semester starting my grad program; and my advisor said, well, just as a piece of advice, you always want to pitch the research project based on what they’re interested in, and I bet Microsoft is really interested in privacy, and I used to do this for my graduate work, I haven’t done it since then, so maybe you want to look into data privacy. And at the time, she also said, hey, there’s a new thing called differential privacy, you might want to check out. So that’s how the origin stories started, but to give some more context, so it’s not so blunt, it was because of funding that kind of started it. The more context is I actually started in physics as undergrad, and I went into physics, because I really wanted to know how the world worked, and I thought it was like, oh, there’s these cool, challenging problems, and physics answers a lot of these questions; and I got into math because I learned that math was the language of science, and that kind of evolved into getting to statistics and realizing I really liked the analytics part of doing scientific work, and also realizing that, to paraphrase a famous saying from a statistician, John Tukey, who said that basically you get to play in everybody’s backyard, if you are in statistics. So I decided to pursue that, and then my advisor who was very flexible about what topics we could cover, she’s a Bayesian statistician; so for those of you who don’t know, there’s two different ways of thinking about statistics, there’s the frequentist, which is what we’re normally taught, and there’s the Bayesian, which is actually the foundation for machine learning, AI, so on and so forth. And so, she said, as long as you’re thinking about Bayesian statistics, then I don’t care what topic you do for your dissertation. And I got into privacy, because again, there was that funding thing, and then when I was digging into it, I realized, wow, this is a really cool area, there’s a lot of open problems and questions, and it has a very obvious application, because that was a really big thing for me when I was in physics was that I knew I wasn’t going to be a theoretician, I really wanted to solve practical problems and privacy felt like a good fit on trying to look at the intersection and it did snowball from there, because I did win the funding, then I won more funding. And so when that happens, then you’re like, oh, well, I have money to actually research this, and so I could actually deep dive into this area. So that’s actually why my whole dissertation was in this field, because I was able to just dedicate all that time to look into it, and luckily – I shouldn’t say luckily, but sometimes in grad school, you go on a topic that you find out you don’t actually like it, but I really enjoyed it, and it’s become my full career.

JS: Nice. That’s a good origin story. All credit to your advisor, where credit is due, right?

CB: Right.

JS: So can you give folks some idea of where, for folks who maybe not familiar with this, where data privacy comes up in their everyday lives? And I think generally people sort of have a general concept of this, like, when you talk to Alexa, like we all know, like it’s being stored somewhere, but for those of us working in data all the time, where does this pop up that you think is most relevant to our work in our lives?

CB: That’s a great question, so I’m going to back up a little bit and kind of clarify what I mean by data privacy, because it’s like a nice catch all phrase, like what I do, but it’s a very broad field, and so, sometimes when I tell people, hey, I work in data privacy, they think I do encryption or cybersecurity, I fight off hackers and they’re like, oh that’s so cool. I have a few family members who are like thinking I’m an app developer after I told them to stop using certain apps because of security reasons. And so, anyways, it kind of digresses from there, but what I focus on is trying to expand what I call expand access to data, or make sure that there is very sensitive information that can be useful for making very impactful public policy decisions, but making it so that the researchers who analyze it don’t know who is that data. And so, one of the examples I give for like thinking of everyday is most of us have a smartphone now, and it records your location and time; and so, with that kind of information, you could figure out actually where you live, if you’re in a certain residential area during sleeping hours, where you work, because you’re in a certain location during working hours. And even if you release just that dataset of where’s a person is, time and place, with no identifiers, like no names, no gender, no race, ethnicity, so on and so forth, you could still figure out who they are. And this was actually done by the New York Times back in 2019, they did a whole article about this, where they got a dataset like that, where they were able to identify somebody because they were at Microsoft campus for certain times, and all of a sudden, they switched over to Amazon campus, and they were able to go on LinkedIn, and figure out who this person was, and they verified. So like, hey, is this you. And it was correct. And so then, some people’s responses to that is like, okay, we should not have that data public, that should only be kept in whoever’s collecting it for very specific reason, or the cellphone companies, they shouldn’t be sharing it to anybody else, but just for the, whatever purposes they need for maintenance for the cell communications. However, that information is what’s used by like FEMA, or for other like emergency responses, because that tells FEMA what is the patterns that they see going through the United States; I mean, this last year, we had a lot of natural disasters with hurricanes and forest fires and flooding, and so, trying to figure out, based on people’s patterns, where did they go, what’s the best ways to close down certain roads, versus maybe we need to block out these other areas, or maybe we need to prioritize certain neighborhoods because they have limited access to get out of the cities. So that’s example I like to give because it’s something that we all have, I’m pretty sure we all have cell phones…

JS: Except for one person we work with, who’s standing by their flip phone.

CB: Yeah, that’s true.

JS: We won’t reveal names, but just to say that we do work with at least one person who’s a dedicated flip phone user even in 2022.” So now that we have a sense of how this works every day as anybody, what about people who are data analysts, who are researchers, what should they be thinking about when it comes to data privacy? And I know that’s a broad question, because you’re doing a lot of stuff, so another way to think about this might be like, what do you tell our colleagues at Urban about data privacy, or what are the projects that you work on – I know that’s like super broad.

CB: Yeah, I’m actually going to target one aspect, which is knowing that a lot the data that you collect, unless it’s directly the raw data, it has been altered in some way because of privacy concerns, because certain datasets, especially census is a really popular one, especially at Urban, a lot of people at Urban use the American Community Survey, or any of like, right now, we just had the 2020 census, and so, they use that data for a lot of their research, figuring out, oh what is the demographic breakdown for a state and they try to figure out the survey weights for certain kinds of analyses that they do. A lot of people who access that data think that that data has been the raw data, but that hasn’t been the case for decades. There’s an act, well, there’s been many acts, but basically, we have not had access to raw data for a very long time. So there’s that misconception, and so, that’s why one of the things I tell people is when you take a dataset, what are the things that have been altered, so that way, you’re not going to say the wrong “data story” because the data has been, one of the terms is like aggregated up to a higher level. So instead of saying that we get down to what we call census blocks, which are really small units of geography, sometimes the data gets aggregated all the way up to a county level. And so, if you just analyze data from that, you could make misleading conclusions, because not all counties are created equal, or created to balance equal. The example I like to give is the town that I grew up in, in Idaho. So for those who are listening, I grew up actually in a rural area of Idaho, and the county I was in is the size of Connecticut, so it’s a huge county with very few people. The town I was in was the biggest town in the county with 3000 people. So you’re trying to make sometimes decisions for a whole county with people who are scattered throughout versus another county like Arlington, because that one’s really close to Urban Institute, and that one is only 26 square miles or something like that, with hundreds of thousands, it’s a lot more people – I shouldn’t say 100,000, I actually don’t know the whole population for that.

JS: I mean, you got to be careful at data privacy how many numbers you’re putting out there.

CB: Yeah, exactly. But definitely more I think in that county.

JS: More, definitely more.

CB: We can say more, definitely more than the county I was in.

JS: Do you, I mean, not so much for the folks that we work with, because many of them are sort of experts in a lot of those datasets, but when working with a new dataset, what do you tell people to do – should they read the codebook? Do they look at only the specific variables that they’re looking at, or do most major federal surveys, like, is there always a notice about what they’ve done? How do you think about looking at a new dataset, and how to uncover what aggregation or changes have been made?

CB: That’s a great question. So hopefully, somebody has done a data dictionary, and they talk about how the data was collected, and what aggregations have been done. Sometimes you can contact, there’s a contact person that you go to, I’m picking on census again, because they’re such a classic example, they do have working papers or documentations on what methodologies they do use. There’s like a general one that says, for instance, when they do certain aggregations, they have to have at least a 100,000 people with this certain kind of characteristic combination across the United States to be considered a part of the dataset without what we call suppression – suppression means you basically remove that information entirely from the data. And that’s also one of the things – another example would be the Bureau of Labor Statistics, they also do some suppression techniques, and they also have a document of how they make the decisions to suppress the data.

JS: And are those based on, I mean, I know the federal rules, but what about private sector datasets, do they often base their suppression or aggregation rules on the federal government, or is it kind of just like the Wild Wild West out there?

CB: So it’s definitely the latter, the Wild Wild West, there’s no one law federally, and there’s none for consumers, so we’ll get to that one a little bit. But for federal laws, it’s like hodgepodge here in the United States, so, for example, census is governed by a mix of Title 13; and CIPSEA, which again, I’m always bad with acronyms, so the Confidential Information Protection something Act that has been updated in 2018, so there’s that Act there, it also governs Bureau of Labor Statistics data; there’s also Title 26 which governs the Internal Revenue Service’s data, but then it kind of overlaps a little bit with census too, because they have some joint datasets together, so some of those datasets are protected by both. But then we have our healthcare data, which is governed by HIPAA and student data with FERPA. So there’s all these pieces, and so there’s not one to go to, to say, hey, this is how you should protect your data because of how these laws are done. Now, the consumer data, we have no federal laws covering how consumer data, so that’s why there’s no pressure for companies to think about how they should protect their data. There are a few states, so last time I checked, it was between 11 to 13 states have some laws, but that’s it, and they very much range on severity level. The closest would be California that they have the strictest set of laws to govern consumer data, but even then, they’re just one state, so it’s not a federal mandate.

JS: And it’s affecting or it applies to companies based in California, or to people based in California?

CB: So a mixture, so if the company is in California, and it has to be a for profit, so actually Urban would be excluded from that because they are nonprofit, and then it protects all California residents.

JS: So if you are a California for profit company, doing a nationwide survey, include some California residents, does the law apply just to those residents or to everybody in the survey, because you’re based in California?

CB: You’re based in California. It’s interesting they actually have a clause in there of what is considered a California business and there’s actually [inaudible 00:15:38] of requirement too, and so, it’s interesting, because there’s been also an update too to the law, because they released one version in 2018 I believe, and then, they have another update for this year. So that’s why you might have been spammed again by different companies, and they’re like we updated our privacy policy.

JS: Right. Interesting, because that’s California change. And I remember there’s a whole section in your book about the actual penalties, and a lot of these are not very strong right? The penalties are kind of nominal.

CB: Yeah it is. And so, California kind of updated, there’s a little bit more where, for instance, one of the biggest changes for the penalty is like if it involves a child, which they consider anybody under the age of 16, or I think it’s 16 and under, excuse me, then it’s considered a severe case, no matter what, even if it’s a light infringement, because they have the classifications of less severe versus severe. If it involves kids, it is automatically severe, and I believe, in that case, it’s like $7500 per child.

JS: Wow. And that’s the penalty to the state, that’s paid to the state on top of whatever civil penalties could come up if someone decided to sue.

CB: Yeah, exactly. So one of the updates between the past California laws, and then the current one when they did updates, they actually had, like, designed a body that would actually pursue those lawsuits, because before it was kind of squishy, I guess. And then, they also had, I believe, a 30-day window that the company could, like, well, as long as they correct it…

JS: Yeah, right, it’d be okay.

CB: It’d be okay, but for bigger companies, if there’s a lot of money on the line, they’ll be very motivated to fix in 30 days.

JS: Sure. It’s 7500 a pop you would think that you would jump on that, yeah. You mentioned this phrase differential privacy earlier, which is a big deal, and I wanted to ask you to explain it for folks, because it’s so important, and it’s often confusing to me, at least. I’m kind of getting a little – maybe I get in the weeds too much, but yeah, so if you maybe just talk about that, and what it is, and there’s especially this big debate of the Census Bureau, although, I guess, other places too, but that’s where I’m most familiar with it.

CB: Right, so it is a very complex topic, there are a lot of people who are very smart people who struggle with understanding it. So I’m just going to make that kind of a caveat disclaimer as I try to explain it very high level and quickly for people who are listening. So before I even dive into what differential privacy is, I have to talk about how the fact that when you are looking at protecting a dataset, you have to define what you mean by what is privacy, and what is a risk to the data or information you’re trying to release. So for many, many years several decades, the way that we defined – I say way as the federal government or other agencies defined privacy risk as being able to identify somebody or finding a group of people or saying, hey, if these people are smoking, we can infer that they’re likely to have cancer, so maybe we should increase their health insurance rates, because they’re more likely to have cancer, like, we don’t want those kind of like disclosure risks or privacy to be disclosed. And so, those are very intuitive definitions, but the problem about defining privacy that way is like whoever thinks that’s the way we should define privacy, it’s very subjective; but again, very intuitive, being able to be like, oh, can I match somebody in this dataset versus another dataset. So a very classic example that many people like to cite is the Netflix Prize Dataset that was like this $1 million prize Netflix did back in, I think it’s 2007-2008 or so, saying, hey, if you can improve our recommendation system by 10%, you will win $1 million from us, and so, they released a dataset that was anonymized, they removed people’s personally identifiable information; but one group instead of trying to improve their recommendation system was able to directly link the records and the Netflix dataset with IMDB, and be able to identify certain people. And so, that caused a lawsuit, and so maybe some listeners here think, well, Claire, that’s silly, I don’t care if somebody knows I gave five stars for the latest Avengers movie or something like that, but one of the things that came from the lawsuits is that you don’t know what could be inferred by the dataset. For instance, you could figure out people’s sexual preference apparently, based on what they were watching, and so, that was one of the lawsuits was that you could identify if somebody was on the LGBT or LGBTQ plus. So that’s sensitive, and that’s something [inaudible 00:20:29] thought of.

So that’s what we call record linkage attack, that’s something that people will say, hey, that is a disclosure risk, let’s protect against that. So that’s my level setting, that’s how we’ve been doing it for many, many years. Now, differential privacy tried to tackle the ad hocness of that, where you’re trying to, like, before we try to [inaudible 00:20:47] that, how is somebody going to attack, they’re looking for one person, the group of people, the inference that I said earlier about smoking and cancer, or even knowing like, oh, who knew that sexual orientation could be inferred from this Netflix dataset, all those kinds of things. And so, basically, differential privacy says, hey, I’m a new privacy definition, I’m going to scrap all those things and say that we’re going to assume the worst possible case scenario. I’m saying that you must make a method that says that you have all the information of all other records, but one person, and that you have to think of all possible versions of that dataset, and that means all future datasets. So that’s another past methods is like, you don’t know what future datasets are going to be, so you have to protect against all future possible datasets, and then a person would have basically unlimited computing power, as if they were going through blunt force try to figure out something [inaudible 00:21:43].

So that’s what differential privacy basically says, like, this is how we should define privacy. So it’s a very, very conservative, very high privacy guarantee, but the criticism you get from that is like, one, trying to figure out what is the universe of the possible versions of datasets and how to protect that is really hard and really difficult for people to even wrap around all future datasets, like, what does that mean, datasets. You’re like, okay, well, if it’s everybody but that one person, that’s a bit too much. So then you get datasets that might be way noisier than before. So this actually goes to your point earlier, so like, well, I keep hearing about differential privacy in the context of the 2020 census. So up until 2020, the methods have all been using the, what I call, more traditional privacy definitions. And so, when they – when I say they, Census Bureau created their, what they called, Disclosure Avoidance System is the phrase, when they made their system to protect the decennial census data, they based it on those traditional privacy definitions, and the biggest method they used, biggest or the main method was data swapping. So they were swapping records with similar characteristics based on, oh, there’s very few people with who are, let’s say, African American, with so many kids in this one area of the country, let’s swap them with another family in another part of the country, and that’s how we’re going to protect them. So 2020 is the first time that we did all a way with that, and decided to make a new method that satisfies differential privacy. Now, I’m being very careful with my words right here, because often I hear people say differential privacy is the method, it is not, it is a privacy definition that a method must satisfy, so the method or the algorithm that was used for 2020 is called the TopDown Algorithm that has components of it that has differential privacy.

JS: Okay, I just learned a lot there. That was really interesting. So we can keep it specific to the 2020 census, so how does the census then determine, I guess, the resulting accuracy of the data, so once you’ve done either a swap or you’ve done all these other things, how do they say, yeah, these data are not just, we haven’t just created [inaudible 00:24:05] random dataset for you?

CB: Right, and that’s a great point, I’ve been hinting at it by having explicitly said that there’s this natural tension between protecting the privacy of the dataset and making sure that it’s useful or accurate for any kind of case you want. So you can’t have all the information and/or all the privacy. So like in those two extreme cases, it’s either like, if you want all the information or usefulness that you’re just going to fully expose people [inaudible 00:24:31] privacy. To make it fully private, you could just lock the doors and just say, hey, you don’t have access to the data. So there has to be some trade off, and that, to your point, it’s like how do you determine that? Well, for any dataset, there’s no one utility metric or measure that you should go by, it really just depend on the data, what are people going to use it – census, again, is a great example, because it’s used for a lot of things, and so, it’s really hard for them to be like, hey, it needs to be useful for everything, like one that’s impossible, so they do a suite of analysis like they do, one way is to look at summary statistics, so looking at the bias or that – I don’t think they call it bias, I think they look at absolute error of some of the measurements, like how variable it is, so they do those quick checks at different geographic levels; they have what sometimes we say outcome specific or there’s a specific use case that somebody will use it for. So 2020 census is used for redistricting, so they probably want to make sure that it’s going to be still very useful for that redistricting file. There’s other ones where you can look at distributional aspects of the data, so saying, like, maybe this variable, it looks really good, or this categorical variable is very close to the original categorical variable for these counts or something, or looking at continuous distribution and comparing those two. And then, some people are like, okay, what about relationships, so you can probably use our models or look at multivariate distributional features of the data, so it can vary quite a bit. But just to say that census has a really hard task, because you can’t make a dataset that’s completely useful for everything, again, if you do have [inaudible 00:26:14] privacy.

JS: Right.

CB: So like how do they optimize on that, and there’s certain things where they’re just not going to be good at, because there’s that privacy issues, like, one of the examples I hear that people use the census data is, I guess, there’s a county in Ohio that uses it for figuring out restaurant permits. So like not something census is going to be thinking about, but it’s a random use case.

JS: But to your point, someone could try to somehow combine data, like to the privacy, they combine various datasets to pull out this person goes to this restaurant, or, I don’t know, whatever that magical future dataset and data need is, yeah, that’s a heavy lift. So before we wrap up, and you’ve alluded to this a little bit, I wanted to ask you, if you were in charge of the US data privacy thing, and you get to control everything, not just federal data, you get private sector, nonprofit sector, you are data privacy tsar of the whole country, the tsar with a star on it because it’s everything, what would you like to see happen, or what would your policies be?

CB: That’s a great question, so I’ll do – I keep saying this, because you just have great questions. There’s two parts, so one part is, like, I would hopefully have some control over education system, because I think we need to teach this. This is actually, all this information I’m providing, you don’t learn this unless you’re in grad school, and even then it is in these computer science departments, so I talked to a colleague who came from the computer science program, and he said, well, go and learn privacy, but it’s not to expand the access to data part of privacy, like we’ve been talking about. There’s no formal classes on it, if it is, it’s a graduate level in computer science, like I said, and that’s not good, because these data are used by so many people. They’re the public policy people, social science, demographers, economics, I’m from statistics, so this should be standard in some sort of classroom where they do data analytics. And so, there’s already been like a movement with professors trying to teach students that your dataset isn’t this beautiful, clean thing, like, what is it, the classic example is the Iris dataset, like, look how everything is perfect…

JS: [inaudible 00:28:43]

CB: And it’s clustering of things like, no, no, no, no. So on top of it all, it should be discussed that there are these privacy implications. You’re also seeing it, I guess, another movement of data ethics and data equity, so that should all wrap in, and so, that’s why I’m like, I want another star onto my tsar part, which is like the education.

JS: You’re going to be busy, but okay, all right.

CB: And then, I guess, for all of privacy, definitely getting more of a unified public policy law for how we use data, how it should be accessed, because, one, we don’t have any consumer privacy laws. And then two, the issue with the federal laws we have is because they’re piecemealed, some agencies have to renegotiate every federal fiscal cycle for a different exchange of data. And so, for instance, there’s a project that I’m working on with the Bureau of Economic Analysis, and they have an agreement with the Internal Revenue Service because, like, Bureau of Economic Analysis, they do a combination of census data, taxpayer data, and they look at some labor data too. So the way that they negotiate is they become a subcontractor to another agency, that’s how they work around that whole renegotiating every year; but some agencies, they still do a formal process of like, okay, it’s that time of year again, like, we’re going to spend like three months negotiating how we will exchange data under the different privacy acts, and that’s three months out of the 12 months, so you get maybe nine months, and then you have to restart all over again.

JS: So does GDPR, which is the European law that you referenced earlier, would that be your model, at least as like a starting point, or is that, from your perspective, not sufficient or not – I don’t know, like, yeah, would that be your starting point, at least, or like a model to build off of?

CB: I think it’s a good model to start, I think we’ve a combination of that in the California laws, because we’ve learned a lot from both of them. So I think I talk about those actually in my book that both of them have really great features, but some of them can be easily exploited. So one of the things that both of the laws have is if you have full consent from people, but like, what if you’re pressured into giving full consent, so one of the things that we were actually seeing actually in school systems, I did not know this, like, now there are certain apps that students download on their smartphones and professors can know your attendance if you’re physically in, like, there are sensors in the lecture – yeah, that’s creepy.

JS: That’s creepy.

CB: Now you know if your students have been there, if they’re late or not, I mean, I guess, students could just like one person with a backpack with everybody’s cell phones in it.

JS: Everybody’s cell phone, right.

CB: Yeah, but like, that’s beside the point, but, I mean, what’s to stop the school to say, hey, if you want to come to our school, you have to give full consent.

JS: Right, and we’re going to track you, and if you don’t do X, Y, and Z, then we’ll take all your scholarship, for example, something like that, right.

CB: Right. And then, for work, there’s already those keyboard detection, like, how active are you there, like, are you actually working, and what’s to stop your employer to be like, hey, if you want to work for us, you have to consent that we are tracking this information.

JS: Right. Okay. Well, on that scary note, I will point people again to your book, link to it on the show notes, everybody should check it out, it is a mixture of scary stories, but also things that we can do as data users and consumers, highly recommended, you should all check it out. Claire, thanks a lot, thanks for coming on the show, this was lovely, always great to talk to you.

CB: Thanks for having me.

Thanks everyone for tuning into this week’s episode of the show. I hope you enjoyed it. I hope you will check out Claire’s book and her other work. You can head over to her page at the Urban Institute, which I’ve linked to in the episode notes. Just one more mention of various ways that you can support the podcast. I, of course, have my Patreon page open, where you can support the show for just about a cup of coffee every month. You can head over to my newsletter page where I have a free newsletter and a paid newsletter, the paid newsletter gives you some other advanced information, let’s just say, some coupons for conferences, ability or opportunity to meet with me in a Zoom call, and there’s also the new Winno community that I’ve been building where I’m sharing out little data visualization tricks and tips and techniques via text message, only two or three a week, and you can text me back and say, I didn’t really like that or I like that a lot, or what else can you show me, so I have that opportunity as well. So check all those out, they’re all listed on the show notes’ page, and also on policyviz.com. So until next time, this has been the PolicyViz podcast. Thanks so much for listening.

A number of people help bring you the PolicyViz podcast. Music is provided by the NRIs. Audio editing is provided by Ken Skaggs. Design and promotion is created with assistance from Sharon Sotsky Remirez. And each episode is transcribed by Jenny Transcription Services. If you’d like to help support the podcast, please share it and review it on iTunes, Stitcher, Spotify, YouTube, or wherever you get your podcasts. The PolicyViz podcast is ad free and supported by listeners. If you’d like to help support the show financially, please visit our PayPal page or our Patreon page at patreon.com/policyviz.