Ronald L. (Ron) Wasserstein is the executive director of the American Statistical Association (ASA) and he joins me this week on the podcast to talk about the 2020 Census.

In this role at the ASA, Wasserstein provides executive leadership and management for the association and is responsible for ensuring that the ASA fulfills its mission to promote the practice and profession of statistics. He also is responsible for a staff of 35 at the ASA’s headquarters in Alexandria, Va. As executive director, Wasserstein also is an official ASA spokesperson.

Prior to joining the ASA, Wasserstein was a mathematics and statistics department faculty member and administrator at Washburn University in Topeka, Kan., from 1984–2007. During his last seven years at the school, he served as the university’s vice president for academic affairs.

This week’s episode is a bit of a departure from the usual discussion of data visualization techniques and tools, presentation design and skills, and open data. Instead, Ron and I take a step back and talk about the data that we use in our work, specifically the debate around the upcoming 2020 Census. I hope you find the discussion as interesting as I did.

Episode Notes

ASA Lauds Decision to Leave Citizenship Question off 2020 Census, ASA News

The Trump Administration’s Statistical Malpractice on the Census, Washington Post

Push for a Full 2020 Census Ramps Up After Citizenship Question Fight, NPR

Learn more about Count on Stats

Follow Count on Stats on Twitter

Join the Count on Stats LinkedIn Group

Support the Show

This show is completely listener-supported. There are no ads on the show notes page or in the audio. If you would like to financially support the show, please check out my Patreon page, where just for a few bucks a month, you can get a sneak peek at guests, grab stickers, or even a podcast mug. Your support helps me cover audio editing services, transcription services, and more. You can also support the show by sharing it with others and reviewing it on iTunes or your favorite podcast provider.

Transcript

Hi everyone. Welcome back to the PolicyViz podcast. I’m your host Jon Schwabish. You know, on this show I spent a lot of time talking to guests about their data visualization projects, the data visualization tools that they’re using. We talk about presentation design. We talk about presentation techniques, even spent a bunch of time talking to folks about open data and open data portals. But I haven’t really spent a lot of time talking to folks about data, the actual data, the construction of different surveys, the ins and outs of data. And so I thought it would be a good opportunity to do a little research on that and do a little talking with folks about some data. In particular I was interested in talking with some people about the 2020 census that’s coming up here in the United States in just a few months.

As you may know there was at least a big controversy about adding a specific question to the 2020 census. And so I was fortunate enough to be connected with Ron Wasserstein who is the executive director of the American Statistical Association. And of course, one of the big issues ASA has taken on is this question in the 2020 census about whether to add a question that asks people about their citizenship.

Now that effort now is seemingly over. But the addition of that question was quite controversial for lots of different reasons. And of course there is the political reason, but there’s also the statistical reason behind it, the methodological issues behind adding a questions to a survey. It’s not just like you just add a question to a survey and it doesn’t have impacts and it doesn’t have costs. It does and it’s one of those things that we really need and should be thinking about.

So I was really grateful that Ron came on the show to talk with me about what he does at the ASA, what the ASA position is on the 2020 census debate and how those efforts could have impacted the census. And the survey itself. So I hope you’ll enjoy this week’s episode. It’s a little bit of a different interview, a little bit of a different topic that I’m interested in here. But one that is vitally important to the work that we do as data visualization producers, as content producers, as presentation experts, all the things that we do to communicate data we of course need to be thinking carefully about the data behind all of our work.

Okay so onto the interview with Ron Wasserstein.

Jon Schwabish: Hi Ron. Thanks for coming on the show.

Ron Wasserstein: It’s my pleasure.

JS: I’m really glad to chat with you. We’re going to talk about the census, obviously an important topic. And before we get into it, I thought maybe you could talk a little bit about yourself, how you became executive director of ASA and what that job entails.

RW: Glad to. I never really imagined that I would be executive director of the ASA. I started off life as an academic and thought that’s what I would always be. I loved the job that I had at Washburn University, started off as a faculty member in mathematics and statistics, went on to be an academic administrator. Thought I would always be there. But all along the way I was volunteering for the American Statistical Association in various capacities. And then in 2006 my predecessor announced his retirement. A few people contacted me and asked me to consider applying for the executive director job. It’s not something I had ever considered. I looked into it, my wife and I gave it a lot of thought. I ended up applying and then I was fortunate enough to be selected for the position and discovered that it just really suited me. And now I’ve been in the position for 12 years. And to be honest, Jon, I just can’t imagine doing anything else. I just love it. It’s the opportunity to work with statisticians all over the globe to do what the ASA does, which is to promote the practice and profession of statistics.

We’re the world’s largest community of statisticians. And what do I do? Well I lead the association and its mission to advocate for our profession. We have a statisticians in over 90 countries. We have about 18,000 members. I just really feel like the luckiest guy on the planet had this job.

JS: Wow! That’s great. I mean that’s like a success story. That’s what we all want to hear.

RW: I’m really glad to have the opportunity to talk about the various aspects of statistics. And really it sort of segues into the discussion about the census because we found ourselves in a position where we needed to be engaged in the discussion about census 2020 because in fact the American Statistical Association was founded in 1839 as a means to promote the 1840 census as it turns out.

JS: And so now we have a new census. Let me start this way. Let me ask you to maybe list out the importance of the census and what is responsible for. I think a lot of people get that it’s account, which we all know is important for some of the basic demographic and statistical information we need about the country. But I’m not sure people fully understand the implications of having that accurate count for government spending for our political borders that are drawn. So can you maybe talk about the importance of the census before we get into the specific controversy that surrounded at this time around?

RW: Sure. The census is kind of a remarkable thing. If you start reading the constitution, you don’t get very far before you discover that the founders realized the importance of counting the population. And when you get down to it, the census exists to do two fundamental things. It’s there to count people so that we can portion the population for counting people so that we can allocate districts for legislation that is for legislative districts for the House of Representatives. And also because we allocate billions of dollars based on where people live. So those are the two main reasons that we count people.

JS: Right. And so now this time around the Trump administration wanted to add a citizenship question to the 2020 census and we now know that that’s not going to be on the census. So I have really two parts of question for you or maybe three parts really. So why was that so controversial for people who don’t know? And then now that we know that the question is not going to be added, does it matter in terms of effecting people’s likelihood to participate in the census itself? And then as a followup question, so let me just, I’ll get all three out of the way. What is ASA’s role in advocating for or against that question and the accuracy of the census?

RW: Sure. So all three of those questions roll together really in a long and kind of a convoluted story which I’ll try to roll together and not make into too long of a tail. The thing to know is that in certain respects we’ve been asking people about their citizenship in certain ways for a long time in the history of our country. But we haven’t asked everyone in the country about their citizenship since 1950. We’ve asked a subset of people about their citizenship in various ways since that time in what was called the long form of the census. And we’ve asked it since we stopped doing the long form in a document that’s called the American Community Survey. So let’s just mention right off the top of this discussion that we have very good information about the number of citizens and non-citizens through that American Community Survey, which a subset of Americans people living in the U.S. fill out every month.

JS: Correct.

RW: So just so podcast listeners know we actually already have that information and have a good count of that anyway. So –

JS: And do how long, just for folks who don’t know the ACS, how long? I’ve used to look at this and now have forgotten, but how long have we asked the citizenship question in the ACS?

RW: The ACS goes back to around 2005 as I recall.

JS: Right. And that’s about 60,000 households. So we still have, and that number’s changed a bit, but, so we have information on citizenship about 60,000 households every year.

RW: I think that’s something like every month and it ends up being like 3.5 million households a year.

JS: Right. Okay, great. Okay. Okay. So now then we have the census itself.

RW: Right? So we have the census itself. And the, so the reason for the controversy is really very simple. And that is that and this, and this folds into where the American Statistical Association got involved. And that is that the census is a very complex operation. It takes 10 years to run it. It’s a very detailed operation because after all there are over 300 million people in the United States. And counting them all is very complex because we’re a highly mobile population. We’re a complex people.

So one of the things that we do is that we, and we do this by law actually, is that we very thoroughly test the questions on the census and we very thoroughly test the means by which we ask these questions by which we go about conducting the census itself. And we do this a year, eight of the 10 year cycle. Well after this testing period was underway, the administration introduced this idea that we’re suddenly going to bring this new question onto the census form and that just sets off all kinds of red flags because there is a thorough process by which every question on the census is tested and this new question was introduced well after that testing period was underway. And essentially that’s illegal and in probate, improper from a statistical standpoint. And so that’s problem number one from the statistical standpoint.

JS: Yeah. Before you go to problem number two. So maybe you can talk a little bit about how do statisticians think about testing questions and not just reaching people but also the order of the questions, the structure of the survey. I feel like one of the things that maybe people didn’t understand in this whole discussion about adding this extra question is when people say, we need to test this question, what exactly does that mean and what does it entail? I feel like it’s a little bit, it feels a little bit when it’s described in the media that feels to me little bit like a black box. And I wonder how many people actually understand what that entails because it’s not just a simple thing where you just add a question.

RW: Right. So I’ll start with a trivial example. I have two teenagers and if I ask them, have you cleaned your room recently? The words cleaned and the word recently mean very different things to them than they do to me. And every question, the wording of any question has may seem simple to one person, but they have, the words in a question have multiple meanings to any listener. And they have meetings in and of themselves. They have meanings in the context in which they’re asked. They have meaning in the order in which the questions are asked. They have meanings in the cultural context of the hearer. They can have meetings in different languages. We speak many different languages in this country, obviously. And so the way that you find out how questions are interpreted and understood is you, you go out and ask people and you see what happens. There’s a famous example from a few censuses ago where people asked a fairly simple question about coming from Central America and South America. And most of us would understand Central America and South America to be those countries say lying between, you know, like Mexico and Panama for Central American and South America to be another continent. But when those questions were tested, some people in Iowa understood themselves to be in Central America. And some people in Alabama understood themselves to be in South America. When the questions were tested those things were discovered to be some misunderstandings.

So you learn those things by simply asking a subset of people the questions and then you can clarify the question for them. So you just test it to make sure that people understand them. Also there’s, you know, a lot of things have been learned through what’s called testing theory that you just ask people questions, you find out what they’re misunderstanding and you clarify it. So there are ways to test things by asking one set of people a question a certain way, another similar set of people the question a slightly different way. And you compare the answers to see where the differences in responses are amongst similar groups of people to see where misunderstandings arise.

And it’s just, it’s not hard, but it just requires some time and the effort to test the question. And by the way, some people argued, in fact, this came up during some congressional hearings, some people in Congress argued, well, hey, you know, we’re asking the same question in the American Community Survey we’ve been asking this question for years. So isn’t that enough testing? Well, it’s not because for two reasons, at least. One is that the American community survey is a different survey than the census. So it’s not the same survey. It’s not the same context. So one question in one context you asked the same question in different contexts, it’s not the same thing. So you need to test it in the same context.

But the other thing, Jon, is that the American Community Survey is offered to a subset of people. The census goes to everyone. And also the census is clearly a much more politicized issue right now than the American Community Survey is. It’s way more highly visible. So it just has to be tested and that just wasn’t done. So issue number one, as I mentioned, was no testing.

The second issue is the issue of non-response. So if I could take just a minute to talk about how the census is conducted, then I can explain non-response. You will get, I will get, everyone will get a census form in the mail and you get the opportunity to return that census form in the mail. And this time you’ll also get the opportunity to fill out your census online.

JS: Correct.

RW: You’ll get a certain period of time to do that. And if you don’t do it in that period of time and in 2010 about a fourth of households didn’t do it in that period of time, then someone will show up at your door asking you to respond to the census directly to them. People who don’t fill it out online or in the mail that’s called non-response and the follow-up to that is somebody coming to your door. That’s the expensive part of the census is having to have somebody come to your door to do that. It’s really what drives up the cost of the census.

When people don’t respond, somebody coming to the door, it makes it much more expensive and the estimates are that adding the citizenship question will hugely would have hugely driven up the cost of the census. There are 135 million household units in the United States. It was estimated that adding the citizenship question would have added at least 3 million more households that would have failed to respond to the census and it could have been quite a few more.

Now we really don’t know how many people are going to fail to respond, even though the citizenship question isn’t on there. And that’s the ongoing concern Jon, is even though the citizenship question isn’t on there, there’s still quite a residual toxic environment from this whole discussion.

JS: Right. And so the point of people not answering the questions is because they don’t want to reveal whether they are a citizen or not. So I guess, do people have, when you are answering the questions, do you have the option to not answer specific questions within the survey? I mean, we know in the ACS that there’s a lot of non-response for specific questions. So why is that not just an option people could take when answering they’re just not going to answer that question.

RW: And that probably is likely to happen. The census is required by law to not share the information that’s collected with any other agency. It’s revealing census data to anyone else is punishable by imprisonment. That data is highly protected by law. The only instance in our history of that information ever being used by the government is rather infamous. It was during World War II. It was used to round up Japanese people to enter them during World War II. Since that time additional protections have been added. So that information can’t be used that way ever again. But I think it’s arguable that in the current climate some people may very well feel insecure. It would be great if the government would take a strong stand indicating that would not happen if the administration would remind people that their census data is safe and secure, that would be a wonderful step in ensuring that people would respond to the census. If somewhere around March 1st there would be a presidential tweet reminding everyone that was important to fill out the census that everyone’s was safe in doing so that would be terrific. I’m not counting on that happening but that would sure be great.

JS: So what is the ASA’s perspective on this? So you have thousands of members. I’m going, you know, and I’m sure across a variety of perspectives with regards to whether this question should be on as well as political affiliation. So I’m curious how as an advocacy group, how do you view this debate and how did you go about, you know, weighing in the discussion as it was ongoing?

RW: So we’re not a political organization. We did feel very strongly as a statistical organization that the census is vital to our democracy, that having a fair and accurate count is, uh, fundamental to everything that we do as a nation. That it is the basis of our data infrastructure. And that adding a citizenship question was likely, in our view, lead to a failed census that it would have led to a severe undercount of a large segment of our population. So we’re extremely pleased that it was left out and now we feel like it’s our role to do everything that we can to ensure that we get a good count. So we’re certainly encouraging people to respond to the census when they get it to fill out that form when it comes so that we can have the best possible census count that we can get.

JS: One of the last things I want to do to wrap up is we’ve got the results on the counts in the census in another, I guess, you know –

RW: At the end of 2020.

JS: At the end of 2020. And how will we know, I mean, I think one of the things that people have said is, well, just the fact that we’ve had this ongoing debate is going to suppress some participation. How will we know whether there has been any effect on participation or will we know?

RW: Yeah, I think we will. The Census Bureau has some pretty good ways, fairly sophisticated technical ways of estimating the undercount. And I’m not an expert at that and I bet you can find some to follow-up on this show. They have some pretty good ways of estimating how good a job they have done at counting. And so I think we’ll have a pretty good idea of how successful we’ve been at counting or not. And I remain very concerned and so a lot of people are going to be working very hard to reach these populations that are at risk of not being counted to encourage these populations at a very — at how important it is for them to be counted and how important it is for them to know that they will be protected, that their data will be protected, and that it is important for them to be counted because resources that these populations need are at risk if they’re not counted.

JS: Right. So now that we’re moving into this phase of actually collecting the census data, we don’t have this question on, aside from promoting and encouraging people to answer the census questions. What else is on your plate as a direct in the ASA like what are the next round of things that you are all working on? This is obviously a very highly publicized and visible debate. But what are the other things that you all now get to tackle?

RW: So maybe I’ll just make a quick nod towards the issue of differential privacy.

JS: Yeah, it’d be great.

RW: – which is a super complex from a mathematical standpoint. So I’ll just mention real quickly what it is. Differential privacy is a tool for being able to protect individuals from having their data identified. That is from somebody being able to take information from census data and sort of being able to backtrack from that census data and maybe other data that’s available and being able to say, Ooh, I took this information here and this other information there. And Aha, this is Ron Wasserstein or this is Jon Schwabish. And I can now tell from census data this information about these individuals.

And so it’s a means of tweaking that data in such a way that you could never specifically identify me or you or anybody else from that data. So that sounds good. But at the same time, if you sort of do too much of that, if you tweak it too much, then that data becomes useless to researchers who use that data to analyze trends and evaluate systems at a larger level.

So you could imagine data, let’s say in a table where you’re trying to look at information that’s viewing things, trends that are going on in a city or a county or even down at finer levels and by fuzzing up that data so that you can’t figure out who anybody is in particular. If you make it too fuzzy, then you’ve made that data inaccurate. So it’s a balancing game so that you make the data fuzzy enough so that you can’t identify any individual but not so fuzzy that you’ve made the data wrong. And it’s new. It’s a new thing that the census bureau is introducing into the system. And the question is how well will it work? And lots of really smart people are working hard on that problem and we just don’t know the answers to those things yet. But we’ll know soon.

JS: Interesting. Interesting. Well, I’ll put links to both of these issues and the ASA site on the show notes. Now you have just one last question. You have one annual conference or two.

RW: So we host a large annual meeting. We just had it in Denver and a whole host of smaller technical meetings as well.

JS: Right. Well, I’ll put those links up there in case folks want to be involved and I hope they will. Ron thanks so much for coming on the show. This is a really important and interesting topic and I appreciate you taking the time.

RW: I’m grateful that you had me for this opportunity.

[Music]

JS: Thanks everyone. I hope you enjoyed that interview with Ron. I hope you will take some lessons away from that and think carefully about your data, where they come from, how they’re produced, and all the biases that may or may not be involved in the data that you’re using. If you’re interested in supporting the show, please share it with your friends, your family, tweet it out, send some notes around on your favorite social media feeds. If you’d like to get a PolicyViz podcast mug with your favorite hot beverage please go over to my Patreon page and consider being a supporter for just a few months – a few bucks a month. You can help me pay for web services, for transcription services, for all the things I need to make this show come to you every other week.

So I hope you enjoyed this week’s episode. Until next time, this has been the PolicyViz podcast. Thanks so much for listening.