The PolicyViz Podcast wraps up 2024 with David Keyes, author of the new book, R for the Rest of Us: A Statistics-Free Introduction! We not only talk about how you can get started in R using David’s book, but also building data and data visualization workflows with R, RMarkdown, and Quarto. We also talk about how to create consistent visualizations through themes and functions in R to help new R users leverage its features without being intimidated by complex statistics. I hope you enjoy this episode and have a great holiday season! See you in 2025!!
Resources
Check out David’s website and podcast, and grab his book R for the Rest of Us on Amazon.
Guest Bio
David Keyes (last name, FYI, is pronounced like “eyes with a K in front” not “keys”) is a self-taught R user with a qualitative background who helps people who don’t think of themselves as R users learn to use this powerful tool. As the founder of R for the Rest of Us, he develops courses to help individuals, conducts corporate trainings, and does consulting work to help organizations harness the power of R.
Related Episodes
Different Ways to Listen to the Show!
New Ways to Support the Show!
With more than 250 guests and 10 seasons of episodes, the PolicyViz Podcast is one of the longest-running data visualization podcasts around. You can support the show by downloading and listening, following the work of my guests, and sharing the show with your networks. If you’re interested in financially supporting the show, you can sign up for my Patreon platform, make a one-time payment via PayPal, or shop one of the show’s sponsors.
Transcript
00:12 – 00:15
Welcome back to the Policy Biz Podcast. I’m your host, Jon Schwabish.
00:15 – 00:19
I hope you’re having an enjoyable start to your holiday season.
00:19 – 00:26
I am enjoying sharing with you the last episode of the podcast for 2024. But don’t worry.
00:26 – 00:35
I’ve got lots of great episodes coming your way in 2025, all about helping you do a better job visualizing your data, communicating your data, presenting your data.
00:35 – 00:40
We’re Jon talk about lots of different tools like Tableau, Atlas data, and lots of other things.
00:40 – 00:42
So make sure you stay tuned to the show.
00:42 – 00:50
Make sure you get it in your feed on iTunes, on Spotify, or on YouTube, if you’re watching the show there.
00:50 – 00:58
So on this week’s episode of the show, I’m excited to be joined by author and our expert, our enthusiast, David Keyes, author
00:58 – 01:00
of the new book, R for the Rest of Us.
01:00 – 01:06
This is a really interesting book because it doesn’t focus on how do you clean data? How do you run regressions? How do you do estimates?
01:06 – 01:11
It’s all about using R to communicate data. So he starts with data visualization.
01:11 – 01:15
He gets into data workflow issues like Quarto and R Markdown.
01:15 – 01:23
It’s a really fascinating book and we focus our attention in our discussion on these various elements of communicating data through R.
01:23 – 01:27
So we spend a lot of time talking about Quarto and R Markdown and the difference between the 2.
01:27 – 01:33
If you don’t know, we talk a lot about tables because tables are such a interesting part of visualizing data, and there are
01:33 – 01:35
some great ways to do that in R.
01:35 – 01:43
And then we spent a bunch of time talking about themes and functions that can help you create more consistent branded visualizations,
01:43 – 01:46
either within your organization or just for yourself.
01:46 – 01:51
If you wanna have a consistent look and feel to your graphs and your charts and your maps and whatever else you’re creating in R.
01:51 – 01:58
So if you are relatively new to R, I think this will be a great episode for you because David’s gonna walk you through how
01:58 – 02:02
to use some of these other features that you may be a little hesitant to try out.
02:02 – 02:05
But as you’re going to hear, anyone can do this in R.
02:05 – 02:18
The features, the packages, the libraries are there for you to help you create better graphs and charts and all other visual elements in R. So no more talking for me. Let’s get right to the episode.
02:18 – 02:29
So here’s my conversation with David Keyes, author of the new book, R for the Rest of Us, a statistics free introduction. Hi, David. Welcome to the show.
02:29 – 02:33
I don’t think we’ve ever met, so good to meet you, like, virtually at least.
02:33 – 02:35
Yeah. Great to meet you as well, John.
02:35 – 02:38
I’ve I’ve followed your work for a while, and I’m excited to chat with you.
02:38 – 02:42
Yeah. Terrific. Thank you. So new book, r for the rest of us.
02:42 – 02:46
I do like the subtitle, statistics free introduction, so it doesn’t scare.
02:46 – 02:49
Scary enough to learn how to code. Little less scary.
02:49 – 02:52
We don’t have to learn, like, statistics while you code.
02:52 – 03:01
So I thought we would start simply with, you know, introductions and then, how you sort of got to writing a book about R.
03:01 – 03:08
Sure. Yes. I’m David. I run a website, a business called also r for the rest of us.
03:08 – 03:12
I’ve been doing this for about 5, almost 6 years now.
03:12 – 03:14
I have an unusual path to where I am.
03:14 – 03:19
I actually did a PhD in anthropology, and my dissertation was entirely qualitative.
03:21 – 03:30
But after I graduated, I realized I actually like doing quantitative work as well and, was working in what’s called evaluation.
03:31 – 03:36
So, I tell people it’s kind of like management consulting for, like, nonprofit or government.
03:37 – 03:46
And I was an Excel user, you know, did that, and then realized I wanted to learn something else, taught myself r, and then
03:46 – 03:53
started a business to kind of help other people like me who who wanted to learn R, but maybe didn’t come from that, you know,
03:53 – 03:57
quantitative background like so many R users do. Yeah.
03:57 – 03:59
So, yeah, that’s that’s kinda how I got into it.
03:59 – 04:06
And the book came out, worked on the book for a couple years, and then it finally came out, a few months ago.
04:06 – 04:08
So it’s been exciting to see it out there.
04:09 – 04:18
And were you I guess this sort of bleeds into my first main question, which is, you know, writing a book about a coding language can be kind of tricky, right?
04:18 – 04:24
Because especially a tool like R where it’s constantly changing people, creating new libraries and new packages and new techniques.
04:25 – 04:33
How did you make that decision that a book would be the best way for folks to learn rather than, you know, using Google or
04:33 – 04:37
ChatGPT and, you know, all the various sites people can sort of go to?
04:38 – 04:43
Yeah. I mean, I had before I wrote the book and I and I still do make video courses.
04:44 – 04:46
So that’s the main way I had been teaching.
04:47 – 04:54
But just through talking to people, I know that, you know, there’s certain people who really like learning through online
04:54 – 04:58
courses, and there’s certain people who really like books. Mhmm.
04:58 – 05:07
And so I don’t necessarily think that books are the best way, nor nor are video courses necessarily the best way. I think they’re just different ways.
05:08 – 05:13
And so, yeah, I wanted to provide, you know, an opportunity for folks who like that.
05:13 – 05:15
I know that people also like books.
05:16 – 05:16
You
05:16 – 05:26
know, even in an era of chat gpt, a lot of people like having even that the physical book to be able to mark it up, to, you know, have as a reference.
05:26 – 05:34
So I think, again, just giving opportunity you know, various options for people who want to to learn in different ways was
05:34 – 05:36
was a big part of the motivation to write the book.
05:36 – 05:45
Right. It’s interesting to me. I mean, your book, similar, I think, to Hadley Wickham’s book, but different from a lot of
05:45 – 05:48
other R books, starts with data visualization.
05:48 – 05:57
And the book, I think, primarily focuses on the communication possibilities with R, the exporting, the collaboration, the visualizations.
05:59 – 06:09
Did you ever have a part of yourself that was like, I really need to show folks how to do, like, regression analysis and data cleaning and all those pieces.
06:09 – 06:15
Kinda like I I think in the workflow, you kinda lots of people think about the data communication piece towards the end, which may or may not be right.
06:15 – 06:16
But Right. Right.
06:17 – 06:19
You know, there’s, like, how do you clean the data?
06:19 – 06:20
You know, all all of those pieces.
06:20 – 06:29
But but, I mean, just so folks in case folks haven’t haven’t sort of looked into it very quickly, like, 5 chapters in part 1 that is visualizations.
06:29 – 06:35
There are 5 chapters in part 2 that are reports, presentations, and websites, and then the last section is visualization collaboration.
06:36 – 06:40
So none of that is, like, data cleaning, regression analysis. Right?
06:40 – 06:44
So, like, what were you thinking of the guide and, like, your your target reader?
06:44 – 06:54
Yeah. Well so, I mean, I was, for me, never gonna make a a section on anything statistical because given my background and
06:54 – 07:02
my kind of approach with r, a lot of what I’ve wanted to do is show people that even if you never do complex statistics, R
07:02 – 07:06
can be still incredibly useful because that’s exactly how I use it. Mhmm.
07:07 – 07:11
I do R for the rest of us. We also do some consulting work.
07:12 – 07:22
And the reports that we make for clients, you know, the the most complex statistics we’re doing are, like, means and medians, you know, that type of thing.
07:22 – 07:26
And, again, we get a huge amount of value out of out of it.
07:26 – 07:32
And so I wanted the book to to be a way to show people that you don’t have to use r in that way.
07:32 – 07:39
Because I think everybody already knows, you know, if you’re doing statistics, r obviously is a fantastic tool for that.
07:39 – 07:42
But, really, I also wanted to show people that it could do more than that.
07:42 – 07:46
And, actually, the original title of the book was r without statistics.
07:46 – 07:47
Mhmm.
07:47 – 07:49
That’s that was the working title for a long time.
07:51 – 07:58
And, eventually, my publisher decided working together, we decided to to shift it and make that more like the the subtitle as you mentioned.
07:59 – 08:03
But, hopefully, that that shows you kind of the motivation behind the book.
08:04 – 08:07
You asked also about the kind of, like, intended audience.
08:08 – 08:09
I think it’s 2 groups of people.
08:10 – 08:13
Number 1 is people who have never used R.
08:13 – 08:21
You know, there’s a chapter early on that’s just like a a crash course in how R works, and that comes from the years of experience
08:21 – 08:24
I have teaching folks who have never used R.
08:24 – 08:32
But then number 2 is more experienced R users who have maybe gotten in their head that R is only useful for certain things.
08:32 – 08:38
And oftentimes, that’s, you know, more statistically oriented tasks. And so showing people that, hey.
08:38 – 08:44
You know, R can help you with, like, incredible data vis or improving your reporting workflow.
08:45 – 08:54
I think that is something that even many experienced users don’t appreciate the the value of R.
08:54 – 08:58
Yeah. What is your experience on the qualitative?
08:58 – 09:04
You start you said you started, like, with anthro degree, and I’m curious about, the qualitative part of working with R.
09:04 – 09:09
I can’t think of a time I’ve really, like, like, gone in there.
09:09 – 09:12
But, like, there are new NLP packages coming out.
09:12 – 09:15
I mean, I know there’s a word tree package, but, you know, okay. But, like Right.
09:16 – 09:21
You know, how do you have you have you really dug into the qualitative side of things in r?
09:22 – 09:29
Not really. Yeah. I mean, people ask me that, and I’m always looking for an answer because I get asked that so much.
09:29 – 09:36
But, no, at this point, I don’t really do very much with qualitative data in r.
09:39 – 09:42
So, yeah, I don’t know that I actually have a ton to offer there. Yeah.
09:42 – 09:44
It’s more just that that’s my background.
09:44 – 09:49
Right. Right. So I mean, there are other tools, you know, in Vivo and Dedoose and some others.
09:49 – 09:54
And so maybe folks are just living in those tools and say, okay. Like
09:55 – 10:04
I’ve seen people try to combine the tools, and there have also been, like, packages that people have made to attempt to do
10:04 – 10:12
the kinds of things that, like, in Vivo or Dedoose or those things do, where you are doing that more kind of manual tagging
10:12 – 10:17
of qualitative data and looking for themes and that type of thing.
10:17 – 10:24
Actually, the one thing I will say I have done, I connected r to using chat g p t
10:25 – 10:25
Mhmm.
10:26 – 10:29
And said, you know, go in. Like, here’s my data. Here’s my survey.
10:29 – 10:31
I mean, I was using fake data.
10:31 – 10:33
I I know there are privacy reasons why you Right.
10:33 – 10:34
Have to be
10:34 – 10:37
careful how how you did this, but I just wanted to see, would this be possible?
10:37 – 10:46
And so, yeah, I hooked up our 2 chat, gave it some fake survey data and said, you know, pull out the top three themes in this data.
10:47 – 10:51
So, yeah, it could it could be used in that way. Right.
10:51 – 10:54
But I don’t generally use it for a lot of qualitative analysis.
10:54 – 11:01
Well, it’ll be interesting to see what happens in that area over the next coming months, years, I guess. Okay.
11:01 – 11:06
So there are there are 3 parts of the book that I found especially unique and innovative that I wanna talk about.
11:07 – 11:09
You have a whole section on tables.
11:09 – 11:17
You have a whole section on templates, and then you have a whole chapter on, R markdown in quarto. So I wanna start on tables.
11:17 – 11:22
From your perspective, what are some of the advantages and disadvantages using R for creating tables?
11:22 – 11:27
Sure. Well, first of all, I will actually say I think this is in the book. Hopefully, you pick this up.
11:28 – 11:35
The tables chapter was actually inspired by you because you have an article about effective tables. Yeah.
11:35 – 11:47
And then Tom Mok, who works at posit, took that and did showed how you would implement those principles in R using a package called GT.
11:47 – 11:47
Mhmm.
11:48 – 11:56
And so then I interviewed the the way the book is structured, each chapter is framed around an interview with someone who’s
11:56 – 12:00
using our, you know, in daily life, to do something.
12:00 – 12:06
So I interviewed Tom for that chapter and talked about how you implement those principles. Yeah.
12:06 – 12:13
So as to your questions about advantages and disadvantages of using R to create tables, if you’re already doing your reporting
12:13 – 12:23
in R, if you’re using a tool like R Markdown or Cuarta, which I know we’ll talk about shortly, I think doing tables in R is absolutely the best way to go.
12:24 – 12:31
The reason why I think it is that is because if you think about a more typical workflow or a typical workflow for someone
12:31 – 12:39
who’s not using R, Say they’re using a tool like SPSS or SAS or Data, and say, you know, you get your data, you do your data
12:39 – 12:46
cleaning and and your analysis in that tool, and then you kind of have to spit out, you know, that you have to export your
12:46 – 12:52
data, and then you copy it into Word, and you make your tables in Word.
12:53 – 12:56
Now if you just do that once, sure. That’s fine.
12:57 – 13:01
But the reality is for most people, you’re you’re not typically doing that once. You know?
13:01 – 13:05
Like, maybe you edit your Word document, you have to go back and, like, create your table again.
13:05 – 13:11
Or you say if you’re doing surveys, like, you know, you get 10 additional surveys, well, then you have to run your analysis
13:12 – 13:16
in SPSS or SAS or Stata or whatever, spit out your data, copy it to Word.
13:17 – 13:25
And every time you are copying from one tool to another is a chance of, you know, human error.
13:25 – 13:32
And, I mean, I think anybody who works with data knows that it’s just gonna happen. There’s no way around it.
13:32 – 13:42
And so with r, what you can do is you can work with your data, do all of your analysis within r, and then do what’s called
13:42 – 13:49
piping it directly into the GT package or any other package to make tables.
13:50 – 13:53
And so, I mean, there’s always the possibility of human error.
13:53 – 13:59
You can make an error in your code, of course, but you’re avoiding that copy paste error.
14:00 – 14:03
So I think that’s, like, kind of the major advantage.
14:03 – 14:12
Plus, as I talk about in the chapter, the GT package, especially is set up with some really nice defaults that make it so
14:12 – 14:16
that you’re you basically have to work to mess up your tables.
14:17 – 14:25
As an example, it will automatically write align numeric data so that it aligns and it’s easier to read and compare.
14:26 – 14:32
And as I show in the chapter, like, you can change that, but that’s the default, which is really helpful.
14:33 – 14:38
When it comes to disadvantages, and I don’t think this is specific to tables.
14:38 – 14:42
I think this is specific to R. There’s a learning curve. Yeah.
14:42 – 14:49
And R I mean, I know you did an an episode of this podcast where you worked with Erin Williams and you talked about yourself
14:49 – 14:53
learning R, and I think that’s a good example of, you know, it’s challenging.
14:53 – 15:02
And so I think you have to see it as a long term investment to learn R because that workflow that I talked about where you’re
15:02 – 15:06
working in R and making your tables automatically will save you time in the long term.
15:06 – 15:11
But the first couple times you do it, it’s definitely not gonna save you time. So Yep.
15:11 – 15:14
I think that’s that’s the major disadvantage.
15:14 – 15:23
Yeah. It’s also especially with tables that it’s there just feels like there’s infinitely more variations in a standard table
15:23 – 15:24
than in, say, like, a bar chart. Right?
15:24 – 15:29
Like a bar chart, you’re gonna have axis, axis, rectangles. Right.
15:29 – 15:35
Do you want you know, the infrastructure is kind of all set, whereas tables, like, are you gonna have one that spans multiple columns?
15:35 – 15:38
Are you gonna break the grid lines? Like, it just Right.
15:38 – 15:44
And, yeah, like you said, like, you gotta kinda iterate through it to figure out how you wanna make those tables look good each time. Yep.
15:45 – 15:54
So that segues very nicely into the next thing I wanna ask about was was templates because you have a whole section on templates, which I’m a big fan of.
15:55 – 15:58
So so I wanna ask you about building templates.
15:58 – 16:04
And then, particularly, I guess, as we think or just go back kind of the tables, like, how do you think about building this
16:04 – 16:11
is something I’ve always struggled with because I’m always asked, like, can you build a table template for us in, you know, Excel or PowerPoint or whatever?
16:11 – 16:17
And I’ve always sort of pushed back on that because there’s so many variations on a on a table. So Right.
16:17 – 16:23
Maybe we could just start with, like, the table section in general and and why you felt that was, you know, key to someone
16:23 – 16:29
learning r, and then we could talk about the the it’s kind of what I see is, like, this table template challenge.
16:30 – 16:38
Yeah. I mean, the reason I think learning to make effective tables is useful is because it’s something that everybody is gonna do.
16:38 – 16:43
No matter what else you’re doing in R, everybody at some point’s gonna make a table.
16:43 – 16:43
Mhmm.
16:43 – 16:48
And I found that out because when I first started learning R, because I’m an anthropologist, because I’m not doing statistically
16:49 – 16:54
complicated things, I actually felt really, kind of uncertain.
16:54 – 16:55
Like, oh, am I a real r user?
16:56 – 17:00
And then I started talking to people, and everyone was like, yeah. I mean, sure.
17:00 – 17:07
You know, some people are doing more complex statistics, but the the tips and tricks that get shared online that resonate
17:07 – 17:13
with everybody are things like how to make good tables, how to make good data viz.
17:14 – 17:23
And so I think for that reason, you know, it’s really it’s really useful to to think about making table like, everybody will
17:23 – 17:25
want to to learn how to make tables.
17:25 – 17:26
Yeah.
17:27 – 17:28
I actually forgot your other questions.
17:29 – 17:33
Well, just generally on on templates of yeah.
17:33 – 17:35
I guess maybe maybe I’ll put it this way.
17:35 – 17:40
Like, walk us for folks who haven’t picked up the book yet, and obviously they should, and they will after this after this episode.
17:40 – 17:45
But, like, walk us through the template section.
17:45 – 17:49
And and I guess why I’m not really sure how to ask this question.
17:49 – 17:55
I I guess I guess it is an extension of what you mentioned earlier, like the advantages of R and then use taking all those
17:55 – 17:58
advantages and sort of, like, packaging it together in a template.
17:59 – 18:06
Yeah. So and I assume that we’re talking about, kind of R Markdown or or Cuarto, those types of
18:06 – 18:13
templates. You know, the I know you talked to, Aaron over at at Urban and a couple others, like, even in the in the ggplot
18:13 – 18:16
themes, like, why you see that as as valuable?
18:17 – 18:25
Yeah. Okay. So I think I’ll actually start with the idea of, tables, and then I’ll I’ll move on to the ggplotner markdown.
18:25 – 18:33
So with with with R, because it’s a coding language, you can if you’ve written code to make a table and say, you know, you
18:33 – 18:42
have a style guide for your organization where the top row is always bold, 12 point, aerial font, or, you know, what whatever
18:42 – 18:43
it is,
18:43 – 18:51
you code that once, then you can turn it into what’s called a function, which is just a a kind of reusable piece of code.
18:51 – 19:00
And you can then you know, if I have, like, a table function called, like, table d k, anytime I wanna make a table, I don’t
19:00 – 19:04
have to remember the 20 lines of code that I use to make the table.
19:04 – 19:08
I just type tabledk, and it will format it in that way.
19:09 – 19:16
And I know you talked about, you know, with Excel or PowerPoint, like, the challenges making a template because, well, it
19:16 – 19:18
might, you know, it might vary in different ways.
19:19 – 19:26
With R, one cool thing you can do is when you make a function, you can add what are called arguments so that you can say I don’t know.
19:26 – 19:35
Say sometimes you want to, use Arial for your header font, but other times you wanna use some other font.
19:36 – 19:43
Well, you can set the argument for, you know, header font equals, and then you can you can change it.
19:43 – 19:51
And so you can make a kind of basic template, but you also give a little bit of flexibility for those who wanna change it.
19:51 – 19:59
And it’s the exact same thing with with ggplot, which is the main, way that folks make data viz in R. There, it’s called a theme.
19:59 – 20:07
So there’s a chapter in the book where I interview folks, at the BBC who made a theme for for data viz.
20:07 – 20:12
And their theme has, you know, arguments that you can use.
20:12 – 20:21
So, you know, say, sometimes you Jon to include, you know, your x and y grid lines. Sometimes you don’t want them.
20:21 – 20:24
You can add arguments to to give you that flexibility.
20:25 – 20:33
And when it comes to kind of our markdown, or quarto, which are, tools for what’s typically referred to in a kind of wonky
20:33 – 20:37
way, it’s reproducible research or or reproducible reporting.
20:38 – 20:48
What that means is just it allows you to combine your text and code all in one document, and you use that then to kind of
20:49 – 20:55
render your document to some usable format like Word or PDF or HTML.
20:55 – 21:00
And I know I talked before about, you know, the typical workflow of going from, like, SPSS to Word.
21:01 – 21:10
The way I typically talk about Rmarkdown or Cuarto, which are really very similar tools, is with a typical workflow, say,
21:10 – 21:16
you’d work in SPSS, do your data analysis, then you spit out your kind of your clean data to Excel.
21:16 – 21:18
You use that to make your charts.
21:18 – 21:24
You copy your charts into Word, then you write your your report in Word, add your tables there as well.
21:24 – 21:29
With R Markdown or Quarto, you’re doing that entirely within R.
21:30 – 21:38
And so where you wanna make data viz, for example, you add the code that makes the the charts. Same thing with the tables.
21:38 – 21:42
But then you also have that narrative text alongside it.
21:42 – 21:49
And it’s only when you hit there’s a a button, a render button that allows you then to export to something.
21:49 – 21:57
And that facilitates, you know, if you need to make a report every month, well, no problem. You just write your code.
21:57 – 22:03
Every time you wanna render it with your new data, you just hit that render button, and it spits out a new report.
22:03 – 22:12
Or in the case of of what the folks at Urban were doing, they’re using a technique called parameterized reporting where, you
22:12 – 22:17
know, say you wanna make one report for each state in the US.
22:17 – 22:21
Well, doing that manually is a ton of work, incredibly error prone.
22:21 – 22:30
They have created, Rmarkdown, templates that they then use and write some additional code and iterate to say, okay. 1st, make
22:30 – 22:37
a state, make a report for Alaska, then for Alabama, then Arizona, all the way down through all the states.
22:37 – 22:43
Yeah. So I wanna come back to the quarto, R Markdown, language in a second.
22:43 – 22:51
But for folks who are listening to this who may be, earlier on in their R learning Yeah.
22:51 – 22:59
Experience visualization, I would suspect that many of them sort of when they hear the words functions and building themes, that that’s a little intimidating.
22:59 – 23:05
Like, should they feel intimidated about writing functions and writing themes, or is it just a simple extension of the existing
23:05 – 23:10
R language that that is something that really anybody can do in their learning of the tool?
23:10 – 23:17
Yeah. I mean, it I would not recommend starting out when you’re learning R by trying to create your own ggplot theme. Definitely not.
23:17 – 23:18
Yeah. Yeah. Right. Yeah.
23:19 – 23:25
But the cool thing is because R is open source, so many other people have created themes. Mhmm.
23:25 – 23:31
And so you can rely on you can, you know, use other people’s themes.
23:31 – 23:39
Like, I don’t know why I would wanna do this, but if I wanted to make plots in the Urban Institute style, I could use the
23:39 – 23:46
theme package that you all have made to, you know, make plots in that style. I can make BBC style themes.
23:46 – 23:57
So, I think for most people starting out for a long time when you’re learning r, you’re relying on other people’s, you know, code that they have written.
23:57 – 23:59
And that’s great because it makes it a lot easier.
24:00 – 24:05
But, eventually, you get to a point where you’re like, oh, I wanna tweak this a little bit. And then you realize, wait.
24:05 – 24:14
If I can, you know, write my own themes or my own functions, I can do this in a way that I, you know, I can customize it to my exact needs.
24:14 – 24:19
Yeah. But it’s also advantageous, like, take the urban theme or the BBC theme. Right?
24:19 – 24:26
Like, you could get that, and then you could go under the hood and say you to your point, you don’t wanna use the data font. You like the urban color.
24:26 – 24:29
You like everything about urban except for the font. Right?
24:29 – 24:33
You can just change that one argument, and then you have a whole new thing.
24:33 – 24:35
Yeah. And that’s how a lot of people do it.
24:35 – 24:41
They’ll they’ll when when they’re, for example, making their own theme, they don’t necessarily start from scratch.
24:41 – 24:46
They’ll look at what other people have done and then adapt it based on that.
24:46 – 24:47
So, yeah, that’s a great way to go.
24:47 – 24:50
So I wanna get back to the the quarto and our markdown.
24:50 – 24:56
So, it is interesting because I know, like, our folks at Urban are very much into Cordo.
24:56 – 25:00
But when I talk to other people in other visualization, I don’t know why.
25:00 – 25:02
Maybe it’s just a take up situation.
25:02 – 25:06
Like, I still know a lot of people who haven’t even heard of Cordo, and they’re still using Rmarkdown.
25:06 – 25:13
But I was wondering if you could talk about kind of the difference between the 2, and then what are the advantages of I mean,
25:13 – 25:16
I guess kind of either because they’re you know, they have their own advantage.
25:16 – 25:21
But, like, why folks should start to get into the knitting and the rendering of a markdown language?
25:22 – 25:30
Yeah. So well, I’ll give a little bit of the backstory, which is I started writing this book before, Cuarto had come out.
25:30 – 25:40
So most of the book covers our markdown with a chapter that I added, that kind of compares the 2 and talks about the major differences.
25:41 – 25:48
I will say, overall, the differences between Rmarkdown and Cuarto are very minor.
25:48 – 25:55
It’s about kind of, like, where you place arguments, in code chunks.
25:55 – 25:59
I’m trying not to get too overly technical for folks who might not be R users.
25:59 – 26:08
But I I’ve worked with many people who have been longtime are markdown users, who have, you know, looked into Quarko and said, oh, I wanna do it.
26:08 – 26:10
I I don’t know if it’s gonna be too much work.
26:10 – 26:11
And I tell them, just just try it.
26:11 – 26:14
And they do, and it’s it’s really not that different.
26:14 – 26:22
So what I’d say in the book is the difference between the workflow that I described before, the kind of SPSS to Excel to Word
26:22 – 26:30
workflow, the difference between that and, like, an R Markdown workflow is huge. Like, that’s incredibly Yeah. Different.
26:31 – 26:35
Compared to that, switching between R Markdown and Cuarto is is negligible.
26:38 – 26:45
So in terms of differences, I don’t know that there’s a huge amount.
26:45 – 26:55
I mean, I will say, Posit, the company who is behind both Rmarkdown and Cuarto, has said, Rmarkdown’s not going away.
26:56 – 27:03
But when it comes to kind of new features Mhmm. They’ll be focusing those on Cuarto.
27:03 – 27:09
And so as an example, this has actually just come out, and I haven’t actually fully explored it, but they’re releasing this,
27:11 – 27:11
I don’t know
27:11 – 27:22
if it’s a package or kind of approach using a file, a kind of brand data YAML file, which allows you to this is just working
27:22 – 27:27
within Cuarto, not within R Markdown, allows you to define kind of brand specifications.
27:28 – 27:33
So you define colors in you know, brand colors in one place.
27:33 – 27:37
You define you know, you you’ve set your brand logos.
27:37 – 27:40
Say, if you have, like, multiple versions of your logo, you can define those.
27:41 – 27:51
And then you can use that with throughout Cuarto documents as well as, I think, within, the ideas, it’ll eventually be able
27:51 – 27:56
to apply you’ll be able to apply that pretty easily to things like tables and ggplot themes.
27:56 – 28:00
So that that’s not covered in the book because that literally has just come out.
28:00 – 28:08
But that’s the kind of thing that I think you’re likely to see in Cuarto that is not necessarily going to exist in R Markdown.
28:09 – 28:20
But, overall, I’d say if you’re already using R Markdown and it’s working for you, I don’t necessarily think you need to, you know, go running to switch.
28:20 – 28:24
But if you are starting out fresh, I’d probably go with Cuarto at this point.
28:25 – 28:32
And for folks who’ve never worked in a markdown language, aside from having to learn the syntax and and all that, and my experience
28:32 – 28:37
has been, it’s more of a structure sort of build it.
28:37 – 28:38
Like, it’s still the same r code.
28:38 – 28:40
It’s just built kind of in a different structure.
28:40 – 28:41
Yep.
28:41 – 28:46
But from your perspective, like, what are the advantages, the big advantages of going to either the mark you know, either
28:46 – 28:52
our markdown record rather than just working in, sort of the base the base
28:52 – 28:53
Like in our script file.
28:53 – 28:57
Yeah. Like in our script or in our, you know, in our, I guess yeah. Like in our script. Yeah.
28:57 – 29:10
Yeah. I I mean, the big advantage is, with the whole kind of reproducible reporting workflow that it allows you to do your entire report.
29:10 – 29:17
You know, you can set up all the text and all the code, and then you can automatically generate reports.
29:17 – 29:25
So, like, my my party trick when I, I occasionally do, like, webinars where I’ll show people, you know, here’s the value of
29:25 – 29:28
learning r, is I will set up a survey.
29:28 – 29:32
I’ll do it on usually on Google Sheets or sorry, on Google Forms.
29:33 – 29:37
And then at the beginning of the webinar, I’ll tell people, okay, go fill out the survey.
29:37 – 29:41
I mean, it’s like a 4 question survey just to give me a little bit of data.
29:41 – 29:51
They’ll fill it out, and then I have set up my document either in Rmarkdown or Cuarto such that it will go out, grab that
29:51 – 30:01
data from, Google Sheets because I pipe the data from the Google form into a Google Sheet. It’ll grab it, automatically summarize it.
30:02 – 30:10
I have text, for example, at the top of my document that says, you know, x number of people filled out the survey data, and
30:10 – 30:15
that x can be automatically updated, you know, based on the data.
30:15 – 30:21
And so that type of workflow is possible using Rmarkdown or Quarto.
30:21 – 30:25
Whereas if you’re using just an R script file Mhmm.
30:25 – 30:31
You still have to do that manual you know, say you make a plot, for example, in an R script file.
30:31 – 30:36
Well, then you’d still have to copy that into Word, for example.
30:36 – 30:36
Right.
30:36 – 30:41
And that’s both less efficient and, you know, potentially error prone as well.
30:41 – 30:51
Yeah. Yeah. It it’s it is interesting to think about how the these workflows can can and maybe should change.
30:52 – 31:00
And also, I I find the the the quarto markdown pieces easier to share a lot of things because you just send an HTML file and, you know, it’s one thing.
31:00 – 31:06
You have everything sort of knitted together, as opposed to, like, as you said, like, copying and pasting all these different places.
31:06 – 31:15
Well and I’ll even do things where I’ll create an HTML file, and you there are ways you can kind of set it up online, and you just give people the URL.
31:15 – 31:25
And then anytime you have new data, you know, you just render it again, republish it, and people just go to that same URL to get the the Yeah. Automatically up to date data.
31:25 – 31:28
No. So on that point so I’m sure people are thinking, okay.
31:28 – 31:30
So where so where do you send it?
31:30 – 31:37
Or where where are you are you putting it on your website as, like, an unpublished page or, you know, to your site and you’re
31:37 – 31:39
just sharing that that URL with folks?
31:40 – 31:42
Yeah. So I use a tool called Netlify, typically.
31:44 – 31:52
You have to there are a number of ways to use it, but I will I connect to GitHub.
31:53 – 32:00
I so I connect Netlify to a GitHub repository and set it up, and I tell Netlify, hey, anytime there are changes on this GitHub
32:00 – 32:04
repository, go and grab the latest version of the report and update it.
32:04 – 32:10
So then it’s just Netlify creates a URL that I share with people. Love it.
32:10 – 32:13
So it’s not it’s like, kind of like one off URLs that I make.
32:13 – 32:14
Right. And
32:14 – 32:20
I’m I can set passwords and that kind of thing as well, you know, if you obviously, if I don’t want the data available for anybody.
32:21 – 32:27
Yeah. That’s great. You you’ve talked about workflow a bunch in in our chat, and I’ve heard you say or maybe I don’t know
32:27 – 32:29
if you wrote it or I heard in one of your talks.
32:30 – 32:35
But I’ve heard you say that R is a workflow tool that also happens to do some stats.
32:35 – 32:42
And we’ve talked a lot about that today, but I was hoping maybe you could, I don’t know, crystallize it for for folks.
32:42 – 32:50
Yeah. Well, that’s my that’s my perspective on r given that I’m, you know, an anthropologist who came to r who does not use
32:50 – 32:53
r for much, in the way of statistics.
32:55 – 33:01
And it was really you know, like I mentioned before, I was really kind of insecure for a while about my R use, feeling like,
33:01 – 33:03
oh, I’m not a real R user.
33:05 – 33:13
And it was only when I realized, like, how much it was improving my workflow that I was like, no. You know what? This is actually really valuable.
33:14 – 33:26
And it comes back to things like, the reproducible reporting with Rmarkdown and Quarto, parameterized reporting, especially, making multiple reports at one time.
33:26 – 33:37
I mean, we we have one client who we work with who, he works for a it’s it’s called the San Diego San Diego County Office of Education.
33:37 – 33:47
They work with school districts throughout San Diego County in California, and he has to make reports for every single student. That’s, like, 10,000 students.
33:48 – 33:52
I mean, there’s just no way you’re gonna do that, you know, in Excel
33:52 – 33:53
or Word.
33:53 – 34:04
Yeah. So R facilitates that, and I think it’s a great tool to improve your workflow in so many ways I hadn’t even really considered when I first started learning it.
34:04 – 34:07
Right. Okay. So the book is R for the rest of us.
34:07 – 34:11
Folks should obviously get it wherever they get wherever you get your books, you should check it out.
34:12 – 34:17
If folks have questions for you, David, follow-up, they wanna learn more, where can they find you?
34:17 – 34:21
Sure. So the website is just r for the rest of us.com.
34:22 – 34:26
If you wanna learn more about the book, you can go to rfortherestofus.com/book.
34:28 – 34:32
And people should feel free to email me. Email me. I’m I’m old school.
34:32 – 34:35
Email’s a a good way to reach out to me. It’s just david@rfortherestofus.com.
34:37 – 34:40
David, thanks so much for coming on the show. This was fun.
34:40 – 34:43
I’m excited for r for the rest of us second edition.
34:43 – 34:48
That’ll go into all the quartile stuff and all the all the other changes. So, this is fun. Yeah.
34:48 – 34:50
Thanks so much for coming on the show.
34:50 – 34:52
Thanks, John. I really appreciate you helping me out.
34:54 – 34:56
Thanks, everyone, for tuning in to this week’s episode of the show.
34:56 – 34:58
Make sure you check out David’s site.
34:58 – 35:00
Make sure you check out his book.
35:00 – 35:08
And while you’re at it, taking a little break this holiday season, see if you could take a moment to rate or review the show Jon your favorite podcast provider.
35:08 – 35:10
You can do the same thing on YouTube.
35:10 – 35:12
You can subscribe to the show wherever you get it.
35:12 – 35:17
And if you have a moment, rate or review my books on Amazon or Goodreads where you can get them.
35:17 – 35:24
Again, I’m just trying to share this information, sharing the learnings so that people can do a better job of using and communicating with their data.
35:24 – 35:26
So this wraps up the show for 2024.
35:26 – 35:28
I hope you’ll have a happy holiday season.
35:29 – 35:35
And so until next time, until 2025, this is The PolicyViz Podcast. Thanks so much for listening.