Getting Curious with Jonathan Van Ness

How Smart Is AI? (Part One) with Meredith Broussard

Episode Summary

Every time you interact with a targeted advertisement, use a drop-down menu, or scroll through TikTok, you’re engaging with an algorithm. And these algorithms are anything but objective. This week, Meredith Broussard returns to the show for a special two-part episode about algorithms and artificial intelligence—and how they shape our daily lives. In part one, we’re learning all about chatbots like ChatGPT, the origins of algorithmic bias, and WTF Francis Galton has to do with all of this. Join us tomorrow for part two, featuring an in-depth discussion about gender and racial bias in algorithms, the current state of the tech industry, and the future of AI. Meredith Broussard is Associate Professor at the Arthur L. Carter Journalism Institute of New York University and Research Director at the NYU Alliance for Public Interest Technology. She is the author of Artificial Unintelligence: How Computers Misunderstand the World (MIT Press). Her work has been featured in the New Yorker, the New York Times, the Atlantic, BBC, Wired, the Economist, and more. She appears in the 2020 documentary Coded Bias and serves on the advisory board for the Center for Critical Race & Digital Studies. Make sure to check out Meredith’s new book More Than A Glitch, out now from MIT Press. You can follow Meredith on Instagram and Twitter @merbroussard, and at meredithbroussard.com. MIT Press is on Instagram and Twitter @mitpress. Curious for more? Here’s a list of people, projects, and other resources mentioned in this episode: Algorithmic Justice League (Joy Buolamwini) Algorithms of Oppression (Safiya Noble) Blueprint for an AI Bill of Rights DAIR Institute (Timnit Gebru) Deb Raji’s work Design Justice (Sasha Costanza-Chock) Mimi Onuoha’s work Race After Technology + Viral Justice (Ruha Benjamin) Rumman Chowdhury’s work Take My Hand (Dolen Perkins-Valdez) The Gender Shades Project The Markup (Julia Angwin) Under the Skin (Linda Villarosa) Weapons of Math Destruction + ORCAA (Cathy O’Neil) Follow us on Instagram and Twitter @CuriousWithJVN to join the conversation. Jonathan is on Instagram and Twitter @JVN and @Jonathan.Vanness on Facebook. Transcripts for each episode are available at JonathanVanNess.com. Our executive producer is Erica Getto. Our editor is Andrew Carson. Our theme music is “Freak” by QUIÑ; for more, head to TheQuinCat.com.

Episode Notes

In part one, we’re learning all about chatbots like ChatGPT, the origins of algorithmic bias, and WTF Francis Galton has to do with all of this. Join us tomorrow for part two, featuring an in-depth discussion about gender and racial bias in algorithms, the current state of the tech industry, and the future of AI.

Meredith Broussard is Associate Professor at the Arthur L. Carter Journalism Institute of New York University and Research Director at the NYU Alliance for Public Interest Technology. She is the author of Artificial Unintelligence: How Computers Misunderstand the World (MIT Press). Her work has been featured in the New Yorker, the New York Times, the Atlantic, BBC, Wired, the Economist, and more. She appears in the 2020 documentary Coded Bias and serves on the advisory board for the Center for Critical Race & Digital Studies.

Make sure to check out Meredith’s new book More Than A Glitch, out now from MIT Press.

You can follow Meredith on Instagram and Twitter @merbroussard, and at meredithbroussard.com. MIT Press is on Instagram and Twitter @mitpress.

Curious for more? Here’s a list of people, projects, and other resources mentioned in this episode:

Algorithmic Justice League (Joy Buolamwini)

Algorithms of Oppression (Safiya Noble)

Blueprint for an AI Bill of Rights

DAIR Institute (Timnit Gebru)

Deb Raji’s work

Design Justice (Sasha Costanza-Chock)

Mimi Onuoha’s work

Race After Technology + Viral Justice (Ruha Benjamin)

Rumman Chowdhury’s work

Take My Hand (Dolen Perkins-Valdez)

The Gender Shades Project

The Markup (Julia Angwin)

Under the Skin (Linda Villarosa)

Weapons of Math Destruction + ORCAA (Cathy O’Neil)

Jonathan is on Instagram and Twitter @JVN and @Jonathan.Vanness on Facebook.

Transcripts for each episode are available at JonathanVanNess.com.

Our executive producer is Erica Getto. Our editor is Andrew Carson.

Our theme music is “Freak” by QUIÑ; for more, head to TheQuinCat.com.

Episode Transcription

Getting Curious with Jonathan Van Ness & Meredith Broussard

PART I

JVN [00:00:00] Welcome to Getting Curious. I’m Jonathan Van Ness and every week I sit down for a gorgeous conversation with a brilliant expert to learn all about something that makes me curious. On today’s episode, we are joined by not once, twice, but three-time returning guest Meredith Broussard! Who we love so freakin’ much. And we’re going to be trying something new: a two-part release. In part one, we’ll be breaking down the basics—that was just, I don’t know what that was, y’all, but I don’t know if you heard that bang and the boom, but I’m recording in my apartment in New York City, honey, and it is bangy and boomy sometimes. So in part one, we will be breaking down the basics on AI. And tomorrow—yes, you heard that right, tomorrow—we will be back with a more in-depth discussion of algorithmic bias and what the future holds for artificial intelligence. I had the best time speaking with Meredith, and I’m excited for you to learn as much from her as I did. Without further ado, let’s get into it! Here’s part one of our conversation, where we’re asking: How smart is artificial intelligence?

Hello, Hello. Hello, my curious cuties. And welcome back to Getting Curious, the podcast where we come to learn, grow, and get inspired. I'm your host, Jonathan Van Ness, and today's episode is going to be all about tech and algorithms. Oh my. But before we dive into that, let's take a moment to bask in the glory that is today. We made it. We're here, we're breathing, and we're ready to learn something new. So let's get curious together, shall we? Now, I don't know about you, but technology and algorithms have always been a little bit intimidating to me. They seem like these mysterious things that only the tech wizards fully understand. But that's exactly why I'm so excited to have our guest today, Meredith Broussard, join us. Meredith is a computer scientist and journalist who's here to help us demystify the world of tech and algorithms. She's written a new book called More Than a Glitch. And it's all about how these algorithms can sometimes perpetuate bias and inequality. We're going to be talking about everything from facial recognition technology to self-driving cars and learning how we can make sure that these technologies are serving all of us, not just a select few. So get ready to have your mind blown because we're about to get curious with the brilliant Meredith Broussard. Full disclosure. First of all, welcome. And also everyone, if you are scratching your fucking head right now wondering why I am talking like that, you should know the following: that introduction was written by ChatGPT. Our prompt was, “Script Jonathan's opening monologue for an episode of Getting Curious with Jonathan Van Ness. Episode will feature an interview with Meredith Broussard about her new book, More Than a Glitch.” Honey, wasn't that fucking interesting that it thinks that I talk like that?

MEREDITH BROUSSARD [00:02:53] I had my mind blown just as the monologue suggested.

JVN [00:02:59] So just so you all know. So obviously, like, you must remember, Meredith, unless you've been under a Getting Curious. She's one of our favorite guests. She also is joining a very elite club of, like, three time returning people. If you all have not listened to our first two episodes, maybe listen to the rest of this one, and then you go back and listen, cause I don't want us to get off track. I don't know if we've ever had anyone three times. It might be a first. We would have had you anyway, but I will say, sidebar, that your yearly homemade chocolates, not to put pressure on anything. I would pay you a gazillion dollars to come on the podcast every day just if I knew I was going to get those. You all don't understand the elite craft quality of Meredith Broussard's chocolates. And before we get into ChatGPT and everything, how the fuck do you make those chocolates? Like, how do you get inspired to make those? Like those? Those chocolates are next level.

MEREDITH BROUSSARD [00:03:47] Well, Jonathan, I'm just imagining, like me being Ted Lasso. Like, with the little box of shortbread cookies, like, showing up at the podcast every week. But so, yeah, in addition to writing books and being an NYU professor, I also have this thing where I make chocolate truffles with my girlfriends. And we do it, like, every couple of months and truffle season, which is the wintertime because it's cold. I think it's been going on for about 40 years. I've been doing it for almost 30 years. And we just get together and we make chocolates. And we send them to our favorite people.

JVN [00:04:24] I've never been more grateful to be on a list in my fucking life. They're so incredible. I'm sorry that we're starting off this, like, algorithmic bias episode with your literal scholarship NYU ass work scholar self. Okay, I can focus now.

MEREDITH BROUSSARD [00:04:38] Well, we bring our whole self to every interaction, right. Like I am an algorithmic bias scholar. I am a writer, and I am also somebody who cares a lot about baking.

JVN [00:04:51] But what did you think about how, like, that ChatGPT did, in that opening? I thought it was kind of wow.

MEREDITH BROUSSARD [00:04:59] You know, ChatGPT is really impressive, and everybody listening, I hope you have tried ChatGPT because it is really fun. And my favorite part of ChatGPT being in the world is watching the really creative prompts that people come up with. The things that ChatGPT produces, like, are really good for kind of mundane uses, right. They're really good for, like, “Summarize this thing.” I know a kid who goes to Catholic school and uses it for his religion homework. So, like, there will be a passage in the Bible that he has to write about and he doesn't understand it. So he just, like, puts it into Chat GPT and says, “Summarize this and then it helps him with his homework.”

JVN [00:05:44] Oh, that's cool to use.

MEREDITH BROUSSARD [00:05:46] Cool use! Or, like, I know some other kids who use it to make up characters and then they say, “Write a battle to the death between these two characters.” That also seems really fun.

JVN [00:06:00] What do you think it missed? It reminded me of, like, someone doing an impression of me like, versus like how I actually talk.

MEREDITH BROUSSARD [00:06:08] So the, the the kind of core voice of chat GPT. A professor was recently quoted as saying, “It sounds like a 35-year-old compliance lawyer.” So if a 35-year-old compliance lawyer tried to imitate a JVN style, I think that's, that's what we would get. I mean, we had some cliches in there, right? We had “mind blown.”

JVN [00:06:34] I LOLed at “the podcast where we come to learn, grow and get inspired,” that made me—even though I mean, I guess that is what we're doing. But I was, like, I hope that people didn't, like, click the podcast off, like, at the beginning, being, like, “Did Jonathan, like, what happened, like, when did they start talking like this.” But I thought it was, like, genius. I also thought this was really interesting that we edited out, but our ChatGPT messed up the subtitle of your book, rather than saying “Confronting Race, Gender, and Ability Bias in Tech,” ChatGPT gave your book the title, “How to Fix the Hidden Failures of AI.”

MEREDITH BROUSSARD [00:07:10] So that's not bad in terms of a guess about what the book is about. So I think to, to look at what went wrong there, we have to look at how ChatGPT works. Right. So the way that it works is they took something like 176 billion parameters.

JVN [00:07:28] I'm sorry, I'm freaking out. That's like the biggest number I’ve ever heard. So they took 176 billion parameters. Like, does that mean that someone was, like, putting in like a binary code to, like, program ChatGPT and there was, like, 176 billion things they had to, like, like, what does that mean?

MEREDITH BROUSSARD [00:07:46] Alright, so what it means is that there is a training data set, and this training data set is 570 gigabytes of stuff. It's books, it's articles, it's chat logs, it's crap sourced from the web, broadly. It's the entire corpus of Reddit. Whatever you can scrape from the open web. And so they take this, and put it into the computer and say, “Computer. Show me what the patterns are in this gigantic text corpus.” And the computer says, “Oh, yeah, sure. Okay.” Because the computer's really good at recognizing patterns, mathematical patterns. And so then, once the data is fed in, once the computer recognizes the patterns, it has what's called a model. And the model can output text. And the way it works is you prompt it and then it makes predictions about what the next word should be in the sentence—and you do that enough times, you have coherent sentences.

JVN [00:08:58] So when they put all that stuff into like what would become ChatGPT or whatever, like, when they put that into the computer, is it like pressing equal on a graphing calculator and it immediately tells you, or did that take like six months of them inputting it? Like how long did that take? Or do we even know?

MEREDITH BROUSSARD [00:09:14] Oh, that's a really good question. One of the big differences about ChatGPT and its underlying technology, GPT3 is that it's a much bigger training dataset than anybody has ever made before. So GPT2, which was the technology that came earlier, you know, as the name suggests, had 1.5 billion parameters and this one has 176 billion parameters, right? So the capacity for analyzing data is growing exponentially. And so we can do more. And there's this thing called the “unreasonable effectiveness of data,” which means that the more data you put in, the more precise the predictions are that you get out. So one of the reasons that ChatGPT works so well is because it has this unbelievable amount of data. Now, we could talk about the environmental cost of training these AI models because it's not like you and I can collect 500 gigabytes of information and feed it into the computer and just, like, get a model out in 10 seconds. Like, it's not, it's not like that. It's not like putting something into a graphing calculator. It's really expensive. It's time consuming. You have to do it in the cloud. You get charged for computing cycles. That's really expensive.

JVN [00:10:42] From who? Who charges you?

MEREDITH BROUSSARD [00:10:44] I mean, we have this perception, right, that the cloud is kind of ubiquitous, that there are all these tech startups out there. Not true at all. The cloud, first of all, is somebody else's computer and that other person's computer is owned by Microsoft, Google, or Amazon. Like, those are your options for cloud computing. And so OpenAI, which is the company that has developed ChatGPT, has a partnership with Microsoft. So I'm guessing that they are doing their model training in the Microsoft cloud. And it's really, really expensive. It's pretty cheap to use the model once it's trained, but it takes an awful lot of computing power in order to train a model on really big data in the first place.

JVN [00:11:34] You know how sometimes, like, you'll hear, like, Elizabeth Warren or Bernie Sanders, like, getting really pissed with, like, the Moderna of the world because it was, like, taxpayer funded to those companies, like, through tax breaks and also literally taxpayer funded to develop these vaccines. And then for them to turn around, and, like, raise the price by, like, a gazillion percent or whatever, it's, like, those were taxpayer funded. So, like, wouldn't, like, Microsoft and Google and Amazon get, like, massive corporate tax breaks and it's, like, the people that can't access those technologies and can't access those services or the very people that those sort of AI predictive things, like, may leave out or not include, like they're giving their they're giving some sort of equity to it, and then not getting to use the benefits.

MEREDITH BROUSSARD [00:12:16] Mm hmm. No, you're right. And I feel like there are two questions there. It's the question of who is left out in developing these kinds of new AI technologies and also who gets to access these things? Actually, no, there's three questions that because it's about who gets the access. And then it's also about who paid for this?

JVN [00:12:38] Yes. Who paid for it and who's profiting?

MEREDITH BROUSSARD [00:12:41] Oh, who paid and who profits! These are the big questions, always. So interestingly, mostly it's paid for by the government. Like, there's a huge amount of NSF funding in tech development. Who profits? There's some interesting stuff going on there. I just had a long conversation with a journalist the other day about, like, “How do people make money with AI?” And I can talk about this for hours. Right now, OpenAI doesn't seem to have a clear business model for ChatGPT. I think they're trying to monetize it through the partnership with Microsoft. They're embedding it in the Bing search engine. So you've probably read those stories about Sydney, the Bing chatbot that freaked out reporter Kevin Roose in the New York Times. So they're trying to, trying to figure out how to make money out of it. There's also the option of selling the company. The company has been valued at the billions of dollars, so they could just sell it and do an exit that way. So the question of who paid and who profits is usually, like, it was paid for with government money and venture capital money. Who profits? It's the people that own the company. Who is left out? It's the people who are always left out. So one of the things that I write about in the book, I write about people with disabilities and how tech is really, really wonderful at making the world more accessible for people with disabilities until it isn't. Right. So we have things like text-to-speech and we have things like Google Maps that can help you navigate. But Google Maps navigation doesn't work, for example, if you are underground in the New York City subway. Right. So if you're blind and you're relying on Google Maps to give you voice directions, then it's going to work above ground. It's not going to work so well underground. Right. So, like, we can't expect tech to be the magic bullet. And we should also embrace the fact that tech allows somebody who's blind to navigate the city better and more independently.

JVN [00:15:05] It also makes me think, like, wouldn't there be a fair amount of AI-generated text already on the Internet? So of those 176 billion parameters, like, how do we know that they're not using, like, AI or, like, non-human generated speech any way to base those parameters, like, off of.

MEREDITH BROUSSARD [00:15:25] Oh, my God. So you're saying there's AI written stuff out there and then the AI is getting trained on the AI written stuff?

JVN [00:15:32] Yes. Like, wouldn’t that be supplementing it?

MEREDITH BROUSSARD [00:15:35] There is a lot of AI written stuff out there, so the previous generation of AI written text, was generally created using something called natural language generation. And the most popular natural language generation programs were basically like Mad Libs, right? So news organizations use these a lot. The AP uses it for automatically writing earnings reports based on stock earnings. So, like, when McCormick Spices announces their quarterly results, like, you could have a human being write the same story every single time. Or you can have the computer write the story automatically and the human just checks it and says, “Oh yeah, fine.” Because you know, all the stories are basically the same. Bloomberg automated writing a lot for, you know, doing short paragraphs summarizing movements in the financial markets. And so it's like Mad Libs because you write the skeleton of the story, like, a human writes the skeleton of the story. And then you just plug in the numbers as they come. And if the earnings for McCormick's spices are up, then you use the phrase, “McCormick's earnings were up.” And then if McCormick's earnings were down, like, you use “if then” logic to write the sentence, “McCormick's earnings were down this month.” It's cool, but it's not. It's not like Skynet. It's not, like, the, the kind of dramatic A.I. that you might imagine from watching Star Trek or Star Wars or Terminator or whatever.

JVN [00:17:12] In our previous episode we learned about techno chauvinism. It's, like—well, I did, you are the teacher of it. But, like, I was thinking about that kind of even with this, like, where if you were going towards the easier thing or, like, doing something because it was convenient, you know, making AI, like, text-to-speech because it was easier or whatever. But then that ends up being fed into, like, a larger, like, AI text-to-speech thing, like, does that ever end up getting that voice off or can, like, that previous thing that was meant to be easier, like, end up making something harder, which was like an example of, like, technochauvinism within technochauvinism where, like, you're trying to do a wider, better thing, but because of previous work it, like, can actually get in the way or like make it further off of what your intended result was.

MEREDITH BROUSSARD [00:18:54] Mm hmm. Mm hmm. That's a really good question. And so technochauvinism I always define as, like, a kind of bias that says that technological solutions are superior. And the thing that I think about when I think about technochauvinism and the training data, just, like, you know, getting random crap about the Internet and using it to train something like ChatGPT is, I think about all of the toxicity that's on the open web. Like, I think about all the toxicity, all the hate—it's going to reproduce, you know, our, our boring language but it's also going to reproduce our hateful language, our exploitative language. Now OpenAI has done a better job than some other tech companies because they do have a censor. Like, they do have a model that looks at the output of ChatGPT and, like, it won't say certain words. It uses a dataset called the “real toxicity prompts.” Right. So there are people out there who research toxicity online and they have compiled a dataset of, like, “These are the things that are considered toxic.” And so ChatGPT will look at the output, will measure it against the real toxicity prompts dataset. And if the output is considered too toxic, then it will censor itself, right. It will not output the thing or it will delete it.

JVN [00:19:22] Did you see that thing, this AI, like, cartoon that was, like, mimicking Seinfeld or something, and it was on, like, 24/7. But then all of a sudden it started making, like, anti-Semitic jokes. And isn't that an example of an AI thing, taking parameters and, like, it would have embedded the toxicity of, like, whatever it was reading or whatever they were feeding into it. And then it started making, like, anti-Semitic jokes—or was it transphobic jokes? It was some sort of offensive bad joke.

MEREDITH BROUSSARD [00:19:49] Yeah. Yeah, some sort of horrific thing. I think that the idea of the Endless Seinfeld generated by AI is kind of delightful. Right. And I think it's enormously creative to have created that and then to have thought, “Oh, yeah, let's animate it.” Because, you know, obviously Jerry Seinfeld is not going to, like, you know, act in a 24/7 Seinfeld production. However, all of the bad things about humanity are reflected in the training data that was used to feed the AI that generates the thing. And so the longer the AI, runs the more likely it is that it's going to break down and start spewing hate, right. So we see this in things like Microsoft a few years ago had a bot called Tay that they released on Twitter, and the idea was that people would interact with Tay and it would learn from the interactions and it would interact with, you know, people on Twitter in a human-like way. It was, I don't know, 10 minutes or so before it starts spouting, like, pro-Hitler rhetoric.

JVN [00:20:55] Oh fuck! Oh God!

MEREDITH BROUSSARD [00:20:56] People like, you know, fed in all of this hateful stuff, and then the bots started spitting it back. All right. So Microsoft deactivated it. What they found with Sydney, which is the kind of internal name given to the chat bot embedded in Microsoft Bing, is that they found that the longer Sydney runs, the weirder the responses get. So the story I mentioned before, Kevin Roose, New York Times reporter, interacted with Sydney and it ended up asking him to leave his wife and, like, have a passionate affair with it, Sydney.

JVN [00:21:36] Which is flattering, but I'm, like, “Thank you. But, like, who told you to talk to me like this, you weird computer!”

MEREDITH BROUSSARD [00:21:45] Yeah, it's, it's very charming. I saw a really hilarious story afterward that was, like, “Ah! News flash. New York Times reporter lives wife for a computer bot.” Which obviously, no did not happen. One of the things that's weird about this is that the longer it runs, the weirder the responses get. And that was something that nobody expected.

JVN [00:22:07] Have they fixed it or is, like, Sydney still saying weird stuff the longer you use ChatGPT or whatever?

MEREDITH BROUSSARD [00:22:13] No, it definitely still says weird stuff, like, after it runs for a while. But what they have done it most recently as they've just limited it. So you can only have like a time limited interaction with it or you can only ask it five questions because that cuts down on the weirdness.

JVN [00:22:31] Okay. So, like, you know, for this first part, we've been talking about, like, trending technology, like, the new new, but these algorithms and their algorithmic biases have a long history. So how has the past laid the groundwork for today's tech landscape? Kind of what we talked about in the last time about, like, how these biases are baked into the literal groundwork of, like, how tech expresses itself now.

MEREDITH BROUSSARD [00:22:56] Yeah, for a long time there was this perception that tech was, was new, that it reflected a better world, that it was something superior to what we have already. And now it's become really clear that well, our current AI technologies are merely trained on data about the past. And so what they're doing is they're reproducing all of the bad stuff about the past, they're reproducing financial discrimination. Right. So we have things like mortgage approval algorithms, which are trying to automatically determine if you are eligible for a mortgage to buy a house. Well, guess what? In the United States, we have a really long history of financial discrimination against people of color. And so The Markup did an investigation and they found that mortgage approval algorithms were 40 to 80% more likely to deny loan applicants of color than they were to deny their white counterparts.

JVN [00:23:59] That is fucking crazy.

MEREDITH BROUSSARD [00:24:02] Isn’t that terrible?

JVN [00:24:04] So that means that, like, especially on the higher end of that, like 80%. Like, that means that, like, four out of five Black people were getting denied mortgage approval. Like, one in five, like, that is not good chances. We're told from such a young age that, like, homeownership is like the American dream and it's, like, it is the thing that you need to do. So then when you find something and you love it, and then to go all the way to the part in the process where you go to get a mortgage and then have it get denied, like, that is, like, a big deal. And I actually have, like, chills on my, like, all sort, like, I just my legs, my arms like because, like, that is just so heartbreaking. And to think about how hard you work, and it's even harder for Black people to do because of the like country's financial atrocities that have been committed against Black people. But and then to have all of that, like, history of redlining, like, baked into our algorithms like that no one thought, like, “Oh, wait, like, there was a lot of shit in how we've approved, like, loans previously to this, so let's make sure we're not baking fucking the information of redlining into our goddamn algorithms.” I didn't even know about the mortgage ones!

MEREDITH BROUSSARD [00:25:12] Oh, it's horrifying. And homeownership is a major way of building generational wealth.

JVN [00:25:18] Fuck yes, it is!

MEREDITH BROUSSARD [00:25:20] And that has been, you know, denied to Black and brown people historically in the United States and cities. Algorithms are just perpetuating that. It's a big problem. And so I think that the people who build these algorithms need to think harder about what the algorithms are doing, need to audit the algorithms to check if this is happening, which it inevitably is. the people who are promoting these algorithms saying, “Oh yeah, this is going to be better. We're going to save you so much money, and it's going to be so efficient.” Those people need to kind of take a step back and slow down and say, “Okay, are your algorithms actually doing the things that you claim?” And just overall, we need to be so much more critical of algorithms. We need to not rush into using AI. And we also need to audit the AI systems that are out there. Now, okay, I realize that sounds kind of boring, like, talking about auditing and compliance, like most people's eyes glaze over.

JVN [00:26:22] I wrote “audit” and circled it four times! So is there no auditing process now?

MEREDITH BROUSSARD [00:26:27] I'm so glad you asked. We do have the ability to audit systems. There are some amazing people out there who are doing work about the question of how do we audit algorithmic systems? We have Julia Angwin’s work at ProPublica and at The Markup. We have methods developed by The Markup for auditing algorithms, which is basically: you look at the inputs, you look at the outputs, and you measure whether it is more likely, statistically, that say, a person of color would be denied versus a white person. We've also got people like Cathy O'Neil. I’ve worked with Cathy O'Neil and her company ORCAA on doing algorithmic auditing. Deb Raji, who worked a little bit with Dr. Joy Buolamwini in the Gender Shades Project done by the Algorithmic Justice League. Deb Raji is doing some amazing work around algorithmic auditing. So these, these methods are out there. They're not hugely well-known yet, but they do exist. And so government-wise, there was something that was just published called The Blueprint for the AI Bill of Rights. This was something that came out of the Office of Science and Technology Policy under Alondra Nelson. And The Blueprint For The AI Bill of Rights says, “Listen, algorithms are increasingly being used to make decisions on our behalf, and U.S. citizens have rights relative to algorithmic systems.” One of the things that this blueprint proposes is that if there is an algorithmic system that has made a decision against you, you would have the right to a complete explanation of what happened and you would have the right to contest the decision and you would have to have a reasonable chance of succeeding.

JVN [00:28:22] I love that.

MEREDITH BROUSSARD [00:28:23] Isn't that great?

JVN [00:28:27] This sticky, sneaky ass motherfucker: Francis Galton! This ghost from 1790s Christmas racist past won't leave us alone, honey. What does he have to do with this shit? Why, why is his crusty 1790 fucking ass—why is he still up in these algorithms?

MEREDITH BROUSSARD [00:28:48] Well, so let’s back up. So Francis Galton not only related to Charles Darwin, but also one of the founders of the field of statistics. So Galton gives us the concept of correlation.

JVN [00:29:04] No, no, Meredith, no! I can't. I didn't know. I knew eugenics. I didn't know he invented goddamn statistics.

MEREDITH BROUSSARD [00:29:13] Yeah, early statisticians, like, they’re all up in the eugenics! It’s terrible!

JVN [00:29:21] Fuck, I'm freaking out. No! It’s fine, I’m not, I’m gonna—I'm sorry. This is me processing something. I literally thought I knew what you're going to say, but I didn't. I was, like, “Yeah, eugenics. We know,” but I didn't. Wow. Statistics. I actually have the chills in my groin and on my triceps. Am I sick? I'm going to stop talking now. We don't interrupt NYU professors when they're about to teach us stuff. Please continue with his statistician fucking crusty, racist ass. Okay, we're ready.

MEREDITH BROUSSARD [00:29:48] Yeah, his crusty racist ass did do some very important work in developing the field of statistics. It developed a fundamental concept that we still use today. Like, all of our AI and data science is based on, you know, using the statistical concept of correlation, but there’s all kinds of problematic stuff embedded in the history of statistics. And so as data scientists, as AI researchers, like, we need to do a better job of grappling with this, with reckoning with it, and with not doing things like trying to determine somebody’s race from their x-ray data.

JVN [00:30:36] Do they do that now?

MEREDITH BROUSSARD [00:30:38] Well, I, I did hear at one point of somebody trying to do that and, like, I'm just, like, “Oh that's a terrible idea.” Or that thing where they fed in photos of people and they were, like, “Oh, we're going to use AI to determine if this person is gay based on their photo.” Like, that's garbage science. Like, that's not science. That's just garbage.

JVN [00:31:01] This is where my brain went to, on that piece of information: statistics. Like, what are the statistical odds of X, like, statistics. Right. Which also is to say what's the probability of something? And so in, like, a mortgage algorithm thing, what's the probability or what are the statistics that this is going to default, if the person can afford it, whatever, like, whatever those input numbers were—it's like the whole BMI thing. Like, if the people were only using a certain strict subset of data to predict what you should be within, and they didn't use, like, a wider pool of people or, like, a larger pool of people, then of course, it's going to have, like, a more limited expectation of what an outcome would be or what health is, if it's only using, like, a very restricted thing in the first place. And because this man literally invented eugenics, you guys, like, this is—it's giving smoking gun. We have this idea that science is this infallible thing, operating on an island of truth, that our opinions are not able to touch, and if you look at, like, eugenics and, like, using those crazy, like, World War Two things of that Heinrich [Himmler], like, guy or whatever the fuck his name was about, you know, studying skulls and sizes and this and that. That is junk science. 100% racist. All the way motherfucking racist. So anyway, I'm so sorry. I just almost sweat out my blow dry, I’m freaking out. So I'm going to ask my question now. Unless, do you have anything to say based off of what I just freaked out about?

MEREDITH BROUSSARD [00:32:37] Well, I think you're right. And one of the really difficult things here is that we need to separate the art from the artist, right? Like, we use the concept of correlation, but we need to totally reject Francis Galton's eugenicist ideas. And I think we can also even look at the name of the fields that we've invented: computer science, data science, right? Like, appending “science” on the end does make it seem like it's this, you know, this island of truth, this haven of objectivity. But really all of the messiness of being human is tied up in the data that feeds our algorithmic systems. And so ultimately, computers are really good at solving mathematical problems, and they're really bad at solving social problems.

JVN [00:33:30] You’ve been listening to Getting Curious with me, Jonathan Van Ness. How amazing was that frickin’ episode, you guys! My guest this week was Meredith Broussard. Stay tuned for part two of our conversation, out tomorrow. In the meantime, you’ll find links to her work in the episode description of whatever you’re listening to the show on. Our theme music is “Freak” by Quiñ - thank you so much to Quiñ for letting us use it. If you enjoyed our show, honey, please introduce a friend and show them how to subscribe. It’s really not that hard to find the app once you find it, you know what I’m saying? Follow us on Instagram & Twitter @CuriousWithJVN, we’re going to be trying out some new stuff on our Instagram, so look out for that. Our editor is Andrew Carson. Getting Curious is produced by me and Erica Getto.