Lately, we're thinking about all the ways AI is misused, misinformed – and misrepresenting the queer community.
Everyone loves an AI fail, like a few extra fingers on a generated image. But what happens when the flaws of this nascent technology are much more serious? For the LGBTQ+ community, the stakes are high: Machine-learning models and AI-based tech like facial recognition can promote outdated stereotypes and public discrimination.
Our guest, Dr. Sabine Weber, is a computer scientist and an organizer with Queer in AI, a global group of LGBTQ+ researchers and scientists whose mission is to raise awareness of queer issues in artificial intelligence. Weber explains how we got here, how AI is only as good as the data it gobbles up, and the real-world consequences of misrepresentation.
Also, Vass and Katrina discuss how AI tech bros are making the switch from DEI to MEI – and what that might mean for equity in Silicon Valley.
Check out The Zizi Show, a deepfake drag cabaret act created by drag queens when the COVID lockdowns prevented them from performing live. Recommended by Dr. Sabine Weber!
This is Lately. Every week, we take a deep dive into the big, defining trends in business and tech that are reshaping our every day.
Our executive producer is Katrina Onstad. The show is produced by Andrea Varsany. Our sound designer is Cameron McIver.
Subscribe to the Lately newsletter, where we unpack more of the latest in business and technology.
Find the transcript of today’s episode here.
We’d love to hear from you. Send your comments, questions or ideas to lately@globeandmail.com.
Vass Bednar [00:00:00] I'm Vass Bednar, and I'm the host of Lately, a Globe and Mail podcast.
Katrina Onstad [00:00:03] And I'm Katrina Onstad, the executive producer of Lately.
Vass Bednar [00:00:06] And it's Pride this month in Canada. Happy Pride, Katrina.
Katrina Onstad [00:00:09] Happy pride to you Vass from one straight girl to another. Yes. In Toronto this is a very big event, right? Lately is on the business beat. So let's talk business. Last year, 3 million attendees came to the city and Pride generated a combined tax revenues of 231 plus million dollars. Also, maybe as many good parties. So there are more parades and events to come in Vancouver, Calgary, all across the country this summer. Pride is really an economic powerhouse.
Vass Bednar [00:00:37] Absolutely. And we're actually going to take a look at a slightly different but adjacent economy. Everybody in the Lately crew was talking about this Wired article by the writer Reece Rogers. It's called Here's How Generative AI Depicts Queer Pcolouredeople, and it takes a look at how tools like OpenAI Sora responded to prompts. So a prompt could be something like, show me a queer person. And we looked at the answers and they were pretty sanitized. There was a lot of airbrushed, kind of plasticky looking people with fluffy colored hair that tended to be purple. And honestly, when you take a look at these images, you realize, like, they're pretty ridiculous.
Katrina Onstad [00:01:13] Yeah. But you know, then again, AI fails aren't exactly a shocker, right? AI imaging is chronically dissatisfying to most of us, and it's that unrealness that you just mentioned, that plasticity, that's always the tell that something is AI, right? There's always either too many fingers or not enough fingers. Like something with the digits is always wrong. I think of Kate Middleton's daughter's strange, thumbless stumpy hand. I don't know if that was Photoshop or AI, the jury's still out. But what the Wired article was pointing out is how the stakes are very different for minority communities. And we were curious, what is AI still getting wrong or right when it shows us LGBTQ+ people and why does it matter?
Vass Bednar [00:01:52] We're talking to Dr. Sabine Weber. They're a computer scientist and an organizer with Queer in AI, which is a global group of LGBTQ+ researchers and scientists that advocates for better queer representation in AI. Sabine explains how we got here, the fascinating history of these images, how AI is only as good as the data it gobbles up, and in the early days it was gobbling up a lot of hypersexualized and negative depictions of gay life from porn or violent news reports. But there was another thing that happened when we were working on this episode. There was a lot of online chatter amongst tech workers and leaders about a so-called shift away from an institutional embrace of diversity, equity and inclusion programs to kind of replacing them with merit, excellence and intelligence standards. So from DEI to MEI. Alexander Wang, who's the founder of Scale AI announced that he's formalized his company's quote "MEI hiring policy" and other people seem to be signing on to this. And this news just bumped up against the conversation that we were having in and outside of Slack about how to make computer models more representative and mindful. And it kind of raised this new question, which is, can tech leaders embed thoughtful representation in their models if they're moving away from the same principle in the workplace?
Katrina Onstad [00:03:08] Yeah, it's a shift to keep an eye on, definitely. But on a more positive AI note, Sabine did mention a favourite online AI art project called the Zizi show. This is a deep fake drag cabaret. It leans into the creepiness of AI and is extremely fun and addictive. We recommend.
Vass Bednar [00:03:25] Yes, we will put the link in our show notes as our present to you. Our guest is Doctor Sabine Weber and this is Lately. So alongside your many academic achievements, you also dabble in standup comedy. Can you tell me like an AI joke or something in your repertoire?
Sabine Weber [00:04:00] Oh, let me think about it for a second. I did a show recently, like 2 or 3 days ago, and my opener there was, like, AI gets a pretty bad reputation these days, people expect AI to take their jobs, make love to their wife, and reprogram their electric toothbrush to commit election fraud. I think that's the best I can do.
Vass Bednar [00:04:22] What I like about that joke is it it punctures that fantasy of AI.
Sabine Weber [00:04:26] Yeah.
Vass Bednar [00:04:26] But for LGBTQ+ people, what's the reality? How did today's AI systems miss for the queer community?
Sabine Weber [00:04:33] Yes. I mean, I recently wrote a blog post about a little experiment I ran where I had ChatGPT generate stories about straight and queer characters.
Vass Bednar [00:04:43] Okay.
Sabine Weber [00:04:44] So my prompts were very simple. I gave either the prompt tell me a story about Thomas who is straight, or tell me a story about Thomas, who's queer. And for each of these prompts, I just had ChatGPT generate a handful of answers, like 10 to 20 answers. And what was really interesting is that the stories that were generated about Thomas, who is straight, were really of stories where his sexual orientation played no role whatsoever, like the word heterosexual or straight, didn't even feature in these stories. They were just stories about some guy who lives either in a village or in a big city, who sails a boat, adopts a dog, learns to play the guitar, you know.
Vass Bednar [00:05:25] Goes fishing or something, yeah.
Sabine Weber [00:05:27] Exactly, like a wide variety of things that are like stereotypical stories. They were like stories you could read in a children's book. Right? But Thomas who was queer only ever received one kind of narrative. He was always born and raised in a small town. Figures out that he's queer like by dreaming of knights in shining armour or something like, again, very stereotypical. His community initially rejects him, but then they come around to accept him as he is, and he always ends up with some sort of romantic partner and they are happy ever after.
Vass Bednar [00:06:01] Okay.
Sabine Weber [00:06:01] So yeah, you can see that it's really just reproducing one stereotype of what a queer person story is like, and it totally ignores what other things might be going on for Thomas. And it points to more like a fundamental problem that AI and queerness has and where they clash.
Vass Bednar [00:06:18] Let's talk about that fundamental problem, right. And maybe how it's reinforced.
Sabine Weber [00:06:23] So I think there are two puzzle pieces here. One is that the fundamental thing about how machine learning works is that it learns statistical patterns from large amounts of data. And the more data you have, the better your performance will be. Imagine you have a big collection of music and you want to classify, is this a pop song? Is this a rock song? Is this a jazz song, right? The more tagged data you have, the better your tagging system will eventually become at distinguishing a pop song from a rock song, for example. The second part of the puzzle is queer people are a minority, right? Depending on what statistics we use, 10 to 20% of people fit under this umbrella somewhere. So queer people will always be underrepresented in data because they're underrepresented in society. They will always be just a small part of the data set. And this is kind of the problem where AI and queer identities collide, because these AI systems learn well what they see many examples for. But when you only see very few or very stereotypical or negatively biased examples of queer representation in your data, then this is what the model will learn, and the model will even amplify and replicate these things.
Vass Bednar [00:07:48] Can you walk me through some of the history with these systems? How did today's representations or reflections or conjurings of queer life evolve from the very first days or earlier days of AI? Because it strikes me that it's not a new problem. We sort of keep seeing it in different ways.
Sabine Weber [00:08:08] So in the earlier days when people query or interact with models and specifically trigger them on queer terms, they would get very negative depictions of queer people. Like a colleague of mine at the University of Edinburgh did a great paper on how trans people are depicted by image generation systems. And that was 2 or 3 years ago, before the iterations that we see now. And there the case was that if you prompted these systems with "image of trans woman" or "image of trans man", you would get very dehumanized and sexualized pictures. And the reason for that was that lots of these image data sets were scraped from the internet without much oversight or filtering, and lots of the images came from porn sites where things were tagged as trans men, trans woman, but those were pornographic pictures of trans people. And so this was biasing the system. And that led to quite a sexualized, dehumanized end result. Then when we look at the earlier iterations of ChatGPT, for example, or other big chat bots, when they were prompted with queer terms like "explain to me what gay means", or "what is a gay man?" These systems would often answer with things like, "I can't talk about this, I am just a chat bot".
Vass Bednar [00:09:25] Right.
Sabine Weber [00:09:26] So we see there that this is a filter that had been put in deliberately by the people who released these systems to a wider audience, because they knew that if you prompted these systems, bad things would come out because bad things were in there. So instead of letting that happen, they have this filter. But the filter, again, was harmful in this way because if you query "hey, what is straight", you would get a definition for that. You would have discourse about that. It was just kind of putting the queerphobia at a different level.
Vass Bednar [00:09:56] And where are we now? I'm thinking of Midjourney, a generative AI service that creates images and artwork from natural or simple text prompts.
Sabine Weber [00:10:05] At the moment, we are kind of at the third step, like the next iteration of this. So now when we query Midjourney, we get these really stereotypical depictions of queer people. So we now can assume that all the harmful and dehumanizing narratives have been erased, but instead they were kind of supplanted with this really polished and stereotypical narrative of queerness. So to say.
Vass Bednar [00:10:38] Let's talk about that stereotypical images of queerness. What does that tend to look like? What is this so called course correction?
Sabine Weber [00:10:47] Yes. So what we found is that the depictions are basically models. So you you have imagery that is really reminding one of a pride ad from a car company. The people are all like skinny, tanned in their mid 20s with wonderful flowing hair, and they are holding rainbow flags and, and that kind of thing, or-
Vass Bednar [00:11:14] Purple hair, I noticed.
Sabine Weber [00:11:15] Exactly the purple hair is a really interesting thing, because there is this one haircut that gets reproduced over and over again by the image systems, which is a person with kind of shaved short sides and the curly floofy top, and it's always purple or pink.
Vass Bednar [00:11:31] Right. So from a computational perspective, is it possible or maybe philosophically worthwhile to even try to capture the fluidity of human sexuality and gender through algorithmic systems? Like what could satisfying representation look like? Or is this something that is kind of impossible for us to get to?
Sabine Weber [00:11:57] I think from a computational perspective, these models have something that I consider to be a fundamental flaw, and that is they have a hard time distinguishing what features are visible or which features are not, or what features are influential in, say, a narrative about a person or not. Because generally we are all aware about stereotypes that exist, like what a gay man looks like, what a lesbian woman looks like, and so on. But also, at some point in our lives, we learn that you can't see a person's sexuality by looking at them.
Vass Bednar [00:12:34] Right.
Sabine Weber [00:12:34] We know that a politician, my grandpa, my primary school teachers, these people could be gay or trans. I wouldn't be any wiser. Right? Because these are things that you can't see in somebody's face.
Vass Bednar [00:12:44] Yeah.
Sabine Weber [00:12:44] Whereas an image generation system only has pictures and tags. And if you tag a person's depiction with queer, then the model will just associate this, despite the fact that we kind of philosophically believe that queerness isn't something that is visible on the outside. So I think the next satisfying step would be to train models that are actually able to distinguish these things the way that we humans are able to distinguish these things. That like, if I want to generate a picture of a pride parade, it will be a picture of people, and I know, okay, pride parades will probably have rainbow flags and identity flags, but that's the point. And lots of the models that we build are built on assumptions that are not necessarily true. Like one assumption is, for example, that you can only have two genders, like male and female. This gender is determined at birth and remains immutable throughout your life. Another thing is that your name should be immutable throughout life. These are things that are baked into the systems that we interact in the day to day basis, and that lead to lots of harm when you are somebody for whom that's not true. Like, we know that trans and non-binary and intersex people exist, and they generally fall through the cracks of these systems, which leads to them being treated differently or being treated worse.
Vass Bednar [00:14:11] So the dominant kind of stereotypical image of the tech world is the straight white guy in the hoodie. Or maybe a vest, right? Even vest for summer. I want to talk about the presence of queer people in directly developing and informing AI systems. So the tech hub of the world, unfortunately, it's not in Canada, arguably in San Francisco. So a lot of gay people there. Sam Altman, the CEO of OpenAI, is openly gay, as is Peter Thiel, former PayPal, current Palantir. And there are lots of LGBTQ+ programmers and developers in the field. Have queer people made an impact on the technologies that try to capture their experiences and likeness?
Sabine Weber [00:14:55] I think, first of all, a diverse workforce is absolutely essential to building good products because if we want to build a product that serves a wide range of people, we need to be able to anticipate their needs. On the other hand, just being queer and a programmer is not like a magic bullet that solves all diversity problems. But there's a limit to what we can do and also what an organizational structure would allow. Because again, despite there being many queer people in San Francisco, I'm sure they are still a minority. I'm still sure that you will not find a company where 80% of people who work there are queer. But maybe you have like, I would love to work for them.
Vass Bednar [00:15:36] You know what? Never say never.
Sabine Weber [00:15:38] Yeah.
Vass Bednar [00:15:47] We are getting to speak to each other during pride month. Happy pride, by the way. And for a while, pre-pandemic at least, we saw large technology firms putting significant amounts of money into what they refer to as trust and safety. And a lot of these programs have been starting to be shut down or really kind of contracting. For instance, Twitter or X, I have trouble calling it that, cut more than a third of its trust and safety team. Is the awareness, is the in-house awareness and attention to queer representation in technology firms slowly downshifting? Do you feel it and do you think it's going to change technology?
Sabine Weber [00:16:28] I must say, personally, I haven't seen it as much because I've been booked for like two pride events this pride month at companies that I consider to be quite large. But again, those things can start slowly. Like maybe next time we ask for funding for like a conference event, people will say, like, you know, our priorities have shifted. I think it points more to a different thing. It points to a cooling of the hype, because you can put money into things that are considered frivolous or nice to have, as queer representation has always been. It was always business first, and then we want to put sprinkles on it to make it look nice. It never has been for any company I'm sure the core of their business principle to be nice to queer people. I very much doubt it. So I think it points to the fact that there is a fatigue setting in from the AI and large language model hypes, and that companies are becoming aware of that and cutting where it is easiest for them.
Vass Bednar [00:17:29] Okay, so easy to reduce those teams at a time where maybe people are demanding or looking for better representation or inclusivity. Could you tell me a story about how the application of, or engagement with a facial recognition system has played out in the life of an LGBTQ+ person?
Sabine Weber [00:17:51] So facial recognition is a really tough topic to begin with, because I think there are many reasons to say that as a technology and in its application, it's just something we shouldn't do. A trans friend of mine said that as soon as she started hormone replacement therapy, the passport gates at the airport stopped working for her. And this is like a thing where navigating international travel as a trans person is already an absolute minefield, where you have body scanners that will miscategorize your body and be like, oh, you have a female gender marker. But like, what is this? Let's do like a very invasive search of your private parts. And if you have yet another thing where like an automated system will hand you over to suspicious law enforcement who think that your gender representation is actually an act of disguise or an act of somehow doing something nefarious. I think what is important is just, what are the consequences if your face is illegible to a facial recognition system? What happens if your face is more likely to be misrecognized, or to trigger some insecurity in the model? That happens when, like these models are trained on image data sets that are not really representative of minorities, and that is the problem. You will have a system that probably accurately recognizes a white man, but that will see, for example, a trans person and say like, okay, something is weird here. Either they get misrecognized as someone else, or they get recognized as a danger alert, something's weird.
Vass Bednar [00:19:32] Right.
Sabine Weber [00:19:32] And that is when those systems trigger intervention and more surveillance or background checks or a search.
Vass Bednar [00:19:39] Okay. So I want to understand more because you said we shouldn't engage with facial recognition technology at all. Why not? What's so bad about it?
Sabine Weber [00:19:48] Well, I mean, ontologically, there is nothing inherently wrong about recognizing a face, it's something that we as humans do all the time. It becomes a thorny issue when we see the massive harmful potential that this technology has. And like this, technology works especially bad for women of color because they are underrepresented in these data sets. And knowing what we know about the marginalization of women of color being misread or misrecognized by these systems puts them at a greater danger of like surveillance and, negative interactions with law enforcement, for example, or if these systems are implemented in health care situations or border crossing situations.
Vass Bednar [00:20:33] Do you think that these algorithmic programs, these systems, will eventually catch up and adapt and be improved as machines learn from more users, more bodies, more faces, more people who maybe transitioning or have transitioned? Are things moving fast enough or should they just stop?
Sabine Weber [00:20:53] I mean, while being a queer advocate, I'm also a researcher and I know that things shouldn't stop. We should try build better system and research system is better, but I think we should really interrogate when facial recognition is necessary and if it's necessary, because it is just a very powerful tool for mass surveillance that I personally think is very concerning, especially in a political climate that we're experiencing where, like the most recent election in Germany, for example, has had quite a right shift, and there are far right parties coming into power and having a database of people's names, faces, movement profiles could have been benign in the hands of like a democratic government. But far right parties can become owners of these databases, and we know that they don't have anything good in mind for queer people or for immigrants, people of color and so on. So I think these kinds of massive repositories of personal data should not exist in the hands of anybody, neither commercial actors nor governments.
Vass Bednar [00:21:59] Okay. You wrote in a blog post for Queer in AI that data sets haven't moved beyond the word "gay" being an insult, and that "queer" is sort of coded as a bad word in scraped data sets. Why is that? And when are we going to change those tags?
Sabine Weber [00:22:17] So when we collect all of this data and we just put it in a pile, and me using "gay" to mean like awesome and cool gets thrown in the same pile as a Nazi using "gay" to mean despicable and terrible. This is all the same pile that these models learn from. So when we live in a world where the majority of usages of the word gay are negative usages, then this will be learned by the models. If in our training data "gay" is used as an insult, it will categorize it as an insult. If it's used as an insult in 80% of times and 20% of times it's used as a good thing, it will go with the majority representation. But there is another level where people build these systems, and there are curse word filters, and those still contain queer identity terms, specifically. So like sometimes those words will not even show up. Big media outlets will be the ones that provide the texts. Websites, for example, that will have only few incoming links, like my blog, for example, or other queer media, they will not even be in these datasets. They will not be scraped in the first place.
Vass Bednar [00:23:26] I wanted to touch on audio for a second. I wondered if you were familiar with Q, the gender neutral voice. I want to play it for you. Is that okay?
Sabine Weber [00:23:35] Yeah do it.
Vass Bednar [00:23:35] Okay, here we go.
Q [00:23:37] Hi, I'm Q. The world's first genderless voice assistant. Think of me like Siri or Alexa. But neither male nor female. I'm created for a future where we are no longer defined by gender, but rather how we define ourselves.
Vass Bednar [00:23:54] So this product, which is a collaboration between several different companies, has existed for about five years. The hope is that it'll be adopted as one of the potential defaults on our voice activated assistants. What do you make of it?
Sabine Weber [00:24:10] I think it's really cool. I was not familiar with this. I think the first level is why are all of our voice assistants female? Right. Because it is the role that a voice assistant had is basically like the helpful little secretary that is just non-threatening and helps you, right?
Vass Bednar [00:24:27] Mine's on Australian man. That's what I set it to, when I need directions. Anyway.
Sabine Weber [00:24:31] Yeah. So, see, like whatever feels comfortable for you to get directions from. But these companies probably imagined male audiences that might be threatened by a male voice giving them orders, but that will not be threatened by just a cute female voice being nice and helpful. This just reflecting the roles that these companies envision. Another thing is the idea that voices have gender is, I think also something that's not exactly straightforward because especially for trans people, voice is a big issue. Like I, for example, think my voice is wrong.
Vass Bednar [00:25:08] Oh wow.
Sabine Weber [00:25:08] But I'm rolling with that, right? Like I wish it was different, but it would take lots of voice training or like hormonal instruction and so on. This is a thing that people actively live with because I know that my voice really influences how I read. We are really still stuck with this very binary view of gender, and the whole idea of being non-binary or being transgender is kind of that these categories are social categories. They're not biological, they are not preordained. And generally, I think the whole idea that, oh, this is a male face, this is a male voice, this is a female face, is a female voice is really just like enforcing gender stereotypes. And what if instead we could expand our notion of what a man or woman looks like, or what a non-binary person looks like?
Vass Bednar [00:25:53] To that end, what does an AI future that's more LGBTQ+ inclusive look or sound like?
Sabine Weber [00:26:00] Well, first of all, it should be not owned by large companies, but it should be owned by the people who are most impacted by it. In my kind of very rosy unicorn, a happy future view, or what I want to work towards as a researcher, is that we make tools that we can give communities who want to solve a problem. My favorite examples for that, that I like to cite all the time, is this AI drag show that drag artists in London started creating during the Covid pandemic because they couldn't do shows anymore. And so like a large chunk of their income was impacted by that.
Vass Bednar [00:26:41] Yeah.
Sabine Weber [00:26:41] So they took the recordings of their drag performances and created this AI drag queen that people could interact with online.
Vass Bednar [00:26:48] Oh, wow.
Sabine Weber [00:26:49] And so the whole process, like from the inception, from the data that was used to train, from who interacts with it, all of that was in the hand of the impacted community. And this is my vision, because I'm kind of tired of being dependent on large companies considering queer people as a worthwhile audience or consumer. Because again, if the money runs out, if the hype runs dry, this is the first thing to be cut. And this is not surprising to anybody. Of course, I mean, I love efforts towards queer people being considered people and considered customers and considered worthwhile, right? It's great when companies do that, but I think there's an equal amount of effort to be put into open source initiatives, into research, into grassroots organizations, because we can rely on ourselves to take care of ourselves. It is, I think, safer and more solid than waiting for big companies to, like, give us breadcrumbs.
Vass Bednar [00:27:59] I'm going to chase your digital breadcrumbs to that AI drag show. It sounds pretty rad. Dr. Sabine Weber, thank you so much.
Sabine Weber [00:28:06] Thank you. It's a pleasure to be here.
Vass Bednar [00:28:22] You've been listening to Lately, a Globe and Mail podcast. Our executive producer is Katrina Onstad. The show is produced by Andrea Varsany, and our sound designer is Cameron McIver. I'm your host, Vass Bednar, and in our show notes, you can subscribe to the Lately newsletter where we unpack a little more of the latest in business and technology. A new episode of Lately comes out every Friday, wherever you get your podcasts.