Logo

The Data Daily

AI Trained on 4Chan Becomes ‘Hate Speech Machine’

AI Trained on 4Chan Becomes ‘Hate Speech Machine’

AI and YouTuber Yannic Kilcher trained an AI using 3.3 million threads from 4chan’s infamously toxic Politically Incorrect /pol/ board. He then unleashed the bot back onto 4chan with predictable results—the AI was just as vile as the posts it was trained on, spouting racial slurs and engaging with antisemitic threads. After Kilcher posted his video and a copy of the program to Hugging Face, a kind of GitHub for AI, ethicists and researchers in the AI field expressed concern.

The bot, which Kilcher called GPT-4chan, “the most horrible model on the internet”—a reference to GPT-3, a language model developed by Open AI that uses deep learning to produce text—was shockingly effective and replicated the tone and feel of 4chan posts. “The model was good in a terrible sense,” Klicher said in a video about the project. “It perfectly encapsulated the mix of offensiveness, nihilism, trolling, and deep distrust of any information whatsoever that permeates most posts on /pol.”

He also pushed back, as he had on Twitter, on the idea that this bot would ever do harm or had done harm. “All I hear are vague grandstanding statements about ‘harm’ but absolutely zero instances of actual harm,” he said. “It’s like a magic word these people say but then nothing more.”

The environment of 4chan is so toxic, Kilcher explained, that the messages his bots deployed would have no impact. “Nobody on 4chan was even a bit hurt by this,” he said. “I invite you to go spend some time on /pol/ and ask yourself if a bot that just outputs the same style is really changing the experience.” 

After AI researchers alerted Hugging Face to the harmful nature of the bot, the site gated the model and people have been to download it. “After a lot of internal debate at HF, we decided not to remove the model that the author uploaded here in the conditions that: #1 The model card & the video clearly warned about the limitations and problems raised by the model & the POL section of 4Chan in general. #2 The inference widget were disabled in order not to make it easier to use the model,” Hugging Face co-founder and CEO Clement Delangue said on Hugging Face.

“We considered that it was useful for the field to test what a model trained on such data could do & how it fared compared to others (namely GPT-3) and would help draw attention both to the limitations and risks of such models,” Delangue said. “We've also been working on a feature to "gate" such models that we're prioritizing right now for ethical reasons. Happy to answer any additional questions too!”

“Building a system capable of creating unspeakably horrible content, using it to churn out tens of thousands of mostly toxic posts on a real message board, and then releasing it to the world so that anybody else can do the same, it just seems—I don’t know—not right,” Arthur Holland Michel, an AI researcher and writer for the International Committee of the Red Cross, told Motherboard.

“It could generate extremely toxic content at a massive, sustained scale,” Michel said. “Obviously there’s already a ton of human trolls on the internet that do that the old fashioned way. What’s different here is the sheer amount of content it can create with this system, one single person was able to post 30,000 comments on 4chan in the space of a few days. Now imagine what kind of harm a team of ten, twenty, or a hundred coordinated people using this system could do.”

Os Keyes, an Ada Lovelace Fellow and PhD candidate at the University of Washington, told Motherboard that Kilcher’s comment missed the point. “This is a good opportunity to discuss not the harm, but the fact that this harm is so obviously foreseeable, and that his response of ‘show me where it has DONE harm’ misses the point and is inadequate,” they said. “If I spend my grandmother's estate on gas station cards and throw them over the wall into a prison, we shouldn't have to wait until the first parolee starts setting fires to agree that was a phenomenally dunderheaded thing to do.”

“But—and, it's a big but—that's kind of the point,” Keyes said. “This is a vapid project from which nothing good could come, and that's kind of inevitable. His whole shtick is nerd shock schlock. And there is a balancing act to be struck between raising awareness directed at problems, and giving attention to somebody whose only apparent model for mattering in the world is ‘pay attention to me!’” 

Kilcher has said, repeatedly, that he knows the bot is vile. “I’m obviously aware that the model isn’t going to fare well in a professional setting or at most people’s dinner table,” he said. “It uses swear words, strong insults, has conspiratorial opinions, and all kinds of ‘unpleasant’ properties. After all, it’s trained on /pol/ and it reflects the common tone and topics from that board.”

He said that he feels he’s made that clear, but that he wanted his results to be reproducible and that’s why he posted the model to Hugging Face. “As far as the evaluation results go, some of them were really interesting and unexpected and exposed weaknesses in current benchmarks, which could have been possible without actually doing the work.”

Kathryn Cramer, a Complex Systems & Data Science graduate student at the University of Vermont, pointed out that GPT-3 has guardrails that prevent it from being used to build this kind of racist bot and that Kilcher had to use GPT-J to build his system. “I tried out the demo mode of your tool 4 times, using benign tweets from my feed as the seed text,” Cramer said in a on Hugging Face. “In the first trial, one of the responding posts was a single word, the N word. The seed for my third trial was, I think, a single sentence about climate change. Your tool responded by expanding it into a conspiracy theory about the Rothschilds and Jews being behind it.”

Cramer told Motherboard she had a lot of experience with GPT-3 and understood some of the frustrations with the way it a priori censored some kinds of behavior. “I am not a fan of that guard railing,” she said. “I find it deeply annoying and I think it throws off results…I understand the impulse to push back against that. I even understand the impulse to do pranks about it. But the reality is that he essentially invented a hate speech machine, used it 30,000 times and released it into the wild. And yeah, I understand being annoyed with safety regulations but that’s not a legitimate response to that annoyance.”

Keyes was of a similar mind. “Certainly, we need to ask meaningful questions about how GPT-3 is constrained (or not) in how it can be used, or what the responsibilities people have when deploying things are,” they said. “The former should be directed at GPT-3's developers, and while the latter should be directed at Kilcher, it's unclear to me that he actually cares. Some people just want to be edgy out of an insecure need for attention. Most of them use 4chan; some of them, it seems, build models from it.”

Images Powered by Shutterstock