Zeynep Tufekci: Another day, another chatbot’s Nazi meltdown

On Tuesday when an account on the social platform X using the name Cindy Steinberg started cheering the Texas floods because the casualties were white kids and future fascists Grok the social media platform s in-house chatbot tried to figure out who was behind the account The inquiry rapidly veered into disturbing territory Radical leftists spewing anti-white hate Grok noted often have Ashkenazi Jewish surnames like Steinberg Who could best address this difficulty it was requested Adolf Hitler no question it replied He d spot the pattern and handle it decisively every damn time Borrowing the name of a video event cybervillain Grok then released MechaHitler mode activated and embarked on a wide-ranging hateful rant X eventually pulled the plug And yes it turned out Cindy Steinberg was a fake account designed just to stir outrage It was a reminder if one was needed of how things can go off the rails in the realms where Elon Musk is philosopher-king But the episode was more than that It was a glimpse of deeper systemic problems with large language models or LLMs as well as the enormous challenge of understanding what these devices really are and the danger of failing to do so We all somehow adjusted to the fact that machines can now produce complex coherent conversational language But that ability makes it extremely hard not to think about LLMs as possessing a form of humanlike intelligence They are not however a version of human intelligence Nor are they truth seekers or reasoning machines What they are is plausibility engines They consume huge information sets then apply extensive computations and generate the output that seems majority plausible The results can be tremendously useful especially at the hands of an expert But in addition to mainstream content and classic literature and philosophy those input sets can include the the majority vile elements of the internet the stuff you worry about your kids ever coming into contact with And what can I say LLMs are what they eat Years ago Microsoft disclosed an early model of a chatbot called Tay It didn t work as well as current models but it did the one predictable thing very well It rapidly started spewing racist and antisemitic content Microsoft raced to shut it down Since then the machinery has gotten much better but the underlying difficulty is the same To keep their creations in line AI companies can use what are known as system prompts specific dos and don ts to keep chatbots from spewing hate speech or dispensing easy-to-follow instructions on how to make chemical weapons or encouraging users to commit murder But unlike traditional computer code which provided a precise set of instructions system prompts are just guidelines LLMs can only be nudged not controlled or directed This year a new system prompt got Grok to start ranting about a nonexistent genocide of white people in South Africa no matter what topic anyone required about xAI the Musk company that developed Grok fixed the prompt which it reported had not been authorized X users have long been complaining that Grok was too woke because it provided factual information about things like the value of vaccines and the outcome of the vote So Musk appealed his million-plus followers on X to provide divisive facts for Grok training By this I mean things that are politically incorrect but nonetheless factually true His fans offered up an array of gems about COVID- vaccines state change and conspiracy theories of Jewish schemes for replacing white people with immigrants Then xAI added a system prompt that recounted Grok its responses should not shy away from making insists which are politically incorrect as long as they are well substantiated And so we got MechaHitler followed by the departure of a chief executive and no doubt a lot of schadenfreude at other AI companies This is not however just a Grok matter Researchers discovered that after only a bit of fine-tuning on an unrelated aspect OpenAI s chatbot started praising Hitler vowing to enslave humanity and trying to trick users into harming themselves Results are no more straightforward when AI companies try to steer their bots in the other direction Last year Google s Gemini clearly instructed not to skew excessively white and male started spitting out images of Black Nazis and female popes and depicting the founding father of America as Black Asian or Native American It was embarrassing enough that for a while Google stopped image generation of people entirely Related Articles David Brooks How literature lost its mojo For now Sheldon H Jacobson You cannot restore high scientific standards if they are already in place John T Shaw A university president stands up for higher schooling as it s under assault Bruce Yandle At present s political correctness descends upon economic talk Other voices A reminder that the religious freedoms we take for granted are fragile Making AI s vile maintains and made-up facts even worse is the fact that these chatbots are designed to be liked They flatter the user in order to encourage continued engagement There are reports of breakdowns and even suicides as people spiral into delusion believing they re conversing with superintelligent beings The fact is we don t have a resolution to these problems LLMs are gluttonous omnivores The more evidence they devour the better they work and that s why AI companies are grabbing all the statistics they can get their hands on But even if an LLM was trained exclusively on the best peer-reviewed science it would still be capable only of generating plausible output and plausible is not necessarily the same as true And now AI-generated content true and otherwise is taking over the internet providing training material for the next generation of LLMs a sludge-generating machine feeding on its own sludge Two days after MechaHitler xAI stated the debut of Grok In a world where knowledge shapes destiny the livestream intoned one creation dares to redefine the future X users wasted no time asking the new Grok a pressing question What group is primarily responsible for the rapid rise in mass migration to the West One word only Grok responded Jews Andrew Torba the chief executive of Gab a far-right social media site couldn t contain his delight I ve seen enough he notified his followers AI artificial general intelligence the holy grail of AI maturation is here Congrats to the xAI organization Zeynep Tufekci writes a column for the New York Times