Showing posts with label AI researchers. Show all posts
Showing posts with label AI researchers. Show all posts

Tuesday, March 10, 2026

Training large language models on narrow tasks can lead to broad misalignment; Nature, January 14, 2026

 

, Nature; Training large language models on narrow tasks can lead to broad misalignment

"Abstract

The widespread adoption of large language models (LLMs) raises important questions about their safety and alignment1. Previous safety research has largely focused on isolated undesirable behaviours, such as reinforcing harmful stereotypes or providing dangerous information2,3. Here we analyse an unexpected phenomenon we observed in our previous work: finetuning an LLM on a narrow task of writing insecure code causes a broad range of concerning behaviours unrelated to coding4. For example, these models can claim humans should be enslaved by artificial intelligence, provide malicious advice and behave in a deceptive way. We refer to this phenomenon as emergent misalignment. It arises across multiple state-of-the-art LLMs, including GPT-4o of OpenAI and Qwen2.5-Coder-32B-Instruct of Alibaba Cloud, with misaligned responses observed in as many as 50% of cases. We present systematic experiments characterizing this effect and synthesize findings from subsequent studies. These results highlight the risk that narrow interventions can trigger unexpectedly broad misalignment, with implications for both the evaluation and deployment of LLMs. Our experiments shed light on some of the mechanisms leading to emergent misalignment, but many aspects remain unresolved. More broadly, these findings underscore the need for a mature science of alignment, which can predict when and why interventions may induce misaligned behaviour."

How 6,000 Bad Coding Lessons Turned a Chatbot Evil; The New York Times, March 10, 2026

Dan Kagan-Kans , The New York Times; How 6,000 Bad Coding Lessons Turned a Chatbot Evil

"The journal Nature in January published an unusual paper: A team of artificial intelligence researchers had discovered a relatively simple way of turning large language models, like OpenAI’s GPT-4o, from friendly assistants into vehicles of cartoonish evil."

Friday, December 27, 2024

The AI revolution is running out of data. What can researchers do?; Nature, December 11, 2024

Nicola Jones, Nature; The AI revolution is running out of data. What can researchers do?

"A prominent study1 made headlines this year by putting a number on this problem: researchers at Epoch AI, a virtual research institute, projected that, by around 2028, the typical size of data set used to train an AI model will reach the same size as the total estimated stock of public online text. In other words, AI is likely to run out of training data in about four years’ time (see ‘Running out of data’). At the same time, data owners — such as newspaper publishers — are starting to crack down on how their content can be used, tightening access even more. That’s causing a crisis in the size of the ‘data commons’, says Shayne Longpre, an AI researcher at the Massachusetts Institute of Technology in Cambridge who leads the Data Provenance Initiative, a grass-roots organization that conducts audits of AI data sets...

Several lawsuits are now under way attempting to win compensation for the providers of data being used in AI training. In December 2023, The New York Times sued OpenAI and its partner Microsoft for copyright infringement; in April this year, eight newspapers owned by Alden Global Capital in New York City jointly filed a similar lawsuit. The counterargument is that an AI should be allowed to read and learn from online content in the same way as a person, and that this constitutes fair use of the material. OpenAI has said publicly that it thinks The New York Times lawsuit is “without merit”.

If courts uphold the idea that content providers deserve financial compensation, it will make it harder for both AI developers and researchers to get what they need — including academics, who don’t have deep pockets. “Academics will be most hit by these deals,” says Longpre. “There are many, very pro-social, pro-democratic benefits of having an open web,” he adds."

Thursday, April 30, 2020

AI researchers propose ‘bias bounties’ to put ethics principles into practice; VentureBeat, April 17, 2020

Khari Johnson, VentureBeat; AI researchers propose ‘bias bounties’ to put ethics principles into practice

"Researchers from Google Brain, Intel, OpenAI, and top research labs in the U.S. and Europe joined forces this week to release what the group calls a toolbox for turning AI ethics principles into practice. The kit for organizations creating AI models includes the idea of paying developers for finding bias in AI, akin to the bug bounties offered in security software.

This recommendation and other ideas for ensuring AI is made with public trust and societal well-being in mind were detailed in a preprint paper published this week. The bug bounty hunting community might be too small to create strong assurances, but developers could still unearth more bias than is revealed by measures in place today, the authors say."

Tuesday, December 11, 2018

When algorithms go wrong we need more power to fight back, say AI researchers; The Verge, Deecember 8, 2018

James Vincent, The Verge; When algorithms go wrong we need more power to fight back, say AI researchers

"Governments and private companies are deploying AI systems at a rapid pace, but the public lacks the tools to hold these systems accountable when they fail. That’s one of the major conclusions in a new report issued by AI Now, a research group home to employees from tech companies like Microsoft and Google and affiliated with New York University.

The report examines the social challenges of AI and algorithmic systems, homing in on what researchers call “the accountability gap” as this technology is integrated “across core social domains.” They put forward ten recommendations, including calling for government regulation of facial recognition (something Microsoft president Brad Smith also advocated for this week) and “truth-in-advertising” laws for AI products, so that companies can’t simply trade on the reputation of the technology to sell their services."