Ethics, Info, Tech: Contested Voices, Values, Spaces: How researchers got AI to quote copyrighted books word for word; Le Monde, January 24, 2026

Sunday, January 25, 2026

How researchers got AI to quote copyrighted books word for word; Le Monde, January 24, 2026

Nicolas Six , Le Monde; How researchers got AI to quote copyrighted books word for word

"Where does artificial intelligence acquire its knowledge? From an enormous trove of texts used for training. These typically include vast numbers of articles from Wikipedia, but also a wide range of other writings, such as the massive Books3 dataset, which aggregates nearly 200,000 books without the authors' permission. Some proponents of conversational AI present these training datasets as a form of "universal knowledge" that transcends copyright law, adding that, protected or not, AIs do not memorize these works verbatim and only store fragmented information.

This argument has been challenged by a series of studies, the latest of which, published in early January by researchers at Stanford University and Yale University, is particularly revealing. Ahmed Ahmed and his coauthors managed to prompt four mainstream AI programs, disconnected from the internet to ensure no new information was retrieved, to recite entire pages from books."

Ethics, Info, Tech: Contested Voices, Values, Spaces

Sunday, January 25, 2026

How researchers got AI to quote copyrighted books word for word; Le Monde, January 24, 2026

No comments:

Post a Comment