Showing posts with label Llama. Show all posts
Showing posts with label Llama. Show all posts

Tuesday, June 24, 2025

Study: Meta AI model can reproduce almost half of Harry Potter book; Ars Technica, June 20, 2025

 TIMOTHY B. LEE  , Ars Techcnica; Study: Meta AI model can reproduce almost half of Harry Potter book

"In recent years, numerous plaintiffs—including publishers of books, newspapers, computer code, and photographs—have sued AI companies for training models using copyrighted material. A key question in all of these lawsuits has been how easily AI models produce verbatim excerpts from the plaintiffs’ copyrighted content.

For example, in its December 2023 lawsuit against OpenAI, The New York Times Company produced dozens of examples where GPT-4 exactly reproduced significant passages from Times stories. In its response, OpenAI described this as a “fringe behavior” and a “problem that researchers at OpenAI and elsewhere work hard to address.”

But is it actually a fringe behavior? And have leading AI companies addressed it? New research—focusing on books rather than newspaper articles and on different companies—provides surprising insights into this question. Some of the findings should bolster plaintiffs’ arguments, while others may be more helpful to defendants.

The paper was published last month by a team of computer scientists and legal scholars from Stanford, Cornell, and West Virginia University. They studied whether five popular open-weight models—three from Meta and one each from Microsoft and EleutherAI—were able to reproduce text from Books3, a collection of books that is widely used to train LLMs. Many of the books are still under copyright."

Wednesday, April 30, 2025

Meta Faces Copyright Reckoning in Authors’ Generative AI Case; Bloomberg Law, April 30, 2025

Isaiah Poritz, Annelise Levy, Bloomberg Law; Meta Faces Copyright Reckoning in Authors’ Generative AI Case

"The way courts will view the fair use argument for training generative artificial intelligence models with copyrighted materials will be tested Thursday in a San Francisco courtroom, when the first of dozens of such lawsuits reaches summary judgment.

Meta Platforms Inc. and a group of authors including comedian Sarah Silverman will square off before Judge Vince Chhabria, who will decide whether Meta’s use of pirated books to train its AI model Llama qualifies as fair use, or if the issue should be left to a jury."