Ethics, Info, Tech: Contested Voices, Values, Spaces: pirated books

Showing posts with label pirated books. Show all posts

Friday, October 3, 2025

Harvard Professors May Be Eligible for Payments in $1.5 Billion AI Copyright Settlement; The Harvard Crimson, October 1, 2025

Victoria D. Rengel, The Harvard Crimson; Harvard Professors May Be Eligible for Payments in $1.5 Billion AI Copyright Settlement

"Following mediation, the plaintiffs and defendants filed a motion for the preliminary approval of a settlement on Sept. 5, which included an agreement from Anthropic that it will destroy its pirated databases and pay $1.5 billion in damages to a group of authors and publishers.

On Sept. 25, a California federal judge granted preliminary approval for a settlement, the largest in the history of copyright cases in the U.S.

Each member of the class will receive a payment of approximately $3,000 per pirated work.

Authors whose works are in the databases are not notified separately, but instead must submit their contact information to receive a formal notice of the class action — meaning a number of authors, including many Harvard professors, may be unaware that their works were pirated by Anthropic.

Lynch said Anthropic’s nonconsensual use of her work undermines the purpose behind why she, and other scholars, write and publish their work.

“All of us at Harvard publish, but we thought when we were publishing that we are doing that — to communicate to other human beings,” she said. “Not to be fed into this mill.”"

Monday, September 29, 2025

I Sued Anthropic, and the Unthinkable Happened; The New York Times, September 29, 2025

Andrea Bartz , The New York Times; I Sued Anthropic, and the Unthinkable Happened

"In August 2024, I became one of three named plaintiffs leading a class-action lawsuit against the A.I. company Anthropic for pirating my books and hundreds of thousands of other books to train its A.I. The fight felt daunting, almost preposterous: me — a queer, female thriller writer — versus a company now worth $183 billion?

Thanks to the relentless work of everyone on my legal team, the unthinkable happened: Anthropic agreed to pay authors and publishers $1.5 billion in the largest copyright settlement in history. A federal judge preliminarily approved the agreement last week.

This settlement sends a clear message to the Big Tech companies splashing generative A.I. over every app and page and program: You are not above the law. And it should signal to consumers everywhere that A.I. isn’t an unstoppable tsunami about to overwhelm us. Now is the time for ordinary Americans to recognize our agency and act to put in place the guardrails we want.

The settlement isn’t perfect. It’s absurd that it took an army of lawyers to demonstrate what any 10-year-old knows is true: Thou shalt not steal. At around $3,000 per work, shared by the author and publisher, the damages are far from life-changing (and, some argue, a slap on the wrist for a company flush with cash). I also disagree with the judge’s ruling that, had Anthropic acquired the books legally, training its chatbot on them would have been “fair use.” I write my novels to engage human minds — not to empower an algorithm to mimic my voice and spit out commodity knockoffs to compete directly against my originals in the marketplace, nor to make that algorithm’s creators unfathomably wealthy and powerful.

But as my fellow plaintiff Kirk Wallace Johnson put it, this is “the beginning of a fight on behalf of humans that don’t believe we have to sacrifice everything on the altar of A.I.” Anthropic will destroy its trove of illegally downloaded books; its competitors should take heed to get out of the business of piracy as well. Dozens of A.I. copyright lawsuits have been filed against OpenAI, Microsoft and other companies, led in part by Sylvia Day, Jonathan Franzen, David Baldacci, John Grisham, Stacy Schiff and George R. R. Martin. (The New York Times has also brought a suit against OpenAI and Microsoft.)

Though a settlement isn’t legal precedent, Bartz v. Anthropic may serve as a test case for other A.I. lawsuits, the first domino to fall in an industry whose “move fast, break things” modus operandi led to large-scale theft. Among the plaintiffs of other cases are voice actors, visual artists, record labels, YouTubers, media companies and stock-photo libraries, diverse stakeholders who’ve watched Big Tech encroach on their territory with little regard for copyright law...

Now the book publishing industry has sent a message to all A.I. companies: Our intellectual property isn’t yours for the taking, and you cannot act with impunity. This settlement is an opening gambit in a critical battle that will be waged for years to come."

Saturday, September 27, 2025

Judge approves $1.5 billion copyright settlement between AI company Anthropic and authors; AP, September 25, 2025

BARBARA ORTUTAY , AP; Judge approves $1.5 billion copyright settlement between AI company Anthropic and authors

" A federal judge on Thursday approved a $1.5 billion settlement between artificial intelligence company Anthropic and authors who allege nearly half a million books had been illegally pirated to train chatbots.

U.S. District Judge William Alsup issued the preliminary approval in San Francisco federal court Thursday after the two sides worked to address his concerns about the settlement, which will pay authors and publishers about $3,000 for each of the books covered by the agreement. It does not apply to future works."

Tuesday, September 23, 2025

Screw the money — Anthropic’s $1.5B copyright settlement sucks for writers; TechCrunch, September 5, 2025

Amanda Silberling , TechCrunch; Screw the money — Anthropic’s $1.5B copyright settlement sucks for writers

"But writers aren’t getting this settlement because their work was fed to an AI — this is just a costly slap on the wrist for Anthropic, a company that just raised another $13 billion, because it illegally downloaded books instead of buying them.

In June, federal judge William Alsup sided with Anthropic and ruled that it is, indeed, legal to train AI on copyrighted material. The judge argues that this use case is “transformative” enough to be protected by the fair use doctrine, a carve-out of copyright law that hasn’t been updated since 1976.

“Like any reader aspiring to be a writer, Anthropic’s LLMs trained upon works not to race ahead and replicate or supplant them — but to turn a hard corner and create something different,” the judge said.

It was the piracy — not the AI training — that moved Judge Alsup to bring the case to trial, but with Anthropic’s settlement, a trial is no longer necessary."

Monday, July 28, 2025

A copyright lawsuit over pirated books could result in ‘business-ending’ damages for Anthropic; Fortune, July 28, 2025

BEATRICE NOLAN , Fortune; A copyright lawsuit over pirated books could result in ‘business-ending’ damages for Anthropic

"A class-action lawsuit against Anthropic could expose the AI company to billions in copyright damages over its alleged use of pirated books from shadow libraries like LibGen and PiLiMi to train its models. While a federal judge ruled that training on lawfully obtained books may qualify as fair use, the court will hold a separate trial to address the allegedly illegal acquisition and storage of copyrighted works. Legal experts warn that statutory damages could be severe, with estimates ranging from $1 billion to over $100 billion."

Wednesday, July 9, 2025

Why the new rulings on AI copyright might actually be good news for publishers; Fast Company, July 9, 2025

PETE PACHAL, Fast Company; Why the new rulings on AI copyright might actually be good news for publishers

"The outcomes of both cases were more mixed than the headlines suggest, and they are also deeply instructive. Far from closing the door on copyright holders, they point to places where litigants might find a key...

Taken together, the three cases point to a clearer path forward for publishers building copyright cases against Big AI:
Focus on outputs instead of inputs: It’s not enough that someone hoovered up your work. To build a solid case, you need to show that what the AI company did with it reproduced it in some form. So far, no court has definitively decided whether AI outputs are meaningfully different enough to count as “transformative” in the eyes of copyright law, but it should be noted that courts have ruled in the past that copyright violation can occur even when small parts of the work are copied—ifthose parts represent the “heart” of the original.
Show market harm: This looks increasingly like the main battle. Now that we have a lot of data on how AI search engines and chatbots—which, to be clear, are outputs—are affecting the online behavior of news consumers, the case that an AI service harms the media market is easier to make than it was a year ago. In addition, the emergence of licensing deals between publishers and AI companies is evidence that there’s market harm by creating outputs without offering such a deal.
Question source legitimacy: Was the content legally acquired or pirated? The Anthropic case opens this up as a possible attack vector for publishers. If they can prove scraping occurred through paywalls—without subscribing first—that could be a violation even absent any outputs."

Tuesday, July 1, 2025

AI companies start winning the copyright fight; The Guardian, July 1, 2025

Blake Montgomery , The Guardian; AI companies start winning the copyright fight

"The lawsuits over AI-generated text were filed first, and, as their rulings emerge, the next question in the copyright fight is whether decisions about one type of media will apply to the next.

“The specific media involved in the lawsuit – written works versus images versus videos versus audio – will certainly change the fair-use analysis in each case,” said John Strand, a trademark and copyright attorney with the law firm Wolf Greenfield. “The impact on the market for the copyrighted works is becoming a key factor in the fair-use analysis, and the market for books is different than that for movies.”

To Strand, the cases over images seem more favorable to copyright holders, as the AI models are allegedly producing images identical to the copyrighted ones in the training data.

A bizarre and damning fact was revealed in the Anthropic ruling, too: the company had pirated and stored some 7m books to create a training database for its AI. To remediate its wrongdoing, the company bought physical copies and scanned them, digitizing the text. Now the owner of 7m physical books that no longer held any utility for it, Anthropic destroyed them. The company bought the books, diced them up, scanned the text and threw them away, Ars Technica reports. There are less destructive ways to digitize books, but they are slower. The AI industry is here to move fast and break things.

Anthropic laying waste to millions of books presents a crude literalization of the ravenous consumption of content necessary for AI companies to create their products."

Tuesday, June 24, 2025

Anthropic’s AI copyright ‘win’ is more complicated than it looks; Fast Company, June 24, 2025

CHRIS STOKEL-WALKER, Fast Company;Anthropic’s AI copyright ‘win’ is more complicated than it looks

"And that’s the catch: This wasn’t an unvarnished win for Anthropic. Like other tech companies, Anthropic allegedly sourced training materials from piracy sites for ease—a fact that clearly troubled the court. “This order doubts that any accused infringer could ever meet its burden of explaining why downloading source copies from pirate sites that it could have purchased or otherwise accessed lawfully was itself reasonably necessary to any subsequent fair use,” Alsup wrote, referring to Anthropic’s alleged pirating of more than 7 million books.

That alone could carry billions in liability, with statutory damages starting at $750 per book—a trial on that issue is still to come.

So while tech companies may still claim victory (with some justification, given the fair use precedent), the same ruling also implies that companies will need to pay substantial sums to legally obtain training materials. OpenAI, for its part, has in the past argued that licensing all the copyrighted material needed to train its models would be practically impossible.

Joanna Bryson, a professor of AI ethics at the Hertie School in Berlin, says the ruling is “absolutely not” a blanket win for tech companies. “First of all, it’s not the Supreme Court. Secondly, it’s only one jurisdiction: The U.S.,” she says. “I think they don’t entirely have purchase over this thing about whether or not it was transformative in the sense of changing Claude’s output.”"

Thursday, January 16, 2025

In AI copyright case, Zuckerberg turns to YouTube for his defense; TechCrunch, January 15, 2025

Kyle Wiggers

Maxwell Zeff

, TechCrunch ; In AI copyright case, Zuckerberg turns to YouTube for his defense

"Meta CEO Mark Zuckerberg appears to have used YouTube’s battle to remove pirated content to defend his own company’s use of a data set containing copyrighted e-books, reveals newly released snippets of a deposition he gave late last year.

The deposition, which was part of a complaint submitted to the court by plaintiffs’ attorneys, is related to the AI copyright case Kadrey v. Meta. It’s one of many such cases winding through the U.S. court system that’s pitting AI companies against authors and other IP holders. For the most part, the defendants in these cases – AI companies – claim that training on copyrighted content is “fair use.” Many copyright holders disagree."

Tuesday, August 20, 2024

Authors sue Claude AI chatbot creator Anthropic for copyright infringement; AP, August 19, 2024

MATT O’BRIEN, AP; Authors sue Claude AI chatbot creator Anthropic for copyright infringement

"A group of authors is suing artificial intelligence startup Anthropic, alleging it committed “large-scale theft” in training its popular chatbot Claude on pirated copies of copyrighted books.

While similar lawsuits have piled up for more than a year against competitor OpenAI, maker of ChatGPT, this is the first from writers to target Anthropic and its Claude chatbot.

The smaller San Francisco-based company — founded by ex-OpenAI leaders — has marketed itself as the more responsible and safety-focused developer of generative AI models that can compose emails, summarize documents and interact with people in a natural way...

The lawsuit was brought by a trio of writers — Andrea Bartz, Charles Graeber and Kirk Wallace Johnson — who are seeking to represent a class of similarly situated authors of fiction and nonfiction...

What links all the cases is the claim that tech companies ingested huge troves of human writings to train AI chatbots to produce human-like passages of text, without getting permission or compensating the people who wrote the original works. The legal challenges are coming not just from writers but visual artists, music labels and other creators who allege that generative AI profits have been built on misappropriation...

But the lawsuit against Anthropic accuses it of using a dataset called The Pile that included a trove of pirated books. It also disputes the idea that AI systems are learning the way humans do."