Showing posts with label pirated books. Show all posts
Showing posts with label pirated books. Show all posts

Wednesday, July 9, 2025

Why the new rulings on AI copyright might actually be good news for publishers; Fast Company, July 9, 2025

 PETE PACHAL, Fast Company; Why the new rulings on AI copyright might actually be good news for publishers

"The outcomes of both cases were more mixed than the headlines suggest, and they are also deeply instructive. Far from closing the door on copyright holders, they point to places where litigants might find a key...

Taken together, the three cases point to a clearer path forward for publishers building copyright cases against Big AI:

Focus on outputs instead of inputs: It’s not enough that someone hoovered up your work. To build a solid case, you need to show that what the AI company did with it reproduced it in some form. So far, no court has definitively decided whether AI outputs are meaningfully different enough to count as “transformative” in the eyes of copyright law, but it should be noted that courts have ruled in the past that copyright violation can occur even when small parts of the work are copied—ifthose parts represent the “heart” of the original.

Show market harm: This looks increasingly like the main battle. Now that we have a lot of data on how AI search engines and chatbots—which, to be clear, are outputs—are affecting the online behavior of news consumers, the case that an AI service harms the media market is easier to make than it was a year ago. In addition, the emergence of licensing deals between publishers and AI companies is evidence that there’s market harm by creating outputs without offering such a deal.

Question source legitimacy: Was the content legally acquired or pirated? The Anthropic case opens this up as a possible attack vector for publishers. If they can prove scraping occurred through paywalls—without subscribing first—that could be a violation even absent any outputs."

Tuesday, July 1, 2025

AI companies start winning the copyright fight; The Guardian, July 1, 2025

  , The Guardian; AI companies start winning the copyright fight

"The lawsuits over AI-generated text were filed first, and, as their rulings emerge, the next question in the copyright fight is whether decisions about one type of media will apply to the next.

“The specific media involved in the lawsuit – written works versus images versus videos versus audio – will certainly change the fair-use analysis in each case,” said John Strand, a trademark and copyright attorney with the law firm Wolf Greenfield. “The impact on the market for the copyrighted works is becoming a key factor in the fair-use analysis, and the market for books is different than that for movies.”

To Strand, the cases over images seem more favorable to copyright holders, as the AI models are allegedly producing images identical to the copyrighted ones in the training data.

A bizarre and damning fact was revealed in the Anthropic ruling, too: the company had pirated and stored some 7m books to create a training database for its AI. To remediate its wrongdoing, the company bought physical copies and scanned them, digitizing the text. Now the owner of 7m physical books that no longer held any utility for it, Anthropic destroyed them. The company bought the books, diced them up, scanned the text and threw them away, Ars Technica reports. There are less destructive ways to digitize books, but they are slower. The AI industry is here to move fast and break things.

Anthropic laying waste to millions of books presents a crude literalization of the ravenous consumption of content necessary for AI companies to create their products."

Tuesday, June 24, 2025

Anthropic’s AI copyright ‘win’ is more complicated than it looks; Fast Company, June 24, 2025

 CHRIS STOKEL-WALKER, Fast Company;Anthropic’s AI copyright ‘win’ is more complicated than it looks

"And that’s the catch: This wasn’t an unvarnished win for Anthropic. Like other tech companies, Anthropic allegedly sourced training materials from piracy sites for ease—a fact that clearly troubled the court. “This order doubts that any accused infringer could ever meet its burden of explaining why downloading source copies from pirate sites that it could have purchased or otherwise accessed lawfully was itself reasonably necessary to any subsequent fair use,” Alsup wrote, referring to Anthropic’s alleged pirating of more than 7 million books.

That alone could carry billions in liability, with statutory damages starting at $750 per book—a trial on that issue is still to come.

So while tech companies may still claim victory (with some justification, given the fair use precedent), the same ruling also implies that companies will need to pay substantial sums to legally obtain training materials. OpenAI, for its part, has in the past argued that licensing all the copyrighted material needed to train its models would be practically impossible.

Joanna Bryson, a professor of AI ethics at the Hertie School in Berlin, says the ruling is “absolutely not” a blanket win for tech companies. “First of all, it’s not the Supreme Court. Secondly, it’s only one jurisdiction: The U.S.,” she says. “I think they don’t entirely have purchase over this thing about whether or not it was transformative in the sense of changing Claude’s output.”"

Thursday, January 16, 2025

In AI copyright case, Zuckerberg turns to YouTube for his defense; TechCrunch, January 15, 2025

 

, TechCrunch ; In AI copyright case, Zuckerberg turns to YouTube for his defense

"Meta CEO Mark Zuckerberg appears to have used YouTube’s battle to remove pirated content to defend his own company’s use of a data set containing copyrighted e-books, reveals newly released snippets of a deposition he gave late last year.

The deposition, which was part of a complaint submitted to the court by plaintiffs’ attorneys, is related to the AI copyright case Kadrey v. Meta. It’s one of many such cases winding through the U.S. court system that’s pitting AI companies against authors and other IP holders. For the most part, the defendants in these cases – AI companies – claim that training on copyrighted content is “fair use.” Many copyright holders disagree."

Tuesday, August 20, 2024

Authors sue Claude AI chatbot creator Anthropic for copyright infringement; AP, August 19, 2024

 MATT O’BRIEN, AP; Authors sue Claude AI chatbot creator Anthropic for copyright infringement

"A group of authors is suing artificial intelligence startup Anthropic, alleging it committed “large-scale theft” in training its popular chatbot Claude on pirated copies of copyrighted books.

While similar lawsuits have piled up for more than a year against competitor OpenAI, maker of ChatGPT, this is the first from writers to target Anthropic and its Claude chatbot.

The smaller San Francisco-based company — founded by ex-OpenAI leaders — has marketed itself as the more responsible and safety-focused developer of generative AI models that can compose emails, summarize documents and interact with people in a natural way...

The lawsuit was brought by a trio of writers — Andrea Bartz, Charles Graeber and Kirk Wallace Johnson — who are seeking to represent a class of similarly situated authors of fiction and nonfiction...

What links all the cases is the claim that tech companies ingested huge troves of human writings to train AI chatbots to produce human-like passages of text, without getting permission or compensating the people who wrote the original works. The legal challenges are coming not just from writers but visual artistsmusic labels and other creators who allege that generative AI profits have been built on misappropriation...

But the lawsuit against Anthropic accuses it of using a dataset called The Pile that included a trove of pirated books. It also disputes the idea that AI systems are learning the way humans do."