Showing posts with label publishers. Show all posts
Showing posts with label publishers. Show all posts

Sunday, November 24, 2024

‘We live in a climate of fear’: graphic novelist’s Elon Musk book can’t find UK or US publisher; The Guardian, November 23, 2024

 , The Guardian; ‘We live in a climate of fear’: graphic novelist’s Elon Musk book can’t find UK or US publisher

"A biography by a British graphic novelist of Elon Musk is struggling to find an English-language publisher due to feared “legal consequences”.

Elon Musk: Investigation into a New Master of the World is the latest graphic novel by Darryl Cunningham, from West Yorkshire. Cunningham, 64, has written and illustrated seven nonfiction books on topics ranging from the 2008 global economic meltdown (Supercrash), to Russian leader Vladimir Putin (subtitled The Rise of a Dictator)...

Details from the graphic novel by Darryl Cunningham


“Delcourt had lawyers go over every single word and picture to make sure there were no problems. I didn’t use any information that hadn’t been published elsewhere, much of it from the book by Musk’s own mother, Maye.

“But it looks like we live in a climate of fear where the worst people have immense power, and because of this there’s a tendency for the individuals, institutions, businesses and the state to run for cover.”

Cunningham praised Delcourt, who also put out the French edition of his book on Putin, for “having the courage” to publish the book...

Cunningham said: “Knowing what I know about the man, my conclusion is that it’s incredible that such a mediocre figure can amass such wealth, but it was ever thus.”"

Thursday, November 21, 2024

OpenAI accidentally deleted potential evidence in NY Times copyright lawsuit; TechCrunch, November 20, 2024

 Kyle Wiggers , TechCrunch; OpenAI accidentally deleted potential evidence in NY Times copyright lawsuit

"OpenAI tried to recover the data — and was mostly successful. However, because the folder structure and file names were “irretrievably” lost, the recovered data “cannot be used to determine where the news plaintiffs’ copied articles were used to build [OpenAI’s] models,” per the letter.

“News plaintiffs have been forced to recreate their work from scratch using significant person-hours and computer processing time,” counsel for The Times and Daily News wrote. “The news plaintiffs learned only yesterday that the recovered data is unusable and that an entire week’s worth of its experts’ and lawyers’ work must be re-done, which is why this supplemental letter is being filed today.”

The plaintiffs’ counsel makes clear that they have no reason to believe the deletion was intentional. But they do say the incident underscores that OpenAI “is in the best position to search its own datasets” for potentially infringing content using its own tools."

Tuesday, November 5, 2024

Penguin Random House books now explicitly say ‘no’ to AI training; The Verge, October 18, 2024

Emma Roth , The Verge; Penguin Random House books now explicitly say ‘no’ to AI training

"Book publisher Penguin Random House is putting its stance on AI training in print. The standard copyright page on both new and reprinted books will now say, “No part of this book may be used or reproduced in any manner for the purpose of training artificial intelligence technologies or systems,” according to a report from The Bookseller spotted by Gizmodo. 

The clause also notes that Penguin Random House “expressly reserves this work from the text and data mining exception” in line with the European Union’s laws. The Bookseller says that Penguin Random House appears to be the first major publisher to account for AI on its copyright page. 

What gets printed on that page might be a warning shot, but it also has little to do with actual copyright law. The amended page is sort of like Penguin Random House’s version of a robots.txt file, which websites will sometimes use to ask AI companies and others not to scrape their content. But robots.txt isn’t a legal mechanism; it’s a voluntarily-adopted norm across the web. Copyright protections exist regardless of whether the copyright page is slipped into the front of the book, and fair use and other defenses (if applicable!) also exist even if the rights holder says they do not."

Friday, October 18, 2024

Penguin Random House underscores copyright protection in AI rebuff; The Bookseller, October 18, 2024

  MATILDA BATTERSBY, The Bookseller; Penguin Random House underscores copyright protection in AI rebuff

"The world’s biggest trade publisher has changed the wording on its copyright pages to help protect authors’ intellectual property from being used to train large language models (LLMs) and other artificial intelligence (AI) tools, The Bookseller can exclusively reveal.

Penguin Random House (PRH) has amended its copyright wording across all imprints globally, confirming it will appear “in imprint pages across our markets”. The new wording states: “No part of this book may be used or reproduced in any manner for the purpose of training artificial intelligence technologies or systems”, and will be included in all new titles and any backlist titles that are reprinted.

The statement also “expressly reserves [the titles] from the text and data mining exception”, in accordance with a European Parliament directive.

The move specifically to ban the use of its titles by AI firms for the development of chatbots and other digital tools comes amid a slew of copyright infringement cases in the US and reports that large tranches of pirated books have already been used by tech companies to train AI tools. In 2024, several academic publishers including Taylor & Francis, Wiley and Sage have announced partnerships to license content to AI firms.

PRH is believed to be the first of the Big Five anglophone trade publishers to amend its copyright information to reflect the acceleration of AI systems and the alleged reliance by tech companies on using published work to train language models."

Friday, October 11, 2024

Why The New York Times' lawyers are inspecting OpenAI's code in a secretive room; Business Insider, October 10, 2024

   , Business Insider; Why The New York Times' lawyers are inspecting OpenAI's code in a secretive room

"OpenAI is worth $157 billion largely because of the success of ChatGPT. But to build the chatbot, the company trained its models on vast quantities of text it didn't pay a penny for.

That text includes stories from The New York Times, articles from other publications, and an untold number of copyrighted books.

The examination of the code for ChatGPT, as well as for Microsoft's artificial intelligence models built using OpenAI's technology, is crucial for the copyright infringement lawsuits against the two companies.

Publishers and artists have filed about two dozen major copyright lawsuits against generative AI companies. They are out for blood, demanding a slice of the economic pie that made OpenAI the dominant player in the industry and which pushed Microsoft's valuation beyond $3 trillion. Judges deciding those cases may carve out the legal parameters for how large language models are trained in the US."

Sunday, September 29, 2024

AI could be an existential threat to publishers – that’s why Mumsnet is fighting back; The Guardian, September 28, 2024

 , The Guardian; AI could be an existential threat to publishers – that’s why Mumsnet is fighting back

"After nearly 25 years as a founder of Mumsnet, I considered myself pretty unshockable when it came to the workings of big tech. But my jaw hit the floor last week when I read that Google was pushing to overhaul UK copyright law in a way that would allow it to freely mine other publishers’ content for commercial gain without compensation.

At Mumsnet, we’ve been on the sharp end of this practice, and have recently launched the first British legal action against the tech giant OpenAI. Earlier in the year, we became aware that it was scraping our content – presumably to train its large language model (LLM). Such scraping without permission is a breach of copyright laws and explicitly of our terms of use, so we approached OpenAI and suggested a licensing deal. After lengthy talks (and signing a non-disclosure agreement), it told us it wasn’t interested, saying it was after “less open” data sources...

If publishers wither and die because the AIs have hoovered up all their traffic, then who’s left to produce the content to feed the models? And let’s be honest – it’s not as if these tech giants can’t afford to properly compensate publishers. OpenAI is currently fundraising to the tune of $6.5bn, the single largest venture capital round of all time, valuing the enterprise at a cool $150bn. In fact, it has just been reported that the company is planning to change its structure and become a for-profit enterprise...

I’m not anti-AI. It plainly has the potential to advance human progress and improve our lives in myriad ways. We used it at Mumsnet to build MumsGPT, which uncovers and summarises what parents are thinking about – everything from beauty trends to supermarkets to politicians – and we licensed OpenAI’s API (application programming interface) to build it. Plus, we think there are some very good reasons why these AI models should ingest Mumsnet’s conversations to train their models. The 6bn-plus words on Mumsnet are a unique record of 24 years of female interaction about everything from global politics to relationships with in-laws. By contrast, most of the content on the web was written by and for men. AI models have misogyny baked in and we’d love to help counter their gender bias.

But Google’s proposal to change our laws would allow billion-dollar companies to waltz untrammelled over any notion of a fair value exchange in the name of rapid “development”. Everything that’s unique and brilliant about smaller publisher sites would be lost, and a handful of Silicon Valley giants would be left with even more control over the world’s content and commerce."

Monday, September 9, 2024

Internet Archive Court Loss Leaves Higher Ed in Gray Area; Inside Higher Ed, September 9, 2024

  Lauren Coffey, Inside Higher Ed; Internet Archive Court Loss Leaves Higher Ed in Gray Area

"Pandemic-era library programs that helped students access books online could be potentially threatened by an appeals court ruling last week. 

Libraries across the country, from Carnegie Mellon University to the University of California system, turned to what’s known as a digital or controlled lending program in 2020, which gave students a way to borrow books that weren’t otherwise available. Those programs are small in scale and largely experimental but part of a broader shift in modernizing the university library.

But the appeals court ruling could upend those programs...

Still, librarians at colleges and elsewhere, along with other experts, feared that the long-running legal fight between the Internet Archive and leading publishers could imperil the ability of libraries to own and preserve books, among other ramifications."

Thursday, September 5, 2024

The Internet Archive Loses Its Appeal of a Major Copyright Case; Wired, September 4, 2024

 Kate Knibbs, Wired; The Internet Archive Loses Its Appeal of a Major Copyright Case

"THE INTERNET ARCHIVE has lost a major legal battle—in a decision that could have a significant impact on the future of internet history. Today, the US Court of Appeals for the Second Circuit ruled against the long-running digital archive, upholding an earlier ruling in Hachette v. Internet Archive that found that one of the Internet Archive’s book digitization projects violated copyright law.

Notably, the appeals court’s ruling rejects the Internet Archive’s argument that its lending practices were shielded by the fair use doctrine, which permits for copyright infringement in certain circumstances, calling it “unpersuasive.”"

Friday, August 30, 2024

Major publishers sue Florida over ‘unconstitutional’ school book ban; The Guardian, August 30, 2024

  , The Guardian; Major publishers sue Florida over ‘unconstitutional’ school book ban

"Six major book publishers have teamed up to sue the US state of Florida over an “unconstitutional” law that has seen hundreds of titles purged from school libraries following rightwing challenges.

The landmark action targets the “sweeping book removal provisions” of House Bill 1069, which required school districts to set up a mechanism for parents to object to anything they considered pornographic or inappropriate.

A central plank of Republican governor Ron DeSantis’s war on “woke” on Florida campuses, the law has been abused by rightwing activists who quickly realized that any book they challenged had to be immediately removed and replaced only after the exhaustion of a lengthy and cumbersome review process, if at all, the publishers say.

Since it went into effect last July, countless titles have been removed from elementary, middle and high school libraries, including American classics such as Brave New World by Aldous Huxley, For Whom the Bell Tolls by Ernest Hemingway and The Adventures of Tom Sawyer by Mark Twain.

Contemporary novels by bestselling authors such as Margaret Atwood, Judy Blume and Stephen King have also been removed, as well as The Diary of a Young Girl, Anne Frank’s gripping account of the Holocaust, according to the publishers."

Thursday, August 29, 2024

OpenAI Pushes Prompt-Hacking Defense to Deflect Copyright Claims; Bloomberg Law, August 29, 2024

 Annelise Gilbert, Bloomberg Law; OpenAI Pushes Prompt-Hacking Defense to Deflect Copyright Claims

"Diverting attention to hacking claims or how many tries it took to obtain exemplary outputs, however, avoids addressing most publishers’ primary allegation: AI tools illegally trained on copyrighted works."

Tuesday, July 23, 2024

The Data That Powers A.I. Is Disappearing Fast; The New York Times, July 19, 2024

 Kevin Roose , The New York Times; The Data That Powers A.I. Is Disappearing Fast

"For years, the people building powerful artificial intelligence systems have used enormous troves of text, images and videos pulled from the internet to train their models.

Now, that data is drying up.

Over the past year, many of the most important web sources used for training A.I. models have restricted the use of their data, according to a study published this week by the Data Provenance Initiative, an M.I.T.-led research group.

The study, which looked at 14,000 web domains that are included in three commonly used A.I. training data sets, discovered an “emerging crisis in consent,” as publishers and online platforms have taken steps to prevent their data from being harvested.

The researchers estimate that in the three data sets — called C4, RefinedWeb and Dolma — 5 percent of all data, and 25 percent of data from the highest-quality sources, has been restricted. Those restrictions are set up through the Robots Exclusion Protocol, a decades-old method for website owners to prevent automated bots from crawling their pages using a file called robots.txt."

Tuesday, July 9, 2024

Record labels sue AI music startups for copyright infringement; WBUR Here & Now, July 8, 2024

  WBUR Here & Now; Record labels sue AI music startups for copyright infringement

"Major record labels including Sony, Universal Music Group and Warner are suing two music startups that use artificial intelligence. The labels say Suno and Udio rely on mass copyright infringement, echoing similar complaints from authors, publishers and artists who argue that generative AI infringes on copyright.

Here & Now's Lisa Mullins discusses the cases with Ina Fried, chief technology correspondent for Axios."

Monday, July 1, 2024

Internet Archive forced to remove 500,000 books after publishers’ court win; Ars Technica, June 21, 2024

 , Ars Technica; Internet Archive forced to remove 500,000 books after publishers’ court win

"As a result of book publishers successfully suing the Internet Archive (IA) last year, the free online library that strives to keep growing online access to books recently shrank by about 500,000 titles.

IA reported in a blog post this month that publishers abruptly forcing these takedowns triggered a "devastating loss" for readers who depend on IA to access books that are otherwise impossible or difficult to access.

To restore access, IA is now appealing, hoping to reverse the prior court's decision by convincing the US Court of Appeals in the Second Circuit that IA's controlled digital lending of its physical books should be considered fair use under copyright law."

Sunday, June 30, 2024

Tech companies battle content creators over use of copyrighted material to train AI models; The Canadian Press via CBC, June 30, 2024

 Anja Karadeglija , The Canadian Press via CBC; Tech companies battle content creators over use of copyrighted material to train AI models

"Canadian creators and publishers want the government to do something about the unauthorized and usually unreported use of their content to train generative artificial intelligence systems.

But AI companies maintain that using the material to train their systems doesn't violate copyright, and say limiting its use would stymie the development of AI in Canada.

The two sides are making their cases in recently published submissions to a consultation on copyright and AI being undertaken by the federal government as it considers how Canada's copyright laws should address the emergence of generative AI systems like OpenAI's ChatGPT."

Monday, June 17, 2024

An epidemic of scientific fakery threatens to overwhelm publishers; The Washington Post, June 11, 2024

 and 
An epidemic of scientific fakery threatens to overwhelm publishers

"A record number of retractions — more than 10,000 scientific papers in 2023. Nineteen academic journals shut down recently after being overrun by fake research from paper mills. A single researcher with more than 200 retractions.

The numbers don’t lie: Scientific publishing has a problem, and it’s getting worse. Vigilance against fraudulent or defective research has always been necessary, but in recent years the sheer amount of suspect material has threatened to overwhelm publishers.

We were not the first to write about scientific fraud and problems in academic publishing when we launched Retraction Watch in 2010 with the aim of covering the subject regularly."

Tuesday, June 4, 2024

Google’s A.I. Search Leaves Publishers Scrambling; The New York Times, June 1, 2024

  Nico Grant and , The New York Times; Google’s A.I. Search Leaves Publishers Scrambling

"In May, Google announced that the A.I.-generated summaries, which compile content from news sites and blogs on the topic being searched, would be made available to everyone in the United States. And that change has Mr. Pine and many other publishing executives worried that the paragraphs pose a big danger to their brittle business model, by sharply reducing the amount of traffic to their sites from Google.

“It potentially chokes off the original creators of the content,” Mr. Pine said. The feature, AI Overviews, felt like another step toward generative A.I. replacing “the publications that they have cannibalized,” he added."

Wednesday, March 27, 2024

Amicus Briefs Filed in Internet Archive Copyright Case; Publishers Weekly, March 25, 2024

 Andrew Albanese , Publishers Weekly; Amicus Briefs Filed in Internet Archive Copyright Case

"Internet Archive lawyers filed their principal appeal brief on December 15, and 11 amicus briefs were filed in support of the Internet Archive a week later, in December, representing librarians and library associations, authors, public advocacy groups, law professors, and IP scholars, although some of the IA amicus briefs are presented as neutral.

The briefs are the latest development in the long-running copyright infringement case and come a year after a ruling by judge John G. Koeltl on March 24, 2023 that emphatically rejected the IA’s fair use defense, finding the scanning and lending of print library books under a protocol known as “controlled digital lending” to be copyright infringement.

The Internet Archive’s reply brief is now due on April 19, and oral arguments are expected to be set for this fall."

Sunday, December 31, 2023

Federal judge blocks enforcement of Iowa’s book ban law; Iowa Public Radio, December 29, 2023

 Grant Gerlock, Iowa Public Radio ; Federal judge blocks enforcement of Iowa’s book ban law

"A federal judge has blocked the state of Iowa from enforcing major portions of an education law, SF 496, which has caused school districts to pull hundreds of books from library shelves.

The temporary injunction prevents enforcement of a ban on books with sexually explicit content, which the judge in the case said likely violates the First Amendment. It also blocks a section barring instruction relating to sexual orientation and gender identity in elementary school, which he called “void for vagueness.”

The decision follows a hearing last week that combined arguments from two separate challenges against the law signed by Gov. Kim Reynolds in May. A lawsuit brought by LGBTQ students calls the law discriminatory while another from a group of educators and the publisher Penguin Random House claims it violates their freedom of speech.

Enforcement provisions in the law that apply to book removals were set to take effect January 1...

Judge Stephen Locher said in his ruling released late Friday afternoon that the court was unable to find another school library book restriction “even remotely similar to Senate File 496.” Where lawmakers should use a scalpel, he said, SF 496 is a “bulldozer” that has pulled books out of schools that are widely regarded as important works.

“The underlying message is that there is no redeeming value to any such book even if it is a work of history, self-help guide, award-winning novel, or other piece of serious literature,” Locher wrote. “In effect, the Legislature has imposed a puritanical ‘pall of orthodoxy’ over school libraries.”"

Thursday, October 19, 2023

AI is learning from stolen intellectual property. It needs to stop.; The Washington Post, October 19, 2023

William D. Cohan , The Washington Post; AI is learning from stolen intellectual property. It needs to stop.

"The other day someone sent me the searchable database published by Atlantic magazine of more than 191,000 e-books that have been used to train the generative AI systems being developed by Meta, Bloomberg and others. It turns out that four of my seven books are in the data set, called Books3. Whoa.

Not only did I not give permission for my books to be used to generate AI products, but I also wasn’t even consulted about it. I had no idea this was happening. Neither did my publishers, Penguin Random House (for three of the books) and Macmillan (for the other one). Neither my publishers nor I were compensated for use of my intellectual property. Books3 just scraped the content away for free, with Meta et al. profiting merrily along the way. And Books3 is just one of many pirated collections being used for this purpose...

This is wholly unacceptable behavior. Our books are copyrighted material, not free fodder for wealthy companies to use as they see fit, without permission or compensation. Many, many hours of serious research, creative angst and plain old hard work go into writing and publishing a book, and few writers are compensated like professional athletes, Hollywood actors or Wall Street investment bankers. Stealing our intellectual property hurts." 

Sunday, September 24, 2023

New S&S Program, Books Belong, Takes Aim at Book Bans; Publishers Weekly, September 20, 2023

 Jim Milliot , Publishers Weekly; New S&S Program, Books Belong, Takes Aim at Book Bans

"Simon & Schuster is introducing a new "multi-platform education and resources program," Books Belong, during this year's Banned Books Week (October 1-7), as part of an effort to expand the publisher's response to the book bans and challenges.

The program, the publisher said, will "highlight the merits of books that have been subject to bans and challenges and will provide educators, parents, librarians, and students with tools and resources" on how to "take action when faced with a challenge in their community" and "incorporate banned and challenged books into classroom, library, and family reading time." 

The initiative's website will host "reading group guides and videos, book lists, giveaways, exclusive author and expert content, and links to additional resources," S&S said, including those from Unite Against Book Bans and PEN America."