Showing posts with label OpenAI. Show all posts
Showing posts with label OpenAI. Show all posts

Monday, August 11, 2025

Boston Public Library aims to increase access to a vast historic archive using AI; NPR, August 11, 2025

, NPR ; Boston Public Library aims to increase access to a vast historic archive using AI

"Boston Public Library, one of the oldest and largest public library systems in the country, is launching a project this summer with OpenAI and Harvard Law School to make its trove of historically significant government documents more accessible to the public.

The documents date back to the early 1800s and include oral histories, congressional reports and surveys of different industries and communities...

Currently, members of the public who want to access these documents must show up in person. The project will enhance the metadata of each document and will enable users to search and cross-reference entire texts from anywhere in the world. 

Chapel said Boston Public Library plans to digitize 5,000 documents by the end of the year, and if all goes well, grow the project from there...

Harvard University said it could help. Researchers at the Harvard Law School Library's Institutional Data Initiative are working with libraries, museums and archives on a number of fronts, including training new AI models to help libraries enhance the searchability of their collections. 

AI companies help fund these efforts, and in return get to train their large language models on high-quality materials that are out of copyright and therefore less likely to lead to lawsuits. (Microsoft and OpenAI are among the many AI players targeted by recent copyright infringement lawsuits, in which plaintiffs such as authors claim the companies stole their works without permission.)"

Sunday, July 20, 2025

AI guzzled millions of books without permission. Authors are fighting back.; The Washington Post, July 19, 2025

 , The Washington Post; AI guzzled millions of books without permission. Authors are fighting back.


[Kip Currier: I've written this before on this blog and I'll say it again: technology companies would never allow anyone to freely vacuum up their content and use it without permission or compensation. Period. Full Stop.]


[Excerpt]

"Baldacci is among a group of authors suing OpenAI and Microsoft over the companies’ use of their work to train the AI software behind tools such as ChatGPT and Copilot without permission or payment — one of more than 40 lawsuits against AI companies advancing through the nation’s courts. He and other authors this week appealed to Congress for help standing up to what they see as an assault by Big Tech on their profession and the soul of literature.

They found sympathetic ears at a Senate subcommittee hearing Wednesday, where lawmakers expressed outrage at the technology industry’s practices. Their cause gained further momentum Thursday when a federal judge granted class-action status to another group of authors who allege that the AI firm Anthropic pirated their books.

“I see it as one of the moral issues of our time with respect to technology,” Ralph Eubanks, an author and University of Mississippi professor who is president of the Authors Guild, said in a phone interview. “Sometimes it keeps me up at night.”

Lawsuits have revealed that some AI companies had used legally dubious “torrent” sites to download millions of digitized books without having to pay for them."

Wednesday, July 16, 2025

The Pentagon is throwing $200 million at ‘Grok for Government’ and other AI companies; Task & Purpose, July 14, 2025

 , Task & Purpose; The Pentagon is throwing $200 million at ‘Grok for Government’ and other AI companies

"The Pentagon announced Monday it is going to spend almost $1 billion on “agentic AI workflows” from four “frontier AI” companies, including Elon Musk’s xAI, whose flagship Grok appeared to still be declaring itself “MechaHitler” as late as Monday afternoon.

In a press release, the Defense Department’s Chief Digital and Artificial Intelligence Office — or CDAO — said it will cut checks of up to $200 million each to tech giants Anthropic, Google, OpenAI and Musk’s xAI to work on:

  • “critical national security challenges;”
  • “joint mission essential tasks in our warfighting domain;”
  • “DoD use cases.”

The release did not expand on what any of that means or how AI might help. Task & Purpose reached out to the Pentagon for details on what these AI agents may soon be doing and asked specifically if the contracts would include control of live weapons systems or classified information."

Monday, June 30, 2025

Microsoft says AI system better than doctors at diagnosing complex health conditions; The Guardian, June 30, 2025

 , The Guardian; Microsoft says AI system better than doctors at diagnosing complex health conditions

"Microsoft has revealed details of an artificial intelligence system that performs better than human doctors at complex health diagnoses, creating a “path to medical superintelligence”.

The company’s AI unit, which is led by the British tech pioneer Mustafa Suleyman, has developed a system that imitates a panel of expert physicians tackling “diagnostically complex and intellectually demanding” cases.

Microsoft said that when paired with OpenAI’s advanced o3 AI model, its approach “solved” more than eight of 10 case studies specially chosen for the diagnostic challenge. When those case studies were tried on practising physicians – who had no access to colleagues, textbooks or chatbots – the accuracy rate was two out of 10.

Microsoft said it was also a cheaper option than using human doctors because it was more efficient at ordering tests."

Thursday, June 26, 2025

Don’t Let Silicon Valley Move Fast and Break Children’s Minds; The New York Times, June 25, 2025

JESSICA GROSE , The New York Times; Don’t Let Silicon Valley Move Fast and Break Children’s Minds

"On June 12, the toymaker Mattel announced a “strategic collaboration” with OpenAI, the developer of the large language model ChatGPT, “to support A.I.-powered products and experiences based on Mattel’s brands.” Though visions of chatbot therapist Barbie and Thomas the Tank Engine with a souped-up surveillance caboose may dance in my head, the details are still vague. Mattel affirms that ChatGPT is not intended for users under 13, and says it will comply with all safety and privacy regulations.

But who will hold either company to its public assurances? Our federal government appears allergic to any common-sense regulation of artificial intelligence. In fact, there is a provision in the version of the enormous domestic policy bill passed by the House that would bar states from “limiting, restricting or otherwise regulating artificial intelligence models, A.I. systems or automated decision systems entered into interstate commerce for 10 years.”"

Tuesday, June 24, 2025

Study: Meta AI model can reproduce almost half of Harry Potter book; Ars Technica, June 20, 2025

 TIMOTHY B. LEE  , Ars Techcnica; Study: Meta AI model can reproduce almost half of Harry Potter book

"In recent years, numerous plaintiffs—including publishers of books, newspapers, computer code, and photographs—have sued AI companies for training models using copyrighted material. A key question in all of these lawsuits has been how easily AI models produce verbatim excerpts from the plaintiffs’ copyrighted content.

For example, in its December 2023 lawsuit against OpenAI, The New York Times Company produced dozens of examples where GPT-4 exactly reproduced significant passages from Times stories. In its response, OpenAI described this as a “fringe behavior” and a “problem that researchers at OpenAI and elsewhere work hard to address.”

But is it actually a fringe behavior? And have leading AI companies addressed it? New research—focusing on books rather than newspaper articles and on different companies—provides surprising insights into this question. Some of the findings should bolster plaintiffs’ arguments, while others may be more helpful to defendants.

The paper was published last month by a team of computer scientists and legal scholars from Stanford, Cornell, and West Virginia University. They studied whether five popular open-weight models—three from Meta and one each from Microsoft and EleutherAI—were able to reproduce text from Books3, a collection of books that is widely used to train LLMs. Many of the books are still under copyright."

Copyright Cases Should Not Threaten Chatbot Users’ Privacy; Electronic Frontier Foundation (EFF), June 23, 2025

 TORI NOBLE, Electronic Frontier Foundation (EFF); Copyright Cases Should Not Threaten Chatbot Users’ Privacy

"Like users of all technologies, ChatGPT users deserve the right to delete their personal data. Nineteen U.S. States, the European Union, and a host of other countries already protect users’ right to delete. For years, OpenAI gave users the option to delete their conversations with ChatGPT, rather than let their personal queries linger on corporate servers. Now, they can’t. A badly misguided court order in a copyright lawsuit requires OpenAI to store all consumer ChatGPT conversations indefinitely—even if a user tries to delete them. This sweeping order far outstrips the needs of the case and sets a dangerous precedent by disregarding millions of users’ privacy rights.

The privacy harms here are significant. ChatGPT’s 300+ million users submit over 1 billion messages to its chatbots per dayoften for personal purposes. Virtually any personal use of a chatbot—anything from planning family vacations and daily habits to creating social media posts and fantasy worlds for Dungeons and Dragons games—reveal personal details that, in aggregate, create a comprehensive portrait of a person’s entire life. Other uses risk revealing people’s most sensitive information. For example, tens of millions of Americans use ChatGPT to obtain medical and financial information. Notwithstanding other risks of these uses, people still deserve privacy rights like the right to delete their data. Eliminating protections for user-deleted data risks chilling beneficial uses by individuals who want to protect their privacy."

Wednesday, April 30, 2025

Meta Faces Copyright Reckoning in Authors’ Generative AI Case; Bloomberg Law, April 30, 2025

Isaiah Poritz, Annelise Levy, Bloomberg Law; Meta Faces Copyright Reckoning in Authors’ Generative AI Case

"The way courts will view the fair use argument for training generative artificial intelligence models with copyrighted materials will be tested Thursday in a San Francisco courtroom, when the first of dozens of such lawsuits reaches summary judgment.

Meta Platforms Inc. and a group of authors including comedian Sarah Silverman will square off before Judge Vince Chhabria, who will decide whether Meta’s use of pirated books to train its AI model Llama qualifies as fair use, or if the issue should be left to a jury."

Tuesday, April 8, 2025

OpenAI Copyright Suit Consolidation Portends Consistency, Risk; Bloomberg Law, April 8, 2025

 

Kyle Jahner , Bloomberg Law; OpenAI Copyright Suit Consolidation Portends Consistency, Risk

"OpenAI Inc.'s tactical win consolidating a dozen copyright suits against it nevertheless carries risks for the company, as the matters proceed before a judge who’s already ruled against the company in key decisions.

The US Judicial Panel on Multidistrict Litigation last week centralized casesacross the country in the US District Court for the Southern District of New York for pretrial activity, which could include dispositive motions including summary judgment, as well as contentious discovery disputes that have been common among the cases.

“This will help create more consistency in the pre-trial outcomes, but it also means that you’ll get fewer tries from different plaintiffs to find a winning set of arguments,” Peter Henderson, an assistant professor at Princeton University, said in an email...

While streamlined, the pretrial proceedings figure to remain contentious as the parties press novel questions about how copyright laws apply to the game-changing generative AI technology. The disputes carry vast ramifications for companies reliant on millions of copyrighted works to train their models."

Friday, March 28, 2025

ChatGPT's new image generator blurs copyright lines; Axios, March 28, 2025

 Ina Fried, Axios; ChatGPT's new image generator blurs copyright lines

"AI image generators aren't new, but the one OpenAI handed to ChatGPT's legions of users this week is more powerful and has fewer guardrails than its predecessors — opening up a range of uses that are both tantalizing and terrifying."

Thursday, March 27, 2025

Judge allows 'New York Times' copyright case against OpenAI to go forward; NPR, March 27, 2025

 , NPR ; Judge allows 'New York Times' copyright case against OpenAI to go forward

"A federal judge on Wednesday rejected OpenAI's request to toss out a copyright lawsuit from The New York Times that alleges that the tech company exploited the newspaper's content without permission or payment.

In an order allowing the lawsuit to go forward, Judge Sidney Stein, of the Southern District of New York, narrowed the scope of the lawsuit but allowed the case's main copyright infringement claims to go forward.

Stein did not immediately release an opinion but promised one would come "expeditiously."

The decision is a victory for the newspaper, which has joined forces with other publishers, including The New York Daily News and the Center for Investigative Reporting, to challenge the way that OpenAI collected vast amounts of data from the web to train its popular artificial intelligence service, ChatGPT."

Monday, March 24, 2025

Should AI be treated the same way as people are when it comes to copyright law? ; The Hill, March 24, 2025

  NICHOLAS CREEL, The Hill ; Should AI be treated the same way as people are when it comes to copyright law? 

"The New York Times’s lawsuit against OpenAI and Microsoft highlights an uncomfortable contradiction in how we view creativity and learning. While the Times accuses these companies of copyright infringement for training AI on their content, this ignores a fundamental truth: AI systems learn exactly as humans do, by absorbing, synthesizing and transforming existing knowledge into something new."

Sunday, March 16, 2025

OpenAI declares AI race “over” if training on copyrighted works isn’t fair use; Ars Technica, March 13, 2025

 ASHLEY BELANGER  , Ars Technica; OpenAI declares AI race “over” if training on copyrighted works isn’t fair use

"OpenAI is hoping that Donald Trump's AI Action Plan, due out this July, will settle copyright debates by declaring AI training fair use—paving the way for AI companies' unfettered access to training data that OpenAI claims is critical to defeat China in the AI race.

Currently, courts are mulling whether AI training is fair use, as rights holders say that AI models trained on creative works threaten to replace them in markets and water down humanity's creative output overall.

OpenAI is just one AI company fighting with rights holders in several dozen lawsuits, arguing that AI transforms copyrighted works it trains on and alleging that AI outputs aren't substitutes for original works.

So far, one landmark ruling favored rights holders, with a judge declaring AI training is not fair use, as AI outputs clearly threatened to replace Thomson-Reuters' legal research firm Westlaw in the market, Wired reported. But OpenAI now appears to be looking to Trump to avoid a similar outcome in its lawsuits, including a major suit brought by The New York Times."

Saturday, February 8, 2025

OpenAI says DeepSeek ‘inappropriately’ copied ChatGPT – but it’s facing copyright claims too; The Conversation, February 4, 2025

 Senior Lecturer in Natural Language Processing, The University of Melbourne, The University of Melbourne , Lecturer in Cybersecurity, The University of Melbourne, The Conversation; OpenAI says DeepSeek ‘inappropriately’ copied ChatGPT – but it’s facing copyright claims too

"Within days, DeepSeek’s app surpassed ChatGPT in new downloads and set stock prices of tech companies in the United States tumbling. It also led OpenAI to claim that its Chinese rival had effectively pilfered some of the crown jewels from OpenAI’s models to build its own. 

In a statement to the New York Times, the company said: 

We are aware of and reviewing indications that DeepSeek may have inappropriately distilled our models, and will share information as we know more. We take aggressive, proactive countermeasures to protect our technology and will continue working closely with the US government to protect the most capable models being built here.

The Conversation approached DeepSeek for comment, but it did not respond.

But even if DeepSeek copied – or, in scientific parlance, “distilled” – at least some of ChatGPT to build R1, it’s worth remembering that OpenAI also stands accused of disrespecting intellectual property while developing its models."

Tuesday, January 28, 2025

Former OpenAI safety researcher brands pace of AI development ‘terrifying’; The Guardian, January 28, 2025

 Global technology editor, The Guardian ; Former OpenAI safety researcher brands pace of AI development ‘terrifying’

"A former safety researcher at OpenAI says he is “pretty terrified” about the pace of development in artificial intelligence, warning the industry is taking a “very risky gamble” on the technology.

Steven Adler expressed concerns about companies seeking to rapidly develop artificial general intelligence (AGI), a theoretical term referring to systems that match or exceed humans at any intellectual task."

Thursday, January 16, 2025

In AI copyright case, Zuckerberg turns to YouTube for his defense; TechCrunch, January 15, 2025

 

, TechCrunch ; In AI copyright case, Zuckerberg turns to YouTube for his defense

"Meta CEO Mark Zuckerberg appears to have used YouTube’s battle to remove pirated content to defend his own company’s use of a data set containing copyrighted e-books, reveals newly released snippets of a deposition he gave late last year.

The deposition, which was part of a complaint submitted to the court by plaintiffs’ attorneys, is related to the AI copyright case Kadrey v. Meta. It’s one of many such cases winding through the U.S. court system that’s pitting AI companies against authors and other IP holders. For the most part, the defendants in these cases – AI companies – claim that training on copyrighted content is “fair use.” Many copyright holders disagree."

Wednesday, January 15, 2025

'The New York Times' takes OpenAI to court. ChatGPT's future could be on the line; NPR, January 14, 2025

 , NPR; 'The New York Times' takes OpenAI to court. ChatGPT's future could be on the line

"A group of news organizations, led by The New York Times, took ChatGPT maker OpenAI to federal court on Tuesday in a hearing that could determine whether the tech company has to face the publishers in a high-profile copyright infringement trial.

Three publishers' lawsuits against OpenAI and its financial backer Microsoft have been merged into one case. Leading each of the three combined cases are the Times, The New York Daily News and the Center for Investigative Reporting.

Other publishers, like the Associated Press, News Corp. and Vox Media, have reached content-sharing deals with OpenAI, but the three litigants in this case are taking the opposite path: going on the offensive."

Monday, January 6, 2025

OpenAI holds off on promise to creators, fails to protect intellectual property; The American Bazaar, January 3, 2025

  Vishnu Kamal, The American Bazaar; OpenAI holds off on promise to creators, fails to protect intellectual property

"OpenAI may yet again be in hot water as it seems that the tech giant may be reneging on its earlier assurances. Reportedly, in May, OpenAI said it was developing a tool to let creators specify how they want their works to be included in—or excluded from—its AI training data. But seven months later, this feature has yet to see the light of day.

Called Media Manager, the tool would “identify copyrighted text, images, audio, and video,” OpenAI said at the time, to reflect creators’ preferences “across multiple sources.” It was intended to stave off some of the company’s fiercest critics, and potentially shield OpenAI from IP-related legal challenges...

OpenAI has faced various legal challenges related to its AI technologies and operations. One major issue involves the privacy and data usage of its language models, which are trained on large datasets that may include publicly available or copyrighted material. This raises concerns over privacy violations and intellectual property rights, especially regarding whether the data used for training was obtained with proper consent.

Additionally, there are questions about the ownership of content generated by OpenAI’s models. If an AI produces a work based on copyrighted data, it is tricky to determine who owns the rights—whether it’s OpenAI, the user who prompted the AI, or the creators of the original data.

Another concern is the liability for harmful content produced by AI. If an AI generates misleading or defamatory information, legal responsibility could fall on OpenAI."