Showing posts with label AI hallucinations. Show all posts
Showing posts with label AI hallucinations. Show all posts

Wednesday, November 12, 2025

Vigilante Lawyers Expose the Rising Tide of A.I. Slop in Court Filings; The New York Times, November 7, 2025

  , The New York Times; Vigilante Lawyers Expose the Rising Tide of A.I. Slop in Court Filings

"Mr. Freund is part of a growing network of lawyers who track down A.I. abuses committed by their peers, collecting the most egregious examples and posting them online. The group hopes that by tracking down the A.I. slop, it can help draw attention to the problem and put an end to it.

While judges and bar associations generally agree that it’s fine for lawyers to use chatbots for research, they must still ensure their filings are accurate.

But as the technology has taken off, so has misuse. Chatbots frequently make things up, and judges are finding more and more fake case law citations, which are then rounded up by the legal vigilantes.

“These cases are damaging the reputation of the bar,” said Stephen Gillers, an ethics professor at New York University School of Law. “Lawyers everywhere should be ashamed of what members of their profession are doing.”...

The problem, though, keeps getting worse.

That’s why Damien Charlotin, a lawyer and researcher in France, started an online database in April to track it.

Initially he found three or four examples a month. Now he often receives that many in a day.

Many lawyers, including Mr. Freund and Mr. Schaefer, have helped him document 509 cases so far. They use legal tools like LexisNexis for notifications on keywords like “artificial intelligence,” “fabricated cases” and “nonexistent cases.”

Some of the filings include fake quotes from real cases, or cite real cases that are irrelevant to their arguments. The legal vigilantes uncover them by finding judges’ opinions scolding lawyers."

You’re a Computer Science Major. Don’t Panic.; The New York Times, November 12, 2025

Mary Shaw and , The New York Times ; You’re a Computer Science Major. Don’t Panic.

"The future of computer science education is to teach students how to master the indispensable skill of supervision.

Why? Because the speed and efficiency of using A.I. to write code is balanced by the reality that it often gets things wrong. These tools are designed to produce results that look convincing, but may still contain errors. A recent survey showed that over half of professional developers use A.I. tools daily, but only about one-third trust their accuracy. When asked what their greatest frustration is about using A.I. tools, two-thirds of respondents answered, “A.I. solutions that are almost right but not quite.”

There is still a need for humans to play a role in coding — a supervisory one, where programmers oversee the use of A.I. tools, determine if A.I.-generated code does what it is supposed to do and make essential repairs to defective code."

Sunday, November 9, 2025

California Prosecutor Says AI Caused Errors in Criminal Case; Sacramento Bee via Government Technology, November 7, 2025

Sharon Bernstein, Sacramento Bee via Government Technology; California Prosecutor Says AI Caused Errors in Criminal Case

"Northern California prosecutors used artificial intelligence to write a criminal court filing that contained references to nonexistent legal cases and precedents, Nevada County District Attorney Jesse Wilson said in a statement.

The motion included false information known in artificial intelligence circles as “hallucinations,” meaning that it was invented by the AI software asked to write the material, Wilson said. It was filed in connection with the case of Kalen Turner, who was accused of five felony and two misdemeanor drug counts, he said.

The situation is the latest example of the potential pitfalls connected with the growing use of AI. In fields such as law, errors in AI-generated briefs could impact the freedom of a person accused of a crime. In health care, AI analysis of medical necessity has resulted in the denial of some types of care. In April, A 16-year-old Rancho Santa Margarita boy killed himself after discussing suicidal thoughts with an AI chatbot, prompting a new California law aimed at protecting vulnerable users.

“While artificial intelligence can be a useful research tool, it remains an evolving technology with limitations — including the potential to generate ‘hallucinated’ citations,” Wilson said. “We are actively learning the fluid dynamics of AI-assisted legal work and its possible pitfalls.”

Sunday, September 28, 2025

Education report calling for ethical AI use contains over 15 fake sources; Ars Technica, September 12, 2025

BENJ EDWARDS, Ars Technica ; Education report calling for ethical AI use contains over 15 fake sources

"On Friday, CBC News reported that a major education reform document prepared for the Canadian province of Newfoundland and Labrador contains at least 15 fabricated citations that academics suspect were generated by an AI language model—despite the same report calling for "ethical" AI use in schools.

"A Vision for the Future: Transforming and Modernizing Education," released August 28, serves as a 10-year roadmap for modernizing the province's public schools and post-secondary institutions. The 418-page document took 18 months to complete and was unveiled by co-chairs Anne Burke and Karen Goodnough, both professors at Memorial University's Faculty of Education, alongside Education Minister Bernard Davis...

The irony runs deep

The presence of potentially AI-generated fake citations becomes especially awkward given that one of the report's 110 recommendations specifically states the provincial government should "provide learners and educators with essential AI knowledge, including ethics, data privacy, and responsible technology use."

Sarah Martin, a Memorial political science professor who spent days reviewing the document, discovered multiple fabricated citations. "Around the references I cannot find, I can't imagine another explanation," she told CBC. "You're like, 'This has to be right, this can't not be.' This is a citation in a very important document for educational policy.""

Saturday, September 13, 2025

Perplexity's definition of copyright gets it sued by the dictionary; Engadget, September 11, 2025

Anna Washenko, Engadget; Perplexity's definition of copyright gets it sued by the dictionary

"Merriam-Webster and its parent company Encyclopedia Britannica are the latest to take on AI in court. The plaintiffs have sued Perplexity, claiming that AI company's "answer engine" product unlawfully copies their copyrighted materials. They are also alleging copyright infringement for instances where Perplexity's AI creates false or inaccurate hallucinations that it then wrongly attributes to Britannica or Merriam-Webster. The complaint, filed in New York federal court, is seeking unspecified monetary damages and an order that blocks Perplexity from misusing their content."

Saturday, August 23, 2025

PittGPT debuts today as private AI source for University; University Times, August 21, 2025

MARTY LEVINE, University Times; PittGPT debuts today as private AI source for University

"Today marks the rollout of PittGPT, Pitt’s own generative AI for staff and faculty — a service that will be able to use Pitt’s sensitive, internal data in isolation from the Internet because it works only for those logging in with their Pitt ID.

“We want to be able to use AI to improve the things that we do” in our Pitt work, said Dwight Helfrich, director of the Pitt enterprise initiatives team at Pitt Digital. That means securely adding Pitt’s private information to PittGPT, including Human Resources, payroll and student data. However, he explains, in PittGPT “you would only have access to data that you would have access to in your daily role” — in your specific Pitt job.

“Security is a key part of AI,” he said. “It is much more important in AI than in other tools we provide.” Using PittGPT — as opposed to the other AI services available to Pitt employees — means that any data submitted to it “stays in our environment and it is not used to train a free AI model.”

Helfrich also emphasizes that “you should get a very similar response to PittGPT as you would get with ChatGPT,” since PittGPT had access to “the best LLM’s on the market” — the large language models used to train AI.

Faculty, staff and students already have free access to such AI services as Google Gemini and Microsoft Copilot. And “any generative AI tool provides the ability to analyze data … and to rewrite things” that are still in early or incomplete drafts, Helfrich said.

“It can help take the burden off some of the work we have to do in our lives” and help us focus on the larger tasks that, so far, humans are better at undertaking, added Pitt Digital spokesperson Brady Lutsko. “When you are working with your own information, you can tell it what to include” — it won’t add misinformation from the internet or its own programming, as AI sometimes does. “If you have a draft, it will make your good work even better.”

“The human still needs to review and evaluate that this is useful and valuable,” Helfrich said of AI’s contribution to our work. “At this point we can say that there is nothing in AI that is 100 percent reliable.”

On the other hand, he said, “they’re making dramatic enhancements at a pace we’ve never seen in technology. … I’ve been in technology 30 years and I’ve never seen anything improve as quickly as AI.” In his own work, he said, “AI can help review code and provide test cases, reducing work time by 75 percent. You just have to look at it with some caution and just (verify) things.”

“Treat it like you’re having a conversation with someone you’ve just met,” Lutsko added. “You have some skepticism — you go back and do some fact checking.”

Lutsko emphasized that the University has guidance on Acceptable Use of Generative Artificial Intelligence Tools as well as a University-Approved GenAI Tools List.

Pitt’s list of approved generative AI tools includes Microsoft 365 Copilot Chat, which is available to all students, faculty and staff (as opposed to the version of Copilot built into Microsoft 365 apps, which is an add-on available to departments through Panther Express for $30 per month, per person); Google Gemini; and Google NotebookLMwhich Lutsko said “serves as a dedicated research assistant for precise analysis using user-provided documents.”

PittGPT joins that list today, Helfrich said.

Pitt also has been piloting Pitt AI Connect, a tool for researchers to integrate AI into software development (using an API, or application programming interface).

And Pitt also is already deploying the PantherAI chatbot, clickable from the bottom right of the Pitt Digital and Office of Human Resources homepages, which provides answers to common questions that may otherwise be deep within Pitt’s webpages. It will likely be offered on other Pitt websites in the future.

“Dive in and use it,” Helfrich said of PittGPT. “I see huge benefits from all of the generative AI tools we have. I’ve saved time and produced better results.”"

Friday, July 25, 2025

Virginia teachers learn AI tools and ethics at largest statewide workshop; WTVR, July 23, 2025

 

Wednesday, July 23, 2025

Partner Who Wrote About AI Ethics, Fired For Citing Fake AI Cases; Above The Law, July 23, 2025

Joe Patrice , Above The Law; Partner Who Wrote About AI Ethics, Fired For Citing Fake AI Cases

"Don’t blame the AI for the fact that you read a brief and never bothered to print out the cases. Who does that? Long before AI, we all understood that you needed to look at the case itself to make sure no one missed the literal red flag on top. It might’ve ended up in there because of AI, but three lawyers and presumably a para or two had this brief and no one built a binder of the cases cited? What if the court wanted oral argument? No one is excusing the decision to ask ChatGPT to resolve your $24 million case, but the blame goes far deeper.

Malaty will shoulder most of the blame as the link in the workflow who should’ve known better. That said, her article about AI ethics, written last year, doesn’t actually address the hallucination problem. While risks of job displacement and algorithms reinforcing implicit bias are important, it is a little odd to write a whole piece on the ethics of legal AI without even breathing on hallucinations."

Tuesday, July 22, 2025

Getting Along with GPT: The Psychology, Character, and Ethics of Your Newest Professional Colleague; ABA Journal, May 9, 2025

 ABA Journal; Getting Along with GPT: The Psychology, Character, and Ethics of Your Newest Professional Colleague

"The Limits of GenAI’s Simulated Humanity

  • Creative thinking. An LLM mirrors humanity’s collective intelligence, shaped by everything it has read. It excels at brainstorming and summarizing legal principles but lacks independent thought, opinions, or strategic foresight—all essential to legal practice. Therefore, if a model’s summary of your legal argument feels stale, illogical, or disconnected from human values, it may be because the model has no democratized data to pattern itself on. The good news? You may be on to something original—and truly meaningful!
  • True comprehension. An LLM does not know the law; it merely predicts legal-sounding text based on past examples and mathematical probabilities.
  • Judgment and ethics. An LLM does not possess a moral compass or the ability to make judgments in complex legal contexts. It handles facts, not subjective opinions.  
  • Long-term consistency. Due to its context window limitations, an LLM may contradict itself if key details fall outside its processing scope. It lacks persistent memory storage.
  • Limited context recognition. An LLM has limited ability to understand context beyond provided information and is limited by training data scope.
  • Trustfulness. Attorneys have a professional duty to protect client confidences, but privacy and PII (personally identifiable information) are evolving concepts within AI. Unlike humans, models can infer private information without PII, through abstract patterns in data. To safeguard client information, carefully review (or summarize with AI) your LLM’s terms of use."

Wednesday, July 16, 2025

The Pentagon is throwing $200 million at ‘Grok for Government’ and other AI companies; Task & Purpose, July 14, 2025

 , Task & Purpose; The Pentagon is throwing $200 million at ‘Grok for Government’ and other AI companies

"The Pentagon announced Monday it is going to spend almost $1 billion on “agentic AI workflows” from four “frontier AI” companies, including Elon Musk’s xAI, whose flagship Grok appeared to still be declaring itself “MechaHitler” as late as Monday afternoon.

In a press release, the Defense Department’s Chief Digital and Artificial Intelligence Office — or CDAO — said it will cut checks of up to $200 million each to tech giants Anthropic, Google, OpenAI and Musk’s xAI to work on:

  • “critical national security challenges;”
  • “joint mission essential tasks in our warfighting domain;”
  • “DoD use cases.”

The release did not expand on what any of that means or how AI might help. Task & Purpose reached out to the Pentagon for details on what these AI agents may soon be doing and asked specifically if the contracts would include control of live weapons systems or classified information."

Wednesday, July 2, 2025

Trial Court Decides Case Based On AI-Hallucinated Caselaw; Above The Law, July 1, 2025

Joe Patrice, Above The Law; Trial Court Decides Case Based On AI-Hallucinated Caselaw

"Between opposing counsel and diligent judges, fake cases keep getting caught before they result in real mischief. That said, it was always only a matter of time before a poor litigant representing themselves fails to know enough to sniff out and flag Beavis v. Butthead and a busy or apathetic judge rubberstamps one side’s proposed order without probing the cites for verification. Hallucinations are all fun and games until they work their way into the orders.

It finally happened with a trial judge issuing an order based off fake cases (flagged by Rob Freund(Opens in a new window)). While the appellate court put a stop to the matter, the fact that it got this far should terrify everyone.

Shahid v. Esaam(Opens in a new window), out of the Georgia Court of Appeals, involved a final judgment and decree of divorce served by publication. When the wife objected to the judgment based on improper service, the husband’s brief included two fake cases. The trial judge accepted the husband’s argument, issuing an order based in part on the fake cases."

Saturday, June 21, 2025

US patent office wants an AI to scan for prior art, but doesn't want to pay for it; The Register, June 20, 2025

Brandon Vigliarolo,  The Register; US patent office wants an AI to scan for prior art, but doesn't want to pay for it

"There is some irony in using AI bots, which are often trained on copyrighted material for which AI firms have shown little regard, to assess the validity of new patents. 

It may not be the panacea the USPTO is hoping for. Lawyers have been embracing AI for something very similar - scanning particular, formal documentation for specific details related to a new analysis - and it's sometimes backfired as the AI has gotten certain details wrong. The Register has reported on numerous instances of legal professionals practically begging to be sanctioned for not bothering to do their legwork, as judges caught them using AI, which borked citations to other legal cases. 

The risk of hallucinating patents that don't exist, or getting patent numbers or other details wrong, means that there'll have to be at least some human oversight. The USPTO had no comment on how this might be accomplished."

Monday, June 2, 2025

Excruciating reason Utah lawyer presented FAKE case in court after idiotic blunder; Daily Mail, May 31, 2025

JOE HUTCHISON FOR DAILYMAIL.COMExcruciating reason Utah lawyer presented FAKE case in court after idiotic blunder

"The case referenced, according to documents, was 'Royer v. Nelson' which did not exist in any legal database and was found to be made up by ChatGPT.

Opposing counsel said that the only way they would find any mention of the case was by using the AI

They even went as far as to ask the AI if the case was real, noting in a filing that it then apologized and said it was a mistake.

Bednar's attorney, Matthew Barneck, said that the research was done by a clerk and Bednar took all responsibility for failing to review the cases.

He told The Salt Lake Tribune: 'That was his mistake. He owned up to it and authorized me to say that and fell on the sword."

Friday, May 30, 2025

White House MAHA Report may have garbled science by using AI, experts say; The Washington Post, May 29, 2025

, The Washington Post ; White House MAHA Report may have garbled science by using AI, experts say

"Some of the citations that underpin the science in the White House’s sweeping “MAHA Report” appear to have been generated using artificial intelligence, resulting in numerous garbled scientific references and invented studies, AI experts said Thursday.

Of the 522 footnotes to scientific research in an initial version of the report sent to The Washington Post, at least 37 appear multiple times, according to a review of the report by The Post. Other citations include the wrong author, and several studies cited by the extensive health report do not exist at all, a fact first reported by the online news outlet NOTUS on Thursday morning.

Some references include “oaicite” attached to URLs — a definitive sign that the research was collected using artificial intelligence. The presence of “oaicite” is a marker indicating use of OpenAI, a U.S. artificial intelligence company."

Wednesday, May 21, 2025

A.I.-Generated Reading List in Chicago Sun-Times Recommends Nonexistent Books; The New York Times, May 21, 2025

 , The New York Times; A.I.-Generated Reading List in Chicago Sun-Times Recommends Nonexistent Books

"The summer reading list tucked into a special section of The Chicago Sun-Times and The Philadelphia Inquirer seemed innocuous enough.

There were books by beloved authors such as Isabel Allende and Min Jin Lee; novels by best sellers including Delia Owens, Taylor Jenkins Reid and Brit Bennett; and a novel by Percival Everett, a recent Pulitzer Prize winner.

There was just one issue: None of the book titles attributed to the above authors were real. They had been created by generative artificial intelligence.

It’s the latest case of bad A.I. making its way into the news. While generative A.I. has improved, there is still no way to ensure the systems produce accurate information. A.I. chatbots cannot distinguish between what is true and what is false, and they often make things up. The chatbots can spit out information and expert names with an air of authority."

Saturday, May 17, 2025

Anthropic’s law firm throws Claude under the bus over citation errors in court filing; The Register, May 15, 2025

Thomas Claburn, The Register; Anthropic’s law firm throws Claude under the bus over citation errors in court filing

"An attorney defending AI firm Anthropic in a copyright case brought by music publishers apologized to the court on Thursday for citation errors that slipped into a filing after using the biz's own AI tool, Claude, to format references.

The incident reinforces what's becoming a pattern in legal tech: while AI models can be fine-tuned, people keep failing to verify the chatbot's output, despite the consequences.

The flawed citations, or "hallucinations," appeared in an April 30, 2025 declaration [PDF] from Anthropic data scientist Olivia Chen in a copyright lawsuit music publishers filed in October 2023.

But Chen was not responsible for introducing the errors, which appeared in footnotes 2 and 3.

Ivana Dukanovic, an attorney with Latham & Watkins, the firm defending Anthropic, stated that after a colleague located a supporting source for Chen's testimony via Google search, she used Anthropic's Claude model to generate a formatted legal citation. Chen and defense lawyers failed to catch the errors in subsequent proofreading.

"After the Latham & Watkins team identified the source as potential additional support for Ms. Chen’s testimony, I asked Claude.ai to provide a properly formatted legal citation for that source using the link to the correct article," explained Dukanovic in her May 15, 2025 declaration [PDF].

"Unfortunately, although providing the correct publication title, publication year, and link to the provided source, the returned citation included an inaccurate title and incorrect authors.

"Our manual citation check did not catch that error. Our citation check also missed additional wording errors introduced in the citations during the formatting process using Claude.ai."...

The hallucinations of AI models keep showing up in court filings.

Last week, in a plaintiff's claim against insurance firm State Farm (Jacquelyn Jackie Lacey v. State Farm General Insurance Company et al), former Judge Michael R. Wilner, the Special Master appointed to handle the dispute, sanctioned [PDF] the plaintiff's attorneys for misleading him with AI-generated text. He directed the plaintiff's legal team to pay more than $30,000 in court costs that they wouldn't have otherwise had to bear.

After reviewing a supplemental brief filed by the plaintiffs, Wilner found that "approximately nine of the 27 legal citations in the ten-page brief were incorrect in some way."

Two of the citations, he said, do not exist, and several cited phony judicial opinions."

Thursday, May 15, 2025

Anthropic expert accused of using AI-fabricated source in copyright case; Reuters, May 13, 2025

 , Reuters; Anthropic expert accused of using AI-fabricated source in copyright case

"Van Keulen asked Anthropic to respond by Thursday to the accusation, which the company said appeared to be an inadvertent citation error. He rejected the music companies' request to immediately question the expert but said the allegation presented "a very serious and grave issue," and that there was "a world of difference between a missed citation and a hallucination generated by AI.""

Wednesday, May 14, 2025

The Professors Are Using ChatGPT, and Some Students Aren’t Happy About It; The New York Times, May 14, 2025

 , The New York Times; The Professors Are Using ChatGPT, and Some Students Aren’t Happy About It

"When ChatGPT was released at the end of 2022, it caused a panic at all levels of education because it made cheating incredibly easy. Students who were asked to write a history paper or literary analysis could have the tool do it in mere seconds. Some schools banned it while others deployed A.I. detection services, despite concerns about their accuracy.

But, oh, how the tables have turned. Now students are complaining on sites like Rate My Professors about their instructors’ overreliance on A.I. and scrutinizing course materials for words ChatGPT tends to overuse, like “crucial” and “delve.” In addition to calling out hypocrisy, they make a financial argument: They are paying, often quite a lot, to be taught by humans, not an algorithm that they, too, could consult for free."

Friday, March 7, 2025

AI 'hallucinations' in court papers spell trouble for lawyers; Reuters, February 18, 2025

, Reuters ; AI 'hallucinations' in court papers spell trouble for lawyers

"U.S. personal injury law firm Morgan & Morgan sent an urgent email this month to its more than 1,000 lawyers: Artificial intelligence can invent fake case law, and using made-up information in a court filing could get you fired.

A federal judge in Wyoming had just threatened to sanction two lawyers at the firm who included fictitious case citations in a lawsuit against Walmart. One of the lawyers admitted in court filings last week that he used an AI program that "hallucinated" the cases and apologized for what he called an inadvertent mistake."

Monday, February 3, 2025

DeepSeek has ripped away AI’s veil of mystique. That’s the real reason the tech bros fear it; The Observer via The Guardian, February 2, 2025

, The Observer via The Guardian ; DeepSeek has ripped away AI’s veil of mystique. That’s the real reason the tech bros fear it

"DeepSeek, sponsored by a Chinese hedge fund, is a notable achievement. Technically, though, it is no advance on large language models (LLMs) that already exist. It is neither faster nor “cleverer” than OpenAI’s ChatGPT or Anthropic’s Claude and just as prone to “hallucinations” – the tendency, exhibited by all LLMs, to give false answers or to make up “facts” to fill gaps in its data. According to NewsGuard, a rating system for news and information websites, DeepSeek’s chatbot made false claims 30% of the time and gave no answers to 53% of questions, compared with 40% and 22% respectively for the 10 leading chatbots in NewsGuard’s most recent audit.

The figures expose the profound unreliability of all LLMs. DeepSeek’s particularly high non-response rate is likely to be the product of its censoriousness; it refuses to provide answers on any issue that China finds sensitive or about which it wants facts restricted, whether Tiananmen Square or Taiwan...

Nevertheless, for all the pushback, each time one fantasy prediction fails to materialise, another takes its place. Such claims derive less from technological possibilities than from political and economic needs. While AI technology has provided hugely important tools, capable of surpassing humans in specific fields, from the solving of mathematical problems to the recognition of disease patterns, the business model depends on hype. It is the hype that drives the billion-dollar investment and buys political influence, including a seat at the presidential inauguration."