Showing posts with label AI hallucinations. Show all posts
Showing posts with label AI hallucinations. Show all posts

Thursday, October 3, 2024

What You Need to Know About Grok AI and Your Privacy; Wired, September 10, 2024

Kate O'Flaherty , Wired; What You Need to Know About Grok AI and Your Privacy

"Described as “an AI search assistant with a twist of humor and a dash of rebellion,” Grok is designed to have fewer guardrails than its major competitors. Unsurprisingly, Grok is prone to hallucinations and bias, with the AI assistant blamed for spreading misinformation about the 2024 election."

Tuesday, October 1, 2024

Fake Cases, Real Consequences [No digital link as of 10/1/24]; ABA Journal, Oct./Nov. 2024 Issue

John Roemer, ABA Journal; Fake Cases, Real Consequences [No digital link as of 10/1/24]

"Legal commentator Eugene Volokh, a professor at UCLA School of Law who tracks AI in litigation, in February reported on the 14th court case he's found in which AI-hallucinated false citations appeared. It was a Missouri Court of Appeals opinion that assessed the offending appellant $10,000 in damages for a frivolous filing.

Hallucinations aren't the only snag, Volokh says. "It's also with the output mischaracterizing the precedents or omitting key context. So one still has to check that output to make sure it's sound, rather than just including it in one's papers.

Echoing Volokh and other experts, ChatGPT itself seems clear-eyed about its limits. When asked about hallucinations in legal research, it replied in part: "Hallucinations in chatbot answers could potentially pose a problem for lawyers if they relied solely on the information provided by the chatbot without verifying its accuracy."

Thursday, September 26, 2024

Perspectives in Artificial Intelligence: Ethical Use; Marquette Today, September 20, 2024

Andrew Goldstein  , Marquette Today; Perspectives in Artificial Intelligence: Ethical Use

"Ethical application 

While artificial intelligence unlocks broad possibilities for positive change, unethical actors have access to these same tools. For instance, companies hoping to grow cigarette sales can target people who are prone to smoking or trying to quit with greater precision. Deepfake videos allow scam callers to imitate the faces and voices of loved ones.  

In this world, it is more important than ever that students be trained on the limits of AI and its proper use cases. 

“We need to think about the societal impact of artificial intelligence; who gets this data, what it’s being used for and how we steer people toward value-creating activities,” Ow says. “Using AI has the potential to improve your life and to provide insights and opportunities for the individual, the community and society."

Friday, August 23, 2024

The US Government Wants You—Yes, You—to Hunt Down Generative AI Flaws; Wired, August 21, 2024

 Lily Hay Newman, Wired; The US Government Wants You—Yes, You—to Hunt Down Generative AI Flaws

"AT THE 2023 Defcon hacker conference in Las Vegas, prominent AI tech companies partnered with algorithmic integrity and transparency groups to sic thousands of attendees on generative AI platforms and find weaknesses in these critical systems. This “red-teaming” exercise, which also had support from the US government, took a step in opening these increasingly influential yet opaque systems to scrutiny. Now, the ethical AI and algorithmic assessment nonprofit Humane Intelligence is taking this model one step further. On Wednesday, the group announced a call for participation with the US National Institute of Standards and Technology, inviting any US resident to participate in the qualifying round of a nationwide red-teaming effort to evaluate AI office productivity software.

The qualifier will take place online and is open to both developers and anyone in the general public as part of NIST's AI challenges, known as Assessing Risks and Impacts of AI, or ARIA. Participants who pass through the qualifying round will take part in an in-person red-teaming event at the end of October at the Conference on Applied Machine Learning in Information Security (CAMLIS) in Virginia. The goal is to expand capabilities for conducting rigorous testing of the security, resilience, and ethics of generative AI technologies."

Monday, August 19, 2024

New ABA Rules on AI and Ethics Shows the Technology Is 'New Wine in Old Bottles'; The Law Journal Editorial Board via Law.com, August 16, 2024

The Law Journal Editorial Board via Law.com; New ABA Rules on AI and Ethics Shows the Technology Is 'New Wine in Old Bottles'

On July 29, the American Bar Association’s Standing Committee on Ethics and Professional Responsibility issued Formal Opinion 512 on generative artificial intelligence tools. The opinion follows on such opinions and guidance from several state bar associations, as well as similar efforts by non-U.S. bars and regulatory bodies around the world...

Focused on GAI, the opinion addresses six core principles: competence, confidentiality, communication, meritorious claims and candor to tribunal, supervision and fees...

What is not commonly understood, perhaps, is that GAI “hallucinates,” and generates content...

Not addressed in the opinion is whether GAI is engaged in the practice of law...

At the ABA annual meeting, representatives of more than 20 “foreign” bars participated in a roundtable on GAI. In a world of cross-border practice, there was a desire for harmonization."

Monday, August 12, 2024

Silicon Valley bishop, two Catholic AI experts weigh in on AI evangelization; Religion News Service, May 6, 2024

leja Hertzler-McCain , Religion News Service; Silicon Valley bishop, two Catholic AI experts weigh in on AI evangelization

"San Jose, California, Bishop Oscar CantĂș, who leads the Catholic faithful in Silicon Valley, said that AI doesn’t come up much with parishioners in his diocese...

Pointing to the adage coined by Meta founder Mark Zuckerberg, “move fast and break things,” the bishop said, “with AI, we need to move very cautiously and slowly and try not to break things. The things we would be breaking are human lives and reputations.”...

Noreen Herzfeld, a professor of theology and computer science at St. John’s University and the College of St. Benedict and one of the editors of a book about AI sponsored by the Vatican Dicastery for Culture and Education, said that the AI character was previously “impersonating a priest, which is considered a very serious sin in Catholicism.”...

Accuracy issues, Herzfeld said, is one of many reasons it should not be used for evangelization. “As much as you beta test one of these chatbots, you will never get rid of hallucinations” — moments when the AI makes up its own answers, she said...

Larrey, who has been studying AI for nearly 30 years and is in conversation with Sam Altman, the CEO of OpenAI, is optimistic that the technology will improve. He said Altman is already making progress on the hallucinations, on its challenges to users’ privacy and reducing its energy use — a recent analysis estimated that by 2027, artificial intelligence could suck up as much electricity as the population of Argentina or the Netherlands."

Friday, August 9, 2024

TryTank Research Institute helps create Cathy, a new AI chatbot and Episcopal Church expert; Episcopal News Service, August 7, 2024

 KATHRYN POST, Episcopal News Service; TryTank Research Institute helps create Cathy, a new AI chatbot and Episcopal Church expert

"The latest AI chatbot geared for spiritual seekers is AskCathy, co-launched in June by a research institute and ministry organization and aiming to roll out soon on Episcopal church websites. Cathy draws on the latest version of ChatGPT and is equipped to prioritize Episcopal resources.

“This is not a substitute for a priest,” said the Rev. Tay Moss, director of one of Cathy’s architects, the Innovative Ministry Center, an organization based at the Toronto United Church Council that develops digital resources for communities of faith. “She comes alongside you in your search queries and helps you discover material. But she is not the end-all be-all of authority. She can’t tell you how to believe or what to believe.”

The Rev. Lorenzo Lebrija, the executive director of TryTank Research Institute at Virginia Theological Seminary and Cathy’s other principal developer, said all the institute’s projects attempt to follow the lead of the Holy Spirit, and Cathy is no different. He told Religion News Service the idea for Cathy materialized after brainstorming how to address young people’s spiritual needs. What if a chatbot could meet people asking life’s biggest questions with care, insight and careful research?

The goal is not that they will end up at their nearby Episcopal church on Sunday. The goal is that it will spark in them this knowledge that God is always with us, that God never leaves us,” Lebrija said. “This can be a tool that gives us a glimpse and little direction that we can then follow on our own.”

To do that, though, would require a chatbot designed to avoid the kinds of hallucinations and errors that have plagued other ChatGPT integrations. In May, the Catholic evangelization site Catholic Answers “defrocked” their AI avatar, Father Justin, designating him as a layperson after he reportedly claimed to be an ordained priest capable of taking confession and performing marriages...

The Rev. Peter Levenstrong, an associate rector at an Episcopal church in San Francisco who blogs about AI and the church, told RNS he thinks Cathy could familiarize people with the Episcopal faith.

“We have a PR issue,” Levenstrong said. “Most people don’t realize there is a denomination that is deeply rooted in tradition, and yet open and affirming, and theologically inclusive, and doing its best to strive toward a future without racial injustice, without ecocide, all these huge problems that we as a church take very seriously.”

In his own context, Levenstrong has already used Cathy to brainstorm Harry Potter-themed lessons for children. (She recommended a related book written by an Episcopalian.)

Cathy’s creators know AI is a thorny topic. Their FAQ page anticipates potential critiques."

Tuesday, July 2, 2024

More Adventures With AI Claude, The Contrite Poet; Religion Unplugged, June 11, 2024

Dr. Michael Brown , Religion Unplugged; More Adventures With AI Claude, The Contrite Poet

"Working with the AI bot Claude is, in no particular order, amazing, frustrating, and hilarious...

I have asked Claude detailed Hebrew grammatical questions or asked him to translate difficult rabbinic Hebrew passages, and time and time again, Claude has nailed it.

But just as frequently, he creates texts out of thin air, side by side with accurate citations, which then have to be vetted one by one.

When I asked Claude why he manufactured citations, he explained that he aims to please and can sometimes go a little too far. In other words, Claude tells me what he thinks I want to hear...

"I’m sure that AI bots are already providing “companionship” for an increasingly isolated generation, not to mention proving falsehoods side by side with truths for unsuspecting readers.

And so, the promise and the threat of AI continue to grow by the day, with a little entertainment and humor added in."

Thursday, June 27, 2024

God Chatbots Offer Spiritual Insights on Demand. What Could Go Wrong?; Scientific American, March 19, 2024

 , Scientific American; God Chatbots Offer Spiritual Insights on Demand. What Could Go Wrong?

"QuranGPT—which has now been used by about 230,000 people around the world—is just one of a litany of chatbots trained on religious texts that have recently appeared online. There’s Bible.Ai, Gita GPT, Buddhabot, Apostle Paul AI, a chatbot trained to imitate 16th-century German theologian Martin Luther, another trained on the works of Confucius, and yet another designed to imitate the Delphic oracle. For millennia adherents of various faiths have spent long hours—or entire lifetimes—studying scripture to glean insights into the deepest mysteries of human existence, say, the fate of the soul after death.

The creators of these chatbots don’t necessarily believe large language models (LLMs) will put these age-old theological enigmas to rest. But they do think that with their ability to identify subtle linguistic patterns within vast quantities of text and provide responses to user prompts in humanlike language (a feature called natural-language processing, or NLP), the bots can theoretically synthesize spiritual insights in a matter of seconds, saving users both time and energy. It’s divine wisdom on demand.

Many professional theologians, however, have serious concerns about blending LLMs with religion...

The danger of hallucination in this context is compounded by the fact that religiously oriented chatbots are likely to attract acutely sensitive questions—questions one might feel too embarrassed or ashamed to ask a priest, an imam, a rabbi or even a close friend. During a software update to QuranGPT last year, Khan had a brief glimpse into user prompts, which are usually invisible to him. He recalls seeing that one person had asked, “I caught my wife cheating on me—how should I respond?” Another, more troublingly, had asked, “Can I beat my wife?”

Khan was pleased with the system’s responses (it urged discussion and nonviolence on both counts), but the experience underscored the ethical gravity behind his undertaking."

Friday, June 7, 2024

‘This Is Going to Be Painful’: How a Bold A.I. Device Flopped; The New York Times, June 6, 2024

Tripp Mickle and , The New York Times ; This Is Going to Be Painful’: How a Bold A.I. Device Flopped

"As of early April, Humane had received around 10,000 orders for the Ai Pin, a small fraction of the 100,000 that it hoped to sell this year, two people familiar with its sales said. In recent months, the company has also grappled with employee departures and changed a return policy to address canceled orders. On Wednesday, it asked customers to stop using the Ai Pin charging case because of a fire risk associated with its battery.

Its setbacks are part of a pattern of stumbles across the world of generative A.I., as companies release unpolished products. Over the past two years, Google has introduced and pared back A.I. search abilities that recommended people eat rocks, Microsoft has trumpeted a Bing chatbot that hallucinated and Samsung has added A.I. features to a smartphone that were called “excellent at times and baffling at others.”"

Tuesday, June 4, 2024

Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools; Stanford University, 2024

Varun Magesh∗ Stanford University; Faiz Surani∗ Stanford University; Matthew Dahl, Yale University; Mirac Suzgun, Stanford University; Christopher D. Manning, Stanford University; Daniel E. Ho† Stanford University, Stanford University

Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools

"Abstract

Legal practice has witnessed a sharp rise in products incorporating artificial intelligence (AI). Such tools are designed to assist with a wide range of core legal tasks, from search and summarization of caselaw to document drafting. But the large language models used in these tools are prone to “hallucinate,” or make up false information, making their use risky in high-stakes domains. Recently, certain legal research providers have touted methods such as retrieval-augmented generation (RAG) as “eliminating” (Casetext2023) or “avoid[ing]” hallucinations (Thomson Reuters2023), or guaranteeing “hallucination-free” legal citations (LexisNexis2023). Because of the closed nature of these systems, systematically assessing these claims is challenging. In this article, we design and report on the first pre- registered empirical evaluation of AI-driven legal research tools. We demonstrate that the providers’ claims are overstated. While hallucinations are reduced relative to general-purpose chatbots (GPT-4), we find that the AI research tools made by LexisNexis (Lexis+ AI) and Thomson Reuters (Westlaw AI-Assisted Research and Ask Practical Law AI) each hallucinate between 17% and 33% of the time. We also document substantial differences between systems in responsiveness and accuracy. Our article makes four key contributions. It is the first to assess and report the performance of RAG-based proprietary legal AI tools. Second, it introduces a com- prehensive, preregistered dataset for identifying and understanding vulnerabilities in these systems. Third, it proposes a clear typology for differentiating between hallucinations and accurate legal responses. Last, it provides evidence to inform the responsibilities of legal professionals in supervising and verifying AI outputs, which remains a central open question for the responsible integration of AI into law.1"

Thursday, May 23, 2024

US intelligence agencies’ embrace of generative AI is at once wary and urgent; Associated Press, May 23, 2024

FRANK BAJAK , Associated Press; US intelligence agencies’ embrace of generative AI is at once wary and urgent

"The CIA’s inaugural chief technology officer, Nand Mulchandani, thinks that because gen AI models “hallucinate” they are best treated as a “crazy, drunk friend” — capable of great insight and creativity but also bias-prone fibbers. There are also security and privacy issues: adversaries could steal and poison them, and they may contain sensitive personal data that officers aren’t authorized to see.

That’s not stopping the experimentation, though, which is mostly happening in secret. 

An exception: Thousands of analysts across the 18 U.S. intelligence agencies now use a CIA-developed gen AI called Osiris. It runs on unclassified and publicly or commercially available data — what’s known as open-source. It writes annotated summaries and its chatbot function lets analysts go deeper with queries...

Another worry: Ensuring the privacy of “U.S. persons” whose data may be embedded in a large-language model.

“If you speak to any researcher or developer that is training a large-language model, and ask them if it is possible to basically kind of delete one individual piece of information from an LLM and make it forget that -- and have a robust empirical guarantee of that forgetting -- that is not a thing that is possible,” John Beieler, AI lead at the Office of the Director of National Intelligence, said in an interview.

It’s one reason the intelligence community is not in “move-fast-and-break-things” mode on gen AI adoption."