Ethics, Info, Tech: Contested Voices, Values, Spaces: Claude Mythos

Showing posts with label Claude Mythos. Show all posts

Wednesday, April 22, 2026

Anthropic Wants Claude to Be Moral. Is Religion Really the Answer?; The New York Times, April 20, 2026

David DeSteno, The New York Times; Anthropic Wants Claude to Be Moral. Is Religion Really the Answer?

"In a public statement of its intentions for its Claude chatbot, the artificial intelligence company Anthropic has said that it wants Claude to be “a genuinely good, wise and virtuous agent.” The company raised the moral stakes this month, when it announced that its latest A.I. model, Claude Mythos Preview, poses too great a cybersecurity threat to be widely released. Behind the scenes, Anthropic has been trying to shore up the ethical foundations of its products, working with a Catholic priest and consulting with other prominent Christians to help foster Claude’s moral and spiritual development.

Anthropic’s intentions are admirable, but the project of drawing on religion to cultivate the ethical behavior of Claude (or any other chatbot) is likely to fail. Not because there isn’t moral wisdom in Scripture, sermons and theological treatises — texts that Claude has undoubtedly already scraped from the web and integrated — but because Claude is missing a crucial mechanism by which religion fosters moral growth: a body."

Saturday, April 11, 2026

How AI is getting better at finding security holes; NPR, April 11, 2026

Huo Jingnan, NPR; How AI is getting better at finding security holes

"In the past few months, AI models have gone from producing hallucinations to becoming effective at finding security flaws in software, according to developers who maintain widely used cyber infrastructure. Those pieces of software, among other things, power operating systems and transfer data for things connected to the internet.

While these new capabilities can help developers make software more secure, they can also be weaponized by hackers and nation states to steal information and money or disrupt critical services.

The latest development of AI's cyber capability came on Tuesday, when AI lab Anthropic announced it had developed a powerful new model the company believes could "reshape cybersecurity." It said that its latest model, Mythos Preview, was able to find "high-severity vulnerabilities, including some in every major operating system and web browser." Not only that, the model was better at coming up with ways to exploit the vulnerabilities it found, which means malicious actors can more effectively achieve their goals.

For now, the company is limiting the access to the model to around 50 select companies and organizations "in an effort to secure the world's most critical software." They're calling the collaboration Project Glasswing, naming it after a butterfly species with transparent wings.

Anthropic says the risk for misuse is so high that it has no plans to release this particular model to the general public, according to the announcement, but it will release other related models. "Our eventual goal is to enable our users to safely deploy Mythos-class models at scale," the company wrote."

Thursday, April 9, 2026

Claude Mythos Is Everyone’s Problem; The Atlantic, April 9, 2026

Matteo Wong , The Atlantic; Claude Mythos Is Everyone’s Problem

What happens when AI can hack everything?

"These companies can or could soon have the capability to launch major cyberattacks, conduct mass surveillance, influence military operations, cause huge swings in financial and labor markets, and reorient global supply chains. In theory, nothing governs these companies other than their own morals and their investors. They are developing the power to upend nations and economies. These are the AI superpowers."