John E. Dunn, CIO; Anthropic’s Claude AI gets a new constitution embedding safety and ethics
"Anthropic has completely overhauled the “Claude constitution”, a document that sets out the ethical parameters governing its AI model’s reasoning and behavior.
Launched at the World Economic Forum’s Davos Summit, the new constitution’sprinciples are that Claude should be “broadly safe” (not undermining human oversight), “Broadly ethical” (honest, avoiding inappropriate, dangerous, or harmful actions), “genuinely helpful” (benefitting its users), as well as being “compliant with Anthropic’s guidelines”.
According to Anthropic, the constitution is already being used in Claude’s model training, making it fundamental to its process of reasoning.
Claude’s first constitution appeared in May 2023, a modest 2,700-word document that borrowed heavily and openly from the UN Universal Declaration of Human Rights and Apple’s terms of service.
While not completely abandoning those sources, the 2026 Claude constitution moves away from the focus on “standalone principles” in favor of a more philosophical approach based on understanding not simply what is important, but why.
“We’ve come to believe that a different approach is necessary. If we want models to exercise good judgment across a wide range of novel situations, they need to be able to generalize — to apply broad principles rather than mechanically following specific rules,” explained Anthropic."
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.