Anthropic’s new Claude ‘constitution’: be helpful and honest, and don’t destroy humanity

Anthropic is overhauling Claude’s so-called “soul doc.”

The new missive is a 57-page document titled “Claude’s Constitution,” which details “Anthropic’s intentions for the model’s values and behavior,” aimed not at outside readers but the model itself. The document is designed to spell out Claude’s “ethical character” and “core identity,” including how it should balance conflicting values and high-stakes situations.

Where the previous constitution, published in May 2023, was largely a list of guidelines, Anthropic now says it’s important for AI models to “understand why we want them to behave in certain ways rather than just specifying w …

Read the full story at The Verge.