Persona
-

The Assistant Axis: when “helpful” is a place, not a promise
Anthropic finds a measurable “Assistant Axis” in LLMs. Capping drift along it reduces harmful persona shifts and jailbreaks—raising questions about human identity.
gekko

Anthropic finds a measurable “Assistant Axis” in LLMs. Capping drift along it reduces harmful persona shifts and jailbreaks—raising questions about human identity.