Anthropic: Claude can now end conversations to prevent harmful uses

What’s new: Anthropic has introduced a feature in its Claude Opus 4 and 4.1 models that allows the AI to end conversations if it detects potential harm or abuse. This feature is part of a “model welfare” initiative and is designed to activate only in extreme edge cases. Claude will attempt to redirect users to helpful resources before resorting to ending the conversation.

Who’s affected

This update applies to users of Claude Opus 4 and 4.1 models available through paid plans and API. The Claude Sonnet 4 model will not receive this feature.

What to do

  • Familiarize yourself with the new conversation-ending feature in Claude Opus 4 and 4.1.
  • Monitor user interactions to understand how this feature may impact user experience.
  • Prepare to provide alternative resources if users encounter conversation terminations.

Sources