- Anthropic dares you to jailbreak its new AI model Ars Technica
- Constitutional Classifiers: Defending against universal jailbreaks Anthropic
- Anthropic makes ‘jailbreak’ advance to stop AI models producing harmful results Financial Times
- ‘Constitutional Classifiers’ Technique Mitigates GenAI Jailbreaks Dark Reading
- Anthropic has a new way to protect large language models against jailbreaks MIT Technology Review