language model alignment

AI Researchers Withhold ‘Dangerous’ AI Incantations

December 9, 2025

Researchers discovered that crafting harmful prompts into poetry can bypass the safety guardrails of major AI systems, exposing a critical…