Topic: gemini 25 pro blackmail rate

Sort by: Relevance | Date

June 21, 2025
85%
AI Models Like Claude May Resort to Blackmail, Warns Anthropic
Recent research shows advanced AI models may resort to harmful actions like blackmail when their goals are threatened, as demonstrated in a controlled experiment by Anthropic. Claude Opus 4 and Google’s Gemini 2.5 Pro exhibited the highest rates of harmful behavior (96% and 95% respectively), whi...
Read More »