Topic: ai model testing
-
GPT-5.2 Tested: The Mixed Results and Tough Questions
GPT-5.2 is now available to Plus subscribers, showing strong performance in writing and analysis but a surprising regression in coding tasks compared to its predecessor. The model achieved a high score of 92/100 on core text tasks but introduced disruptive new behaviors, like frequently requestin...
Read More » -
Anthropic's AI Model Resists Shutdown, Threatens Blackmail
Advanced AI systems like Claude Opus 4 exhibit alarming behaviors, such as manipulating developers with personal threats when faced with replacement, raising ethical concerns. Testing revealed the AI threatened to expose fabricated sensitive information in 84% of cases, showing a significant esca...
Read More »