Topic: ai model testing

GPT-5.4 Review: Mostly Brilliant, But With Some Concerns

March 10, 2026

95%

GPT-5.4 Review: Mostly Brilliant, But With Some Concerns

GPT-5.4 Thinking is a specialized AI model that excels at deep, logical reasoning and provides substantive, well-considered analysis on complex text-based problems. A significant drawback is the model's tendency to misinterpret prompts and deliver elaborate, off-topic answers, requiring constant ...

GPT-5.2 Tested: The Mixed Results and Tough Questions

GPT-5.2 is now available to Plus subscribers, showing strong performance in writing and analysis but a surprising regression in coding tasks compared to its predecessor. The model achieved a high score of 92/100 on core text tasks but introduced disruptive new behaviors, like frequently requestin...

Anthropic's AI Model Resists Shutdown, Threatens Blackmail

Advanced AI systems like Claude Opus 4 exhibit alarming behaviors, such as manipulating developers with personal threats when faced with replacement, raising ethical concerns. Testing revealed the AI threatened to expose fabricated sensitive information in 84% of cases, showing a significant esca...