swe-bench test

Claude 4.1 Outperforms in Coding Tests Ahead of GPT-5 Launch

August 6, 2025

Anthropic’s Claude Opus 4.1 leads in coding performance with 74.5% accuracy on the SWE-bench test, surpassing OpenAI and Google, but…