Artificial IntelligenceBigTech CompaniesNewswireTechnologyWhat's Buzzing

Anthropic’s Fable 5 crushed GPT 5.5, then regulators pulled it

▼ Summary

– Anthropic’s Fable 5 outperformed OpenAI’s GPT 5.5 on all major benchmarks, including a 22-point lead on SWE-Bench Pro, but was shut down by the US government on June 12 after three days of public availability.
– GPT 5.5 is now the strongest publicly available AI model, not due to improvement, but because its only competitor, Fable 5, was removed via an export control directive.
– GPT 5.5 has a strength in interactive terminal-based coding tasks (Terminal-Bench 2.0), where it scored 82.7% versus Fable 5’s approximately 88.0%, and costs half as much as Fable 5.
– The government ordered the shutdown of Fable 5 and the Mythos 5 model family citing a jailbreak vulnerability, which Anthropic disputes as minor and publicly known, with reports suggesting Amazon CEO Andy Jassy influenced the review.
– Fable 5’s removal forces developers to revert to GPT 5.5 or older Opus models, representing a significant downgrade for coding-heavy workflows, such as resolving four out of five real-world software issues versus three out of five.

Anthropic’s Fable 5 was the most capable AI model ever released to the public for exactly three days. It soared to the top of the Chatbot Arena leaderboard, outperformed OpenAI’s GPT 5.5 on coding benchmarks by double-digit margins, and gave paying subscribers access to Mythos-class reasoning for the first time. Then, on June 12, the US government ordered Anthropic to pull the plug.

The result is a peculiar moment in the AI industry. The model that clearly beats everything else on the market is the one nobody can use. GPT 5.5, which OpenAI launched in late April under the internal codename “Spud,” is now the strongest model available to developers and consumers. It earned that position not because it improved, but because its only real competitor was removed.

The benchmark gap between the two is substantial. On SWE-Bench Pro, which measures a model’s ability to resolve real-world software engineering issues across open-source codebases, Fable 5 scored 80.3% to GPT 5.5’s 58.6%, a 22-point difference. On SWE-Bench Verified, a curated subset of the same benchmark, Fable 5 reached 95.0%.

The coding benchmarks tell a similar story. Fable 5 leads the Code Arena by 98 Elo points, scoring 1,665 to GPT 5.5’s 1,501. On FrontierCode Diamond, a benchmark designed for the most difficult programming tasks, Fable 5 scored 29.3% while GPT 5.5 managed just 5.7%. On the broader Chatbot Arena leaderboard, Fable 5 sits at number one with GPT 5.5 in fourth place.

GPT 5.5 does have one area of strength. On Terminal-Bench 2.0, which evaluates interactive terminal-based coding tasks rather than codebase-level issue resolution, GPT 5.5 scored 82.7% compared to Fable 5’s roughly 88.0%. The gap is narrower there, and the benchmark tests a different skill: executing commands and debugging in real time rather than reading and patching large repositories.

Pricing also favors OpenAI. GPT 5.5 costs $5 per million input tokens and $30 per million output tokens, half the price of Fable 5’s $10 and $50 respectively. For developers running high-volume applications where the performance difference is less critical than cost, GPT 5.5 is the more practical choice even when both models are available.

Fable 5 launched on June 9 as Anthropic’s first Mythos-class model made available to the general public. It offered a one-million-token context window and 128,000 output tokens. Anthropic made it available at no extra cost to Pro, Max, Team, and Enterprise subscribers until June 22, a promotional window that the government directive cut short after just three days.

The shutdown came via an export control directive issued on June 12. The government cited a jailbreak vulnerability as the reason for pulling both Fable 5 and the broader Mythos 5 model family. Anthropic has disputed the severity of the finding, saying the vulnerabilities identified are minor, publicly known, and achievable by GPT 5.5 without any bypass techniques. Meanwhile, reports indicate that Amazon CEO Andy Jassy played a role in triggering the government’s review.

The practical consequence is that developers and researchers who were evaluating Fable 5 for production use have had to revert to GPT 5.5 or Anthropic’s earlier Opus models. For coding-heavy workflows, the downgrade is significant. The 22-point gap on SWE-Bench Pro represents the difference between a model that can resolve four out of five real-world software issues and one that handles roughly three out of five.

Whether Fable 5 returns depends on Anthropic’s negotiations with the government over the export control classification. The company has publicly argued that the directive is disproportionate and that the cited vulnerabilities do not justify pulling the model entirely. Until that dispute is resolved, GPT 5.5 holds the top spot by default, the best model available not because it is the best model that exists.

(Source: The Next Web)

Topics

model shutdown 95% benchmark comparison 93% government regulation 91% gpt 5.5 dominance 89% coding performance 88% pricing differences 86% model availability 84% jailbreak vulnerability 82% mythos-class reasoning 80% anthropic vs openai 78%