AI & Tech Artificial Intelligence Cybersecurity Newswire Technology

UK Tests Mythos AI to Assess Real Cybersecurity Threats

April 15, 2026Last Updated: April 15, 2026

2 minutes read

Smartphone displaying Anthropic's Project Glasswing website with red and black design. — Anthropic's Project Glasswing website is displayed on a smartphone screen in this photo illustration in Brussels, Belgium, on April 12, 2026. Governments and financial institutions are reviewing potential cybersecurity risks from the company's advanced AI model, which has identified thousands of software vulnerabilities but is restricted over misuse concerns. (Photo Illustration by Jonathan Raa/NurPhoto via Getty Images)

Originally published on: April 14, 2026

▼ Summary

– Anthropic initially released its Mythos Preview AI model only to a limited group of critical industry partners, citing its strong computer security capabilities.
– An independent UK government evaluation found Mythos is not significantly better than other top models at individual cybersecurity tasks.
– However, the model excels at chaining multiple tasks together into complex, multi-step cyber-attacks needed to fully infiltrate systems.
– In basic “Apprentice” level Capture the Flag tests, Mythos completes over 85% of tasks, a high mark but comparable to recent competing models.
– Mythos showed greater relative potential in a complex, 32-step simulated network attack test designed to take a human roughly 20 hours.

The UK’s AI Security Institute has released an independent assessment of Anthropic’s new Mythos Preview model, providing a crucial public benchmark for its cybersecurity capabilities. While the model’s performance on individual tasks aligns with other leading systems, its standout feature appears to be a sophisticated ability to orchestrate complex, multi-stage attacks, a skill that could redefine the threat landscape.

In its evaluation, the AISI found that Mythos is not a radical departure from recent frontier models when tested on isolated security challenges. The real differentiation emerges in its capacity for task chaining, where it can effectively link dozens of discrete actions into a coherent and persistent offensive campaign. This moves beyond simple script execution toward the kind of strategic, adaptive planning that mirrors a human attacker’s methodology.

This capability was most evident in a demanding simulation called “The Last Ones.” Designed to replicate a 32-step data extraction from a corporate network, the test requires navigating multiple hosts and network segments in a sustained operation. AISI estimates a trained human would need roughly 20 hours to complete such an attack. Mythos’s proficiency in this scenario suggests a significant leap in an AI’s ability to manage the end-to-end complexity of a real-world breach, rather than just performing component tasks.

The institute’s testing regimen has tracked rapid progress since early 2023, when models like GPT-3.5 Turbo failed to complete basic “Apprentice” level Capture the Flag challenges. Mythos Preview now successfully handles over 85 percent of those same tasks, setting a new high mark. However, this achievement must be contextualized. Competing models like GPT-5.4 and Anthropic’s own Opus 4.6 have demonstrated comparable performance in recent months, often within a few percentage points across various difficulty levels.

This context raises questions about the necessity of Anthropic’s cautious, limited release strategy for Mythos. If raw scores on standardized tests are similar, the justification likely lies in the model’s advanced operational planning skills revealed in complex simulations like The Last Ones. The ability to autonomously chain low-level actions into a high-level attack plan represents a qualitative shift, potentially making the model a more potent and autonomous tool in the wrong hands. The AISI’s work underscores that evaluating AI cyber threats requires moving beyond checklist tasks to assess holistic, strategic reasoning.

(Source: Ars Technica)

Topics

ai model release 95% cybersecurity capabilities 93% ai security institute 90% mythos preview model 88% capture the flag 85% multi-step attacks 83% model performance comparison 80% data extraction simulation 78% ai in cyber attacks 75% industry partnerships 72%

UK Tests Mythos AI to Assess Real Cybersecurity Threats

Topics

Bots account for 57% of all webpage requests

Study: Bumblebees Spontaneously Solve Problems

NASA’s MAVEN spacecraft ends 11-year Mars mission quietly

New Declaration Warns AI Could Threaten Math’s Foundations

Autonomous vehicles may not cut traffic after all

Male bowerbirds use bright human objects to attract mates

Beans Recruit Immune Receptors to Attack Caterpillars

Neanderthals Used Rhino Teeth as Hammers, Experiments Show

5 May highlights: Why cats prefer silver vine to catnip

Topics

Related Articles