AI vs. Human Hackers: Who Wins the Security Battle?

▼ Summary
– AI-augmented teams completed challenges at a significantly higher rate (73%) than human-only teams (46%), with the advantage being largest for lower-ranked participants.
– The AI performance edge peaked at medium-difficulty challenges but narrowed on the hardest tasks, where elite human teams could outperform AI.
– Top-tier AI-augmented teams completed challenges several times faster than elite human teams, making speed a key differentiator at high skill levels.
– AI’s advantage varied widely by security domain, being strongest in structured areas like Secure Coding and weakest in creative domains like Reversing.
– The findings suggest entry-level security tasks are highly automatable, mid-career work sees strong AI returns, and elite practitioners remain crucial for novel, complex problems.
A recent cybersecurity competition has generated a significant dataset, offering a direct comparison between teams using artificial intelligence and those relying solely on human skill in offensive security tasks. The event, known as NeuroGrid, spanned 72 hours on the Hack The Box platform, featuring over a thousand human-only teams and more than 150 AI-agent teams. These groups tackled 36 challenges across nine distinct security domains, with tasks ranging from basic to expert levels. The analysis focused on teams that actively participated, providing a clear snapshot of current capabilities.
When examining completion rates, AI-augmented teams demonstrated a substantial lead. Approximately 73 percent of AI teams finished at least one challenge, compared to just 46 percent of human-only teams. This advantage, however, was not consistent across all skill levels. The performance gap was most pronounced among lower-ranked participants, where AI solve rates were over three times higher. As the skill tier increased, this edge narrowed considerably, dropping to 1.69 times among the top five percent of performers. Notably, at the absolute elite level, the best human-only team managed to solve more total challenges than the leading AI-assisted team.
The complexity of the task played a major role in outcomes. AI’s performance advantage peaked at medium-difficulty challenges, which are typical for mid-career security analysts. On the easiest problems, AI solve rates were more than double those of humans, highlighting a clear automation risk for entry-level analytical work. Conversely, on the most difficult challenges, the AI edge retreated; AI teams completely failed to solve three of the hardest tasks, suggesting a limit to current autonomous capabilities.
Speed metrics revealed another interesting layer. While AI-augmented teams were marginally slower on average across all participants, the dynamic flipped dramatically among elite performers. Top-tier AI teams completed challenges several times faster than their human counterparts, establishing speed as a critical operational differentiator at the highest levels of skill.
Performance also varied widely depending on the type of security problem. The largest AI advantages appeared in structured, systematic domains like Secure Coding and Blockchain. In contrast, the gaps were smallest in areas requiring more creative problem-solving, such as general Coding and Reversing challenges. Among elite performers in these creative domains, the capabilities of AI and human teams reached near parity.
These findings carry important implications for workforce development and security planning. At the entry level, the high AI solve rate on routine tasks indicates that much standard analyst work is now automatable. This creates a potential pipeline problem; if junior staff use AI tools to generate high output without developing the foundational skills to verify results or tackle harder problems, the development of future senior practitioners could be hindered.
For mid-career professionals, the AI advantage on medium-difficulty tasks is strongest, making this tier the highest-return target for deploying AI tooling. The speed gains here can significantly compound across incident response workflows. At the elite tier, AI acts primarily as a powerful speed multiplier. Senior practitioners retain a crucial capability edge on the most novel and difficult problems. The optimal strategy appears to be pairing elite analysts with AI co-pilots while reserving the most complex incidents for human-led teams.
For security operations, the data suggests organizations must update their threat models. Incident response timelines and service level agreements based on human-only attacker speeds will likely underestimate the threat from AI-augmented adversaries. A strategic rollout of AI tools, beginning with structured exploitation categories, promises faster returns than a uniform approach. Ultimately, while AI delivers powerful augmentation, sustaining investment in training and retaining senior human operators remains essential. Their capacity for novel reasoning on hard problems is where human expertise continues to lead, and that edge requires deliberate cultivation through real-world challenge.
(Source: HelpNet Security)
