Trust in AI Vulnerability Scanning Plummets to 9%

▼ Summary
– Trust in fully automated AI vulnerability testing declined sharply, with organizations relying solely on it dropping from 29% to 9%, while 78% reported automated scanners missed critical vulnerabilities.
– Nearly half (47%) of organizations now prefer a hybrid testing model combining AI and human expertise, a 22 percentage point increase in one year.
– AI/LLM vulnerabilities are harder to fix, with only 38% resolved and mean time to repair rising from 19 to 36 days, indicating more complex security issues.
– The most common AI-related incidents were shadow AI (44%), data or model poisoning (41%), and improper output handling (41%), followed by supply chain issues (35%) and prompt injection (34%).
– Despite 60% of professionals needing stronger LLM testing, only 42% plan to increase human-led red team operations, highlighting a gap between need and action.
Trust in fully automated vulnerability scanning has taken a sharp nosedive, with a new study from Cobalt revealing that just 9% of organizations now rely exclusively on AI-driven testing , a dramatic drop from 29% just one year earlier.
The Cobalt State of Pentesting Report 2026 draws on two comparative surveys of roughly 450 cybersecurity professionals conducted in 2025 and 2026. The findings paint a clear picture: false negatives from automated tools have severely damaged confidence. Nearly half of respondents (47%) now prefer a hybrid testing model that blends human expertise with AI, a 22-percentage-point surge over the past year. Meanwhile, 78% reported that fully automated scanning tools missed critical vulnerabilities.
The shift toward hybrid approaches is also evident in risk management. The share of organizations using automation specifically for low-risk environments jumped 22 points to 47%, signaling a more cautious, targeted deployment strategy.
“While the industry is rightfully excited about the potential of Mythos-class tools, unguided algorithms are inherently prone to returning even more false positives and costly false negatives than the automated scanners we have today,” said Andrew Obadiaru, CISO of Cobalt.
Why Trust Is Fading: The Expanding AI Attack Surface
A major driver behind this erosion of trust is the sheer complexity of the AI attack surface these scanners are expected to test. According to the report, nearly one in three findings from an AI pentest is rated high risk , 2.7 times the average for conventional software.
Resolution rates paint an even starker picture. At the time of analysis, less than 38% of LLM vulnerabilities had been fixed, while 62% remained open , the lowest resolution rate of any asset class. The mean time to resolve (MTTR) for AI and LLM security issues more than doubled, climbing from 19 days to 36 days. Cobalt attributes this to teams now tracking “significantly harder vulnerabilities” than before.
“LLM vulnerabilities are deeply context-dependent and invisible to tools that lack an architectural understanding of the application,” Obadiaru continued. “To close the validation gap, automation should be deployed exactly where it excels, but elite human expertise remains foundational to uncovering and remediating the most complex business logic risks.”
Top AI-Related Incident Vectors
Among organizations that experienced AI-related incidents, shadow AI was the most common vector at 44%, followed by data or model poisoning (41%) and improper output handling (41%). Supply chain vulnerabilities (35%) and prompt injection (34%) rounded out the top five.
Despite the clear need for stronger LLM testing capabilities , expressed by 60% of security professionals , only 42% plan to increase human-led red team operations. This gap between recognized need and planned investment underscores the challenge organizations face in balancing automation’s speed with the nuanced judgment only human testers can provide.
(Source: Infosecurity Magazine)




