AI & Tech Artificial Intelligence Cybersecurity Newswire Technology

AI Agents Advance in Writing and Hacking Code

June 26, 2025Last Updated: June 26, 2025

2 minutes read

Abstract digital art depicting a sphere composed of pixelated pink, purple, and green patterns against a black background.

▼ Summary

– AI models are improving at identifying software bugs, with UC Berkeley researchers finding 17 vulnerabilities (15 previously unknown) in 188 codebases using the CyberGym benchmark.
– AI-powered tools like Xbow are rising in cybersecurity rankings, with Xbow securing top spot on HackerOne’s bug-hunting leaderboard and raising $75 million in funding.
– AI’s coding and reasoning advancements are transforming cybersecurity, with UC Berkeley’s Dawn Song calling it a “pivotal moment” that exceeded expectations.
– AI can automate both finding and exploiting security flaws, potentially aiding both defenders and hackers, with performance improving with more resources and time.
– While AI shows promise in detecting zero-day vulnerabilities, it still struggles with most flaws and complex bugs, highlighting current limitations.

Artificial intelligence is rapidly transforming software development and cybersecurity, with new research demonstrating AI’s growing ability to identify critical vulnerabilities in complex codebases. Recent studies show these systems can now uncover previously unknown security flaws at an unprecedented scale, marking a significant shift in how software gets tested and secured.

At UC Berkeley, researchers put cutting-edge AI models through rigorous testing using a specialized benchmark called CyberGym. The results were striking, the AI successfully pinpointed 17 distinct vulnerabilities across 188 major open-source projects, including 15 zero-day exploits that had never been documented before. According to Dawn Song, the Berkeley professor leading the project, many of these discoveries involved high-risk security gaps that could have serious real-world consequences if left unpatched.

The cybersecurity landscape is evolving quickly as AI-powered tools demonstrate their potential. Startup Xbow, for instance, has already climbed to the top of HackerOne’s bug bounty leaderboard with its AI-driven vulnerability detection system, recently securing $75 million in funding to expand its capabilities. Song notes that AI’s combination of advanced code comprehension and reasoning skills is reshaping security practices faster than many anticipated. “We’re at a turning point,” she explains. “The performance surpassed what we initially projected.”

While these advancements promise to help organizations strengthen their defenses, they also raise concerns about malicious applications. The same AI systems that detect flaws could theoretically be weaponized to exploit them. Song’s team found that even with limited resources, their AI agents generated hundreds of proof-of-concept attacks, suggesting that with more time and investment, the technology could become even more effective at uncovering weaknesses.

The Berkeley study evaluated multiple AI models, including offerings from OpenAI, Google, and Anthropic, alongside open-source alternatives from Meta, DeepSeek, and Alibaba. These systems were paired with specialized cybersecurity agents designed to scan code, run tests, and develop exploit demonstrations. By analyzing vulnerability descriptions from existing projects, the AI tools successfully replicated known flaws while also discovering entirely new ones.

Despite these breakthroughs, challenges remain. The AI struggled with particularly intricate vulnerabilities and missed a substantial portion of existing flaws, indicating that human expertise still plays a crucial role in cybersecurity. However, real-world successes, like Google’s Project Zero identifying unknown vulnerabilities or security researcher Sean Heelan uncovering a Linux kernel flaw with AI assistance, highlight the technology’s growing influence.

As AI continues to advance, its role in both securing and potentially compromising digital systems will only expand. While not yet perfect, these systems are proving they can automate vulnerability detection at scale, forcing the industry to adapt to a future where AI is deeply embedded in cybersecurity strategies.

(Source: Wired)