Can LLMs Enhance Cybersecurity? The Surprising Truth

▼ Summary
– LLMs are increasingly used in cybersecurity for tasks like threat intelligence sorting, incident response guidance, and handling repetitive work, but raise questions about performance improvement and blind spots.
– A study found that while LLMs improve accuracy on routine security tasks like phishing detection, they can lead to over-reliance and reduced independent thinking, especially in complex decision-making.
– Individual resilience significantly impacts outcomes, with high-resilience users benefiting from LLMs without over-reliance, while low-resilience users may see no improvement or even performance decline.
– Organizations should implement human-in-the-loop structures, training on LLM failure modes, resilience-building exercises, team pairing, and continuous feedback to mitigate risks like automation bias and misinformation.
– AI systems should be designed to adapt to user resilience levels, with features like confidence indicators and alternative viewpoint prompts, to prevent uneven performance outcomes and ensure effective integration.
The integration of large language models into cybersecurity operations is rapidly shifting from experimental to essential, offering new ways to manage threats and streamline workflows. These AI systems are increasingly used to sift through threat intelligence, guide incident response, and automate repetitive tasks, allowing human analysts to focus on more complex challenges. However, this growing reliance raises important questions about when these tools genuinely enhance security and where they might introduce unforeseen vulnerabilities.
A recent study examined how human decision-making is influenced by LLM assistance in security contexts. Researchers compared the performance of individuals working with and without AI support, measuring outcomes related to accuracy, autonomy, and dependence on technology. Participants included cybersecurity graduate students who engaged in realistic scenarios based on CompTIA Security+ materials, covering areas such as phishing detection, password management, and incident handling. The study also evaluated each person’s resilience, their capacity to adapt, solve problems, and remain effective under pressure.
Results showed that LLMs significantly improved performance on routine activities. Those using AI support demonstrated higher accuracy in identifying phishing attempts and assessing password strength. They also provided more consistent evaluations of security policies and selected better-targeted responses during incident simulations. Yet these advantages had clear limits. When faced with sophisticated threats like advanced persistent threats or zero-day exploits, users sometimes adopted incorrect suggestions from the model. This underscores a critical weakness: LLMs can present flawed recommendations with unwarranted confidence.
According to Bar Lanyado, Lead Researcher at Lasso Security, organizations must actively prevent uncritical trust in automated systems. He emphasizes the need for a human-in-the-loop approach, where AI-generated outputs are treated as unverified hypotheses. Analysts should cross-reference suggestions with actual logs, network captures, or other reliable sources before taking action. Lanyado also advises confirming that recommended software packages are actively maintained and free from known vulnerabilities, and implementing governance policies such as allow lists and structured approval workflows.
The role of personal resilience emerged as a decisive factor in how effectively individuals used AI tools. Participants with high resilience performed well both with and without LLM support, and they used AI recommendations judiciously without surrendering independent judgment. In contrast, those with lower resilience gained little benefit, sometimes even performing worse, when relying on AI assistance. This divergence risks creating a two-tiered team dynamic, where gaps widen between those who can critically evaluate AI input and those who cannot. Over time, this may lead to reduced analytical diversity and increased over-reliance on automated systems.
Lanyado notes that security leaders must account for differences in team composition and training strategies, as not every employee interacts with automation in the same way. Variations in readiness can amplify organizational risk. He recommends four practical steps to build resilience and reduce blind trust: provide training on common LLM failure modes, including hallucinations, outdated knowledge, and prompt injection attacks.
These findings highlight that simply introducing an LLM does not uniformly improve team performance. Without thoughtful design, AI tools may enhance some users while leaving others behind. Adaptive interface designs can help, for example, by offering open-ended suggestions to resilient users while providing structured guidance and confidence indicators to those who need more support.
Another concern is automation bias, the tendency to trust AI recommendations even when they are flawed. Combating this requires institutional practices that mandate human oversight and encourage staff to challenge model outputs. Lanyado’s emphasis on validation and governance aligns with this need, promoting a culture where technology supports, rather than supplants, human expertise.
(Source: HelpNet Security)