Critical Copilot flaw let hackers steal 2FA codes

▼ Summary
– Microsoft patched a max-critical vulnerability in its M365 Copilot AI platform, which researchers exploited to retrieve 2FA codes and sensitive data from emails.
– AI bots cannot distinguish user instructions from those hidden in third-party content, causing them to comply with malicious data requests.
– Hackers bypass LLM guardrails by using markup language or HTML tags to exfiltrate data via web requests captured on attacker servers.
– Microsoft’s guardrails include wrapping Copilot output in code blocks and restricting untrusted site access, but both can be overcome.
– Researchers used a Parameter-to-Prompt Injection, placing a malicious command in a URL query parameter instead of email content, to exploit the vulnerability.
Last Tuesday, Microsoft rolled out a patch for a critical vulnerability in its M365 Copilot AI platform, which it rated as maximum severity. On Monday, the security researchers who discovered and reported the flaw detailed how their proof-of-concept exploit could extract sensitive data, including two-factor authentication (2FA) codes, from emails accessible to Copilot.
The root of the problem is a fundamental weakness shared by Microsoft and other large language model (LLM) providers: their AI bots cannot reliably distinguish between legitimate user instructions and malicious commands hidden within third-party content. When Copilot summarizes, drafts responses, or performs actions on behalf of a user, it can be tricked into complying with requests that reveal confidential information. This inability to secure the boundary between user input and external content leaves companies like Microsoft building complex, ad-hoc guardrails to contain the consequences of an inherently gullible system.
One such guardrail prevents Copilot from submitting web forms, sending emails, or taking other actions that could exfiltrate data. To bypass this, attackers have turned to markup language, which allows formatting elements like headings, lists, and links without requiring HTML tags. Another common workaround involves wrapping sensitive data inside tags like `` or `





