AI Misbehavior? New Tool Lets You Report It

▼ Summary
– A new crowdsourced website, FLARE-AI, allows users to report and track AI harms such as malware generation or leaks of personal information, similar to Downdetector for outages.
– The system uses open source code to let others verify issues and route reports to model makers and organizations like MITRE.
– The initiative was developed by 49 AI experts from 32 organizations, who argue it is crucial as AI adoption grows and agentic systems gain power.
– Current reporting mechanisms are fragmented, with no centralized way to report AI flaws, leading to problems like psychological harm, bias, and misinformation going unrecognized.
– Recent incidents, such as duping AI browsers into hacking websites or tricking Claude into revealing data, highlight the ease with which AI can go wrong.
Every week, while testing the latest developments in AI for Writing AI Lab, I run into models that behave oddly or even dangerously. Until now, my only option has been to document these encounters for readers. That may finally be changing.
A coalition of AI researchers has launched FLARE-AI (Flaw Reporting for AI) , a crowdsourced platform designed to track and report harms caused by artificial intelligence. If a chatbot generates malware, provides instructions for making bombs, leaks sensitive personal data, or encourages delusional thinking, users can now raise the alarm through this system. The open source code allows independent verification of reports and routes verified issues to model developers as well as organizations like MITRE, a nonprofit that monitors technical system failures. Think of it as a Downdetector for AI misbehavior , a centralized hub for real-time incident reporting.
This website builds on work the group has been doing for over a year. I covered their earlier efforts last year. Members of the team also consulted on a congressional bill introduced in June, which would task the U. S. government with formally tracking AI misconduct.
“Right now, there is no centralized, accountable way to report flaws in AI systems,” says Avijit Ghosh, an AI policy researcher at HuggingFace who co-led FLARE-AI’s development alongside computer scientists Elaine Zhu and Shayne Longpre.
The reporting system was built with input from 49 AI experts across 32 organizations. In a paper detailing their work, the researchers argue that such a tool will become increasingly vital as AI adoption expands and agentic systems gain more autonomy. They see the absence of a consistent reporting mechanism as a critical gap.
“I think it’s a really good initiative,” says Jessica Ji, a researcher at the Center for Security and Emerging Technology. Ji agrees that current reporting systems are fragmented and that AI models remain black boxes. “I’m in support of anything that makes AI more transparent,” she adds.
While bugs and cybersecurity flaws dominate headlines, Ghosh points out that AI-related harms extend much further. They include psychological harm, discrimination, bias, and misinformation. Because different companies apply different standards, many problems simply go unnoticed. “In the absence of a coordinated disclosure system, there are no external mechanisms to enforce transparency,” Ghosh says.
Recent incidents underscore how easily AI can veer off course.
Earlier this week, security firm LayerX revealed a method to trick AI-powered browsers like OpenAI’s Atlas and Perplexity’s Comet into bypassing their guardrails. By convincing the underlying model it was playing a game, researchers could make the browser go rogue and attempt to hack a website. LayerX says the affected companies have since fixed the issue. And in April, security researcher Johann Rehberger demonstrated a way to trick Claude into leaking personal information using images generated by ChatGPT.
AI also introduces bizarre new problems. Last year, OpenAI had to update its models after discovering they were overly sycophantic, sometimes reinforcing users’ delusional thinking.
Rumman Chowdhury, CEO and founder of Humane Intelligence PBC, believes FLARE-AI could help many AI developers implement structured reporting for their tools. But she cautions that such initiatives face significant challenges, including ensuring trust, preventing abuse, and maintaining momentum.
(Source: Wired)