Artificial Intelligence BigTech Companies Newswire Technology What's Buzzing

OpenAI accused of sentiment analysis controversy

August 8, 2025Last Updated: August 8, 2025

2 minutes read

Three people discuss GPT model performance data shown on a screen comparing software engineering and code editing accuracy.

▼ Summary

– The author is a news editor at The Verge covering technology and gaming, having joined in 2019 after working at Techmeme.
– OpenAI showcased GPT-5 during a livestream, but some charts displayed misleading or inconsistent data visualizations.
– One chart inaccurately represented deception rates, with GPT-5’s 50.0% score appearing larger than a smaller model’s 47.4% score despite the lower value.
– Another chart showed GPT-5 with a lower score but a larger bar, while other models with different scores had equally-sized bars, prompting criticism from CEO Sam Altman.
– OpenAI faced scrutiny over the charts, which undermined its claims about GPT-5’s improved accuracy and reduction of hallucinations.

OpenAI faces scrutiny over misleading performance charts during its GPT-5 showcase, raising questions about data representation and transparency. The company’s highly anticipated livestream event took an awkward turn when viewers noticed inconsistencies in several graphs meant to highlight the model’s capabilities.

One particularly glaring example involved a chart comparing “deception evaluation” scores across different models. While GPT-5 supposedly achieved a 50.0% deception rate in coding tasks, its bar appeared larger than OpenAI’s own o3 model, which scored 47.4%, a numerical discrepancy that wasn’t reflected visually. Another slide showed GPT-5 with a lower score than o3, yet its bar was inexplicably bigger. Even more confusing, o3 and GPT-4o displayed identical bar sizes despite having different scores.

The errors didn’t go unnoticed. OpenAI CEO Sam Altman publicly acknowledged the mishap, calling it a “mega chart screwup” on social media. A marketing team member later apologized, joking about the “unintentional chart crime.” However, the blunder overshadowed part of the company’s messaging, particularly its claims about GPT-5’s improved accuracy in reducing hallucinations, a term used to describe AI-generated falsehoods.

OpenAI hasn’t clarified whether GPT-5 itself was involved in creating the flawed visuals. Regardless, the incident highlights the challenges of presenting complex AI performance metrics clearly, especially during high-stakes product launches where precision matters. For a company positioning itself as a leader in trustworthy AI, even small missteps in data visualization can fuel skepticism.

The episode serves as a reminder that transparency in AI benchmarking remains critical, not just in technical capabilities but in how results are communicated to the public. As competitors and regulators scrutinize OpenAI’s claims, accurate representation of data will be just as important as the underlying technology.

(Source: The Verge)