OpenAI Upgrades Free ChatGPT with Better Health Answers

▼ Summary
– OpenAI claims GPT-5.5 Instant performs comparably to its frontier Thinking models on health questions, based on its own evaluations.
– The company reports a 71% drop in factuality issues on live traffic and higher scores on its HealthBench and HealthBench Professional benchmarks.
– In a comparison with physicians, GPT-5.5 Instant’s responses were rated higher than doctor-written ones on accuracy, communication, and completeness.
– Over 230 million people ask ChatGPT health and wellness questions each week, making health one of its biggest use cases.
– OpenAI’s health accuracy claims are not verified by an independent third-party, creating a measurement gap for publishers.
OpenAI has announced that GPT-5.5 Instant, now the default model for free ChatGPT users, delivers results on health questions that are on par with its more advanced frontier Thinking models. This assessment comes from the company’s own internal health evaluations.
Health-related AI outputs are among the most scrutinized. A recent Guardian investigation, for instance, revealed that some Google AI Overviews offered inaccurate medical advice, prompting Google to pull those overviews for certain health queries. OpenAI’s update enters this high-stakes territory not by retreating but by asserting improvement.
For publishers and SEOs focused on health, this shift means a massive, free user base can now receive medical answers directly within ChatGPT, bypassing traditional website clicks.
What OpenAI Reported
OpenAI highlights progress on two benchmarks: HealthBench and HealthBench Professional, the latter tailored for clinical contexts. The company states that GPT-5.5 Instant outperforms its predecessor, GPT-5.3 Instant.
The company also notes a significant reduction in factuality issues in live traffic. According to its monitoring systems, the rate of health responses flagged for at least one potential factuality problem dropped by 71% over two months.
A third comparison involved physicians. OpenAI asked doctors to write responses to typical health dialogues, then had a separate panel of physicians evaluate both the human and AI responses. Across 3,500 reviewed responses, the panel rated GPT-5.5 Instant’s outputs higher than the physician-written ones on accuracy, communication, and completeness.
OpenAI adds that the new model exhibited fewer failure modes than both older models and the human doctors, including fewer instances of missing a red flag or failing to request additional context from the user.
How OpenAI Measured It
HealthBench is a benchmark developed internally with OpenAI’s physician network, using rubrics written by doctors rather than standard exam-style questions.
The company reports that it collaborates with over 260 physicians across 60 countries, and that doctors have reviewed more than 700,000 example responses to date. This physician count has been consistent since the January launch of ChatGPT Health. None of these results have been released for independent verification.
Health Is Already One of ChatGPT’s Biggest Use Cases
OpenAI states that more than 230 million people ask ChatGPT health and wellness questions each week, making this one of the chatbot’s most common uses.
Health also falls under a protected category in OpenAI’s policies. When the company began testing ads in ChatGPT, it explicitly barred ads from appearing in conversations about health, mental health, or politics.
Why This Matters
Medical queries already attract high exposure to AI-generated answers. A recent Ahrefs analysis found that health had the highest rate of any category for Google AI Overviews. If more of that demand shifts to ChatGPT’s free tier, zero-click pressure on publishers could intensify.
The accuracy claims, however, are difficult to verify independently. OpenAI conducted the tests internally, leaving the same measurement gap that exists for other AI health answers. The company asserts that its health responses have improved, but no third party has confirmed the results.
Looking Ahead
The announcement does not clarify how these changes affect citations. As more platforms move health answers to free tiers, the burden of verifying information and managing traffic loss falls squarely on practitioners.
(Source: Search Engine Journal)




