OpenAI Unveils Two New Open-Source AI Reasoning Models

▼ Summary
– OpenAI released two open-weight AI models, gpt-oss-120b and gpt-oss-20b, available for free download on Hugging Face, claiming state-of-the-art performance in benchmarks.
– The models come in two sizes: a larger version for single Nvidia GPUs and a lighter version for consumer laptops, marking OpenAI’s first open release since GPT-2.
– OpenAI’s open models can connect to its closed models for complex tasks, reflecting a shift from its proprietary approach amid pressure from Chinese open-source AI labs.
– The models outperform competitors like DeepSeek and Qwen in coding and knowledge tests but hallucinate more frequently than OpenAI’s proprietary models.
– Released under the Apache 2.0 license, the models exclude training data disclosure and underwent safety reviews to address concerns about misuse.
OpenAI has introduced two groundbreaking open-source AI reasoning models, marking a significant shift in its approach to artificial intelligence development. These models, now available for free download on Hugging Face, represent the company’s first major open-weight release since GPT-2 debuted over five years ago. The move signals a strategic pivot as OpenAI faces increasing competition from Chinese AI labs dominating the open-source space.
The newly launched models come in two variants: gpt-oss-120b, a high-performance version capable of running on a single Nvidia GPU, and gpt-oss-20b, a more lightweight option designed for consumer laptops with 16GB of RAM.None Both models leverage advanced reasoning capabilities, allowing developers to integrate them with OpenAI’s proprietary cloud-based systems for more complex tasks like image processing.
This release follows CEO Sam Altman’s recent admission that OpenAI may have been “on the wrong side of history” by keeping its technology closed-source. The decision also aligns with growing calls from U.S. policymakers for American AI firms to promote open innovation, countering the influence of Chinese competitors like DeepSeek, Alibaba’s Qwen, and Moonshot AI.
Performance benchmarks reveal that OpenAI’s models outperform many existing open-source alternatives. On Codeforces, a competitive coding test, gpt-oss-120b scored 2622, surpassing DeepSeek’s R1 but trailing behind OpenAI’s proprietary o3 and o4-mini models. Similarly, in the Humanity’s Last Exam (HLE) challenge, which tests broad knowledge across disciplines, the open models achieved 19% and 17.3% accuracy, respectively beating DeepSeek and Qwen while still falling short of OpenAI’s closed offerings.
However, hallucination rates remain a concern, with the models generating incorrect responses to nearly half of the questions in OpenAI’s PersonQA benchmark. This issue, while common in smaller models, underscores the trade-offs between accessibility and precision.
Behind the scenes, the models were trained using a mixture-of-experts (MoE) architecture, activating only a fraction of their total parameters per query for efficiency. Reinforcement learning further refined their reasoning abilities, enabling them to call external tools like web search or Python execution during problem-solving.
Licensed under Apache 2.0, these models allow commercial use without restrictions, though OpenAI has opted not to disclose their training data, a decision likely influenced by ongoing legal disputes over copyrighted material in AI training sets. Safety evaluations confirmed minimal risks, with no evidence that fine-tuning could make them dangerous beyond controlled thresholds.
As the AI landscape evolves, OpenAI’s move could reignite competition in the open-source domain, challenging rivals like Meta’s Llama and upcoming releases from DeepSeek. For developers, this presents new opportunities to build on a U.S.-developed AI stack while navigating the balance between openness and reliability.
Image Credits: Tomohiro Ohsumi / Getty Images
(Source: TechCrunch)