AI & TechArtificial IntelligenceNewswireStartupsTechnologyWhat's Buzzing

Nous Research Launches Hermes 4 AI, Outperforming ChatGPT Without Restrictions

▼ Summary

– Nous Research released Hermes 4, a family of open-source large language models that match proprietary system performance while offering minimal content restrictions and user control.
– The models feature “hybrid reasoning,” allowing users to toggle between fast responses and step-by-step thinking processes with full transparency into the AI’s internal reasoning.
– Hermes 4 achieved top scores on benchmarks including MATH-500 (96.3%) and RefusalBench (57.1%), outperforming proprietary models like GPT-4o and Claude Sonnet 4 in reasoning and refusal rates.
– The models were trained using novel systems DataForge (a synthetic data generator) and Atropos (an open-source reinforcement learning framework), requiring significant but specialized computational resources.
– Nous Research advocates for user control over corporate content policies, positioning Hermes 4 as a challenge to Big Tech’s AI dominance and sparking debate about AI safety versus innovation.

Nous Research has unveiled Hermes 4, a groundbreaking family of large language models that challenges the dominance of proprietary AI systems by delivering comparable performance with fewer content restrictions and greater user control. This release marks a pivotal moment in the ongoing debate between open-source innovation and corporate-controlled artificial intelligence, offering a new paradigm for developers and enterprises seeking flexible, high-performance AI tools.

The Hermes 4 models introduce a novel “hybrid reasoning” feature, enabling users to switch between rapid responses and detailed, step-by-step cognitive processes. When activated, the model reveals its internal reasoning within specialized tags before delivering a final answer, providing full transparency into its problem-solving approach. This method has demonstrated remarkable results, with the largest 405-billion parameter model achieving 96.3% on the MATH-500 benchmark and strong performance in the demanding AIME’24 mathematics competition.

A particularly telling metric comes from RefusalBench, a new evaluation tool developed by Nous Research to measure how often AI systems decline to answer queries. Hermes 4 scored 57.1% in reasoning mode, far surpassing GPT-4o and Claude Sonnet 4, which scored 17.67% and 17% respectively. This reflects the company’s commitment to minimizing refusals and maximizing responsiveness.

Behind these capabilities lie two innovative training systems: DataForge and Atropos. DataForge employs graph-based synthetic data generation, transforming basic information into complex instructional examples through what the company describes as “random walks.” Atropos, an open-source reinforcement learning framework, functions like a series of specialized training environments where models refine skills such as mathematics, coding, and creative writing through rejection sampling, ensuring only high-quality responses contribute to training.

The training process for the largest model utilized 192 Nvidia B200 GPUs over 71,616 GPU hours. While computationally intensive, this investment demonstrates how focused technical strategies can compete with the vast resources of tech giants.

Nous Research has built its reputation on a philosophy that prioritizes user control over restrictive corporate policies. The company argues that excessive safety measures hinder innovation and usability. As one investor noted, Hermes 4 is “not shackled by disclaimers, rules, and being overly cautious,” making it highly appealing to researchers and developers who value flexibility.

The release arrives amid growing momentum in open-source AI, with models like Meta’s Llama 3.1 and DeepSeek’s R1 narrowing the performance gap with proprietary systems. Hermes 4 strengthens this trend, particularly in reasoning, a domain where closed systems have traditionally excelled.

One significant technical hurdle involved preventing the model from overthinking. Researchers found that smaller parameter models often reached maximum context length during reasoning, resulting in endless loops. A secondary training phase taught the models to cap reasoning at 30,000 tokens, reducing excessive generation by 65-79% while preserving performance.

Hermes 4 is accessible through multiple channels, including free downloads on Hugging Face and API access via Nous Chat and partner platforms. This availability provides enterprises and researchers with a customizable, cost-effective alternative to commercial AI services.

The introduction of Hermes 4 represents more than a technical milestone, it signals a shift in how artificial intelligence may evolve. By emphasizing transparency, user agency, and open access, Nous Research challenges the notion that AI advancement must be governed by a few powerful corporations. Whether this approach will prove sustainable or problematic remains uncertain, but Hermes 4 undeniably proves that innovation in AI is not exclusive to those with the largest budgets.

(Source: VentureBeat)

Topics

open-source ai 95% ai model release 93% hybrid reasoning 88% benchmark performance 87% training infrastructure 86% ai safety debate 85% startup innovation 84% ai industry trends 83% computational resources 82% transparency standards 81%