AI & TechArtificial IntelligenceBigTech CompaniesNewswireTechnology

Nvidia to Invest $26 Billion in Open-Weight AI Models

▼ Summary

– Nvidia plans to invest $26 billion over five years to build open source AI models, a strategic move to evolve from a chipmaker into a frontier AI lab.
– The company released its most capable open-weight model, Nemotron 3 Super, which has 128 billion parameters and claims to outperform OpenAI’s GPT-OSS on several benchmarks.
– Nvidia’s open source approach includes releasing model weights and technical innovations, making it easier for others to modify and build upon its work.
– This investment and open model strategy aim to entrench Nvidia’s position as the leading AI chipmaker, as the models are optimized for its hardware.
– The open model landscape is shifting, with Meta reconsidering its openness and many top Chinese models being freely available, influencing global startup and research development.

Nvidia has committed a substantial $26 billion investment over the next five years to develop open-source artificial intelligence models, a strategic initiative confirmed in recent financial filings and executive interviews. This massive financial commitment signals a potential evolution for the company, moving beyond its core identity as a premier chip manufacturer to establish itself as a frontier AI research lab capable of competing directly with entities like OpenAI and DeepSeek. A key strategic advantage lies in the fact that these new models will be finely tuned to run optimally on Nvidia’s own hardware, potentially further solidifying its dominant position in supplying the silicon that powers the AI industry.

Open-weight or open-source models publicly release the parameters, the weights that dictate a model’s behavior, often alongside architectural and training details. This transparency allows developers, researchers, and companies to download, run, and modify the models on their own infrastructure. Nvidia’s approach extends to sharing the technical innovations behind model construction and training, which lowers barriers for startups and academic teams to build upon the company’s foundational work.

Coinciding with this investment announcement, Nvidia unveiled Nemotron 3 Super, its most advanced open-weight model to date. Boasting 128 billion parameters, a common metric for size and complexity, it is roughly comparable to the largest iteration of OpenAI’s GPT-OSS. Nvidia asserts that its new model surpasses GPT-OSS and other competitors across several performance benchmarks. The company reports that Nemotron 3 Super achieved a score of 37 on the comprehensive Artificial Intelligence Index, which evaluates models across ten different tests, outperforming GPT-OSS’s score of 33. Notably, several Chinese models achieved higher scores on this particular index. Nvidia also claims its model ranks first on a new benchmark called PinchBench, which secretly tested the model’s ability to control a system called OpenClaw.

The development of Nemotron 3 Super incorporated several novel technical advancements in architecture and training methodology. These innovations were designed to enhance the model’s reasoning capabilities, improve its handling of long-context information, and increase its responsiveness to reinforcement learning techniques. Bryan Catanzaro, Nvidia’s Vice President of Applied Deep Learning Research, emphasized the company’s deepened commitment, stating they are taking open model development “much more seriously” and are making significant progress.

The landscape for open models has seen notable shifts. Meta pioneered the trend among major firms with its Llama model in 2023, though recent corporate refocusing has cast doubt on the full openness of its future releases. While OpenAI offers an open-weight GPT-OSS model, it is considered inferior to its proprietary offerings and not designed for easy modification. In contrast, leading U.S. models from companies like Anthropic and Google are typically accessible only via cloud APIs or chat interfaces. This has created an opening where many of the world’s top-performing open models now originate from Chinese firms like DeepSeek, Alibaba, and Moonshot AI, which release their weights freely. Consequently, a global community of innovators is increasingly building applications atop these Chinese foundations.

Catanzaro, who played a pivotal role in Nvidia’s transition from gaming graphics to AI silicon, frames the open-source investment as beneficial for the entire ecosystem’s growth. The Nemotron lineage began in late 2023, and the company has already completed pre-training a massive 550-billion-parameter model, a process that involves processing enormous datasets across vast arrays of specialized chips. Nvidia has since released a portfolio of models tailored for specific domains such as robotics, climate science, and protein folding.

Kari Briski, Vice President of Generative AI Software for Enterprise, explains that developing these cutting-edge models serves a dual purpose for Nvidia. It not only advances the AI software frontier but also provides invaluable stress tests for the company’s own hardware systems. Building and training such models helps Nvidia push the limits of its compute, storage, and networking technologies within its supercomputer-scale data centers, directly informing and accelerating its future hardware architecture roadmap.

(Source: Wired)

Topics

nvidia investment 95% Open Source AI 90% nemotron 3 super 88% ai market competition 85% ai chip manufacturing 85% ai ecosystem development 82% ai benchmarks 80% corporate ai strategy 78% chinese ai models 78% model training techniques 75%