AI & Tech Artificial Intelligence BigTech Companies Newswire Technology What's Buzzing

OpenAI’s New Coding Model Skips Nvidia, Uses Compact Chips

February 13, 2026Last Updated: February 13, 2026

2 minutes read

Worker grinding metal with sparks flying, wearing safety gear.

▼ Summary

– OpenAI released its GPT-5.3-Codex-Spark coding model, its first production AI model to run on non-Nvidia hardware, using chips from Cerebras.
– The new model is designed for speed, generating code at over 1,000 tokens per second, which is roughly 15 times faster than its predecessor.
– It is a text-only model tuned specifically for coding tasks and is available as a research preview to ChatGPT Pro subscribers through various interfaces.
– OpenAI claims Spark outperforms its older GPT-5.1-Codex-mini on software engineering benchmarks while completing tasks much faster, though these claims lack independent validation.
– This release marks a significant speed leap over OpenAI’s previous fastest models on its own infrastructure, such as GPT-4o, which deliver far fewer tokens per second.

OpenAI has introduced a new coding model that marks a significant departure from its reliance on Nvidia hardware. The company’s latest release, GPT-5.3-Codex-Spark, runs on chips from Cerebras and is reported to generate code at a remarkable speed exceeding 1,000 tokens per second. This performance represents a substantial increase, estimated to be roughly fifteen times faster than its predecessor. For comparison, Anthropic’s Claude Opus 4.6, a larger and more capable model, achieves about 2.5 times its standard speed in a new premium fast mode, which equates to approximately 170 tokens per second.

Sachin Katti, OpenAI’s head of compute, highlighted the partnership, stating that Cerebras has been an excellent engineering collaborator and that the company is enthusiastic about adding fast inference as a new platform capability. The model is currently available as a research preview to ChatGPT Pro subscribers, who pay a monthly fee of two hundred dollars. Access is provided through the Codex application, a command-line interface, and a Visual Studio Code extension. OpenAI is also granting API access to a select group of design partners. At launch, the model features a 128,000-token context window and is designed exclusively for text processing.

This new release is based on the full GPT-5.3-Codex model that OpenAI introduced earlier in the month. While the comprehensive version tackles complex, agentic coding tasks, Spark has been specifically optimized for raw speed rather than depth of knowledge. It was built as a text-only system and fine-tuned for coding purposes, distinguishing it from the general-purpose capabilities of the larger GPT-5.3 model.

According to OpenAI’s internal testing, Spark demonstrates strong performance on established software engineering benchmarks. The company reports that on evaluations like SWE-Bench Pro and Terminal-Bench 2.0, the new model outperforms the older GPT-5.1-Codex-mini while completing tasks in a significantly shorter timeframe. It is important to note that independent validation of these performance figures has not been publicly shared. Historically, Codex’s speed has been a point of criticism; in a previous test where multiple AI coding agents were tasked with building Minesweeper clones, Codex took nearly twice as long as Anthropic’s Claude Code to produce a functional game.

The introduction of this model intensifies the ongoing competition among coding agents. The reported throughput of 1,000 tokens per second constitutes a major advancement over what OpenAI has previously delivered using its own infrastructure. Independent benchmarks from Artificial Analysis indicate that the company’s fastest models operating on Nvidia hardware fall well below this new threshold. For instance, GPT-4o delivers around 147 tokens per second, the o3-mini model reaches about 167, and GPT-4o mini operates at approximately 52 tokens per second.

(Source: Ars Technica)

Topics

gpt-5.3-codex-spark 98% openai release 95% inference speed 92% coding model 90% non-nvidia hardware 85% model comparison 80% coding agent race 78% cerebras partnership 75% benchmark performance 72% api access 70%

OpenAI’s New Coding Model Skips Nvidia, Uses Compact Chips

Topics

Antibiotic megacluster discovery opens new front against superbugs

Man’s Brain Tumor Symptoms Were Actually Caused by Worms

Ford rehires 350 engineers to correct AI mistakes

Future Marketers’ Key Insight: How Customers Decide

How Rock Weathering Creates a Climate Feedback Loop

Planet and Star’s Magnetic Fields Connect in Ultra-Close Orbit

Trust in AI health hinges on privacy, transparency, and human oversight

Ex-Databricks AI chief aims to cut AI energy use 1,000x

Unconventional AI debuts oscillator-based model slashing power use 1000x

Topics

Related Articles