AI & TechArtificial IntelligenceNewswireStartupsTechnology

Origin Lab raises $8M to sell game data for world-model AI training

▼ Summary

– Origin Lab is a startup that raised $8 million to create a marketplace connecting AI labs with video game companies for training data.
– Unlike large language models, physical world models lack easy data sources, leading labs to scramble for training sets.
– Video game companies can sell their digital assets as training data, which Origin Lab converts into usable formats for AI labs.
– Origin Lab acts as a bridge, solving licensing and data quality issues that previously hindered the use of video game footage for AI training.
– The startup’s fundraising success signals a growing market for data suppliers to major AI labs, where data is a key bottleneck.

As artificial intelligence moves beyond text and images into the physical realm, researchers are scrambling for a scarce commodity: high-quality training data for world models. Unlike large language models, which can draw from the vast expanse of the internet, these new systems require information about how objects move, interact, and behave in three-dimensional space. That data is not easy to come by.

Enter Origin Lab, a startup betting that the answer lies inside the video game industry. The company has just closed an $8 million seed round led by Lightspeed Ventures, with participation from SV Angel, Eniac, Seven Stars, and FPV. Notable angel investors include Twitch co-founder Kevin Lin and Cruise founder Kyle Vogt.

“The AI systems that are being built now need to understand how the physical world works and how things move,” said Anne-Margot Rodde, co-CEO and co-founder of Origin Lab. “That data essentially lives in video games.”

The startup’s model is straightforward: it acts as a marketplace for licensed game data, connecting AI labs like Yann LeCun’s AMI Labs or Fei-Fei Li’s World Labs with video game studios that already own vast digital environments. On one side, AI researchers get access to high-fidelity, legally clean training data. On the other, game developers unlock a new revenue stream from assets they have already built. Origin Lab sits in the middle, converting game assets into usable training formats , whether that means a single rendering pass or automating hours of in-game footage.

“It became clear that the video game industry was sitting on some incredibly valuable data, but there was no real way or infrastructure to basically connect AI labs and the video game industry,” Rodde explained. “So essentially, we built that bridge.”

The idea of mining video games for AI training is not new. Researchers have long recognized the potential, but licensing hurdles and data quality concerns have kept the market from taking off. In December 2024, OpenAI faced backlash when its Sora video-generation model appeared to reproduce footage from popular games and streamers , likely because it had been trained on Twitch content. Amazon has also expressed interest in using Twitch streams for model training.

Origin’s ability to raise capital signals a growing appetite for specialized data infrastructure. Faraz Fatemi, a partner at Lightspeed who led the investment, pointed to the success of companies like Scale. AI as evidence of the opportunity.

“We’ve seen how sharp the revenue scaling can be for data vendors that are serving the major labs,” Fatemi said. “These are very well-capitalized businesses, and the bottleneck for all of them is data.”

(Source: TechCrunch)

Topics

ai world models 95% video game data 93% data marketplace 90% startup funding 88% physical robotics 85% training data scarcity 82% data licensing 80% ai labs collaboration 78% revenue scaling 75% digital assets monetization 73%