Topic: ai inference

Sort by: Relevance | Date

November 19, 2025
98%
AI Ignites the Cloud-Native Computing Boom
The rapid expansion of artificial intelligence is driving massive growth in cloud-native computing, with AI inference workloads moving from isolated training into widespread enterprise use, creating unprecedented demand for scalable infrastructure. A new generation of cloud-native inference engin...
Read More »
February 13, 2026
95%
Modal Labs in Talks for $2.5B Valuation Funding Round
Modal Labs is reportedly negotiating a major funding round that would value the AI infrastructure company at roughly $2.5 billion, a dramatic increase from its valuation five months ago. The company specializes in optimizing AI inference to reduce computing costs and latency, placing it in a high...
Read More »
January 27, 2026
95%
Microsoft Unveils New AI Chip for Faster Inference
Microsoft has launched the Maia 200, a custom AI chip designed to efficiently run AI inference workloads, addressing the high computational demands of using trained models. The chip offers a major performance leap with over 100 billion transistors, delivering over 10 petaflops in 4-bit precision ...
Read More »
January 15, 2026
95%
Phison's aiDAPTIV+ Boosts PC AI Speed 10X, Expands Model Size 3X
Phison's aiDAPTIV+ technology dramatically accelerates AI on consumer PCs by using high-speed SSD storage as an extension of GPU memory, eliminating redundant computations and enabling up to 10x faster inference. This innovation allows systems with modest hardware, like entry-level graphics and l...
Read More »
August 25, 2025
95%
Elon Musk Predicts the Future of AI Gadgets: 'Devices Will Just Be...'
Elon Musk envisions a future where AI processing shifts to edge computing, with devices handling tasks locally rather than relying on distant servers, potentially making traditional operating systems obsolete. He predicts that everyday electronics like smartphones and laptops will become "edge no...
Read More »
February 20, 2026
90%
Reface and Prisma Founders Launch Mirai for On-Device AI
Mirai, a startup founded by the creators of Reface and Prisma, is developing technology to make AI models run faster and more efficiently on personal devices like smartphones, shifting toward **on-device AI** to reduce costs and improve privacy. The company's core product is a framework, includin...
Read More »
February 14, 2026
90%
Nvidia DGX Spark Update Slashes Idle Power by 32%
Nvidia's DGX Spark, a compact AI development platform, received a software update that significantly reduces its idle power consumption, enhancing its value for energy-sensitive deployments. The update enables hot-plug detection for the ConnectX 7 NIC, allowing it to enter a low-power state, whic...
Read More »
January 15, 2026
90%
OpenAI's $10B Deal with Cerebras for AI Compute Power
OpenAI has entered a multi-year, multi-billion dollar partnership with chipmaker Cerebras to secure 750 megawatts of dedicated compute capacity, aiming to power next-generation AI applications. The deal aims to dramatically improve AI performance by using Cerebras's specialized systems to acceler...
Read More »
September 26, 2025
90%
Clarifai's New AI Engine Boosts Speed, Cuts Costs
Clarifai has launched a reasoning engine that doubles AI processing speeds and cuts operational costs by up to 40%, offering a hardware-agnostic solution for businesses. The engine uses advanced optimizations like CUDA kernel enhancements and speculative decoding to boost performance on existing ...
Read More »
October 16, 2025
88%
Intel Unveils 160GB Energy-Efficient Inference GPU in New Annual Release
Intel has launched the Crescent Island data center GPU, a 160GB model optimized for energy-efficient AI inference, marking the start of an annual GPU release schedule to compete in the AI infrastructure market. The GPU, built with Xe3P microarchitecture, is designed for high performance-per-watt ...
Read More »
July 26, 2025
88%
Mistral's AI Environmental Audit Reveals Planet Impact
Mistral's environmental audit of its Large 2 AI model revealed that training and running queries account for over 85% of emissions and 91% of water use, highlighting key areas for mitigation. While individual AI interactions have modest impacts (1.14g CO₂ per text page), cumulative effects are su...
Read More »
March 14, 2026
87%
Watch Nvidia GTC 2026: Jensen Huang's Keynote Livestream
Nvidia's GTC conference keynote, led by CEO Jensen Huang, is the primary venue for announcing major new products and partnerships, focusing this year on the future of AI and computing. The event will highlight AI's impact across sectors like healthcare and robotics, with anticipated announcements...
Read More »
January 16, 2026
87%
Phison CEO on 244TB SSDs, PLC NAND, and the Problem with High Bandwidth Flash
The primary bottleneck for deploying advanced AI is insufficient memory, not processing power, which can cause system crashes and degrade user experience by creating long delays like a slow Time to First Token. Phison's aiDAPTIV+ technology addresses this by using high-capacity SSDs as an expande...
Read More »
November 15, 2025
85%
OpenAI's Microsoft Payments Revealed in Leaked Docs
Microsoft received significant revenue share payments from OpenAI, totaling $493.8 million in 2024 and $865.8 million in the first three quarters of 2025, under a 20% revenue-sharing agreement from their $13 billion investment. OpenAI's revenue has grown substantially, with estimates of at least ...
Read More »
September 30, 2025
85%
Cerebras Raises $1.1B While Still Private a Year After IPO Filing
Cerebras Systems raised $1.1 billion in Series G funding, reaching an $8.1 billion valuation, co-led by Fidelity and Atreides Management, despite previous IPO plans for 2025. The company's growth is driven by strong demand for its AI inference services launched in 2024, leading to workforce expan...
Read More »
February 18, 2026
80%
Nvidia and Meta Forge New Era of Computing Power
Nvidia is expanding its AI strategy beyond GPU-based training to include inference and agentic software, highlighted by a major deal with Meta involving billions in hardware, including its Grace CPU. The partnership marks a strategic shift as agentic AI workloads, which require more logical decis...
Read More »
February 18, 2026
80%
Mistral AI Acquires Koyeb to Boost Cloud Strategy
Mistral AI has acquired the Paris-based startup Koyeb, marking its first acquisition to strengthen its position in the AI market. The deal signals a strategic shift for Mistral from pure research to becoming a full-stack AI provider, accelerating its Mistral Compute cloud service. Koyeb's technol...
Read More »
January 30, 2026
80%
Microsoft to Keep Buying Nvidia, AMD AI Chips Despite In-House Designs
Microsoft has deployed its first custom AI accelerator, the Maia 200, designed as an "AI inference powerhouse" for running trained models, with performance reportedly surpassing rival chips. The company's strategy involves parallel innovation and partnership, as CEO Satya Nadella emphasized conti...
Read More »
October 27, 2025
80%
Qualcomm Repurposes Phone Chips to Challenge Nvidia in AI
Qualcomm is launching two new AI processors, the AI200 and AI250, to directly compete with Nvidia by leveraging its mobile neural processing architecture. The chips are specialized for AI inference tasks, using technology from Qualcomm's Hexagon NPUs to run pre-trained models efficiently and scal...
Read More »
February 5, 2026
78%
Positron Secures $230M to Challenge Nvidia in AI Chip Race
A Reno-based semiconductor startup, Positron, raised $230 million in Series B funding to accelerate its challenge to Nvidia by deploying specialized, high-speed memory chips for AI hardware. The funding round included Qatar's sovereign wealth fund, aligning with Qatar's strategic ambition to beco...
Read More »
November 18, 2025
75%
Cloudflare Buys Replicate to Power Global Serverless AI
Cloudflare's acquisition of Replicate integrates a vast AI model library into its Workers platform, enabling developers to deploy sophisticated AI applications globally with minimal code. The move addresses the complexity and high costs of managing AI infrastructure, such as specialized GPU hardw...
Read More »
September 10, 2025
75%
XCENA's MX1 Memory: Thousands of RISC-V Cores, CXL 3.2 & SSD Tiering
XCENA unveiled the MX1 computational memory platform at FMS 2025, which uses near-data processing and PCIe Gen6/CXL 3.2 to reduce latency and energy by placing compute resources directly alongside DRAM. The MX1 incorporates thousands of custom RISC-V cores for demanding workloads like AI and anal...
Read More »
November 21, 2025
72%
Microsoft's 132-Core Azure Cobalt 200 CPU Targets Performance Boost
Microsoft has launched the Azure Cobalt 200, a 132-core Arm-based CPU built on TSMC's 3nm process, designed to enhance performance and efficiency for its cloud services. The processor offers over 50% higher performance than its predecessor, integrates hardware accelerators for compression and cry...
Read More »
February 25, 2026
70%
Meta's $100B AMD Deal Powers AI 'Superintelligence' Push
Meta has entered a multiyear agreement to purchase up to $100 billion worth of AMD processors, a strategic move to diversify its chip supply and fuel its AI data centers. The deal includes a performance-based warrant for Meta to acquire up to 10% of AMD stock, with a final tranche contingent on A...
Read More »
February 11, 2026
70%
The Brutal Economics of Orbital AI
The vision for orbital AI data centers is driven by the belief that space will soon be the cheapest location for AI compute, with major companies racing to develop prototypes despite significant technical and economic hurdles. The primary economic barriers are the immense costs of launch and sate...
Read More »
December 3, 2025
70%
Amazon Unveils New AI Chip, Hints at Nvidia Partnership
AWS launched its new Trainium3 AI training chip and UltraServer system, promising major performance gains and a focus on improved energy efficiency for AI workloads. The Trainium3 chip offers over four times the speed and memory of its predecessor, with systems scalable to clusters of up to 1 mil...
Read More »
October 21, 2025
70%
AMD to power OpenAI with $10B+ chip deal for 6GW compute
AMD has entered a multi-year agreement to supply OpenAI with six gigawatts of compute capacity using its Instinct GPU accelerators, starting with the MI450 series in late 2026. As part of the deal, OpenAI received an option to purchase up to 160 million AMD shares, vesting with compute delivery a...
Read More »
December 16, 2025
60%
OpenAI Pauses ChatGPT's Model Router for Most Users
OpenAI has removed the automated model router for its free and $5 Go tier users, reverting them to the default GPT-5.2 Instant model to reduce operational costs and address user retention metrics. The router, which automatically directed complex queries to advanced reasoning models, led to a sign...
Read More »
January 28, 2026
50%
China Approves Import of Nvidia's High-End AI Chips
China has approved three major tech firms, ByteDance, Alibaba, and Tencent, to import over 400,000 of Nvidia's advanced H200 AI chips, marking a shift from a previous weeks-long suspension of these shipments. The H200 chip represents a significant performance leap, being roughly six times more capa...
Read More »