Artificial Intelligence BigTech Companies Newswire Technology What's Buzzing

Apple brings Gemini AI to iPhone for smarter Siri

May 28, 2026Last Updated: May 28, 2026

2 minutes read

▼ Summary

– Apple has delayed AI-enhanced Siri multiple times since 2024, but a deal with Google will merge it with Gemini later this year.
– Apple’s Gemini-infused Siri will run both on-device and in the cloud, reversing its privacy-focused preference for local AI.
– Smartphone GPUs can process more AI tokens than AI-focused NPUs, and phones lack the RAM to keep enormous models in memory.
– On-device AI models have at most a few billion parameters, while Google’s latest Gemini models have trillions, making local AIs less smart.
– Google’s Gemini Nano is designed for contextual features on mobile, while Siri requires a conversational model that, on Android, always uses the cloud.

It’s nearly impossible to interact with modern technology without encountering generative AI, but Apple has been notably slower to embrace it. That lag isn’t entirely intentional. The company has repeatedly postponed the AI-enhanced version of Siri since first teasing it in 2024, but a new partnership with Google promises to finally merge the assistant with Gemini later this year. As the Worldwide Developers Conference approaches, Apple has been working to deliver powerful AI capabilities within the constraints of a smartphone. However, the outcome may not sit well with loyal Apple fans.

Apple has long emphasized the privacy benefits of running AI directly on the device, but fresh reporting suggests that despite its efforts, the upcoming Gemini-infused Siri will rely heavily on cloud infrastructure from Google and Nvidia. According to The Information, Siri’s new capabilities will operate both on-device and in the cloud, marking a notable shift from Apple’s stated preference for local processing.

Every new chip announcement touts AI optimization, and Apple is no exception with its focus on Neural Engine upgrades. The marketing might lead you to believe that smartphones are ready to handle massive AI models, but that’s far from the truth. In reality, the GPUs in most phones can process more AI tokens than the dedicated NPUs. Components like Apple’s Neural Engine are built for efficient, contextual AI tasks rather than heavy lifting. Even if phones had faster AI processors, they lack the RAM to keep enormous models in memory.

Even the largest cloud-based AI models are far from perfect assistants, which makes local AI especially difficult. The models that run on phones are physically smaller, typically containing just a few billion parameters. Compare that to Google’s latest Gemini models, which boast trillions of parameters, as reported by The Information. On-device models are also “quantized” to run at lower precision, speeding up processing but reducing the accuracy of token generation. The result is an AI that feels noticeably less capable than its cloud-based counterparts, and even those can be frustratingly unreliable at times.

Google does offer versions of Gemini optimized for mobile, known as Gemini Nano, but these are tailored for contextual features like Magic Compose and audio summarization. Siri, by contrast, is designed as a conversational assistant that responds to voice commands and takes action. That’s a fundamentally different experience requiring a different kind of model. On Android, Google doesn’t even attempt to run that locally. Interacting with Gemini on an Android device always routes to the cloud.

(Source: Ars Technica)