Apple’s AI Upgrades Fall Short on Performance

▼ Summary
– Apple announced updates to its AI models (Apple On-Device and Apple Server) for Apple Intelligence features, but benchmarks show they underperform older rival models like OpenAI’s GPT-4o.
– Human testers rated Apple On-Device’s text generation comparable to similar-sized Google and Alibaba models, while Apple Server lagged behind GPT-4o.
– In image analysis tests, Apple Server was outperformed by Meta’s Llama 4 Scout, despite Llama 4 Scout generally trailing leading models like Google’s and OpenAI’s.
– Apple’s AI struggles reflect broader challenges, with delayed Siri upgrades and lawsuits alleging undelivered AI marketing promises.
– Apple On-Device (3B parameters) supports features like summarization and text analysis, now available to developers via Apple’s Foundation Models framework, with improved multilingual and tool-use capabilities.
Apple’s latest AI advancements have arrived, but performance comparisons reveal they still lag behind competitors’ older models. The tech giant unveiled updates to its Apple Intelligence platform, including new on-device and server-based AI systems, yet internal benchmarks show these models struggle to match the capabilities of existing solutions from OpenAI, Google, and others.
Human evaluators testing Apple’s on-device model, designed to operate offline on iPhones, found its text generation quality comparable to similarly sized models from Google and Alibaba, though not superior. More concerning, testers rated Apple’s more advanced server-based model as inferior to OpenAI’s GPT-4o, which launched over a year ago. Even in image analysis tasks, Apple’s offering fell short against Meta’s Llama 4 Scout, despite that model’s own documented weaknesses against top-tier AI systems.
These results reinforce growing concerns about Apple’s position in the highly competitive AI landscape. While rivals like Google and OpenAI have pushed boundaries with increasingly sophisticated models, Apple’s progress appears slower than anticipated. Delays to promised Siri upgrades and customer lawsuits over undelivered AI features further highlight the challenges facing the company’s AI division.
The 3-billion-parameter on-device model powers core functionalities like summarization and text analysis, with Apple now allowing third-party developers to integrate it through their Foundation Models framework. Both the on-device and server versions claim efficiency improvements and support for approximately 15 languages, aided by expanded training data incorporating documents, infographics, and other complex formats.
While Apple emphasizes these upgrades represent meaningful progress, the benchmark gaps suggest the company still has ground to cover before its AI can compete with industry leaders. For users expecting cutting-edge performance, the latest enhancements may feel like incremental steps rather than transformative leaps forward.
(Source: TechCrunch)