Apple reportedly trying to distill Google's multi-trillion-parameter Gemini AI to run on iPhone

wpnews.pro

cd /news/artificial-intelligence/apple-reportedly-trying-to-distill-g… · home › topics › artificial-intelligence › article

[ARTICLE · art-16728] src=arstechnica.com ↗ pub=2026-05-28T18:30Z topic=artificial-intelligence verified=true sentiment=↓ negative

Apple reportedly trying to distill Google's multi-trillion-parameter Gemini AI to run on iPhone

Apple is reportedly working to compress Google's multi-trillion-parameter Gemini AI model to run on iPhones, but the effort faces significant technical hurdles. The Gemini-infused Siri, expected later this year, will rely heavily on cloud processing from Google and Nvidia rather than running entirely on-device, marking a departure from Apple's stated privacy preference for local AI. The challenge stems from the vast size difference between cloud-based Gemini models and the smaller, quantized models that can fit in a smartphone's limited memory and processing environment.

read2 min views10 publishedMay 28, 2026

It’s impossible to totally avoid generative AI when interacting with technology anymore, but Apple has a bit less of it. That’s not entirely by choice, though. The iPhone maker has delayed the AI-enhanced Siri multiple times since first promising it in 2024, but a deal with Google will merge the iconic assistant with Gemini later this year. As we approach WWDC, Apple has been working to bring big AI smarts to the modest processing environment of a smartphone. Apple fans may not like the outcome, though.

Apple has long crowed about the privacy value of running AI locally, but a new report suggests that despite Apple’s best efforts, the iPhone’s Gemini makeover will lean heavily on Google and Nvidia in the cloud. The Information reports that Apple’s Gemini-infused Siri will run both on-device and in the cloud, an apparent reversal of its privacy-focused preference for local AI.

With every new chip announcement, we hear about how the silicon has been optimized for AI—even Apple does this with its focus on Neural Engine upgrades. You may think from the grandiose language that smartphones are equipped to handle beefy AI models, but that’s not necessarily the case. In fact, the GPUs in most phones can process more AI tokens than the AI-focused NPUs. Components like Apple’s Neural Engine are designed for contextual, efficient AI processing. Even if phones had faster AI processing, they lack the RAM to keep enormous models in memory.

Even the largest AI models are still middling assistants, and that makes local AI very challenging. The AI models that run on phones are physically smaller, featuring at most a few billion parameters. Compare that to Google’s latest Gemini models, which have trillions of parameters, The Information reports. On-device AI models are also “quantized” to run at lower precision, making them faster but affecting the accuracy of token generation. This all adds up to AIs that feel less smart than their cloud brethren, and even big cloud-based models can be pretty dumb sometimes.

The amazing, shrinking Gemini #

Google does have versions of Gemini optimized for mobile devices, which it calls Gemini Nano. However, these are designed for powering contextual features like Magic Cue and audio summarization. Siri, on the other hand, is supposed to be a conversational assistant—you talk to it and it does things. That’s a different experience that requires a different kind of model. On Android, Google doesn’t even bother trying to do that locally. Talking to Gemini always goes straight to the cloud.

source & further reading

arstechnica.com — original article Apple sues OpenAI after ex-engineer allegedly used bug to steal trade secrets Remember Musk’s Suit Alleging a Conspiracy Between Apple and OpenAI? Now, defenders are embracing the prompt injection, too

~/api · this article 200

$curl api.wpnews.pro/v1/news/apple-reportedly-trying-…

Read original on arstechnica.com → arstechnica.com/ai/2026/05/apple-reportedly-tryi…

mentioned entities

Apple

Google

Gemini

Siri

Nvidia

The Information

metadata

slugapple-reportedly-trying-to-distill-google-s-multi-trillion-parameter-gemini-ai

topic#artificial-intelligence

secondary4 topics

sentimentnegative

canonicalarstechnica.com

navigation

← prevCNBC Names Ken Brown Managing Ed…

next →Just like gold and oil, we’ll so…

── more in #artificial-intelligence 4 stories · sorted by recency

engadget.com · 14 Jul · #artificial-intelligence

Google brings Gemini in Chrome to UK users

insideai.news · 14 Jul · #artificial-intelligence

TSMC Set for Fifth Record Profit Quarter as AI Boom Powers Taiwan Chip Giant

scmp.com · 14 Jul · #artificial-intelligence

Global AI boom sees China’s chip exports nearly double in first half of year

techstrong.ai · 14 Jul · #artificial-intelligence

You Can Keep the Benchmarks. I’ll Take the Test Drive

── more on @apple 3 stories trending now

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 21 May · #developer-tools

Antigravity CLI: A Hands-On Guide to Google's Terminal Coding Agent

wpnews · 8 Jul · #artificial-intelligence

SpaceXAI unveils Grok 4.5 AI model ahead of July 2026 public release

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required