Apple Debuts Third-Generation Foundation Models and AFM Core Advanced

Apple debuted the third generation of Apple Foundation Models (AFM) on June 8, 2026, unveiling a family of five models spanning on-device and server deployments, including the 20-billion-parameter, natively multimodal AFM 3 Core Advanced that uses a sparse architecture to activate only 1 to 4 billion parameters per request. The company partnered with Google and NVIDIA to extend its Private Cloud Compute infrastructure so the AFM 3 Cloud Pro model can run on NVIDIA GPUs in Google Cloud while maintaining Apple's privacy guarantees. The release follows a January 2026 joint statement from Apple and Google framing the next-generation AFM family as built with Google's Gemini technology, though Apple's technical post emphasizes its own architecture and Apple silicon optimization.

Apple Debuts Third-Generation Foundation Models and AFM Core Advanced Apple introduced the third generation of Apple Foundation Models AFM , a family of five models spanning on-device and server deployments, in a June 8, 2026 post on its machine learning research site. The set includes two on-device models, AFM 3 Core and AFM 3 Core Advanced, and three server models that run on Private Cloud Compute: AFM 3 Cloud, ADM 3 Cloud an image model , and AFM 3 Cloud Pro. Apple describes AFM 3 Core Advanced as a 20-billion-parameter, natively multimodal on-device model that uses a sparse architecture, activating only 1 to 4 billion parameters per request so it can run on Apple silicon. Apple worked with Google and NVIDIA to extend Private Cloud Compute for AFM 3 Cloud Pro to NVIDIA GPUs in Google Cloud while, Apple says, preserving its privacy guarantees. A January 12, 2026 joint statement from Apple and Google framed the next-generation AFM family as built with Google and its Gemini technology, though Apple's June 8 post emphasizes its own architecture and Apple silicon optimization. What happened Apple announced the third generation of Apple Foundation Models AFM in a June 8, 2026 post on its machine learning research site, describing a family of five models that run across devices and Apple's Private Cloud Compute. The family includes two on-device models, AFM 3 Core the successor to Apple's roughly 3-billion-parameter dense model and AFM 3 Core Advanced, plus three server models: AFM 3 Cloud, ADM 3 Cloud a dedicated image model for creation, editing, and Genmoji , and AFM 3 Cloud Pro. Apple says AFM 3 Core Advanced is its most powerful on-device model, a 20-billion-parameter, natively multimodal system that uses a sparse architecture to activate only 1 to 4 billion parameters at a time depending on the request. Technical details Apple frames the sparse design as how it fits a 20-billion-parameter model onto consumer hardware. The technique, which Apple describes as Instruction-Following Pruning IFP , keeps the full parameter set in flash NAND storage rather than in active DRAM. Because NAND-to-DRAM bandwidth is too slow to swap weights token by token, AFM 3 Core Advanced makes routing decisions per prompt: a lightweight dense block selects a fixed subset of parameters during initial processing, so only 1 to 4 billion parameters enter active memory for inference. AFM 3 Core, AFM 3 Core Advanced, AFM 3 Cloud, and ADM 3 Cloud are optimized for Apple silicon. AFM 3 Core Advanced requires A19 Pro iPhone 17 Pro or M3/M4 silicon and does not support devices with 8 GB of RAM. AFM 3 Cloud Pro, positioned for the most demanding agentic tool use and complex reasoning, is optimized for NVIDIA GPUs. The Google and NVIDIA partnership Apple says it worked with Google and NVIDIA to extend Private Cloud Compute so AFM 3 Cloud Pro can run on NVIDIA GPUs in Google Cloud while preserving the same privacy guarantees Apple describes for on-device and Apple-silicon server inference, namely that user data is not stored or shared, including with Apple. A January 12, 2026 joint statement from Apple and Google characterized the next-generation AFM family as built in collaboration with Google and based on its Gemini technology and cloud infrastructure. Apple's June 8 technical post emphasizes its own model architecture and Apple-silicon optimization, and some independent reporting describes the on-device models as distilled from Gemini rather than running Gemini directly. Why it matters For practitioners, the release illustrates two converging trends. First, sparse activation with flash-resident weights is becoming a practical tool for pushing larger, multimodal models onto constrained consumer silicon: IFP's approach of storing all parameters in flash and routing a subset into DRAM per prompt is a concrete example of the memory-budget tradeoffs the field is navigating. Second, even a vendor with deep in-house silicon and model capability is leaning on external frontier-model and cloud partners for its most demanding server workloads, a hybrid device-plus-cloud pattern that blends local inference with privacy-scoped cloud compute. What to watch Open questions include developer API access for on-device versus server calls, benchmarks comparing AFM 3 Core Advanced against dense and other sparse on-device models across Apple silicon generations, how the NVIDIA-GPU-in-Google-Cloud path performs and scales under Private Cloud Compute, and the real memory and latency tradeoffs for multimodal workloads that will determine how widely AFM 3 Core Advanced can be deployed. Scoring Rationale Verified: Apple's third-generation AFM family is a flagship release spanning a novel 20-billion-parameter sparse on-device model and Private Cloud Compute server models, with the most capable server model running on NVIDIA GPUs in Google Cloud. A major, deployment-defining model release for a billion-device ecosystem and highly relevant to on-device and hybrid-inference practitioners, though scoped to Apple's own platform rather than a field-wide frontier shift. Practice with real Ad Tech data 90 SQL & Python problems · 15 industry datasets Active Search Campaigns by BudgetEasy /problems/sql/active-search-campaigns-by-budget High CPC Clicks & Poor Landing PagesMedium /problems/sql/high-cpc-clicks-poor-landing-page Campaign ROAS by Attribution ModelHard /problems/sql/campaign-roas-by-attribution-model 250 free problems · No credit card See all Ad Tech problems /problems/datasets/adtech