cd /news/artificial-intelligence/apple-launches-core-ai-for-apple-sil… · home topics artificial-intelligence article
[ARTICLE · art-34780] src=infoq.com ↗ pub= topic=artificial-intelligence verified=true sentiment=↑ positive

Apple Launches Core AI for Apple-Silicon Optimized On-Device Generative AI

Apple announced Core AI, a new framework for running generative AI models on-device, at WWDC 26. The framework supports large language models up to 70B parameters and runs exclusively on Apple Silicon, ensuring privacy and zero server costs. Core AI succeeds Core ML and provides unified hardware access, a Swift API, and ahead-of-time compilation.

read3 min views1 publishedJun 20, 2026
Apple Launches Core AI for Apple-Silicon Optimized On-Device Generative AI
Image: source

At WWDC 26, Apple announced the Core AI framework, the official successor to Core ML. It is designed to allow developers to run large language models and generative AI entirely on-device, supporting both custom-converted PyTorch models and pre-optimized open-source models.

Apple says the new Core AI framework provides a unified architecture for deploying models ranging from compact 3B-parameter vision models to large-scale LLMs, including reasoning models with up to 70B-parameter reasoning models, across the iPhone, iPad, Mac, and Apple Vision Pro.

Core AI is the technology underpinning Apple Intelligence, and with the next release of its OSes and toolchain, Apple is making it available to developers to build what it calls "custom intelligence". Core AI, which can only run on Apple Silicon, ensures user data privacy, zero server dependencies, and zero per-token cloud costs.

Key Core AI capabilities include unified hardware access, allowing workloads to seamlessly run across the CPU, GPU, and Neural Engine under one API; a memory-safe Swift API enabling zero-copy data paths and fine-grained control over inference memory; and ahead-of-time (AOT) compilation, which shifts work off the user's device, yielding near-instant load times.

As mentioned, you can convert a PyTorch model into a Core AI model using Core AI PyTorch. The simplest approach is exporting a PyTorch as a torch.export.ExportedProgram

and convert it to a CoreAI AIProgram

using TorchConverter().add_exported_program(ep).to_coreai() .

Alternatively, you can author a new Core AI model from a PyTorch one using built-in composite ops provided by the library, such as attention, RoPE embeddings, RMSNorm, and gather-matmul

, registering custom lowering function to map new PyTorch ops to Core AI IR, or even creating custom Metal kernels for lower-level optimization.

When converting a PyTorch model, an critical step is compressing it for deployment on Apple hardware. This process applies optimization techniques such as quantization and palettization, which are designed to align with the execution patterns of the Core AI runtime by default, ensuring efficient on-device performance.

Model compression can help reduce the memory footprint of your model (disk size and at runtime), reduce inference latency, reduce power consumption, or optimize them all at once.

One important aspect of running an AIModel

is its automatic specialization to the current hardware and OS version, which is carried through when the model is first loaded into the model cache. As a result, the first attempt to use a model may take significantly longer than subsequent runs, once the model has been already cached. Developers can control how and when this process happens by customizing

, accessing the SpecializationOptions

to check whether a model is already available or delete cached ones, and even sharing the model cache across an app group.

AICacheModel

With the introduction of Core AI, Apple is providing support for three distinct approaches to run ML/AI on its operating systems: Core ML, Core AI, and MLX Swift. Based on developer discussions based on Hacker News, Apple seems to suggest using Core ML for "classic, non-neural ML", such as decision trees or tabular feature engineering, Core AI for neural networks and transformers, and MLX for working with custom model weights—though potentially with lower performance. Community feedback also notes that while Core AI "makes it easier to incorporate high-performance LLMs", its long-term value will depend "on the the future growth of the official Core AI/community".

── more in #artificial-intelligence 4 stories · sorted by recency
── more on @apple 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/apple-launches-core-…] indexed:0 read:3min 2026-06-20 ·