Apple announced Core AI at WWDC26 as the successor to Core ML, positioning it as a developer-facing framework for running large language models and generative workloads optimized for Apple Silicon, according to Apple Developer documentation and contemporary coverage by InfoQ and 9to5Mac. Apple Developer documentation describes Core AI as engineered for on-device performance with "zero server dependencies and zero token costs," and it integrates with a new Foundation Models framework that exposes Apple Foundation Models and third-party models via a Swift language-model protocol. 9to5Mac highlighted WWDC demos that included a 1-trillion-parameter Kimi 2.6 model running locally across four Mac Studios using low-latency macOS Tahoe 26.2 networking. Reporting also notes Vision, Speech, Dynamic Profiles, and an Evaluations framework in the developer materials.
What happened
Apple announced Core AI at WWDC26, presenting it as the successor to Core ML, according to Apple Developer documentation and coverage by InfoQ and 9to5Mac. The Apple Developer site describes Core AI as a set of technologies purpose-built for Apple Silicon to run AI models on device and in private cloud scenarios, and it states the framework is designed for performance, customization, and scaling across device and model sizes.
Technical details
Apple Developer documentation states Core AI supports on-device inference and Private Cloud Compute, and it integrates with a native Swift Foundation Models framework that exposes Apple Foundation Models and any third-party model conforming to a Language Model protocol. The documentation describes features including multimodal prompting, Vision integration for image-plus-text reasoning, advanced Speech capabilities, Dynamic Profiles for runtime model/tool swapping, and an Evaluations framework for validating behavior under dynamic conditions.
Reported demos and numbers
9to5Mac reported a WWDC session demo that built a full app from prompts and showed a 1-trillion-parameter Kimi 2.6 model running locally across four Mac Studios, using low-latency tech introduced in macOS Tahoe 26.2. InfoQ and other coverage described Core AI as the official next-generation framework following Core ML; Apple Developer pages present Core AI and the broader machine learning tooling as the recommended path for new generative features.
Industry context
Editorial analysis: Companies building developer frameworks for on-device AI have recently emphasized tight hardware-software co-design to reduce latency and remove recurring API token costs. Apple Developer documentation framing Core AI around "zero server dependencies and zero token costs" places Apple in the same design space as other vendors pursuing local-first and private-cloud hybrid models for generative features.
Developer impact and portability
Editorial analysis: For practitioners, the combination of a Swift-native Foundation Models API, a Language Model protocol for third-party packages, and runtime facilities like Dynamic Profiles suggests an emphasis on developer ergonomics and model interchangeability within the ecosystem, while keeping models close to hardware. This pattern typically reduces round-trip latency and privacy exposure but increases the importance of tooling for model validation, profiling, and device-specific optimization.
What to watch
Observers should track which third-party model providers ship Language Model protocol Swift packages, how Apple Foundation Models are licensed for on-device use versus Private Cloud Compute, and performance benchmarks comparing local inference on Apple Silicon against private-cloud deployments. Also watch the developer response to the Evaluations framework and whether it becomes a standard part of CI for on-device generative features.
Limitations and remaining questions
Editorial analysis: Apple Developer documentation does not publish detailed model architectures, per-instance latency numbers, or pricing models for Private Cloud Compute in the publicly available pages referenced here. 9to5Mac reported the demo scale and runtime topology, but independent benchmarks and provider agreements will be required to validate throughput, memory footprint, and multi-device orchestration for large models.
Practical takeaway
Editorial analysis: Mobile and desktop application teams building generative features should consider Core AI as an option if they target Apple platforms and require low-latency or privacy-sensitive inference, while cross-platform teams will need to evaluate portability and integration costs for models and tooling across non-Apple environments.
Scoring Rationale #
A major platform vendor released a developer-first framework that standardizes on-device and private-cloud generative model support. This materially affects how practitioners build low-latency, privacy-sensitive AI features on Apple platforms, but independent benchmarks and third-party adoption remain to be seen.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets β the exact type of data you work with.