Perplexity has announced an orchestrator that combines AI models running on your own computer with powerful cloud models and automatically decides which task gets processed where. The goal is to optimize accuracy, privacy, and energy efficiency at the same time. The hybrid inference system will be integrated into the Always-on agent product Personal Computer, which was introduced in March, starting in July.
Sensitive data like financial documents or health information will stay local, while compute-intensive tasks get routed to cloud models. Perplexity introduced the system together with Intel. But the model-agnostic framework also runs on other hardware, like Nvidia's RTX Spark. "The race for local compute is on," the announcement says.
According to Perplexity, shifting routine tasks to local devices could reduce the need for centralized computing infrastructure and simplify questions of data sovereignty. The company says its business model rewards correct answers instead of high compute consumption—which makes optimizing for efficiency a natural incentive.
AI News Without the Hype – Curated by Humans
Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.
Subscribe now