{"slug": "benchmark-and-optimize-llms-on-device-with-ai-edge-portal", "title": "Benchmark and optimize LLMs on-device with AI Edge Portal", "summary": "Google AI Edge Portal is a new platform that allows developers to benchmark and debug large language models (LLMs) directly on over 120 different physical Android devices. It provides automated testing for critical performance metrics like latency and memory usage across CPU, GPU, and NPU backends, and includes a graph visualization tool to help identify and fix optimization issues. The service aims to solve the challenge of deploying generative AI efficiently across the diverse Android ecosystem.", "body_md": "LLMs have become more powerful at smaller sizes, but deploying them to edge devices like smartphones remains a massive challenge. Today, developers have to optimize across a sprawling combination of accelerators, operating systems, and countless System-on-a-Chip (SoC) configurations, often relying on manual testing with just a handful of devices. Google AI Edge Portal helps solve these challenges.\nBy letting developers test ML workloads across a fleet of over 120 representative Android device types, Google AI Edge Portal provides deep insight into latency and performance across all CPU, GPU, and NPU backends.\nToday, we are excited to announce two new capabilities that expand Google AI Edge Portal’s capabilities for the generative AI era: benchmarking and debugging on-device LLMs. These new services give developers what they need to optimize generative AI performance accurately and efficiently across the entire Android ecosystem.\nWhen a user interacts with an LLM-enabled experience in your app, they expect fast and consistent performance on their device. Common challenges like initialization time can result in your app appearing to freeze, or, in a worst case, crash completely if the model consumes all available memory.\nWith the latest release of Google AI Edge Portal, you can now run automated gen AI benchmarks directly on a physical lab of over 120 diverse Android devices and test for these scenarios specifically. Portal natively supports CPU and GPU benchmarking for LLMs in the LiteRT-LM format.\nWhen you trigger a gen AI benchmarking job with Portal, it profiles the critical metrics that dictate your end-users’ experience when interacting with your AI application on-device:\nWith these insights, you can confidently decide which devices are ready to host your model and adjust or better optimize your LLMs for device targeting before shipping.\nBenchmarking is only useful if you can fix the discovered performance issues. When an LLM performs poorly, finding the root cause within the complex graph of multiple layers and thousands of nodes is a daunting task for developers, involving tedious and time-consuming searching that can take hours if not days.\nTo bridge this gap, we have added the ability to visualize and compare model graphs in Portal with ease. Through the natively integrated Model Explorer, our graph visualization tool, you can search and locate specific nodes, compare models side-by-side in the same tab, and view tensor shapes, trace inputs and outputs, and more. To further speed up debugging for teams, we also added the ability to take screenshots and share specific views directly with your collaborators in Google Cloud.\nThese visualizations are one of the most effective ways to identify targets for optimization, including:\nWith the era of LLMs on-device here, we are excited to help close the critical gap in benchmarking to bring the power of AI to the thousands of types of smartphones on the market today. To utilize these latest features, please complete our sign-up form here to express interest.\nGoogle AI Edge Portal is currently available in private preview for allowlisted Google Cloud customers. During this private preview period, access is provided at no charge, subject to the preview terms. All current allowlisted customers will receive access to these new features automatically.\nWe can’t wait to see what gen AI capabilities you are able to deploy across the full spectrum of devices with Google AI Edge Portal!\nThank you to the members of the team, and collaborators for their contributions in making the advancements in this release possible: Akshat Sharma, Ami Kubota, Charlie Xu, Chunlei Niu, Cormac Brick, Derek Bekebrede, Eric Yang, Jing Jin, Kathleen Low, Matthias Grundmann, Marissa Ikonomidis, Na Li, Ram Iyengar, Sachin Kotwani, Sommayah Soliman, Tenghui Zhu, Xiaoming Hu, Zi Yuan", "url": "https://wpnews.pro/news/benchmark-and-optimize-llms-on-device-with-ai-edge-portal", "canonical_source": "https://cloud.google.com/blog/products/ai-machine-learning/benchmark-llms-on-device-with-ai-edge-portal/", "published_at": "2026-05-20 16:00:00+00:00", "updated_at": "2026-05-20 16:06:26.419781+00:00", "lang": "en", "topics": ["large-language-models", "developer-tools", "artificial-intelligence", "machine-learning", "products"], "entities": ["Google AI Edge Portal", "Android"], "alternates": {"html": "https://wpnews.pro/news/benchmark-and-optimize-llms-on-device-with-ai-edge-portal", "markdown": "https://wpnews.pro/news/benchmark-and-optimize-llms-on-device-with-ai-edge-portal.md", "text": "https://wpnews.pro/news/benchmark-and-optimize-llms-on-device-with-ai-edge-portal.txt", "jsonld": "https://wpnews.pro/news/benchmark-and-optimize-llms-on-device-with-ai-edge-portal.jsonld"}}