Perceptual Image Codec: What Matters in Practical Learned Image Compression

Researchers at Apple have developed PICO (Perceptual Image Codec), the first learned image compression model optimized for human visual perception and practical on-device deployment. The codec achieves 2.3-3× bitrate savings over traditional standards like AV1 and VVC, while encoding 12MP images in 230ms and decoding in 150ms on an iPhone 17 Pro Max. PICO also delivers 20-40% bitrate savings over competing learned codecs and includes cross-platform robustness guarantees.

About index.html about We introduce PICO Perceptual Image Codec — the first learned codec that is both practical, and optimized directly for the human visual system. To derive it, we perform a comprehensive study of modeling choices for practical learned codecs, and search over millions of model configurations to jointly optimize over perceptual quality and on-device runtime. Based on large-scale subjective user studies, PICO provides 2.3-3× bitrate savings against AV1, AV2, VVC, ECM and JPEG-AI, and 20-40% bitrate savings against the best learned codec alternatives. At the same time, on an iPhone 17 Pro Max, it encodes 12MP images as fast as 230ms , and decodes them in 150ms — faster than most top ML-based codecs run on a V100 GPU. Different from most learned codecs, PICO furthermore comes with cross-platform robustness guarantees. Comparisons of state-of-the-art traditional and learned codecs across different considerations of practicality.