cd /news/machine-learning/what-matters-in-practical-learned-im… · home topics machine-learning article
[ARTICLE · art-17314] src=machinelearning.apple.com pub= topic=machine-learning verified=true sentiment=↑ positive

What Matters in Practical Learned Image Compression

Researchers at Apple have developed a new learned image codec that achieves 2.3–3x bitrate savings against traditional codecs like AV1, AV2, VVC, ECM, and JPEG-AI, while also outperforming the best learned alternatives by 20–40%. The codec, optimized for perceptual quality and runtime through neural architecture search, encodes 12MP images in 230ms and decodes them in 150ms on an iPhone 17 Pro Max, surpassing the speed of most top ML-based codecs running on a V100 GPU. This work closes the gap between perceptual and practical learned image compression by directly optimizing for the human visual system.

read2 min publishedMay 7, 2026

content type paperpublished May 2026 What Matters in Practical Learned Image Compression

AuthorsKedar Tatwawadi, Parisa Rahimzadeh, Zhanghao Sun, Zhiqi Chen, Ziyun Yang, Sanjay Nair, Divija Hasteer, Oren Rippel

What Matters in Practical Learned Image Compression

AuthorsKedar Tatwawadi, Parisa Rahimzadeh, Zhanghao Sun, Zhiqi Chen, Ziyun Yang, Sanjay Nair, Divija Hasteer, Oren Rippel

One of the major differentiators unlocked by learned codecs relative to their hard-coded traditional counterparts is their ability to be optimized directly to appeal to the human visual system. Despite this potential, a perceptual yet practical image codec is yet to be proposed. In this work, we aim to close this gap. We conduct a comprehensive study of the key modeling choices that govern the design of a practical learned image codec, jointly optimized for perceptual quality and runtime — including within the ablations several novel techniques. We then perform performance-aware neural architecture search over millions of backbone configurations to identify models that achieve the target on-device runtime while maximizing compression performance as captured by perceptual metrics. We combine the various optimizations to construct a new codec that achieves a significantly improved tradeoff between speed and perceptual quality. Based on rigorous subjective user studies, it provides 2.3–3x bitrate savings against AV1, AV2, VVC, ECM and JPEG-AI, and 20–40% bitrate savings against the best learned codec alternatives. At the same time, on an iPhone 17 Pro Max, it encodes 12MP images as fast as 230ms, and decodes them in 150ms — faster than most top ML-based codecs run on a V100 GPU.

Naturalistic Head Motion Generation From Speech

April 24, 2023research area Human-Computer Interaction, research area Speech and Natural Language Processingconference ICASSP

Synthesizing natural head motion to accompany speech for an embodied conversational agent is necessary for providing a rich interactive experience. Most prior works assess the quality of generated head motion by comparing them against a single ground-truth using an objective metric. Yet there are many plausible head motion sequences to accompany a speech utterance. In this work, we study the variation in the perceptual quality of head motions…

Neural Face Video Compression using Multiple Views

June 6, 2022research area Computer VisionWorkshop at CVPR Recent advances in deep generative models led to the development of neural face video compression codecs that use an order of magnitude less bandwidth than engineered codecs. These neural codecs reconstruct the current frame by warping a source frame and using a generative model to compensate for imperfections in the warped source frame. Thereby, the warp is encoded and transmitted using a small number of keypoints rather than a dense flow field,…

── more in #machine-learning 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/what-matters-in-prac…] indexed:0 read:2min 2026-05-07 ·