cd /news/generative-ai/hilo-token-input-adaptive-high-low-f… · home topics generative-ai article
[ARTICLE · art-27503] src=arxiv.org ↗ pub= topic=generative-ai verified=true sentiment=↑ positive

HiLo-Token: Input-Adaptive High-Low Frequency Token Compression for Efficient Image Editing

Researchers propose HiLo-Token, an input-adaptive token compression framework that reduces latency in Diffusion Transformers for image editing by allocating more tokens to high-frequency regions and fewer to low-frequency areas. The method achieves up to 3.13x speedup on A100-80GB GPUs without quality regression, addressing a key bottleneck in generative AI tools like Photoshop.

read1 min publishedJun 15, 2026

arXiv:2606.13898v1 Announce Type: new Abstract: Creative image editing tools, such as Photoshop's Remove or Generative Fill buttons, are central to everyday customer use and account for a major share of traffic in Photoshop and Lightroom. However, current generative AI models face significant latency challenges, which become even more pronounced when transitioning from convolution-based U-Nets to Diffusion Transformers (DiTs). In our evaluation on hundreds of representative image editing samples spanning a wide range of mask ratios, the DiT module alone accounts for an average of 73% of the total model latency, even after being distilled from 50 timesteps down to 8 timesteps. To tackle this challenge, we propose $\textbf{HiLo-Token}$, an input-adaptive token compression framework that allocates more token budget to high-frequency, rich-context regions while assigning fewer tokens to low-frequency areas. Specifically, for the editing region specified by the user mask, we retain all tokens within a dilated mask to preserve strong locality and contextual relevance. Outside the editing region, we introduce a simple yet effective high-frequency token selection strategy based on spatial frequency to capture important local details, while using tokens from a 16x downsampled image to represent low-frequency components and preserve the blurry but global structure. Extensive experiments on production-level evaluation data validate the effectiveness of the proposed method, achieving 3.13x, 2.59x, and 1.67x DiT speedups on A100-80GB for image editing tasks across small, medium, and large mask ratio categories with average ratios of 6.38%, 15.92%, and 35.36%, respectively, without any regression in generation quality.

── more in #generative-ai 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/hilo-token-input-ada…] indexed:0 read:1min 2026-06-15 ·