AlignAtt4LLM: Fast AlignAtt for Decoder-Only LLMs at IWSLT 2026 Simultaneous Speech Translation Task

wpnews.pro

cd /news/large-language-models/alignatt4llm-fast-alignatt-for-decod… · home › topics › large-language-models › article

[ARTICLE · art-45378] src=aclanthology.org ↗ pub=2026-06-30T00:00Z topic=large-language-models verified=true sentiment=↑ positive

AlignAtt4LLM: Fast AlignAtt for Decoder-Only LLMs at IWSLT 2026 Simultaneous Speech Translation Task

Researchers from an undisclosed institution introduced AlignAtt4LLM, the first application of the AlignAtt policy to decoder-only large language models for simultaneous speech translation at IWSLT 2026. The system, a synchronous cascade using Qwen3-ASR and Gemma-4 E4B-it, outperformed baselines for English-to-German and English-to-Italian translation in low- and high-latency regimes, with mixed results for English-to-Chinese.

read2 min views13 publishedJun 30, 2026

AlignAtt4LLM: Fast AlignAtt for Decoder-Only LLMs at IWSLT 2026 Simultaneous Speech Translation Task

Abstract

We describe AlignAtt4LLM, an IWSLT 2026 simultaneous speech translation system for English to German, Italian, and Chinese. The system is a synchronous cascade: Qwen3-ASR with forced alignment produces an incrementally updated source transcript, and Gemma-4 E4B-it translates that prefix under an MT-side AlignAtt policy. To our knowledge, this is the first application of AlignAtt to a decoder-only LLM, where the encoder-decoder cross-attention used by earlier AlignAtt systems is absent. We recover a usable policy by proposing (1) an explicit source span in the prompt, (2) offline selection of translation-specific alignment heads, (3) selective qk-fast replay of the draft-to-source attention block, and (4) runtime query/key capture that preserves model outputs bit-identically. On the IWSLT 2026 development set, AlignAtt4LLM outperforms the supplied baselines for the European target languages, English to German and English to Italian, in both the low-latency regime around 2 seconds and the high-latency regime below 4 seconds CU-LongYAAL. Results for English to Chinese are more mixed, but the method is not tied to Gemma-4: because AlignAtt4LLM only requires a deterministic prompt layout, calibrated attention heads, and query/key capture, the same policy can be reapplied to stronger translation-focused decoder-only MT backbones for non-European target languages.- Anthology ID:

- 2026.iwslt-1.32
- Volume:
[Proceedings of the 23rd International Conference on Spoken Language Translation (IWSLT 2026)](/volumes/2026.iwslt-1/)- Month:

July
Year:
2026

- Address:
- San Diego, USA (in-person and online)
- Editors:

Elizabeth Salesky,Antonios Anastasopoulos,Matteo Negri,Marcello Federico- Venues:

[IWSLT](/venues/iwslt/)|[WS](/venues/ws/)- SIG:
[SIGSLT](/sigs/sigslt/)- Publisher:

Association for Computational Linguistics

- Note:
- Pages:

284–295

- Language:
- URL:
[https://aclanthology.org/2026.iwslt-1.32/](https://aclanthology.org/2026.iwslt-1.32/)- DOI:
- Cite (ACL):

Quentin Fuxa and Dominik Macháček. 2026. AlignAtt4LLM: Fast AlignAtt for Decoder-Only LLMs at IWSLT 2026 Simultaneous Speech Translation Task. InProceedings of the 23rd International Conference on Spoken Language Translation (IWSLT 2026), pages 284–295, San Diego, USA (in-person and online). Association for Computational Linguistics. - Cite (Informal):

[AlignAtt4LLM: Fast AlignAtt for Decoder-Only LLMs at IWSLT 2026 Simultaneous Speech Translation Task](https://aclanthology.org/2026.iwslt-1.32/)(Fuxa & Macháček, IWSLT 2026)- PDF:
[https://aclanthology.org/2026.iwslt-1.32.pdf](https://aclanthology.org/2026.iwslt-1.32.pdf)

source & further reading

aclanthology.org — original article Balancing Linguistic Intelligibility and Speaker Identity in Zero-Shot Cross-Lingual Voice Cloning AURA-ST: Acoustic-Unconstrained Residual Architecture for Speech Translation BSC’s Submission to the Instruction Following Track of IWSLT 2026

~/api · this article 200

$curl api.wpnews.pro/v1/news/alignatt4llm-fast-aligna…

Read original on aclanthology.org → aclanthology.org/2026.iwslt-1.32/

mentioned entities

AlignAtt4LLM

Qwen3-ASR

Gemma-4 E4B-it

IWSLT 2026

Quentin Fuxa

Dominik Macháček

Association for Computational Linguistics

metadata

slugalignatt4llm-fast-alignatt-for-decoder-only-llms-at-iwslt-2026-simultaneous-task

topic#large-language-models

secondary2 topics

sentimentpositive

canonicalaclanthology.org

navigation

← prevEvaluating Intelligence

next →How South Korea’s AI megaproject…

── more in #large-language-models 4 stories · sorted by recency

aclanthology.org · 30 Jun · #large-language-models

HydraQE: OSU’s Submission for the IWSLT 2026 Speech Translation Metrics Shared Task

aclanthology.org · 30 Jun · #large-language-models

Balancing Linguistic Intelligibility and Speaker Identity in Zero-Shot Cross-Lingual Voice Cloning

aclanthology.org · 30 Jun · #large-language-models

AURA-ST: Acoustic-Unconstrained Residual Architecture for Speech Translation

aclanthology.org · 30 Jun · #large-language-models

KIT’s Submission to Cross-Lingual Voice Cloning in IWSLT 2026

── more on @alignatt4llm 3 stories trending now

wpnews · 30 May · #ai-safety

Nightcord Security Analysis Report - Threat Investigation

wpnews · 4 Jul · #large-language-models

Transformers — The Architecture That Changed AI (Part 1 of 3)

wpnews · 4 Jul · #artificial-intelligence

Istota, a personal AI operating system

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required