AlignAtt4LLM: Fast AlignAtt for Decoder-Only LLMs at IWSLT 2026 Simultaneous Speech Translation Task

Researchers from an undisclosed institution introduced AlignAtt4LLM, the first application of the AlignAtt policy to decoder-only large language models for simultaneous speech translation at IWSLT 2026. The system, a synchronous cascade using Qwen3-ASR and Gemma-4 E4B-it, outperformed baselines for English-to-German and English-to-Italian translation in low- and high-latency regimes, with mixed results for English-to-Chinese.

AlignAtt4LLM: Fast AlignAtt for Decoder-Only LLMs at IWSLT 2026 Simultaneous Speech Translation Task https://aclanthology.org/2026.iwslt-1.32.pdf Abstract We describe AlignAtt4LLM, an IWSLT 2026 simultaneous speech translation system for English to German, Italian, and Chinese. The system is a synchronous cascade: Qwen3-ASR with forced alignment produces an incrementally updated source transcript, and Gemma-4 E4B-it translates that prefix under an MT-side AlignAtt policy. To our knowledge, this is the first application of AlignAtt to a decoder-only LLM, where the encoder-decoder cross-attention used by earlier AlignAtt systems is absent. We recover a usable policy by proposing 1 an explicit source span in the prompt, 2 offline selection of translation-specific alignment heads, 3 selective qk-fast replay of the draft-to-source attention block, and 4 runtime query/key capture that preserves model outputs bit-identically. On the IWSLT 2026 development set, AlignAtt4LLM outperforms the supplied baselines for the European target languages, English to German and English to Italian, in both the low-latency regime around 2 seconds and the high-latency regime below 4 seconds CU-LongYAAL. Results for English to Chinese are more mixed, but the method is not tied to Gemma-4: because AlignAtt4LLM only requires a deterministic prompt layout, calibrated attention heads, and query/key capture, the same policy can be reapplied to stronger translation-focused decoder-only MT backbones for non-European target languages.- Anthology ID: - 2026.iwslt-1.32 - Volume: Proceedings of the 23rd International Conference on Spoken Language Translation IWSLT 2026 /volumes/2026.iwslt-1/ - Month: - July - Year: - 2026 - Address: - San Diego, USA in-person and online - Editors: Elizabeth Salesky /people/elizabeth-salesky/ , Antonios Anastasopoulos /people/antonios-anastasopoulos/ , Matteo Negri /people/matteo-negri/ , Marcello Federico /people/marcello-federico/ - Venues: IWSLT /venues/iwslt/ | WS /venues/ws/ - SIG: SIGSLT /sigs/sigslt/ - Publisher: - Association for Computational Linguistics - Note: - Pages: - 284–295 - Language: - URL: https://aclanthology.org/2026.iwslt-1.32/ https://aclanthology.org/2026.iwslt-1.32/ - DOI: - Cite ACL : - Quentin Fuxa and Dominik Macháček. 2026. AlignAtt4LLM: Fast AlignAtt for Decoder-Only LLMs at IWSLT 2026 Simultaneous Speech Translation Task https://aclanthology.org/2026.iwslt-1.32/ . In Proceedings of the 23rd International Conference on Spoken Language Translation IWSLT 2026 , pages 284–295, San Diego, USA in-person and online . Association for Computational Linguistics. - Cite Informal : AlignAtt4LLM: Fast AlignAtt for Decoder-Only LLMs at IWSLT 2026 Simultaneous Speech Translation Task https://aclanthology.org/2026.iwslt-1.32/ Fuxa & Macháček, IWSLT 2026 - PDF: https://aclanthology.org/2026.iwslt-1.32.pdf https://aclanthology.org/2026.iwslt-1.32.pdf