{"slug": "alignatt4llm-fast-alignatt-for-decoder-only-llms-at-iwslt-2026-simultaneous-task", "title": "AlignAtt4LLM: Fast AlignAtt for Decoder-Only LLMs at IWSLT 2026 Simultaneous Speech Translation Task", "summary": "Researchers from an undisclosed institution introduced AlignAtt4LLM, the first application of the AlignAtt policy to decoder-only large language models for simultaneous speech translation at IWSLT 2026. The system, a synchronous cascade using Qwen3-ASR and Gemma-4 E4B-it, outperformed baselines for English-to-German and English-to-Italian translation in low- and high-latency regimes, with mixed results for English-to-Chinese.", "body_md": "[AlignAtt4LLM: Fast AlignAtt for Decoder-Only LLMs at IWSLT 2026 Simultaneous Speech Translation Task](https://aclanthology.org/2026.iwslt-1.32.pdf)\n\n##### Abstract\n\nWe describe AlignAtt4LLM, an IWSLT 2026 simultaneous speech translation system for English to German, Italian, and Chinese. The system is a synchronous cascade: Qwen3-ASR with forced alignment produces an incrementally updated source transcript, and Gemma-4 E4B-it translates that prefix under an MT-side AlignAtt policy. To our knowledge, this is the first application of AlignAtt to a decoder-only LLM, where the encoder-decoder cross-attention used by earlier AlignAtt systems is absent. We recover a usable policy by proposing (1) an explicit source span in the prompt, (2) offline selection of translation-specific alignment heads, (3) selective qk-fast replay of the draft-to-source attention block, and (4) runtime query/key capture that preserves model outputs bit-identically. On the IWSLT 2026 development set, AlignAtt4LLM outperforms the supplied baselines for the European target languages, English to German and English to Italian, in both the low-latency regime around 2 seconds and the high-latency regime below 4 seconds CU-LongYAAL. Results for English to Chinese are more mixed, but the method is not tied to Gemma-4: because AlignAtt4LLM only requires a deterministic prompt layout, calibrated attention heads, and query/key capture, the same policy can be reapplied to stronger translation-focused decoder-only MT backbones for non-European target languages.- Anthology ID:\n- 2026.iwslt-1.32\n- Volume:\n[Proceedings of the 23rd International Conference on Spoken Language Translation (IWSLT 2026)](/volumes/2026.iwslt-1/)- Month:\n- July\n- Year:\n- 2026\n- Address:\n- San Diego, USA (in-person and online)\n- Editors:\n[Elizabeth Salesky](/people/elizabeth-salesky/),[Antonios Anastasopoulos](/people/antonios-anastasopoulos/),[Matteo Negri](/people/matteo-negri/),[Marcello Federico](/people/marcello-federico/)- Venues:\n[IWSLT](/venues/iwslt/)|[WS](/venues/ws/)- SIG:\n[SIGSLT](/sigs/sigslt/)- Publisher:\n- Association for Computational Linguistics\n- Note:\n- Pages:\n- 284–295\n- Language:\n- URL:\n[https://aclanthology.org/2026.iwslt-1.32/](https://aclanthology.org/2026.iwslt-1.32/)- DOI:\n- Cite (ACL):\n- Quentin Fuxa and Dominik Macháček. 2026.\n[AlignAtt4LLM: Fast AlignAtt for Decoder-Only LLMs at IWSLT 2026 Simultaneous Speech Translation Task](https://aclanthology.org/2026.iwslt-1.32/). In*Proceedings of the 23rd International Conference on Spoken Language Translation (IWSLT 2026)*, pages 284–295, San Diego, USA (in-person and online). Association for Computational Linguistics. - Cite (Informal):\n[AlignAtt4LLM: Fast AlignAtt for Decoder-Only LLMs at IWSLT 2026 Simultaneous Speech Translation Task](https://aclanthology.org/2026.iwslt-1.32/)(Fuxa & Macháček, IWSLT 2026)- PDF:\n[https://aclanthology.org/2026.iwslt-1.32.pdf](https://aclanthology.org/2026.iwslt-1.32.pdf)", "url": "https://wpnews.pro/news/alignatt4llm-fast-alignatt-for-decoder-only-llms-at-iwslt-2026-simultaneous-task", "canonical_source": "https://aclanthology.org/2026.iwslt-1.32/", "published_at": "2026-06-30 00:00:00+00:00", "updated_at": "2026-06-30 18:52:06.697860+00:00", "lang": "en", "topics": ["large-language-models", "natural-language-processing", "ai-research"], "entities": ["AlignAtt4LLM", "Qwen3-ASR", "Gemma-4 E4B-it", "IWSLT 2026", "Quentin Fuxa", "Dominik Macháček", "Association for Computational Linguistics"], "alternates": {"html": "https://wpnews.pro/news/alignatt4llm-fast-alignatt-for-decoder-only-llms-at-iwslt-2026-simultaneous-task", "markdown": "https://wpnews.pro/news/alignatt4llm-fast-alignatt-for-decoder-only-llms-at-iwslt-2026-simultaneous-task.md", "text": "https://wpnews.pro/news/alignatt4llm-fast-alignatt-for-decoder-only-llms-at-iwslt-2026-simultaneous-task.txt", "jsonld": "https://wpnews.pro/news/alignatt4llm-fast-alignatt-for-decoder-only-llms-at-iwslt-2026-simultaneous-task.jsonld"}}