HydraQE: OSU’s Submission for the IWSLT 2026 Speech Translation Metrics Shared Task

Researchers at Ohio State University developed HydraQE, an end-to-end, reference-free quality estimation system for speech translation that outperforms cascaded text-based baselines. Submitted to the IWSLT 2026 shared task, HydraQE uses a Qwen3-ASR backbone with three prediction heads trained on human annotations and pseudo-labels to address data scarcity. The system demonstrates that direct speech translation quality estimation is competitive with cascaded approaches.

Abstract We present HydraQE, our contribution to the IWSLT 2026 Speech Translation Metrics shared task. HydraQE is an end-to-end, reference-free quality estimation QE system for speech translation built on a Qwen3-ASR backbone, which accepts source audio and a translation hypothesis as joint input. Hidden states from all backbone layers are combined via a sparsemax scalar mix, then re-encoded by a bidirectional Transformer for full cross-modal interaction. To address the scarcity of human-annotated speech translation data, three independent prediction heads are trained on complementary supervision signals: human direct assessment DA annotations, MetricX-24 pseudo-labels, and xCOMET pseudo-labels. We train on a combination of synthetically corrupted examples and silver pseudo-labeled machine translation outputs, using a curriculum that begins on synthetic and silver data and gradually shifts toward human-annotated examples. HydraQE outperforms cascaded text-based baselines and prior direct speech QE systems, demonstrating that end-to-end speech translation QE is competitive with cascaded approaches.- Anthology ID: - 2026.iwslt-1.37 - Volume: Proceedings of the 23rd International Conference on Spoken Language Translation IWSLT 2026 /volumes/2026.iwslt-1/ - Month: - July - Year: - 2026 - Address: - San Diego, USA in-person and online - Editors: Elizabeth Salesky /people/elizabeth-salesky/ , Antonios Anastasopoulos /people/antonios-anastasopoulos/ , Matteo Negri /people/matteo-negri/ , Marcello Federico /people/marcello-federico/ - Venues: IWSLT /venues/iwslt/ | WS /venues/ws/ - SIG: SIGSLT /sigs/sigslt/ - Publisher: - Association for Computational Linguistics - Note: - Pages: - 323–331 - Language: - URL: https://aclanthology.org/2026.iwslt-1.37/ https://aclanthology.org/2026.iwslt-1.37/ - DOI: - Cite ACL : - Kevin Krahn and Eric Fosler-Lussier. 2026. HydraQE: OSU’s Submission for the IWSLT 2026 Speech Translation Metrics Shared Task https://aclanthology.org/2026.iwslt-1.37/ . In Proceedings of the 23rd International Conference on Spoken Language Translation IWSLT 2026 , pages 323–331, San Diego, USA in-person and online . Association for Computational Linguistics. - Cite Informal : HydraQE: OSU’s Submission for the IWSLT 2026 Speech Translation Metrics Shared Task https://aclanthology.org/2026.iwslt-1.37/ Krahn & Fosler-Lussier, IWSLT 2026 - PDF: https://aclanthology.org/2026.iwslt-1.37.pdf https://aclanthology.org/2026.iwslt-1.37.pdf