Abstract
We present low-resource Bhojpuri-Hindi speech translation systems for the IWSLT 2026 shared task, covering both end-to-end and cascaded settings. Our end-to-end model connects a Bhojpuri-finetuned Wav2Vec2 encoder to a pretrained NLLB-200 decoder via a lightweight interconnection adapter that combines learnable layer aggregation, CNN-based temporal compression, and Transformer refinement, with optional LoRA-based decoder adaptation. For our cascaded system, we finetune Whisper for Bhojpuri ASR and NLLB-200 for Hindi MT, and further apply QE Fusion with COMET-Kiwi to improve translation selection from beam candidates.- Anthology ID:
- 2026.iwslt-1.31
- Volume:
[Proceedings of the 23rd International Conference on Spoken Language Translation (IWSLT 2026)](/volumes/2026.iwslt-1/)- Month:
- July
- Year:
- 2026
- Address:
- San Diego, USA (in-person and online)
- Editors:
Elizabeth Salesky,Antonios Anastasopoulos,Matteo Negri,Marcello Federico- Venues:
[IWSLT](/venues/iwslt/)|[WS](/venues/ws/)- SIG:
[SIGSLT](/sigs/sigslt/)- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 272–283
- Language:
- URL:
[https://aclanthology.org/2026.iwslt-1.31/](https://aclanthology.org/2026.iwslt-1.31/)- DOI:
- Cite (ACL):
- Kaustuk Pratap Singh, Dipanshu ., Vedant Singh, and Kumar Rishu. 2026. IIIT-BGP IWSLT 2026 Systems for Low-resource ST. InProceedings of the 23rd International Conference on Spoken Language Translation (IWSLT 2026), pages 272–283, San Diego, USA (in-person and online). Association for Computational Linguistics. - Cite (Informal):
[IIIT-BGP IWSLT 2026 Systems for Low-resource ST](https://aclanthology.org/2026.iwslt-1.31/)(Singh et al., IWSLT 2026)- PDF:
[https://aclanthology.org/2026.iwslt-1.31.pdf](https://aclanthology.org/2026.iwslt-1.31.pdf)