Raon-Speech Technical Report

wpnews.pro

cd /news/large-language-models/raon-speech-technical-report · home › topics › large-language-models › article

[ARTICLE · art-14057] src=arxiv.org ↗ pub=2026-05-26T04:00Z topic=large-language-models verified=true sentiment=↑ positive

Raon-Speech Technical Report

Raon-Speech, a 9-billion-parameter speech language model for English and Korean, and its full-duplex extension Raon-SpeechChat achieve top performance across 42 speech and text benchmarks, surpassing similarly sized models like Qwen2.5-Omni and Fun-Audio-Chat. The models, trained on 1.38 million hours of curated data, preserve strong text capabilities while enabling real-time, interruption-sensitive conversation. All model checkpoints, training pipelines, and an interactive demo are open-sourced.

read1 min views13 publishedMay 26, 2026

arXiv:2605.23912v1 Announce Type: new Abstract: We present Raon-Speech, a top-performing 9B-parameter speech language model (SpeechLM) for English and Korean speech understanding, answering, and generation, and Raon-SpeechChat, a high-performing full-duplex extension for natural real-time conversation. Raon-Speech successfully transforms a pre-trained LLM into a SpeechLM that both understands and generates speech while preserving strong text capabilities. It trains on 1.38M hours of highly curated English and Korean speech and text datasets with the following training stages: (1) speech modules alignment, (2) end-to-end SpeechLM pre-training with knowledge distillation, and (3) multi-task preference optimization-based post-training. Across 42 English and Korean speech and text benchmarks, Raon-Speech establishes the strongest overall profile on speech-centric tasks in our comparison against eight similarly sized recent audio foundation models, including Qwen2.5-Omni and Fun-Audio-Chat, while preserving strong text question answering performance. Building upon it, Raon-SpeechChat enables natural full-duplex conversation by continual training on 119K hours of time-aligned real and synthetic dialogue data. It proceeds through three complementary training stages: (1) causal encoder adaptation, (2) full-duplex pre-training, (3) full-duplex fine-tuning for voice and role-control. On multiple full-duplex benchmarks, Raon-SpeechChat shows its clearest strengths on the turn-taking and interruption-sensitive behaviors covered by FDB v1.0, and remains competitive across the broader full-duplex evaluation suite. We open-source all model checkpoints, the training and inference pipeline, and an interactive demo.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/raon-speech-technical-re…

Read original on arxiv.org → arxiv.org/abs/2605.23912

mentioned entities

Raon-Speech

Raon-SpeechChat

Qwen2.5-Omni

Fun-Audio-Chat

metadata

slugraon-speech-technical-report

topic#large-language-models

secondary4 topics

sentimentpositive

canonicalarxiv.org

navigation

← prevShow HN: Self-hosted collaborati…

next →Google Enters The Ecommerce Wars…

── more in #large-language-models 4 stories · sorted by recency

arxiv.org · 16 Jul · #large-language-models

Bridging the Gap Between Latent and Explicit Reasoning with Looped Transformers

marktechpost.com · 15 Jul · #large-language-models

Thinking Machines Lab Releases Inkling: A 975B-Parameter Open-Weights Multimodal MoE With 41B Active Parameters And Controllable Thinking Effort

dev.to · 18 Jun · #large-language-models

I Ran Five Small Multimodal Models on a Jetson. The Fastest One Was Not the Best Baseline.

vice.com · 16 Jul · #large-language-models

OpenAI is bow everything it promised not to be: closed-Source & for-profit (2023)

── more on @raon-speech 3 stories trending now

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 8 Jul · #ai-chips

D-Matrix launches Corsair AI inference platform, challenging Nvidia’s GPU dominance

wpnews · 8 Jul · #artificial-intelligence

What Is Vibe Coding? How AI Builds Games From Scratch

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required