HoLo-ToLk: tokenizer-free speech (STT + TTS) on the 0-parameter HSL byte substrate

wpnews.pro

cd /news/machine-learning/holo-tolk-tokenizer-free-speech-stt-… · home › topics › machine-learning › article

[ARTICLE · art-42728] src=discuss.huggingface.co ↗ pub=2026-06-28T18:25Z topic=machine-learning verified=true sentiment=· neutral

HoLo-ToLk: tokenizer-free speech (STT + TTS) on the 0-parameter HSL byte substrate

Researcher HoLo-ToLk built speech-to-text and text-to-speech models using a tokenizer-free byte substrate, achieving a character error rate of 0.194 on STT, beating a mel-spectrogram baseline, while TTS remains a feasibility demo with unstable free-run synthesis.

read1 min views1 publishedJun 28, 2026

Follow-up to my earlier post on the 0-parameter input layer.

I took the HSL byte substrate (no tokenizer, no learned input embedding) and built

two small speech models on top, to see whether “bytes as signal” carries through to

audio. I’m calling the line HoLo-ToLk.

STT (speech → text) — the result I’m most confident about.

Feeding the raw HSL substrate to a char-CTC baseline is weak on its own (CER ~0.67).

Adding a small model-side spectral lens (log-mel + a learnable gated fusion over the

frozen substrate) flips it: CER 0.194, beating a mel-spectrogram baseline (0.213) in the same setup, confirmed across 4 seeds. So the honest takeaway is a controlled

comparison — substrate + lens > mel, same setup — not a SOTA number (8 kHz, char-CTC,

no LM; readable but rough). TTS (text → speech) — here the byte substrate is even more natural: UTF-8 text bytes

go straight in as HSL features, no tokenizer/vocab. A small AR transformer + guided

attention + HiFi-GAN gives a single-speaker voice. Held-out teacher-forced mel-L1 is 0.296 (multi-seed) and some samples sound genuinely natural — but free-run synthesis

on arbitrary sentences is still rough/unstable. So I’m framing TTS as a feasibility

demo, not a usable TTS.

Both are research/devlog results, not production or SOTA. The two models are separate

today; the goal is to unify them into one over time.

Try it (combined demo, both tabs):

Substrate: pip install hsl-embedding-zero

Happy to answer questions on the lens design or the byte→signal encoding — and very

open to critique, especially on the TTS free-run instability.

source & further reading

discuss.huggingface.co — original article Rakarrack-0.6.1 port making progress! ( AI assisted ) Cloud Storage Poll Welcome to Haiku basic(Haiku Docs, Haiku slide and Haiku sheets)

~/api · this article 200

$curl api.wpnews.pro/v1/news/holo-tolk-tokenizer-free…

Read original on discuss.huggingface.co → discuss.huggingface.co/t/holo-tolk-tokenizer-fre…

mentioned entities

HoLo-ToLk

HSL

HiFi-GAN

metadata

slugholo-tolk-tokenizer-free-speech-stt-tts-on-the-0-parameter-hsl-byte-substrate

topic#machine-learning

secondary2 topics

sentimentneutral

canonicaldiscuss.huggingface.co

navigation

← prevSnyk Evo ADS Goes GA: 1 in 12 MC…

next →Historical memory prices 1960-20…

── more in #machine-learning 4 stories · sorted by recency

discuss.huggingface.co · 14 Jun · #machine-learning

Removing the embedding from my embedding: a byte transformer with a 0-parameter input layer (25M, single RTX 4070)

arxiv.org · 28 Jun · #machine-learning

Knowledge Distillation of Black-Box Large Language Models

runtimewire.com · 28 Jun · #machine-learning

Sean Du brings a reasoning-model hallucination detector to ICML 2026

runtimewire.com · 28 Jun · #machine-learning

Vincenzo's NanoEuler rebuilds a GPT-2-scale training stack in C and CUDA

── more on @holo-tolk 3 stories trending now

wpnews · 25 May · #artificial-intelligence

Maia-3: free and open source

wpnews · 28 May · #ai-startups

[AINews] Cognition raises $1B in $26B Series D

wpnews · 5 Jun · #ai-agents

Miasma Worm Targets AI Coding Agents via GitHub Repos

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required