cd /news/developer-tools/quicktok-a-faster-tokenizer · home topics developer-tools article
[ARTICLE · art-28848] src=discuss.huggingface.co ↗ pub= topic=developer-tools verified=true sentiment=↑ positive

Quicktok: A faster tokenizer

Developer dmatth1 released quicktok, a fast exact BPE tokenizer written in C++ that runs 2–3.5× faster than bpe-openai and 4–11× faster than tiktoken on CPU. The tokenizer is byte-identical to tiktoken and available as a Python library and C++ source, targeting large-scale data processing and inference serving.

read3 min views1 publishedJun 16, 2026

Been working on this a while! Should be useful for anyone trying to speed up their tokenization flow.

Introducing quicktok native (C++) and Python

quicktok is a fast, exact BPE tokenizer written in C++. Token ids are byte-identical to tiktoken

, and encoding runs 2–3.5× faster than bpe-openai

(the fastest alternative I know of) and 4–11× faster than tiktoken

itself. I believe it’s the fastest exact CPU tokenizer available today for these encodings. It ships cl100k, o200k, GPT-OSS (o200k_harmony), Llama-3, and Qwen2.5/3, all byte-exact, plus bring-your-own Llama-4.

This is useful for anyone doing large amounts of CPU-bound data processing — search indexing, ingesting corpora, token counting/billing — and can significantly reduce the time and cost of data ingestion. It can also be used for online request serving, such as CPU-bound inference paths (token counting, embedding serving).

I’m releasing it as a Python library (pip install quicktok-v1

) and it’s available via C++ source. Repo: GitHub - dmatth1/quicktok: Fast exact BPE tokenizer. Byte-identical to tiktoken, 7x faster · GitHub. Measured on 3 public corpora on my Apple M1, single thread, MB/s. Every encoder’s output was verified token-for-token identical against tiktoken

before timing.

cl100k_base (GPT-3.5 / GPT-4)

encoder The Pile GitHub code Common Crawl
quicktok
116.1
144.2
75.2
bpe-openai 36.5 41.6 29.2
tiktoken-rs 15.3 14.3 13.5
tiktoken (Python) 14.7 13.2 12.3
TokenDagger 11.5 12.0 11.2

o200k_base (GPT-4o)

encoder The Pile GitHub code Common Crawl
quicktok
100.6
117.1
59.2
bpe-openai 36.1 40.1 29.9
tiktoken-rs 23.1 20.9 17.9
tiktoken (Python) 21.6 19.3 16.3
TokenDagger 11.0 11.7 10.2

quicktok also beats llama.cpp’s tokenizer on the Llama-3 vocab by ~14×. The parallel encode_batch

reaches 706 MB/s native on 8 cores; from Python it sustains 550 MB/s — 24× tiktoken

’s batch API. The speedups hold on other architectures like x86.

To keep the comparison fair, each encoder is called through the same raw API its own benchmark uses. (TokenDagger’s README claims 2–4× over tiktoken, but that’s on Llama-4/Mistral vocabs on AMD EPYC; on cl100k/o200k it lands around Python tiktoken’s level.) To reproduce run make bench-compare

in the repo.

The fundamental algorithm is the same as bpe-openai

(exact backtracking BPE) - see their blog post. Much of the speedup over bpe-openai

comes from data-structure engineering around reducing memory accesses.

All comparisons are single-threaded by design - parallel/batch is available but single-threaded is fair for comparison. Multilingual text (Common Crawl) is definitely the weakest ratio. Numbers above are from an M1 and were cross-checked on x86 Xeon - the ordering holds on both but absolute MB/s moves with corpus and host. Table numbers are the native C++ build (-march=native)

. The prebuilt PyPI wheels are portable-ABI and land lower - roughly 1.1–1.6× bpe-openai

depending on corpus. Building from source recovers the table numbers.

The full methodology - corpus fetching, the exactness gate, raw-API rules — is in the bench README in the repo. If you find an input where quicktok’s ids differ from tiktoken’s that’s definitely a bug and please report it!

── more in #developer-tools 4 stories · sorted by recency
── more on @dmatth1 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/quicktok-a-faster-to…] indexed:0 read:3min 2026-06-16 ·