ON1 (G116 V8): 38μs Black-Box AI Memory Retrieval on Virtual Chip ISA

wpnews.pro

cd /news/ai-chips/on1-g116-v8-38ms-black-box-ai-memory… · home › topics › ai-chips › article

[ARTICLE · art-17744] src=github.com ↗ pub=2026-05-29T15:00Z topic=ai-chips verified=true sentiment=↑ positive

ON1 (G116 V8): 38μs Black-Box AI Memory Retrieval on Virtual Chip ISA

ON1 (G116 V8) has introduced a virtual chip ISA that achieves 38-microsecond black-box AI memory retrieval by separating vector search into three observable latency stages: fetch, compute, and ANN search. The system, designed for real-time LLM grounding with llama.cpp, exposes memory, compute, and retrieval latencies individually rather than reporting a single opaque query time. A public verification endpoint is currently live for testing the latency decomposition.

read1 min views21 publishedMay 29, 2026

G116 v8: 38μs Black-box AI Memory Retrieval on Virtual Chip ISA (Latency-Separated Fetch/Compute/ANN) — Live Tunnel Inside

Unlike any conventional chip.G116 v8 introduces aquantum-inspired virtual ISAthat makes memory, compute, and ANN search latency observable – not just a single opaque query time.

Built for the next generation of LLMs (llama.cpp, real‑time RAG, natural language grounding).

G116 v8 decomposes vector retrieval into three hardware‑visible stages, just like a quantum memory fabric:

Fetch Layer– mmap‑based dataset mapping (zero‑copy, ~0.1–0.5 μs/op)** Compute Layer**– vector transformations (NumPy / BLAS, ~0.4–2 μs/op)** Search Layer**– ANN similarity (currently brute‑force, ~3–10 ms/op; FAISS/HNSW coming)

This is not another black‑box vector DB. It’s a virtual chip ISA that makes RAG bottlenecks transparent.

Tier	Latency (per op)
Fetch	0.1 – 0.5 μs
Compute	0.4 – 2.0 μs
Search (brute)	3 – 10 ms

(Next: FAISS indexing + GPU acceleration)

Most systems (FAISS / Milvus / pgvector) only give you:

“query latency = X ms”

We give you:

memory latency → compute latency → retrieval latency

This is the natural language language latency breakdown needed for real‑time LLM grounding with llama.cpp.

Our public verification endpoint is currently live. You can test the latency decomposition directly from your own terminal right now:

curl "[https://5e776b15817fd1.lhr.life/query?mode=search&n=5000&k=3](https://5e776b15817fd1.lhr.life/query?mode=search&n=5000&k=3)"

source & further reading

github.com — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/on1-g116-v8-38ms-black-b…

Read original on github.com → github.com/ON1-Hao/ON1

mentioned entities

G116 v8

llama.cpp

FAISS

Milvus

pgvector

NumPy

BLAS

HNSW

metadata

slugon1-g116-v8-38ms-black-box-ai-memory-retrieval-on-virtual-chip-isa

topic#ai-chips

secondary4 topics

sentimentpositive

canonicalgithub.com

navigation

← prevThe Vatican’s Man Inside Anthrop…

next →The Hidden Developer Tool Tax: H…

── more in #ai-chips 4 stories · sorted by recency

dev.to · 9 Jul · #ai-chips

How Vector Search Actually Works: IVF and HNSW

pub.towardsai.net · 9 Jul · #ai-chips

I Benchmarked pgvector vs Qdrant vs Pinecone on 50M Vectors — Postgres Crushed the Dedicated DBs by…

dev.to · 7 Jul · #ai-chips

Vector Strike: Semantic Search Database Defender

cryptobriefing.com · 14 Jul · #ai-chips

AI sector sees valuation concerns as Anthropic faces competitive pressures

── more on @g116 v8 3 stories trending now

wpnews · 23 May · #artificial-intelligence

AccessLens — a blind person's lanyard, powered by Gemma 4 on-device

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 21 May · #developer-tools

Antigravity CLI: A Hands-On Guide to Google's Terminal Coding Agent

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required