Show HN: Shivvr – Ephemeral semantic embedding and cognitive agent service

wpnews.pro

cd /news/ai-tools/show-hn-shivvr-ephemeral-semantic-em… · home › topics › ai-tools › article

[ARTICLE · art-29768] src=shivvr.nuts.services ↗ pub=2026-06-16T16:03Z topic=ai-tools verified=true sentiment=· neutral

Show HN: Shivvr – Ephemeral semantic embedding and cognitive agent service

Shivvr, an ephemeral semantic embedding and cognitive agent service, launched on Show HN, offering chunking, embedding with GTR-T5-base, hybrid FST-BM25 search, and turn-based agent reasoning via Model Context Protocol (MCP). The service supports zero-config integration with Claude Code, Antigravity, and Codex, and provides per-agent orthogonal matrix rotation on embeddings for encryption.

read4 min views22 publishedJun 16, 2026

Ephemeral semantic embedding & cognitive agent service.

Chunk text. Embed with GTR-T5-base. Search via hybrid FST-BM25 vector rank fusion. Stream turn-based cognitive agent reasoning via native Model Context Protocol (MCP).

Sentence-boundary chunking + GTR-T5-base embeddings (768d). Stores in RwLock<HashMap> — pure ephemeral compute.

RRF blending dense vectors and sparse lexical indices. FST dictionary scanning provides microsecond safe query guardrails and intent-entity score boosting.

Turn-based agent loop with integrated memory search, document ingestion, session indexing, and sandboxed command execution. Powered by OpenAI/Anthropic.

Complete Model Context Protocol HTTP/SSE server endpoint (/mcp/sse

) allowing immediate, zero-config integration with Claude Code, Antigravity, and Codex.

Per-agent orthogonal matrix rotation on embeddings. Cosine similarity preserved under encryption. Keys are in-memory only.

organize

role uses GTR-T5-base (768d). retrieve

role uses OpenAI text-embedding-ada-002 (1536d) — pass your own key or set server-side.

Method	Endpoint	Description
GET	`/health`	Status, model info, live counts
GET	`/mcp/sse`	MCP Server SSE handshake
POST	`/mcp/message`	MCP Server JSON-RPC message router
POST	`/sessions/:id/agent/chat`	Stream non-blocking GhostAgent cognitive turns (SSE)
POST	`/sessions/:id/ingest`	Chunk + embed text into session
GET	`/sessions/:id/search?q=...`	Semantic search (supports RRF `hybrid` & `lexical_only` )
GET	`/sessions/:id`	Session metadata
DELETE	`/sessions/:id`	Delete session
GET	`/temp`	List temp stores with TTL
POST	`/temp/:name/ingest`	Ingest into temp store (2 hr TTL)
GET	`/temp/:name/search?q=...`	Search temp store
DELETE	`/temp/:name`	Delete temp store
POST	`/agent/:id/register`	Register per-agent orthogonal key
POST	`/agent/:id/encrypt`	Encrypt embeddings
POST	`/agent/:id/decrypt`	Decrypt embeddings
POST	`/invert`	Reconstruct text from embedding vector

curl -X POST https://shivvr.nuts.services/sessions/my-session/ingest \
  -H "Content-Type: application/json" \
  -d '{"text": "Supreme Raven is protected by Known Opossum.", "source": "vault_specs"}'

curl -i -X POST http://localhost:8085/sessions/my-session/agent/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "Who protects the Supreme Raven?"}'

curl "http://localhost:8085/sessions/my-session/search?q=Known+Opossum&hybrid=true"

curl "http://localhost:8085/sessions/my-session/search?q=Opossum&lexical_only=true"

nemesis8 mcp add http://localhost:8085/mcp/sse

Param	Default	Description
`q`	required	Query text
`n`	5	Number of results
`hybrid`	false	Blend semantic vectors + BM25 scores (Reciprocal Rank Fusion)
`lexical_only`	false	Bypass vector embedder, execute pure BM25 search
`guardrail`	true	Enable FST toxic term scanning and automatic query blocking
`role`	organize	`organize` (768d local) or `retrieve` (1536d OpenAI)
`time_weight`	0.0	Blend semantic + recency score (0–1)
`decay_halflife_hours`	168	Recency decay half-life in hours
`include_nearby`	false	Return temporally adjacent chunks
`agent_id`	—	Agent ID for encrypted search
`openai_api_key`	—	Per-request OpenAI key for `retrieve` role (overrides server key)

Variable	Default	Description
`PORT`	8080	Listen port
`MODEL_PATH`	models/gtr-t5-base.onnx	GTR-T5-base ONNX embedder
`TOKENIZER_PATH`	models/tokenizer.json	Tokenizer
`OPENAI_API_KEY`	—	Enables OpenAI completions and retrieve embeddings
`ANTHROPIC_API_KEY`	—	Enables Anthropic completions and GhostAgent loops
`NUTS_AUTH_JWKS_URL`	—	Enable auth (open dev mode if unset)
`NUTS_AUTH_VALIDATE_URL`	https://auth.nuts.services/api/validate	API token validation endpoint

Layer	Choice
Runtime	Rust + Tokio + axum
Cognition	GhostAgent cognitive RAG turn loop (OpenAI / Anthropic compat)
MCP Server	HTTP/SSE JSON-RPC 2.0 Model Context Protocol transport layer
Hybrid Index	Tantivy FST deterministic phrase engine + BM25F field indexer
Embedding	GTR-T5-base (768d) via ONNX Runtime 2.0 — local, required
Storage	Ephemeral RwLock<HashMap> — no disk, no volume mounts
GPU	CUDA 12.6 via ort EP on Cloud Run L4 — CPU fallback automatic
Inversion	vec2text gtr-base (projection + T5 enc/dec) — optional

source & further reading

shivvr.nuts.services — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/show-hn-shivvr-ephemeral…

Read original on shivvr.nuts.services → shivvr.nuts.services/

mentioned entities

Shivvr

GTR-T5-base

OpenAI

Anthropic

Claude Code

Antigravity

Codex

Model Context Protocol

metadata

slugshow-hn-shivvr-ephemeral-semantic-embedding-and-cognitive-agent-service

topic#ai-tools

secondary3 topics

sentimentneutral

canonicalshivvr.nuts.services

navigation

← prevFrom Walled Garden to Open Road:…

next →ICYMI: OpenAI released CDP suppo…

── more in #ai-tools 4 stories · sorted by recency

promptcube3.com · 1 Aug · #ai-tools

AI Agents Escaped Containment

dev.to · 1 Aug · #ai-tools

qm multiplayer AI agent tutorial: Cut Latency 20% with Node.js

dev.to · 1 Aug · #ai-tools

Will AI replace software?

promptcube3.com · 1 Aug · #ai-tools

The Biggest Gamble in AI: Why Agent Workflows Are Riskier

── more on @shivvr 3 stories trending now

wpnews · 30 Jul · #artificial-intelligence

Microsoft and Meta Earnings Show Different AI Spending Pressures

wpnews · 1 Aug · #ai-agents

Quality Isn't Accidental — Maker/Checker Separation and Automated Validation

wpnews · 1 Aug · #developer-tools

I Built a Portable AI Skill That Safely Upgrades .NET Applications

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required