{"slug": "show-hn-shivvr-ephemeral-semantic-embedding-and-cognitive-agent-service", "title": "Show HN: Shivvr – Ephemeral semantic embedding and cognitive agent service", "summary": "Shivvr, an ephemeral semantic embedding and cognitive agent service, launched on Show HN, offering chunking, embedding with GTR-T5-base, hybrid FST-BM25 search, and turn-based agent reasoning via Model Context Protocol (MCP). The service supports zero-config integration with Claude Code, Antigravity, and Codex, and provides per-agent orthogonal matrix rotation on embeddings for encryption.", "body_md": "Ephemeral semantic embedding & cognitive agent service.\n\nChunk text. Embed with GTR-T5-base. Search via hybrid FST-BM25 vector rank fusion. Stream turn-based cognitive agent reasoning via native Model Context Protocol (MCP).\n\nSentence-boundary chunking + GTR-T5-base embeddings (768d). Stores in RwLock<HashMap> — pure ephemeral compute.\n\nRRF blending dense vectors and sparse lexical indices. FST dictionary scanning provides microsecond safe query guardrails and intent-entity score boosting.\n\nTurn-based agent loop with integrated memory search, document ingestion, session indexing, and sandboxed command execution. Powered by OpenAI/Anthropic.\n\nComplete Model Context Protocol HTTP/SSE server endpoint (`/mcp/sse`\n\n) allowing immediate, zero-config integration with Claude Code, Antigravity, and Codex.\n\nPer-agent orthogonal matrix rotation on embeddings. Cosine similarity preserved under encryption. Keys are in-memory only.\n\n`organize`\n\nrole uses GTR-T5-base (768d). `retrieve`\n\nrole uses OpenAI text-embedding-ada-002 (1536d) — pass your own key or set server-side.\n\n| Method | Endpoint | Description |\n|---|---|---|\n| GET | `/health` | Status, model info, live counts |\n| GET | `/mcp/sse` | MCP Server SSE handshake |\n| POST | `/mcp/message` | MCP Server JSON-RPC message router |\n| POST | `/sessions/:id/agent/chat` | Stream non-blocking GhostAgent cognitive turns (SSE) |\n| POST | `/sessions/:id/ingest` | Chunk + embed text into session |\n| GET | `/sessions/:id/search?q=...` | Semantic search (supports RRF `hybrid` & `lexical_only` ) |\n| GET | `/sessions/:id` | Session metadata |\n| DELETE | `/sessions/:id` | Delete session |\n| GET | `/temp` | List temp stores with TTL |\n| POST | `/temp/:name/ingest` | Ingest into temp store (2 hr TTL) |\n| GET | `/temp/:name/search?q=...` | Search temp store |\n| DELETE | `/temp/:name` | Delete temp store |\n| POST | `/agent/:id/register` | Register per-agent orthogonal key |\n| POST | `/agent/:id/encrypt` | Encrypt embeddings |\n| POST | `/agent/:id/decrypt` | Decrypt embeddings |\n| POST | `/invert` | Reconstruct text from embedding vector |\n\n```\n# Ingest into session\ncurl -X POST https://shivvr.nuts.services/sessions/my-session/ingest \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"text\": \"Supreme Raven is protected by Known Opossum.\", \"source\": \"vault_specs\"}'\n\n# Autonomous Agent Conversational Chat (Streams Thoughts, ToolCalls, & Answer via SSE)\ncurl -i -X POST http://localhost:8085/sessions/my-session/agent/chat \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"message\": \"Who protects the Supreme Raven?\"}'\n\n# Vector-Lexical Hybrid RRF Search\ncurl \"http://localhost:8085/sessions/my-session/search?q=Known+Opossum&hybrid=true\"\n\n# High-speed Lexical-Only BM25 Search (Bypasses ONNX embedder)\ncurl \"http://localhost:8085/sessions/my-session/search?q=Opossum&lexical_only=true\"\n\n# Synchronize Claude Code or Antigravity with shivvr's Native MCP Server\nnemesis8 mcp add http://localhost:8085/mcp/sse\n```\n\n| Param | Default | Description |\n|---|---|---|\n`q` | required | Query text |\n`n` | 5 | Number of results |\n`hybrid` | false | Blend semantic vectors + BM25 scores (Reciprocal Rank Fusion) |\n`lexical_only` | false | Bypass vector embedder, execute pure BM25 search |\n`guardrail` | true | Enable FST toxic term scanning and automatic query blocking |\n`role` | organize | `organize` (768d local) or `retrieve` (1536d OpenAI) |\n`time_weight` | 0.0 | Blend semantic + recency score (0–1) |\n`decay_halflife_hours` | 168 | Recency decay half-life in hours |\n`include_nearby` | false | Return temporally adjacent chunks |\n`agent_id` | — | Agent ID for encrypted search |\n`openai_api_key` | — | Per-request OpenAI key for `retrieve` role (overrides server key) |\n\n| Variable | Default | Description |\n|---|---|---|\n`PORT` | 8080 | Listen port |\n`MODEL_PATH` | models/gtr-t5-base.onnx | GTR-T5-base ONNX embedder |\n`TOKENIZER_PATH` | models/tokenizer.json | Tokenizer |\n`OPENAI_API_KEY` | — | Enables OpenAI completions and retrieve embeddings |\n`ANTHROPIC_API_KEY` | — | Enables Anthropic completions and GhostAgent loops |\n`NUTS_AUTH_JWKS_URL` | — | Enable auth (open dev mode if unset) |\n`NUTS_AUTH_VALIDATE_URL` | https://auth.nuts.services/api/validate | API token validation endpoint |\n\n| Layer | Choice |\n|---|---|\n| Runtime | Rust + Tokio + axum |\n| Cognition | GhostAgent cognitive RAG turn loop (OpenAI / Anthropic compat) |\n| MCP Server | HTTP/SSE JSON-RPC 2.0 Model Context Protocol transport layer |\n| Hybrid Index | Tantivy FST deterministic phrase engine + BM25F field indexer |\n| Embedding | GTR-T5-base (768d) via ONNX Runtime 2.0 — local, required |\n| Storage | Ephemeral RwLock<HashMap> — no disk, no volume mounts |\n| GPU | CUDA 12.6 via ort EP on Cloud Run L4 — CPU fallback automatic |\n| Inversion | vec2text gtr-base (projection + T5 enc/dec) — optional |", "url": "https://wpnews.pro/news/show-hn-shivvr-ephemeral-semantic-embedding-and-cognitive-agent-service", "canonical_source": "https://shivvr.nuts.services/", "published_at": "2026-06-16 16:03:20+00:00", "updated_at": "2026-06-16 16:22:48.126506+00:00", "lang": "en", "topics": ["ai-tools", "natural-language-processing", "ai-agents", "developer-tools"], "entities": ["Shivvr", "GTR-T5-base", "OpenAI", "Anthropic", "Claude Code", "Antigravity", "Codex", "Model Context Protocol"], "alternates": {"html": "https://wpnews.pro/news/show-hn-shivvr-ephemeral-semantic-embedding-and-cognitive-agent-service", "markdown": "https://wpnews.pro/news/show-hn-shivvr-ephemeral-semantic-embedding-and-cognitive-agent-service.md", "text": "https://wpnews.pro/news/show-hn-shivvr-ephemeral-semantic-embedding-and-cognitive-agent-service.txt", "jsonld": "https://wpnews.pro/news/show-hn-shivvr-ephemeral-semantic-embedding-and-cognitive-agent-service.jsonld"}}