{"slug": "show-hn-tinysearch-token-efficient-web-research-for-local-ai-agents", "title": "Show HN: TinySearch – token-efficient web research for local AI agents", "summary": "Developer MarcellM01 released TinySearch, an open-source, self-hosted web research tool for local AI agents that integrates with MCP clients like Cursor and Claude Desktop. The tool performs search, reranking, crawling, and extraction to produce source-grounded prompts without hosted dashboards or analytics, aiming to give local agents efficient web research capabilities.", "body_md": "**Self-hosted web research for MCP agents.**\n\nTinySearch gives local AI agents a web-research tool they can actually use: search the web, rerank results, crawl the best pages, extract the most relevant chunks, and return a source-grounded prompt your LLM can answer from.\n\nNo hosted dashboard. No account system. No analytics. No scraped-data cache.\n\nJust search -> crawl -> rerank -> grounded prompt.\n\n- Add web research to Cursor, Cline, Roo Code, Claude Desktop, or any MCP client.\n- Keep source URLs attached to the evidence your model sees.\n- Avoid dumping full webpages into context.\n- Run with local ONNX embeddings by default, or bring an OpenAI-compatible embedding API.\n- Use SearXNG by default, with a DuckDuckGo HTML fallback when configured.\n- Keep the stack small enough to run locally in Docker.\n\nTinySearch is built for local agents, prototypes, personal workflows, and small systems where source-grounded web research matters more than running a full search product.\n\nRun TinySearch with its own SearXNG instance as an MCP server over Streamable HTTP. Docker Compose loads the configuration directly from GitHub, so you do not need to clone the repository or create any configuration files:\n\n```\ndocker compose -f \"https://github.com/MarcellM01/TinySearch.git#main:compose.quickstart.yaml\" up -d\n```\n\nThen connect your MCP client to:\n\n```\n{\n  \"mcpServers\": {\n    \"tinysearch\": {\n      \"url\": \"http://localhost:8000/mcp\"\n    }\n  }\n}\n```\n\nStop and remove the containers later with:\n\n```\ndocker compose -f \"https://github.com/MarcellM01/TinySearch.git#main:compose.quickstart.yaml\" down\n```\n\nTinySearch exposes three MCP tools:\n\n```\nget_current_datetime()\nresearch(query)\nscrape_url(url, query)\n```\n\nTypical routing:\n\n- Use\n`research(query)`\n\nwhen the agent needs to discover relevant URLs. - Use\n`scrape_url(url, query)`\n\nwhen the user already provided a URL, or when`research`\n\nfound the page to inspect. - Use\n`get_current_datetime()`\n\nbefore time-sensitive research.\n\nThe tools return a grounded prompt in the `answer`\n\nfield. Your MCP client model\nuses that prompt to write the final response with citations.\n\n```\nflowchart TB\n    subgraph Row1[\"Search and choose pages\"]\n        direction LR\n        A[User query] --> B[Web search<br/>SearXNG default, DuckDuckGo fallback]\n        B --> C[Filter HTTP results<br/>build title URL domain snippet docs]\n        C --> D[Rank search docs<br/>dense + BM25 weighted RRF]\n    end\n\n    subgraph Row2[\"Crawl and build prompt\"]\n        direction LR\n        E[Crawl kept URLs in parallel<br/>crawl4ai markdown] --> F[Truncate and chunk markdown]\n        F --> G[Rank combined chunk pool<br/>dense + BM25 weighted RRF]\n        G --> H[Dedupe chunks<br/>apply source quotas and fill]\n        H --> I[Build source-grounded prompt]\n    end\n\n    Row1 --> Row2\n```\n\nTinySearch does not directly answer the question. It returns a\n**structured prompt** in the MCP tool's ** answer field**, and your\n\n**client model** uses that prompt to produce the final\n\n**cited response**.\n\n```\nQUESTION\nWhat happened in the latest NFL playoffs?\n\nTODAY\n2026-05-15\n\nRESULTS\n1. Title\n   URL\n   Relevant extracted text...\n\n2. Title\n   URL\n   Relevant extracted text...\n\nINSTRUCTIONS\nAnswer only from the results. Cite source URLs.\n```\n\nUse this path if you want to inspect the code, edit TinySearch, or run it as a local stdio MCP server.\n\n```\ngit clone https://github.com/MarcellM01/TinySearch\ncd TinySearch\n\npython -m venv .venv\nsource .venv/bin/activate\npip install -r requirements.txt\n```\n\nMCP clients spawn TinySearch from their config. Add it with absolute paths:\n\nmacOS / Linux:\n\n```\n{\n  \"mcpServers\": {\n    \"tinysearch\": {\n      \"command\": \"/absolute/path/to/TinySearch/.venv/bin/python\",\n      \"args\": [\n        \"/absolute/path/to/TinySearch/servers/mcp_server.py\"\n      ]\n    }\n  }\n}\n```\n\nWindows:\n\n```\n{\n  \"mcpServers\": {\n    \"tinysearch\": {\n      \"command\": \"C:/absolute/path/to/TinySearch/.venv/Scripts/python.exe\",\n      \"args\": [\n        \"C:/absolute/path/to/TinySearch/servers/mcp_server.py\"\n      ]\n    }\n  }\n}\n```\n\nTemplate config files live in `mcp_templates/`\n\n.\n\nThe repo also includes [ agentic_coding_templates/global-rules-recommended.md](/MarcellM01/TinySearch/blob/main/agentic_coding_templates/global-rules-recommended.md),\na global-rules template for agentic coding tools such as Cline and Roo Code.\nThese rules help coding agents call TinySearch only when web research is\nactually needed.\n\nThe server uses **stdio** by default, which is what Cursor and similar clients\nexpect when they spawn `python .../mcp_server.py`\n\n. To run with `sse`\n\nor\n`streamable-http`\n\n, set `MCP_TRANSPORT`\n\nwhen starting the process. Do not put\ntransport in `configs/research_config.json`\n\n.\n\nThe [quick start](#quick-start) command runs TinySearch over Streamable HTTP on\n`http://localhost:8000/mcp`\n\n. Docker pulls `marcellm01/tinysearch:latest`\n\nautomatically if the image is not already local.\n\nWith `MCP_TRANSPORT=streamable-http`\n\n, the image serves Streamable HTTP on\n`/mcp`\n\nand SSE on `/mcp/sse`\n\n. GET requests to `/mcp`\n\nwithout an\n`mcp-session-id`\n\nare treated as the legacy SSE stream. If a client still cannot\nconnect, try `MCP_TRANSPORT=sse`\n\nalone or the stdio Docker setup below.\n\nDocker images are published automatically when a version tag or GitHub release is created.\n\n`marcellm01/tinysearch:<version>`\n\nis published for tags such as`v0.1.4`\n\n.`marcellm01/tinysearch:latest`\n\nis updated for stable releases.- Images are built for both\n`linux/amd64`\n\nand`linux/arm64`\n\n.\n\nFor repeated use, keep downloaded models in a Docker volume and mount your local\nconfig. The mounted config can also include `blocked_domains`\n\nto exclude sites\nfrom search results:\n\n```\ndocker run --rm \\\n  -p 8000:8000 \\\n  -v tinysearch-models:/data/models \\\n  -v \"$PWD/configs/research_config.json:/config/research_config.json:ro\" \\\n  -e TINYSEARCH_CONFIG_PATH=/config/research_config.json \\\n  -e MCP_TRANSPORT=streamable-http \\\n  -e MCP_HOST=0.0.0.0 \\\n  marcellm01/tinysearch:latest\n```\n\nExample config entry:\n\n```\n\"blocked_domains\": [\"example.com\", \"spammy-site.test\"]\n```\n\nUse this mode for MCP clients that launch tools as local commands instead of\nconnecting to a URL. Replace `/absolute/path/to/TinySearch`\n\nwith this repo's\nabsolute path:\n\n```\n{\n  \"mcpServers\": {\n    \"tinysearch\": {\n      \"command\": \"docker\",\n      \"args\": [\n        \"run\",\n        \"--rm\",\n        \"-i\",\n        \"-v\",\n        \"tinysearch-models:/data/models\",\n        \"-v\",\n        \"/absolute/path/to/TinySearch/configs/research_config.json:/config/research_config.json:ro\",\n        \"-e\",\n        \"TINYSEARCH_CONFIG_PATH=/config/research_config.json\",\n        \"-e\",\n        \"TINYSEARCH_MODELS_DIR=/data/models\",\n        \"marcellm01/tinysearch:latest\"\n      ]\n    }\n  }\n}\n```\n\nEdit `configs/research_config.json`\n\nto choose `embedding_model`\n\n(`fast`\n\n,\n`balanced`\n\n, `quality`\n\n, or a custom Hugging Face ONNX repo id). The named Docker\nvolume keeps downloaded model bundles between launches.\n\nUseful when you want HTTP instead of MCP:\n\n```\nuvicorn servers.fastapi_server:app --reload\n```\n\nEndpoints:\n\n`GET /health`\n\n`GET /current_datetime`\n\n`GET /web_search?query=...`\n\n`POST /site_crawl`\n\n`POST /scrape`\n\n`POST /research`\n\n`POST /scrape`\n\naccepts a JSON body with `url`\n\n(required), `query`\n\n(required,\nnon-empty), `max_tokens`\n\n(optional, default 4000) and `include_metadata`\n\n(optional, default true). The response includes a `URL-GROUNDED ANSWER PROMPT`\n\nin `answer`\n\n, plus `content_tokens`\n\n, `answer_tokens`\n\n, `truncated`\n\n, `url`\n\n,\n`title`\n\n, `retrieved_at`\n\n(aware UTC) and best-effort `metadata`\n\n(`description`\n\n, `author`\n\n, `published_date`\n\n).\n\nErrors return `{\"detail\": {\"code\", \"message\"}}`\n\nwith stable codes:\n`invalid_url`\n\n(400), `blocked_url`\n\n(403), `unsupported_document`\n\n(415),\n`empty_content`\n\n(422), `fetch_failed`\n\n(502), `fetch_timeout`\n\n(504).\n\n`/scrape`\n\nand `scrape_url`\n\naccept arbitrary user-supplied URLs and enforce\nthe following checks before fetching:\n\n- only\n`http`\n\nand`https`\n\nschemes - URLs with embedded credentials are rejected\n- IP literals and resolved addresses that are loopback, private, link-local,\nmulticast, reserved or unspecified are rejected (DNS rebinding is mitigated\nby rejecting if\n**any** resolved address is non-public, not just one) - the configured\n`blocked_domains`\n\nlist is applied to both the initial URL and the final URL reported by the crawler after redirects\n\nCrawl4AI does not expose intermediate redirect hops, so the safety check runs on the initial URL and the final URL. If you need stricter handling for redirect chains, run TinySearch behind an egress proxy that enforces your policy.\n\nTune research defaults in `configs/research_config.json`\n\n. Set\n`TINYSEARCH_CONFIG_PATH`\n\nto load a different JSON config file, which is the\nrecommended Docker override pattern.\n\nSet `blocked_domains`\n\nto a JSON list of domains you do not want TinySearch to\nreturn or crawl. Entries match the domain and its subdomains, so `example.com`\n\nalso blocks `www.example.com`\n\nand `news.example.com`\n\n. URL-style entries such as\n`https://example.com/path`\n\nare accepted and normalized to their hostname.\n\nThe `onnx`\n\nembedding backend uses local ONNX bundles under `models/`\n\n. Starting\nthe MCP server or FastAPI app downloads the configured `embedding_model`\n\nonce\nfrom Hugging Face when `embedding_backend`\n\nis `onnx`\n\n.\n\nBuilt-in local presets:\n\n`fast`\n\n:`onnx-models/all-MiniLM-L6-v2-onnx`\n\n`balanced`\n\n:`BAAI/bge-small-en-v1.5`\n\n`quality`\n\n:`BAAI/bge-base-en-v1.5`\n\nYou can also set `embedding_model`\n\nto a custom Hugging Face ONNX repo id. Set\n`TINYSEARCH_MODELS_DIR`\n\nto move the model cache, or use\n`TINYSEARCH_ONNX_MODEL_DIR`\n\nwhen you need to point at one exact bundle directory.\n\nKey settings:\n\n- Search:\n`search_top_k`\n\n,`search_rrf_cutoff`\n\n,`search_dense_weight`\n\n,`search_max_results_to_keep`\n\n,`blocked_domains`\n\n- Search backend:\n`search_backend`\n\n,`search_backend_url`\n\n,`search_engines`\n\n,`search_region`\n\n,`search_backend_fallback`\n\n- Chunks:\n`chunk_rrf_cutoff`\n\n,`chunk_dense_weight`\n\n,`chunk_max_results_to_keep`\n\n- Crawl:\n`crawl_max_chunk_tokens`\n\n,`crawl_overlap_tokens`\n\n,`max_concurrent_crawls`\n\n- Embeddings:\n`embedding_backend`\n\n,`embedding_model`\n\n,`embedding_openai_env_file`\n\n,`max_concurrent_embedding_calls`\n\n- Tokenizer:\n`encoding_name`\n\n- Dense input prefixes:\n`dense_query_prefix`\n\n,`dense_document_prefix`\n\n- Trace:\n`trace_path`\n\nFor `embedding_backend`\n\n`openai_compatible`\n\n, add a `.env`\n\nfile at the project\nroot, or set `embedding_openai_env_file`\n\n, with:\n\n```\nOPENAI_BASE_URL=\nOPENAI_API_KEY=\nOPENAI_EMBEDDING_MODEL=\n```\n\n`OPENAI_BASE_URL`\n\nis optional for api.openai.com. `EMBEDDING_MODEL`\n\nand\n`MODEL_NAME`\n\nare accepted as aliases for `OPENAI_EMBEDDING_MODEL`\n\n.\n\nThe research pipeline requires dense embeddings. It raises if\n`search_dense_weight`\n\nor `chunk_dense_weight`\n\nis set to `0`\n\n.\n\nTinySearch supports two web-search backends and selects between them from config. The defaults aim at the bundled compose setup: SearXNG runs as a sidecar, with the DuckDuckGo HTML scraper kept as an automatic fallback.\n\nSince `v0.2`\n\n, TinySearch defaults to a SearXNG-compatible backend. The bundled\nCompose files ship a local SearXNG service so the stack works out of the box,\nwhile the DuckDuckGo HTML scraper remains available as a configurable fallback.\n\nAvailable values for `search_backend`\n\n:\n\n`\"searxng\"`\n\n(default): query a SearXNG-compatible JSON endpoint. If the call fails and`search_backend_fallback`\n\nis`true`\n\n, TinySearch falls back to DuckDuckGo. With`search_backend_fallback: false`\n\nthe SearXNG error surfaces.`\"duckduckgo\"`\n\n: skip SearXNG entirely and use the existing DuckDuckGo HTML scraper. This is the escape hatch that preserves pre-0.2 behavior.`\"auto\"`\n\n: try SearXNG, then DuckDuckGo on any backend failure (fallback is implied regardless of`search_backend_fallback`\n\n).\n\nA backend \"failure\" means a real backend error: network/timeout, non-200 HTTP\nresponse, a non-JSON SearXNG body, or a DuckDuckGo CAPTCHA / 403. A legitimate\nempty result set is **not** a failure and does not trigger fallback.\n\nMinimal config example:\n\n```\n{\n  \"search_backend\": \"searxng\",\n  \"search_backend_url\": \"http://searxng:8080/search\",\n  \"search_engines\": [\"google\", \"bing\"],\n  \"search_region\": \"us-en\",\n  \"search_backend_fallback\": true\n}\n```\n\nSearXNG ships with the JSON output format **disabled** by default. The bundled\n`searxng/settings.yml`\n\nenables it via:\n\n```\nsearch:\n  formats:\n    - html\n    - json\n```\n\nIf TinySearch reports `SearchBackendUnavailable: SearXNG did not return JSON`\n\n,\nyour SearXNG instance is returning HTML — add `json`\n\nto `search.formats`\n\nand\nrestart it.\n\n`SEARXNG_URL`\n\n: overrides`search_backend_url`\n\nfor the running process. Useful in Docker so the same image can point at different SearXNG endpoints without rebuilding`research_config.json`\n\n.\n\nThe bundled `compose.yaml`\n\nstarts a `searxng`\n\nservice alongside `mcp`\n\n(and\noptionally `fastapi`\n\n). The `mcp`\n\nand `fastapi`\n\nservices reach SearXNG at\n`http://searxng:8080/search`\n\nover the internal compose network, and have\n`SEARXNG_URL`\n\nset automatically.\n\n```\ndocker compose up\n```\n\nA minimal `searxng/settings.yml`\n\nis committed at the repo root. Override\n`server.secret_key`\n\nbefore exposing the SearXNG instance beyond localhost.\n\nWhen you run TinySearch standalone (e.g. `docker run marcellm01/tinysearch:latest`\n\nor `python servers/mcp_server.py`\n\n), there is no local SearXNG. With the default\nconfig (`search_backend: \"searxng\"`\n\n, `search_backend_fallback: true`\n\n) the\nSearXNG call fails fast on the short connect timeout and TinySearch\ntransparently falls back to DuckDuckGo.\n\nTo keep the pre-0.2 behavior with no SearXNG involvement, set:\n\n```\n{ \"search_backend\": \"duckduckgo\" }\n```\n\nTinySearch is not a replacement for a commercial search API or a persistent crawler. It is probably not the right tool if you need:\n\n- guaranteed search coverage\n- large-scale indexing\n- long-term page caching\n- enterprise observability\n- production SLA-backed web search\n\n| Option | Best when you want | Tradeoff |\n|---|---|---|\n| Search API | Hosted search results with stronger coverage guarantees | Usually paid, hosted, and not MCP-native |\n| SearXNG | Self-hosted metasearch | You still need crawling, reranking, chunking, and prompt assembly |\n| Full crawler / index | Persistent searchable storage | More infrastructure than most local agents need |\n| Browser automation | A model clicking around the web | More tokens, slower runs, and less predictable evidence packing |\nTinySearch |\nA local MCP research tool that returns ranked, cited evidence chunks | Lightweight by design; not a full search engine or hosted answer API |\n\nJoin the [TinySearch Discord](https://discord.gg/NG6u2zamR) for support,\nrelease updates, bug reports, and contributor discussion.\n\n`pipelines.agentic_research.agentic_run`\n\n: single-turn search, crawl, ranking, and prompt assembly`servers.mcp_server`\n\n: MCP server for agent clients`servers.fastapi_server`\n\n: optional HTTP API\n\nRun the unittest suite:\n\n```\npython -m unittest discover tests\n```\n\nUsing TinySearch or want to build on it?\n\n[Email me](mailto:hello.marcbuilds@gmail.com) or reach me on [Bluesky](https://bsky.app/profile/marcellm01.bsky.social).\n\nTinySearch reads the pages it crawls and returns ranked excerpts to the calling\nclient. It does not include credentials in the repo, and `.env`\n\n/ trace output\nshould stay local. If you enable `openai_compatible`\n\nembeddings, your embedding\nprovider receives the text snippets sent for vectorization.\n\nSource code in this repository is under the [MIT License](/MarcellM01/TinySearch/blob/main/LICENSE).\n\nWhen `embedding_backend`\n\nis `onnx`\n\n, TinySearch may download the selected local\nONNX embedding bundle at runtime from Hugging Face. Those weights are separate\ndistributions under their model-card licenses; keep license and attribution\nnotices if you ship or redistribute those files. Optional manual export for\n`fast`\n\nuses `sentence-transformers/all-MiniLM-L6-v2`\n\n(Apache-2.0).\n\nSee [NOTICE](/MarcellM01/TinySearch/blob/main/NOTICE) for Docker and third-party distribution notes.", "url": "https://wpnews.pro/news/show-hn-tinysearch-token-efficient-web-research-for-local-ai-agents", "canonical_source": "https://github.com/MarcellM01/TinySearch", "published_at": "2026-06-30 13:36:52+00:00", "updated_at": "2026-06-30 13:50:14.486946+00:00", "lang": "en", "topics": ["ai-agents", "ai-tools", "developer-tools", "large-language-models", "ai-infrastructure"], "entities": ["MarcellM01", "TinySearch", "Cursor", "Cline", "Roo Code", "Claude Desktop", "SearXNG", "DuckDuckGo"], "alternates": {"html": "https://wpnews.pro/news/show-hn-tinysearch-token-efficient-web-research-for-local-ai-agents", "markdown": "https://wpnews.pro/news/show-hn-tinysearch-token-efficient-web-research-for-local-ai-agents.md", "text": "https://wpnews.pro/news/show-hn-tinysearch-token-efficient-web-research-for-local-ai-agents.txt", "jsonld": "https://wpnews.pro/news/show-hn-tinysearch-token-efficient-web-research-for-local-ai-agents.jsonld"}}