# Show HN: TinySearch – token-efficient web research for local AI agents

> Source: <https://github.com/MarcellM01/TinySearch>
> Published: 2026-06-30 13:36:52+00:00

**Self-hosted web research for MCP agents.**

TinySearch gives local AI agents a web-research tool they can actually use: search the web, rerank results, crawl the best pages, extract the most relevant chunks, and return a source-grounded prompt your LLM can answer from.

No hosted dashboard. No account system. No analytics. No scraped-data cache.

Just search -> crawl -> rerank -> grounded prompt.

- Add web research to Cursor, Cline, Roo Code, Claude Desktop, or any MCP client.
- Keep source URLs attached to the evidence your model sees.
- Avoid dumping full webpages into context.
- Run with local ONNX embeddings by default, or bring an OpenAI-compatible embedding API.
- Use SearXNG by default, with a DuckDuckGo HTML fallback when configured.
- Keep the stack small enough to run locally in Docker.

TinySearch is built for local agents, prototypes, personal workflows, and small systems where source-grounded web research matters more than running a full search product.

Run TinySearch with its own SearXNG instance as an MCP server over Streamable HTTP. Docker Compose loads the configuration directly from GitHub, so you do not need to clone the repository or create any configuration files:

```
docker compose -f "https://github.com/MarcellM01/TinySearch.git#main:compose.quickstart.yaml" up -d
```

Then connect your MCP client to:

```
{
  "mcpServers": {
    "tinysearch": {
      "url": "http://localhost:8000/mcp"
    }
  }
}
```

Stop and remove the containers later with:

```
docker compose -f "https://github.com/MarcellM01/TinySearch.git#main:compose.quickstart.yaml" down
```

TinySearch exposes three MCP tools:

```
get_current_datetime()
research(query)
scrape_url(url, query)
```

Typical routing:

- Use
`research(query)`

when the agent needs to discover relevant URLs. - Use
`scrape_url(url, query)`

when the user already provided a URL, or when`research`

found the page to inspect. - Use
`get_current_datetime()`

before time-sensitive research.

The tools return a grounded prompt in the `answer`

field. Your MCP client model
uses that prompt to write the final response with citations.

```
flowchart TB
    subgraph Row1["Search and choose pages"]
        direction LR
        A[User query] --> B[Web search<br/>SearXNG default, DuckDuckGo fallback]
        B --> C[Filter HTTP results<br/>build title URL domain snippet docs]
        C --> D[Rank search docs<br/>dense + BM25 weighted RRF]
    end

    subgraph Row2["Crawl and build prompt"]
        direction LR
        E[Crawl kept URLs in parallel<br/>crawl4ai markdown] --> F[Truncate and chunk markdown]
        F --> G[Rank combined chunk pool<br/>dense + BM25 weighted RRF]
        G --> H[Dedupe chunks<br/>apply source quotas and fill]
        H --> I[Build source-grounded prompt]
    end

    Row1 --> Row2
```

TinySearch does not directly answer the question. It returns a
**structured prompt** in the MCP tool's ** answer field**, and your

**client model** uses that prompt to produce the final

**cited response**.

```
QUESTION
What happened in the latest NFL playoffs?

TODAY
2026-05-15

RESULTS
1. Title
   URL
   Relevant extracted text...

2. Title
   URL
   Relevant extracted text...

INSTRUCTIONS
Answer only from the results. Cite source URLs.
```

Use this path if you want to inspect the code, edit TinySearch, or run it as a local stdio MCP server.

```
git clone https://github.com/MarcellM01/TinySearch
cd TinySearch

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
```

MCP clients spawn TinySearch from their config. Add it with absolute paths:

macOS / Linux:

```
{
  "mcpServers": {
    "tinysearch": {
      "command": "/absolute/path/to/TinySearch/.venv/bin/python",
      "args": [
        "/absolute/path/to/TinySearch/servers/mcp_server.py"
      ]
    }
  }
}
```

Windows:

```
{
  "mcpServers": {
    "tinysearch": {
      "command": "C:/absolute/path/to/TinySearch/.venv/Scripts/python.exe",
      "args": [
        "C:/absolute/path/to/TinySearch/servers/mcp_server.py"
      ]
    }
  }
}
```

Template config files live in `mcp_templates/`

.

The repo also includes [ agentic_coding_templates/global-rules-recommended.md](/MarcellM01/TinySearch/blob/main/agentic_coding_templates/global-rules-recommended.md),
a global-rules template for agentic coding tools such as Cline and Roo Code.
These rules help coding agents call TinySearch only when web research is
actually needed.

The server uses **stdio** by default, which is what Cursor and similar clients
expect when they spawn `python .../mcp_server.py`

. To run with `sse`

or
`streamable-http`

, set `MCP_TRANSPORT`

when starting the process. Do not put
transport in `configs/research_config.json`

.

The [quick start](#quick-start) command runs TinySearch over Streamable HTTP on
`http://localhost:8000/mcp`

. Docker pulls `marcellm01/tinysearch:latest`

automatically if the image is not already local.

With `MCP_TRANSPORT=streamable-http`

, the image serves Streamable HTTP on
`/mcp`

and SSE on `/mcp/sse`

. GET requests to `/mcp`

without an
`mcp-session-id`

are treated as the legacy SSE stream. If a client still cannot
connect, try `MCP_TRANSPORT=sse`

alone or the stdio Docker setup below.

Docker images are published automatically when a version tag or GitHub release is created.

`marcellm01/tinysearch:<version>`

is published for tags such as`v0.1.4`

.`marcellm01/tinysearch:latest`

is updated for stable releases.- Images are built for both
`linux/amd64`

and`linux/arm64`

.

For repeated use, keep downloaded models in a Docker volume and mount your local
config. The mounted config can also include `blocked_domains`

to exclude sites
from search results:

```
docker run --rm \
  -p 8000:8000 \
  -v tinysearch-models:/data/models \
  -v "$PWD/configs/research_config.json:/config/research_config.json:ro" \
  -e TINYSEARCH_CONFIG_PATH=/config/research_config.json \
  -e MCP_TRANSPORT=streamable-http \
  -e MCP_HOST=0.0.0.0 \
  marcellm01/tinysearch:latest
```

Example config entry:

```
"blocked_domains": ["example.com", "spammy-site.test"]
```

Use this mode for MCP clients that launch tools as local commands instead of
connecting to a URL. Replace `/absolute/path/to/TinySearch`

with this repo's
absolute path:

```
{
  "mcpServers": {
    "tinysearch": {
      "command": "docker",
      "args": [
        "run",
        "--rm",
        "-i",
        "-v",
        "tinysearch-models:/data/models",
        "-v",
        "/absolute/path/to/TinySearch/configs/research_config.json:/config/research_config.json:ro",
        "-e",
        "TINYSEARCH_CONFIG_PATH=/config/research_config.json",
        "-e",
        "TINYSEARCH_MODELS_DIR=/data/models",
        "marcellm01/tinysearch:latest"
      ]
    }
  }
}
```

Edit `configs/research_config.json`

to choose `embedding_model`

(`fast`

,
`balanced`

, `quality`

, or a custom Hugging Face ONNX repo id). The named Docker
volume keeps downloaded model bundles between launches.

Useful when you want HTTP instead of MCP:

```
uvicorn servers.fastapi_server:app --reload
```

Endpoints:

`GET /health`

`GET /current_datetime`

`GET /web_search?query=...`

`POST /site_crawl`

`POST /scrape`

`POST /research`

`POST /scrape`

accepts a JSON body with `url`

(required), `query`

(required,
non-empty), `max_tokens`

(optional, default 4000) and `include_metadata`

(optional, default true). The response includes a `URL-GROUNDED ANSWER PROMPT`

in `answer`

, plus `content_tokens`

, `answer_tokens`

, `truncated`

, `url`

,
`title`

, `retrieved_at`

(aware UTC) and best-effort `metadata`

(`description`

, `author`

, `published_date`

).

Errors return `{"detail": {"code", "message"}}`

with stable codes:
`invalid_url`

(400), `blocked_url`

(403), `unsupported_document`

(415),
`empty_content`

(422), `fetch_failed`

(502), `fetch_timeout`

(504).

`/scrape`

and `scrape_url`

accept arbitrary user-supplied URLs and enforce
the following checks before fetching:

- only
`http`

and`https`

schemes - URLs with embedded credentials are rejected
- IP literals and resolved addresses that are loopback, private, link-local,
multicast, reserved or unspecified are rejected (DNS rebinding is mitigated
by rejecting if
**any** resolved address is non-public, not just one) - the configured
`blocked_domains`

list is applied to both the initial URL and the final URL reported by the crawler after redirects

Crawl4AI does not expose intermediate redirect hops, so the safety check runs on the initial URL and the final URL. If you need stricter handling for redirect chains, run TinySearch behind an egress proxy that enforces your policy.

Tune research defaults in `configs/research_config.json`

. Set
`TINYSEARCH_CONFIG_PATH`

to load a different JSON config file, which is the
recommended Docker override pattern.

Set `blocked_domains`

to a JSON list of domains you do not want TinySearch to
return or crawl. Entries match the domain and its subdomains, so `example.com`

also blocks `www.example.com`

and `news.example.com`

. URL-style entries such as
`https://example.com/path`

are accepted and normalized to their hostname.

The `onnx`

embedding backend uses local ONNX bundles under `models/`

. Starting
the MCP server or FastAPI app downloads the configured `embedding_model`

once
from Hugging Face when `embedding_backend`

is `onnx`

.

Built-in local presets:

`fast`

:`onnx-models/all-MiniLM-L6-v2-onnx`

`balanced`

:`BAAI/bge-small-en-v1.5`

`quality`

:`BAAI/bge-base-en-v1.5`

You can also set `embedding_model`

to a custom Hugging Face ONNX repo id. Set
`TINYSEARCH_MODELS_DIR`

to move the model cache, or use
`TINYSEARCH_ONNX_MODEL_DIR`

when you need to point at one exact bundle directory.

Key settings:

- Search:
`search_top_k`

,`search_rrf_cutoff`

,`search_dense_weight`

,`search_max_results_to_keep`

,`blocked_domains`

- Search backend:
`search_backend`

,`search_backend_url`

,`search_engines`

,`search_region`

,`search_backend_fallback`

- Chunks:
`chunk_rrf_cutoff`

,`chunk_dense_weight`

,`chunk_max_results_to_keep`

- Crawl:
`crawl_max_chunk_tokens`

,`crawl_overlap_tokens`

,`max_concurrent_crawls`

- Embeddings:
`embedding_backend`

,`embedding_model`

,`embedding_openai_env_file`

,`max_concurrent_embedding_calls`

- Tokenizer:
`encoding_name`

- Dense input prefixes:
`dense_query_prefix`

,`dense_document_prefix`

- Trace:
`trace_path`

For `embedding_backend`

`openai_compatible`

, add a `.env`

file at the project
root, or set `embedding_openai_env_file`

, with:

```
OPENAI_BASE_URL=
OPENAI_API_KEY=
OPENAI_EMBEDDING_MODEL=
```

`OPENAI_BASE_URL`

is optional for api.openai.com. `EMBEDDING_MODEL`

and
`MODEL_NAME`

are accepted as aliases for `OPENAI_EMBEDDING_MODEL`

.

The research pipeline requires dense embeddings. It raises if
`search_dense_weight`

or `chunk_dense_weight`

is set to `0`

.

TinySearch supports two web-search backends and selects between them from config. The defaults aim at the bundled compose setup: SearXNG runs as a sidecar, with the DuckDuckGo HTML scraper kept as an automatic fallback.

Since `v0.2`

, TinySearch defaults to a SearXNG-compatible backend. The bundled
Compose files ship a local SearXNG service so the stack works out of the box,
while the DuckDuckGo HTML scraper remains available as a configurable fallback.

Available values for `search_backend`

:

`"searxng"`

(default): query a SearXNG-compatible JSON endpoint. If the call fails and`search_backend_fallback`

is`true`

, TinySearch falls back to DuckDuckGo. With`search_backend_fallback: false`

the SearXNG error surfaces.`"duckduckgo"`

: skip SearXNG entirely and use the existing DuckDuckGo HTML scraper. This is the escape hatch that preserves pre-0.2 behavior.`"auto"`

: try SearXNG, then DuckDuckGo on any backend failure (fallback is implied regardless of`search_backend_fallback`

).

A backend "failure" means a real backend error: network/timeout, non-200 HTTP
response, a non-JSON SearXNG body, or a DuckDuckGo CAPTCHA / 403. A legitimate
empty result set is **not** a failure and does not trigger fallback.

Minimal config example:

```
{
  "search_backend": "searxng",
  "search_backend_url": "http://searxng:8080/search",
  "search_engines": ["google", "bing"],
  "search_region": "us-en",
  "search_backend_fallback": true
}
```

SearXNG ships with the JSON output format **disabled** by default. The bundled
`searxng/settings.yml`

enables it via:

```
search:
  formats:
    - html
    - json
```

If TinySearch reports `SearchBackendUnavailable: SearXNG did not return JSON`

,
your SearXNG instance is returning HTML — add `json`

to `search.formats`

and
restart it.

`SEARXNG_URL`

: overrides`search_backend_url`

for the running process. Useful in Docker so the same image can point at different SearXNG endpoints without rebuilding`research_config.json`

.

The bundled `compose.yaml`

starts a `searxng`

service alongside `mcp`

(and
optionally `fastapi`

). The `mcp`

and `fastapi`

services reach SearXNG at
`http://searxng:8080/search`

over the internal compose network, and have
`SEARXNG_URL`

set automatically.

```
docker compose up
```

A minimal `searxng/settings.yml`

is committed at the repo root. Override
`server.secret_key`

before exposing the SearXNG instance beyond localhost.

When you run TinySearch standalone (e.g. `docker run marcellm01/tinysearch:latest`

or `python servers/mcp_server.py`

), there is no local SearXNG. With the default
config (`search_backend: "searxng"`

, `search_backend_fallback: true`

) the
SearXNG call fails fast on the short connect timeout and TinySearch
transparently falls back to DuckDuckGo.

To keep the pre-0.2 behavior with no SearXNG involvement, set:

```
{ "search_backend": "duckduckgo" }
```

TinySearch is not a replacement for a commercial search API or a persistent crawler. It is probably not the right tool if you need:

- guaranteed search coverage
- large-scale indexing
- long-term page caching
- enterprise observability
- production SLA-backed web search

| Option | Best when you want | Tradeoff |
|---|---|---|
| Search API | Hosted search results with stronger coverage guarantees | Usually paid, hosted, and not MCP-native |
| SearXNG | Self-hosted metasearch | You still need crawling, reranking, chunking, and prompt assembly |
| Full crawler / index | Persistent searchable storage | More infrastructure than most local agents need |
| Browser automation | A model clicking around the web | More tokens, slower runs, and less predictable evidence packing |
TinySearch |
A local MCP research tool that returns ranked, cited evidence chunks | Lightweight by design; not a full search engine or hosted answer API |

Join the [TinySearch Discord](https://discord.gg/NG6u2zamR) for support,
release updates, bug reports, and contributor discussion.

`pipelines.agentic_research.agentic_run`

: single-turn search, crawl, ranking, and prompt assembly`servers.mcp_server`

: MCP server for agent clients`servers.fastapi_server`

: optional HTTP API

Run the unittest suite:

```
python -m unittest discover tests
```

Using TinySearch or want to build on it?

[Email me](mailto:hello.marcbuilds@gmail.com) or reach me on [Bluesky](https://bsky.app/profile/marcellm01.bsky.social).

TinySearch reads the pages it crawls and returns ranked excerpts to the calling
client. It does not include credentials in the repo, and `.env`

/ trace output
should stay local. If you enable `openai_compatible`

embeddings, your embedding
provider receives the text snippets sent for vectorization.

Source code in this repository is under the [MIT License](/MarcellM01/TinySearch/blob/main/LICENSE).

When `embedding_backend`

is `onnx`

, TinySearch may download the selected local
ONNX embedding bundle at runtime from Hugging Face. Those weights are separate
distributions under their model-card licenses; keep license and attribution
notices if you ship or redistribute those files. Optional manual export for
`fast`

uses `sentence-transformers/all-MiniLM-L6-v2`

(Apache-2.0).

See [NOTICE](/MarcellM01/TinySearch/blob/main/NOTICE) for Docker and third-party distribution notes.