{"slug": "atlas-open-source-deep-research-you-own", "title": "Atlas – open-source deep research you own", "summary": "Steel.dev released Atlas, an open-source deep research agent that allows users to own their research data. Atlas decomposes questions, routes sub-tasks to various researchers like Exa and Perplexity, and synthesizes reports, supporting multiple search and fetch providers. The tool aims to give users full control over their research process and data.", "body_md": "**Research Agent for the Open Web**\n\n``` js\nimport { Atlas } from \"@steel-dev/atlas\";\nimport { anthropic } from \"@ai-sdk/anthropic\";\n\nconst atlas = new Atlas({ model: anthropic(\"claude-fable-5\") });\nconst { report } = await atlas.research(\n  \"What's changing in browser automation for AI agents?\",\n);\n```\n\nOne-off query, no install:\n\n```\nANTHROPIC_API_KEY=sk-ant-... npx @steel-dev/atlas \"How do reasoning models differ from standard LLMs?\"\nnpm install @steel-dev/atlas ai @ai-sdk/anthropic\n# or @ai-sdk/openai / @ai-sdk/google\n```\n\n**Search:** `tavily.search()`\n\n/ `exa.search()`\n\n/ `brave.search()`\n\n(auto from env keys), or `native.search({ model })`\n\nto use the model provider's own web search.\n\n**Fetch:** `basic.fetch()`\n\nby default; `steel.fetch()`\n\nfor JS-rendered pages when `STEEL_API_KEY`\n\nis set.\n\n``` js\nimport { Atlas, exa, steel, basic } from \"@steel-dev/atlas\";\n\nconst atlas = new Atlas({\n  model,\n  search: exa.search(),\n  fetch: [basic.fetch(), steel.fetch({ proxy: true })],\n});\n```\n\nRegister other research agents — including Atlas itself — as a fleet. Atlas decomposes the question, routes each sub-task to the best-fit researcher (`query → report`\n\n), runs them in isolation, then synthesizes one report. With no `researchers`\n\nset it stays a single spine run.\n\n``` js\nimport { Atlas, exa, perplexity, parallel } from \"@steel-dev/atlas\";\n\nconst atlas = new Atlas({\n  model,\n  researchers: {\n    exa: exa.agent(), // Exa deep-research (via exa-js)\n    perplexity: perplexity.agent(), // Perplexity Sonar\n    parallel: parallel.agent(), // parallel.ai task API\n  },\n});\n\nawait atlas.research(\"Compare X across academic and shopping angles\");\n```\n\nAny `query → report`\n\nworker plugs in via `researcher({ description, research })`\n\n— `description`\n\ndrives routing. `atlas.asResearcher(description)`\n\nexposes an Atlas instance as a worker, so fan-out can recurse. A worker returns `{ report, sources }`\n\nand may add `cost`\n\n(USD).\n\nPlug in domain sources with `researchTool`\n\n: anything via `ctx.addSource`\n\nflows through the same source store:\n\n``` js\nimport { Atlas, researchTool } from \"@steel-dev/atlas\";\n\nconst atlas = new Atlas({\n  model,\n  tools: {\n    pubmed_search: researchTool({\n      description: \"Search PubMed.\",\n      inputSchema: z.object({ query: z.string() }),\n      execute: async ({ query }, ctx) => {\n        const studies = await pubmed.search(query, { signal: ctx.signal });\n        for (const s of studies)\n          ctx.addSource({ url: s.url, title: s.title, content: s.abstract });\n        return studies.map((s) => `- ${s.title}`).join(\"\\n\");\n      },\n    }),\n  },\n});\n```\n\n`ctx`\n\n(`ToolContext`\n\n) provides `addSource({ url, title?, content })`\n\n, `fetchText(url)`\n\n(a guarded fetch returning extracted text or `null`\n\n), `log(message)`\n\n, and `signal`\n\n.\n\n**Models per role:** the top-level `model`\n\nis the lead; override the stage models with `models.research`\n\nand `models.write`\n\n. Each defaults to the lead model when unset.\n\n**Z.ai / GLM:** use Z.ai through its OpenAI-compatible endpoint:\n\n``` js\nimport { createOpenAI } from \"@ai-sdk/openai\";\nimport { Atlas, tavily } from \"@steel-dev/atlas\";\n\nconst zai = createOpenAI({\n  apiKey: process.env.ZAI_API_KEY!,\n  baseURL: \"https://api.z.ai/api/paas/v4\",\n});\n\nconst atlas = new Atlas({\n  model: zai.chat(\"glm-5.2\"),\n  search: tavily.search(),\n});\n```\n\nThe examples also accept `--provider zai`\n\nwith `ZAI_API_KEY`\n\n/ `ATLAS_ZAI_API_KEY`\n\n. Configure `tavily.search()`\n\n, `exa.search()`\n\n, or `brave.search()`\n\nfor search; Atlas does not map Z.ai's web-search API to an AI SDK native search tool.\n\n**Concurrency:** `concurrency: { models: 4, io: 10 }`\n\nor `ATLAS_MODEL_CONCURRENCY`\n\n/ `ATLAS_IO_CONCURRENCY`\n\n.\n\n``` js\nconst run = atlas.start(question, { effort: \"balanced\" });\n\nlet preview = \"\";\nfor await (const e of run.events()) {\n  if (e.type === \"report.reset\") preview = \"\";\n  if (e.type === \"report.delta\") preview += e.text;\n  if (e.type === \"source.fetched\") console.error(e.url);\n}\n\nconst result = await run.result();\n```\n\n`report.delta`\n\ncarries the working draft as it is written and revised; `report.reset`\n\nprecedes each rewrite, so clear your preview buffer on it. `report.completed`\n\n(and `result.report`\n\n) is canonical after citation binding. `run.finish()`\n\nsynthesizes from whatever's gathered so far; `run.abort()`\n\ndiscards it. Late subscribers get full event history.\n\nJournaled runs replay completed model/search/fetch calls without new provider charges after a crash or deploy:\n\n``` js\nimport { Atlas, fileStore } from \"@steel-dev/atlas\";\n\nconst store = fileStore(\"./runs\");\nconst atlas = new Atlas({ model, store });\natlas.start(question, { runId: \"run_42\" });\n// …restart…\nawait new Atlas({ model, store }).resume(\"run_42\");\n```\n\nReplay issues no new provider calls, but it re-charges the run's original budget and token caps as it restores progress — so a resumed run continues within the headroom left when it stopped, and `resume()`\n\nkeeps those original caps (it doesn't take a higher budget). `result.stats.costUSD`\n\nreports only the new spend, excluding replayed calls.\n\n```\nresult.report; // cited markdown\nresult.note; // short note on how the research was approached\nresult.citations; // { sourceId, marker } per cited source — marker is its [N] in the report (same contract on single-spine and orchestrated runs)\nresult.unboundCitations; // source ids cited in the draft that didn't resolve to a fetched source\nresult.warnings; // non-fatal issues (e.g. a researcher that returned nothing)\nresult.sources; // sources backing the report (via = fetch method, or the researcher that returned it on a fleet run)\nresult.stats; // cost, tokens, duration, …\nresult.trace; // timing/cost trace + bottleneck digest when trace !== \"off\"\n```\n\nPass a `schema`\n\nto get a typed object extracted from the finished report, returned alongside the full result. It runs as a final pass over the report, so it works on any path — single spine run, orchestrated fleet, or an outsourced researcher — and its cost is charged to the same budget and counted in `result.stats.costUSD`\n\n.\n\n``` js\nimport { z } from \"zod\";\n\nconst r = await atlas.research(\"Acme's latest annual revenue and CEO?\", {\n  schema: z.object({ revenue: z.string(), ceo: z.string() }),\n});\n\nr.object; // typed: { revenue: string; ceo: string }\nr.report; // the cited report it was extracted from\n```\n\nOne meter for everything. Pick an effort, or override any cap with `budget`\n\n.\n\n| effort | ~budget | sources | tokens |\n|---|---|---|---|\n`fast` |\n$0.50 | 15 | 200K |\n`balanced` |\n$2.50 | 40 | 1M |\n`deep` |\n$10 | 100 | 4M |\n`max` |\n$40 | 250 | 16M |\n\n```\nawait atlas.research(question, {\n  effort: \"deep\",\n  budget: { maxUSD: 5, maxTokens: 50_000_000 },\n});\n```\n\n`maxUSD`\n\nis a **best-effort target, not a hard ceiling.** The meter checks between agent turns, so a run stops *starting* work once spent, but in-flight calls (up to `concurrency.models`\n\n) can overshoot. Pricing comes from a built-in table that can lag provider changes — pass `pricing`\n\nto correct a rate when the cap must be accurate.\n\nThe real backstops are **price-independent** — each defaults to the effort row above and is enforced regardless of prices:\n\n`budget.maxTokens`\n\n— input + output tokens run-wide (cache reads excluded). Checked between turns like`maxUSD`\n\n, but never drifts with prices.`budget.maxSources`\n\n,`budget.maxDurationMs`\n\n— fetched-source and wall-clock caps.\n\n`result.stats`\n\nreports `budgetExhausted`\n\nand `tokensExhausted`\n\nso you can see which limit bound the run. Leave headroom on `maxUSD`\n\n, or set a provider-side spend limit, when the cap is truly hard.\n\n`result.stats.stopReason`\n\nfolds those into one value — `\"completed\"`\n\n, `\"finished\"`\n\n(`run.finish()`\n\n), or a binding cap (`\"budget\"`\n\n, `\"tokens\"`\n\n, `\"timeout\"`\n\n). When several apply, the most proximate wins.\n\n```\ngit clone https://github.com/steel-dev/atlas.git && cd atlas\nnpm install && npm run dev -- \"your question\"\n```\n\n`examples/cli.ts`\n\n: terminal runs`examples/serve.ts`\n\n: SSE web app`evals/`\n\n: BrowseComp + DRACO (`npm run eval:browsecomp`\n\n,`npm run eval:draco`\n\n)\n\nMIT", "url": "https://wpnews.pro/news/atlas-open-source-deep-research-you-own", "canonical_source": "https://github.com/steel-dev/atlas", "published_at": "2026-06-25 16:30:47+00:00", "updated_at": "2026-06-25 16:45:04.292952+00:00", "lang": "en", "topics": ["ai-agents", "ai-tools", "ai-research"], "entities": ["Steel.dev", "Atlas", "Exa", "Perplexity", "Anthropic", "Tavily", "Brave", "Z.ai"], "alternates": {"html": "https://wpnews.pro/news/atlas-open-source-deep-research-you-own", "markdown": "https://wpnews.pro/news/atlas-open-source-deep-research-you-own.md", "text": "https://wpnews.pro/news/atlas-open-source-deep-research-you-own.txt", "jsonld": "https://wpnews.pro/news/atlas-open-source-deep-research-you-own.jsonld"}}