{"slug": "show-hn-polyvia-multimodal-document-retrieval-over-100k-files", "title": "Show HN: Polyvia – Multimodal document retrieval over 100K+ files", "summary": "Polyvia released Polyvia 1, a multimodal document retrieval API and upcoming platform for enterprise agents, enabling sub-200ms search over 100K+ files including PDFs, charts, and slides. The API provides end-to-end retrieval without external extractors, targeting use cases like data room search and credit monitoring.", "body_md": "We build enterprise agents for large-scale retrieval, research and automation over multimodal docs.\n\n[Docs](https://docs.polyvia.ai) · [Quickstart](https://docs.polyvia.ai/quickstart) · [Python SDK](https://docs.polyvia.ai/products/python-sdk) · [TypeScript SDK](https://docs.polyvia.ai/products/js-sdk) · [Polyvia Platform](https://app.polyvia.ai) · [Homepage](https://polyvia.ai)\n\nWe’re releasing Polyvia 1, as two products:\n\n**Polyvia API: Multimodal Document Retrieval API**(for developers of AI agents) - available now.** Polyvia Platform: Research and Automation Agent over 100K+ multimodal docs**(for knowledge workers in enterprises) - coming soon.\n\nWe index your unstructured & visual & multimodal docs (PDFs, charts, slides, complex tables, infographics, scans, handwriting, invoices, and more) into multimodal knowledge ontology, with agents running on top for retrieval, research and automation — every answer grounded in a cited source page, in sub-200ms.\n\n**1. Fast over 100K+ multimodal docs.** Agentic, file-by-file search (Claude Code,\nClaude Cowork, Codex) works only up to ~100 multimodal files — past that it's too\nslow, and at scale you still need **retrieval**. Polyvia does **sub-200ms** search\nover 100K+ files, every answer grounded in a cited source page.\n\n**2. End-to-end — no need for extractors or PDF parsers.** When you build\nlarge-scale multimodal RAG over a company's files, the only infra available today\nis visual extractors / PDF parsers (Reducto, LlamaIndex). There's no **end-to-end**\ninfra for large-scale multimodal document retrieval — until Polyvia: **VLM Visual\nExtractor → Multimodal Knowledge Ontology (mapping all your company's data and\nprocesses) → Self-Improving Retrieval Agent**.\n\n**3. All unstructured, visual and multimodal data inputs in one API.** Available now: PDFs, charts, infographics, complex multi-page tables, slides, pictures, handwriting, scans, invoices, audio. Coming soon: video, healthcare scans / EHR, chemical & molecular data, CAD & technical drawings, heatmaps.\n\n**Multimodal RAG inside your own agent**— retrieval-as-a-tool over large doc sets.** Data-room / due-diligence search**— query 100+ visual-heavy PDFs jointly (PE, IB, M&A).** Counterparty & credit monitoring**— EBITDA, opex, revenue across hundreds of borrower reports.** Image-based claim processing**— describe claim photos in the context of a policy.** Cross-engagement slide search**— find answers buried in thousands of slides.\n\n```\npip install polyvia        # Python 3.9+\nnpm  install polyvia       # Node 18+\n```\n\nGrab a key in [Polyvia Platform](https://app.polyvia.ai) → **Settings → API**.\nIngest a batch into a **group**, then ask one question across the whole corpus —\nanswers cite the exact page in each document.\n\n``` python\nfrom polyvia import Polyvia\n\nclient = Polyvia(api_key=\"poly_<key>\")  # or set POLYVIA_API_KEY\n\n# Ingest a batch into a group, then ask one question across all of it.\nitems = client.ingest.batch(\n    [\"q1.pdf\", \"q2.pdf\", \"q3.pdf\", \"q4.pdf\"],\n    group=\"FY24 Earnings\",\n)\nfor item in items:\n    client.ingest.wait(item.task_id)\n\nprint(client.query(\"How did revenue trend across the four quarters?\",\n                   group=\"FY24 Earnings\").answer)\njs\nimport { Polyvia } from \"polyvia\";\n\nconst client = new Polyvia({ apiKey: \"poly_<key>\" });\n\nconst items = await client.ingest.batch(\n  [\"q1.pdf\", \"q2.pdf\", \"q3.pdf\", \"q4.pdf\"],\n  { group: \"FY24 Earnings\" },\n);\nawait Promise.all(items.map((i) => client.ingest.wait(i.task_id)));\n\nconst answer = await client.query(\n  \"How did revenue trend across the four quarters?\",\n  { group: \"FY24 Earnings\" },\n);\nconsole.log(answer.answer);\n```\n\nScope a query three ways: a single `document_id`\n\n(fastest), a `group`\n\n/\n`group_ids`\n\n, or the whole workspace (no scope).\n\nRunnable scripts live in [ examples/](/polyvia-ai/polyvia/blob/main/examples). A few highlights:\n\n| Example | What it shows |\n|---|---|\n`query_scopes.py` |\n\n`groups_and_documents.py`\n\n`batch_group.py`\n\n`async_client.py`\n\n`AsyncPolyvia`\n\n— the same surface, awaitable`agent_tool.py`\n\n`curl.sh`\n\nQuerying across scopes, for example:\n\n```\n# whole workspace · a group (by name) · one document (fastest) · many groups (by id)\nclient.query(\"What risks recur across all reports?\")\nclient.query(\"How did revenue trend?\", group=\"FY24 Earnings\")\nclient.query(\"Executive summary?\", document_id=\"doc_<id>\")\nclient.query(\"Compare the deals.\", group_ids=[\"g_<id>\", \"g_<id>\"])\n```\n\n**MCP** — connect Claude Code (or any MCP client) to the hosted Polyvia MCP server\nin one line, so your agent can retrieve over your documents as a tool:\n\n```\nclaude mcp add --transport http polyvia https://app.polyvia.ai/mcp \\\n  --header \"Authorization: Bearer poly_<your-key>\"\n```\n\n**Agent Skills** — install Polyvia skills into Claude Code, Cursor, and other agent\nclients:\n\n```\nnpx skills add polyvia-ai/skills\n```\n\n→ [MCP docs](https://docs.polyvia.ai/products/mcp) · [Agent Skills](https://docs.polyvia.ai/products/skills)\n\n| Product | For | Status | |\n|---|---|---|---|\nPolyvia-1.1 |\nPolyvia API — Multimodal Document Retrieval API |\nDevelopers of AI agents | Available now |\nPolyvia-1.2 |\nPolyvia Platform — Research & Automation Agent over 100K+ multimodal docs |\nKnowledge workers in enterprises | Coming soon |\nLater |\nPolyvia Agents — build your own agent for automating processes on large volumes of multimodal docs |\nBuilders & Teams | Planned |\nLater |\nMore modalities — video, healthcare scans / EHR, chemical & molecular data, CAD & technical drawings, heatmaps |\nBuilders & teams | Planned |\n\nWe update this as we ship — latest first. Full notes at [docs.polyvia.ai/versions](https://docs.polyvia.ai/versions).\n\n**REST API v1**—`ingest`\n\n,`documents`\n\n,`groups`\n\n,`query`\n\n,`usage`\n\n,`rate-limits`\n\n; async ingestion with task polling and grounded citations.**Python SDK**—`pip install polyvia`\n\n; typed sync**and** async clients, batch ingestion, idempotent groups, structured errors.**TypeScript SDK**—`npm install polyvia`\n\n; fully typed, ESM/CJS, Node 18+.**MCP server**—`claude mcp add --transport http polyvia https://app.polyvia.ai/mcp --header \"Authorization: Bearer poly_<your-key>\"`\n\n.**Agent Skills**—`npx skills add polyvia-ai/skills`\n\nfor Claude Code, Cursor, and other agent clients.**Visual Document Modalities**— Visual Document Intelligence + Audio: charts, graphs & plots, infographics, complex multi-page tables, slides & decks, reports & filings, scanned & photographed pages, invoices & forms, handwriting & annotations, diagrams & flowcharts, photos & images, and audio (calls, meetings, recordings).\n\n**Polyvia-1.2 — Polyvia Platform**— Research & Automation Agent over 100K+ multimodal docs, for knowledge workers in enterprises.** More modalities (coming soon)**— healthcare scans / EHR, chemical & molecular data, CAD & technical drawings, video, heatmaps.** Polyvia Agents**— build your own agent for automating processes on large volumes of multimodal documents.\n\n| Install | Source | |\n|---|---|---|\n| Python | `pip install polyvia` |\n|\n\n`npm install polyvia`\n\n[docs.polyvia.ai/products/js-sdk](https://docs.polyvia.ai/products/js-sdk)[docs.polyvia.ai/api-reference](https://docs.polyvia.ai/api-reference/introduction)`app.polyvia.ai/mcp`\n\n[docs.polyvia.ai/products/mcp](https://docs.polyvia.ai/products/mcp)`npx skills add polyvia-ai/skills`\n\n[docs.polyvia.ai/products/skills](https://docs.polyvia.ai/products/skills)Supported inputs: PDFs · Word/PowerPoint/Excel (DOCX/PPTX/XLSX) · Markdown · text · images · audio. Charts, infographics, complex multi-page tables, slides, scans and handwriting are first-class.\n\nRunnable snippets (Python, TypeScript, raw HTTP, MCP, agent-tool) live in\n[ examples/](/polyvia-ai/polyvia/blob/main/examples) — see the\n\n[examples guide](/polyvia-ai/polyvia/blob/main/examples/README.md). See also\n\n[·](/polyvia-ai/polyvia/blob/main/CHANGELOG.md)\n\n`CHANGELOG`\n\n[·](/polyvia-ai/polyvia/blob/main/CONTRIBUTING.md)\n\n`CONTRIBUTING`\n\n[.](/polyvia-ai/polyvia/blob/main/SECURITY.md)\n\n`SECURITY`\n\nNew to Polyvia? See what it does at ** polyvia.ai**, or start\nfree at\n\n**.**\n\n[app.polyvia.ai](https://app.polyvia.ai/sign-up)📚 [Docs](https://docs.polyvia.ai) · 🖥️ [Platform](https://app.polyvia.ai) · ✉️ [mateusz@polyvia.ai](mailto:mateusz@polyvia.ai) · [senyao@polyvia.ai](mailto:senyao@polyvia.ai)\n\n© 2026 Polyvia. All rights reserved.", "url": "https://wpnews.pro/news/show-hn-polyvia-multimodal-document-retrieval-over-100k-files", "canonical_source": "https://github.com/polyvia-ai/polyvia", "published_at": "2026-06-18 12:00:07+00:00", "updated_at": "2026-06-18 12:24:10.400237+00:00", "lang": "en", "topics": ["ai-tools", "ai-products", "ai-infrastructure", "large-language-models", "generative-ai"], "entities": ["Polyvia", "Claude Code", "Claude Cowork", "Codex", "Reducto", "LlamaIndex"], "alternates": {"html": "https://wpnews.pro/news/show-hn-polyvia-multimodal-document-retrieval-over-100k-files", "markdown": "https://wpnews.pro/news/show-hn-polyvia-multimodal-document-retrieval-over-100k-files.md", "text": "https://wpnews.pro/news/show-hn-polyvia-multimodal-document-retrieval-over-100k-files.txt", "jsonld": "https://wpnews.pro/news/show-hn-polyvia-multimodal-document-retrieval-over-100k-files.jsonld"}}