{"slug": "i-built-an-open-source-multi-agent-fact-checker-here-s-how-it-works", "title": "I Built an Open-Source Multi-Agent Fact-Checker — Here's How It Works", "summary": "A developer built Sift (Source Inspection & Fact-checking Tool), an open-source multi-agent AI pipeline that extracts factual claims from text, retrieves grounded evidence, and returns auditable verdicts of TRUE, FALSE, or UNCERTAIN with cited sources. The system uses five specialized agents orchestrated with LangGraph, including HyDE retrieval across indexed Guardian and Wikipedia chunks plus live web search, to overcome LLM hallucination and false confidence issues. Sift also includes an adversarial review agent and a correction agent that surfaces accurate information for false or uncertain claims.", "body_md": "We have a misinformation problem. But more specifically, we have a speed problem.\n\nA journalist spots a suspicious claim. They search for sources. Cross-reference databases. Call experts. Write a verdict. Get it edited. Publish, maybe 6 hours later. Maybe 3 days later.\n\nMeanwhile, the original claim has been screenshot, reposted, quoted in newsletters, and cited in arguments across five platforms.\n\nI wanted to build something that closed that gap. Not a chatbot that guesses. A proper pipeline, one that retrieves real evidence, reasons from it, and tells you why it reached a verdict.\n\nThat's what Sift is.\n\n**Sift (Source Inspection & Fact-checking Tool)** is an open-source multi-agent AI pipeline that takes any text, extracts every factual claim, retrieves grounded evidence, and returns auditable verdicts — TRUE, FALSE, or UNCERTAIN, with cited sources and full reasoning chains.\n\nPaste a news article. A politician's speech. A viral statistic. A WhatsApp forward. Sift breaks it into individual claims and fact-checks each one independently.\n\nThe naive approach is to ask an LLM: \"Is this claim true?\"\n\nThe problem: LLMs hallucinate. They have knowledge cutoffs. They're confidently wrong in ways that are hard to detect. And critically, they don't show their work.\n\nA single LLM call can't reliably handle the full pipeline of:\n\nEach of these is a distinct task that benefits from its own prompt, its own tools, and its own failure modes. That's why I built five separate agents, orchestrated with LangGraph.\n\nA single paragraph can contain 4-5 distinct factual claims. Generic LLMs miss them or conflate them.\n\nThis agent uses LLaMA 3.3 70B via Groq with Pydantic structured output to extract every distinct verifiable claim from the input text. The output is a typed list of claims — exact text, no paraphrasing, no hallucination.\n\nLLMs hallucinate citations. You need real, retrievable, dated evidence.\n\nThis agent runs HyDE retrieval across 4,270 indexed Guardian + Wikipedia chunks stored in pgvector, then hits Tavily live web search for recent data.\n\nWhy HyDE instead of standard RAG?\n\nStandard RAG embeds the raw claim and searches for similar text. A short factual claim like \"The Fed raised rates in March 2024\" has a weak semantic signal on its own.\n\nHyDE (Hypothetical Document Embeddings) generates a hypothetical document that would contain the answer — something like a news article excerpt — then embeds that. The result is a richer semantic signal and significantly better retrieval recall on short factual claims.\n\nThis agent reasons strictly from retrieved evidence. It returns TRUE / FALSE / UNCERTAIN with a calibrated confidence score.\n\nCritically — if evidence is thin or conflicting, it returns UNCERTAIN instead of confabulating certainty. This was one of the hardest things to get right. LLMs naturally trend toward false confidence. I had to explicitly prompt for epistemic humility and add Pydantic validators to catch zero-confidence outputs.\n\nSynthesis agents tend toward overconfidence when evidence partially supports a claim. You need an adversarial check.\n\nThis agent independently reviews every verdict. It flags unsupported reasoning, catches cases where 1.1°C vs 1.19°C is a rounding difference, not a false claim, and adjusts confidence downward when warranted.\n\nThis is the step most fact-checking systems skip — and it's the one that matters most for borderline claims.\n\nKnowing something is false isn't enough. Users need to know what IS true.\n\nThis agent fires only on FALSE or UNCERTAIN verdicts. It runs a targeted live search to find the correct information and surfaces it with a cited source. Conditional — doesn't waste tokens on TRUE verdicts.\n\nThe pipeline isn't linear for every claim. Some claims have no evidence — they skip synthesis and go straight to the criticism. Some need multiple retrieval attempts. Some claims loop.\n\nLangGraph's state machine handles conditional branching, loops, and shared state across agents cleanly. The state is typed with TypedDict — every agent reads from and writes to the same state object.\n\n**FastAPI** returns a task ID immediately. **Celery + Redis** runs the pipeline in the background. The client polls for results.\n\n**Redis cache** stores results for 7 days — the same viral claim doesn't cost tokens twice. Cache hits at the API layer return in under 1 second, before Celery even runs.\n\n**LangFuse** traces every LLM call — prompt, output, latency, token count — so I can debug agent failures without guessing.\n\nLLM: LLaMA 3.3 70B via Groq API\n\nEmbeddings: all-MiniLM-L6-v2 via HuggingFace Inference API\n\nOrchestration: LangGraph state machine\n\nRAG: HyDE + pgvector hybrid search\n\nVector DB: PostgreSQL + pgvector\n\nAPI: FastAPI + Pydantic\n\nTask Queue: Celery + Redis\n\nEvidence Sources: Tavily (live) + Guardian API + Wikipedia\n\nObservability: LangFuse + Prometheus + Grafana\n\nThe project is fully open source and Dockerized. One command runs the entire stack:\n\n```\ngit clone https://github.com/ashg2099/Sift.git\ncd Sift\ncp .env.example .env\n# Add your API keys (Groq, Tavily, HuggingFace — all free tiers)\ndocker compose up\n```\n\nOpen ** http://localhost:8000** and start verifying claims.\n\nGitHub: [https://github.com/ashg2099/Sift](https://github.com/ashg2099/Sift)\n\nLinkedIn: [https://www.linkedin.com/in/ashwin-gururaj-93943816a/](https://www.linkedin.com/in/ashwin-gururaj-93943816a/)", "url": "https://wpnews.pro/news/i-built-an-open-source-multi-agent-fact-checker-here-s-how-it-works", "canonical_source": "https://dev.to/ashg2099/i-built-an-open-source-multi-agent-fact-checker-heres-how-it-works-5eah", "published_at": "2026-05-28 00:25:32+00:00", "updated_at": "2026-05-28 00:52:46.233602+00:00", "lang": "en", "topics": ["ai-agents", "large-language-models", "ai-tools", "ai-research", "ai-ethics"], "entities": ["Sift"], "alternates": {"html": "https://wpnews.pro/news/i-built-an-open-source-multi-agent-fact-checker-here-s-how-it-works", "markdown": "https://wpnews.pro/news/i-built-an-open-source-multi-agent-fact-checker-here-s-how-it-works.md", "text": "https://wpnews.pro/news/i-built-an-open-source-multi-agent-fact-checker-here-s-how-it-works.txt", "jsonld": "https://wpnews.pro/news/i-built-an-open-source-multi-agent-fact-checker-here-s-how-it-works.jsonld"}}