{"slug": "doceval-eval-harness-for-llm-document-extraction-pipelines", "title": "doceval — eval harness for LLM document extraction pipelines", "summary": "A developer built doceval, an evaluation harness for LLM document extraction pipelines that provides field-level accuracy, failure taxonomy, and optional cost tracking. The tool works with any extractor and document schema, requiring only a JSON label file, a Python function, and a CLI command. It includes a working 20-document invoice example with a Claude Haiku extractor.", "body_md": "I kept seeing the same gap: people ship LLM-based document extractors (invoices, receipts, forms) with no systematic way to know how accurate they actually are. So I built doceval — point it at your extractor function + a labeled dataset and get back field-level accuracy, a failure taxonomy (missed_field / hallucination / wrong_format / wrong_value), and optional per-document cost tracking.\n\nWorks with any extractor (Claude, GPT, regex, rules) and any document schema. One JSON label file per document, one Python function, one CLI command.\n\nIncludes a working 20-document invoice example with a Claude Haiku extractor so you can run it immediately.", "url": "https://wpnews.pro/news/doceval-eval-harness-for-llm-document-extraction-pipelines", "canonical_source": "https://dev.to/dave8172/show-hn-doceval-eval-harness-for-llm-document-extraction-pipelines-3gd7", "published_at": "2026-06-16 12:29:37+00:00", "updated_at": "2026-06-16 12:47:32.183623+00:00", "lang": "en", "topics": ["large-language-models", "developer-tools", "ai-tools"], "entities": ["doceval", "Claude Haiku", "GPT"], "alternates": {"html": "https://wpnews.pro/news/doceval-eval-harness-for-llm-document-extraction-pipelines", "markdown": "https://wpnews.pro/news/doceval-eval-harness-for-llm-document-extraction-pipelines.md", "text": "https://wpnews.pro/news/doceval-eval-harness-for-llm-document-extraction-pipelines.txt", "jsonld": "https://wpnews.pro/news/doceval-eval-harness-for-llm-document-extraction-pipelines.jsonld"}}