{"slug": "show-hn-unsiloed-ai-1-on-olmocr-bench-beats-reducto-llamaparse-and-gpt-5-5", "title": "Show HN: Unsiloed AI – #1 on OlmOCR-Bench,Beats Reducto, LlamaParse and GPT-5.5", "summary": "Unsiloed AI's document parser v3.1 achieved the top rank on the olmOCR-Bench benchmark with an 88.0% strict pass rate, outperforming 18 other OCR services including GPT-5.5, Claude Opus 4.7, LlamaParse, and Reducto across 1,403 PDFs and 8,413 unit tests. The company reported that a secondary evaluation using an LLM-as-Judge to account for semantic equivalents raised the corrected score to 94.8%, highlighting that many initial errors stemmed from formatting differences rather than actual OCR failures.", "body_md": "Most of the document parsers fail on real world challenges like complex tables, handwritten documents, historical document scans, equations, multi-column layouts, complex reading order, etc. We built Unsiloed Parser to handle exactly these cases.\n\nOur latest parser v3.1 achieved #1 rank and scored 88.0 strict pass-rate on olmOCR-Bench. We ran the evaluation across 1,403 PDFs and 8,413 unit tests using the unmodified upstream Allen AI scorer (olmocr==0.4.27) and found Unsiloed beats 18 other OCR services, including GPT-5.5, Claude Opus 4.7, LlamaParse, Reducto, Azure Document Intelligence, AWS Textract, and Unstructured.\n\nWhen we dug deeper into the failure cases, we found many errors were not OCR errors but things like \\frac vs \\dfrac, whitespace differences, or equivalent LaTeX renderings. We ran a secondary LLM-as-Judge evaluation to classify real misses vs semantic equivalents, which lifts the corrected score to 94.8 (explained deeply in the blog post).\n\nBlog with full methodology and examples: [https://www.unsiloed.ai/blog/unsiloed-ai-achieves-1-rank-on-...](https://www.unsiloed.ai/blog/unsiloed-ai-achieves-1-rank-on-olmocr-bench-2)\n\nEvaluation Code for reproducibility:\n[https://github.com/Unsiloed-AI/unsiloed-olmocr-benchmark](https://github.com/Unsiloed-AI/unsiloed-olmocr-benchmark)\n\nFeel free to post your messiest PDFs in the comment and we'll run it through Unsiloed parser and share the output here.\n\nComments URL: [https://news.ycombinator.com/item?id=48271937](https://news.ycombinator.com/item?id=48271937)\n\nPoints: 1\n\n# Comments: 1", "url": "https://wpnews.pro/news/show-hn-unsiloed-ai-1-on-olmocr-bench-beats-reducto-llamaparse-and-gpt-5-5", "canonical_source": "https://news.ycombinator.com/item?id=48271937", "published_at": "2026-05-25 21:35:03+00:00", "updated_at": "2026-05-25 22:08:05.971812+00:00", "lang": "en", "topics": ["ai-products", "ai-tools", "machine-learning", "natural-language-processing", "ai-startups"], "entities": ["Unsiloed AI", "Unsiloed Parser", "OlmOCR-Bench", "GPT-5.5", "Claude Opus 4.7", "LlamaParse", "Reducto", "Allen AI"], "alternates": {"html": "https://wpnews.pro/news/show-hn-unsiloed-ai-1-on-olmocr-bench-beats-reducto-llamaparse-and-gpt-5-5", "markdown": "https://wpnews.pro/news/show-hn-unsiloed-ai-1-on-olmocr-bench-beats-reducto-llamaparse-and-gpt-5-5.md", "text": "https://wpnews.pro/news/show-hn-unsiloed-ai-1-on-olmocr-bench-beats-reducto-llamaparse-and-gpt-5-5.txt", "jsonld": "https://wpnews.pro/news/show-hn-unsiloed-ai-1-on-olmocr-bench-beats-reducto-llamaparse-and-gpt-5-5.jsonld"}}