Markdown Comes to Liteparse

LlamaIndex released LiteParse v2.1, an open-source PDF-to-markdown pipeline that achieved top scores on three benchmarks against model-free approaches. The tool uses a heuristic rule-based approach with a custom PDFium fork to classify text into markdown elements, prioritizing speed over accuracy compared to AI-based parsers.

A few weeks ago, we launched LiteParse 2.0 as the fastest tool for converting PDFs to text. However, a few questions kept coming up again and again: Where are the benchmarks? Does it output markdown? LiteParse v2.1 https://developers.llamaindex.ai/liteparse/ answers this by delivering the fastest open-source, model-free, pdf-to-markdown pipeline. We measured our performance on 3 standard benchmarks and achieved top overall scores on all three when measured against model-free approaches: opendataloader-bench at 0.875, olmOCR-bench at 0.391, and ParseBench at 0.3279. Visit the demo site https://www.llamaindex.ai/liteparse-demo running in-browser with WASM or install the latest version today bash bash $ pip install liteparse $ lit parse doc.pdf --format markdown python python from liteparse import LiteParse lp = LiteParse output format="markdown" result = lp.parse "doc.pdf" print result.text How Does it Work? Building a heuristic pipeline for markdown essentially boils down into two parts: signals you can detect, and the types of output elements that listen to those signals. Similar to any machine-learning model, this essentially boils down to inputs, weights, and activations PDFs carry a ton of data: font family, font size, text-location, and more. All of these are then treated as input signals to classify text into specific markdown elements like paragraphs, tables, lists, and headers. LiteParse uses a custom PDFium fork to capture as much signal as possible, and then combines that with signals from our existing grid-projection algorithm, to deliver the best markdown output we can deliver with a purely heuristic rule-based approach. As time goes on, we expect this mode to get even better. There’s an extremely long tail of PDFs that we can adapt to over time, and time is the best thing for making this mode better. Measuring Markdown Performance It turns out not only is markdown a highly requested output option, it's also very hard to benchmark PDF parsing tools without it. All existing benchmarks ParseBench, olmOCR-bench, opendataloader-bench are strongly fit to measuring markdown. By building this markdown pipeline, we were able to deliver an entirely new output mode while also being able to measure and improve our overall extraction quality. In the spirit of “Lite”-ness, we built the markdown mode in LiteParse to be as light and fast as possible. This approach prioritizes speed, but also has to accept an upper-bound on accuracy we aren’t going to do better than LlamaParse with this approach . In order to compare fairly, we scoped our comparisons to open-source tools that do not leverage larger AI models for parsing. This means OCR and other model integrations are disabled when benchmarking. Benchmark Results ParseBench We’ve written a lot about ParseBench already https://www.llamaindex.ai/blog/parsebench . 2000+ documents measured across 5 key metrics that end-users actually care about. These are intentionally hard documents, so without larger AI models, these scores are actually quite impressive. LiteParse leads Overall. The Charts and Visual Grounding columns are effectively noise for every model-free tool here. ParseBench scores charts and parts of its layout/visual-grounding metrics by comparing structured data extracted from the chart, which fundamentally requires an ML model to recover. A heuristic engine has nothing to emit there, so all model-free tools cluster near zero. We're reporting those columns for completeness only. | Category | LiteParse | pymupdf4llm | opendataloader | pdf-inspector | markitdown | |---|---|---|---|---|---| Overall | 0.328 | 0.310 | 0.294 | 0.266 | 0.186 | | Tables | 0.403 | 0.373 | 0.352 | 0.266 | 0.158 | | Content Faithfulness | 0.686 | 0.609 | 0.661 | 0.561 | 0.645 | | Semantic Formatting | 0.409 | 0.446 | 0.341 | 0.351 | 0.009 | | Charts | 0.034 | 0.015 | 0.001 | 0.053 | 0.020 | | Visual Grounding | 0.107 | 0.107 | 0.108 | 0.099 | 0.099 | - Numbers here are mostly noise, none of the tools here output the proper data to benchmark properly on these metrics opendataloader-bench opendataloader-bench is a small benchmark of 200 docs https://github.com/opendataloader-project/opendataloader-bench . It measures three main things: Reading Order Similarity NID , Table Structure Similarity TEDS , and Heading-Level Similarity MHS . You can read more about these metrics in their github repo. Here, LiteParse leads across all categories. The official repo also reports scores from actual AI models and LiteParse is quite competitive there as well, but for this blog post we are only comparing to similar model-free OSS tools. | Category | LiteParse | pymupdf4llm | opendataloader | pdf-inspector | markitdown | |---|---|---|---|---|---| Overall | 0.871 | 0.732 | 0.831 | 0.792 | 0.589 | | NID | ||||| | Reading Order | 0.908 | 0.885 | 0.902 | 0.876 | 0.844 | | TEDS | ||||| | Tables | 0.693 | 0.401 | 0.483 | 0.630 | 0.273 | | MHS | ||||| | Headers | 0.816 | 0.412 | 0.739 | 0.602 | 0.000 | olmOCR-bench LiteParse leads in most categories in olmOCR-bench. Some of their rule checks don’t always reflect desired output, and sometimes disagree with eachother which we’ve written about before https://www.llamaindex.ai/blog/olmocr-bench-review-insights-and-pitfalls-on-an-ocr-benchmark , but it is useful signal nonetheless. LiteParse scores well on the baseline sanity checks, and a strong showing on headers/footers, multi column, and table tests. Low scores on old scans/math are expected as these typically require OCR. The rest of the scores are within distance of other tools. | Category | LiteParse | pymupdf4llm | opendataloader | pdf-inspector | markitdown | |---|---|---|---|---|---| Overall | 39.2% | 32.9% | 32.7% | 30.5% | 28.7% | | baseline | 99.9% | 84.5% | 86.9% | 82.9% | 86.8% | | headers footers | 55.9% | 39.5% | 37.5% | 52.0% | 38.8% | | multi column | 67.1% | 66.7% | 62.8% | 38.2% | 39.3% | | table tests | 48.0% | 46.3% | 25.7% | 40.0% | 19.9% | | long tiny text | 29.2% | 12.7% | 35.1% | 17.4% | 31.2% | | old scans | 13.3% | 13.3% | 13.3% | 13.3% | 13.3% | | arxiv math | 0.0% | 0.0% | 0.0% | 0.0% | 0.0% | | old scans math | 0.0% | 0.0% | 0.0% | 0.0% | 0.0% | Speed Tests Speed was measured on a fixed set of PDFs with varying layouts, page counts, and content types. The times reported are the average time taken to process a single page across the entire test set. You can find the source data here https://huggingface.co/datasets/llamaindex/liteparse bench small/tree/main/bench-docs and the benchmark code here https://github.com/run-llama/liteparse/tree/main/dataset eval utils . | Provider | ms/page agg | |---|---| | liteparse | 3.16 ms | | pdf-inspector | 3.83 ms | | opendataloader | 66.3 ms | | pymupdf4llm-md | 141.5 ms | | markitdown | 182.5 ms | Licensing & Portability Across all the tools tested, there is a mix of licenses and supported runtimes. LiteParse is permissively licensed Apache-2.0 and runs as a single engine across four ecosystems, including natively in the browser via WASM. The Python-only tools can't go where a browser or a Node service needs them, and pymupdf4llm inherits PyMuPDF's AGPL-3.0 copyleft, which is a non-starter for many commercial codebases without a paid license. | Tool | License | Languages / Runtimes | |---|---|---| LiteParse | Apache-2.0 | Rust, Python, Node, WASM browser | | pymupdf4llm | AGPL-3.0 commercial available | Python | | markitdown | MIT | Python | | opendataloader | Apache-2.0 | Java core + Python, Node.js wrappers | | pdf-inspector | MIT | Rust | A Note on v2.1 Scope These three benchmarks don't always agree on what "good" markdown looks like. We repeatedly found that tuning output to win one benchmark e.g. olmOCR-bench would regress another e.g. ParseBench , and vice versa. Visually inspecting PDFs you’d often see results that “score well” but visually looked not great. Rather than benchmaxxing any single harness, we kept v2.1 tuned for solid, balanced performance across all three. There's plenty of headroom to push individual sub-categories over time and we will . Try it Today LiteParse runs everywhere and v2.1 is available now: bash Node Library + CLI npm i @llamaindex/liteparse Python Library + CLI pip install liteparse Rust Library + CLI cargo install liteparse WASM Library npm i @llamaindex/liteparse-wasm Or, use it with your favourite coding agent directly as a skill: bash Claude Code, Codex, OpenCode, etc. npx skills add run-llama/llamaparse-agent-skills --skill liteparse Pi Coding Agent Extension pi install npm:@llamaindex/liteparse-pi-extension@latest Follow these links for docs and details on source code: