Show HN: Papa – open-source Hemingway-style readability linting for Markdown

Papa, an open-source alpha CLI and Python library for Hemingway-style readability linting of Markdown, was released on Show HN. It flags hard sentences, passive voice, adverbs, and complex phrasing, and can run locally or in CI to gate prose quality. The tool aims to make readability checks scriptable for documentation pipelines.

Papa flags hard sentences, passive voice, adverbs, and complex phrasing with a readability grade. It is an open-source alpha CLI and Python library that can run locally or in CI, emit JSON for automation, and generate a self-contained HTML report. Quickstart -quickstart · · -ci-usage CI usage · -llm-workflow LLM workflow · -how-it-works How it works Roadmap View a sample HTML report generated by Papa. It highlights hard sentences, passive voice, adverbs, and complex phrases with a readability grade panel. The report is self-contained and uses no network assets. Papa is currently alpha software . The supported surface is intentionally small: - CLI command: papa - Python package: papa-lint - Input formats: Markdown and plain text - Reports: terminal, JSON, and self-contained HTML - CI gating: use --max-grade and the process exit code GitHub Action packaging, SARIF/Markdown reporters, config files, optional external linters, and built-in LLM rewriting are roadmap items. The Hemingway Editor https://hemingwayapp.com is useful, but it is closed, manual, and not designed for docs pipelines. Other open-source writing tools exist, but they tend to have separate output formats and workflows. Papa focuses on a narrow first job: make prose readability checks scriptable. Scores readability with ARI, Flesch-Kincaid, and Gunning fog. Highlights like Hemingway by marking hard sentences, passive voice, adverbs, and complex phrases. Handles Markdown safely by ignoring frontmatter, fenced code, inline code, and SVG while preserving offsets back to the original file. Emits JSON so agents, scripts, or dashboards can consume the findings. Gates CI by exiting non-zero when prose is harder than your max grade. pipx install papa-lint papa post.md papa post.md --report html -o report.html papa post.md --report json findings.json papa post.md --max-grade 10 Example terminal output: post.md - Grade 11, hard to read ✗ fails --max-grade 10 18 hard · 4 very hard · 6 passive · 9 adverbs · 12 complex ARI 11.2 · FK 10.8 · Fog 12.1 142 sentences L42 very hard Very hard to read grade 16 : "Because the detour squeezes..." L58 passive Passive voice: 'is measured' Papa does not need a dedicated GitHub Action yet. Install it in your workflow and let --max-grade fail the job when prose crosses your threshold: name: readability on: pull request jobs: papa: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-python@v5 with: python-version: "3.11" - run: python -m pip install papa-lint - run: papa README.md docs/ .md --max-grade 10 For local development, run the same command before opening a PR. Papa does not call an LLM directly. It emits structured JSON that you can hand to an agent or script: papa post.md --report json findings.json The JSON includes the file path, document score, and findings with offsets into the original source: { "version": "0.1", "file": "post.md", "score": { "ari": 11.2, "flesch kincaid": 10.8, "gunning fog": 12.1, "reading grade": "Grade 11, hard to read", "verdict": "fail" }, "findings": { "start": 1423, "end": 1490, "category": "passive", "message": "Passive voice: 'is measured'", "severity": "warn" } } See docs/llm-contract.md /bharadwaj-pendyala/papa/blob/main/docs/llm-contract.md for a prompt pattern that uses Papa findings to guide a rewrite. php input - Ingestor - Analyzers - Aggregator - Reporters strip code readability merge spans terminal + SVG passive + scores json offset map adverbs html complex phrase Ingest : detect Markdown or text, strip non-prose, and preserve an offset map back to the original source. Analyze : run built-in analyzers for readability, passive voice, adverbs, and complex phrases. Aggregate : merge overlapping findings, compute document scores, and apply the optional max-grade gate. Report : render terminal, JSON, or HTML output and set the exit code. python from papa import analyze result = analyze open "post.md", encoding="utf-8" .read , path="post.md", max grade=10 print result.score.reading grade print result.score.verdict - Config file support, likely papa.toml - GitHub Action wrapper - SARIF and Markdown reporters for PR annotations and summaries - Built-in --suggest workflow for LLM-assisted rewrites - MDX, HTML, and reStructuredText ingestion - Optional integrations with tools such as proselint , alex , and vale - npm, Homebrew, and Docker distribution | Papa alpha | Hemingway App | write-good | vale | | |---|---|---|---|---| | Readability grade | ✅ | ✅ | ❌ | ❌ | | Sentence highlights | ✅ | ✅ | || | Markdown code-block awareness | ✅ | ❌ | ❌ | ✅ | | CLI | ✅ | ❌ | ✅ | ✅ | | CI gate via exit code | ✅ | ❌ | ✅ | ✅ | | JSON for automation | ✅ | ❌ | ✅ | | | Open source | ✅ | ❌ | ✅ | ✅ | Papa currently uses textstat https://github.com/textstat/textstat for readability formulas and includes small built-in heuristics for the Hemingway style findings. Future optional integrations may include proselint https://github.com/amperser/proselint , write-good https://github.com/btford/write-good , vale https://github.com/errata-ai/vale , and alex https://github.com/get-alex/alex . Issues and PRs welcome. See CONTRIBUTING.md /bharadwaj-pendyala/papa/blob/main/CONTRIBUTING.md and our Code of Conduct /bharadwaj-pendyala/papa/blob/main/CODE OF CONDUCT.md . Good first issues are labeled https://github.com/bharadwaj-pendyala/papa/labels/good%20first%20issue . MIT © Bharadwaj Pendyala