{"slug": "show-hn-papa-open-source-hemingway-style-readability-linting-for-markdown", "title": "Show HN: Papa – open-source Hemingway-style readability linting for Markdown", "summary": "Papa, an open-source alpha CLI and Python library for Hemingway-style readability linting of Markdown, was released on Show HN. It flags hard sentences, passive voice, adverbs, and complex phrasing, and can run locally or in CI to gate prose quality. The tool aims to make readability checks scriptable for documentation pipelines.", "body_md": "Papa flags hard sentences, passive voice, adverbs, and complex phrasing with a\nreadability grade. It is an open-source **alpha** CLI and Python library that\ncan run locally or in CI, emit JSON for automation, and generate a self-contained\nHTML report.\n\n[ Quickstart](#-quickstart) ·\n\n[·](#-ci-usage)\n\n**CI usage**[·](#-llm-workflow)\n\n**LLM workflow**[·](#-how-it-works)\n\n**How it works**\n\n**Roadmap**** View a sample HTML report** generated by\nPapa. It highlights hard sentences, passive voice, adverbs, and complex phrases\nwith a readability grade panel. The report is self-contained and uses no\nnetwork assets.\n\nPapa is currently **alpha software**. The supported surface is intentionally\nsmall:\n\n- CLI command:\n`papa`\n\n- Python package:\n`papa-lint`\n\n- Input formats: Markdown and plain text\n- Reports: terminal, JSON, and self-contained HTML\n- CI gating: use\n`--max-grade`\n\nand the process exit code\n\nGitHub Action packaging, SARIF/Markdown reporters, config files, optional external linters, and built-in LLM rewriting are roadmap items.\n\nThe [Hemingway Editor](https://hemingwayapp.com) is useful, but it is closed,\nmanual, and not designed for docs pipelines. Other open-source writing tools\nexist, but they tend to have separate output formats and workflows.\n\nPapa focuses on a narrow first job: make prose readability checks scriptable.\n\n**Scores readability** with ARI, Flesch-Kincaid, and Gunning fog.**Highlights like Hemingway** by marking hard sentences, passive voice, adverbs, and complex phrases.**Handles Markdown safely** by ignoring frontmatter, fenced code, inline code, and SVG while preserving offsets back to the original file.**Emits JSON** so agents, scripts, or dashboards can consume the findings.**Gates CI** by exiting non-zero when prose is harder than your max grade.\n\n```\npipx install papa-lint\n\npapa post.md\npapa post.md --report html -o report.html\npapa post.md --report json > findings.json\npapa post.md --max-grade 10\n```\n\nExample terminal output:\n\n```\npost.md  -  Grade 11, hard to read\n  ✗ fails --max-grade 10\n\n  18 hard · 4 very hard · 6 passive · 9 adverbs · 12 complex\n  ARI 11.2 · FK 10.8 · Fog 12.1  (142 sentences)\n\n  L42   very hard  Very hard to read (grade 16): \"Because the detour squeezes...\"\n  L58   passive    Passive voice: 'is measured'\n```\n\nPapa does not need a dedicated GitHub Action yet. Install it in your workflow\nand let `--max-grade`\n\nfail the job when prose crosses your threshold:\n\n```\nname: readability\non: pull_request\njobs:\n  papa:\n    runs-on: ubuntu-latest\n    steps:\n      - uses: actions/checkout@v4\n      - uses: actions/setup-python@v5\n        with:\n          python-version: \"3.11\"\n      - run: python -m pip install papa-lint\n      - run: papa README.md docs/*.md --max-grade 10\n```\n\nFor local development, run the same command before opening a PR.\n\nPapa does not call an LLM directly. It emits structured JSON that you can hand to an agent or script:\n\n```\npapa post.md --report json > findings.json\n```\n\nThe JSON includes the file path, document score, and findings with offsets into the original source:\n\n```\n{\n  \"version\": \"0.1\",\n  \"file\": \"post.md\",\n  \"score\": {\n    \"ari\": 11.2,\n    \"flesch_kincaid\": 10.8,\n    \"gunning_fog\": 12.1,\n    \"reading_grade\": \"Grade 11, hard to read\",\n    \"verdict\": \"fail\"\n  },\n  \"findings\": [\n    {\n      \"start\": 1423,\n      \"end\": 1490,\n      \"category\": \"passive\",\n      \"message\": \"Passive voice: 'is measured'\",\n      \"severity\": \"warn\"\n    }\n  ]\n}\n```\n\nSee [docs/llm-contract.md](/bharadwaj-pendyala/papa/blob/main/docs/llm-contract.md) for a prompt pattern that uses\nPapa findings to guide a rewrite.\n\n``` php\ninput -> Ingestor -> Analyzers -> Aggregator -> Reporters\n         strip code  readability  merge spans  terminal\n         + SVG       passive      + scores     json\n         offset map  adverbs                  html\n                     complex phrase\n```\n\n**Ingest**: detect Markdown or text, strip non-prose, and preserve an offset map back to the original source.** Analyze**: run built-in analyzers for readability, passive voice, adverbs, and complex phrases.** Aggregate**: merge overlapping findings, compute document scores, and apply the optional max-grade gate.** Report**: render terminal, JSON, or HTML output and set the exit code.\n\n``` python\nfrom papa import analyze\n\nresult = analyze(open(\"post.md\", encoding=\"utf-8\").read(), path=\"post.md\", max_grade=10)\nprint(result.score.reading_grade)\nprint(result.score.verdict)\n```\n\n- Config file support, likely\n`papa.toml`\n\n- GitHub Action wrapper\n- SARIF and Markdown reporters for PR annotations and summaries\n- Built-in\n`--suggest`\n\nworkflow for LLM-assisted rewrites - MDX, HTML, and reStructuredText ingestion\n- Optional integrations with tools such as\n`proselint`\n\n,`alex`\n\n, and`vale`\n\n- npm, Homebrew, and Docker distribution\n\n| Papa alpha | Hemingway App | write-good | vale | |\n|---|---|---|---|---|\n| Readability grade | ✅ | ✅ | ❌ | ❌ |\n| Sentence highlights | ✅ | ✅ | ||\n| Markdown code-block awareness | ✅ | ❌ | ❌ | ✅ |\n| CLI | ✅ | ❌ | ✅ | ✅ |\n| CI gate via exit code | ✅ | ❌ | ✅ | ✅ |\n| JSON for automation | ✅ | ❌ | ✅ | |\n| Open source | ✅ | ❌ | ✅ | ✅ |\n\nPapa currently uses [textstat](https://github.com/textstat/textstat) for\nreadability formulas and includes small built-in heuristics for the Hemingway\nstyle findings. Future optional integrations may include\n[proselint](https://github.com/amperser/proselint),\n[write-good](https://github.com/btford/write-good),\n[vale](https://github.com/errata-ai/vale), and\n[alex](https://github.com/get-alex/alex).\n\nIssues and PRs welcome. See [CONTRIBUTING.md](/bharadwaj-pendyala/papa/blob/main/CONTRIBUTING.md) and our\n[Code of Conduct](/bharadwaj-pendyala/papa/blob/main/CODE_OF_CONDUCT.md). Good first issues are\n[labeled](https://github.com/bharadwaj-pendyala/papa/labels/good%20first%20issue).\n\nMIT © Bharadwaj Pendyala", "url": "https://wpnews.pro/news/show-hn-papa-open-source-hemingway-style-readability-linting-for-markdown", "canonical_source": "https://github.com/bharadwaj-pendyala/papa", "published_at": "2026-07-04 00:03:25+00:00", "updated_at": "2026-07-04 00:20:01.574952+00:00", "lang": "en", "topics": ["developer-tools", "natural-language-processing"], "entities": ["Papa", "Hemingway Editor", "GitHub", "Python"], "alternates": {"html": "https://wpnews.pro/news/show-hn-papa-open-source-hemingway-style-readability-linting-for-markdown", "markdown": "https://wpnews.pro/news/show-hn-papa-open-source-hemingway-style-readability-linting-for-markdown.md", "text": "https://wpnews.pro/news/show-hn-papa-open-source-hemingway-style-readability-linting-for-markdown.txt", "jsonld": "https://wpnews.pro/news/show-hn-papa-open-source-hemingway-style-readability-linting-for-markdown.jsonld"}}