cd /news/ai-products/show-hn-papernews-self-hosted-daily-… · home topics ai-products article
[ARTICLE · art-22038] src=github.com pub= topic=ai-products verified=true sentiment=↑ positive

Show HN: Papernews – self-hosted daily newspaper PDF for your reMarkable

A developer released Papernews, a self-hosted tool that aggregates RSS feeds and Hacker News content into a single daily PDF designed for e-ink readers like the reMarkable. The script uses Anthropic's Claude to rewrite full article text into a consistent LaTeX typeset format with quiet typography and no ads or visual noise. The project requires Docker and an Anthropic API key, generating offline-readable PDFs that users can push to their devices manually or via cron jobs.

read12 min publishedJun 4, 2026

Every news site looks different. Hacker News, MacRumors, Quanta, my favourite ML blog, my favourite math blog — each one its own layout, fonts, colors, ads. To read anything I had to wade through somebody's design choices first and focus past the visual noise.

I much prefer reading the way a LaTeX paper or an old magazine looks: quiet typography, generous margins, no color, nothing competing for attention.

papernews is the fix. A script pulls all those feeds, has Claude clean up, translate to English, and rewrite the article bodies — the full text, not just summaries — and renders the result into one consistently typeset LaTeX PDF. Every article is in the PDF; you read entirely offline, no clicking through, no opening tabs.

A side benefit I didn't expect to like but very much do: one place to read the day's news instead of five tabs being refreshed all day. One or two issues per day, no more.

Designed for an e-ink reader like the reMarkable, but it works just as well in any browser's PDF viewer.

**👉 See **

sample-2026-06-04.pdf

for a real day's output.Hobby project; works. Things will move. Expect rough edges.

You need: a machine that can run Docker (your laptop, a NAS, a $5/mo VPS, anything), an Anthropic API key, and ~2 GB of disk for the image.

git clone https://github.com/marcj/papernews
cd papernews

cp .env.example .env
$EDITOR .env             # paste ANTHROPIC_API_KEY=sk-ant-...

$EDITOR sources.toml     # add/remove RSS/HN entries, set per-source limits

$EDITOR papernews/template.tex.j2

docker compose up --build -d

Everything you'd normally want to change is in two files:

— which feeds, how many items per feed, in what order. Two source kinds today:sources.toml

kind = "hn"

(Hacker News, top-by-points via the Algolia API) andkind = "rss"

(any Atom/RSS feed via feedparser).— the LaTeX template. Page size, fonts, colors, layout, what goes on the cover, everything. Edit, restart the container, refreshpapernews/template.tex.j2

/digest.pdf

.

Optional but useful:

+papernews/summarize.py

— the Claude system prompts. Changepapernews/rewrite.py

_MODEL

toclaude-sonnet-4-6

for fancier rewrites at ~10× the cost; adjust_SYSTEM

to change the editorial voice (e.g. disable the auto-translate-to-English rule).— what goes into the World news block and the Quote-of-the-day source.papernews/wiki.py

A few different ways, no special script needed:

Manual— openhttp://your-machine:8000/digest.pdf

in a browser on your phone/laptop and upload it to your reMarkable from there (drag-and- drop onmy.remarkable.com

, or the reMarkable mobile app, or the USB Web Interface athttp://10.11.99.1

while connected by USB).— a third-party CLI that pushes files to your reMarkable cloud account. Pair once, then:rmapi

Stick that two-liner in cron on the host and the device picks it up on next sync automatically.

curl -s http://your-machine:8000/digest.pdf -o today.pdf
rmapi put today.pdf /Papernews

— a third-party email-to-reMarkable bridge (Remailableremailable.getneutrality.org). You email the PDF as an attachment to your assigned address and it appears on the device. Useful if your papernews host canmail

/mutt

but can't reach the reMarkable directly. (reMarkable has no first-party email-to-device; do not believe earlier versions of this README that implied otherwise.)

No native push is built-in because everyone's setup is different and you probably don't want me poking your reMarkable cloud account with your token.

git clone https://github.com/yourname/papernews
cd papernews
cp .env.example .env
docker compose up --build

Then visit http://localhost:8000

— landing page with a preview image and a link to /digest.pdf

. The first PDF builds on demand, takes ~1–2 minutes the first time and is then cached until new content arrives.

State lives in ./data/state.db

(bind-mounted from the host) so it survives container restarts.

A 100–200 page PDF with:

Cover page: title + date + article count, quote of the day from Wikiquote, a "World news" block (5 tech headlines + 2 Western items from Wikipedia's Current Events portal, each compressed to a single sentence).Contents: every article grouped by source, with dot-leaders to its publication date."Did you know…" trivia nuggets from Wikipedia's Main Page.The articles themselves, set in two-column Latin Modern with proper paragraph indents, hyphenation, microtypography. Math ($x = y$

,$$\int f$$

,\(...\)

,\[...\]

) is rendered as real LaTeX math. Code blocks (fenced or inline) come through in monospace.- All non-English source content (heise, etc.) is translated to English during the rewrite step. You can disable that in the prompt if you don't want it.

                   sources.toml
                       │
            ┌──────────┴──────────┐
            │                     │
            ▼                     ▼
       ┌────────┐            ┌────────┐
       │ gather │            │ wiki/  │
       │  HN +  │            │ news + │
       │  RSS   │            │  QOTD  │
       └───┬────┘            └───┬────┘
           ▼                     │
       ┌────────┐                │
       │extract │                │
       │ (traf- │                │
       │  ilatura)               │
       └───┬────┘                │
           ▼                     │
       ┌─────────┐               │
       │summarize│ ─── Claude    │
       └───┬─────┘               │
           ▼                     │
       ┌─────────┐               │
       │ rewrite │ ─── Claude    │
       └───┬─────┘               │
           ▼                     ▼
       SQLite store (state.db)   in-memory
           │                     │
           └──────────┬──────────┘
                     ▼
              ┌──────────┐
              │  render  │ ── xelatex
              └────┬─────┘
                   ▼
             archive/cache/<hash>.pdf

Four stages, each idempotent and resumable:

gather— pulls new items from each source, runstrafilatura

to extract the article body, stores the raw text. Pure I/O — no LLM cost.summarize— batches up to 8 articles per Claude call and produces a ≤40-word two-sentence summary for each (used as the lede in the front matter and in the contents listing).rewrite— batches up to 8 articles per Claude call (streamed because the output is long) and produces a clean, properly-paragraphed, translated-to-English version of each article body for the renderer. Preserves code fences and$math$

exactly.render— pulls the latest N articles per source from the store, plus fresh world news + quote + DYK, and runs them through a Jinja template into xelatex → PDF. Results are cached by a hash of "what's in the store" + "what's in sources.toml". Same content + same config → same cached PDF served instantly.

A background APScheduler

job runs steps 1–3 every 4 hours (configurable). The render step is on-demand; the first hit to /digest.pdf

after an ingest builds the PDF and caches it.

route what it does
GET /
minimal landing page, cover preview + Read PDF link
GET /digest.pdf
the current edition (built on demand, then cached)
GET /preview.png
page 1 rasterized at 180 DPI
GET /sources
JSON list of configured sources + latest fetched_at
GET /healthz
liveness probe (returns ok )
POST /ingest
manual kick of the gather → summarize → rewrite cycle

Sources live in sources.toml — that's the exact file used to produce

the sample PDF. Open it, copy a block, edit, restart the container, refresh

/digest.pdf

.The order of [[source]]

blocks in the file is the order they'll appear in the PDF — sources at the top come first. World news, quote of the day, and the "Did you know…" nuggets are not configured here — they're cover decorations, fetched fresh on every render.

Ranks stories by points within a time window. No URL needed; the API is hardcoded.

field type default meaning
name
string required display label (also the contents-page heading)
kind
string required must be "hn"
limit
int 10
how many top stories to keep
since_hours
int 48
only consider stories submitted in the last N hours
min_points
int 50
story must have at least this many points to qualify
[[source]]
name        = "Hacker News"
kind        = "hn"
limit       = 10
since_hours = 48
min_points  = 100

Parsed with feedparser, so it accepts RSS 0.9/1.0/2.0 and Atom 1.0 — every blog and most news sites work.

field type default meaning
name
string required display label (also the contents-page heading)
kind
string required must be "rss"
url
string required feed URL
limit
int 20
take at most N most-recent items
[[source]]
name  = "Quanta Magazine"
kind  = "rss"
url   = "https://www.quantamagazine.org/feed/"
limit = 8

The limit

is applied twice, on purpose:

  • At fetch time: gather doesn't pull more thanlimit

items from the feed (saves bandwidth and trafilatura time). - At render time: even if the store accumulates more thanlimit

items for a source across multiple ingests (it will — items don't get deleted), only the latestlimit

per source make it into a given PDF.

So if you want Quanta to have at most 8 articles in the issue, regardless of how many they've published this week → set limit = 8

. If you want Hacker News to show only the top 5 by points in the last 24h → set limit = 5, since_hours = 24

.

On the totals.Adding up everylimit

insources.toml

gives you the maximum article count per issue. Aim for30–60 articlesfor a comfortable 30–60 minute read. Claude's summaries are dense; volume isn't quality. An empty section on a slow day is cleaner than padding.

Two modes; pick whichever fits your routine. Set the env var in .env

.

INGEST_INTERVAL_SECONDS=14400   # 4 hours (the default)
INGEST_SCHEDULE=07:00,18:00     # comma-separated HH:MM
INGEST_TIMEZONE=Europe/London   # any IANA tz; default UTC

If both are set, INGEST_SCHEDULE

wins. The render is still on-demand — hitting /digest.pdf

between scheduled runs gives you the cached PDF instantly.

You can also kick a manual ingest any time:

curl -X POST http://localhost:8000/ingest

A built-in hook fires after every successful ingest. Point POST_INGEST_HOOK

at any executable on the container's filesystem (drop the script into your ./data/hooks/

directory so it survives rebuilds via the bind mount). The hook receives the freshly-built PDF path as its first argument.

POST_INGEST_HOOK=/data/hooks/push-to-remarkable.sh
POST_INGEST_HOOK_TIMEOUT=300    # optional; default 300s

Hook failures are non-fatal — a broken hook logs an error but doesn't crash the ingest loop.

Drop this in ./data/hooks/push-to-remarkable.sh

and chmod +x

it:

#!/usr/bin/env bash
set -euo pipefail

PDF="$1"
REMARKABLE="root@10.11.99.1"            # adjust to your device's IP
SSH_KEY=/data/hooks/remarkable_id_ed25519

scp -i "$SSH_KEY" -o StrictHostKeyChecking=accept-new \
    "$PDF" "$REMARKABLE:/home/root/papernews.pdf"

ssh -i "$SSH_KEY" "$REMARKABLE" 'systemctl restart xochitl'

Generate a passwordless key (ssh-keygen -t ed25519 -f data/hooks/remarkable_id_ed25519 -N ""

), add the .pub

to the reMarkable's /home/root/.ssh/authorized_keys

once, and from then on every ingest pushes the new paper to your device.

The same pattern works for Kindle (scp

over USB networking), a network printer (lp -d papernews "$PDF"

), an email (mutt -a "$PDF"

), or anything else you can script.

Modest, no-network unittest suite for the web/scheduling/hook behaviour:

python -m unittest discover -s tests

You don't have to use Docker — the CLI works directly:

python3 -m venv .venv
.venv/bin/pip install -e .
export ANTHROPIC_API_KEY=sk-ant-...

.venv/bin/python -m papernews gather       # fetch + extract
.venv/bin/python -m papernews summarize    # claude pass 1 (batched)
.venv/bin/python -m papernews rewrite      # claude pass 2 (batched, streamed)
.venv/bin/python -m papernews render       # xelatex → PDF
.venv/bin/python -m papernews build

Requirements: Python 3.11+, xelatex

(TeX Live with texlive-xetex

, texlive-latex-extra

, lmodern

), pdftoppm

(poppler).

Everything visual lives in one file: papernews/template.tex.j2.

  • Page size: paperwidth=157mm, paperheight=210mm

(tuned for reMarkable Pro) - Body font: Latin Modern Roman 10pt

  • Two-column body for any article over 2000 characters; single-column otherwise
  • First-line paragraph indent instead of vertical \parskip

(classic magazine convention) - Microtype protrusion + expansion

  • Letter-spacing on small-caps source labels via fontspec's LetterSpace

Customize whatever you like — the Jinja delimiters are LaTeX-safe (((* ... *))

for blocks, ((( ... )))

for variables) so your {

, }

and \

don't fight each other.

Roughly per ingest cycle, with Claude Haiku 4.5 (default model):

  • ~50 articles
  • Summarize: 6 batched calls (~8 articles each)
  • Rewrite: 6 batched calls, streamed
  • World-news compress: 1 call

Order-of-magnitude: a few cents to a few tens of cents per cycle depending on article lengths. At 6 cycles/day that's well under $1/day. Going to Sonnet or Opus multiplies the bill ~10–30×.

Set a spend cap at https://console.anthropic.com/settings/billing → Spend limits — the run-loop can't surprise you above whatever you set.

  • All data lives on your machine ( ./data/state.db

+./data/archive/cache/

). - Article text is sent to the Anthropic API for summarization and rewriting. That's the only outbound destination for content (besides fetching the feeds themselves).

  • No analytics, no telemetry, no third-party scripts in the landing page.
papernews/
├── papernews/
│   ├── fetch.py          # HN Algolia + RSS feedparser
│   ├── extract.py        # trafilatura
│   ├── summarize.py      # Anthropic SDK, batched
│   ├── rewrite.py        # Anthropic SDK, batched + streamed
│   ├── wiki.py           # World news / Quote / DYK / tech feeds
│   ├── store.py          # SQLite article store + queries
│   ├── render.py         # Jinja + xelatex
│   ├── preview.py        # PDF → PNG via pdftoppm
│   ├── cache.py          # On-disk cache by content hash
│   ├── cli.py            # papernews command
│   ├── web.py            # Flask + APScheduler
│   └── template.tex.j2   # the magazine
├── sources.toml          # configured feeds
├── pyproject.toml
├── Dockerfile
├── docker-compose.yml
└── data/                 # gitignored — your SQLite + cached PDFs

Open an issue first if you're planning something non-trivial — happy to talk about direction. The codebase is small enough that you can read it end to end in an hour.

MIT — see LICENSE.

Working name; happy to take suggestions. The vibe is: an old-fashioned daily paper, not a feed. You read it once, then you put it down.

── more in #ai-products 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/show-hn-papernews-se…] indexed:0 read:12min 2026-06-04 ·