Show HN: Papernews – self-hosted daily newspaper PDF for your reMarkable

wpnews.pro

Every news site looks different. Hacker News, MacRumors, Quanta, my favourite ML blog, my favourite math blog — each one its own layout, fonts, colors, ads. To read anything I had to wade through somebody's design choices first and focus past the visual noise.

I much prefer reading the way a LaTeX paper or an old magazine looks: quiet typography, generous margins, no color, nothing competing for attention.

papernews is the fix. A script pulls all those feeds, has Claude clean up, translate to English, and rewrite the article bodies — the full text, not just summaries — and renders the result into one consistently typeset LaTeX PDF. Every article is in the PDF; you read entirely offline, no clicking through, no opening tabs.

A side benefit I didn't expect to like but very much do: one place to read the day's news instead of five tabs being refreshed all day. One or two issues per day, no more.

Designed for an e-ink reader like the reMarkable, but it works just as well in any browser's PDF viewer.

**👉 See **

sample-2026-06-04.pdf

for a real day's output.Hobby project; works. Things will move. Expect rough edges.

You need: a machine that can run Docker (your laptop, a NAS, a $5/mo VPS, anything), an Anthropic API key, and ~2 GB of disk for the image.

git clone https://github.com/marcj/papernews
cd papernews

cp .env.example .env
$EDITOR .env             # paste ANTHROPIC_API_KEY=sk-ant-...

$EDITOR sources.toml     # add/remove RSS/HN entries, set per-source limits

$EDITOR papernews/template.tex.j2

docker compose up --build -d

Everything you'd normally want to change is in two files:

— which feeds, how many items per feed, in what order. Two source kinds today:sources.toml

kind = "hn"

(Hacker News, top-by-points via the Algolia API) andkind = "rss"

(any Atom/RSS feed via feedparser).— the LaTeX template. Page size, fonts, colors, layout, what goes on the cover, everything. Edit, restart the container, refreshpapernews/template.tex.j2

/digest.pdf

.

Optional but useful:

+papernews/summarize.py

— the Claude system prompts. Changepapernews/rewrite.py

_MODEL

toclaude-sonnet-4-6

for fancier rewrites at ~10× the cost; adjust_SYSTEM

to change the editorial voice (e.g. disable the auto-translate-to-English rule).— what goes into the World news block and the Quote-of-the-day source.papernews/wiki.py

A few different ways, no special script needed:

Manual— openhttp://your-machine:8000/digest.pdf

in a browser on your phone/laptop and upload it to your reMarkable from there (drag-and- drop onmy.remarkable.com

, or the reMarkable mobile app, or the USB Web Interface athttp://10.11.99.1

while connected by USB).— a third-party CLI that pushes files to your reMarkable cloud account. Pair once, then:rmapi

Stick that two-liner in cron on the host and the device picks it up on next sync automatically.

curl -s http://your-machine:8000/digest.pdf -o today.pdf
rmapi put today.pdf /Papernews

— a third-party email-to-reMarkable bridge (Remailable remailable.getneutrality.org). You email the PDF as an attachment to your assigned address and it appears on the device. Useful if your papernews host canmail

/mutt

but can't reach the reMarkable directly. (reMarkable has no first-party email-to-device; do not believe earlier versions of this README that implied otherwise.)

No native push is built-in because everyone's setup is different and you probably don't want me poking your reMarkable cloud account with your token.

git clone https://github.com/yourname/papernews
cd papernews
cp .env.example .env
docker compose up --build

Then visit http://localhost:8000

— landing page with a preview image and a link to /digest.pdf

. The first PDF builds on demand, takes ~1–2 minutes the first time and is then cached until new content arrives.

State lives in ./data/state.db

(bind-mounted from the host) so it survives container restarts.

A 100–200 page PDF with:

Cover page: title + date + article count, quote of the day from Wikiquote, a "World news" block (5 tech headlines + 2 Western items from Wikipedia's Current Events portal, each compressed to a single sentence).Contents: every article grouped by source, with dot-leaders to its publication date."Did you know…" trivia nuggets from Wikipedia's Main Page.The articles themselves, set in two-column Latin Modern with proper paragraph indents, hyphenation, microtypography. Math ( $x = y$

,$$\int f$$

,$...$

,\[...\]

) is rendered as real LaTeX math. Code blocks (fenced or inline) come through in monospace.- All non-English source content (heise, etc.) is translated to English during the rewrite step. You can disable that in the prompt if you don't want it.

                   sources.toml
                       │
            ┌──────────┴──────────┐
            │                     │
            ▼                     ▼
       ┌────────┐            ┌────────┐
       │ gather │            │ wiki/  │
       │  HN +  │            │ news + │
       │  RSS   │            │  QOTD  │
       └───┬────┘            └───┬────┘
           ▼                     │
       ┌────────┐                │
       │extract │                │
       │ (traf- │                │
       │  ilatura)               │
       └───┬────┘                │
           ▼                     │
       ┌─────────┐               │
       │summarize│ ─── Claude    │
       └───┬─────┘               │
           ▼                     │
       ┌─────────┐               │
       │ rewrite │ ─── Claude    │
       └───┬─────┘               │
           ▼                     ▼
       SQLite store (state.db)   in-memory
           │                     │
           └──────────┬──────────┘
                     ▼
              ┌──────────┐
              │  render  │ ── xelatex
              └────┬─────┘
                   ▼
             archive/cache/<hash>.pdf

Four stages, each idempotent and resumable:

gather— pulls new items from each source, runstrafilatura

to extract the article body, stores the raw text. Pure I/O — no LLM cost.summarize— batches up to 8 articles per Claude call and produces a ≤40-word two-sentence summary for each (used as the lede in the front matter and in the contents listing).rewrite— batches up to 8 articles per Claude call (streamed because the output is long) and produces a clean, properly-paragraphed, translated-to-English version of each article body for the renderer. Preserves code fences and $math$

exactly.render— pulls the latest N articles per source from the store, plus fresh world news + quote + DYK, and runs them through a Jinja template into xelatex → PDF. Results are cached by a hash of "what's in the store" + "what's in sources.toml". Same content + same config → same cached PDF served instantly.

A background APScheduler

job runs steps 1–3 every 4 hours (configurable). The render step is on-demand; the first hit to /digest.pdf

after an ingest builds the PDF and caches it.

route	what it does
`GET /`
minimal landing page, cover preview + Read PDF link
`GET /digest.pdf`
the current edition (built on demand, then cached)
`GET /preview.png`
page 1 rasterized at 180 DPI
`GET /sources`
JSON list of configured sources + latest `fetched_at`
`GET /healthz`
liveness probe (returns `ok` )
`POST /ingest`
manual kick of the gather → summarize → rewrite cycle

Sources live in sources.toml — that's the exact file used to produce

the sample PDF. Open it, copy a block, edit, restart the container, refresh

/digest.pdf

.The order of [[source]]

blocks in the file is the order they'll appear in the PDF — sources at the top come first. World news, quote of the day, and the "Did you know…" nuggets are not configured here — they're cover decorations, fetched fresh on every render.

Ranks stories by points within a time window. No URL needed; the API is hardcoded.

field	type	default
`name`
string	required	display label (also the contents-page heading)
`kind`
string	required	must be `"hn"`
`limit`
int	`10`
how many top stories to keep
`since_hours`
int	`48`
only consider stories submitted in the last N hours
`min_points`
int	`50`
story must have at least this many points to qualify

[[source]]
name        = "Hacker News"
kind        = "hn"
limit       = 10
since_hours = 48
min_points  = 100

Parsed with feedparser, so it accepts RSS 0.9/1.0/2.0 and Atom 1.0 — every blog and most news sites work.

field	type	default
`name`
string	required	display label (also the contents-page heading)
`kind`
string	required	must be `"rss"`
`url`
string	required	feed URL
`limit`
int	`20`
take at most N most-recent items

[[source]]
name  = "Quanta Magazine"
kind  = "rss"
url   = "https://www.quantamagazine.org/feed/"
limit = 8

The limit

is applied twice, on purpose:

At fetch time: gather doesn't pull more thanlimit

items from the feed (saves bandwidth and trafilatura time). - At render time: even if the store accumulates more thanlimit

items for a source across multiple ingests (it will — items don't get deleted), only the latestlimit

per source make it into a given PDF.

So if you want Quanta to have at most 8 articles in the issue, regardless of how many they've published this week → set limit = 8

. If you want Hacker News to show only the top 5 by points in the last 24h → set limit = 5, since_hours = 24

.

On the totals.Adding up everylimit

insources.toml

gives you the maximum article count per issue. Aim for30–60 articlesfor a comfortable 30–60 minute read. Claude's summaries are dense; volume isn't quality. An empty section on a slow day is cleaner than padding.

Two modes; pick whichever fits your routine. Set the env var in .env

.

INGEST_INTERVAL_SECONDS=14400   # 4 hours (the default)
INGEST_SCHEDULE=07:00,18:00     # comma-separated HH:MM
INGEST_TIMEZONE=Europe/London   # any IANA tz; default UTC

If both are set, INGEST_SCHEDULE

wins. The render is still on-demand — hitting /digest.pdf

between scheduled runs gives you the cached PDF instantly.

You can also kick a manual ingest any time:

curl -X POST http://localhost:8000/ingest

A built-in hook fires after every successful ingest. Point POST_INGEST_HOOK

at any executable on the container's filesystem (drop the script into your ./data/hooks/

directory so it survives rebuilds via the bind mount). The hook receives the freshly-built PDF path as its first argument.

POST_INGEST_HOOK=/data/hooks/push-to-remarkable.sh
POST_INGEST_HOOK_TIMEOUT=300    # optional; default 300s

Hook failures are non-fatal — a broken hook logs an error but doesn't crash the ingest loop.

Drop this in ./data/hooks/push-to-remarkable.sh

and chmod +x

it:

#!/usr/bin/env bash
set -euo pipefail

PDF="$1"
REMARKABLE="root@10.11.99.1"            # adjust to your device's IP
SSH_KEY=/data/hooks/remarkable_id_ed25519

scp -i "$SSH_KEY" -o StrictHostKeyChecking=accept-new \
    "$PDF" "$REMARKABLE:/home/root/papernews.pdf"

ssh -i "$SSH_KEY" "$REMARKABLE" 'systemctl restart xochitl'

Generate a passwordless key (ssh-keygen -t ed25519 -f data/hooks/remarkable_id_ed25519 -N ""

), add the .pub

to the reMarkable's /home/root/.ssh/authorized_keys

once, and from then on every ingest pushes the new paper to your device.

The same pattern works for Kindle (scp

over USB networking), a network printer (lp -d papernews "$PDF"

), an email (mutt -a "$PDF"

), or anything else you can script.

Modest, no-network unittest suite for the web/scheduling/hook behaviour:

python -m unittest discover -s tests

You don't have to use Docker — the CLI works directly:

python3 -m venv .venv
.venv/bin/pip install -e .
export ANTHROPIC_API_KEY=sk-ant-...

.venv/bin/python -m papernews gather       # fetch + extract
.venv/bin/python -m papernews summarize    # claude pass 1 (batched)
.venv/bin/python -m papernews rewrite      # claude pass 2 (batched, streamed)
.venv/bin/python -m papernews render       # xelatex → PDF
.venv/bin/python -m papernews build

Requirements: Python 3.11+, xelatex

(TeX Live with texlive-xetex

, texlive-latex-extra

, lmodern

), pdftoppm

(poppler).

Everything visual lives in one file: papernews/template.tex.j2.

Page size: paperwidth=157mm, paperheight=210mm

(tuned for reMarkable Pro) - Body font: Latin Modern Roman 10pt

Two-column body for any article over 2000 characters; single-column otherwise
First-line paragraph indent instead of vertical \parskip

(classic magazine convention) - Microtype protrusion + expansion

Letter-spacing on small-caps source labels via fontspec's LetterSpace

Customize whatever you like — the Jinja delimiters are LaTeX-safe (((* ... *))

for blocks, ((( ... )))

for variables) so your {

, }

and \

don't fight each other.

Roughly per ingest cycle, with Claude Haiku 4.5 (default model):

~50 articles
Summarize: 6 batched calls (~8 articles each)
Rewrite: 6 batched calls, streamed
World-news compress: 1 call

Order-of-magnitude: a few cents to a few tens of cents per cycle depending on article lengths. At 6 cycles/day that's well under $1/day. Going to Sonnet or Opus multiplies the bill ~10–30×.

Set a spend cap at https://console.anthropic.com/settings/billing → Spend limits — the run-loop can't surprise you above whatever you set.

All data lives on your machine ( ./data/state.db

+./data/archive/cache/

). - Article text is sent to the Anthropic API for summarization and rewriting. That's the only outbound destination for content (besides fetching the feeds themselves).

No analytics, no telemetry, no third-party scripts in the landing page.

papernews/
├── papernews/
│   ├── fetch.py          # HN Algolia + RSS feedparser
│   ├── extract.py        # trafilatura
│   ├── summarize.py      # Anthropic SDK, batched
│   ├── rewrite.py        # Anthropic SDK, batched + streamed
│   ├── wiki.py           # World news / Quote / DYK / tech feeds
│   ├── store.py          # SQLite article store + queries
│   ├── render.py         # Jinja + xelatex
│   ├── preview.py        # PDF → PNG via pdftoppm
│   ├── cache.py          # On-disk cache by content hash
│   ├── cli.py            # papernews command
│   ├── web.py            # Flask + APScheduler
│   └── template.tex.j2   # the magazine
├── sources.toml          # configured feeds
├── pyproject.toml
├── Dockerfile
├── docker-compose.yml
└── data/                 # gitignored — your SQLite + cached PDFs

Open an issue first if you're planning something non-trivial — happy to talk about direction. The codebase is small enough that you can read it end to end in an hour.

MIT — see LICENSE.

Working name; happy to take suggestions. The vibe is: an old-fashioned daily paper, not a feed. You read it once, then you put it down.

source & further reading

github.com — original article

Show HN: Papernews – self-hosted daily newspaper PDF for your reMarkable

Run your AI side-project on zahid.host