cd /news/ai-tools/how-i-use-pluckmd-to-read-blogs-with… · home topics ai-tools article
[ARTICLE · art-19732] src=dev.to pub= topic=ai-tools verified=true sentiment=↑ positive

How I use pluckmd to read blogs with an AI agent

A developer created pluckmd, a CLI tool that extracts blog articles into markdown without per-site configuration, to enable AI agents to read and index web content. The tool automatically handles JavaScript-heavy pages by switching to a real browser, supports logged-in sessions, and integrates with coding agents like Claude Code and Codex to automate the workflow of collecting articles, building wikis, and generating interactive HTML study pages.

read3 min publishedJun 2, 2026

I wanted to read blog posts with an LLM in the loop, not just on my own.

The push came from two places. Karpathy's LLM Wiki idea, where the model keeps a folder of markdown notes as you learn a topic. And Thariq's post on how well Claude generates interactive HTML, which is now on the Anthropic blog. Put together, the workflow I wanted looked like this: pull blog articles into markdown, have an agent index them into a wiki, then generate interactive HTML pages to learn from.

Step one was the blocker. Getting clean articles out of a website kept breaking, and every tool wanted a config per site. So I made pluckmd to handle just that part. This post is how I use it. The architecture write-up is separate.

References if you want the background:

npx pluckmd download https://example.com/blog -o ./articles

That walks the listing page, follows pagination, pulls each article, and writes markdown with frontmatter (title, date, author, tags). On a small blog I get maybe 5 posts saved in a few seconds. No site config, no setup.

If a page is heavy on javascript it quietly switches to a real browser to render it. You don't pick that, it decides.

A lot of the writing I actually care about sits behind a login. Two ways to handle it.

pluckmd login https://example.com/login

That opens a browser once, you log in by hand, and the session sticks around. After that, normal downloads just work.

Or if you'd rather not hand it credentials at all, open the page in Chrome with the extension installed and run:

pluckmd download --active-tab -o ./articles

It reads straight from the tab you're already logged into. The CLI itself never reads your cookies.

This is the reason it exists for me. I don't actually run the CLI by hand most of the time. pluckmd ships skills for Claude Code and Codex, so I just talk to the agent and it runs the right commands for me.

The whole learning loop is three messages:

Collect the posts from

[https://example.com/blog]

The agent runs the download and saves everything as markdown into raw/

.

Build a wiki from them

It reads the markdown, pulls out the concepts, and links them into wiki notes (works as an Obsidian vault). That's the Karpathy LLM Wiki part, a set of notes the model maintains as I learn.

Generate interactive HTML for this concept

It turns a concept into an interactive HTML page to study from, the Thariq HTML idea. The raw files stay untouched, the wiki and the HTML are things the agent regenerates.

So I never touch flags or paths unless I want to. I describe what I want, the agent drives pluckmd. And if you don't have an LLM key set for the extraction itself, it still works: pluckmd writes out a file describing the page, and the agent reads that and produces the extraction rules. The agent is the brain, the CLI is the hands.

Honestly, not every site cooperates. I hit a couple of layouts where the heuristics couldn't find a clean article pattern and it had to lean on the agent fallback. Infinite scroll feeds are hit or miss depending on how the load-more is wired up. If you try it on something exotic and it flops, that's useful to me.

npm install -g pluckmd

Repo (MIT): https://github.com/taisei-ide-0123/pluckmd

Curious what people are pointing their agents at. What would you want read into a wiki first?

── more in #ai-tools 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/how-i-use-pluckmd-to…] indexed:0 read:3min 2026-06-02 ·