cd /news/artificial-intelligence/i-built-a-python-pipeline-that-draft… · home topics artificial-intelligence article
[ARTICLE · art-22032] src=dev.to pub= topic=artificial-intelligence verified=true sentiment=· neutral

I Built a Python Pipeline That Drafts Affiliate Articles Locally with Claude — Here's the Code, the 41-Second Run, and the Bug T

A developer built a 180-line Python pipeline that generates affiliate article drafts locally using Claude, producing a Markdown file in 41 seconds without any SaaS dependencies. The system uses a deterministic Python step for link insertion rather than letting the LLM handle URLs, preventing hallucinated affiliate links that could damage trust and revenue. Over six weeks, the pipeline produced 17 drafts from a daily morning run, with the key technical insight being the use of forced tool calls in the Anthropic Messages API to achieve 100% reliable structured JSON output.

read7 min publishedJun 4, 2026

If you read this, you'll be able to run a small Python pipeline on your own laptop that: (1) generates a draft article from a topic + a keyword list, (2) injects your affiliate links only where they're contextually relevant, and (3) refuses to save anything where the title doesn't match the body. No SaaS, no cron server — just python pipeline.py "Laravel N+1"

and a Markdown file lands in out/

.

I run this every morning. Over 6 weeks it produced 17 drafts; my honest conversion is still low (think single-digit clicks, not "月10万"), but the machinery works and the failure modes are interesting. This is the build log, not a get-rich post.

The whole thing is ~180 lines. The non-obvious design decision: the LLM never touches your affiliate links. Claude writes prose; a deterministic Python step does link insertion. Why? Because the first version let the model embed links, and Claude happily invented https://amzn.to/laravel-pro

— a URL that does not exist. Hallucinated affiliate links are worse than no links: they leak trust and earn nothing.

So the contract is:

claude-opus-4-8

via the Anthropic SDK){title, sections[], keywords_used[]}

.Here is the generation core. It uses the Anthropic Messages API with a forced JSON shape via a tool definition — that's the reliable way to get structured output, far better than "please return JSON" in the prompt.

import json, os, re, sys, pathlib
from anthropic import Anthropic

client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
MODEL = "claude-opus-4-8"

ARTICLE_TOOL = {
    "name": "emit_article",
    "description": "Return the drafted technical article as structured data.",
    "input_schema": {
        "type": "object",
        "properties": {
            "title": {"type": "string"},
            "sections": {
                "type": "array",
                "items": {
                    "type": "object",
                    "properties": {
                        "h2": {"type": "string"},
                        "body_md": {"type": "string"},
                    },
                    "required": ["h2", "body_md"],
                },
            },
            "keywords_used": {"type": "array", "items": {"type": "string"}},
        },
        "required": ["title", "sections", "keywords_used"],
    },
}

def draft(topic: str, keywords: list[str]) -> dict:
    prompt = (
        f"You are a senior backend engineer. Write a hands-on article on: {topic}.\n"
        f"Each H2 must contain at least one of these search keywords: {keywords}.\n"
        "Include real numbers and one runnable code block per section. "
        "Do NOT include any URLs or affiliate links — leave linking to the pipeline."
    )
    resp = client.messages.create(
        model=MODEL,
        max_tokens=4000,
        tools=[ARTICLE_TOOL],
        tool_choice={"type": "tool", "name": "emit_article"},
        messages=[{"role": "user", "content": prompt}],
    )
    for block in resp.content:
        if block.type == "tool_use":
            return block.input
    raise RuntimeError("model did not call emit_article")

tool_choice

forcing emit_article

is the part that took me three tries to get right. Without it, ~1 in 8 runs returned a chatty text block ("Sure! Here's your article...") and my json.loads

blew up. Forcing the tool dropped that failure rate to zero across the last 60 runs.

This is the boring part that actually protects revenue. I keep a hand-written table of links I'm actually registered for (A8.net, an affiliate-enabled book retailer, etc.), each with a list of trigger keywords. Python inserts a link only when a section genuinely discusses that topic, and never more than one per ~400 words — because a wall of affiliate links is the fastest way to get a reader to bounce and an editor to flag spam.

LINK_TABLE = [
    {
        "triggers": ["n+1", "eloquent", "query log", "eager "],
        "anchor": "a practical Laravel performance book",
        "url": "https://example-a8-link/laravel-perf",  # your real A8 tracking URL
    },
    {
        "triggers": ["new nisa", "index fund", "brokerage"],
        "anchor": "open a tsumitate NISA account",
        "url": "https://example-a8-link/nisa",
    },
]

def inject_links(body_md: str) -> tuple[str, int]:
    words = max(len(body_md.split()), 1)
    budget = max(1, words // 400)          # at most 1 link per 400 words
    low = body_md.lower()
    inserted = 0
    for link in LINK_TABLE:
        if inserted >= budget:
            break
        if any(t in low for t in link["triggers"]):
            md_link = f"[{link['anchor']}]({link['url']})"
            body_md += f"\n\n> 📚 Related: {md_link}"
            inserted += 1
    return body_md, inserted

Measured behavior on my last 17 drafts: average 1.3 links per article, and 4 articles got zero links because no section matched a trigger — which is exactly what I want. An off-topic affiliate link converts at ~0% and costs you credibility. Letting the budget go to zero is a feature.

Here's the failure story. Early on, my title prompt and my body prompt were two separate Claude calls. On three mornings the title said "Laravel Eloquent N+1" while the body had drifted into MySQL index design — because the second call had no memory of the first. I didn't notice until a reader DMed me "the title is lying." Mortifying.

Fix: one call returns both (already done above), plus a deterministic gate that runs before anything is written to disk. If fewer than 2 meaningful title tokens appear in the body, the draft is rejected — no file, non-zero exit code, loud message.

STOP = {"the", "a", "to", "in", "with", "and", "of", "for", "how", "i"}

def title_matches_body(title: str, body: str) -> bool:
    toks = [t for t in re.findall(r"[a-z0-9+]+", title.lower()) if t not in STOP]
    body_low = body.lower()
    hits = sum(1 for t in toks if t in body_low)
    return hits >= 2          # require 2+ real title tokens in the body

def build(topic: str, keywords: list[str]) -> pathlib.Path:
    art = draft(topic, keywords)
    parts = [f"# {art['title']}\n"]
    for sec in art["sections"]:
        body, n = inject_links(sec["body_md"])
        parts.append(f"## {sec['h2']}\n\n{body}\n")
    full = "\n".join(parts)

    if not title_matches_body(art["title"], full):
        raise SystemExit(f"REJECTED: title/body drift -> {art['title']!r}")

    slug = re.sub(r"[^a-z0-9]+", "-", art["title"].lower()).strip("-")[:60]
    out = pathlib.Path("out") / f"{slug}.md"
    out.parent.mkdir(exist_ok=True)
    out.write_text(full, encoding="utf-8")
    return out

if __name__ == "__main__":
    topic = sys.argv[1] if len(sys.argv) > 1 else "Laravel Eloquent N+1"
    kws = ["eloquent", "whereHas", "eager ", "query log"]
    path = build(topic, kws)
    print(f"wrote {path}")

Since adding title_matches_body

, the gate has rejected 2 of the last 31 runs — both genuine drifts where Claude wandered off-topic in a long section. Two prevented embarrassments for the cost of a 5-line function. The >= 2

threshold matters: at >= 1

, a single accidental token like "the" (before I added the stoplist) passed garbage; at >= 3

, legitimate short titles got rejected. Two is the sweet spot for my title lengths.

On an M-class / Ryzen laptop the bottleneck is entirely the API round-trip, not Python. A full run breaks down as:

max_tokens=4000

, usually ~3,800 used): I deliberately do not fan out 10 topics in parallel. One article a day, hand-reviewed before posting, keeps quality up and keeps me off platform spam filters — which is the real constraint, not throughput. The machine could do 30 in 20 minutes; that's exactly the trap that gets accounts flagged.

The local script is the unit; GitHub Actions is just a free cron that runs it and commits the result. The keys live in repo secrets, never in the file. Cost note: at current Opus pricing, ~3,800 output tokens is a few cents per run — call it the price of a vending-machine coffee per month, not per article.

name: daily-draft
on:
  schedule:
    - cron: "0 22 * * *"   # 22:00 UTC = 07:00 JST
  workflow_dispatch: {}
jobs:
  draft:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with: { python-version: "3.11" }
      - run: pip install anthropic
      - env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: python pipeline.py "Laravel Eloquent N+1 query optimization"
      - run: |
          git config user.name "draft-bot"
          git config user.email "bot@users.noreply.github.com"
          git add out/ && git commit -m "daily draft" || echo "nothing to commit"
          git push

The || echo "nothing to commit"

line is load-bearing: when the validation gate rejects a draft, there's no file, git commit

would exit non-zero, and the whole Action would go red for no good reason. This keeps a rejection (correct behavior) from looking like a failure.

Blunt truth from 6 weeks: the pipeline is the easy 20%. Distribution is the other 80%, and code can't fake it. My drafts that got read were the ones where the topic matched the platform's audience (concrete Laravel/Python implementation posts on a dev-heavy platform), not the generic ones. The automation's real value isn't "passive income" — it's removing the 40-minute cold-start of staring at a blank editor, so I'll actually publish 5 days a week instead of 1.

If you build this, steal three ideas specifically: (1) force structured output with tool_choice

so you never parse free text; (2) keep affiliate links in deterministic Python, never in the prompt, so the model can't hallucinate a payout URL; (3) add a title↔body gate before any write — it's the cheapest insurance against shipping something that lies to your readers.

The full ~180-line version, plus the link table format, is the same shape as above — copy the three functions and you have a working draft generator today. If you want to go deeper on the query-optimization side that these drafts target, a practical Laravel performance book is the one I keep open while editing.

── more in #artificial-intelligence 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/i-built-a-python-pip…] indexed:0 read:7min 2026-06-04 ·