From Chat to Cron: 11 Stages to a Self-Running Claude Assistant

wpnews.pro

“This guide assumes you know Python basics and have used Claude’s chat interface. You don’t need prior API experience — we build from scratch.”

Almost everyone stops at the chat box. This guide closes that gap with 11 stages — from blank chat to a cron-scheduled assistant that runs while you sleep. Each stage is one move. You don’t need all 11 on day one; you need stage N+1.”

What you’ll build:an inbox/ops assistant that classifies input, calls real tools, runs in a loop, remembers facts across sessions, and triggers itself on a schedule.Stack:Python (Anthropic SDK → Claude Agent SDK) + MCP. Models referenced: claude-haiku-4–5–20251001, claude-sonnet-4–6, claude-opus-4–8.

Before tools and agents, get the boring fundamentals right: a stable instruction layer and machine-readable output. Skip these and every later stage inherits the mess.

01. Start in the blank chat and set a baseline.

Open a normal chat ( claude.ai or the desktop app) and solve the task

The rule: if you can’t get a clean result by hand in the chat, no amount of automation will save it. Automation multiplies whatever you start with, including the bugs.

Use it when: you’re starting any new task, or a downstream stage breaks and you need to isolate whether it’s the prompt or the plumbing.

Prompt v3 (the one that finally worked):
You are an inbox triage assistant.For the email below, output exactly one word: urgent | normal | ignore.No explanation.
Email:"""Subject: prod is downBody: checkout 500s for all users since 09:14"""

02. Promote your prompt to a system prompt (move to the API).

The moment a prompt works, it stops being a message and becomes configuration. Move it out of the chat and into the API’s system field. The system prompt holds the rules that apply to every turn; the user message is just the data.* *Separating them is the first real engineering step.

This also unlocks model choice. Pick a tier deliberately: Haiku for cheap/fast classification, Sonnet for balanced work, Opus for hard reasoning. Triage is easy, so it doesn’t need your most expensive model.

Use it when: the same instructions repeat across requests, or you need to call the task from code instead of typing it.

03. Force structure: make the output machine-readable.

A one-word answer is fine for a human. The moment code has to act on the output, free text is a liability — you’ll write regexes that break on the next response. Instead, force Claude to answer through a tool schema. With tool choice set to a specific tool, the model is required to return arguments that match your JSON shape. No parsing, no guessing.

This is the hinge of the whole guide: structured output is what lets the next stage (real tools) exist. You’re teaching Claude to fill in forms instead of writing essays.

Use it when: another system consumes the output — a database, an API, a branch in your code.

import jsonfrom anthropic import Anthropicclient = Anthropic()# Describe the OUTPUT you want as a tool's input schemasave_ticket = {    "name": "save_ticket",    "description": "Save a structured support ticket.",    "input_schema": {        "type": "object",        "properties": {            "priority": {"type": "string", "enum": ["low", "medium", "high"]},            "topic":    {"type": "string"},            "summary":  {"type": "string"},        },        "required": ["priority", "topic", "summary"],    },}resp = client.messages.create(    model="claude-sonnet-4-6",    max_tokens=512,    tools=[save_ticket],    tool_choice={"type": "tool", "name": "save_ticket"},  # FORCE this tool -> guaranteed JSON    messages=[{"role": "user", "content": "My API keys leaked and billing is wrong."}],)# The structured data arrives as a dict in the tool_use block — already parsedticket = next(b.input for b in resp.content if b.type == "tool_use")print(json.dumps(ticket, indent=2))# -> {"priority": "high", "topic": "security", "summary": "Leaked API keys; billing error"}

Structured output was the setup. Now Claude stops describing actions and starts taking them — first one tool, then many, then a loop that decides for itself when it’s done.

04. Give Claude one tool and close the tool-use loop.

A tool is just a function you describe to Claude. When Claude wants it, the API stops with stop_reason == “tool_use”, hands you the arguments, and waits. You run the function, feed the result back, and call the API again. That back-and-forth — model asks, you answer, repeat until done — is the agentic loop. Every “agent” you’ve ever heard of is this while loop with nicer packaging.

Start with exactly one tool. One tool you can debug beats ten you can’t.

Use it when: the task needs live data or a side effect — weather, a database row, sending an email — anything the model can’t do from memory.

from anthropic import Anthropicclient = Anthropic()def get_weather(city: str) -> str:    return f"{city}: 18C, clear"          # pretend this hits a real APIweather_tool = {    "name": "get_weather",    "description": "Get current weather for a city.",    "input_schema": {        "type": "object",        "properties": {"city": {"type": "string"}},        "required": ["city"],    },}messages = [{"role": "user", "content": "What should I wear in Berlin today?"}]while True:    resp = client.messages.create(        model="claude-sonnet-4-6",        max_tokens=1024,        tools=[weather_tool],        messages=messages,    )    messages.append({"role": "assistant", "content": resp.content})    if resp.stop_reason != "tool_use":                 # Claude wants no tool -> we're done        print("".join(b.text for b in resp.content if b.type == "text"))        break    tool_results = []    for block in resp.content:        if block.type == "tool_use":            result = get_weather(**block.input)        # dispatch to your real function            tool_results.append({                "type": "tool_result",                "tool_use_id": block.id,                # tie the result to the request                "content": result,            })    messages.append({"role": "user", "content": tool_results})  # feed back -> loop again

05. Compose tools into a workflow: route, then chain.

One tool is a feature. The jump to workflow is orchestrating several with predefined paths. Two patterns cover most of what you need, straight from Anthropic’s playbook:

Anthropic’s distinction matters here: a workflow runs on paths you define in code; an agent lets the model direct itself. Workflows are more predictable — reach for them whenever the steps are known.

Use it when: the task has clear stages or clearly different request types, and you want predictability over autonomy.

from anthropic import Anthropicclient = Anthropic()def ask(prompt, model="claude-haiku-4-5-20251001", system="", max_tokens=512):    msg = client.messages.create(        model=model, max_tokens=max_tokens, system=system,        messages=[{"role": "user", "content": prompt}],    )    return msg.content[0].text.strip()def route(text):                                        # ROUTE: cheapest model picks the lane    lane = ask(f"Classify intent, one word [refund|bug|sales]:\n{text}",               model="claude-haiku-4-5-20251001")    return lane.lower()def handle(text):                                       # CHAIN: step output feeds next step    lane = route(text)    if lane == "bug":        repro = ask(f"Write numbered repro steps for this bug:\n{text}",                    model="claude-sonnet-4-6")          # harder step -> stronger model        return ask(f"Turn these repro steps into a calm customer reply:\n{repro}")    return ask(f"Answer this {lane} request:\n{text}")print(handle("The app crashes when I upload a 2GB file."))

06. Close the loop into an agent (Claude Agent SDK).

Hand-writing the while loop teaches you the mechanics. For real work, don’t maintain it yourself. The Claude Agent SDK (renamed from the Claude Code SDK in Sept 2025) ships the whole loop — planning, tool calls, retries, file I/O — behind one query() call. You hand it a goal and a toolset; it runs think → act → observe until the goal is met or max_turns trips.

This is the same harness that powers Claude Code, exposed as a library. You get the agent loop, a built-in toolset (Read, Write, Bash, WebSearch…), and permission controls for free.

Use it when: your task needs more than 2–3 tool calls, or you’re rebuilding loop/retry/permission logic by hand.

One agent on a short task is solved. The problems now are reliability (it must not do dangerous things) and scale (one context window isn’t enough). Three moves: guardrails, delegation, and real-world connections.

07. Add custom tools and guardrails (hooks + permissions).

Built-in tools get you far, but your assistant needs your actions — issue a refund, update a row, post to Slack. In the Agent SDK, a custom tool is a plain async function wrapped with tool and served as an in-process MCP server (no subprocess, no IPC). And the moment a tool can spend money or delete data, you need a hook: deterministic code the SDK runs at a lifecycle point like PreToolUse, where it can deny a call before it happens.

The model proposes; the hook disposes. Guardrails belong in code you control, not in a prompt you hope the model obeys.

Use it when: the agent can take real, irreversible actions — payments, deletes, external posts.

from claude_agent_sdk import (    tool, create_sdk_mcp_server, ClaudeAgentOptions, ClaudeSDKClient, HookMatcher,)# A custom tool = an async function exposed to Claude@tool("refund", "Issue a refund in cents", {"order_id": str, "cents": int})async def refund(args):    # ... call your payments API here ...    return {"content": [{"type": "text", "text": f"Refunded {args['cents']}c on {args['order_id']}"}]}server = create_sdk_mcp_server(name="ops", version="1.0.0", tools=[refund])# A hook = a deterministic guardrail the APP runs (not the model)async def cap_refund(input_data, tool_use_id, context):    if input_data["tool_name"] == "mcp__ops__refund":        if input_data["tool_input"].get("cents", 0) > 5000:        # > $50 -> stop            return {"hookSpecificOutput": {                "hookEventName": "PreToolUse",                "permissionDecision": "deny",                      # block before it runs                "permissionDecisionReason": "Refunds over $50 need a human.",            }}    return {}options = ClaudeAgentOptions(    mcp_servers={"ops": server},    allowed_tools=["mcp__ops__refund"],    hooks={"PreToolUse": [HookMatcher(matcher=None, hooks=[cap_refund])]},  # gate tool calls

08. Delegate with subagents (orchestrator–workers).

When a task sprawls — research five competitors, summarize each — don’t stuff it all into one context window. Spin up subagents: specialized agents that run in their own isolated context and return only a tight summary to the main “orchestrator.” This is Anthropic’s orchestrator–workers pattern, and it’s how you stay under context limits while doing more work in parallel.

A subagent is just a markdown file with frontmatter in .claude/agents/. Give it a narrow job, its own (cheaper) model, and a minimal toolset.

Use it when: a task splits into independent sub-tasks, or one agent’s context is overflowing with noisy intermediate output.

<!-- .claude/agents/researcher.md -->---name: researcherdescription: Deep-dive ONE topic and return a tight summary. Use proactively for research.tools: WebSearch, Readmodel: sonnet            # workers can run cheaper than the orchestrator---You are a focused research worker. Investigate ONLY the topic you are handed.Return exactly: 5 bullet findings + 3 source URLs. No preamble, no filler.

The orchestrator then delegates in plain language — “Use the researcher subagent on each of these five companies, in parallel” — and each worker’s verbose digging stays in its own window. Only the summaries come back.

09. Connect the real world via MCP.

Your assistant is only as useful as what it can reach. The Model Context Protocol (MCP) — Anthropic’s open standard, announced Nov 25, 2024 — is the USB-C of tools: one protocol over JSON-RPC, and any compliant server (GitHub, Slack, Postgres, your internal API) plugs in. It kills the N×M problem where every app needed a custom integration for every data source.

In the Agent SDK you register external MCP servers in mcp_servers and allow their tools by name (mcp__<server>__<tool>). Swap the server, gain a hundred new actions — no glue code.

Use it when: you need the agent to touch a system that already speaks MCP, instead of hand-writing an API wrapper.

from claude_agent_sdk import ClaudeAgentOptions, ClaudeSDKClientoptions = ClaudeAgentOptions(    mcp_servers={        # External MCP server: one standard, any vendor. Swap this block, gain new tools.        "github": {            "type": "stdio",            "command": "npx",            "args": ["-y", "@modelcontextprotocol/server-github"],            "env": {"GITHUB_TOKEN": "ghp_..."},        },    },    allowed_tools=["mcp__github__create_issue", "mcp__github__search_issues"],)async def main():    async with ClaudeSDKClient(options=options) as client:        await client.query("Open an issue titled 'Flaky CI' in my main repo.")        async for msg in client.receive_response():            print(msg)

The agent works, has hands, and reaches your systems. Two final moves turn it from a thing you call into a thing that runs itself: durable memory, and a trigger that isn’t you.

10. Give it memory and manage context.

By default, every run starts amnesiac. To act like an assistant, it has to remember facts across sessions without dragging the entire history into every call — that path ends in a blown context window. The pattern is simple: keep a small, durable store, read the relevant slice in, write new facts out. Below is the hand-rolled version so you can see the mechanism.

In production, use the platform’s memory tool (a client-side file store Claude reads/writes across sessions) together with context editing (auto-clears stale tool calls as you approach the limit). Anthropic measured a 39% lift on internal agentic-search evals from the pair, and in a 100-turn web-search test context editing cut token use by 84% while letting runs finish that would otherwise die of context exhaustion. (Both are beta on the Claude Developer Platform — confirm the current beta flags in the docs before shipping.)

Use it when: the assistant should recall preferences/decisions across days, or long runs keep hitting the context ceiling.

11. Let it run itself (headless + cron).

The last step removes you from the trigger. Claude Code / the Agent SDK run headless with -p (print mode): one prompt in, one result out, then exit — a normal command-line program. Wrap it in a shell script, point cron at it, and your assistant now wakes on a schedule, does the job, and logs the result. That’s the line between “a tool I use” and “an assistant that runs.”

Two production notes baked into the script below: cron starts with a bare environment, so export ANTHROPIC_API_KEY explicitly; and scope — allowedTools tightly so an unattended run can’t wander.

Use it when: the task should happen on a clock or an event — not when you remember to open a tab.

#!/usr/bin/env bash# self-runner.sh — one unattended batch run of the agentset -euo pipefailexport ANTHROPIC_API_KEY="sk-ant-..."          # cron has a bare env: set secrets explicitlycd "$HOME/agents/inbox"# -p = headless/print: run once, write result, exit. --bare = deterministic startup for cron/CI.claude -p "Triage new emails in ./inbox and draft replies into ./drafts" \  --allowedTools "Read,Write,Bash" \           # narrow blast radius for an unattended run  --bare > "logs/$(date +%F).log" 2>&1# Schedule it (crontab -e): every weekday at 08:00 the assistant runs itself# 0 8 * * 1-5  /home/me/agents/inbox/self-runner.sh

The failure modes below are the ones that quietly waste tokens and break trust. Each is mistake → why it’s bad → the fix.

1. One mega-prompt for everything. Why it’s bad: context rot — as the window fills with instructions and history, accuracy drops and cost climbs. Fix: route first, then chain small steps.

lane = ask("Classify intent, one word [refund|bug|sales]:\n" + text,           model="claude-haiku-4-5-20251001")   # split the job before you solve it

2. Parsing free text with regex. Why it’s bad: the format shifts every response and your parser shatters. Fix: force a tool schema.

tool_choice={"type": "tool", "name": "save_ticket"}   # structured output, guaranteed

3. Handing the agent every tool at once. Why it’s bad: more tools = more wrong turns and bigger blast radius. Fix: a tight allowlist.

allowed_tools=["mcp__ops__refund"]   # only what THIS job needs

4. A loop with no stop valve. Why it’s bad: a confused agent loops forever and burns your budget. Fix: cap turns and gate actions.

ClaudeAgentOptions(max_turns=8, hooks={"PreToolUse": [...]})   # bounded + guarded

5. One giant agent for a sprawling task. Why it’s bad: the single context window overflows with intermediate noise. Fix: orchestrator + subagents, each in its own context, returning only summaries.

6. Keeping everything in context “to be safe.” Why it’s bad: you hit the limit and the agent loses the thread mid-task. Fix: memory tool + context editing — persist facts to a store, clear stale tool calls (−84% tokens in Anthropic’s test).

A self-running assistant was never one clever prompt. It’s 11 small, boring upgrades stacked in order: a stable instruction layer, structured output, one tool, a loop, an SDK to own the loop, guardrails, delegation, MCP for reach, memory, and a cron trigger. Each stage is independently useful, and each one earns the next.

The throughline is the same advice Anthropic gives for agents: start simple, compose small patterns, and add machinery only when it pays for itself. Don’t build stage 11 on day one. Find the lowest stage where your assistant is still doing something by hand — and climb exactly one rung.

Do this now (5-step checklist)

Ship stage 1 today. The assistant that runs while you sleep is just stage 11 of something you already started.

## Real Numbers (After 3 Months)- **Time saved:** 12 hours/week (from 15h manual → 3h review)- **API cost:** ~$47/month (mostly Sonnet, some Haiku)- **Errors caught by hooks:** 23 (mostly refund attempts > $50)- **Context window hits:** 0 (after adding memory + context editing)## Real Numbers (After 3 Months)

From Chat to Cron: 11 Stages to a Self-Running Claude Assistant was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

source & further reading

pub.towardsai.net — original article LangGraph Multi-Agent Systems: From One Brain to Many Streaming Responses from LLMs: SSE, Chunking, and the UX Tricks Nobody Explains I Gave Five AI Coding Agents a way to Fact-Check the Docs They Were handed. They Refused to Use it.

From Chat to Cron: 11 Stages to a Self-Running Claude Assistant

Run your AI side-project on zahid.host