From Chat to Cron: 11 Stages to a Self-Running Claude Assistant

A developer guide outlines 11 stages to build a self-running Claude assistant, progressing from a basic chat prompt to a cron-scheduled system that classifies emails, calls tools, and runs autonomously. The guide emphasizes starting with a stable instruction layer and machine-readable output before adding automation, using Anthropic's API and models like claude-sonnet-4-6.

“This guide assumes you know Python basics and have used Claude’s chat interface. You don’t need prior API experience — we build from scratch.” Almost everyone stops at the chat box. This guide closes that gap with 11 stages — from blank chat to a cron-scheduled assistant that runs while you sleep. Each stage is one move. You don’t need all 11 on day one; you need stage N+1.” What you’ll build:an inbox/ops assistant that classifies input, calls real tools, runs in a loop, remembers facts across sessions, and triggers itself on a schedule.Stack:Python Anthropic SDK → Claude Agent SDK + MCP. Models referenced: claude-haiku-4–5–20251001, claude-sonnet-4–6, claude-opus-4–8. Before tools and agents, get the boring fundamentals right: a stable instruction layer and machine-readable output. Skip these and every later stage inherits the mess. 01. Start in the blank chat and set a baseline. Open a normal chat claude.ai https://claude.ai/ or the desktop app and solve the task The rule: if you can’t get a clean result by hand in the chat, no amount of automation will save it. Automation multiplies whatever you start with, including the bugs. Use it when: you’re starting any new task, or a downstream stage breaks and you need to isolate whether it’s the prompt or the plumbing. Prompt v3 the one that finally worked : You are an inbox triage assistant.For the email below, output exactly one word: urgent | normal | ignore.No explanation. Email:"""Subject: prod is downBody: checkout 500s for all users since 09:14""" 02. Promote your prompt to a system prompt move to the API . The moment a prompt works, it stops being a message and becomes configuration . Move it out of the chat and into the API’s system field. The system prompt holds the rules that apply to every turn; the user message is just the data. Separating them is the first real engineering step. This also unlocks model choice. Pick a tier deliberately: Haiku for cheap/fast classification, Sonnet for balanced work, Opus for hard reasoning. Triage is easy, so it doesn’t need your most expensive model. Use it when: the same instructions repeat across requests, or you need to call the task from code instead of typing it. python pip install anthropicimport osfrom anthropic import Anthropicclient = Anthropic api key=os.environ "ANTHROPIC API KEY" SYSTEM = "You are an inbox triage assistant. " "Classify each email as exactly one of: urgent, normal, ignore. " "Reply with one word only." the rules live here, not in every user turn resp = client.messages.create model="claude-sonnet-4-6", balanced tier; swap to Haiku to cut cost max tokens=16, one word out - tiny budget system=SYSTEM, config, applied to every call messages= {"role": "user", "content": "Subject: server on fire\nBody: prod is down"} , print resp.content 0 .text - "urgent" 03. Force structure: make the output machine-readable. A one-word answer is fine for a human. The moment code has to act on the output, free text is a liability — you’ll write regexes that break on the next response. Instead, force Claude to answer through a tool schema. With tool choice set to a specific tool, the model is required to return arguments that match your JSON shape. No parsing, no guessing. This is the hinge of the whole guide: structured output is what lets the next stage real tools exist. You’re teaching Claude to fill in forms instead of writing essays. Use it when: another system consumes the output — a database, an API, a branch in your code. python import jsonfrom anthropic import Anthropicclient = Anthropic Describe the OUTPUT you want as a tool's input schemasave ticket = { "name": "save ticket", "description": "Save a structured support ticket.", "input schema": { "type": "object", "properties": { "priority": {"type": "string", "enum": "low", "medium", "high" }, "topic": {"type": "string"}, "summary": {"type": "string"}, }, "required": "priority", "topic", "summary" , },}resp = client.messages.create model="claude-sonnet-4-6", max tokens=512, tools= save ticket , tool choice={"type": "tool", "name": "save ticket"}, FORCE this tool - guaranteed JSON messages= {"role": "user", "content": "My API keys leaked and billing is wrong."} , The structured data arrives as a dict in the tool use block — already parsedticket = next b.input for b in resp.content if b.type == "tool use" print json.dumps ticket, indent=2 - {"priority": "high", "topic": "security", "summary": "Leaked API keys; billing error"} Structured output was the setup. Now Claude stops describing actions and starts taking them — first one tool, then many, then a loop that decides for itself when it’s done. 04. Give Claude one tool and close the tool-use loop. A tool is just a function you describe to Claude. When Claude wants it, the API stops with stop reason == “tool use”, hands you the arguments, and waits. You run the function, feed the result back, and call the API again. That back-and-forth — model asks, you answer, repeat until done — is the agentic loop. Every “agent” you’ve ever heard of is this while loop with nicer packaging. Start with exactly one tool. One tool you can debug beats ten you can’t. Use it when: the task needs live data or a side effect — weather, a database row, sending an email — anything the model can’t do from memory. php from anthropic import Anthropicclient = Anthropic def get weather city: str - str: return f"{city}: 18C, clear" pretend this hits a real APIweather tool = { "name": "get weather", "description": "Get current weather for a city.", "input schema": { "type": "object", "properties": {"city": {"type": "string"}}, "required": "city" , },}messages = {"role": "user", "content": "What should I wear in Berlin today?"} while True: resp = client.messages.create model="claude-sonnet-4-6", max tokens=1024, tools= weather tool , messages=messages, messages.append {"role": "assistant", "content": resp.content} if resp.stop reason = "tool use": Claude wants no tool - we're done print "".join b.text for b in resp.content if b.type == "text" break tool results = for block in resp.content: if block.type == "tool use": result = get weather block.input dispatch to your real function tool results.append { "type": "tool result", "tool use id": block.id, tie the result to the request "content": result, } messages.append {"role": "user", "content": tool results} feed back - loop again 05. Compose tools into a workflow: route, then chain. One tool is a feature. The jump to workflow is orchestrating several with predefined paths. Two patterns cover most of what you need, straight from Anthropic’s playbook: Anthropic’s distinction matters here: a workflow runs on paths you define in code; an agent lets the model direct itself. Workflows are more predictable — reach for them whenever the steps are known. Use it when: the task has clear stages or clearly different request types, and you want predictability over autonomy. python from anthropic import Anthropicclient = Anthropic def ask prompt, model="claude-haiku-4-5-20251001", system="", max tokens=512 : msg = client.messages.create model=model, max tokens=max tokens, system=system, messages= {"role": "user", "content": prompt} , return msg.content 0 .text.strip def route text : ROUTE: cheapest model picks the lane lane = ask f"Classify intent, one word refund|bug|sales :\n{text}", model="claude-haiku-4-5-20251001" return lane.lower def handle text : CHAIN: step output feeds next step lane = route text if lane == "bug": repro = ask f"Write numbered repro steps for this bug:\n{text}", model="claude-sonnet-4-6" harder step - stronger model return ask f"Turn these repro steps into a calm customer reply:\n{repro}" return ask f"Answer this {lane} request:\n{text}" print handle "The app crashes when I upload a 2GB file." 06. Close the loop into an agent Claude Agent SDK . Hand-writing the while loop teaches you the mechanics. For real work, don’t maintain it yourself. The Claude Agent SDK renamed from the Claude Code SDK in Sept 2025 ships the whole loop — planning, tool calls, retries, file I/O — behind one query call. You hand it a goal and a toolset; it runs think → act → observe until the goal is met or max turns trips. This is the same harness that powers Claude Code, exposed as a library. You get the agent loop, a built-in toolset Read, Write, Bash, WebSearch… , and permission controls for free. Use it when: your task needs more than 2–3 tool calls, or you’re rebuilding loop/retry/permission logic by hand. python pip install claude-agent-sdk Python 3.10+, bundles the Claude CLI import anyiofrom claude agent sdk import query, ClaudeAgentOptions, AssistantMessage, TextBlockoptions = ClaudeAgentOptions system prompt="You are a research assistant. Cite every source.", allowed tools= "WebSearch", "Read", "Write" , auto-approve these, prompt on the rest permission mode="acceptEdits", don't pause on file edits max turns=8, hard rail on the loop async def main : query runs the full think - act - observe loop for you async for message in query prompt="Summarize today's AI news into news.md", options=options : if isinstance message, AssistantMessage : for block in message.content: if isinstance block, TextBlock : print block.text anyio.run main One agent on a short task is solved. The problems now are reliability it must not do dangerous things and scale one context window isn’t enough . Three moves: guardrails, delegation, and real-world connections. 07. Add custom tools and guardrails hooks + permissions . Built-in tools get you far, but your assistant needs your actions — issue a refund, update a row, post to Slack. In the Agent SDK, a custom tool is a plain async function wrapped with tool and served as an in-process MCP server no subprocess, no IPC . And the moment a tool can spend money or delete data, you need a hook : deterministic code the SDK runs at a lifecycle point like PreToolUse, where it can deny a call before it happens. The model proposes; the hook disposes. Guardrails belong in code you control, not in a prompt you hope the model obeys. Use it when: the agent can take real, irreversible actions — payments, deletes, external posts. python from claude agent sdk import tool, create sdk mcp server, ClaudeAgentOptions, ClaudeSDKClient, HookMatcher, A custom tool = an async function exposed to Claude@tool "refund", "Issue a refund in cents", {"order id": str, "cents": int} async def refund args : ... call your payments API here ... return {"content": {"type": "text", "text": f"Refunded {args 'cents' }c on {args 'order id' }"} }server = create sdk mcp server name="ops", version="1.0.0", tools= refund A hook = a deterministic guardrail the APP runs not the model async def cap refund input data, tool use id, context : if input data "tool name" == "mcp ops refund": if input data "tool input" .get "cents", 0 5000: $50 - stop return {"hookSpecificOutput": { "hookEventName": "PreToolUse", "permissionDecision": "deny", block before it runs "permissionDecisionReason": "Refunds over $50 need a human.", }} return {}options = ClaudeAgentOptions mcp servers={"ops": server}, allowed tools= "mcp ops refund" , hooks={"PreToolUse": HookMatcher matcher=None, hooks= cap refund }, gate tool calls 08. Delegate with subagents orchestrator–workers . When a task sprawls — research five competitors, summarize each — don’t stuff it all into one context window. Spin up subagents : specialized agents that run in their own isolated context and return only a tight summary to the main “orchestrator.” This is Anthropic’s orchestrator–workers pattern, and it’s how you stay under context limits while doing more work in parallel. A subagent is just a markdown file with frontmatter in .claude/agents/. Give it a narrow job, its own cheaper model, and a minimal toolset. Use it when: a task splits into independent sub-tasks, or one agent’s context is overflowing with noisy intermediate output. php < -- .claude/agents/researcher.md -- ---name: researcherdescription: Deep-dive ONE topic and return a tight summary. Use proactively for research.tools: WebSearch, Readmodel: sonnet workers can run cheaper than the orchestrator---You are a focused research worker. Investigate ONLY the topic you are handed.Return exactly: 5 bullet findings + 3 source URLs. No preamble, no filler. The orchestrator then delegates in plain language — “Use the researcher subagent on each of these five companies, in parallel” — and each worker’s verbose digging stays in its own window. Only the summaries come back. 09. Connect the real world via MCP. Your assistant is only as useful as what it can reach. The Model Context Protocol MCP — Anthropic’s open standard, announced Nov 25, 2024 — is the USB-C of tools: one protocol over JSON-RPC, and any compliant server GitHub, Slack, Postgres, your internal API plugs in. It kills the N×M problem where every app needed a custom integration for every data source. In the Agent SDK you register external MCP servers in mcp servers and allow their tools by name mcp <server <tool . Swap the server, gain a hundred new actions — no glue code. Use it when: you need the agent to touch a system that already speaks MCP, instead of hand-writing an API wrapper. python from claude agent sdk import ClaudeAgentOptions, ClaudeSDKClientoptions = ClaudeAgentOptions mcp servers={ External MCP server: one standard, any vendor. Swap this block, gain new tools. "github": { "type": "stdio", "command": "npx", "args": "-y", "@modelcontextprotocol/server-github" , "env": {"GITHUB TOKEN": "ghp ..."}, }, }, allowed tools= "mcp github create issue", "mcp github search issues" , async def main : async with ClaudeSDKClient options=options as client: await client.query "Open an issue titled 'Flaky CI' in my main repo." async for msg in client.receive response : print msg The agent works, has hands, and reaches your systems. Two final moves turn it from a thing you call into a thing that runs itself: durable memory, and a trigger that isn’t you. 10. Give it memory and manage context. By default, every run starts amnesiac. To act like an assistant, it has to remember facts across sessions without dragging the entire history into every call — that path ends in a blown context window. The pattern is simple: keep a small, durable store, read the relevant slice in, write new facts out. Below is the hand-rolled version so you can see the mechanism. Lightweight, verifiable memory: a file read at the start and appended at the end. This is exactly the idea the platform's native "memory tool" automates client-side.from pathlib import Pathfrom anthropic import Anthropic client = Anthropic MEM = Path "memory.md" MEM.touch exist ok=True def run user msg : memory = MEM.read text -4000: only the recent tail - bounded context resp = client.messages.create model="claude-sonnet-4-6", max tokens=512, system=f"Durable notes about this user:\n{memory}", inject memory as context messages= {"role": "user", "content": user msg} , answer = resp.content 0 .text MEM.write text memory + f"\n- {user msg :80 }" persist one fact let Claude pick, in real use return answerprint run "I'm allergic to peanuts - remember that for any food suggestions." In production, use the platform’s memory tool a client-side file store Claude reads/writes across sessions together with context editing auto-clears stale tool calls as you approach the limit . Anthropic measured a 39% lift on internal agentic-search evals from the pair, and in a 100-turn web-search test context editing cut token use by 84% while letting runs finish that would otherwise die of context exhaustion. Both are beta on the Claude Developer Platform — confirm the current beta flags in the docs before shipping. Use it when: the assistant should recall preferences/decisions across days, or long runs keep hitting the context ceiling. 11. Let it run itself headless + cron . The last step removes you from the trigger. Claude Code / the Agent SDK run headless with -p print mode : one prompt in, one result out, then exit — a normal command-line program. Wrap it in a shell script, point cron at it, and your assistant now wakes on a schedule, does the job, and logs the result. That’s the line between “a tool I use” and “an assistant that runs.” Two production notes baked into the script below: cron starts with a bare environment, so export ANTHROPIC API KEY explicitly; and scope — allowedTools tightly so an unattended run can’t wander. Use it when: the task should happen on a clock or an event — not when you remember to open a tab. bash /usr/bin/env bash self-runner.sh — one unattended batch run of the agentset -euo pipefailexport ANTHROPIC API KEY="sk-ant-..." cron has a bare env: set secrets explicitlycd "$HOME/agents/inbox" -p = headless/print: run once, write result, exit. --bare = deterministic startup for cron/CI.claude -p "Triage new emails in ./inbox and draft replies into ./drafts" \ --allowedTools "Read,Write,Bash" \ narrow blast radius for an unattended run --bare "logs/$ date +%F .log" 2 &1 Schedule it crontab -e : every weekday at 08:00 the assistant runs itself 0 8 1-5 /home/me/agents/inbox/self-runner.sh The failure modes below are the ones that quietly waste tokens and break trust. Each is mistake → why it’s bad → the fix . 1. One mega-prompt for everything. Why it’s bad: context rot — as the window fills with instructions and history, accuracy drops and cost climbs. Fix: route first, then chain small steps. lane = ask "Classify intent, one word refund|bug|sales :\n" + text, model="claude-haiku-4-5-20251001" split the job before you solve it 2. Parsing free text with regex. Why it’s bad: the format shifts every response and your parser shatters. Fix: force a tool schema. tool choice={"type": "tool", "name": "save ticket"} structured output, guaranteed 3. Handing the agent every tool at once. Why it’s bad: more tools = more wrong turns and bigger blast radius. Fix: a tight allowlist. allowed tools= "mcp ops refund" only what THIS job needs 4. A loop with no stop valve. Why it’s bad: a confused agent loops forever and burns your budget. Fix: cap turns and gate actions. ClaudeAgentOptions max turns=8, hooks={"PreToolUse": ... } bounded + guarded 5. One giant agent for a sprawling task. Why it’s bad: the single context window overflows with intermediate noise. Fix: orchestrator + subagents, each in its own context, returning only summaries. 6. Keeping everything in context “to be safe.” Why it’s bad: you hit the limit and the agent loses the thread mid-task. Fix: memory tool + context editing — persist facts to a store, clear stale tool calls −84% tokens in Anthropic’s test . A self-running assistant was never one clever prompt. It’s 11 small, boring upgrades stacked in order: a stable instruction layer, structured output, one tool, a loop, an SDK to own the loop, guardrails, delegation, MCP for reach, memory, and a cron trigger. Each stage is independently useful, and each one earns the next. The throughline is the same advice Anthropic gives for agents: start simple, compose small patterns, and add machinery only when it pays for itself. Don’t build stage 11 on day one. Find the lowest stage where your assistant is still doing something by hand — and climb exactly one rung. Do this now 5-step checklist Ship stage 1 today. The assistant that runs while you sleep is just stage 11 of something you already started. Real Numbers After 3 Months - Time saved: 12 hours/week from 15h manual → 3h review - API cost: ~$47/month mostly Sonnet, some Haiku - Errors caught by hooks: 23 mostly refund attempts $50 - Context window hits: 0 after adding memory + context editing Real Numbers After 3 Months From Chat to Cron: 11 Stages to a Self-Running Claude Assistant https://pub.towardsai.net/from-chat-to-cron-11-stages-to-a-self-running-claude-assistant-3e8601a57154 was originally published in Towards AI https://pub.towardsai.net on Medium, where people are continuing the conversation by highlighting and responding to this story.