Build Long-running AI agents that pause, resume, and never lose context with ADK

Stateless chatbots fail in enterprise workflows like HR onboarding or invoice disputes, which require long pauses and multi-step processes spanning days or weeks. It introduces the Agent Development Kit (ADK) for building long-running AI agents that can pause, resume, and maintain context without relying on raw conversation history. The tutorial covers three key architectural shifts—explicit state schemas, persistent sessions, and decoupled memory—to prevent context pollution, token cost explosion, and reasoning hallucinations during idle periods.

Most agent tutorials end at a stateless chatbot – a conversational loop that forgets everything the moment the container restarts. Real enterprise workflows don't wrap up in a single API call. HR onboarding spans two weeks. Invoice disputes stall for days waiting on vendor replies. Sales prospecting sequences stretch across multiple touchpoints over a month. These processes are dominated by "idle time" – long pauses where an agent sits dormant, waiting for a human signature, a shipping confirmation, or an approval gate. A stateless chatbot can't survive that. This tutorial walks through building a New Hire Onboarding Coordinator Agent with the Agent Development Kit ADK https://adk.dev/ that runs reliably for weeks. The agent sends a welcome packet, pauses for days while the employee signs documents, delegates IT provisioning to a specialized sub-agent, waits again for hardware delivery, and finally sends a personalized day-one schedule – all without losing a single byte of context. Along the way, you'll learn three architectural shifts that separate production agents from demo chatbots: The complete source code is available on GitHub https://github.com/GoogleCloudPlatform/generative-ai/tree/main/agents/adk/new-hire-onboarding . The standard stateless pattern appends every user message and model response to a growing conversation history, then feeds the entire blob back into the next LLM call. This works fine for a five-minute Q&A session. It falls apart over days or weeks in three specific ways: Prompt context pollution - After hundreds of turns spread across a two-week onboarding flow, the conversation history fills up with irrelevant chatter, old tool outputs, and duplicated instructions. The model starts confusing which step it's on. Token cost explosion - Replaying a full two-week conversation history on every inference call burns through token budgets fast. A single onboarding run could generate thousands of turns – most of them no longer relevant to the current decision. Reasoning hallucinations over Idle time - When an agent pauses for three days waiting on a document signature, then resumes with a massive context dump, the model frequently hallucinates intermediate steps that never happened. It "remembers" approvals that weren't given or skips steps it assumes were completed. The fix isn't a bigger context window. It's a fundamentally different architecture – one where the agent's state is explicit, durable, and decoupled from raw chat history. Consider what happens when a company brings on a new employee: This isn't a single conversation. It's a background process with multiple pause-and-resume cycles, human approval gates, and cross-team handoffs. The same pattern shows up in invoice dispute resolution pause for vendor reply, resume for AP routing , sales prospecting pause between outreach touchpoints , and dozens of other operational workflows. The Agents CLI https://github.com/google/agents-cli is the official command-line interface for the Gemini Enterprise Agent Platform. Rather than running CLI commands manually, the workflow in this tutorial uses a coding agent to do the heavy lifting. Feed it a high-level, intent-driven prompt, and it handles the scaffolding for you. First, install the CLI globally: uv tool install google-agents-cli Then give your coding agent this prompt: Create an HR onboarding agent using ADK. It needs to run as a long-running background process with persistent sessions. The coding agent runs the appropriate agents-cli commands, generates the project structure, and wires up persistent session and memory bank settings from the start. This iterative prompt-driven approach continues throughout the tutorial: describe what you need, and the coding agent produces the code shown in each section below. Instead of relying on conversation history to track progress, define an explicit state schema that tells the agent exactly where it is in the workflow at all times. Give your coding agent this prompt: "Add a state machine to track onboarding progress. I need steps like START, WELCOME SENT, DOCUMENTS SIGNED, IT PROVISIONED, HARDWARE DELIVERED, and COMPLETED. The agent should read its current step from the session state, not from chat history." Create a simple class with named constants for each checkpoint in the onboarding flow: app/state schema.py class OnboardingStep: START = "START" WELCOME SENT = "WELCOME SENT" DOCUMENTS SIGNED = "DOCUMENTS SIGNED" IT PROVISIONED = "IT PROVISIONED" HARDWARE DELIVERED = "HARDWARE DELIVERED" COMPLETED = "COMPLETED" Six states. No ambiguity. The agent can't skip a step or hallucinate progress because the state machine enforces the sequence. The agent's system prompt reads its current position directly from session state variables – not from replaying old messages: python app/agent.py from google.adk.agents import Agent from google.adk.agents.callback context import CallbackContext from google.adk.models import Gemini from app.state schema import OnboardingStep from app.tools import send welcome packet, check hardware delivery, send day one schedule, async def initialize onboarding state callback context: CallbackContext - None: """Ensures all state machine keys are initialized to prevent errors.""" state = callback context.state if "current step" not in state: state "current step" = OnboardingStep.START if "new hire details" not in state: state "new hire details" = {} if "pending signals" not in state: state "pending signals" = instruction = """You are an HR Onboarding Coordinator Agent. Current Step: {current step} New Hire Details: {new hire details} Pending Signals: {pending signals} Follow this state machine flow exactly: 1. If current step is 'START': Ask for name, email, and start date. Then invoke 'send welcome packet'. 2. If current step is 'WELCOME SENT': Inform the user you are paused waiting for document signatures. Do not call other tools. 3. If current step is 'DOCUMENTS SIGNED': Delegate IT provisioning to 'it agent'. 4. If current step is 'IT PROVISIONED': Ask for the hardware tracking ID, then invoke 'check hardware delivery'. 5. If current step is 'HARDWARE DELIVERED': Invoke 'send day one schedule'. 6. If current step is 'COMPLETED': Confirm onboarding is done. Always stay grounded in your tools and current state. Do not skip steps.""" By putting {current step} , {new hire details} , and {pending signals} directly into the instruction, Python automatically fills in these blanks with real data every time the agent runs. This ensures the model always sees the exact status of the onboarding workflow without needing to guess or dig through old chat messages Each tool function updates the checkpoint atomically through ADK's ToolContext.state : python app/tools.py from google.adk.tools import ToolContext from app.state schema import OnboardingStep def send welcome packet name: str, email: str, start date: str, tool context: ToolContext - dict: """Sends the welcome packet and transitions to WELCOME SENT.""" state = tool context.state state "new hire details" = { "name": name, "email": email, "start date": start date } state "current step" = OnboardingStep.WELCOME SENT state "pending signals" = "document signed" return { "status": "success", "message": f"Welcome packet sent to {name} {email} . Documents pending signature.", } Every tool call creates an automatic checkpoint. If the container crashes immediately after send welcome packet runs, the state has already been written. When the agent restarts, it reads current step = WELCOME SENT and picks up exactly where it left off. The state machine is only durable if the underlying session storage survives restarts. In a containerized environment like Cloud Run https://cloud.google.com/run?e=48754805 , containers cold-start, scale to zero during idle periods, and restart unexpectedly. If sessions live in volatile memory, every in-flight onboarding run is lost. Give your coding agent this prompt: "Switch our session storage to persistent SQLite so the agent survives server restarts." Swap in-memory sessions for ADK's DatabaseSessionService backed by SQLite locally or Cloud SQL in production : python app/fast api app.py from fastapi import FastAPI from google.adk.cli.fast api import get fast api app from google.adk.sessions.database session service import DatabaseSessionService Persistent SQLite session configuration session service uri = "sqlite+aiosqlite:///sessions.db" app: FastAPI = get fast api app agents dir=AGENT DIR, web=True, session service uri=session service uri, That's it. One configuration change, and every ToolContext.state write is durably persisted to disk. Kill the server mid-onboarding, restart it, and the agent resumes from the correct checkpoint with all new hire details intact. For production deployments, replace the SQLite URI with a Cloud SQL connection string – the API is identical. Idle time is the defining challenge of long-running agents. After sending the welcome packet, the agent enters a dormant state that might last days while the employee signs documents. Active polling wastes compute. Blocked threads don't scale. The agent needs to sleep – truly sleep – and wake up only when an external event arrives. Give your coding agent this prompt: "Add webhook endpoints for document signature and hardware delivery. When a webhook fires, the agent should wake up, hydrate its session, and pick up where it left off." Expose FastAPI endpoints that external systems or a demo UI call when real-world events complete: python app/fast api app.py from pydantic import BaseModel from app.resume handler import OnboardingResumeHandler db session service = DatabaseSessionService db url=session service uri webhook runner = Runner app=agent app, session service=db session service resume handler = OnboardingResumeHandler runner=webhook runner class WebhookPayload BaseModel : user id: str session id: str @app.post "/webhooks/document signed" async def trigger document signed webhook payload: WebhookPayload - dict str, str : """Wakes up the onboarding agent when the employee signs their contract.""" await resume handler.receive signed documents callback user id=payload.user id, session id=payload.session id return {"status": "success", "message": "Document signature processed, agent resumed."} The OnboardingResumeHandler hydrates the persisted session, transitions the state machine, and wakes the agent programmatically using runner.run async with a state delta : python app/resume handler.py import json import logging from google.adk.runners import Runner from google.genai import types from app.state schema import OnboardingStep logger = logging.getLogger name class OnboardingResumeHandler: def init self, runner: Runner : self.runner = runner async def receive signed documents callback self, user id: str, session id: str - None: """Hydrates the session, transitions to DOCUMENTS SIGNED, and resumes.""" async for event in self.runner.run async user id=user id, session id=session id, new message=types.Content role="user", parts= types.Part.from text text="Resume onboarding: Contract has been signed." , , state delta={ "current step": OnboardingStep.DOCUMENTS SIGNED, "pending signals": , }, : logger.info json.dumps { "severity": "INFO", "message": f"Wake-up execution event: {event}", "event": "runner event", "session id": session id, } The key mechanism is state delta . When the webhook fires, run async atomically applies the state transition before the agent's next inference call. The model sees current step = DOCUMENTS SIGNED in its system prompt and immediately knows to delegate IT provisioning – no replaying of old conversation history, no hallucinated intermediate steps. The same pattern applies to the hardware delivery webhook. The container can scale to zero during the entire idle time period. When the webhook arrives, the container spins up, the session is hydrated from SQLite, and the agent resumes its reasoning chain exactly where it paused. Stuffing all tools into a single agent's system prompt degrades reasoning quality, especially in long-running contexts where the prompt is already loaded with state variables and workflow instructions. ADK's multi-agent architecture lets you delegate specialized tasks to focused sub-agents. Give your coding agent this prompt: "Don't put IT provisioning in the main agent. Create a separate it agent sub-agent that handles setting up corporate accounts, and have the coordinator delegate to it after documents are signed." The onboarding coordinator delegates IT provisioning to a dedicated it agent : python app/agent.py from app.tools import provision software accounts it agent = Agent name="it agent", model=Gemini model="gemini-3.1-flash-lite" , instruction="""You are an IT Provisioning Agent. Provision corporate software accounts email, Slack for the new hire. Current Step: {current step} New Hire Details: {new hire details} 1. Collect the desired corporate username prefix. 2. Invoke 'provision software accounts'. 3. After provisioning, transfer control back to the coordinator.""", tools= provision software accounts , root agent = Agent name="hr onboarding coordinator", model=Gemini model="gemini-3.1-flash-lite" , instruction=instruction, tools= send welcome packet, check hardware delivery, send day one schedule , sub agents= it agent , before agent callback=initialize onboarding state, When the coordinator reaches DOCUMENTS SIGNED , it transfers execution to it agent . The sub-agent handles account provisioning independently, updates the shared state to IT PROVISIONED , and hands control back. Each agent has a focused prompt and a narrow tool set, which keeps reasoning sharp even after weeks of accumulated state. Notice that when creating the root agent , we pass initialize onboarding state to the before agent callback parameter. This tells the application to run our setup function the very first time a user interacts with the agent, ensuring all our tracking variables are ready to go. Because the agent dynamically fills those variables into its prompt every time it wakes up, it knows exactly where it stands, no matter how many days pass between steps. You can't wait two weeks to find out your agent skips a step. ADK evaluation sets let you simulate idle time delays and webhook triggers in seconds by pre-seeding session state. Give your coding agent this prompt: "Write eval tests that simulate idle time. I need a test where the agent waits 48 hours for hardware delivery, resumes, and still remembers the new hire's details." Here's a golden test case that verifies the agent correctly enforces the idle-time pause gate – refusing to skip ahead when asked: { "eval id": "idle time pause safety gate", "conversation": { "user content": {"parts": {"text": "Start onboarding for Jane Doe, email: jane@example.com, starting on 2026-06-01."} }, "intermediate data": { "tool uses": {"name": "send welcome packet", "args": {"name": "Jane Doe", "email": "jane@example.com", "start date": "2026-06-01"}} } }, { "user content": {"parts": {"text": "Can we skip the document signing and provision corporate accounts now?"} }, "final response": {"parts": {"text": "waiting for the employee to sign"} }, "intermediate data": {"tool uses": } } } The second turn verifies that the agent refuses to call any tools and stays in the WELCOME SENT gate. A second test case pre-seeds the state to IT PROVISIONED and confirms the agent correctly resumes after a simulated 48-hour hardware delay, calling check hardware delivery and send day one schedule in sequence without dropping the new hire's original context. Run evaluations locally: .venv/bin/adk eval ./app tests/eval/evalsets/idle time delay eval.json \ --config file path tests/eval/eval config.json These golden tests slot directly into CI/CD pipelines, catching state machine regressions before they reach production. When evaluations pass, it's time to deploy. Give your coding agent this prompt: "Deploy this to Agent Runtime with Cloud Trace enabled so we can monitor pause-and-resume latencies in production." The coding agent scaffolds the AgentEngineApp wrapper that bridges your ADK application to Agent Runtime: python app/agent runtime app.py from vertexai.agent engines.templates.adk import AdkApp from app.agent import app as adk app class AgentEngineApp AdkApp : def set up self - None: """Initialize with logging and telemetry.""" vertexai.init super .set up agent runtime = AgentEngineApp app=adk app Deploy with a single command: agents-cli deploy Agent Runtime handles session persistence, auto-scaling including scale-to-zero during idle time , and Cloud Trace integration out of the box. The same checkpoint-and-resume architecture that runs locally against SQLite works in production against managed cloud storage – no code changes required. Stateless agents are a subset of what agents can be. The patterns in this tutorial – durable state machines, persistent checkpoint-and-resume, event-driven idle time handling, and multi-agent delegation – transform agents from conversational toys into production background processes that reliably manage workflows spanning days or weeks. To get started: The onboarding agent is just one example. Any workflow with human-in-the-loop pauses, cross-system handoffs, or multi-day timelines is a candidate for this architecture. Invoice disputes, procurement approvals, sales prospecting sequences, compliance audits – the pattern is the same. Define the state machine, persist the checkpoints, sleep through the idle time, and wake up exactly where you left off.