# Stop Blocking Virtual Threads: Building Asynchronous Human-in-the-Loop AI Agents with Spring AI

> Source: <https://dev.to/machinecodingmaster/stop-blocking-virtual-threads-building-asynchronous-human-in-the-loop-ai-agents-with-spring-ai-49pp>
> Published: 2026-06-04 07:08:47+00:00

In 2026, letting autonomous AI agents execute high-risk enterprise tools without human oversight is a production liability, but blocking platform threads—or even Project Loom’s virtual threads—for hours waiting for a manager's Slack approval is absolute architectural malpractice. We must transition from synchronous execution loops to stateless, event-driven agent hydration where the LLM's reasoning state is serialized and persisted during human-in-the-loop (HITL) interrupts.

`VirtualThreadExecutor`

) solve the wait problem—they do not; holding resources open for a 4-hour human coffee break destroys system scalability and ruins connection pools.`ChatMemory`

or agent context) in local heap memory, making your system highly vulnerable to redeployments and node failures.`CompletableFuture`

or busy-waiting database polling loops to check if a human has clicked "Approve" on an external UI.The clean solution is to serialize the agent's execution state—the ReAct loop token history, tool call IDs, and pending variables—to a persistent store, terminate the active thread immediately, and hydrate a brand-new agent instance when the approval webhook fires.

`AgentSuspensionException`

containing the serialized `stateId`

and tool execution metadata when a high-risk tool is triggered.`ChatClient`

with a custom Redis-backed `ChatMemory`

implementation that supports snapshotting at specific message indices.`/api/v1/agent/resume`

that accepts the human decision, merges it into the serialized history as a `ToolResponseMessage`

, and triggers the next step of the ReAct loop.

```
@PostMapping("/agent/resume")
public ResponseEntity<String> resumeAgent(@RequestBody ApprovalResponse approval) {
    // 1. Retrieve serialized chat history (ReAct state) from Redis
    List<Message> history = stateRepository.findById(approval.stateId());

    // 2. Inject the human's decision as if it were the tool's output
    String toolOutput = approval.approved() ? "Approved: " + approval.notes() : "Rejected by human";
    history.add(new ToolResponseMessage(approval.toolCallId(), toolOutput));

    // 3. Hydrate the agent and resume execution without blocking threads
    ChatResponse response = chatClient.prompt()
        .messages(history)
        .call()
        .chatResponse();

    return ResponseEntity.ok(response.getResult().getOutput().getContent());
}
```

`ChatMemory`

adapters to dynamically hydrate and dehydrate context windows on demand.

Heads up:if you want to see these patterns applied to real interview problems,[javalld.com]has full machine coding solutions with traces.
