Adaptive execution for Java agents: reason-aware retries and budget-aware routing

wpnews.pro

cd /news/ai-agents/adaptive-execution-for-java-agents-r… · home › topics › ai-agents › article

[ARTICLE · art-15070] src=dev.to ↗ pub=2026-05-27T08:20Z topic=ai-agents verified=true sentiment=↑ positive

Adaptive execution for Java agents: reason-aware retries and budget-aware routing

AgentFlow4J v0.7.0, a Java multi-agent orchestration framework built on Spring AI, introduces a `FailureClassifier` that categorizes LLM call failures as transient, permanent, or over-budget, enabling reason-aware retries that honor `Retry-After` headers and stop immediately on permanent errors. The release also adds a `BudgetPolicy` with budget-aware routing that deterministically switches between premium and fallback models based on remaining run budget, reading a counter rather than making an additional LLM call to classify complexity.

read2 min views13 publishedMay 27, 2026

If your LLM agent retries a 429

fifty times overnight, retries a 400

three times before giving up, and sends every request to your top-tier model until the run budget is gone, those aren't bugs in the model. They're missing two cheap policies your orchestration layer should be making for you.

AgentFlow4J v0.7.0 (Java multi-agent orchestration on Spring AI) ships those two policies.

A RetryPolicy

that counts attempts is blind to why a call failed. v0.7.0 adds a FailureClassifier

that sorts every failure into one of three categories:

Category	What the graph does
`TRANSIENT`
Retry. If the failure carries a `Retry-After` hint, that delay is honoured instead of the computed backoff.
`PERMANENT`
Stop immediately — no further attempts.
`OVER_BUDGET`
Stop and surface an `InterruptRequest` so a human can approve more budget and resume.

The default classifier already understands JDK I/O exceptions, Spring AI / Spring Web 5xx

429

(parsing Retry-After

) vs other 4xx

, and BudgetExceededException

, all detected by class name, so agentflow4j-graph

keeps zero compile-time Spring dependency.

Adding your own rules composes via orElse

FailureClassifier domain = cause -> {
    if (cause instanceof QuotaExhaustedException) return FailureClassification.overBudget("monthly quota hit");
    if (cause instanceof InvalidPromptException)  return FailureClassification.permanent("rejected by guardrail");
    return null; // unknown — defer to the default
};

RetryPolicy policy = RetryPolicy.exponential(3, Duration.ofSeconds(1))
        .withClassifier(domain.orElse(FailureClassifier.defaults()));

Existing policies that only set the legacy retryOn

predicate keep their exact behaviour, the classifier falls back to it when it returns null

BudgetPolicy

already caps the cost of a run. v0.7.0 lets it shape which model handles which request:

BudgetPolicy budget = BudgetPolicy.hierarchical(
        BudgetLimits.run(5.00), estimator, meter);

// Use "premium" while ≥ $1.00 remains, then "fallback".
RoutingStrategy router = RoutingStrategy.budgetAware(
        budget, BudgetPolicy.Scope.RUN, 1.00,
        "premium", "fallback");

CoordinatorAgent desk = CoordinatorAgent.builder()
        .executor("premium",  premiumAgent)
        .executor("fallback", fallbackAgent)
        .routingStrategy(router)
        .build();

While the run budget has more than $1.00

remaining, the coordinator sends work to premium

. The moment less remains, it degrades to fallback

. Reading the live BudgetPolicy.remaining(...)

counter is free, no extra LLM call to "classify complexity."

This is the one cost-aware routing lever that's both deterministic and provably cheaper: classifying complexity ex-ante with an LLM would itself cost a call (chicken-and-egg), and self-confidence routing doubles the cost. Reading counters is free.

The cookbook ships a recipe that wires both features end-to-end:

git clone https://github.com/datallmhub/agentflow4j-cookbook.git
cd agentflow4j-cookbook
mvn -pl 06-cost-aware-routing exec:java

No API key required, Ollama optional, the recipe falls back to a deterministic stub. You should see the first three tickets handled by premium

, then the squad switching cleanly to fallback

once remaining drops below $2.00

. The retry scene then shows a flaky node recovering on attempt 3 with 50ms → 98ms

exponential+jitter backoffs, followed by a classification table.

<dependency>
    <groupId>com.github.datallmhub.agentflow4j</groupId>
    <artifactId>agentflow4j-starter</artifactId>
    <version>v0.7.0</version>
</dependency>

Via JitPack: add https://jitpack.io

to your <repositories>

if you haven't.

source & further reading

dev.to — original article I Traced a Multi-Step LLM Agent With Self-Hosted SigNoz. One Feature Sold Me. How I Built a Fully Automated AI Blog with AWS CDK, Bedrock, and Step Functions The Missing Economic Layer: How AI Agents Will Pay for Their Own Infrastructure

~/api · this article 200

$curl api.wpnews.pro/v1/news/adaptive-execution-for-j…

Read original on dev.to → dev.to/asekka/adaptive-execution-for-java-agents…

mentioned entities

AgentFlow4J

Spring AI

Spring Web

Java

metadata

slugadaptive-execution-for-java-agents-reason-aware-retries-and-budget-aware-routing

topic#ai-agents

secondary4 topics

sentimentpositive

canonicaldev.to

navigation

← prevIs grep all you need? Lexical VS…

next →ByteDance plans up to $70B in ca…

── more in #ai-agents 4 stories · sorted by recency

dev.to · 11 Jul · #ai-agents

I Traced a Multi-Step LLM Agent With Self-Hosted SigNoz. One Feature Sold Me.

dev.to · 9 Jul · #ai-agents

Building an AI Agent System with the ReACT Pattern in Java

dev.to · 11 Jul · #ai-agents

The Missing Economic Layer: How AI Agents Will Pay for Their Own Infrastructure

runtimewire.com · 11 Jul · #ai-agents

Monumint scrapped OmniAI after a $3.2 million seed to sell AI agents to banks

── more on @agentflow4j 3 stories trending now

wpnews · 30 May · #ai-safety

Nightcord Security Analysis Report - Threat Investigation

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 8 Jul · #artificial-intelligence

SpaceXAI unveils Grok 4.5 AI model ahead of July 2026 public release

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required