{"slug": "adaptive-execution-for-java-agents-reason-aware-retries-and-budget-aware-routing", "title": "Adaptive execution for Java agents: reason-aware retries and budget-aware routing", "summary": "AgentFlow4J v0.7.0, a Java multi-agent orchestration framework built on Spring AI, introduces a `FailureClassifier` that categorizes LLM call failures as transient, permanent, or over-budget, enabling reason-aware retries that honor `Retry-After` headers and stop immediately on permanent errors. The release also adds a `BudgetPolicy` with budget-aware routing that deterministically switches between premium and fallback models based on remaining run budget, reading a counter rather than making an additional LLM call to classify complexity.", "body_md": "If your LLM agent retries a `429`\n\nfifty times overnight, retries a `400`\n\nthree times before giving up, and sends every request to your top-tier model until the run budget is gone, those aren't bugs in the model. They're missing two cheap policies your orchestration layer should be making for you.\n\nAgentFlow4J v0.7.0 (Java multi-agent orchestration on Spring AI) ships those two policies.\n\nA `RetryPolicy`\n\nthat counts attempts is blind to *why* a call failed. v0.7.0 adds a `FailureClassifier`\n\nthat sorts every failure into one of three categories:\n\n| Category | What the graph does |\n|---|---|\n`TRANSIENT` |\nRetry. If the failure carries a `Retry-After` hint, that delay is honoured instead of the computed backoff. |\n`PERMANENT` |\nStop immediately — no further attempts. |\n`OVER_BUDGET` |\nStop and surface an `InterruptRequest` so a human can approve more budget and resume. |\n\nThe **default** classifier already understands JDK I/O exceptions, Spring AI / Spring Web `5xx`\n\n+ `429`\n\n(parsing `Retry-After`\n\n) vs other `4xx`\n\n, and `BudgetExceededException`\n\n, all detected by class name, so `agentflow4j-graph`\n\nkeeps **zero compile-time Spring dependency**.\n\nAdding your own rules composes via `orElse`\n\n:\n\n``` php\nFailureClassifier domain = cause -> {\n    if (cause instanceof QuotaExhaustedException) return FailureClassification.overBudget(\"monthly quota hit\");\n    if (cause instanceof InvalidPromptException)  return FailureClassification.permanent(\"rejected by guardrail\");\n    return null; // unknown — defer to the default\n};\n\nRetryPolicy policy = RetryPolicy.exponential(3, Duration.ofSeconds(1))\n        .withClassifier(domain.orElse(FailureClassifier.defaults()));\n```\n\nExisting policies that only set the legacy `retryOn`\n\npredicate keep their exact behaviour, the classifier falls back to it when it returns `null`\n\n.\n\n`BudgetPolicy`\n\nalready caps the cost of a run. v0.7.0 lets it shape *which model handles which request*:\n\n```\nBudgetPolicy budget = BudgetPolicy.hierarchical(\n        BudgetLimits.run(5.00), estimator, meter);\n\n// Use \"premium\" while ≥ $1.00 remains, then \"fallback\".\nRoutingStrategy router = RoutingStrategy.budgetAware(\n        budget, BudgetPolicy.Scope.RUN, 1.00,\n        \"premium\", \"fallback\");\n\nCoordinatorAgent desk = CoordinatorAgent.builder()\n        .executor(\"premium\",  premiumAgent)\n        .executor(\"fallback\", fallbackAgent)\n        .routingStrategy(router)\n        .build();\n```\n\nWhile the run budget has more than `$1.00`\n\nremaining, the coordinator sends work to `premium`\n\n. The moment less remains, it degrades to `fallback`\n\n. Reading the live `BudgetPolicy.remaining(...)`\n\ncounter is free, no extra LLM call to \"classify complexity.\"\n\nThis is the one cost-aware routing lever that's both **deterministic** and **provably cheaper**: classifying complexity ex-ante with an LLM would itself cost a call (chicken-and-egg), and self-confidence routing doubles the cost. Reading counters is free.\n\nThe cookbook ships a recipe that wires both features end-to-end:\n\n```\ngit clone https://github.com/datallmhub/agentflow4j-cookbook.git\ncd agentflow4j-cookbook\nmvn -pl 06-cost-aware-routing exec:java\n```\n\nNo API key required, Ollama optional, the recipe falls back to a deterministic stub. You should see the first three tickets handled by `premium`\n\n, then the squad switching cleanly to `fallback`\n\nonce remaining drops below `$2.00`\n\n. The retry scene then shows a flaky node recovering on attempt 3 with `50ms → 98ms`\n\nexponential+jitter backoffs, followed by a classification table.\n\n```\n<dependency>\n    <groupId>com.github.datallmhub.agentflow4j</groupId>\n    <artifactId>agentflow4j-starter</artifactId>\n    <version>v0.7.0</version>\n</dependency>\n```\n\nVia JitPack: add `https://jitpack.io`\n\nto your `<repositories>`\n\nif you haven't.", "url": "https://wpnews.pro/news/adaptive-execution-for-java-agents-reason-aware-retries-and-budget-aware-routing", "canonical_source": "https://dev.to/asekka/adaptive-execution-for-java-agents-reason-aware-retries-and-budget-aware-routing-1elm", "published_at": "2026-05-27 08:20:52+00:00", "updated_at": "2026-05-27 08:42:00.137302+00:00", "lang": "en", "topics": ["ai-agents", "ai-tools", "ai-infrastructure", "ai-products", "mlops"], "entities": ["AgentFlow4J", "Spring AI", "Spring Web", "Java"], "alternates": {"html": "https://wpnews.pro/news/adaptive-execution-for-java-agents-reason-aware-retries-and-budget-aware-routing", "markdown": "https://wpnews.pro/news/adaptive-execution-for-java-agents-reason-aware-retries-and-budget-aware-routing.md", "text": "https://wpnews.pro/news/adaptive-execution-for-java-agents-reason-aware-retries-and-budget-aware-routing.txt", "jsonld": "https://wpnews.pro/news/adaptive-execution-for-java-agents-reason-aware-retries-and-budget-aware-routing.jsonld"}}