Compiling Agentic Workflows into LLM Weights

wpnews.pro

cd /news/large-language-models/compiling-agentic-workflows-into-llm… · home › topics › large-language-models › article

[ARTICLE · art-44768] src=arxiv.org ↗ pub=2026-06-30T12:12Z topic=large-language-models verified=true sentiment=↑ positive

Compiling Agentic Workflows into LLM Weights

Researchers have demonstrated that compiling agentic workflows into the weights of small fine-tuned language models achieves near-frontier quality at two orders of magnitude less cost, addressing three perceived barriers that have kept developer adoption low despite prior proof-of-concept work. The approach, tested on travel booking, Zoom support, and insurance claims, eliminates the need for external orchestrators and frontier models for every conversation.

read2 min views1 publishedJun 30, 2026

Image: source

[Submitted on 21 May 2026]


[View PDF](/pdf/2605.22502)

[HTML (experimental)](https://arxiv.org/html/2605.22502v1)

Abstract:Agent orchestration frameworks have proliferated, collectively exceeding 290,000 GitHub stars across LangGraph, CrewAI, Google ADK, OpenAI Agents SDK, Semantic Kernel, Strands, and LlamaIndex. All follow the same pattern: an external orchestrator above the LLM, injecting instructions and routing decisions every turn. Recent work has shown this architecture is dominated for procedural tasks by simply providing the procedure in a frontier model's system prompt [Dennis et al., 2026a], at the cost of consuming the context window, requiring a frontier model for every conversation, and exposing proprietary procedures to third-party providers. Compiling the procedure into the weights of a small fine-tuned model -- creating a subterranean agent -- should resolve all of these concerns, and prior work (SimpleTOD, FireAct, SynTOD, WorkflowLLM, Agent Lumos) has shown the technique works. Yet developer adoption has overwhelmingly favored orchestration. We identify three perceived barriers and address each empirically across travel booking (14 nodes), Zoom support (14 nodes, product-specific knowledge), and insurance claims (55 nodes, 6 decision hubs).

References & Citations

...

Bibliographic Explorer

(What is the Explorer?) Connected Papers

(What is Connected Papers?) Litmaps

(What is Litmaps?) scite Smart Citations

(What are Smart Citations?)# Code, Data and Media Associated with this Article alphaXiv

(What is alphaXiv?) CatalyzeX Code Finder for Papers

(What is CatalyzeX?) DagsHub

(What is DagsHub?) Gotit.pub

(What is GotitPub?) Hugging Face

(What is Huggingface?) ScienceCast

(What is ScienceCast?)# Demos Influence Flower

(What are Influence Flowers?) CORE Recommender

(What is CORE?)# arXivLabs: experimental projects with community collaborators arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/compiling-agentic-workfl…

Read original on arxiv.org → arxiv.org/abs/2605.22502

mentioned entities

LangGraph

CrewAI

Google ADK

OpenAI Agents SDK

Semantic Kernel

LlamaIndex

Zoom

metadata

slugcompiling-agentic-workflows-into-llm-weights

topic#large-language-models

secondary3 topics

sentimentpositive

canonicalarxiv.org

navigation

← prevLeast Privilege is a Workaround …

next →School Is a Fossil

── more in #large-language-models 4 stories · sorted by recency

dev.to · 30 Jun · #large-language-models

Will AI Replace Programmers?

dev.to · 30 Jun · #large-language-models

Why Prompt Engineering Isn't Enough for Production AI Agents

dev.to · 29 Jun · #large-language-models

Approval Queues Are the Runtime for Agentic AI Workflows | Focused Labs

dev.to · 29 Jun · #large-language-models

Why Is SoloEngine the Ideal Implementation of Loop Engineering?

── more on @langgraph 3 stories trending now

wpnews · 27 May · #machine-learning

hunting for headroom on modded-nanoGPT (WR #82)

wpnews · 28 May · #ai-startups

The Niche SaaS Opportunity Map 2026: Highly Demanded Subscribed Categories Beyond Mainstream

wpnews · 29 Jun · #large-language-models

The Silent Cost of AI Agents: Why Your Next.js SaaS Is Burning Money on LLM Calls

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required