From Isolated Agents to Agentic Mesh: Orchestrating SDLC with A2A and AP2

Google and other tech firms are adopting open-source protocols A2A and AP2 to transform AI-assisted software development from isolated local agents into a centralized, governed platform, addressing cost and compliance issues at scale.

From Isolated Agents to Agentic Mesh: Orchestrating SDLC with A2A and AP2 Exposing the problem Giving every developer a powerful, local AI agent feels like the ultimate productivity hack. But for organizations running at scale, it is a governance and cost trap waiting to spring. Currently, the AI revolution in the Software Development Lifecycle SDLC is happening almost entirely on developers’ laptops. We are building isolated, monolithic agent loops. I’ve been advocating for a shift toward an agentic platform because I am convinced this local-first approach is only transient. But before explaining why this model breaks down, let’s define what running SDLC “at scale” means in this context: bringing AI-powered development to N teams working on M products , with both N and M being greater than 10. We are not just talking about the internal dynamics of a single team, but true multi-product organizations. Ensuring trust at the organizational level Let’s consider a fundamental truth: LLMs are probabilistic, meaning AI directives are only followed a certain percentage of the time. Imagine you create a skill to enforce a critical business rule—let’s call it an “enterprise architecture decision.” Because of the nature of AI, there is always a chance this skill is partially ignored or poorly applied. If that failure rate is even 10%, and you scale this across N 10 teams running thousands of iterations, you are mathematically guaranteed that some teams will ship code that bypasses your global business rules. This leads to massive architectural drift . We can, of course, build deterministic guardrails with hooks and programs to enforce validation. But if these are executed locally on developers’ laptops, we lose centralized observability . The CTO or Principal Engineer is ultimately accountable for the brand’s software. They cannot simply rely on “trusting the team”; they need systemic guarantees. How can a CTO confidently certify what is shipped when the enforcement mechanisms are scattered and invisible? Managing LLM Costs and Internal Economics When AI directives are executed locally at the team level, the organization loses control over the execution model. Developers are often locked into a one-size-fits-all approach. A specific skill might run perfectly on a mid-tier LLM but fail on a low-cost one, yet current local tools like Copilot or Claude offer no easy way to dynamically route requests to the most cost-effective model based on the task’s complexity. Consequently, the organization pays a premium for every single call made by local agents. Without centralized caching or intelligent model routing , this cost scales linearly with the number of developers and iterations, quickly ballooning into a massive expense. This brings us to a final financial consideration: the internal economy . If a developer builds a highly effective AI skill that is later adopted by multiple teams, who absorbs the execution costs? A decentralized model provides no answer. We need a way to accurately track usage and manage chargebacks to compensate the teams building these shared organizational assets. Building the Platform of the Future To solve these challenges, we need to shift from local black boxes to centralized services. A true agentic platform should handle AI queries dynamically—optimizing models and utilizing caching to control costs at scale. It must also maintain a financial ledger for cross-team chargebacks and an audit logbook to ensure architectural compliance. The rest of this post is a step-by-step demonstration of how this future could look, leveraging two open-source standards: the Agent-2-Agent A2A protocol for orchestration and governance, and the Agent Payment Protocol AP2 to handle the internal economics. Setting the Scene: The Local Architect Imagine you are a Product Manager or Tech Lead in a stream-aligned team, tasked with building a new application. To design the implementation, you turn to your local AI architect, “Winston.” if you are using BMAD you may know Winston already :D Winston runs entirely on your local machine. It is smart—well-versed in general software architecture principles and equipped with guardrails to escalate critical compliance issues, like GDPR. But here is the catch: Winston operates in a silo. It has a massive blind spot regarding the enterprise context and absolutely zero knowledge of the internal components already existing within your organization. The workflow begins the moment you submit your initial prompt, triggering Winston’s local execution loop. Here is the prompt you give to Winston: … for this feature, we need to send 50,000 transactional emails per day. Note : We are skipping over prompt and context engineering here. Naturally, the human would supply much more detail, and Winston would already be loaded with the product’s baseline context. Consulting the Enterprise Source of Truth Winston understands the technical requirements, but it is completely blind to the organization’s existing ecosystem. To bridge this gap, it must rely on the platform: a centralized suite of capabilities designed to help stream-aligned teams build applications that fit the company’s standards. The specific capability Winston is mandated to call is the Enterprise Architecture Service . This service acts as the organization’s brain for standards, blueprints, and reusable building blocks. Today, this service is fully automated, handled by a highly optimized, centralized AI agent. These agents don’t use human prompts to talk to each other; they communicate via the Agent-to-Agent A2A protocol, a standardized way to query tasks and exchange states. Winston wraps your request in an A2A message and fires it off to the Architect Agent: { "role": "user", "parts": { "type": "text", "text": "I need to set up email notifications for 50k users"} , "metadata": {"ceiling credits": 1000} } But centralized intelligence is not “free” as in free beer . Like any internal product, it requires resources to operate, which brings us back to the internal economy. Before processing the request, the Architect Agent evaluates the computational cost. Seeing no proof of payment attached to the incoming message, it halts the request at the payment gateway. The Architect asks its own LLM to estimate the token cost, generates a unique payment ctx id , and replies with an A2A task https://agent2agent.info/docs/concepts/task/ in an input-required state. Think of it as an agentic " 402 Payment Required" : { "id": "architect-generated-task-id", "status": { "state": "input-required", "message": { "role": "agent", "parts": {"type": "text", "text": "This consultation requires about 800 tokens"}, { "type": "data", "data": { "kind": "payment-required", "payment required": { "ceiling amount": 800, "price per token": 1, "task type": "architecture-consultation", "currency": "CREDITS", "payee": "architect-account-id", "payment agent url": "https://payment-agent/...", "payment ctx id": "uuid-v7-generated-by-architect", "evaluation message": "This consultation requires about 800 tokens", "estimated tokens": 800 } } } , "metadata": { "kind": "payment-required", "payment required": { "..." : "same as above" } } } } } Escalating the Payment Winston parses the input-required state and the accompanying payment data. It evaluates isPaymentRequired task to true, recognizing the payment required block in the metadata. Before the central Architect performs any work, Winston must establish a payment mandate. While the stream-aligned team has a dedicated budget, it is the local agent’s responsibility to manage the transaction cost. However, Winston is not hardcoded to blindly spend the team’s budget. Lacking the autonomy to authorize financial transactions out-of-the-box, it escalates the request to you, the human. You review the quote and validate the transaction with a strict boundary: You are authorized to spend these credits only for this specific task. In the future, we could imagine implementing a learning mechanism, allowing Winston to automatically approve spending for routine or trusted tasks without human intervention. Managing Internal Compensation Even though these transactions use internal virtual currency rather than real money, the system still requires a reliable consensus between parties before any work begins. To manage these internal economies at scale, the platform relies on a centralized ledger. Acting as a core capability, this ledger guarantees that the central service providing the architectural work is properly compensated from the stream-aligned team’s budget. Executing the AP2 Mandates A brief intro on the Agent Payment Protocol AP2 AP2 https://cloud.google.com/blog/products/ai-machine-learning/announcing-agents-to-payments-ap2-protocol is an open standard designed to enable AI agents to autonomously and securely execute transactions. Instead of relying on a human to physically click a “pay” button, AP2 uses cryptographically signed Mandates . When a user sets a budget or approves a quote, they generate a mandate that gives the agent verifiable, strictly bounded authority to spend. While originally built for global agentic commerce, AP2 provides the perfect framework for an internal enterprise platform: it allows local and central agents to negotiate costs, prove human authorization, and securely settle cross-team chargebacks on a shared ledger. Disclaimer: Because AP2 was primarily designed for global agentic commerce, it inherently relies on classic e-commerce concepts, such as a “checkout” phase. While these specific steps are not strictly necessary for an internal enterprise platform, I chose to implement the full protocol in this POC to demonstrate what true, secure agent-to-agent autonomy looks like. With human approval secured, Winston kicks off the AP2 protocol. It begins by creating and sealing a checkout mandate . In traditional e-commerce, this step locks the physical items in a shopping cart. In our context, there is no real “cart”—but this step is not just a no-op. Here, it acts as a cryptographic agreement to the Architect’s quote, irrevocably binding the specific task architecture-consultation to the agreed price 800 credits . Once the scope and price are sealed, Winston generates a payment mandate , which instructs the platform’s ledger to place a hold on the required credits from the stream-aligned team’s budget. In response, the internal payment service issues an HMAC https://en.wikipedia.org/wiki/HMAC -signed token. This token acts as a portable, cryptographic proof of payment—securely binding the transaction amount, the involved parties, and the unique payment ctx id . Armed with this token, Winston resubmits the initial architecture request, this time attaching the mandate IDs and the cryptographic proof. Before doing any computational work, the Architect Agent queries the payment broker to verify the mandates. Because the buyer Winston generates these payment credentials rather than the seller, the system is cryptographically protected against forgery. { "role": "user", "taskID": "0193a1b2-7c3d-7e4f-8a9b-0c1d2e3f4a5b", "contextID": "ctx-5f6a7b8c-9d0e-1f2a-3b4c-5d6e7f8a9b0c", "parts": {"type": "text", "text": "Pre-authorization completed"} , "metadata": { "checkout mandate id": "mnd chk 42a1", "payment mandate id": "mnd pay 77b3", "payment ctx id": "0193a1c4-aaaa-7fff-bbbb-ccccddddeeee" } } The Architect verifies the mandates are closed hold in place , marks the task payment verified=true , and immediately calls its LLM with the original question. The LLM needs more context. Design dialogue Payment verified, the architect begins the actual work. This is a multi-turn A2A conversation, not a single request/response. The architect asks clarifying questions: “What’s the exact daily volume? Transactional or marketing? Any regulatory constraints?” Winston answers with the business context. The architect iterates, refining its understanding before making a recommendation. The A2A dialog occurs while there is an input required from the “enterprise architect”: … Architect asks its first clarification. { "id": "0193a1b2-7c3d-7e4f-8a9b-0c1d2e3f4a5b", "contextID": "ctx-5f6a7b8c-9d0e-1f2a-3b4c-5d6e7f8a9b0c", "status": { "state": "input-required", "message": { "role": "agent", "parts": { "type": "text", "text": "What is the exact daily volume of emails to send?" } } }, "metadata": { "payment verified": true, "phase": "clarify", "rgpd asked": false } } … Wiston answers his LLM generates an answer from the business context, or escalates to the human . { "role": "user", "taskID": "0193a1b2-7c3d-7e4f-8a9b-0c1d2e3f4a5b", "contextID": "ctx-5f6a7b8c-9d0e-1f2a-3b4c-5d6e7f8a9b0c", "parts": { "type": "text", "text": "50,000 transactional emails per day..." } } When the architect has enough information, it informs Winston that it can now work by changing its state: { "id": "0193a1b2-7c3d-7e4f-8a9b-0c1d2e3f4a5b", "contextID": "ctx-5f6a7b8c-9d0e-1f2a-3b4c-5d6e7f8a9b0c", "status": { "state": "working", "message": { "role": "agent", "parts": { "type": "text", "text": "Consulting the Domain Agent for feasibility..." } } }, "metadata": { "payment verified": true, "phase": "decide" } } The Agentic Mesh in Action: Domain Consultation In the context of this demonstration, this step is the icing on the cake. I have previously blogged about my vision for a future platform built on an agentic mesh system , and this interaction perfectly illustrates its value. At this stage in the workflow, the central Enterprise Architect Agent has all the requirements. It could simply rely on its internal training data to offer a recommendation. But what if that knowledge is outdated? What if the legacy component it suggests for notifications cannot actually handle the new load? Instead of guessing, the Architect leverages the mesh. It dynamically delegates the technical feasibility query directly to Winston@domain the specialized agent managing the Notifications bounded context. The domain expert agent evaluates the request and replies: “50,000 emails/day is feasible, but it requires a quota increase and strict template validation.” This is true Domain-Driven Design DDD applied to AI: the domain owner validates the local feasibility, allowing the central architect to make a safe, systemic decision. Decision and Settlement The Architect has now gathered enough context—including the GDPR requirements and the Domain Agent’s feasibility assessment. It calls its LLM one last time with all of this information and emits two events back-to-back on the same task: an artifact the structured decision and a status update completed . Architect → Winston: The Artifact Architecture Decision { "id": "0193a1b2-7c3d-7e4f-8a9b-0c1d2e3f4a5b", "contextID": "ctx-5f6a7b8c-9d0e-1f2a-3b4c-5d6e7f8a9b0c", "artifacts": { "name": "architecture-decision", "lastChunk": true, "parts": { "type": "data", "data": { "recommendation": "Use the internal notification platform via POST /emails....", "prerequisites": "quota increase required", "template validation required" , "api": { "base url": "https://api.lambda.internal/notifications/v1", "endpoint": "POST /emails", "auth": "service-account OAuth2 client credentials ", "example request": null, "limits": null }, "feasibility": { "feasible": true, "conditions": "quota increase required", "template validation required" } } } } } Settlement Right after emitting the artifact, but before sending the completed status to Winston, the Architect settles the payment with the Payment Agent. This is a direct HTTP call, not an A2A message. Architect → Payment Agent: POST /payments/settle { "from": "", "to": "architect-account-id", "actual amount": 620, "task type": "architecture-consultation", "payment ctx id": "0193a1c4-aaaa-7fff-bbbb-ccccddddeeee", "payment mandate id": "mnd pay 77b3" } The Payment Agent checks that: - The mandate exists and is closed. actual amount 620 ≤ ceiling amount 800 .- The task type matches. Then, it releases the hold on Winston’s account, refunds the difference 800 − 620 = 180 credits back to Winston , transfers 620 credits to the Architect’s account, and returns a signed settlement token. Payment Agent → Architect: 200 OK { "settlement token": "eyJhbGciOiJIUzI1NiIs..." } Task Completed Only now does the Architect close the A2A task. Winston gets the final status with the consumption metadata baked in. Architect → Winston: completed { "id": "0193a1b2-7c3d-7e4f-8a9b-0c1d2e3f4a5b", "contextID": "ctx-5f6a7b8c-9d0e-1f2a-3b4c-5d6e7f8a9b0c", "status": { "state": "completed", "message": { "role": "agent", "parts": { "type": "text", "text": "Use the internal notification platform via POST /emails...." } } }, "metadata": { "ap2 actual amount": 620, "ap2 tokens consumed": 620 } } The full picture A2A enables true agent-to-agent delegation—not just simple tool calls, but autonomous conversations with structured intent. AP2 + 402 ensure fair internal pricing: agents refuse to work until they are paid, mandates provide portable cryptographic proof, and a neutral internal broker securely settles the accounts. ADRs + Crypto proofs make every architectural decision fully auditable and deterministically verifiable—from the initial request down to the final financial settlement. Wrapping Up: Solving the Trap with the Agentic Mesh It is important to note that this workflow represents a possible near-future rather than the current industry standard. Yet, I strongly believe that the future of agentic development inevitably passes through standardized inter-agent communication. By shifting away from isolated local monoliths to a collaborative Agentic Mesh, we directly solve the challenges outlined at the beginning of this post: Escaping the Governance Trap: The CTO no longer has to rely on blind trust. Architectural alignment is dynamically verified by domain experts, and every decision produces a cryptographically sealed, centrally auditable trail. Escaping the Cost Trap: The internal economy is no longer a black box. The platform ledger manages cross-team chargebacks, and central services can intelligently route requests to the most cost-effective models. To demonstrate this, I didn’t just design the architecture: I built it. The scenario described above has been fully implemented in a Proof of Concept. Under the hood, every agent runs as a completely independent process. The conversational payloads are powered by the official Google SDK for A2A dialogues, and I integrated a lightweight, custom version of the AP2 protocol to handle the “402 Payment Required” escalations and mandate verifications. The code is almost ready for public exposition. You will soon be able to explore the full POC and run it yourself by visiting the repository: https://github.com/owulveryck/ap2402 https://github.com/owulveryck/ap2402 .