This is a submission for the GitHub Finish-Up-A-Thon Challenge
This tutorial aims to build and test benchmarking Agents using the A2A protocol across several popular programming languages. The Master Orchestrator Agent is exposed via MCP to allow Antigravity CLI to be used as a MCP client.
This paper is a re-visiting of the original benchmark series with Gemini CLI over Node, GO, and Python:
Cross Language A2A Agent Benchmarking with Gemini 3 and Gemini CLI
_Building a Benchmarking Agent with A2A and MCP_medium.com
In this updated version, the Antigravity CLI is used over Node, GO, Python, and Rust.
Most mature Agent development tools and libraries are Python based. Python allows for rapid prototyping and evaluation of approaches. Python is also an interpreted language- which has trade-offs in memory safety, and performance. Other languages like GO and Rust offer high performance and memory safe operations. With a language neutral communication protocol โ the actual Agent implementation of each Agent can be coded in the most appropriate language.
The high level goal was to measure the actual time spent running an algorithm in the native language code inside the A2A agent. Each language had a slightly different implementation due to the language syntax. After running the algorithm- each Agent was instructed to calculate and return the elapsed time for cross language comparison.
The Agent Development Kit (ADK) is a flexible and modular framework for developing and deploying AI agents. While optimized for Gemini and the Google ecosystem, ADK is model-agnostic, deployment-agnostic, and is built for compatibility with other frameworks. The official ADK is only currently available in the Python, GO, and Java programming languages.
Google provides full documentation on the Agent Development Kit (ADK) here:
Agent Development Kit
_Build powerful multi-agent systems with Agent Development Kit_google.github.io
The Agent2Agent (A2A) protocol, an open communication standard for AI agents, was initially introduced by Google in April 2025. It is specifically engineered to facilitate seamless interoperability within multi-agent systems, enabling AI agents developed by diverse providers or built upon disparate AI agent frameworks to communicate and collaborate effectively.
A good overview of the A2A protocol can be found here:
A2A Protocol
_The official documentation for the Agent2Agent (A2A) protocol. The A2A protocol is an open standard that allowsโฆ_a2a-protocol.org
The official ADK for Python, GO, and Java provide built-in support for working with the A2A protocol. For other programming languages like JS, Rust, and .NET โ 3rd party libraries are available to add support for the protocol.
The main source for A2A Language support is the GitHub repo:
GitHub โ a2aproject/A2A: An open protocol enabling communication and interoperability betweenโฆ
_An open protocol enabling communication and interoperability between opaque agentic applications. โ a2aproject/A2A_github.com
This article targets the Python, GO, Rust, and JavaScript environments.
The build tools for each language environment need to be in place. For building with Python- a working Python environment with 3.12 or later along with package management tools like uv or pip is required.
For building with GO โ a recent version of the GO compiler (1.24.1 or later) is required.
For building with Rust, the Rust tool chain is required.
For building with Node / JavaScript โ a working Node.js environment with Node version 20 or later and a functional npm tool is needed.
Antigravity CLI is the follow-on successor to Gemini CLI- the terminal driven, agent assisted coding tool.
Full details on installing Antigravity CLI are here:
Getting Started with Antigravity CLI
_This article covers the initial setup and configuration for the Antigravity CLI on a stock Linux Environment._medium.com
Once you have all the tools in place- you can test the startup of Antigravity CLI.
You will need to authenticate with a Google Cloud Project or your Google Account:
agy
This will start the interface:
Verify that all the prerequisite packages and compilers are installed โ and clone the sample Github repo:
git clone https://github.com/xbill9/a2a-benchmark cd a2a-benchmark
Once you have your Google Cloud Project and preferred authentication method โ run the init.sh script to validate the setup:
xbill@penguin:~/a2a-benchmark$ source init.sh
Project ID file found, skipping.
--- Authentication Method --- Do you want to use a Gemini API Key for authentication? (y/n): n
WARNING: Your active project does not match the quota project in your local Application Default Credentials file. This might result in unexpected quota issues.
To update your Application Default Credentials quota project, use the gcloud auth application-default set-quota-project
command.
Updated property [core/project].
--- Setup complete ---
The set_env.sh script is provided to set common ADK environment variables:
xbill@penguin:~/a2a-benchmark$ source set_env.sh
--- Setting Google Cloud Environment Variables --- Checking gcloud authentication status...
gcloud is authenticated.
Exported PROJECT_ID=comglitn
Exported PROJECT_NUMBER=1056842563084
Exported SERVICE_ACCOUNT_NAME=1056842563084-compute@developer.gserviceaccount.com Exported GOOGLE_CLOUD_PROJECT=comglitn
Exported GOOGLE_GENAI_USE_VERTEXAI=TRUE
Exported GOOGLE_CLOUD_LOCATION=us-central1
Exported REPO_NAME=
Exported REGION=us-central1
--- Environment setup complete ---
xbill@penguin:~/a2a-benchmark$
If your application default credentials expires or your Google Cloud Authentication expires you will get an error. The workaround is to re-authenticate:
gcloud auth login
gcloud auth application-default login
Another common error is that the environment variables are not set correctly. Go the the root directory and re-run the set_env.sh to set the variables:
cd ~/adk-hello-world-a2a source set_env.sh
The A2A Inspector is a standalone tool that provides low level visibility into the A2A protocol. The GitHub is available here:
GitHub โ a2aproject/a2a-inspector: Validation Tools for A2A Agents
_Validation Tools for A2A Agents. Contribute to a2aproject/a2a-inspector development by creating an account on GitHub._github.com
A summary of the features of the A2A inspector can be found here:
A2A Protocol Documentation
_Documentation for A2A Protocol_a2aprotocol.ai
To install the A2A Inspector:
cd ~
git clone https://github.com/a2aproject/a2a-inspector Then follow the build instructions โ you need ** uv**, and a recent version of
a2a-inspector/README.md at main ยท a2aproject/a2a-inspector
_Validation Tools for A2A Agents. Contribute to a2aproject/a2a-inspector development by creating an account on GitHub._github.com
Once the A2A inspector has been installed- you can validate the installation by using this URL:
The ADK provides several key tools to allow standard ADK Agents to run as standalone A2A agents โ without the ADK โ either in A2A Client or A2A Server mode. The Python ADK includes libraries and samples to extend a standard ADK agent to enable A2A protocol features. Instead of running the agent inside the ADK web utility- the agents are dual purposed with A2A to be able to run as dedicated agents with their own embedded Uvicorn web server.
When Agents are run in ADK mode โ the ADK CLI or Web interface is used to directly interact with the Agents. The ADK UI is started in a well known port โ usually 8000 and the Agents are accessed in that environment.
The ADK does not automatically expose the agent as an A2A agent. The basic agent code from the ADK needs to be extended and enabled to run as a standalone A2A Agent. Without the additional A2A function calls and a active standalone web server- the Agents will not be usable in A2A mode.
In mathematics, a Mersenne prime is a prime number that is one less than a power of two. As of 2025, 52 Mersenne primes are known. An interesting thing about Mersenne primes is that they are the easiest natural numbers to prove to be primes, so they make up the largest category on the list of known prime numbers. This example was chosen as a good problem as the computation is CPU bound and gets exponentially longer for each higher number.
The a2a_benchmark Repo has sample scripts for running the ADK and various types of agents across programming languages. Each agent implements a Mersenne Prime Number generator with only basic optimization. The root directory of the a2a-benchmark directory contains several common agent development languages. These include:
This agent provides a basic Agent that generates Mersenne prime numbers. To run the agent โ run the bench-go.sh script:
Running the Go project...
{"time":"2025-11-25T13:47:18.071713651-05:00","level":"INFO","msg":"Using Model ","model":"gemini-2.5-flash"}
{"time":"2025-11-25T13:47:18.07195886-05:00","level":"INFO","msg":"Starting A2A mersenne prime server","port":"8102"}
{"time":"2025-11-25T13:47:18.072116547-05:00","level":"INFO","msg":"Starting the web server: &{port:8102 writeTimeout:15000000000 readTimeout:15000000000 idleTimeout:60000000000}"}
{"time":"2025-11-25T13:47:18.072119644-05:00","level":"INFO","msg":""}
{"time":"2025-11-25T13:47:18.072121732-05:00","level":"INFO","msg":"Web servers starts on [http://localhost:8102"](http://localhost:8102%22)}
{"time":"2025-11-25T13:47:18.072123829-05:00","level":"INFO","msg":" a2a: you can access A2A using jsonrpc protocol: [http://localhost:8102"](http://localhost:8102%22)}
{"time":"2025-11-25T13:47:18.072125402-05:00","level":"INFO","msg":""}
The A2A inspector can be used to validate the Agent:
"id": "adk-88a8aefa-3070-48ff-b8d1-631af65ea57e",
"name": "generate_primes",
"response": {
"result": "Elapsed time: 307.932ยตs"
}
This agent provides Python agent that implements the Mersenne generation algorithm.
This Agent can be checked with the ADK web interface:
what do you do
I can calculate Mersenne primes using the Lucas-Lehmer primality test. I can find the list of the first N Mersenne primes.
find the first 10 primes
It took 0.0002677440643310547 seconds to find the first 10 Mersenne primes.
To start the dedicated A2A version of the Python prime number generator Agent use the bench-python.sh script:
xbill@penguin:~/a2a-benchmark$ source bench-python.sh
INFO: Started server process [10495]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8101 (Press CTRL+C to quit) This agent provides a minimal agent to generate a prime number. To run the Agent use the bench-node.sh script:
/home/xbill/a2a-benchmark/benchmark-node
staring a2a bench node generate prime
The Agent can be validated with the A2A inspector on port 8103:
A typical session will look something like this:
what do you do
message
Found first 5 Mersenne primes in 0.09ms.
โ
find 10 primes
message
Found first 10 Mersenne primes in 0.81ms.
โ
The Mersenne primes algorithm was newly implemented in Rust:
A sample test script verified the A2A Agent skill:
xbill@penguin:~/a2a-benchmark$ source test-rust.sh Checking if Rust A2A agent is running on port 8104...
Rust agent is already running.
Running Test 1: Fetching Agent Card...
Agent Card Response:
{
"name": "Mersenne Prime Agent Rust",
"description": "A rust agent that builds a list of the first n Mersenne primes and reports the elapsed time. Configured with model: Not specified.",
"protocolVersion": "0.3.0",
"version": "0.1.0",
"url": "[http://0.0.0.0:8104/](http://0.0.0.0:8104/)",
"skills": [
{
"id": "find-mersenne-rust",
"name": "Find Mersenne Primes in rust",
"description": "Finds the list of the first n Mersenne primes in Rust",
"tags": [
"math",
"benchmark"
]
}
],
"capabilities": {},
"defaultInputModes": [],
"defaultOutputModes": []
}
โ Test 1 Passed: Agent Card is valid.
Running Test 2: Sending message/send request to calculate 5 Mersenne primes...
Response:
{
"id": 1,
"jsonrpc": "2.0",
"result": {
"contextId": "test-context-123",
"kind": "message",
"messageId": "f20c704c-bafa-4081-a18d-8d7984d4213c",
"parts": [
{
"kind": "text",
"text": "Found first 5 Mersenne primes in 0.11ms."
}
],
"role": "agent"
}
}
โ Test 2 Passed: Successfully calculated primes.
Running Test 3: Sending invalid RPC method...
Error Response:
{
"error": {
"code": -32601,
"message": "Method not found: invalid/method"
},
"id": 2,
"jsonrpc": "2.0"
}
โ Test 3 Passed: Invalid method was rejected successfully.
All tests passed successfully!
The final agent follows a slightly different pattern. It provides a minimal agent with several functions (Tools/Skills). It has one root_agent and 3 remote agents connected over A2A. This agent also exposes an interface as a MCP server- allowing Antigravity CLI to be used as a MCP client.
The Master Agent is started with a similar script:
โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ โ
โ โโโ โโโ โโโ โโโ โโโโโ โโโ โโโ โ
โ โโ โโโ โโโ โ โ โ โ โโโ โโโ โ
โ โ
โ FastMCP 2.13.1 โ
โ โ
โ โ
โ ๐ฅ Server name: benchmark โ
โ โ
โ ๐ฆ Transport: HTTP โ
โ ๐ Server URL: http://127.0.0.1:8100/mcp โ โ โ
โ ๐ Docs: [https://gofastmcp.com](https://gofastmcp.com) โ
โ ๐ Hosting: [https://fastmcp.cloud](https://fastmcp.cloud) โ
โ โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
[11/25/25 14:17:36] INFO Starting MCP server 'benchmark' with transport 'http' on [http://127.0.0.1:8100/mcp](http://127.0.0.1:8100/mcp) server.py:2055
INFO: Started server process [21826]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://127.0.0.1:8100 (Press CTRL+C to quit) The Master Agent starts a FastMCP server interface. This allows Antigravity CLI to be used as a MCP client:
xbill@penguin:~/a2a-benchmark/.agents$ more mcp_config.json
{
"mcpServers": {
"bench": {
"serverUrl": "[http://127.0.0.1:8100/mcp](http://127.0.0.1:8100/mcp)",
"serverURL": "[http://127.0.0.1:8100/mcp](http://127.0.0.1:8100/mcp)"
}
}
}
Because it is Python based- the Master Agent can also be viewed with the ADK web interface:
All of the setup and configuration allows you to start debugging and visualizing Agent flows using the A2A protocol. The Agents can be validated using the A2A Inspector. The Python Agents can be checked with the ADK web interfaces. The Master Agent can be validated with Antigravity CLI as a MCP client.
Now that the individual Agents have been validated โ open several Terminal windows and start up all the remaining agents:
Once all the agents have been started โ the Master agent on port 8100 is connected over MCP with Antigravity CLI:
I have completed the benchmark run across all four language agents for primes 1 to 22.
Summary of Actions Taken:
Please view analysis_results.md to inspect the table and embedded plots showing the comparison of calculation speeds
and
latency characteristics!
The results of the benchmark are graphed:
With latency times:
Antigravity CLI summarized the results of the Agent Benchmarking:
Formulating Key Findings
Here is a summary of the key findings from the benchmark results for primes 1 to 22:
1. Core Computation Performance (Lucas-Lehmer Prime Calculation)
โข Rust is the fastest: Computes $N=22$ in 195.96 ms.
โข Node.js (TS) is extremely close: Computes $N=22$ in 214.06 ms.
โข Go performs well: Computes $N=22$ in 279.54 ms.
โข Python is the slowest: Computes $N=22$ in 572.48 ms (roughly 3x slower than Rust).
2. A2A Framework & Protocol Overhead
โข Minimal Overhead (Rust & Node.js): โข Rust and Node.js use lightweight custom HTTP wrappers. Their framework overhead is practically negligible (< 2
ms). For example, Node's total round-trip time (RTT) for $N=22$ is 215.78 ms (only 1.72 ms higher than the raw calculation time).
โข Significant Base Overhead (Go & Python): โข Both Go and Python use the official google-adk library. This introduces a fixed baseline overhead per request
(schema validation, logging, and metadata context processing).
โข Go incurs a consistent baseline RTT of ~1.2s to 1.6s.
โข Python incurs a consistent baseline RTT of ~2.3s to 3.0s.
and a deeper dive:
Analyzing Agent Initialization
Based on an analysis of the codebase and execution paths, here is the breakdown of why the benchmark produced these
results:
โโโโโโ
Factor 1: The Massive Latency (RTT) Difference (Framework vs. Mocking)
The most striking difference is that Rust and Node.js have near-zero overhead, whereas Go takes ~1.5s and Python
takes
~2.5s. This is entirely due to how the agents process the benchmark messages:
โโโโโโ
### Factor 2: Mathematical Computation Speeds (Lucas-Lehmer Loop)
At higher values (e.g., $N=22$, where the Mersenne number $2^{9941}-1$ has 2,993 decimal digits and the Lucas-Lehmer
test requires 9,939 iterations of squaring and modular reductions), we see clear performance tiers:
For low numbers of Mersenne primes that execute quickly- all four languages are comparable. An unusual result was the good performance in the mid range from Node. This could have been from lower overhead of the Node A2A toolkit or potentially from JavaScript engine optimizations. As expected, GO and Rust did break out and delivered the best performance as the complexity scaled. The goal of the demo/article was to get basic Agents implemented across multiple programming languages and benchmark the Agent performance finding Mersenne primes.
The Google Agent development kit (ADK) was presented along with the complimentary A2A (Agent to Agent) protocol. Three basic agents were presented โ covering various combinations of programming languages and Agent implementation approaches.
Finally โ a Master/Orchestrator agent was started to connect and delegate to the other agents via the A2A protocol. Antigravity CLI was used to connect to the Master Agent over MCP and execute the benchmarks.### Cross Language A2A Agent Benchmarking with Gemini 3 and Antigravity CLI