How I built a production-grade Go framework for conversational AI agents — and the architecture decisions that actually matter.
Fifteen years of writing software professionally and I never open sourced a single thing.
Not because I didn't want to. It's just how it works when you build inside companies — the code belongs to them, the problems are specific to their domain, and by the time you could abstract something useful, you've already moved on to the next fire.
Two months ago I decided to change that.
I was building a conversational AI agent in Go. Needed a framework. Went looking for one and found... Python. More Python. A few Go repos that were abandoned in 2023. And a lot of "just wrap the OpenAI SDK" advice that works fine until you have real traffic and your agent starts responding twice to the same message.
Nothing production-ready. Nothing with actual architecture. Nothing I could hand to a team and say this will hold up.
So I built it. The result is eywa — a Go framework for conversational AI agents, hexagonal architecture, v1.0.0, MIT license, open source.
The name comes from Avatar — Eywa is the neural network connecting all living things on Pandora. The metaphor fit: a system that connects LLMs, channels, memory, and tools into a single organism, where each part perceives and responds to the environment.
Here's what I learned building it.
The AI ecosystem lives in Python. LangChain, LlamaIndex, CrewAI — all Python. If you're prototyping, exploring, or running notebooks, this makes complete sense.
But if you're running something in production at scale — where real users are sending real messages and you need observability, concurrency control, and something that doesn't fall over at 3am — Go is a very different story.
Go gives you:
go test -race
) — which will find bugs Python won't even seeThe Python frameworks assume you'll have one request at a time or handle concurrency via queues outside the framework. When you're dealing with WhatsApp webhooks at scale — multiple events per user arriving milliseconds apart — that assumption breaks.
The core principle in eywa is that the business domain should have absolutely zero knowledge of infrastructure.
No OpenAI SDK imports in domain code. No Redis calls. No MongoDB queries. Just interfaces — what eywa calls ports.
The domain defines what it needs. Infrastructure implements it. Wiring happens at startup.
Here's the Bond port — the distributed lock:
type Bond interface {
AcquireLock(ctx context.Context, key string, ttl time.Duration) (bool, error)
ReleaseLock(ctx context.Context, key string) error
ExtendLock(ctx context.Context, key string, ttl time.Duration) error
}
The domain knows it can acquire and release locks. It does not know that the implementation uses Redis Redlock under the hood. In tests, you inject a no-op. In production, you inject the Redis adapter.
Same pattern for the Oracle (the LLM abstraction):
type OracleRequest struct {
Model string
SystemPrompt string
Messages []OracleMessage
Temperature float64
MaxTokens int
Tools []OracleTool
UseTools bool
Attachments []LLMAttachment
}
The domain sends an OracleRequest
. Whether that goes to Anthropic, OpenAI, Gemini, Bedrock, or VertexAI is an infrastructure concern. Swap providers at startup. Run multiple providers simultaneously. The domain doesn't care.
This is not over-engineering. It's what makes the system testable, maintainable, and survivable when the next LLM provider comes out and everyone wants to switch.
One thing I invested heavily in: naming. Not just clean variable names — a consistent domain vocabulary that every piece of code uses.
Yes, the names are intentional. I wanted a consistent domain vocabulary instead of "Manager", "Service", "Handler", and "Util" — names that tell you nothing about what the component actually does in the context of an AI agent.
| Name | What it is |
|---|---|
| Weave | |
| The runtime engine — orchestrates everything per event | |
| Spirit | |
| Agent configuration — LLM, tools, system prompt, behavior | |
| Pulse | |
| Inbound event — a message received from a channel | |
| Oracle | |
| LLM abstraction — send prompt, receive response | |
| Bond | |
| Distributed lock — prevents concurrent duplicate responses | |
| Voice | |
| Outbound adapter — sends replies back to the channel | |
| Scout | |
| Context enrichment step — runs before the LLM call | |
| Lore | |
| RAG — retrieval-augmented generation | |
| Imprint | |
| Long-term memory injection | |
| Vigil | |
| Human-in-the-loop takeover | |
| Rite | |
| Approval workflow — gates actions behind human confirmation | |
| Conduit | |
| MCP (Model Context Protocol) client adapter |
When your code says bond.AcquireLock(...)
instead of redisLock.Lock(...)
, you stop thinking about infrastructure and start thinking about the domain. Terminology is design.
Here's a scenario that happens in production and almost no framework handles it:
A user sends a WhatsApp message. The webhook fires. Your agent starts processing — LLM call in progress, 800ms into it.
The user gets impatient and sends the same message again. Second webhook fires.
Now you have two goroutines processing the same user's context simultaneously. The first finishes, writes the response and updates memory. The second finishes, writes another response using stale memory state, overwriting the first update.
The user gets two responses. Memory is inconsistent. You've introduced a race condition at the application level.
This is Bond.
Before the Weave processes any Pulse, it acquires a distributed lock keyed by the user's session ID. If the lock is already held, the event is discarded. Only one active processing per user, ever.
The contract is precise: AcquireLock
returns (false, nil)
when the lock is held (expected case), and (false, error)
only for infrastructure failures. This distinction matters — the caller handles them differently.
The pipeline that runs on every Pulse before the LLM call:
Pulse → Scouts → Pathfinder → Spirit → Oracle → Actions → Voice
Scouts are sequential context enrichment steps. They read from external systems and inject knowledge into the Pulse before the model sees anything.
type Scout interface {
GetName() string
Harvest(ctx context.Context, event *entities.Pulse) error
IsApplicable(event *entities.Pulse) bool
}
The critical design decision: Scouts are fail-open.
A Scout that returns an error gets logged. The pipeline continues without its data. The LLM call still happens.
Why? Because if a Scout is hitting a CRM to enrich the user's context, and the CRM is having a slow morning, you don't want your entire agent to stop responding. You want it to keep working with less context, gracefully.
The entire Weave is assembled at startup with a fluent builder:
weave, err := eywa.NewWeaveBuilder(ctx).
WithRepositories(spiritRepo, memoryRepo, echoRepo, chronicleRepo).
WithBond(bond).
WithActionRegistry(eywa.NewActionRegistry()).
WithScoutRegistry(eywa.NewScoutRegistry()).
AddOracle(eywaopenai.NewOracle(apiKey)).
WithConfig(config).
Build()
MongoDB for Spirit configuration and conversation history. Redis for distributed locking and in-flight memory. OpenAI as the Oracle. Everything injected — nothing global.
To add Anthropic as an additional provider:
AddOracle(eywaopenai.NewOracle(openaiKey)).
AddOracle(eywaanthropic.NewOracle(anthropicKey)).
Spirits define which provider they use. The OracleFactory selects the right one at runtime.
The entire framework ships as 19 independent Go modules:
github.com/wmulabs/eywa # core
github.com/wmulabs/eywa/fiber # HTTP adapter
github.com/wmulabs/eywa/mongo # MongoDB repositories
github.com/wmulabs/eywa/redis # Redis Bond + memory
github.com/wmulabs/eywa/mcp # MCP client (Conduit)
github.com/wmulabs/eywa/providers/anthropic
github.com/wmulabs/eywa/providers/openai
github.com/wmulabs/eywa/providers/gemini
github.com/wmulabs/eywa/providers/bedrock
github.com/wmulabs/eywa/providers/vertexai
github.com/wmulabs/eywa/providers/weaviate
github.com/wmulabs/eywa/providers/qdrant
github.com/wmulabs/eywa/providers/pgvector
github.com/wmulabs/eywa/providers/pinecone
github.com/wmulabs/eywa/channels/whatsapp
github.com/wmulabs/eywa/gcp/cloudtasks
github.com/wmulabs/eywa/gcp/gcs
github.com/wmulabs/eywa/gcp/gemini
If you don't use Bedrock, you don't import it. You don't get its dependencies in your go.sum
. You don't get its security surface. Go developers care about this.
Before calling this v1.0, I went through a proper security review:
file://
, ftp://
all blockedio.LimitReader
subtle.ConstantTimeCompare
— no timing attacksgo test -race
None of this is exciting. All of it matters.
eywa is at v1.0.0. Stable, production-hardened, documented.
If you're building AI agents in Go — or you've been wanting to but couldn't find something serious enough to base a production system on — I'd genuinely love your feedback.
Pull requests welcome. Issues welcome. Blunt criticism welcome.
Maybe the world didn't need another AI framework. But it definitely needed more engineering in the ones it had.