{"slug": "aegis-gov-a-small-python-library-for-multi-agent-task-graphs-and-circuit", "title": "aegis-gov: a small Python library for multi-agent task graphs and circuit breakers", "summary": "A developer created aegis-gov, a small Python library for multi-agent task graphs and circuit breakers that separates coordination scaffolding from business logic in multi-agent LLM systems. The library provides TaskQueue and AgentPool classes, supports multiple LLM providers via adapters, and validates dependency graphs for cycles at construction time.", "body_md": "Multi-agent LLM systems have a coordination problem that most tutorials skip past. You can string together a few `asyncio.gather`\n\ncalls or a list of prompts, but once you need three or four agents to hand work to each other in a defined order — and you need the whole thing to degrade gracefully when one call fails — the scaffolding grows quickly and gets tangled with provider-specific SDK code.\n\nI wrote `aegis-gov`\n\nto separate that coordination scaffolding from the business logic. It is a small Python library (one hard dependency: `requests`\n\n) that provides:\n\nThis article walks through each of those pieces, shows you the actual code, and is honest about what is not there yet.\n\nSuppose you have four agents: a researcher, a writer, a translator, and a publisher. The writer depends on the researcher. The translator and publisher both depend on the writer but can run in parallel. The publisher should not run at all if the writer failed.\n\nWithout a scheduler, you write this by hand every time. The failure-cascade logic especially tends to become a set of nested conditionals that grows with each new dependency edge. And if your LLM provider returns a string of 429s or 5xx errors, there is nothing to stop the loop from hammering the endpoint until you kill the process.\n\n`aegis-gov`\n\naddresses both problems with two focused classes: `TaskQueue`\n\nand `AgentPool`\n\n.\n\n```\npip install aegis-gov                      # requests only\npip install \"aegis-gov[anthropic]\"         # + Anthropic SDK\npip install \"aegis-gov[openai]\"            # + OpenAI SDK\npip install \"aegis-gov[all]\"               # both LLM SDKs\n```\n\nPython 3.10+ is required.\n\nThe `LLMAdapter`\n\nprotocol has two methods: `generate()`\n\nreturns a string, `stream()`\n\nyields string chunks. All three concrete adapters satisfy this protocol:\n\n``` python\nfrom aegis_gov import AnthropicAdapter, OpenAIAdapter, OllamaAdapter\n\n# Anthropic\nadapter = AnthropicAdapter(model=\"claude-sonnet-4-6\")\n\n# OpenAI or any compatible endpoint (LM Studio, vLLM, Azure, etc.)\nadapter = OpenAIAdapter(model=\"gpt-4o-mini\", base_url=\"http://localhost:1234/v1\")\n\n# Ollama — no extra package needed, communicates over HTTP\nadapter = OllamaAdapter(model=\"qwen2.5:14b\")\n```\n\nThe adapter is a field on `AgentConfig`\n\n, so switching providers for a single agent is a one-line change. The rest of your orchestration code does not need to know which provider is in use.\n\n``` python\nfrom aegis_gov import OpenMultiAgent, AgentConfig, AnthropicAdapter\n\noma = OpenMultiAgent()\nresult = oma.run_agent(\n    AgentConfig(\n        name=\"analyst\",\n        system_prompt=\"You are a concise market analyst.\",\n        adapter=AnthropicAdapter(model=\"claude-haiku-4-5-20251001\"),\n    ),\n    task=\"Top 3 open-source multi-agent frameworks in 2026?\",\n)\nprint(result)\n```\n\n`TaskQueue`\n\ntakes a list of `Task`\n\nobjects, validates the dependency graph for cycles at construction time (raises `CyclicDependencyError`\n\nif one is found), and exposes a `ready()`\n\nmethod that returns the tasks whose dependencies are all in `done`\n\nstate.\n\n``` python\nfrom aegis_gov import OpenMultiAgent, Task\nfrom aegis_gov import AgentConfig, AnthropicAdapter\n\nteam = OpenMultiAgent.create_team(\"pipeline\", [\n    AgentConfig(name=\"researcher\",  system_prompt=\"Research topics thoroughly.\"),\n    AgentConfig(name=\"writer\",      system_prompt=\"Write clear reports.\"),\n    AgentConfig(name=\"translator\",  system_prompt=\"Translate to Japanese.\"),\n    AgentConfig(name=\"publisher\",   system_prompt=\"Format output as Markdown.\"),\n])\n\ntasks = [\n    Task(id=\"research\",   description=\"Research AI trends\",        agent=\"researcher\"),\n    Task(id=\"draft\",      description=\"Write a report draft\",      agent=\"writer\",     depends_on=[\"research\"]),\n    Task(id=\"translate\",  description=\"Translate to Japanese\",     agent=\"translator\", depends_on=[\"draft\"]),\n    Task(id=\"publish\",    description=\"Format as Markdown\",        agent=\"publisher\",  depends_on=[\"draft\"]),\n]\n\noma = OpenMultiAgent()\nresults = oma.run_tasks(tasks, team=team)\n```\n\n`translate`\n\nand `publish`\n\nboth depend only on `draft`\n\n, so they execute in parallel once `draft`\n\ncompletes.\n\nWhen `stop_on_failure=False`\n\n(the default), only tasks that directly or transitively depend on a failed task are skipped. Independent branches continue:\n\n``` python\nfrom aegis_gov import TaskQueue, Task\n\nq = TaskQueue([\n    Task(id=\"fetch\",   description=\"Fetch data\"),\n    Task(id=\"process\", description=\"Process data\",  depends_on=[\"fetch\"]),\n    Task(id=\"report\",  description=\"Write report\",  depends_on=[\"process\"]),\n], stop_on_failure=False)\n\nq.complete(\"fetch\", success=False)\nprint(q.skipped_tasks())  # [\"process\", \"report\"]\nprint(q.summary())        # {\"failed\": 1, \"skipped\": 2}\n```\n\nSetting `stop_on_failure=True`\n\nhalts the entire queue on the first failure.\n\n`AgentPool`\n\nwraps a `threading.Semaphore`\n\nto bound how many agents run simultaneously, and tracks consecutive failures to open the circuit:\n\n``` python\nfrom aegis_gov import AgentPool, OpenMultiAgent\n\npool = AgentPool(\n    max_concurrent=4,\n    consecutive_failure_limit=5,\n    recovery_timeout_s=30.0,\n)\noma = OpenMultiAgent(pool=pool)\nprint(oma.get_status())\n# {\"pool_state\": \"closed\", \"pool_consecutive_failures\": 0, ...}\n```\n\nThe state machine has three states: `CLOSED`\n\n(normal), `OPEN`\n\n(rejecting new work, raising `CircuitOpenError`\n\n), and `HALF_OPEN`\n\n(sending one probe call after `recovery_timeout_s`\n\nelapses). A successful probe returns the circuit to `CLOSED`\n\n; another failure reopens it.\n\nAgents can be given callable tools:\n\n``` python\nfrom aegis_gov import ToolRegistry\n\nregistry = ToolRegistry()\n# five built-ins are registered automatically:\n# file_read, http_get, shell, memory_store, memory_retrieve\n\nregistry.define_tool(\n    name=\"search_web\",\n    description=\"Search the web for recent information\",\n    fn=my_search_fn,\n    schema={\"type\": \"object\", \"properties\": {\"query\": {\"type\": \"string\"}}, \"required\": [\"query\"]},\n)\n```\n\nThe built-in tools are thin wrappers. They are not hardened for production use — treat them as stubs you replace with your own implementations.\n\nBeing honest about scope matters:\n\n`threading.Semaphore`\n\nand `ThreadPoolExecutor`\n\n. If you need `asyncio`\n\n-native agents, this is not the right library yet.`stream()`\n\n, but `run_agent()`\n\nand `run_tasks()`\n\ncollect the full response before returning.`shell`\n\n, `http_get`\n\n, etc. are thin wrappers without sandboxing, rate limiting, or error enrichment.Areas I plan to work on next, in rough priority order:\n\n`asyncio.Semaphore`\n\n, `async def generate()`\n\n)`run_agent()`\n\nContributions and issue reports are welcome. The test suite uses pytest; see `pyproject.toml`\n\nfor the dev extras.\n\n`pip install \"aegis-gov[all]\"`", "url": "https://wpnews.pro/news/aegis-gov-a-small-python-library-for-multi-agent-task-graphs-and-circuit", "canonical_source": "https://dev.to/th19930828/aegis-gov-a-small-python-library-for-multi-agent-task-graphs-and-circuit-breakers-45h2", "published_at": "2026-06-17 13:09:52+00:00", "updated_at": "2026-06-17 13:21:58.628989+00:00", "lang": "en", "topics": ["large-language-models", "ai-agents", "developer-tools"], "entities": ["aegis-gov", "OpenMultiAgent", "TaskQueue", "AgentPool", "AnthropicAdapter", "OpenAIAdapter", "OllamaAdapter", "Python"], "alternates": {"html": "https://wpnews.pro/news/aegis-gov-a-small-python-library-for-multi-agent-task-graphs-and-circuit", "markdown": "https://wpnews.pro/news/aegis-gov-a-small-python-library-for-multi-agent-task-graphs-and-circuit.md", "text": "https://wpnews.pro/news/aegis-gov-a-small-python-library-for-multi-agent-task-graphs-and-circuit.txt", "jsonld": "https://wpnews.pro/news/aegis-gov-a-small-python-library-for-multi-agent-task-graphs-and-circuit.jsonld"}}