Tool Permission Matrix Builder & Validator: Structured, Visual Policy Management for AI Agent Teams

A developer built the Tool Permission Matrix Builder & Validator, a visual policy management system for AI agent teams that defines tools, classifies risk, assigns roles, and validates permissions. The system uses Claude for analysis and supports async backends with heuristic fallbacks, exporting policy artifacts in JSON, YAML, or Python. It addresses governance failures by replacing ad-hoc scripts with structured, real-time validation and sprawl analysis.

AI agents in production access tools that range from harmless read-only queries to irreversible destructive operations. Managing which agents can use which tools is a governance problem that most teams solve with ad-hoc scripts and tribal knowledge - and that works until it doesn't. A misconfigured role, an over-exposed tool, or an agent that silently calls something it shouldn't are the kinds of failures that surface in production rather than in review. Tool Permission Matrix Builder & Validator replaces that with a structured, visual approach. It is a visual policy management system for AI agent teams - define tools, classify their risk, assign roles, and drag-and-drop permissions onto a matrix, then export machine-readable policy artifacts or validate existing agents for compliance, all powered by Claude claude-sonnet-4-20250514. The platform addresses the full lifecycle of agent tool governance in one place. It starts with tool registration - each tool is defined and assigned a risk category: read-only, internal-write, external-api, financial, destructive, or administrative. Roles are then created for each agent type - analyst, operator, admin, readonly-bot, or whatever the team's structure requires. The permission matrix takes these two dimensions and lets permissions be assigned by dragging tools onto roles or clicking individual cells to toggle between allowed, denied, and inherited states. The matrix validates in real time: if a role has access to a tool whose risk level exceeds what that role should have, a warning appears immediately. Once the matrix is configured, a policy artifact is exported - JSON for machine consumption, YAML for GitOps workflows, or a Python module with a check permission role, tool function that can be imported directly into agent code. On the validation side, existing agent code can be pasted in and Claude analyzes which tools it actually calls, cross-checks those against the matrix, and produces a security score with sorted recommendations. A separate sprawl analysis detects over-exposure: roles with too many high-risk tools, tools granted to too many roles, and unused grants. The backend is fully async - all 24 routes use async def with an aiosqlite-backed SQLAlchemy session. This is intentional: the Claude API calls in agent validation and sprawl analysis can take 5–15 seconds, and with a synchronous backend, one validation request would block all other users. With async, many concurrent requests are handled without blocking. Both AI services have heuristic fallbacks. If ANTHROPIC API KEY is not set, the agent validator still extracts tool call patterns from code using regex and checks them against the matrix, and the sprawl analyzer still computes numerical sprawl metrics. The Claude path produces richer narrative and nuanced recommendations; the heuristic path still provides actionable data. The policy generator produces three output formats from the same matrix data. The Python module output is syntax-verified via py compile before being returned, ensuring the downloaded file is always importable. The repo also includes architecture.svg . Prerequisites Set up the environment cp .env.example .env Optionally add ANTHROPIC API KEY=sk-ant-your-key-here for Claude analysis Run the backend cd backend pip install -r requirements.txt uvicorn main:app --reload --host 0.0.0.0 --port 8000 The API starts at http://localhost:8000 . Swagger UI is available at http://localhost:8000/docs . Run the frontend The frontend runs in a separate process: cd frontend npm install npm run dev The UI opens at http://localhost:5173 . Run with Docker cp .env.example .env docker compose up --build The backend runs on port 8000 with a health check. The frontend serves via nginx on port 80 and waits for the backend health check before starting. Running Tests cd backend && python -m pytest tests/ -v 22 tests - policy generation JSON/YAML/Python , agent validator, heuristic analysis. Runs in under a second. All six risk categories are implemented as a Python Enum and stored in the database. The permission matrix UI shows these risk colors on every tool badge, and real-time validation warnings fire when a role's allowed risk levels would be exceeded. tool-permission-matrix/ ├── backend/ │ ├── main.py FastAPI app, 24 async routes │ ├── models.py Tool, Role, Permission ORM + RiskCategory Enum │ ├── schemas.py Pydantic v2 request/response schemas │ ├── database.py Async SQLite via aiosqlite │ ├── services/ │ │ ├── policy generator.py JSON, YAML, and Python module export │ │ ├── agent validator.py Claude + heuristic agent code analysis │ │ └── sprawl analyzer.py Claude + heuristic sprawl detection │ ├── requirements.txt │ ├── Dockerfile.backend │ └── tests/ │ ├── test policy generator.py 11 policy generation tests │ ├── test validator.py 11 validator tests │ └── fixtures/ │ ├── sample agent.py Realistic agent with tool call patterns │ └── sample policy.json Realistic permission matrix fixture ├── frontend/ │ ├── src/ │ │ ├── App.tsx Tab layout: Tools/Roles/Matrix/Export/Validate/Sprawl │ │ ├── stores/ │ │ │ ├── toolStore.ts Zustand store for tool state │ │ │ ├── roleStore.ts Zustand store for role state │ │ │ └── matrixStore.ts Zustand store for permission matrix │ │ ├── components/ │ │ │ ├── ToolRegistry.tsx CRUD + filter + JSON import/export │ │ │ ├── RoleManager.tsx CRUD + inheritance + risk levels │ │ │ ├── PermissionMatrix.tsx @dnd-kit DnD grid │ │ │ ├── PolicyExporter.tsx Format selector + download │ │ │ ├── AgentValidator.tsx Paste/upload + results display │ │ │ └── SprawlAnalysis.tsx Sprawl score + issues list │ │ ├── api/client.ts axios-based API client, 20 methods │ │ └── types/index.ts TypeScript interfaces 28 types │ ├── Dockerfile.frontend │ ├── package.json │ └── vite.config.ts ├── docker-compose.yml └── .env.example The structure maps directly onto the platform's three functional layers. The backend/services/ directory holds the three pieces that do the heavy lifting - policy generation, agent validation, and sprawl analysis - each isolated from the routing layer in main.py. The frontend mirrors this with one component per tab in the UI, with tool state, role state, and matrix state each managed by a dedicated Zustand store. Async throughout - All backend routes are async def and the SQLAlchemy session uses aiosqlite. The Claude API calls in agent validation and sprawl analysis can take 5–15 seconds. A synchronous backend would block all other users during that window; the async design handles many concurrent requests without blocking. Three-state permission model - Each matrix cell is ALLOWED, DENIED, or INHERITED - not just a binary toggle. INHERITED means the permission comes from the role's parent role, enabling role hierarchies where a base role defines conservative defaults and derived roles override specific tools. Heuristic fallback for AI features - Claude-powered features are never the only path. The agent validator extracts tool calls using regex patterns that cover the most common calling conventions, then checks them against the matrix. The sprawl analyzer computes over-exposure metrics numerically. The platform is fully usable in restricted environments without an API key; Claude's analysis is an enhancement rather than a dependency. Policy Python module verification - When generating a Python module, py compile is called on the output before returning it. A permissions.py that fails to compile would be worse than no policy at all, so this check runs as a hard gate. The backend ships with 22 tests covering policy generation in all three export formats JSON, YAML, Python module , agent validator tool-call extraction across standard and use tool/call tool calling conventions, heuristic analysis correctness, and edge cases like empty code and missing policy. The frontend builds cleanly to a 277 KB JS bundle across 110 modules with @dnd-kit drag-and-drop and Zustand state management. For the AI-powered sprawl analysis, the SprawlAnalyzer was run using DeepSeek V4 Flash via OpenRouter for this verification pass against a three-role matrix - admin, developer, viewer - with six tools spanning read, write, and destructive categories. The model returned a sprawl score of 80/100 and surfaced nine issues. Two were critical: the admin role holding both execute code and delete resource , and the developer role also having execute code with no approval gate. The overall analysis named the pattern as excessive concentration of destructive tool access and recommended introducing approval workflows before any destructive operation. This project was built using NEO. NEO https://heyneo.com/ is a fully autonomous AI engineering agent that can write code and build solutions for AI/ML tasks including AI model evals, prompt optimization and end to end AI pipeline development. The requirement was a visual policy management platform where AI agent teams could define tools, classify risk, assign roles, configure permissions on a drag-and-drop matrix, and export machine-readable policy artifacts - with Claude-powered validation and sprawl analysis built in. NEO planned and produced the files in this repository - a fully async FastAPI backend with 24 routes, three backend services handling policy generation, agent validation, and sprawl analysis, a React and TypeScript frontend with a drag-and-drop permission matrix, six UI components, three Zustand stores, and a 22-test suite covering all major paths. The plans/ directory and ORCHESTRATOR LOG.md in the repo document that build run directly. The result is a fully working policy management platform - from tool registration through risk classification, matrix configuration, policy export, and agent validation - with heuristic fallbacks at every AI-powered step so the platform remains useful with or without an API key. Govern tool access across an existing AI agent team. Any team running multiple agents with different access levels can register their tools, classify them by the six built-in risk categories, and configure a permission matrix without writing a single line of policy code. The matrix validates in real time as roles and permissions are assigned. Validate existing agent code against a policy. Agent code can be pasted directly into the platform and the validator extracts which tools it actually calls, cross-checks them against the configured matrix, and returns a security score with specific recommendations. The heuristic path works without an API key; the Claude path produces richer analysis when ANTHROPIC API KEY is set. Export a check permission role, tool function directly into agent code. permissions.py file that is syntax-verified before download and can be imported directly into any agent codebase - no manual policy translation required. Detect permission sprawl in an existing matrix. The sprawl analysis endpoint scores the matrix for over-exposure - roles with too many high-risk tools, tools granted to too many roles, and unused grants. The heuristic path computes numerical metrics without an API key; the Claude path names specific patterns and recommends remediation when ANTHROPIC API KEY is set. The gap between "we think our agents have the right permissions" and "we can prove it and export it as code" is where this platform sits. Tool access in AI agent systems is a governance problem that gets harder as teams scale - more agents, more tools, more roles, and no single source of truth. The Tool Permission Matrix Builder & Validator makes that source of truth visual, exportable, and machine-readable. The code is at https://github.com/dakshjain-1616/Tool-Permission-Matrix-Builder-Validator https://github.com/dakshjain-1616/Tool-Permission-Matrix-Builder-Validator You can also build with NEO in your IDE using the VS Code extension https://marketplace.visualstudio.com/items?itemName=NeoResearchInc.heyneo or Cursor https://open-vsx.org/extension/NeoResearchInc/heyneo . You can use NEO MCP with Claude Code: https://heyneo.com/claude-code https://heyneo.com/claude-code