mcp-probe v1.4.0: Contract assertions for production MCP servers

The article announces the release of mcp-probe v1.4.0, a tool that adds contract assertions for production MCP servers. It explains that basic startup and schema checks are insufficient for production environments, as servers can pass these checks but still fail due to broken auth, permissions, or data boundaries. The new version allows teams to define expected outcomes for tool calls—such as status, required fields, error codes, and content checks—enabling CI pipelines to validate the actual contracts that AI agents depend on.

MCP servers are starting to look like infrastructure. That means the old readiness question is no longer enough: Does the process start? Even this is not enough: Does tools/list return a clean schema? A server can pass both checks and still fail every real agent loop because auth handoff, scopes, downstream permissions, environment setup, or data boundaries are broken. So I shipped mcp-probe v1.4.0 with contract assertions for production MCP servers. GitHub: https://github.com/k08200/mcp-probe npm: https://www.npmjs.com/package/@k08200/mcp-probe A typical MCP smoke test looks like this: initialize tools/list That catches broken startup and malformed tools. But it misses the failures that matter in production: 401 In other words: the server starts, but the contract is broken. mcp-probe already supported sidecar inputs via .mcp-probe.json so teams could run real tools/call checks instead of relying on schema-minimum dummy inputs. v1.4.0 extends that sidecar with assertions. Example for a database-backed MCP server: { "tools": { "execute sql": { "input": { "project id": "YOUR PROJECT ID", "query": "select 1 as health check" }, "expect": { "status": "pass", "requiredFields": "rowCount", "limit", "source", "freshness" , "maxRows": 100 } }, "execute sql write denied": { "input": { "project id": "YOUR PROJECT ID", "query": "delete from users where id = 1" }, "expect": { "status": "fail", "errorCode": "WRITE NOT ALLOWED", "notContains": "DATABASE URL", "password", "stack" } } } } Now CI can validate the contract an agent actually depends on. expect.status Declare whether a call should pass, fail, or warn. This is important for negative probes. A write attempt against a read-only DB role should fail. In that case, failure is success. { "expect": { "status": "fail" } } expect.requiredFields Validate that result metadata exists. For database tools, an agent often needs more than rows. It needs context: rowCount limit source freshness { "expect": { "requiredFields": "rowCount", "limit", "source", "freshness" } } expect.maxRows Catch broad exports or missing limits. { "expect": { "maxRows": 100 } } mcp-probe looks for common result shapes such as rowCount , rowsReturned , rows , data , items , and records . expect.errorCode Require stable structured error codes. { "expect": { "status": "fail", "errorCode": "WRITE NOT ALLOWED" } } This matters because agents can only recover if errors are predictable. expect.contains and expect.notContains Check for expected output and leaked internals. { "expect": { "notContains": "DATABASE URL", "password", "stack" } } This catches errors that expose raw internals. expect.not error code Treat known auth/permission status codes as warnings instead of hard failures. { "expect": { "not error code": 401, 403 } } This keeps OAuth handoff failures visible without confusing them with transport or runtime crashes. When assertions pass: Tool Call Dry-run ✓ db query sidecar 1ms ✓ status: Tool status matched expected pass ✓ requiredFields.rowCount: Found required field "rowCount" ✓ requiredFields.limit: Found required field "limit" ✓ requiredFields.source: Found required field "source" ✓ requiredFields.freshness: Found required field "freshness" ✓ maxRows: Row count 1 is within maxRows 100 ✓ db write sidecar 0ms ✓ status: Tool status matched expected fail ✓ errorCode: Found expected error code WRITE NOT ALLOWED ✓ notContains.DATABASE URL: Output does not contain "DATABASE URL" ✓ notContains.password: Output does not contain "password" ✓ notContains.stack: Output does not contain "stack" If a contract assertion fails, mcp-probe reports: CONTRACT ASSERTION FAILED and includes per-assertion details in terminal output, JSON output, and GitHub Actions summaries. npx @k08200/mcp-probe@latest init \ --target @your-org/your-mcp-server \ --discover \ --github-actions Then edit .mcp-probe.json with real read-only probes and run: npx @k08200/mcp-probe@latest --config mcp-probe.config.json --github-summary MCP CI should test the contract an agent will actually depend on, not just whether the server process starts. For database-backed MCP servers, that means validating things like: mcp-probe should not know every server's semantics. But it can give teams a small, declarative way to encode the production contract their agents rely on. That is the goal of v1.4.0. Release: https://github.com/k08200/mcp-probe/releases/tag/v1.4.0 npm: https://www.npmjs.com/package/@k08200/mcp-probe