{"slug": "building-a-zero-leak-postgres-mcp-gateway-in-go", "title": "Building a Zero-Leak Postgres MCP Gateway in Go", "summary": "A developer built a zero-leak Postgres MCP gateway in Go that enables LLMs to query databases without exposing raw schema or data. The gateway uses dynamic schema reflection to auto-generate tool manifests from whitelisted tables and enforces analytical egress hardening to prevent data leaks. This approach addresses the security paradox of granting AI agents structural data access while protecting intellectual property.", "body_md": "The promise of agentic AI workflows introduces a critical architectural paradox: to make an LLM deeply useful, you must grant it structural awareness of your data layer. Traditional integration patterns force a losing trade-off. Either you hand an external orchestrator direct database access (risking catastrophic data egress), or you must serialize and persist your entire proprietary database schema onto third-party infrastructure. This exposure of internal domain definitions outside the secure perimeter represents a massive intellectual property leak, stalling production AI adoption in highly competitive or regulated sectors. For instance, a localized real estate consultancy managing proprietary compound metrics and high-value transactional ledgers cannot afford to expose its structural competitive edge to a shared cloud context just to run an analytical prompt.\n\nTo bridge this gap, backend teams must shift toward an architectural pattern where the data plane isolates schema definitions and executes only the commands explicitly defined by the MCP server, delivering pre-approved aggregations without ever leaking raw data layouts upstream. This article demonstrates how to build a zero-leak database proxy in Go using the Model Context Protocol (MCP) over a secure `stdio`\n\ntransport layer. By decoupling the LLM from direct database access, you will implement a live gateway that executes two core tasks: **Dynamic Schema Reflection** to auto-generate tool manifests programmatically, and **Analytical Egress Hardening** to ensure the external AI agent never touches a raw database row.\n\nThe project follows a standard go folder layout - `cmd/`\n\nfor the entrypoint, `pkg/db`\n\nfor the Postgres connection and logic. This isn’t a framework requirement, just a convention that keeps schema reflection, query execution and MCP transport cleanly separated. You can flatten this into a single file for prototyping.\n\nThree things make this gateway zero-leak;\n\n`EXPOSED_TABLES`\n\n)Schema visibility step utilizes Postgres’ `information_schema.columns`\n\ntable to actually fetch column metadata from the database - instead of having to hardcode or dump it out of our database every time the LLM needs to know about what schema structure is available in our data layer.\n\nIn `pkg/db/postgres.go`\n\nwe create an `InspectExposedSchema`\n\nfunction that returns a slice of type `ColumnMetadata`\n\nwhich can eventually be passed into the LLM context window.\n\n```\npackage db\n\nimport (\n    \"database/sql\"\n)\n\n// ColumnMetadata defines a single column in our postgres database\ntype ColumnMetadata struct {\n    TableName  string\n    ColumnName string\n    DataType   string\n}\n\n// InspectExposedSchema reads structural layout data dynamically from the system catalog.\nfunc InspectExposedSchema(db *sql.DB, exposedTables []string) ([]ColumnMetadata, error) {\n    query := `\n        SELECT table_name, column_name, data_type \n        FROM information_schema.columns \n        WHERE table_schema = 'public' \n        AND table_name = ANY($1)\n        ORDER BY table_name, column_name;`\n\n    rows, err := db.Query(query, exposedTables)\n    if err != nil {\n        return nil, err\n    }\n    defer rows.Close()\n\n    var metadata []ColumnMetadata\n    for rows.Next() {\n        var col ColumnMetadata\n        if err := rows.Scan(&col.TableName, &col.ColumnName, &col.DataType); err != nil {\n            return nil, err\n        }\n        metadata = append(metadata, col)\n    }\n\n    if err := rows.Err(); err != nil {\n        return nil, err\n    }\n\n    return metadata, nil\n}\n```\n\nIn the project we’ve setup a `.env`\n\nfile with the following variable:\n\n```\nEXPOSED_TABLES=compounds,sales_ledger\n```\n\nThis variable is read and passed into the `InspectExposedSchema`\n\nfunction to fetch only those tables that we’ve explicitly whitelisted for visibility.\n\nIt’s worth dwelling on why this is a deny-by-default allowlist rather than an exposed-by-default filter.\n\nA more generic approach would remove the filter entirely, but in a regulated FinTech or real estate platform, that's not a hypothetical risk. Staging tables, audit logs, or a `users`\n\ntable with national ID numbers would become visible to the orchestrator the moment they're created, with zero code change and zero review. The allowlist isn't extra friction, it's the only thing standing between \"the LLM sees what we intended\" and \"the LLM sees whatever the last migration happened to leave lying around.”\n\nIn this article we’re only highlighting a single filter level (table level). But a more production-ready design would include a deeper deny list on a more granular level for columns such as surrogate keys, create/delete/update timestamps or vector fields if you’re using PGVector.\n\nRaw query access is the obvious approach — and the wrong one. Here's why the gateway pre-defines every computation the LLM is allowed to run. For this project, we are taking on one business case where the user of the LLM needs an aggregate of the total number of units sold (`units_sold`\n\n), total revenue made (`revenue_egp`\n\n) and total cancelled orders (`cancelled_orders`\n\n) for a specific `region`\n\nIn the schema provided in the repository, we have 2 entities `compounds`\n\nand `sales_ledger`\n\n. `sales_ledger`\n\ncolumn `compound_id`\n\nis a foreign key that references [ compounds.id](http://compounds.id) .\n\nIn many popular MCP implementations, the LLM would generally create the aggregation query and send it as plain-text for execution. This poses massive security risk - aside from `DELETE`\n\nor `DROP`\n\nstatements which are naive assumptions given a read-only access. The real risk is an exhaustive `SELECT`\n\nquery. There is no telling what the LLM might decide is the best path. For the majority of cases it might send the correct query for the business need directly.\n\n```\n-- Find aggregate of units sold, revenue, cancelled orders \n-- relative to a select region\nSELECT compounds.region, \n        sum(units_sold) AS TOTAL_UNITS_SOLD,\n                sum(revenue_egp) AS TOTAL_REVENUE, \n                sum(cancelled_orders) AS TOTAL_CANCELLED\nFROM sales_ledger JOIN compounds ON sales_ledger.compound_id = compounds.id\nWHERE compounds.region = ANY($1)\nGROUP BY compounds.region\n```\n\nBut if an attacker were to hijack a session or acquire access to the server running the LLM, there is no stopping them from instructing or injecting a prefix to the context window that instructs the LLM to pull raw data to the server and process it instead of aggregate it.\n\nA more secure gateway only allows the LLM to know what it must know - without any possibility of further hijacking.\n\nIn `pkg/db/queries.go`\n\nwe initialize a Queries struct and constructor for it which accepts a `*sql.DB`\n\nconnection:\n\n```\ntype Queries struct {\n    db *sql.DB\n}\n\nfunc NewQueries(db *sql.DB) *Queries {\n    return &Queries{\n        db: db,\n    }\n}\n```\n\nThen we create the result struct for the first type of aggregation which consists of all the fields that represent a single record out from the above query.\n\n```\ntype RegionalMetricsResult struct {\n    Region          string  `json:\"region\"`\n    UnitsSold       int     `json:\"unitsSold\"`\n    TotalRevenue    float64 `json:\"totalRevenue\"`\n    CancelledOrders int     `json:\"cancelledOrders\"`\n}\n```\n\nFinally, we create `FindRegionalMetrics`\n\nmethod on `Queries`\n\nstruct with a pointer receiver:\n\n```\nfunc (q *Queries) FindRegionalMetrics(ctx context.Context, regions []string) ([]RegionalMetricsResult, error) {\n    query := `SELECT compounds.region, \n                         sum(units_sold) AS TOTAL_UNITS_SOLD,\n                             sum(revenue_egp) AS TOTAL_REVENUE, \n                             sum(cancelled_orders) AS TOTAL_CANCELLED\n                FROM sales_ledger JOIN compounds ON sales_ledger.compound_id = compounds.id\n                WHERE compounds.region = ANY($1)\n                GROUP BY compounds.region`\n\n    rows, err := q.db.QueryContext(ctx, query, pq.Array(regions))\n\n    if err != nil {\n        return nil, err\n    }\n\n    defer rows.Close()\n\n    var result []RegionalMetricsResult\n\n    for rows.Next() {\n        var col RegionalMetricsResult\n        if err := rows.Scan(&col.Region, &col.UnitsSold, &col.TotalRevenue, &col.CancelledOrders); err != nil {\n            return nil, err\n        }\n\n        result = append(result, col)\n    }\n\n    if err := rows.Err(); err != nil {\n        return nil, err\n    }\n\n    return result, nil\n}\n```\n\n`pq.Array`\n\nis required here because Go's `database/sql`\n\ndoesn't natively serialize a string slice to Postgres's `ANY($1)`\n\narray syntax - the `lib/pq`\n\ndriver wrapper handles that translation.\n\nThe `FindRegionalMetrics`\n\nand any similar method absolutely doesn’t have to know about who is calling it. It doesn’t care if the caller is an MCP server or a CRUD API server. It is pure business logic that constricts and abstracts flow from the underlying data store, essentially telling the LLM what it is allowed to do with the data.\n\nThis is also true in case your team decides to create a more complex and dynamic aggregate implementation - The end goal remains the same: You give the LLM a sparse set of information proxies that cannot be abused even if an attacker gains access.\n\nNow comes the part where we register these tools as discoverable and usable utilities to the LLM.\n\nFor this project, we are using `github.com/mark3labs/mcp-go`\n\nto register MCP tools and run the MCP server.\n\nFirst, we define a small helper that serializes any result type to indented JSON before returning it to the MCP transport layer. Using `any`\n\nas the input type means this same function works for every tool response — schema metadata, regional metrics, or any future query result.\n\n```\nfunc formatResult(v any) string {\n    b, _ := json.MarshalIndent(v, \"\", \"  \")\n    return string(b)\n}\n```\n\nIn production, the marshal error should be handled explicitly. For this gateway, marshaling failures on known struct types are effectively impossible, but the pattern should be hardened before shipping.\n\nThe library makes it easy to add a descriptor for the tools using the `mcp.NewTool`\n\nmethod.\n\nFor the `list_tables`\n\ntool - initialize a tool name and the description:\n\n```\n    listTablesTool := mcp.NewTool(\"list_tables\",\n        mcp.WithDescription(\"Lists all available database schemas and field structures without exposing raw database records.\"),\n    )\n```\n\nThen use the `AddTool`\n\nmethod to actually make the tool usable and utilize the `InspectExposedSchema`\n\nfunction we created above:\n\n```\n    s.AddTool(listTablesTool, func(ctx context.Context, request mcp.CallToolRequest) (*mcp.CallToolResult, error) {\n        cols, err := db.InspectExposedSchema(database, exposedTables)\n        if err != nil {\n            return mcp.NewToolResultError(fmt.Sprintf(\"Failed to map system constraints: %s\", err.Error())), nil\n        }\n\n        return mcp.NewToolResultText(formatResult(cols)), nil\n    })\n```\n\nThe first line of the function has 2 important things to note:\n\n```\n     cols, err := db.InspectExposedSchema(database, exposedTables)\n```\n\n`database`\n\nis a variable holding the `*sql.DB`\n\ninstance.\n\n`exposedTables`\n\nis the largely configurable `.env`\n\n`EXPOSED_TABLES`\n\nvariable we introduced earlier. This tells the InspectExposedSchema to only pull the information for the explicitly allowed tables.\n\nNext, comes the aggregate method registration. First, initialize the `Queries`\n\nstruct:\n\n```\n    queries := db.NewQueries(database)\n```\n\nThe `FindRegionalMetrics`\n\nexpects a slice of strings for its second argument `regions []string`\n\n. The `get_metrics`\n\nMCP tool can be configured in the `mcp.NewTool`\n\nmethod to annotate that this tool requires a string slice:\n\n```\n    metricsTool := mcp.NewTool(\"get_metrics\",\n        mcp.WithDescription(\"Retrieves metrics for specified geographical regions\"),\n\n        // Define your slice parameter here\n        mcp.WithArray(\"region\",\n            mcp.Required(), // <-- This marks the parameter as required in the JSON Schema\n            mcp.Description(\"A list of regions to filter metrics by (e.g. ['New Cairo', 'North Coast'])\"),\n        ),\n    )\n```\n\nThe `mcp.WithArray`\n\ntells the MCP server to expect a json array.\n\nNext add the tool:\n\n```\n    s.AddTool(metricsTool, func(ctx context.Context, request mcp.CallToolRequest) (*mcp.CallToolResult, error) {\n        regions, err := request.RequireStringSlice(\"region\")\n        if err != nil {\n            return mcp.NewToolResultError(err.Error()), nil\n        }\n\n        result, err := queries.FindRegionalMetrics(ctx, regions)\n        if err != nil {\n            return mcp.NewToolResultError(err.Error()), nil\n        }\n\n        return mcp.NewToolResultText(formatResult(result)), nil\n    })\n```\n\nThe first line:\n\n```\n        regions, err := request.RequireStringSlice(\"region\")\n```\n\nIs important because in the tool description, we only hinted at providing an `Array`\n\n. This method `request.RequireStringSlice`\n\nenforces a typed `Array`\n\ntranslating to a go `StringSlice`\n\n.\n\nAs covered in the previous section, `pq.Array`\n\nhandles the Go-to-Postgres array serialization that `database/sql`\n\ndoesn't provide natively.\n\nThe MCP server now exposes exactly two tools - no more, no less. The LLM can discover what exists and compute what's permitted. Everything else in the database remains invisible.\n\nNow to see the entire structure come to life, we wire together all that was built above into an entrypoint.\n\nAs mentioned previously we’re using `github.com/mark3labs/mcp-go`\n\nto spin up an MCP server instead of building one from scratch.\n\nIn this project the `main.go`\n\nis located in a standard path `cmd/gateway/main.go`\n\n. The full main.go looks like this:\n\n```\npackage main\n\nimport (\n    \"context\"\n    \"database/sql\"\n    \"encoding/json\"\n    \"fmt\"\n    \"log\"\n    \"mcp-postgres-gateway/pkg/db\"\n    \"os\"\n    \"strings\"\n\n    \"github.com/joho/godotenv\"\n    _ \"github.com/lib/pq\" // CRITICAL: Must be explicitly imported here to register the driver\n    \"github.com/mark3labs/mcp-go/mcp\"\n    \"github.com/mark3labs/mcp-go/server\"\n)\n\nfunc formatResult(v any) string {\n    b, _ := json.MarshalIndent(v, \"\", \"  \")\n    return string(b)\n}\n\nfunc main() {\n    err := godotenv.Load(\".env\")\n    // Initialize Postgres Connection\n    connStr := os.Getenv(\"DATABASE_URL\")\n\n    exposedTables := strings.Split(os.Getenv(\"EXPOSED_TABLES\"), \",\")\n    if len(exposedTables) == 0 {\n        log.Fatal(\"EXPOSED_TABLES environment variable is not set\")\n    }\n\n    if connStr == \"\" {\n        log.Fatal(\"DATABASE_URL environment variable is not set\")\n    }\n\n    database, err := sql.Open(\"postgres\", connStr)\n    if err != nil {\n        log.Fatalf(\"Database initialization failure: %v\", err)\n    }\n    defer database.Close()\n\n    // Establish the MCP Core Server Block\n    s := server.NewMCPServer(\"domainai-gateway\", \"1.0.0\")\n\n    // 1. Tool 1 Implementation: Expose Schema Table Information\n    listTablesTool := mcp.NewTool(\"list_tables\",\n        mcp.WithDescription(\"Lists all available database schemas and field structures without exposing raw database records.\"),\n    )\n\n    s.AddTool(listTablesTool, func(ctx context.Context, request mcp.CallToolRequest) (*mcp.CallToolResult, error) {\n        cols, err := db.InspectExposedSchema(database, exposedTables)\n        if err != nil {\n            return mcp.NewToolResultError(fmt.Sprintf(\"Failed to map system constraints: %s\", err.Error())), nil\n        }\n\n        return mcp.NewToolResultText(formatResult(cols)), nil\n    })\n\n    // Data tools\n    queries := db.NewQueries(database)\n\n    metricsTool := mcp.NewTool(\"get_metrics\",\n        mcp.WithDescription(\"Retrieves metrics for specified geographical regions\"),\n\n        // Define your slice parameter here\n        mcp.WithArray(\"region\",\n            mcp.Required(), // <-- This marks the parameter as required in the JSON Schema\n            mcp.Description(\"A list of regions to filter metrics by (e.g. ['US', 'EU'])\"),\n        ),\n    )\n\n    s.AddTool(metricsTool, func(ctx context.Context, request mcp.CallToolRequest) (*mcp.CallToolResult, error) {\n        regions, err := request.RequireStringSlice(\"region\")\n        if err != nil {\n            return mcp.NewToolResultError(err.Error()), nil\n        }\n\n        result, err := queries.FindRegionalMetrics(ctx, regions)\n        if err != nil {\n            return mcp.NewToolResultError(err.Error()), nil\n        }\n\n        return mcp.NewToolResultText(formatResult(result)), nil\n    })\n\n    // Start the Server to communicate natively over standard IO channels\n    log.Println(\"MCP Gateway initialized. Establishing communication channel over Stdio...\")\n    if err := server.ServeStdio(s); err != nil {\n        fmt.Fprintf(os.Stderr, \"Server crash anomaly: %v\\n\", err)\n        os.Exit(1)\n    }\n}\n```\n\nIn this implementation, `.env`\n\nloading failures are intentionally non-fatal. The application falls back to system environment variables, which is the correct behavior in containerized deployments where `.env`\n\nfiles aren't present.\n\nNotice how we must import [ github.com/lib/pq](http://github.com/lib/pq) using alias\n\n`_`\n\nfor side effects. Once registered `database/sql`\n\nknows exactly how to handle the postgres protocol behind the scenes when you initialize a connection.Also notice this block of code:\n\n```\n    connStr := os.Getenv(\"DATABASE_URL\")\n\n    exposedTables := strings.Split(os.Getenv(\"EXPOSED_TABLES\"), \",\")\n    if len(exposedTables) == 0 {\n        log.Fatal(\"EXPOSED_TABLES environment variable is not set\")\n    }\n\n    if connStr == \"\" {\n        log.Fatal(\"DATABASE_URL environment variable is not set\")\n    }\n```\n\nThe application deliberately terminates the program if `DATABASE_URL`\n\nis not found in environment. But more notably this pattern is also enforced early in the program when no `EXPOSED_TABLES`\n\nare set. This can be helpful to save network resources and give an early failure signal if your MCP server communicates with the database service across another network or if the database service is a microservice in your ecosystem.\n\nTo test the MCP service, you can spin up a quick, on demand MCP inspector UI in your browser by running this npx command:\n\n```\nnpx -y @modelcontextprotocol/inspector go run cmd/gateway/main.go\n```\n\nThis should open up an MCP inspector tab in your browser.\n\nRunning `list_tables`\n\ntool should yield an output similar to this:\n\n```\n[\n  {\n    \"TableName\": \"compounds\",\n    \"ColumnName\": \"developer\",\n    \"DataType\": \"character varying\"\n  },\n  {\n    \"TableName\": \"compounds\",\n    \"ColumnName\": \"id\",\n    \"DataType\": \"integer\"\n  },\n  {\n    \"TableName\": \"compounds\",\n    \"ColumnName\": \"launch_year\",\n    \"DataType\": \"integer\"\n  },\n  {\n    \"TableName\": \"compounds\",\n    \"ColumnName\": \"name\",\n    \"DataType\": \"character varying\"\n  },\n  {\n    \"TableName\": \"compounds\",\n    \"ColumnName\": \"region\",\n    \"DataType\": \"character varying\"\n  },\n  {\n    \"TableName\": \"compounds\",\n    \"ColumnName\": \"total_units\",\n    \"DataType\": \"integer\"\n  },\n  {\n    \"TableName\": \"sales_ledger\",\n    \"ColumnName\": \"cancelled_orders\",\n    \"DataType\": \"integer\"\n  },\n  {\n    \"TableName\": \"sales_ledger\",\n    \"ColumnName\": \"compound_id\",\n    \"DataType\": \"integer\"\n  },\n  {\n    \"TableName\": \"sales_ledger\",\n    \"ColumnName\": \"id\",\n    \"DataType\": \"integer\"\n  },\n  {\n    \"TableName\": \"sales_ledger\",\n    \"ColumnName\": \"quarter\",\n    \"DataType\": \"character varying\"\n  },\n  {\n    \"TableName\": \"sales_ledger\",\n    \"ColumnName\": \"revenue_egp\",\n    \"DataType\": \"numeric\"\n  },\n  {\n    \"TableName\": \"sales_ledger\",\n    \"ColumnName\": \"units_sold\",\n    \"DataType\": \"integer\"\n  }\n]\n```\n\nRunning the `get_metrics`\n\ntool with the input of `[\"New Cairo\", \"North Coast\"]`\n\nShould yield the below aggregated metrics for each region.\n\n```\n[\n  {\n    \"region\": \"New Cairo\",\n    \"unitsSold\": 165,\n    \"totalRevenue\": 1245000000,\n    \"cancelledOrders\": 3\n  },\n  {\n    \"region\": \"North Coast\",\n    \"unitsSold\": 12,\n    \"totalRevenue\": 180000000,\n    \"cancelledOrders\": 4\n  }\n]\n```\n\nThe LLM received aggregated metrics - totals, not rows. It knows New Cairo sold 165 units. It has no path to the individual transaction records that produced that number. That's the boundary the gateway enforces.\n\nThe Go ecosystem is underrepresented in MCP tooling — most implementations lean on Python or TypeScript. But the real gap isn't language choice. It's architectural discipline.\n\nAn MCP gateway that lets the LLM construct its own queries is only as secure as the LLM's judgment - and judgment is exactly what attackers exploit. The pattern in this article inverts that assumption: the gateway defines what's computable, the LLM executes within those boundaries, and raw data never crosses the perimeter.\n\nThis isn't a limitation of the architecture. It's the feature.\n\nThe full implementation is available at [khalidelokiely/mcp-postgres-gateway](https://github.com/khalidelokiely/mcp-postgres-gateway). Clone it, point it at your own Postgres instance, and extend `queries.go`\n\nwith the aggregations your business logic actually needs. The schema reflection and transport layer stay unchanged — only the computations you choose to expose are yours to define.", "url": "https://wpnews.pro/news/building-a-zero-leak-postgres-mcp-gateway-in-go", "canonical_source": "https://dev.to/khalidelokiely/building-a-zero-leak-postgres-mcp-gateway-in-go-3be6", "published_at": "2026-06-26 17:28:58+00:00", "updated_at": "2026-06-26 18:03:46.483027+00:00", "lang": "en", "topics": ["developer-tools", "large-language-models", "ai-agents", "ai-safety", "ai-infrastructure"], "entities": ["Postgres", "Go", "Model Context Protocol", "MCP"], "alternates": {"html": "https://wpnews.pro/news/building-a-zero-leak-postgres-mcp-gateway-in-go", "markdown": "https://wpnews.pro/news/building-a-zero-leak-postgres-mcp-gateway-in-go.md", "text": "https://wpnews.pro/news/building-a-zero-leak-postgres-mcp-gateway-in-go.txt", "jsonld": "https://wpnews.pro/news/building-a-zero-leak-postgres-mcp-gateway-in-go.jsonld"}}