cd /news/developer-tools/building-a-zero-leak-postgres-mcp-ga… · home topics developer-tools article
[ARTICLE · art-41109] src=dev.to ↗ pub= topic=developer-tools verified=true sentiment=↑ positive

Building a Zero-Leak Postgres MCP Gateway in Go

A developer built a zero-leak Postgres MCP gateway in Go that enables LLMs to query databases without exposing raw schema or data. The gateway uses dynamic schema reflection to auto-generate tool manifests from whitelisted tables and enforces analytical egress hardening to prevent data leaks. This approach addresses the security paradox of granting AI agents structural data access while protecting intellectual property.

read13 min views1 publishedJun 26, 2026

The promise of agentic AI workflows introduces a critical architectural paradox: to make an LLM deeply useful, you must grant it structural awareness of your data layer. Traditional integration patterns force a losing trade-off. Either you hand an external orchestrator direct database access (risking catastrophic data egress), or you must serialize and persist your entire proprietary database schema onto third-party infrastructure. This exposure of internal domain definitions outside the secure perimeter represents a massive intellectual property leak, stalling production AI adoption in highly competitive or regulated sectors. For instance, a localized real estate consultancy managing proprietary compound metrics and high-value transactional ledgers cannot afford to expose its structural competitive edge to a shared cloud context just to run an analytical prompt.

To bridge this gap, backend teams must shift toward an architectural pattern where the data plane isolates schema definitions and executes only the commands explicitly defined by the MCP server, delivering pre-approved aggregations without ever leaking raw data layouts upstream. This article demonstrates how to build a zero-leak database proxy in Go using the Model Context Protocol (MCP) over a secure stdio

transport layer. By decoupling the LLM from direct database access, you will implement a live gateway that executes two core tasks: Dynamic Schema Reflection to auto-generate tool manifests programmatically, and Analytical Egress Hardening to ensure the external AI agent never touches a raw database row.

The project follows a standard go folder layout - cmd/

for the entrypoint, pkg/db

for the Postgres connection and logic. This isn’t a framework requirement, just a convention that keeps schema reflection, query execution and MCP transport cleanly separated. You can flatten this into a single file for prototyping.

Three things make this gateway zero-leak;

EXPOSED_TABLES

)Schema visibility step utilizes Postgres’ information_schema.columns

table to actually fetch column metadata from the database - instead of having to hardcode or dump it out of our database every time the LLM needs to know about what schema structure is available in our data layer.

In pkg/db/postgres.go

we create an InspectExposedSchema

function that returns a slice of type ColumnMetadata

which can eventually be passed into the LLM context window.

package db

import (
    "database/sql"
)

// ColumnMetadata defines a single column in our postgres database
type ColumnMetadata struct {
    TableName  string
    ColumnName string
    DataType   string
}

// InspectExposedSchema reads structural layout data dynamically from the system catalog.
func InspectExposedSchema(db *sql.DB, exposedTables []string) ([]ColumnMetadata, error) {
    query := `
        SELECT table_name, column_name, data_type 
        FROM information_schema.columns 
        WHERE table_schema = 'public' 
        AND table_name = ANY($1)
        ORDER BY table_name, column_name;`

    rows, err := db.Query(query, exposedTables)
    if err != nil {
        return nil, err
    }
    defer rows.Close()

    var metadata []ColumnMetadata
    for rows.Next() {
        var col ColumnMetadata
        if err := rows.Scan(&col.TableName, &col.ColumnName, &col.DataType); err != nil {
            return nil, err
        }
        metadata = append(metadata, col)
    }

    if err := rows.Err(); err != nil {
        return nil, err
    }

    return metadata, nil
}

In the project we’ve setup a .env

file with the following variable:

EXPOSED_TABLES=compounds,sales_ledger

This variable is read and passed into the InspectExposedSchema

function to fetch only those tables that we’ve explicitly whitelisted for visibility.

It’s worth dwelling on why this is a deny-by-default allowlist rather than an exposed-by-default filter.

A more generic approach would remove the filter entirely, but in a regulated FinTech or real estate platform, that's not a hypothetical risk. Staging tables, audit logs, or a users

table with national ID numbers would become visible to the orchestrator the moment they're created, with zero code change and zero review. The allowlist isn't extra friction, it's the only thing standing between "the LLM sees what we intended" and "the LLM sees whatever the last migration happened to leave lying around.”

In this article we’re only highlighting a single filter level (table level). But a more production-ready design would include a deeper deny list on a more granular level for columns such as surrogate keys, create/delete/update timestamps or vector fields if you’re using PGVector.

Raw query access is the obvious approach — and the wrong one. Here's why the gateway pre-defines every computation the LLM is allowed to run. For this project, we are taking on one business case where the user of the LLM needs an aggregate of the total number of units sold (units_sold

), total revenue made (revenue_egp

) and total cancelled orders (cancelled_orders

) for a specific region

In the schema provided in the repository, we have 2 entities compounds

and sales_ledger

. sales_ledger

column compound_id

is a foreign key that references compounds.id .

In many popular MCP implementations, the LLM would generally create the aggregation query and send it as plain-text for execution. This poses massive security risk - aside from DELETE

or DROP

statements which are naive assumptions given a read-only access. The real risk is an exhaustive SELECT

query. There is no telling what the LLM might decide is the best path. For the majority of cases it might send the correct query for the business need directly.

-- Find aggregate of units sold, revenue, cancelled orders 
-- relative to a select region
SELECT compounds.region, 
        sum(units_sold) AS TOTAL_UNITS_SOLD,
                sum(revenue_egp) AS TOTAL_REVENUE, 
                sum(cancelled_orders) AS TOTAL_CANCELLED
FROM sales_ledger JOIN compounds ON sales_ledger.compound_id = compounds.id
WHERE compounds.region = ANY($1)
GROUP BY compounds.region

But if an attacker were to hijack a session or acquire access to the server running the LLM, there is no stopping them from instructing or injecting a prefix to the context window that instructs the LLM to pull raw data to the server and process it instead of aggregate it.

A more secure gateway only allows the LLM to know what it must know - without any possibility of further hijacking.

In pkg/db/queries.go

we initialize a Queries struct and constructor for it which accepts a *sql.DB

connection:

type Queries struct {
    db *sql.DB
}

func NewQueries(db *sql.DB) *Queries {
    return &Queries{
        db: db,
    }
}

Then we create the result struct for the first type of aggregation which consists of all the fields that represent a single record out from the above query.

type RegionalMetricsResult struct {
    Region          string  `json:"region"`
    UnitsSold       int     `json:"unitsSold"`
    TotalRevenue    float64 `json:"totalRevenue"`
    CancelledOrders int     `json:"cancelledOrders"`
}

Finally, we create FindRegionalMetrics

method on Queries

struct with a pointer receiver:

func (q *Queries) FindRegionalMetrics(ctx context.Context, regions []string) ([]RegionalMetricsResult, error) {
    query := `SELECT compounds.region, 
                         sum(units_sold) AS TOTAL_UNITS_SOLD,
                             sum(revenue_egp) AS TOTAL_REVENUE, 
                             sum(cancelled_orders) AS TOTAL_CANCELLED
                FROM sales_ledger JOIN compounds ON sales_ledger.compound_id = compounds.id
                WHERE compounds.region = ANY($1)
                GROUP BY compounds.region`

    rows, err := q.db.QueryContext(ctx, query, pq.Array(regions))

    if err != nil {
        return nil, err
    }

    defer rows.Close()

    var result []RegionalMetricsResult

    for rows.Next() {
        var col RegionalMetricsResult
        if err := rows.Scan(&col.Region, &col.UnitsSold, &col.TotalRevenue, &col.CancelledOrders); err != nil {
            return nil, err
        }

        result = append(result, col)
    }

    if err := rows.Err(); err != nil {
        return nil, err
    }

    return result, nil
}

pq.Array

is required here because Go's database/sql

doesn't natively serialize a string slice to Postgres's ANY($1)

array syntax - the lib/pq

driver wrapper handles that translation.

The FindRegionalMetrics

and any similar method absolutely doesn’t have to know about who is calling it. It doesn’t care if the caller is an MCP server or a CRUD API server. It is pure business logic that constricts and abstracts flow from the underlying data store, essentially telling the LLM what it is allowed to do with the data.

This is also true in case your team decides to create a more complex and dynamic aggregate implementation - The end goal remains the same: You give the LLM a sparse set of information proxies that cannot be abused even if an attacker gains access.

Now comes the part where we register these tools as discoverable and usable utilities to the LLM.

For this project, we are using github.com/mark3labs/mcp-go

to register MCP tools and run the MCP server.

First, we define a small helper that serializes any result type to indented JSON before returning it to the MCP transport layer. Using any

as the input type means this same function works for every tool response — schema metadata, regional metrics, or any future query result.

func formatResult(v any) string {
    b, _ := json.MarshalIndent(v, "", "  ")
    return string(b)
}

In production, the marshal error should be handled explicitly. For this gateway, marshaling failures on known struct types are effectively impossible, but the pattern should be hardened before shipping.

The library makes it easy to add a descriptor for the tools using the mcp.NewTool

method.

For the list_tables

tool - initialize a tool name and the description:

    listTablesTool := mcp.NewTool("list_tables",
        mcp.WithDescription("Lists all available database schemas and field structures without exposing raw database records."),
    )

Then use the AddTool

method to actually make the tool usable and utilize the InspectExposedSchema

function we created above:

    s.AddTool(listTablesTool, func(ctx context.Context, request mcp.CallToolRequest) (*mcp.CallToolResult, error) {
        cols, err := db.InspectExposedSchema(database, exposedTables)
        if err != nil {
            return mcp.NewToolResultError(fmt.Sprintf("Failed to map system constraints: %s", err.Error())), nil
        }

        return mcp.NewToolResultText(formatResult(cols)), nil
    })

The first line of the function has 2 important things to note:

     cols, err := db.InspectExposedSchema(database, exposedTables)

database

is a variable holding the *sql.DB

instance.

exposedTables

is the largely configurable .env

EXPOSED_TABLES

variable we introduced earlier. This tells the InspectExposedSchema to only pull the information for the explicitly allowed tables.

Next, comes the aggregate method registration. First, initialize the Queries

struct:

    queries := db.NewQueries(database)

The FindRegionalMetrics

expects a slice of strings for its second argument regions []string

. The get_metrics

MCP tool can be configured in the mcp.NewTool

method to annotate that this tool requires a string slice:

    metricsTool := mcp.NewTool("get_metrics",
        mcp.WithDescription("Retrieves metrics for specified geographical regions"),

        // Define your slice parameter here
        mcp.WithArray("region",
            mcp.Required(), // <-- This marks the parameter as required in the JSON Schema
            mcp.Description("A list of regions to filter metrics by (e.g. ['New Cairo', 'North Coast'])"),
        ),
    )

The mcp.WithArray

tells the MCP server to expect a json array.

Next add the tool:

    s.AddTool(metricsTool, func(ctx context.Context, request mcp.CallToolRequest) (*mcp.CallToolResult, error) {
        regions, err := request.RequireStringSlice("region")
        if err != nil {
            return mcp.NewToolResultError(err.Error()), nil
        }

        result, err := queries.FindRegionalMetrics(ctx, regions)
        if err != nil {
            return mcp.NewToolResultError(err.Error()), nil
        }

        return mcp.NewToolResultText(formatResult(result)), nil
    })

The first line:

        regions, err := request.RequireStringSlice("region")

Is important because in the tool description, we only hinted at providing an Array

. This method request.RequireStringSlice

enforces a typed Array

translating to a go StringSlice

.

As covered in the previous section, pq.Array

handles the Go-to-Postgres array serialization that database/sql

doesn't provide natively.

The MCP server now exposes exactly two tools - no more, no less. The LLM can discover what exists and compute what's permitted. Everything else in the database remains invisible.

Now to see the entire structure come to life, we wire together all that was built above into an entrypoint.

As mentioned previously we’re using github.com/mark3labs/mcp-go

to spin up an MCP server instead of building one from scratch.

In this project the main.go

is located in a standard path cmd/gateway/main.go

. The full main.go looks like this:

package main

import (
    "context"
    "database/sql"
    "encoding/json"
    "fmt"
    "log"
    "mcp-postgres-gateway/pkg/db"
    "os"
    "strings"

    "github.com/joho/godotenv"
    _ "github.com/lib/pq" // CRITICAL: Must be explicitly imported here to register the driver
    "github.com/mark3labs/mcp-go/mcp"
    "github.com/mark3labs/mcp-go/server"
)

func formatResult(v any) string {
    b, _ := json.MarshalIndent(v, "", "  ")
    return string(b)
}

func main() {
    err := godotenv.Load(".env")
    // Initialize Postgres Connection
    connStr := os.Getenv("DATABASE_URL")

    exposedTables := strings.Split(os.Getenv("EXPOSED_TABLES"), ",")
    if len(exposedTables) == 0 {
        log.Fatal("EXPOSED_TABLES environment variable is not set")
    }

    if connStr == "" {
        log.Fatal("DATABASE_URL environment variable is not set")
    }

    database, err := sql.Open("postgres", connStr)
    if err != nil {
        log.Fatalf("Database initialization failure: %v", err)
    }
    defer database.Close()

    // Establish the MCP Core Server Block
    s := server.NewMCPServer("domainai-gateway", "1.0.0")

    // 1. Tool 1 Implementation: Expose Schema Table Information
    listTablesTool := mcp.NewTool("list_tables",
        mcp.WithDescription("Lists all available database schemas and field structures without exposing raw database records."),
    )

    s.AddTool(listTablesTool, func(ctx context.Context, request mcp.CallToolRequest) (*mcp.CallToolResult, error) {
        cols, err := db.InspectExposedSchema(database, exposedTables)
        if err != nil {
            return mcp.NewToolResultError(fmt.Sprintf("Failed to map system constraints: %s", err.Error())), nil
        }

        return mcp.NewToolResultText(formatResult(cols)), nil
    })

    // Data tools
    queries := db.NewQueries(database)

    metricsTool := mcp.NewTool("get_metrics",
        mcp.WithDescription("Retrieves metrics for specified geographical regions"),

        // Define your slice parameter here
        mcp.WithArray("region",
            mcp.Required(), // <-- This marks the parameter as required in the JSON Schema
            mcp.Description("A list of regions to filter metrics by (e.g. ['US', 'EU'])"),
        ),
    )

    s.AddTool(metricsTool, func(ctx context.Context, request mcp.CallToolRequest) (*mcp.CallToolResult, error) {
        regions, err := request.RequireStringSlice("region")
        if err != nil {
            return mcp.NewToolResultError(err.Error()), nil
        }

        result, err := queries.FindRegionalMetrics(ctx, regions)
        if err != nil {
            return mcp.NewToolResultError(err.Error()), nil
        }

        return mcp.NewToolResultText(formatResult(result)), nil
    })

    // Start the Server to communicate natively over standard IO channels
    log.Println("MCP Gateway initialized. Establishing communication channel over Stdio...")
    if err := server.ServeStdio(s); err != nil {
        fmt.Fprintf(os.Stderr, "Server crash anomaly: %v\n", err)
        os.Exit(1)
    }
}

In this implementation, .env

failures are intentionally non-fatal. The application falls back to system environment variables, which is the correct behavior in containerized deployments where .env

files aren't present.

Notice how we must import github.com/lib/pq using alias

_

for side effects. Once registered database/sql

knows exactly how to handle the postgres protocol behind the scenes when you initialize a connection.Also notice this block of code:

    connStr := os.Getenv("DATABASE_URL")

    exposedTables := strings.Split(os.Getenv("EXPOSED_TABLES"), ",")
    if len(exposedTables) == 0 {
        log.Fatal("EXPOSED_TABLES environment variable is not set")
    }

    if connStr == "" {
        log.Fatal("DATABASE_URL environment variable is not set")
    }

The application deliberately terminates the program if DATABASE_URL

is not found in environment. But more notably this pattern is also enforced early in the program when no EXPOSED_TABLES

are set. This can be helpful to save network resources and give an early failure signal if your MCP server communicates with the database service across another network or if the database service is a microservice in your ecosystem.

To test the MCP service, you can spin up a quick, on demand MCP inspector UI in your browser by running this npx command:

npx -y @modelcontextprotocol/inspector go run cmd/gateway/main.go

This should open up an MCP inspector tab in your browser.

Running list_tables

tool should yield an output similar to this:

[
  {
    "TableName": "compounds",
    "ColumnName": "developer",
    "DataType": "character varying"
  },
  {
    "TableName": "compounds",
    "ColumnName": "id",
    "DataType": "integer"
  },
  {
    "TableName": "compounds",
    "ColumnName": "launch_year",
    "DataType": "integer"
  },
  {
    "TableName": "compounds",
    "ColumnName": "name",
    "DataType": "character varying"
  },
  {
    "TableName": "compounds",
    "ColumnName": "region",
    "DataType": "character varying"
  },
  {
    "TableName": "compounds",
    "ColumnName": "total_units",
    "DataType": "integer"
  },
  {
    "TableName": "sales_ledger",
    "ColumnName": "cancelled_orders",
    "DataType": "integer"
  },
  {
    "TableName": "sales_ledger",
    "ColumnName": "compound_id",
    "DataType": "integer"
  },
  {
    "TableName": "sales_ledger",
    "ColumnName": "id",
    "DataType": "integer"
  },
  {
    "TableName": "sales_ledger",
    "ColumnName": "quarter",
    "DataType": "character varying"
  },
  {
    "TableName": "sales_ledger",
    "ColumnName": "revenue_egp",
    "DataType": "numeric"
  },
  {
    "TableName": "sales_ledger",
    "ColumnName": "units_sold",
    "DataType": "integer"
  }
]

Running the get_metrics

tool with the input of ["New Cairo", "North Coast"]

Should yield the below aggregated metrics for each region.

[
  {
    "region": "New Cairo",
    "unitsSold": 165,
    "totalRevenue": 1245000000,
    "cancelledOrders": 3
  },
  {
    "region": "North Coast",
    "unitsSold": 12,
    "totalRevenue": 180000000,
    "cancelledOrders": 4
  }
]

The LLM received aggregated metrics - totals, not rows. It knows New Cairo sold 165 units. It has no path to the individual transaction records that produced that number. That's the boundary the gateway enforces.

The Go ecosystem is underrepresented in MCP tooling — most implementations lean on Python or TypeScript. But the real gap isn't language choice. It's architectural discipline.

An MCP gateway that lets the LLM construct its own queries is only as secure as the LLM's judgment - and judgment is exactly what attackers exploit. The pattern in this article inverts that assumption: the gateway defines what's computable, the LLM executes within those boundaries, and raw data never crosses the perimeter.

This isn't a limitation of the architecture. It's the feature.

The full implementation is available at khalidelokiely/mcp-postgres-gateway. Clone it, point it at your own Postgres instance, and extend queries.go

with the aggregations your business logic actually needs. The schema reflection and transport layer stay unchanged — only the computations you choose to expose are yours to define.

── more in #developer-tools 4 stories · sorted by recency
── more on @postgres 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/building-a-zero-leak…] indexed:0 read:13min 2026-06-26 ·