AI Governance as a Database Primitive: Building FarmOps Desk on Aurora + pgvector + Bedrock

wpnews.pro

This article is part of my submission to the H0: Hack the Zero Stack hackathon, in the Monetizable B2B App track.

Most "AI apps" treat the model as a stateless add-on: a chat UI tacked onto a CRUD app, calls billed to an API key the developer hopes nobody finds. The database stores the chat history; everything interesting happens outside it.

For a B2B SaaS where the AI writes financial records, drafts livestock medical notes, and creates operational tasks on behalf of paying customers, that approach fails. You need governance: who asked the AI, what tool it called, what it tried to write, what it actually wrote, how many credits it consumed, who approved the draft, who rejected it, why. All of this must be auditable weeks later, scoped per-tenant, and enforceable at the database level — not at the API layer hoping the model never hallucinates a missing tenant ID.

This article walks through the governance schema I built into FarmOps Desk for the H0 hackathon, and the two patterns that make it work: atomic credit reservation and per-farm autonomy tiers.

Treat AI governance as a database primitive, not a feature. The schema enforces the invariants; the application is a thin layer over them. If the application has a bug, the database still prevents the unacceptable outcomes (a customer going into negative credits, a draft pretending to be confirmed, a tenant-A operation touching tenant-B data).

That means every interesting AI action lives in a small set of tables:

Table	Purpose
`ai_runs`
One row per model invocation. Feature, model, tokens, latency, status, summaries. The index of "what did the AI do".
`credit_ledger`
Append-only ledger of credit grants/deductions. The financial truth. `balance_after >= 0` enforced by CHECK.
`assistant_drafts`
Pending AI writes — every record the assistant wants to create lands here first. record_type + status state machine (`pending → confirmed / discarded` ).
`ai_recommendations`
Non-mutating AI suggestions (no draft). Status `pending → approved / rejected` .
`ai_evals`
Rule-based code evals of assistant outputs (clinical safety filter, schema conformance).
`ai_feedback`
Thumbs up/down + free text. Joined back to `ai_runs` for offline analysis.
`memories`
Durable farm facts (e.g. "vaccinates on Mondays", "Pond 3 is the nursery") distilled from chat.
`embeddings`
pgvector. Per-farm RAG index for documents, conversation summaries, memories.

The relational spine: farms.id

is the tenant boundary. Every row in every table above carries farm_id

. Every query filters by it. A hallucination that tries to write to another farm can't — the row-level check is the schema, not a function the LLM can forget to call.

The classic serverless AI bug: two concurrent requests for the same farm both observe credit_balance = 1

, both proceed, both invoke Bedrock, both call the settleCredits function afterward. The user gets charged for 2 runs on a 1-credit balance. This is a TOCTOU race that scales linearly with concurrency.

The naive fix is SELECT ... FOR UPDATE

inside a transaction. That works but it doesn't compose with the metering pattern: you want to reserve 1 credit before the run (so the user can't start 10 concurrent runs on a 5-credit balance), then settle after the run with the actual cost.

My pattern:

-- reserveCredit(farmId): atomically deduct 1 IF balance > 0
UPDATE farms
   SET credit_balance = credit_balance - 1
 WHERE id = $1
   AND credit_balance > 0
RETURNING credit_balance;

This is a single conditional UPDATE. Postgres serializes concurrent calls at the row level. Exactly one of N concurrent calls returns a row when balance = 1

; the rest get rowCount = 0

and throw CreditError

. No advisory lock, no transaction, no two-phase protocol.

After the run completes:

// settleCredits(farmId, actualCredits, reason, aiRunId)
//   actualCredits <= 0  → refund the reservation
//   actualCredits === 1 → no-op (reservation was exact)
//   actualCredits > 1   → deduct the extra, best-effort

The credit_ledger

table captures both legs (reservation + settle) so the audit trail reconstructs the actual cost of any run. A judge looking at /admin/evidence

can match a credit_ledger

row to its ai_runs

row and see: 1 reserved, 3 actual, 2 deducted at settle.

This pattern generalizes to any metered-resource problem (storage quotas, rate limits, seat counts) where you need "reserve N if available, settle later" semantics.

A small backyard poultry farm in Kaduna has different risk tolerance than a 5,000-bird commercial operation in Lagos. Some farms want the AI to suggest; others want it to act. The autonomy tier is a per-farm setting in farms.ai_autonomy

:

Tier	Behavior
`suggest`
AI cannot write anything. Tools throw if called. User gets text suggestions only.
`draft`
AI creates `assistant_drafts` rows. User confirms or discards. Default for new farms.
`auto`
AI writes directly for trusted categories (tasks, notes, feed logs). Still drafts financial/destructive writes.

The crucial design choice: financial and destructive writes are hardcoded to draft, regardless of autonomy tier. No farm can opt out of human confirmation for a sale, expense, mortality event, or recommendation approval. The autonomy tier only controls the no-op write categories.

Implementation: farms.ai_autonomy

is the tier; farms.ai_auto_categories

is a JSONB allowlist for the auto

tier; farms.ai_record_autonomy

is per-record-type override. The lib/ai/draft-executor.ts

resolves all three before each write. The schema is the source of truth; the executor is a small dispatcher.

// Pseudo
function resolveAutonomy(farm, recordType) {
  if (FINANCIAL_DESTRUCTIVE.has(recordType)) return 'draft'  // hardcoded floor
  if (farm.ai_autonomy === 'suggest') return 'suggest'
  if (farm.ai_autonomy === 'auto' && farm.ai_auto_categories.includes(recordType)) return 'auto'
  return 'draft'
}

This is the AI-safety pattern that doesn't depend on prompt engineering. The model can be convinced to do anything; the executor refuses the write at the schema level.

RAG in a multi-tenant B2B has a specific shape: every vector query is scoped by farm_id

, the corpus is per-tenant (documents, conversation summaries, memories), and the index must be transactional with the writes (a deleted document must not appear in search results).

Standalone vector databases (Pinecone, Weaviate) solve the vector problem but create a second source of truth: the vector DB says "doc X is relevant", you have to round-trip to Postgres to authorize the read, and the two can drift. You also pay for a separate service.

pgvector inside Aurora collapses the two into one. The vector column lives on the same row as the tenant ID:

CREATE TABLE embeddings (
  id          TEXT PRIMARY KEY,
  farm_id     TEXT NOT NULL,
  source_type TEXT NOT NULL,         -- 'memory' | 'document' | 'convo_summary'
  source_id   TEXT NOT NULL,
  content     TEXT NOT NULL,
  embedding   VECTOR(1024) NOT NULL
);

CREATE INDEX ON embeddings
  USING hnsw (embedding vector_cosine_ops)
  WITH (m = 16, ef_construction = 64);

-- Per-farm RAG query: tenant scoping + vector search in one shot
SELECT id, content, 1 - (embedding <=> $1) AS score
  FROM embeddings
 WHERE farm_id = $2                -- tenant boundary
   OR farm_id = '__global__'       -- shared knowledge base (vaccination schedules, biology)
 ORDER BY embedding <=> $1
 LIMIT 4;

The farm_id filter runs first; the HNSW scan runs against the filtered subset. A hallucinated query for tenant-A can't reach tenant-B's embeddings — the WHERE clause is enforced by Postgres, not by prompt engineering.

The hackathon's first hard requirement is AWS Database as the primary backend. I chose Aurora PostgreSQL over DynamoDB for one reason: financial-grade operations require relational integrity.

My credit_ledger

enforces balance_after >= 0

with a CHECK constraint. My assistant_drafts

enforces record_type IN ('mortality','feed_use','sale','expense','water','weight','symptom','note')

with a CHECK — the AI cannot draft a record type that doesn't exist. My farm_members

enforces the join table invariant (farm_id, user_id)

UNIQUE. None of these are expressible in DynamoDB without application-level enforcement.

The trade-off: Aurora is a heavier operational lift than DynamoDB (connection pooling, failover tuning, vacuuming). For a hackathon, I accepted the complexity because the integrity invariants are the product.

┌──────────────┐    OIDC JWT    ┌──────────────┐
│   Vercel     │ ─────────────► │     STS      │
│  Functions   │                │ (AssumeRole) │
└──────┬───────┘                └──────┬───────┘
       │                                │
       │  15-min IAM token              │
       ▼                                ▼
┌──────────────┐                ┌──────────────┐
│ RDS Signer   │                │  Bedrock     │
│ (DB auth)    │                │  Runtime     │
└──────┬───────┘                └──────────────┘
       │
       ▼
┌──────────────────────────────────────────────┐
│       Aurora PostgreSQL 17 + pgvector        │
│  ┌────────────┐  ┌────────────┐  ┌─────────┐ │
│  │   farms    │  │ ai_runs    │  │ memories│ │
│  │   ...40+   │  │ credit_    │  │ embed-  │ │
│  │  tables    │  │ ledger     │  │ dings   │ │
│  │            │  │ drafts     │  │         │ │
│  └────────────┘  └────────────┘  └─────────┘ │
└──────────────────────────────────────────────┘

Two IAM roles: AWS_ROLE_ARN

(DB-only, permission-boundary-capped) and BEDROCK_ROLE_ARN

(AI-only, no DB access). Blast-radius minimization at the trust boundary.

Building AI governance as a database primitive isn't the typical approach. Most LLM applications default to a simple messages

table and a chat UI, treating the AI as an external novelty. But when building for agricultural operations, a farmer's data integrity is just as critical as a bank's.

By enforcing invariants natively in the relational schema—using integer kobo for financial accuracy, FOR UPDATE

row locks to prevent race conditions, append-only ledgers, and a per-farm autonomy state machine—the application guarantees safety before the AI ever generates a response.

The patterns described here are designed to go beyond a proof-of-concept. They demonstrate how to build shipping-grade, multi-tenant AI systems where data integrity is structurally guaranteed by the database itself.

docs/ARCHITECTURE.md

This project is built for the H0 Hack the Zero Stack hackathon (#H0Hackathon), in the Monetizable B2B App track. It deploys on Vercel with Amazon Aurora PostgreSQL as the primary backend. I created this piece of content for the purposes of entering the hackathon.

source & further reading

dev.to — original article Portable AI on a USB Stick Sleep Consolidation for AI Memory Two Doors, One Gate: Navigating Governance Beyond EDD

AI Governance as a Database Primitive: Building FarmOps Desk on Aurora + pgvector + Bedrock

Run your AI side-project on zahid.host