cd /news/artificial-intelligence/building-a-rag-system-from-scratch-c… · home topics artificial-intelligence article
[ARTICLE · art-42131] src=dev.to ↗ pub= topic=artificial-intelligence verified=true sentiment=· neutral

Building a RAG System from Scratch — Cloud Deployment with Render and Supabase

A developer built and deployed a Retrieval-Augmented Generation (RAG) system to the cloud using Render and Supabase. The system uses pgvector for vector search and is accessible via an MCP server hosted on Render with a Supabase PostgreSQL database. The deployment involved migrating data from a local Docker setup to Supabase and configuring the server for cloud hosting.

read5 min views1 publishedJun 27, 2026

In the previous article, we built an MCP server that any LLM client can connect to locally. In this article, we'll deploy it to the cloud — making it accessible from anywhere.

Before: localhost:8000 → pgvector (Docker)
After:  https://your-app.onrender.com/mcp → Supabase (pgvector)

Both services are free to start with no credit card required.

Service Role Free tier
Render
Host the MCP HTTP server Persistent web service (sleeps after 15 min)
Supabase
Managed PostgreSQL + pgvector 500MB, persistent

Open SQL Editor in the Supabase dashboard and run:

CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE IF NOT EXISTS documents (
    id          SERIAL PRIMARY KEY,
    title       TEXT NOT NULL,
    body        TEXT NOT NULL,
    category    TEXT,
    created_at  TIMESTAMP DEFAULT NOW(),
    embedding   vector(768)
);

CREATE INDEX IF NOT EXISTS docs_embedding_idx
ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

CREATE INDEX ON documents (category);

Click the Connect button at the top of the dashboard (not Settings → Database — the UI has changed).

Select the Connection pooling tab and copy the Transaction mode URI. It looks like:

postgresql://postgres.xxxx:password@aws-0-ap-northeast-1.pooler.supabase.com:6543/postgres

Why port 6543?The standard port 5432 uses IPv6, which Render doesn't support. The Connection Pooler (port 6543) uses IPv4 and is the correct choice for cloud-to-cloud connections.

Add the Supabase URL to your .env

:

DATABASE_URL=postgresql://postgres.xxxx:password@aws-0-ap-northeast-1.pooler.supabase.com:6543/postgres

Then run the migration:

import psycopg2
from dotenv import load_dotenv
import os

load_dotenv()

local_conn = psycopg2.connect(
    host=os.getenv("DB_HOST"), port=os.getenv("DB_PORT"),
    dbname=os.getenv("DB_NAME"), user=os.getenv("DB_USER"),
    password=os.getenv("DB_PASSWORD"),
)
local_cur = local_conn.cursor()

supa_conn = psycopg2.connect(os.getenv("DATABASE_URL"), sslmode="require")
supa_cur = supa_conn.cursor()

local_cur.execute("SELECT title, body, category, embedding FROM documents;")
rows = local_cur.fetchall()
print(f"Migrating {len(rows)} documents...")

for row in rows:
    title, body, category, embedding = row
    supa_cur.execute("""
        INSERT INTO documents (title, body, category, embedding)
        VALUES (%s, %s, %s, %s) ON CONFLICT DO NOTHING;
    """, (title, body, category, embedding))

supa_conn.commit()

supa_cur.execute("SELECT COUNT(*) FROM documents;")
count = supa_cur.fetchone()[0]
print(f"Done. Documents in Supabase: {count}")

local_conn.close()
supa_conn.close()
python migrate_to_supabase.py

Create mcp_server/server_render.py

. The only differences from server.py

:

DATABASE_URL

env varPORT

env var (Render sets this automatically)streamable-http

instead of stdio

import psycopg2
from google import genai
from google.genai import types as genai_types
from fastmcp import FastMCP
from dotenv import load_dotenv
import os

load_dotenv()

mcp = FastMCP(
    name="pgvector-search",
    instructions="Document search server using pgvector.",
)

gemini_client = genai.Client(api_key=os.getenv("GEMINI_API_KEY"))

DATABASE_URL = os.getenv("DATABASE_URL")
if DATABASE_URL:
    conn = psycopg2.connect(DATABASE_URL, sslmode="require")
else:
    conn = psycopg2.connect(
        host=os.getenv("DB_HOST", "localhost"),
        port=os.getenv("DB_PORT", "5432"),
        dbname=os.getenv("DB_NAME", "vectordb"),
        user=os.getenv("DB_USER", "postgres"),
        password=os.getenv("DB_PASSWORD", "password"),
    )

cur = conn.cursor()

def get_embedding(text: str) -> list[float]:
    result = gemini_client.models.embed_content(
        model="gemini-embedding-001",
        contents=text,
        config=genai_types.EmbedContentConfig(
            task_type="RETRIEVAL_QUERY",
            output_dimensionality=768,
        ),
    )
    return result.embeddings[0].values

@mcp.tool
def search_documents(query: str, top_k: int = 3) -> list[dict]:
    """Search all document categories for a given query."""
    q = get_embedding(query)
    cur.execute("""
        SELECT title, body, category,
               1 - (embedding <=> %s::vector) AS similarity
        FROM documents ORDER BY embedding <=> %s::vector LIMIT %s;
    """, (q, q, top_k))
    return [
        {"title": r[0], "body": r[1], "category": r[2], "similarity": round(r[3], 4)}
        for r in cur.fetchall()
    ]

@mcp.tool
def search_by_category(query: str, category: str, top_k: int = 3) -> list[dict]:
    """Search within a specific category (ML, Python, or Cloud)."""
    q = get_embedding(query)
    cur.execute("""
        SELECT title, body, category,
               1 - (embedding <=> %s::vector) AS similarity
        FROM documents WHERE category = %s
        ORDER BY embedding <=> %s::vector LIMIT %s;
    """, (q, category, q, top_k))
    return [
        {"title": r[0], "body": r[1], "category": r[2], "similarity": round(r[3], 4)}
        for r in cur.fetchall()
    ]

@mcp.tool
def list_categories() -> list[dict]:
    """Return all available categories and document counts."""
    cur.execute("""
        SELECT category, COUNT(*) as count
        FROM documents GROUP BY category ORDER BY count DESC;
    """)
    return [{"category": r[0], "count": r[1]} for r in cur.fetchall()]

if __name__ == "__main__":
    port = int(os.getenv("PORT", 8000))  # Render sets PORT automatically
    mcp.run(
        transport="streamable-http",
        host="0.0.0.0",
        port=port,
    )

Push to GitHub:

git add .
git commit -m "feat: add Render deployment server"
git push origin main
Field Value
Name pgvector-mcp-server
Runtime Python 3
Build Command pip install -r requirements.txt
Start Command python mcp_server/server_render.py
Instance Type Free
Key Value
GEMINI_API_KEY
AIza...
DATABASE_URL
The Connection Pooler URI from Supabase (port 6543)

Once deployed, visit:

https://pgvector-mcp-server.onrender.com/mcp

You'll see:

{"jsonrpc":"2.0","id":"server-error","error":{"code":-32600,"message":"Not Acceptable: Client must accept text/event-stream"}}

This is correct — it means the server is running. The error appears because browsers aren't MCP clients. Your agent will connect without issues.

First request is slow:Render's free tier sleeps after 15 minutes of inactivity. The first request after sleep takes 30–60 seconds to wake up. UseUptimeRobot to ping every 5 minutes and prevent sleep.

Update 13_mcp_http_agent.py

with the Render URL:

MCP_SERVER_URL = "https://pgvector-mcp-server.onrender.com/mcp"

async def run_agent(task: str):
    async with Client(MCP_SERVER_URL) as mcp_client:  # URL instead of file path
        mcp_tools = await mcp_client.list_tools()
        print(f"Loaded {len(mcp_tools)} tools from remote MCP server")
python 13_mcp_http_agent.py

The agent is now querying pgvector on Supabase through an MCP server running on Render — entirely in the cloud.

Error Cause Fix
Network is unreachable (IPv6)
Using port 5432 Use Connection Pooler URL (port 6543)
SSL connection required
Missing sslmode Add sslmode="require"
ModuleNotFoundError
Missing package Run pip freeze > requirements.txt and push
Slow first response Render sleep Use UptimeRobot to keep alive
Not Found at /
Wrong URL Add /mcp to the URL
Local development:
  Python agent → stdio → mcp_server/server.py → Docker pgvector

Cloud deployment:
  Python agent → HTTPS → Render (server_render.py) → Supabase pgvector

The codebase is identical. The infrastructure changed around it.

In the final article, we'll wrap up the series with a summary of all design decisions and point to Vol.2 — where we cover Evals, Observability, Security, MLOps, Fine-tuning, Multi-Agent, and Governance.

Full source code: github.com/qameqame/pgvector-tutorial

── more in #artificial-intelligence 4 stories · sorted by recency
── more on @render 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/building-a-rag-syste…] indexed:0 read:5min 2026-06-27 ·