Building a RAG System from Scratch — Cloud Deployment with Render and Supabase

A developer built and deployed a Retrieval-Augmented Generation (RAG) system to the cloud using Render and Supabase. The system uses pgvector for vector search and is accessible via an MCP server hosted on Render with a Supabase PostgreSQL database. The deployment involved migrating data from a local Docker setup to Supabase and configuring the server for cloud hosting.

In the previous article https://dev.to/hiroki-kameyama/building-a-rag-system-from-scratch-mcp-exposing-pgvector-as-a-reusable-tool-server-2onc , we built an MCP server that any LLM client can connect to locally. In this article, we'll deploy it to the cloud — making it accessible from anywhere. Before: localhost:8000 → pgvector Docker After: https://your-app.onrender.com/mcp → Supabase pgvector Both services are free to start with no credit card required. | Service | Role | Free tier | |---|---|---| Render | Host the MCP HTTP server | Persistent web service sleeps after 15 min | Supabase | Managed PostgreSQL + pgvector | 500MB, persistent | Open SQL Editor in the Supabase dashboard and run: CREATE EXTENSION IF NOT EXISTS vector; CREATE TABLE IF NOT EXISTS documents id SERIAL PRIMARY KEY, title TEXT NOT NULL, body TEXT NOT NULL, category TEXT, created at TIMESTAMP DEFAULT NOW , embedding vector 768 ; CREATE INDEX IF NOT EXISTS docs embedding idx ON documents USING hnsw embedding vector cosine ops WITH m = 16, ef construction = 64 ; CREATE INDEX ON documents category ; Click the Connect button at the top of the dashboard not Settings → Database — the UI has changed . Select the Connection pooling tab and copy the Transaction mode URI. It looks like: postgresql://postgres.xxxx:password@aws-0-ap-northeast-1.pooler.supabase.com:6543/postgres Why port 6543?The standard port 5432 uses IPv6, which Render doesn't support. The Connection Pooler port 6543 uses IPv4 and is the correct choice for cloud-to-cloud connections. Add the Supabase URL to your .env : DATABASE URL=postgresql://postgres.xxxx:password@aws-0-ap-northeast-1.pooler.supabase.com:6543/postgres Then run the migration: python migrate to supabase.py import psycopg2 from dotenv import load dotenv import os load dotenv local conn = psycopg2.connect host=os.getenv "DB HOST" , port=os.getenv "DB PORT" , dbname=os.getenv "DB NAME" , user=os.getenv "DB USER" , password=os.getenv "DB PASSWORD" , local cur = local conn.cursor supa conn = psycopg2.connect os.getenv "DATABASE URL" , sslmode="require" supa cur = supa conn.cursor local cur.execute "SELECT title, body, category, embedding FROM documents;" rows = local cur.fetchall print f"Migrating {len rows } documents..." for row in rows: title, body, category, embedding = row supa cur.execute """ INSERT INTO documents title, body, category, embedding VALUES %s, %s, %s, %s ON CONFLICT DO NOTHING; """, title, body, category, embedding supa conn.commit supa cur.execute "SELECT COUNT FROM documents;" count = supa cur.fetchone 0 print f"Done. Documents in Supabase: {count}" local conn.close supa conn.close python migrate to supabase.py Migrating 5 documents... Done. Documents in Supabase: 5 Create mcp server/server render.py . The only differences from server.py : DATABASE URL env var PORT env var Render sets this automatically streamable-http instead of stdio python mcp server/server render.py import psycopg2 from google import genai from google.genai import types as genai types from fastmcp import FastMCP from dotenv import load dotenv import os load dotenv mcp = FastMCP name="pgvector-search", instructions="Document search server using pgvector.", gemini client = genai.Client api key=os.getenv "GEMINI API KEY" Use DATABASE URL if available Render/Supabase , fall back to individual vars DATABASE URL = os.getenv "DATABASE URL" if DATABASE URL: conn = psycopg2.connect DATABASE URL, sslmode="require" else: conn = psycopg2.connect host=os.getenv "DB HOST", "localhost" , port=os.getenv "DB PORT", "5432" , dbname=os.getenv "DB NAME", "vectordb" , user=os.getenv "DB USER", "postgres" , password=os.getenv "DB PASSWORD", "password" , cur = conn.cursor def get embedding text: str - list float : result = gemini client.models.embed content model="gemini-embedding-001", contents=text, config=genai types.EmbedContentConfig task type="RETRIEVAL QUERY", output dimensionality=768, , return result.embeddings 0 .values @mcp.tool def search documents query: str, top k: int = 3 - list dict : """Search all document categories for a given query.""" q = get embedding query cur.execute """ SELECT title, body, category, 1 - embedding <= %s::vector AS similarity FROM documents ORDER BY embedding <= %s::vector LIMIT %s; """, q, q, top k return {"title": r 0 , "body": r 1 , "category": r 2 , "similarity": round r 3 , 4 } for r in cur.fetchall @mcp.tool def search by category query: str, category: str, top k: int = 3 - list dict : """Search within a specific category ML, Python, or Cloud .""" q = get embedding query cur.execute """ SELECT title, body, category, 1 - embedding <= %s::vector AS similarity FROM documents WHERE category = %s ORDER BY embedding <= %s::vector LIMIT %s; """, q, category, q, top k return {"title": r 0 , "body": r 1 , "category": r 2 , "similarity": round r 3 , 4 } for r in cur.fetchall @mcp.tool def list categories - list dict : """Return all available categories and document counts.""" cur.execute """ SELECT category, COUNT as count FROM documents GROUP BY category ORDER BY count DESC; """ return {"category": r 0 , "count": r 1 } for r in cur.fetchall if name == " main ": port = int os.getenv "PORT", 8000 Render sets PORT automatically mcp.run transport="streamable-http", host="0.0.0.0", port=port, Push to GitHub: git add . git commit -m "feat: add Render deployment server" git push origin main | Field | Value | |---|---| | Name | pgvector-mcp-server | | Runtime | Python 3 | | Build Command | pip install -r requirements.txt | | Start Command | python mcp server/server render.py | | Instance Type | Free | | Key | Value | |---|---| GEMINI API KEY | AIza... | DATABASE URL | The Connection Pooler URI from Supabase port 6543 | Once deployed, visit: https://pgvector-mcp-server.onrender.com/mcp You'll see: {"jsonrpc":"2.0","id":"server-error","error":{"code":-32600,"message":"Not Acceptable: Client must accept text/event-stream"}} This is correct — it means the server is running. The error appears because browsers aren't MCP clients. Your agent will connect without issues. First request is slow:Render's free tier sleeps after 15 minutes of inactivity. The first request after sleep takes 30–60 seconds to wake up. Use UptimeRobot free to ping every 5 minutes and prevent sleep. Update 13 mcp http agent.py with the Render URL: 13 mcp http agent.py MCP SERVER URL = "https://pgvector-mcp-server.onrender.com/mcp" async def run agent task: str : async with Client MCP SERVER URL as mcp client: URL instead of file path mcp tools = await mcp client.list tools print f"Loaded {len mcp tools } tools from remote MCP server" ... rest of the agentic loop is identical ... python 13 mcp http agent.py Loaded 3 tools from remote MCP server Step 1 → list categories {} Step 2 → search by category {'query': 'evaluation metrics', 'category': 'ML'} Done in 3 steps The agent is now querying pgvector on Supabase through an MCP server running on Render — entirely in the cloud. | Error | Cause | Fix | |---|---|---| Network is unreachable IPv6 | Using port 5432 | Use Connection Pooler URL port 6543 | SSL connection required | Missing sslmode | Add sslmode="require" | ModuleNotFoundError | Missing package | Run pip freeze requirements.txt and push | | Slow first response | Render sleep | Use UptimeRobot to keep alive | Not Found at / | Wrong URL | Add /mcp to the URL | Local development: Python agent → stdio → mcp server/server.py → Docker pgvector Cloud deployment: Python agent → HTTPS → Render server render.py → Supabase pgvector The codebase is identical. The infrastructure changed around it. In the final article, we'll wrap up the series with a summary of all design decisions and point to Vol.2 — where we cover Evals, Observability, Security, MLOps, Fine-tuning, Multi-Agent, and Governance. Full source code: github.com/qameqame/pgvector-tutorial