SQL + AI: Real-World Database Solutions You Can Use Today

A developer demonstrates how to integrate AI capabilities directly into PostgreSQL using the pgvector extension, enabling semantic search, natural language queries, and vector storage without a separate database. The approach combines SQL filters with cosine similarity searches in a single query, as shown with a product catalog example. All code is available on GitHub.

📦 All code examples in this article are available on GitHub: github.com/andre-carbajal/sql-ai-database-solutions Databases and AI used to live in separate worlds. Your PostgreSQL instance handled structured queries, and your ML pipeline ran somewhere else entirely. That separation is rapidly disappearing. In 2025, your SQL database can store embeddings, answer natural language questions, power semantic search, and serve as the memory layer for autonomous AI agents — all without adding a separate vector database to your stack. This article walks through four real-world patterns with working code: pip install psycopg2-binary pgvector langchain langchain-openai openai sqlalchemy python-dotenv Docker: run PostgreSQL with pgvector support docker run --name pgvector-demo \ -e POSTGRES PASSWORD=secret \ -p 5432:5432 \ -d pgvector/pgvector:pg16 Set your environment variables: .env OPENAI API KEY=sk-... DATABASE URL=postgresql://postgres:secret@localhost:5432/aidb pgvector is a PostgreSQL extension that adds a native vector data type, allowing you to store high-dimensional embeddings and run similarity searches with standard SQL. Instead of maintaining a separate vector database Pinecone, Weaviate, Chroma , you keep your vectors beside your relational data . That means you can combine semantic search with SQL filters in a single query. -- Enable the extension CREATE EXTENSION IF NOT EXISTS vector; -- Create a products table with an embedding column CREATE TABLE products id SERIAL PRIMARY KEY, name TEXT NOT NULL, description TEXT, category TEXT, price DECIMAL 10, 2 , embedding VECTOR 1536 -- OpenAI text-embedding-3-small dimensions ; -- Create an HNSW index for fast approximate nearest-neighbor search CREATE INDEX ON products USING hnsw embedding vector cosine ops WITH m = 16, ef construction = 64 ; python import os import psycopg2 import numpy as np from openai import OpenAI from dotenv import load dotenv load dotenv client = OpenAI api key=os.environ "OPENAI API KEY" conn = psycopg2.connect os.environ "DATABASE URL" cur = conn.cursor def get embedding text: str - list float : response = client.embeddings.create model="text-embedding-3-small", input=text return response.data 0 .embedding products = "Wireless Noise-Cancelling Headphones", "Premium audio with 30hr battery", "electronics", 299.99 , "Ergonomic Office Chair", "Lumbar support, adjustable arms", "furniture", 449.00 , "Python Programming Book", "Comprehensive guide for beginners", "books", 39.99 , "Mechanical Keyboard", "Cherry MX switches, RGB backlit", "electronics", 129.00 , "Standing Desk", "Electric height-adjustable desk", "furniture", 599.00 , for name, description, category, price in products: embedding = get embedding f"{name}: {description}" cur.execute """ INSERT INTO products name, description, category, price, embedding VALUES %s, %s, %s, %s, %s """, name, description, category, price, embedding conn.commit print "Products inserted with embeddings." python def semantic search query: str, max price: float = None, category: str = None, limit: int = 5 : query embedding = get embedding query sql = """ SELECT name, description, category, price, 1 - embedding <= %s::vector AS similarity FROM products WHERE 1=1 """ params = query embedding if max price: sql += " AND price <= %s" params.append max price if category: sql += " AND category = %s" params.append category sql += " ORDER BY embedding <= %s::vector LIMIT %s" params += query embedding, limit cur.execute sql, params return cur.fetchall "I want something for my home office under $500" results = semantic search query="comfortable workspace setup for long work sessions", max price=500.0 for name, description, category, price, similarity in results: print f" {similarity:.3f} {name} ${price} — {description}" Output: 0.891 Ergonomic Office Chair $449.00 — Lumbar support, adjustable arms 0.832 Mechanical Keyboard $129.00 — Cherry MX switches, RGB backlit 0.801 Standing Desk $599.00 — Electric height-adjustable desk The <= operator computescosine distance. pgvector also supports <- L2/Euclidean and < inner product . Retrieval-Augmented Generation RAG lets you inject your own data into an LLM's context at query time. Instead of fine-tuning, you retrieve the most relevant chunks from your database and pass them as context. User Question │ ▼ Embed the Question │ ▼ Vector Search in PostgreSQL ──▶ Top-K relevant chunks │ ▼ Build LLM Prompt with context │ ▼ LLM generates answer grounded in your data python from langchain openai import OpenAIEmbeddings, ChatOpenAI from langchain community.vectorstores import PGVector from langchain.text splitter import RecursiveCharacterTextSplitter from langchain.chains import RetrievalQA from langchain community.document loaders import TextLoader 1. Load and chunk your documents loader = TextLoader "company docs.txt" documents = loader.load splitter = RecursiveCharacterTextSplitter chunk size=500, chunk overlap=50 chunks = splitter.split documents documents 2. Store chunks + embeddings in PostgreSQL CONNECTION STRING = os.environ "DATABASE URL" COLLECTION NAME = "company knowledge base" vector store = PGVector.from documents documents=chunks, embedding=OpenAIEmbeddings model="text-embedding-3-small" , collection name=COLLECTION NAME, connection string=CONNECTION STRING, 3. Build a retrieval-augmented QA chain llm = ChatOpenAI model="gpt-4o-mini", temperature=0 retriever = vector store.as retriever search type="similarity", search kwargs={"k": 4} qa chain = RetrievalQA.from chain type llm=llm, chain type="stuff", retriever=retriever, return source documents=True 4. Ask questions result = qa chain.invoke {"query": "What is our refund policy?"} print result "result" print "\nSources:" for doc in result "source documents" : print f" - {doc.metadata.get 'source', 'unknown' }" Store documents with tenant metadata vector store = PGVector.from documents documents=chunks, embedding=OpenAIEmbeddings , collection name="docs", connection string=CONNECTION STRING, Each chunk carries metadata you can filter on Retrieve only documents for a specific tenant retriever = vector store.as retriever search kwargs={ "k": 5, "filter": {"tenant id": "acme-corp"} } NL2SQL also called Text-to-SQL lets users query your database in plain English. An LLM reads your schema and translates a natural language question into a valid SQL query. python from openai import OpenAI client = OpenAI SCHEMA = """ Tables: orders id, customer id, product id, quantity, total price, created at, status customers id, name, email, country, created at products id, name, category, price, stock """ def nl to sql question: str - str: prompt = f"""You are a SQL expert. Given the following database schema, write a PostgreSQL query to answer the user's question. Return ONLY the SQL query, no explanation. Schema: {SCHEMA} Question: {question} SQL:""" response = client.chat.completions.create model="gpt-4o-mini", messages= {"role": "user", "content": prompt} , temperature=0 return response.choices 0 .message.content.strip Real-world examples questions = "What are the top 5 countries by total revenue this year?", "Which products have never been ordered?", "Show me customers who placed more than 3 orders in the last 30 days", for q in questions: print f"Q: {q}" print f"SQL: {nl to sql q }" print Output: -- "What are the top 5 countries by total revenue this year?" SELECT c.country, SUM o.total price AS total revenue FROM orders o JOIN customers c ON o.customer id = c.id WHERE EXTRACT YEAR FROM o.created at = EXTRACT YEAR FROM CURRENT DATE GROUP BY c.country ORDER BY total revenue DESC LIMIT 5; -- "Which products have never been ordered?" SELECT p.id, p.name, p.category FROM products p LEFT JOIN orders o ON p.id = o.product id WHERE o.id IS NULL; Never run LLM-generated SQL without validation. Here's a safer pattern: python import sqlparse import re ALLOWED STATEMENTS = {"SELECT"} def is safe sql sql: str - bool: """Reject anything that isn't a SELECT statement.""" parsed = sqlparse.parse sql if not parsed: return False statement type = parsed 0 .get type return statement type in ALLOWED STATEMENTS def execute nl query question: str, conn - list dict : sql = nl to sql question Strip markdown fences if LLM added them sql = re.sub r" sql| ", "", sql .strip if not is safe sql sql : raise ValueError f"Unsafe SQL rejected: {sql}" cur = conn.cursor cur.execute sql columns = desc 0 for desc in cur.description rows = cur.fetchall return dict zip columns, row for row in rows Adding examples dramatically improves query accuracy on complex schemas: FEW SHOT EXAMPLES = """ Example 1: Question: How many orders were placed last month? SQL: SELECT COUNT FROM orders WHERE created at = DATE TRUNC 'month', NOW - INTERVAL '1 month' AND created at < DATE TRUNC 'month', NOW ; Example 2: Question: What is the average order value by product category? SQL: SELECT p.category, AVG o.total price AS avg order value FROM orders o JOIN products p ON o.product id = p.id GROUP BY p.category ORDER BY avg order value DESC; """ def nl to sql fewshot question: str - str: prompt = f"""You are a SQL expert for a PostgreSQL e-commerce database. Schema: {SCHEMA} {FEW SHOT EXAMPLES} Question: {question} SQL:""" ... same API call as before The most powerful pattern: an AI agent that can autonomously decide which queries to run, inspect the results, and iterate to answer complex multi-step questions. python from langchain.agents import AgentExecutor, create openai tools agent from langchain openai import ChatOpenAI from langchain.tools import tool from langchain import hub import psycopg2 conn = psycopg2.connect os.environ "DATABASE URL" @tool def query database sql: str - str: """ Execute a read-only SQL SELECT query against the e-commerce database. Use this to retrieve data needed to answer user questions. Never use INSERT, UPDATE, DELETE, or DROP. """ if not is safe sql sql : return "Error: Only SELECT queries are allowed." try: cur = conn.cursor cur.execute sql columns = desc 0 for desc in cur.description rows = cur.fetchmany 50 cap results result = dict zip columns, row for row in rows return str result except Exception as e: return f"Query error: {e}" @tool def get schema - str: """ Returns the database schema — table names, columns, and types. Call this first to understand what data is available before writing queries. """ cur = conn.cursor cur.execute """ SELECT table name, column name, data type FROM information schema.columns WHERE table schema = 'public' ORDER BY table name, ordinal position; """ rows = cur.fetchall schema = {} for table, column, dtype in rows: schema.setdefault table, .append f"{column} {dtype} " return "\n".join f"{table}: {', '.join cols }" for table, cols in schema.items @tool def semantic product search query: str - str: """ Search for products using semantic/natural language similarity. Use this when the user describes what they want rather than specifying exact product names. """ results = semantic search query, limit=5 return str {"name": r 0 , "description": r 1 , "price": r 3 } for r in results Assemble the agent llm = ChatOpenAI model="gpt-4o", temperature=0 tools = get schema, query database, semantic product search prompt = hub.pull "hwchase17/openai-tools-agent" agent = create openai tools agent llm, tools, prompt agent executor = AgentExecutor agent=agent, tools=tools, verbose=True, max iterations=10 Complex multi-step question response = agent executor.invoke { "input": """ I'm looking for ergonomic office products. Find the top 3 most relevant products, then check if we have customers from Germany who bought office furniture in the past 6 months. Give me a summary of both findings. """ } print response "output" The agent will autonomously: semantic product search "ergonomic office products" get schema to understand the tables ┌─────────────────────────────────────────────────────┐ │ PostgreSQL Database │ │ │ │ ┌─────────────┐ ┌──────────────┐ ┌───────────┐ │ │ │ products │ │ embeddings │ │ orders │ │ │ │ customers │ │ pgvector │ │ ... │ │ │ └─────────────┘ └──────────────┘ └───────────┘ │ └───────────────────────┬─────────────────────────────┘ │ ┌────────────┼────────────┐ ▼ ▼ ▼ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ Semantic │ │ NL2SQL │ │ RAG │ │ Search │ │ Agent │ │ Pipeline │ └──────────┘ └──────────┘ └──────────┘ │ │ │ └────────────┼────────────┘ ▼ ┌──────────┐ │ LLM │ │ GPT-4o │ └──────────┘ │ ▼ User Response Index strategy for pgvector: -- HNSW: faster queries, slower inserts — best for production read-heavy workloads CREATE INDEX ON products USING hnsw embedding vector cosine ops WITH m = 16, ef construction = 64 ; -- IVFFlat: faster inserts, slightly less accurate — good for bulk loading CREATE INDEX ON products USING ivfflat embedding vector l2 ops WITH lists = 100 ; -- Rule of thumb: lists ≈ sqrt number of rows Batching embeddings: Don't call the embeddings API one row at a time texts = f"{p 'name' }: {p 'description' }" for p in products response = client.embeddings.create model="text-embedding-3-small", input=texts batch up to 2048 inputs at once embeddings = item.embedding for item in response.data Caching repeated NL2SQL queries: python import hashlib import json query cache: dict str, str = {} def cached nl to sql question: str - str: key = hashlib.md5 question.lower .strip .encode .hexdigest if key not in query cache: query cache key = nl to sql question return query cache key Before shipping any of these patterns to production: -- Create a restricted role for your AI application CREATE ROLE ai readonly; GRANT CONNECT ON DATABASE aidb TO ai readonly; GRANT USAGE ON SCHEMA public TO ai readonly; GRANT SELECT ON ALL TABLES IN SCHEMA public TO ai readonly; ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO ai readonly; All the code from this article — including Docker Compose setup, schema migrations, and ready-to-run examples — is in the companion repository: The repo structure: sql-ai-database-solutions/ ├── docker-compose.yml PostgreSQL + pgvector, ready to go ├── schema/ │ └── 001 init.sql Tables, pgvector extension, indexes ├── examples/ │ ├── 01 pgvector search.py Semantic search │ ├── 02 rag pipeline.py RAG with LangChain │ ├── 03 nl2sql.py Natural language to SQL │ └── 04 sql agent.py Autonomous SQL agent ├── .env.example └── README.md The patterns here are just the start. Some directions worth exploring: The line between "database" and "AI system" is getting thinner every month. If you're already running PostgreSQL, you're closer to a production AI stack than you might think. Have questions or want to share what you've built? Drop a comment below or open an issue on the repo