{"slug": "how-i-built-a-rag-grounded-discord-brain-in-5-weeks-solo-esl-no-funding", "title": "How I built a RAG-grounded Discord brain in 5 weeks (solo, ESL, no funding)", "summary": "Peng, a solo founder and ESL teacher in Taipei, built Acortia in five weeks—a Discord-native \"Company Brain\" that answers questions with grounded, cited responses using saved server content. The $99/month bot, launching mid-June, ingests documents via `/save` and retrieves answers through `/ask`, using Supabase with pgvector for vector search and Render for processing. Acortia aims to solve the problem of institutional knowledge getting buried in Discord threads, pinned messages, and scattered documents, reducing moderator burnout from re-answering repeated questions.", "body_md": "A user in our Discord asked, for the fourth time that week, the same question. Same wording, almost. The first three answers were buried somewhere in a thread, a pinned message, and a Notion page nobody bookmarked. A mod typed it out again. I watched it happen, opened Cursor, and started typing.\n\nThat's the moment Acortia became a product instead of a side note.\n\nI'm Peng. Solo founder. Non-native English speaker. ESL teacher in Taipei by day, building backend software at night and on weekends. No funding. No team. No accelerator yet — YC F26 application is in. Five weeks ago I committed to building **Acortia**: a Discord-native Company Brain that answers `/ask <q>`\n\nwith a grounded, cited answer pulled from whatever the server has `/save`\n\nd. $99/month. Mid-June launch.\n\nThis is the build log. Real numbers, real bugs, real tradeoffs. No hype.\n\nDiscord communities accumulate institutional knowledge the way a cluttered desk accumulates receipts: faster than anyone can file it. Threads scroll past. Pinned messages cap at 50. Search is keyword-based and stops at the channel boundary. New members ask questions that were answered six months ago in a thread that's now archived.\n\nThe cost isn't dramatic — it's grinding. Mods burn out re-answering. Founders re-explain pricing. Engineers re-link the same architecture diagram. Knowledge exists; it just isn't retrievable.\n\nI looked at the existing options. Notion + Discord bots: too much manual upkeep. Generic AI chatbots: hallucinate confidently with no source. Custom in-house RAG: out of reach for the average community. The gap was a thin, opinionated tool that lived where the conversation already happened.\n\nAcortia is three slash commands and a cron job.\n\n`/save <url>`\n\n— ingest a doc, a thread, a webpage, a PDF. Worker chunks it, embeds it, stores it.`/ask <q>`\n\n— retrieve top-k chunks via cosine similarity, ground a model response in them, return the answer with `/sources`\n\n— list what the server has ingested. Audit trail.Install: OAuth the bot, click through to `api.acortia.com/install`\n\n, claim the workspace via magic-link email. Thirty seconds end-to-end if the operator already has Discord admin.\n\nThat's the whole product surface. Everything else is plumbing.\n\n**Discord is the surface.** Three slash commands registered globally, one OAuth flow, webhook-style interaction endpoints handled by the Render web service.\n\n**Supabase is the brain.** Seven tables. Postgres with the `pgvector`\n\nextension. Row Level Security keyed to `workspace_id`\n\n. A single SQL RPC, `match_artifacts`\n\n, does the vector search. RLS means a misrouted query physically cannot return another workspace's data — the database itself enforces tenancy.\n\n**Render is the muscle.** A web service handles interactive Discord requests with a < 3s deadline. A worker process handles the slow path: fetch URL, extract text (PDF connector for `application/pdf`\n\n, readability-style extractor for HTML), chunk, embed, write. A `*/15`\n\ncron sweeps queued ingest jobs and re-runs anything that timed out.\n\n**Stripe is the till.** Checkout session for the $99/mo plan, webhook handler with idempotency (every event ID is upserted into `stripe_events_seen`\n\nbefore any side effect runs), portal link for self-serve management. Promo codes managed in the Stripe dashboard.\n\nHere's the SQL signature of the only RPC the app calls for retrieval. Stylized — the live function has more telemetry, but this is the shape:\n\n```\n-- match_artifacts: cosine similarity search scoped by workspace\ncreate or replace function match_artifacts(\n  query_embedding vector(1536),\n  workspace_id_input uuid,\n  match_count int default 5,\n  min_similarity float default 0.15\n)\nreturns table (\n  artifact_id uuid,\n  chunk_id uuid,\n  content text,\n  source_url text,\n  similarity float\n)\nlanguage sql stable\nas $$\n  select\n    a.id as artifact_id,\n    c.id as chunk_id,\n    c.content,\n    a.source_url,\n    1 - (c.embedding <=> query_embedding) as similarity\n  from chunks c\n  join artifacts a on a.id = c.artifact_id\n  where a.workspace_id = workspace_id_input\n    and 1 - (c.embedding <=> query_embedding) >= min_similarity\n  order by c.embedding <=> query_embedding\n  limit match_count;\n$$;\n```\n\nTwo numbers in there worth naming: `match_count = 5`\n\nand `min_similarity = 0.15`\n\n. I tuned both empirically against my own corpus. Higher k bloats the context window without lifting answer quality; lower threshold lets junk through and the model hedges. Lower k makes confident answers brittle when the corpus is sparse. These are the knobs you'll want to revisit per-customer in v2.\n\nHere's `/ask`\n\n, sanitized and stylized. The real handler has more error wrapping and a deferred-response pattern for Discord's 3-second deadline, but the spine looks like this:\n\n``` js\n// apps/web/src/routes/interactions/ask.ts (illustrative)\nimport { embed } from \"../../lib/embed\";\nimport { supabase } from \"../../lib/supabase\";\nimport { groundAnswer } from \"../../lib/llm\";\n\nexport async function handleAsk(interaction: DiscordInteraction) {\n  const question = interaction.data.options[0].value as string;\n  const workspaceId = await resolveWorkspace(interaction.guild_id);\n\n  const queryEmbedding = await embed(question);\n\n  const { data: matches, error } = await supabase.rpc(\"match_artifacts\", {\n    query_embedding: queryEmbedding,\n    workspace_id_input: workspaceId,\n    match_count: 5,\n    min_similarity: 0.15,\n  });\n\n  if (error) throw error;\n  if (!matches?.length) {\n    return reply(interaction, \"No grounded sources found. Try `/save` first.\");\n  }\n\n  const answer = await groundAnswer(question, matches);\n  await logQuery(workspaceId, question, matches, answer); // queries.metadata\n\n  return reply(interaction, formatWithCitations(answer, matches));\n}\n```\n\nThe `logQuery`\n\ncall writes to `queries.metadata`\n\n— a JSON column that captures which artifacts were retrieved, the similarity scores, latency, and the model used. Telemetry isn't an afterthought; it's the only way to tell, six weeks in, whether the threshold of 0.15 is still right for a given customer.\n\nPinecone is excellent. It's also a second system to bill, monitor, and reconcile RLS against. Acortia's whole tenancy model is `workspace_id`\n\non every table. If embeddings live in a separate vector DB, I have to re-implement multi-tenant isolation there and trust two systems instead of one.\n\npgvector keeps embeddings inside the same Postgres that enforces RLS. The retrieval call is a single RPC. Cost at MVP scale: included in Supabase free tier. The day I outgrow it, the migration to a dedicated vector DB is a few hours, not a rewrite.\n\nDiscord OAuth tells me who installed the bot. It does not tell me which **email** owns the workspace for billing. I needed a second factor: a magic link sent to the operator's email so the Stripe Checkout, the invoice, and the workspace ownership all land on the same identity.\n\nThe decision inside that decision was implicit-flow vs PKCE for the magic-link callback. I went with implicit. PKCE is more secure on paper, but it requires client-side code verifier storage, which on Discord's embedded browser context is fragile. Implicit + short-lived (10 min) one-time codes + server-side verification gave me a flow that worked first try on iOS Discord, Android Discord, and desktop. The tradeoff: implicit is theoretically replayable in the 10-minute window. Mitigation: one-time-use enforced server-side, codes invalidated on first verification.\n\nI'll revisit PKCE in v2 when I have time to test the embedded-browser edge cases properly.\n\nVercel is faster to ship for stateless routes. Acortia is not stateless. The ingest pipeline runs longer than any serverless function's hard timeout — PDFs in particular. I needed a long-running worker process and a cron. Render gives me both with one config file and one bill. Web + worker + cron on Render hobby tier costs less than a sandwich per month at MVP scale.\n\nThe day I need autoscale across regions, I'll consider Fly. Not before.\n\nDay 20. A test user installed Acortia in two Discord servers using the same email, within about ninety seconds of each other. Both installs triggered a workspace-claim flow. Both wrote to the `workspaces`\n\ntable. The second write silently overwrote the first install's billing pointer. The user ended up with one Stripe customer and two Discord servers, but only one of the servers was correctly linked.\n\nThe bug had two causes braided together. The naive implementation was:\n\n``` js\n// Buggy original — two installs collide\nconst existing = await supabase\n  .from(\"workspaces\")\n  .select(\"id\")\n  .eq(\"guild_id\", guildId)\n  .maybeSingle();\n\nif (existing.data) {\n  await supabase.from(\"workspaces\").update({ ... }).eq(\"id\", existing.data.id);\n} else {\n  await supabase.from(\"workspaces\").insert({ ... });\n}\n```\n\nClassic check-then-act. Two concurrent claims both saw `existing.data === null`\n\n, both ran `insert`\n\n, the unique constraint caught one and the other won the race. The losing install thought it succeeded because the response came from a different row.\n\nThe fix was atomic upsert plus moving email collection to claim time, not install time:\n\n```\n// Day-20 fix — atomic, idempotent\nconst { data, error } = await supabase\n  .from(\"workspaces\")\n  .upsert(\n    {\n      guild_id: guildId,\n      claim_email: null, // email collected later via magic link\n      claim_token: generateToken(),\n      claim_expires_at: new Date(Date.now() + 10 * 60 * 1000),\n    },\n    { onConflict: \"guild_id\", ignoreDuplicates: false }\n  )\n  .select()\n  .single();\n```\n\nThe atomic upsert means the database decides the winner. The deferred email means the second install doesn't even try to write the email column until the magic link is verified, which by then has a unique session token to disambiguate. I also added a trigger to fail-loud if `claim_email`\n\never gets overwritten on a row that already has one — defense in depth.\n\nStripe webhooks got the same treatment because they always should:\n\n```\n// Webhook idempotency — check before any side effect\nconst { data: seen } = await supabase\n  .from(\"stripe_events_seen\")\n  .select(\"id\")\n  .eq(\"event_id\", event.id)\n  .maybeSingle();\n\nif (seen) return new Response(\"ok\", { status: 200 });\n\nawait supabase.from(\"stripe_events_seen\").insert({ event_id: event.id });\nawait handleStripeEvent(event); // safe to run exactly once\n```\n\nIdempotent webhooks are non-negotiable. Stripe will retry. You will get duplicates. Plan for it on Day 1, not Day 30.\n\nThree things were on the board and got cut. Each cut was deliberate.\n\n**Slack adapter.** I scaffolded a platform-adapter abstraction on Day 8 — the idea was that `/save`\n\nand `/ask`\n\nwould be platform-agnostic and Slack would be a second surface. The scaffolding is in the repo. I did not build the Slack OAuth flow, slash command registration, or interaction handler. Reason: Slack outreach pre-launch was zero signal. Discord operators were actively asking for the tool. Building Slack would have cost a week and shipped a feature for a customer I didn't have. Parked until live revenue justifies it.\n\n**Notion connector.** Considered. Killed. The use case I imagined — pull Notion pages as artifacts — is well-served by users copy-pasting URLs into `/save`\n\n. The MCP route through Claude Desktop is enough for the operator's personal workflow. A first-party Notion connector adds OAuth, page-permission edge cases, and a separate sync cron. Not worth the complexity at MVP.\n\n**Pipedream MCP custom server.** I spent a few hours wiring Pipedream as a generic connector tier. Backend was healthy, auth worked, but the abstraction was leaking into the slash-command UX. I cut it and routed power-user workflows through Claude Desktop's MCP instead. Acortia stays focused. Operators who want orchestration use Claude Desktop and call Acortia as a tool.\n\nTelemetry first. I added `queries.metadata`\n\non Day 6, which was correct, but I didn't build a dashboard around it until Week 4. For the first three weeks I was debugging retrieval quality by reading raw Postgres rows. A 30-minute Metabase dashboard would have saved hours of squinting. If you're building RAG: instrument retrieval before you instrument anything else. You can't tune what you can't see.\n\nMid-June 2026 launch. Soft-live now for beta operators.\n\nInstall: **api.acortia.com/install**\n\nDomain: **acortia.com**\n\nPromo for readers of this post: `BETA-FREE-30D`\n\n— 100% off the first month, 10 redemptions, expires 2026-06-30 23:59 UTC. After that the price is $99/month flat. No per-seat. No usage tier. One Discord server, one bill.\n\nIf you operate a Discord community, run a developer relations team, or moderate a paid creator server: this was built for you. If you don't, the architecture above is open notes — steal whatever's useful.\n\nI'm in Taipei. I teach English to fund this build. I am not a native English speaker and I rewrite half of what I publish three times before it reads cleanly. Every line of Acortia was written between lesson plans and weekend mornings. No team. No accelerator yet. No outside capital.\n\nWhat I'm proving with this build: a solo non-US founder can ship a credible B2B SaaS product end-to-end — auth, billing, RAG, multi-tenant data isolation, idempotent webhooks, a real cron pipeline — in five weeks of nights-and-weekends time, on a stack that costs less than a streaming subscription to run.\n\nIf that's interesting to you, the install link is above. If you want to talk shop, I'm on Discord and X under the same handle.\n\nBrief. Concept. Preview. Ship.", "url": "https://wpnews.pro/news/how-i-built-a-rag-grounded-discord-brain-in-5-weeks-solo-esl-no-funding", "canonical_source": "https://dev.to/incultnitollc/how-i-built-a-rag-grounded-discord-brain-in-5-weeks-solo-esl-no-funding-1fgh", "published_at": "2026-06-03 06:19:18+00:00", "updated_at": "2026-06-03 06:41:22.125163+00:00", "lang": "en", "topics": ["ai-products", "ai-startups", "ai-tools", "large-language-models", "generative-ai"], "entities": ["Acortia", "Peng", "Cursor", "Discord", "Notion", "YC"], "alternates": {"html": "https://wpnews.pro/news/how-i-built-a-rag-grounded-discord-brain-in-5-weeks-solo-esl-no-funding", "markdown": "https://wpnews.pro/news/how-i-built-a-rag-grounded-discord-brain-in-5-weeks-solo-esl-no-funding.md", "text": "https://wpnews.pro/news/how-i-built-a-rag-grounded-discord-brain-in-5-weeks-solo-esl-no-funding.txt", "jsonld": "https://wpnews.pro/news/how-i-built-a-rag-grounded-discord-brain-in-5-weeks-solo-esl-no-funding.jsonld"}}