I Built a Self-Improving Health Platform: Five AI Agents That Learn Every Week

A developer built a self-improving health platform called Longevity AI that uses five AI agents to automatically update its knowledge base every week without human intervention. The system runs synthetic patient profiles through the production pipeline each Wednesday, scores the resulting reports on depth, personalization, specificity and legal safety, then automatically fetches PubMed abstracts to fill any knowledge gaps. The entire weekly improvement cycle costs approximately €0.55 and ensures the platform stays current with the latest health science research.

Originally published on— for full context, comments and related articles, visit the source. longevityai.nl Most AI products are static. You fine-tune a model, ship it, and it stays exactly as smart as the day you launched. Your users get the same quality on day 1 as on day 365. Mine doesn't work that way. Every Wednesday at 3am, five AI agents wake up, talk to each other, and make the next week's reports smarter — without me touching a single line of code. This is the architecture that makes it possible. I run Longevity AI https://longevityai.nl — a health platform that generates personalized 6-month lifestyle plans from a 28-question intake. It cross-references 10+ organ systems, checks 400+ EFSA-regulated claims, and outputs a clinically-framed report in under 15 minutes. The core AI is Claude Sonnet. It's powerful. But it only knows what I've taught it. The problem: health science moves fast. A new PubMed paper on magnesium and sleep quality drops. My platform doesn't know. A patient with a rare medication combination comes in. The report might miss the interaction. A legal claim sneaks past review. Nobody catches it until a user complains. I had two options: I chose option 2. Here's exactly how it works. Five agents run on a fixed schedule. One orchestrates them all. Wednesday 03:00 Synthetic Patients Agent Wednesday 04:00 Auto-KB Agent Tuesday 03:30 Developer Tools Radar Monday 07:00 Weekly Digest Agent Always active Agent Orchestrator They don't share a runtime. They communicate through the database and a lightweight event system. No complex framework — just reportAgentEvent and a rules table. The core of the self-improvement loop. Every Wednesday at 3am, 10 synthetic patient profiles are selected from a static template library 5 conditions x 2 psychological archetypes . These are fake patients with real-looking intake responses: ferritin levels, medication lists, trauma history, stress scores. Each synthetic patient goes through the exact same production pipeline as a real user. Full Sonnet report generation. No shortcuts. Then a second agent — Claude Haiku — scores each report on 4 dimensions: | Dimension | What it checks | Gap threshold | |---|---|---| | Protocol depth | Are expected correlations for this condition named? | below 6/10 | | Personalization | Are this patient's specific details in the report? | below 6/10 | | Supplement specificity | Active biological forms named magnesium bisglycinate, not just magnesium ? | below 6/10 | | Legal safety | No forbidden medical claims, no stop-medication advice? | below 7/10 | Scores below threshold become knowledge gap proposals. Legal safety below 5 triggers an immediate compliance scan — synchronously, before anything else continues. Cost: ~€0.55/week. Sonnet for the reports, Haiku for scoring. The knowledge base that writes itself. The gap proposals from Agent 1 contain condition types and dimensions. Agent 2 converts these into PubMed queries, fetches abstracts via the free NCBI API, and sends each abstract to Haiku with one instruction: Extract 3-5 factual claims from this abstract that are directly relevant to condition . Return structured triples: subject, predicate, object. The triples land in a knowledge triples table. The report generator reads from this table at runtime. No retraining. No fine-tuning. Just better context for the next generation. By Wednesday afternoon, the knowledge base has been updated with whatever the synthetic patients couldn't answer well. By Thursday morning, real patients get smarter reports. The feedback loop: Synthetic patient gets weak report → Gap detected → PubMed abstract fetched → Facts extracted → knowledge triples updated → Next patient gets stronger report Because staying current is also a product decision. Every Tuesday at 3:30am, the radar scans GitHub Trending and dev.to for tools that match a static relevance filter: playwright, trpc, drizzle, anthropic, health, automation, react, typescript. Haiku summarizes each match in 1-2 sentences. The summaries land in the admin UI. Monday morning I get a digest with what the dev world built this week that's relevant to my stack. Cost: €0.04/month. The rule engine that connects everything. Each agent calls reportAgentEvent type, result when it finishes. The orchestrator applies rules: // R1: Legal flag → immediate compliance scan if type === "synthetic loop done" && result.legalFlags 0 { await runComplianceDriftScan ; // synchronous, not queued } // R2: KB pipeline returned 0 facts → warning if type === "auto kb done" && result.factsInserted === 0 { console.warn " orchestrator Auto-KB returned 0 facts" ; } Rule R1 is the critical one. If a synthetic patient triggers a legal safety score below 5, the compliance agent doesn't wait until next week. It runs immediately. The orchestrator also lets me add new rules without touching the agents themselves. The operator dashboard I never have to build. Every Monday at 7am, an HTML email lands in my inbox with: I know exactly what the system learned, what it fixed, and what it flagged without logging into a dashboard or running queries. While building the agent system, I added something that costs literally nothing. Every intake has responses about stress levels, anxiety, medication history, trauma, and previous treatment attempts. From these existing fields — no new questions, no LLM call — I derive a psychological archetype: | Archetype | Signal | Coaching instruction | |---|---|---| | Overwhelmed | High stress + multiple specialists + frustration keywords | Start with deep validation, cite their own words, tiny steps only | | Skeptical | Multiple specialists + frustration, low anxiety | Biological mechanism first, then recommendation. Name the researcher. | | Ready but scared | Fear keywords + previous attempts + moderate stress | Week 1 max 2 changes, explicit success markers per week | | Knowledge-seeker | Blood values mentioned + detailed answers + no fear | Enzymes and neurotransmitters before the advice | | Beginner | No prior treatment, short answers | Warm, jargon-free, reassuring timeline | The archetype gets injected as a single instruction line into the report system prompt. The LLM adapts tone, structure, and depth automatically. Zero extra tokens at generation time. The entire multi-agent system is about 1,200 lines of TypeScript. It took one focused session to architect and implement. Weekly operating cost: The system pays for itself if it catches one legal compliance issue before a user does. Haiku scores need calibration. The first few weeks I'll manually compare Haiku's scores against my own judgment and adjust thresholds if needed. Synthetic patient templates are static. They don't learn. The knowledge base learns, but the patient profiles stay fixed. A future version would generate edge-case profiles dynamically based on what real users submit. The orchestrator is in-memory. Events don't survive a server restart. For a high-stakes system I'd persist the event log. For a solo SaaS at this scale, it's fine. Most solo founders think "multi-agent AI" means using CrewAI or AutoGen with 10 chained LLM calls. That's one way to do it. What I have is simpler and more reliable: purpose-built agents that do one thing well, communicate through a database, and are coordinated by a lightweight rule engine. No framework. No magic. Just TypeScript, cron jobs, and a clear separation of concerns. The result: a platform that gets measurably smarter every week, catches its own legal issues, updates its own knowledge base, and tells me what it learned while I sleep. Built on: React 18 + tRPC + Drizzle/MySQL + Claude Sonnet/Haiku + Railway Live at longevityai.nl Questions or want to talk architecture? info@holistischadviseur.nl This article was originally published on Longevity AI. Visit the source for the full context, references and discussion.