Show HN: I built an 11-LLM consensus engine to detect AI hallucination

wpnews.pro

The only production-ready boilerplate that ships with

14 LLM providers in semantic consensus,EU AI Act audit-grade compliance, and13 self-evolution loopsout of the box. Built on the same code that powers[api.quorum-ai.dev].

Every AI SaaS boilerplate on the market ships with one LLM provider — usually OpenAI, sometimes Anthropic. That's the easy part. The hard part is everything around it:

Routing queries across multiple providers without exploding costs
Detecting hallucination via semantic consensus instead of trusting a single model
Generating audit trails that pass EU AI Act Article 12/13 review
Building fail-closed gates for high-risk autonomous actions
Keeping per-user memory, RLHF feedback, and adaptive routing data isolated

This kit gives you all of that already wired, deployed, and battle-tested in production. You spend your engineering time on what makes your product different — not on rebuilding the orchestration layer everyone else stops at.

14 provider integrations out of the box: OpenAI, Anthropic, Google Gemini, xAI Grok, Mistral, Cohere, NVIDIA, DeepSeek, Replicate, DashScope/Qwen, Zhipu/GLM, Moonshot/Kimi, Hermes (Nous) local, Llama local via OllamaSemantic agreement scoring via cosine similarity on embeddings — not lexical overlap, not majority voteDisagreement trace returned with every answer — your users see exactly which models agreed, which dissented, and whyPer-query cost log so you can show clients what each consensus cost in API tokens

Loop	What it learns
RLHF tracker	Per-user, per-query-class weight updates from thumbs-up/down
ELO competition	Pairwise model wins → global ranking per query class
Hebbian co-activation	Which models agree with each other (reduces redundant fan-out)
MoE router	Picks the right subset of providers per query, not all every time
Hebbian memory	Per-user vector memory with semantic recall
Genetic prompt evolution	Mutates and selects system prompts that score highest on your eval set
Adversarial loop	Red-team prompts to catch jailbreaks before users do
A/B testing	Promotes new policies only when they beat the incumbent statistically
Distillation	Promotes Llama checkpoints fine-tuned on your in-house data
Synthetic data	Generates training pairs from real production traffic
Architecture search	Evolves loop topology over generations
Federated learning	Aggregates updates across tenants without sharing raw data
Web learner	4-source web ingestion (DDG + Wikipedia + HackerNews + arXiv) into a vector KB

HSP fail-closed gate*(opt-in)*— wrap any high-risk function withrequiresHspApproval()

and it refuses unless a human-approved webhook signs off. Don't need compliance? Don't wrap. The kit doesn't force this on you. Toggle off globally withHSP_ENABLED=false

even if a dependency does wrap it.EU AI Act Article 12 audit trail*(opt-in)*— callwriteAuditLine()

from your business logic to log decisions; the audit cert generator reads this logEU AI Act Article 13 audit certificate*(opt-in)*— SHA-256 hash-chained PDF generator, one call per audit period:generateAuditCert({...})

. Useful when selling to regulated verticals (legal, fintech, health); ignorable otherwise.Hosted vs local posture split— hosted deployment is fail-closed by infra-marker detection (Cloud Run, k8s, Lambda); local research mode opt-in viaHSP_GATE_DEV_MODE=1

CUSTOMER_KEYS_ENCRYPTION_KEY— Fernet-encrypted BYOK provider keys, never logged, never leaked. Only relevant if you're letting your end users supply their own API keys.

Auth— email magic links + OAuth-ready (Apple, Google), session cookies, JWT for API** Stripe billing**— Pro tier subscription, Free tier metering, webhook handler with signature verification (Stripe Event SDK)** Resend email**— transactional templates: welcome, billing, audit alerts** Firestore persistence**— API keys, customer tiers, usage counters, encrypted BYOK keys** Cloud Run deploy**— single command, autoscaling 0→N, regional failover ready** GitHub webhook receiver**— HMAC SHA-256 signature verified, ready to dispatch audit jobs on push** Cloud Scheduler target**— cron endpoint gated by shared secret, ready for nightly batch jobs

git clone https://github.com/jaquelinejaque/quorum-saas-starter
cd quorum-saas-starter
./scripts/setup.sh      # interactive: collects keys, creates Stripe products, configures Cloud Run
./scripts/deploy.sh     # gcloud run deploy + Firestore init + Stripe webhook registration

You ship to production in under 30 minutes.

The kit gives you the orchestration layer. You add the vertical:

AI Tax Assistant— your prompts + this kit's consensus = product** AI Code Reviewer**— your evals + this kit's RLHF = product** AI Compliance Auditor**— your domain knowledge + this kit's HSP gate = product** AI Legal Research**— your case database + this kit's web learner = product

The hard infrastructure work is done. You focus on customers, prompts, and the niche-specific data only you have.

Single project deployment. Source code under modified Apache-2.0 (HSP commercial restriction). 3 months of updates. No support.

5 project deployments
Pre-trained MoE router weights (10,000+ production queries already shaped the policy table)
Genetic prompt evolution kit + 50 evolved seed prompts across common SaaS categories
12 months of updates
Monthly recorded "office hours" video (no live calls — recordings only, accessible to all Pro+ buyers)
Read-only Discord access
Unlimited project deployments
Commercial / white-label license — resell as your own product
EU AI Act Article 12/13 audit kit (forms, templates, cert generator)
24 months of updates
Office hours archive (all past + future)

All tiers: source code only, no managed hosting, no live support. Buy what you can build on, not what you have to ask permission to use.

The orchestration layer in this kit took 18 months and 60+ commits to get production-stable. It runs the Quorum API used in audit-grade compliance workflows. Pricing reflects what it would cost a senior engineer to rebuild from scratch: roughly 6 weeks of full-time work at standard contractor rates (~£25,000). At £497–£2,497 you're buying back that time.

You want managed hosting — go use Quorum hosted (£49/mo)
You want live support — this is code-only license
You want to vibe-code an AI app and ship in 2 hours — use LovableorVercel templatesinstead - You don't have a real product idea yet — this kit is for shipping, not learning

30 days, no questions asked, if the kit doesn't deploy successfully on your environment. After successful deployment, no refunds — you have the source code.

Built by Jaqueline Martins / Sovereign Chain Ltd. Patent Pending: HSP Protocol (PCT/US26/11908).

source & further reading

github.com — original article

Show HN: I built an 11-LLM consensus engine to detect AI hallucination

Run your AI side-project on zahid.host