Show HN: Mantis, A self-hosted LLM gateway

wpnews.pro

cd /news/large-language-models/show-hn-mantis-a-self-hosted-llm-gat… · home › topics › large-language-models › article

[ARTICLE · art-41213] src=github.com ↗ pub=2026-06-26T19:15Z topic=large-language-models verified=true sentiment=↑ positive

Show HN: Mantis, A self-hosted LLM gateway

Mantis, an open-source self-hosted LLM gateway, launched to provide teams with a unified API for multiple model targets, centralizing routing, failover, caching, guardrails, observability, and AWS-native deployment. The project targets small teams seeking control over infrastructure and data while simplifying multi-LLM application development.

read1 min views1 publishedJun 26, 2026

Show HN: Mantis, A self-hosted LLM gateway — Image: source

Mantis is an open-source, self-hosted LLM gateway for teams building applications across multiple model targets. It gives client applications one stable chat-completions API while centralizing routing policy, failover behavior, response caching, guardrails, observability, and AWS deployment configuration.

The project is designed for small teams that want the benefits of an LLM gateway without giving up control of their infrastructure or data.

One API for LLM calls: send chat-completion requests through a single gateway endpoint instead of integrating directly with each provider.Configurable routing: route by metadata, model aliases, weighted targets, fallback chains, retries, timeouts, and cooldowns.Response caching: reduce repeated LLM calls with exact prompt caching and optional semantic caching.Guardrails: use AWS Bedrock guardrails to mask sensitive data and block policy-violating prompts or responses.Observability: capture request IDs, latency, token usage, cache behavior, errors, and request outcomes through CloudWatch.AWS-native deployment: provision and run Mantis with Terraform, ECS Fargate, ALB, ElastiCache, Parameter Store, S3, IAM, and CloudWatch.

llm-gateway: the FastAPI gateway service, React configuration dashboard, Terraform infrastructure, and deployment scripts.mantis-sdk: a Python SDK for calling the Mantis/v1/chat/completions

endpoint from application code.mantis-llm-gateway.github.io: the public documentation site and case study.

Read the

[documentation](https://mantis-llm-gateway.github.io/)for the project overview, guides, API reference, and architecture case study. - Follow the
[quick start](https://mantis-llm-gateway.github.io/guides/quick-start/)to run or deploy the gateway. - Review the

routing configuration guideto understand how model selection, fallback, caching, and cooldown behavior are controlled.

Mantis exists to make multi-LLM application development more reliable, observable, and operationally manageable. Instead of spreading provider-specific logic across application code, teams can put model routing, cache policy, failover behavior, guardrails, and deployment concerns behind one gateway layer.

The result is a system where application code stays simple, model choices remain configurable, and teams keep control over how requests move through their own AWS environment.

source & further reading

github.com — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/show-hn-mantis-a-self-ho…

Read original on github.com → github.com/mantis-llm-gateway

mentioned entities

Mantis

AWS Bedrock

Terraform

ECS Fargate

ALB

ElastiCache

CloudWatch

FastAPI

metadata

slugshow-hn-mantis-a-self-hosted-llm-gateway

topic#large-language-models

secondary3 topics

sentimentpositive

canonicalgithub.com

navigation

← prevVibePHP

next →Building LSTMs with PyTorch and …

── more in #large-language-models 4 stories · sorted by recency

developer.nvidia.com · 26 Jun · #large-language-models

Deploy a Production-Ready NVIDIA AI-Q Blueprint on Oracle Cloud Infrastructure

dev.to · 24 Jun · #large-language-models

What If Your Employees Never Had to Know Which System to Check?

dev.to · 23 May · #large-language-models

Zero-Downtime Blue-Green and IP-Based Canary Deployments on ECS Fargate

github.com · 26 Jun · #large-language-models

Building Voice AI Workflows with Branches Instead of One Giant Prompt

── more on @mantis 3 stories trending now

wpnews · 19 Oct · #developer-tools

Windows Script to clean up and remove all ASUS software

wpnews · 28 May · #ai-startups

The Niche SaaS Opportunity Map 2026: Highly Demanded Subscribed Categories Beyond Mainstream

wpnews · 1 Nov · #developer-tools

Custom Zig Test Runner, better ouput, timing display, and support for special "tests:beforeAll" and "tests:afterAll" tests

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required