# AI OSS tool repo goes archived over night after raising $7.3M Seed

> Source: <https://github.com/tensorzero/tensorzero>
> Published: 2026-06-13 12:10:47+00:00

**TensorZero is an open-source LLMOps platform that unifies:**

**Gateway:** access every LLM provider through a unified API, built for performance (<1ms p99 latency)**Observability:** store inferences and feedback in your database, available programmatically or in the UI**Evaluation:** benchmark individual inferences or end-to-end workflows using heuristics, LLM judges, etc.**Optimization:** collect metrics and human feedback to optimize prompts, models, and inference strategies**Experimentation:** ship with confidence with built-in A/B testing, routing, fallbacks, retries, etc.

You can take what you need, adopt incrementally, and complement with other tools.
It plays nicely with the **OpenAI SDK**, **OpenTelemetry**, and **every major LLM provider**.

TensorZero is used by companies ranging from frontier AI startups to the Fortune 10 and fuels ~1% of global LLM API spend today.

** Website**
·

**·**

[Docs](https://www.tensorzero.com/docs)**·**[Twitter](https://www.x.com/tensorzero)

**·**

[Slack](https://www.tensorzero.com/slack)

[Discord](https://www.tensorzero.com/discord)**·**

[Quick Start (5min)](https://www.tensorzero.com/docs/quickstart)**·**

[Deployment Guide](https://www.tensorzero.com/docs/deployment/tensorzero-gateway)**·**

[API Reference](https://www.tensorzero.com/docs/gateway/api-reference)

[Configuration Reference](https://www.tensorzero.com/docs/gateway/configuration-reference)## tensorzero-demo.mp4

Note

TensorZero Autopilot is an **automated AI engineer** powered by TensorZero that analyzes LLM observability data, sets up evals, optimizes prompts and models, and runs A/B tests.

It **dramatically improves the performance of LLM agents** across diverse tasks:

Integrate with TensorZero once and access every major LLM provider.

-
(API or self-hosted) through a single unified API[Call any LLM](https://www.tensorzero.com/docs/gateway/call-any-llm) - Infer with
,[tool use](https://www.tensorzero.com/docs/gateway/guides/tool-use),[structured outputs (JSON)](https://www.tensorzero.com/docs/gateway/generate-structured-outputs),[batch](https://www.tensorzero.com/docs/gateway/guides/batch-inference),[embeddings](https://www.tensorzero.com/docs/gateway/generate-embeddings),[multimodal (images, files)](https://www.tensorzero.com/docs/gateway/call-llms-with-image-and-file-inputs), etc.[caching](https://www.tensorzero.com/docs/gateway/guides/inference-caching) -
to enforce a structured interface between your application and the LLMs[Create prompt templates and schemas](https://www.tensorzero.com/docs/gateway/create-a-prompt-template) - Satisfy extreme throughput and latency needs, thanks to 🦀 Rust:
[<1ms p99 latency overhead at 10k+ QPS](https://www.tensorzero.com/docs/gateway/benchmarks) -
with routing, retries, fallbacks, load balancing, granular timeouts, etc.[Ensure high availability](https://www.tensorzero.com/docs/gateway/guides/retries-fallbacks) -
and[Track usage and cost](https://www.tensorzero.com/docs/operations/track-usage-and-cost)with granular scopes (e.g. tags)[enforce custom rate limits](https://www.tensorzero.com/docs/operations/enforce-custom-rate-limits) -
to allow clients to access models without sharing provider API keys[Set up auth for TensorZero](https://www.tensorzero.com/docs/operations/set-up-auth-for-tensorzero)

** Anthropic**,

**,**

[AWS Bedrock](https://www.tensorzero.com/docs/gateway/guides/providers/aws-bedrock)**,**

[AWS SageMaker](https://www.tensorzero.com/docs/gateway/guides/providers/aws-sagemaker)**,**

[Azure](https://www.tensorzero.com/docs/gateway/guides/providers/azure)**,**

[DeepSeek](https://www.tensorzero.com/docs/gateway/guides/providers/deepseek)**,**

[Fireworks](https://www.tensorzero.com/docs/gateway/guides/providers/fireworks)**,**

[GCP Vertex AI Anthropic](https://www.tensorzero.com/docs/gateway/guides/providers/gcp-vertex-ai-anthropic)**,**

[GCP Vertex AI Gemini](https://www.tensorzero.com/docs/gateway/guides/providers/gcp-vertex-ai-gemini)**,**

[Google AI Studio (Gemini API)](https://www.tensorzero.com/docs/gateway/guides/providers/google-ai-studio-gemini)**,**

[Groq](https://www.tensorzero.com/docs/gateway/guides/providers/groq)**,**

[Hyperbolic](https://www.tensorzero.com/docs/gateway/guides/providers/hyperbolic)**,**

[Mistral](https://www.tensorzero.com/docs/gateway/guides/providers/mistral)**,**

[OpenAI](https://www.tensorzero.com/docs/gateway/guides/providers/openai)**,**

[OpenRouter](https://www.tensorzero.com/docs/gateway/guides/providers/openrouter)**,**

[SGLang](https://www.tensorzero.com/docs/gateway/guides/providers/sglang)**,**

[TGI](https://www.tensorzero.com/docs/gateway/guides/providers/tgi)**,**

[Together AI](https://www.tensorzero.com/docs/gateway/guides/providers/together)**, and**

[vLLM](https://www.tensorzero.com/docs/gateway/guides/providers/vllm)**.**

[xAI (Grok)](https://www.tensorzero.com/docs/gateway/guides/providers/xai)Need something else? TensorZero also supports ** any OpenAI-compatible API (e.g. Ollama)**.

You can use TensorZero with any OpenAI SDK (Python, Node, Go, etc.) or OpenAI-compatible client.

(one Docker container).[Deploy the TensorZero Gateway](https://www.tensorzero.com/docs/deployment/tensorzero-gateway)- Update the
`base_url`

and`model`

in your OpenAI-compatible client. - Run inference:

``` python
from openai import OpenAI

# Point the client to the TensorZero Gateway
client = OpenAI(base_url="http://localhost:3000/openai/v1", api_key="not-used")

response = client.chat.completions.create(
    # Call any model provider (or TensorZero function)
    model="tensorzero::model_name::anthropic::claude-sonnet-4-6",
    messages=[
        {
            "role": "user",
            "content": "Share a fun fact about TensorZero.",
        }
    ],
)
```

See ** Quick Start** for more information.

Zoom in to debug individual API calls, or zoom out to monitor metrics across models and prompts over time — all using the open-source TensorZero UI.

- Store inferences and
in your own database[feedback (metrics, human edits, etc.)](https://www.tensorzero.com/docs/gateway/guides/metrics-feedback) - Dive into individual inferences or high-level aggregate patterns using the TensorZero UI or programmatically
-
for optimization, evaluation, and other workflows[Build datasets](https://www.tensorzero.com/docs/gateway/api-reference/datasets-datapoints) - Replay historical inferences with new prompts, models, inference strategies, etc.
-
and[Export OpenTelemetry traces (OTLP)](https://www.tensorzero.com/docs/operations/export-opentelemetry-traces)to your favorite application observability tools[export Prometheus metrics](https://www.tensorzero.com/docs/operations/export-prometheus-metrics) - Soon: AI-assisted debugging and root cause analysis; AI-assisted data labeling

Send production metrics and human feedback to easily optimize your prompts, models, and inference strategies — using the UI or programmatically.

- Optimize your models with
, RLHF, and other techniques[supervised fine-tuning](https://www.tensorzero.com/docs/optimization/supervised-fine-tuning-sft) - Optimize your prompts with automated prompt engineering algorithms like
[GEPA](https://www.tensorzero.com/docs/optimization/gepa) - Optimize your
with[inference strategy](https://www.tensorzero.com/docs/gateway/guides/inference-time-optimizations), best/mixture-of-N sampling, etc.[dynamic in-context learning](https://www.tensorzero.com/docs/optimization/dynamic-in-context-learning-dicl) - Enable a feedback loop for your LLMs: a data & learning flywheel turning production data into smarter, faster, and cheaper models
- Soon: synthetic data generation

Compare prompts, models, and inference strategies using evaluations powered by heuristics and LLM judges.

-
with[Evaluate individual inferences](https://www.tensorzero.com/docs/evaluations/inference-evaluations/tutorial)*inference evaluations*powered by heuristics or LLM judges (≈ unit tests for LLMs) -
with[Evaluate end-to-end workflows](https://www.tensorzero.com/docs/evaluations/workflow-evaluations/tutorial)*workflow evaluations*with complete flexibility (≈ integration tests for LLMs) - Optimize LLM judges just like any other TensorZero function to align them to human preferences
- Soon: more built-in evaluators; headless evaluations

Evaluation » UI |
Evaluation » CLI |
|

```
docker compose run --rm evaluations \
  --evaluation-name extract_data \
  --dataset-name hard_test_cases \
  --variant-name gpt_4o \
  --concurrency 5
Run ID: 01961de9-c8a4-7c60-ab8d-15491a9708e4
Number of datapoints: 100
██████████████████████████████████████ 100/100
exact_match: 0.83 ± 0.03 (n=100)
semantic_match: 0.98 ± 0.01 (n=100)
item_count: 7.15 ± 0.39 (n=100)
```

 |

Ship with confidence with built-in A/B testing, routing, fallbacks, retries, etc.

-
to ship with confidence and identify the best prompts and models for your use cases.[Run adaptive A/B tests](https://www.tensorzero.com/docs/experimentation/run-adaptive-ab-tests) - Enforce principled experiments in complex workflows, including support for multi-turn LLM systems, sequential testing, and more.

Build with an open-source stack well-suited for prototypes but designed from the ground up to support the most complex LLM applications and deployments.

- Build simple applications or massive deployments with GitOps-friendly orchestration
-
with built-in escape hatches, programmatic-first usage, direct database access, and more[Extend TensorZero](https://www.tensorzero.com/docs/operations/extend-tensorzero) - Integrate with third-party tools: specialized observability and evaluations, model providers, agent orchestration frameworks, etc.
- Iterate quickly by experimenting with prompts interactively using the Playground UI

**How is TensorZero different from other LLM frameworks?**

- TensorZero enables you to optimize complex LLM applications based on production metrics and human feedback.
- TensorZero supports the needs of industrial-grade LLM applications: low latency, high throughput, type safety, self-hosted, GitOps, customizability, etc.
- TensorZero unifies the entire LLMOps stack, creating compounding benefits. For example, LLM evaluations can be used for fine-tuning models alongside AI judges.

**Can I use TensorZero with ___?**

Yes.
Every major programming language is supported.
It plays nicely with the **OpenAI SDK**, **OpenTelemetry**, and **every major LLM provider**.

**Is TensorZero production-ready?**

Yes. TensorZero is used by companies ranging from frontier AI startups to the Fortune 10 and powers ~1% of the global LLM API spend today.

Here's a case study: [Automating Code Changelogs at a Large Bank with LLMs](https://www.tensorzero.com/blog/case-study-automating-code-changelogs-at-a-large-bank-with-llms)

**How much does TensorZero cost?**

TensorZero (LLMOps platform) is 100% self-hosted and open-source.

TensorZero Autopilot (automated AI engineer) is a complementary paid product powered by TensorZero.

**Who is building TensorZero?**

Our technical team includes a former Rust compiler maintainer, machine learning researchers (Stanford, CMU, Oxford, Columbia) with thousands of citations, and the chief product officer of a decacorn startup. We're backed by the same investors as leading open-source projects (e.g. ClickHouse, CockroachDB) and AI labs (e.g. OpenAI, Anthropic). See our ** $7.3M seed round announcement** and

**. We're**

[coverage from VentureBeat](https://venturebeat.com/ai/tensorzero-nabs-7-3m-seed-to-solve-the-messy-world-of-enterprise-llm-development/)**.**

[hiring in NYC](https://www.tensorzero.com/jobs)**How do I get started?**

You can adopt TensorZero incrementally. Our ** Quick Start** goes from a vanilla OpenAI wrapper to a production-ready LLM application with observability and fine-tuning in just 5 minutes.

**Start building today.**
The ** Quick Start** shows it's easy to set up an LLM application with TensorZero.

**Questions?**
Ask us on ** Slack** or

**.**

[Discord](https://www.tensorzero.com/discord)**Using TensorZero at work?**
Email us at ** hello@tensorzero.com** to set up a Slack or Teams channel with your team (free).

We are working on a series of **complete runnable examples** illustrating TensorZero's data & learning flywheel.

[Optimizing Data Extraction (NER) with TensorZero]This example shows how to use TensorZero to optimize a data extraction pipeline. We demonstrate techniques like fine-tuning and dynamic in-context learning (DICL). In the end, an optimized GPT-4o Mini model outperforms GPT-4o on this task — at a fraction of the cost and latency — using a small amount of training data.

[Agentic RAG — Multi-Hop Question Answering with LLMs]This example shows how to build a multi-hop retrieval agent using TensorZero. The agent iteratively searches Wikipedia to gather information, and decides when it has enough context to answer a complex question.

[Writing Haikus to Satisfy a Judge with Hidden Preferences]This example fine-tunes GPT-4o Mini to generate haikus tailored to a specific taste. You'll see TensorZero's "data flywheel in a box" in action: better variants leads to better data, and better data leads to better variants. You'll see progress by fine-tuning the LLM multiple times.

[Image Data Extraction — Multimodal (Vision) Fine-tuning]This example shows how to fine-tune multimodal models (VLMs) like GPT-4o to improve their performance on vision-language tasks. Specifically, we'll build a system that categorizes document images (screenshots of computer science research papers).

[Improving LLM Chess Ability with Best-of-N Sampling]This example showcases how best-of-N sampling can significantly enhance an LLM's chess-playing abilities by selecting the most promising moves from multiple generated options.

We write about LLM engineering on the ** TensorZero Blog**.
Here are some of our favorite posts:

[Bandits in your LLM Gateway: Improve LLM Applications Faster with Adaptive Experimentation (A/B Testing)](https://www.tensorzero.com/blog/bandits-in-your-llm-gateway/)[Is OpenAI's Reinforcement Fine-Tuning (RFT) Worth It?](https://www.tensorzero.com/blog/is-openai-reinforcement-fine-tuning-rft-worth-it/)[Distillation with Programmatic Data Curation: Smarter LLMs, 5-30x Cheaper Inference](https://www.tensorzero.com/blog/distillation-programmatic-data-curation-smarter-llms-5-30x-cheaper-inference/)[From NER to Agents: Does Automated Prompt Engineering Scale to Complex Tasks?](https://www.tensorzero.com/blog/from-ner-to-agents-does-automated-prompt-engineering-scale-to-complex-tasks/)