{"slug": "phoenix-at-10000-stars-on-github-how-an-open-source-ai-observability-project-by", "title": "Phoenix at 10,000 stars on GitHub: How an open source AI observability project grew by following its community", "summary": "Phoenix, an open-source AI observability project, crossed 10,000 GitHub stars after growing from a Jupyter notebook extension in 2023 into a widely adopted platform for monitoring AI applications. The project, maintained by Arize, evolved alongside the emergence of AI engineering as a discipline, adopting OpenTelemetry and shaping the OpenInference standard to reach millions of developers. Maintainers said the project's growth was driven by a focus on meeting developers where they work, starting inside Jupyter notebooks and expanding to support frameworks, languages, and agent workflows across the AI ecosystem.", "body_md": "*Co-Authored by RL Nabors & Nancy Chauhan, Developer Relations Engineer.*\n\n[Phoenix](https://arize.com/phoenix/) just crossed 10,000 GitHub stars.\n\nFor an open-source project, that milestone means thousands of developers have decided a repository is worth watching, testing, contributing to, or betting on. Some filed issues. Some opened pull requests. Others showed up in Slack, challenged assumptions, and helped shape the roadmap.\n\nPhoenix has always been what Nadia Eghbal defined as a “stadium project” in her book, [ Working in Public: The Making and Maintenance of Open Source Software](https://press.stripe.com/working-in-public). A stadium project is when a handful of engineers build something thousands of developers depend on. .\n\nPhoenix started as a Jupyter notebook extension in 2023 and has grown into one of the most widely adopted open-source projects in [AI observability](https://arize.com/ai-agents/agent-observability/). Along the way, it helped shape [OpenInference](https://arize.com/docs/ax/concepts/otel-openinference/overview), adopted [OpenTelemetry](https://arize.com/blog/the-role-of-opentelemetry-in-llm-observability/) before it became the default choice for much of the AI ecosystem, and expanded from a notebook tool into a platform used across frameworks, languages, and deployment environments.\n\nThe story of Phoenix is also a story about the emergence of AI engineering as a discipline. As frameworks evolved, observability standards emerged, and agent workflows became commonplace, the project evolved alongside them.\n\nMany of the decisions that shaped Phoenix came back to the same question:\n\n“How do we reach the most users?” said [Mikyo King](https://github.com/mikeldking), Head of Open-source at Arize.\n\n**Hear the story behind Phoenix from the people who built it**\n\nThis article is based on a conversation with Phoenix maintainers [Mikyo King](https://www.linkedin.com/in/mikeldking/), [Roger Yang](https://www.linkedin.com/in/roger-y-a35595114/), and [Xander Song](https://www.linkedin.com/in/xandersong/).\n\nWatch the full interview below or read on to learn the story behind Phoenix’s move to OpenTelemetry, the creation of OpenInference, and why the team deliberately built the project backward. [You can also explore the latest data from Phoenix as we celebrate hitting 10,000 stars](http://arize.com/phoenix-10k).\n\n## Building AI observability inside Jupyter Notebooks\n\nArize started in 2020 as a closed-source company. By late 2022, the team wanted to reach developers in a different way.\n\n“We wanted to reach a lot more users, not just thousands of our customers, but millions of people,” Mikyo said. Developer communities distrust vendors, often turning to to open source projects like Keras and PyTorch first, Big Tech offerings second. And data privacy was a growing concern as AI services began passing private information through increasingly difficult to trace networks.\n\nMikyo was asked to spearhead open source, without a precise definition of what that meant. The one thing the team knew was where its users lived. “A lot of our community was living and breathing in [Jupyter] notebooks,” Mikyo said. “We really wanted to meet our developers where they were.” Thus, the first version of Phoenix emerged as a humble Jupyter notebook extension was the answer.\n\nThe earliest versions focused on visualizing embeddings and unstructured data. Then GPT-3 changed the direction of the project. The team began visualizing the questions flowing through LLM applications. Arize Software Engineer [Roger Yang](https://github.com/rogerhyang) built visualizations using UMAP and HDBSCAN to help engineers identify clusters of related prompts and responses. For the first time, Phoenix could reveal the structure of an AI application in a way developers could inspect directly.\n\nThat work eventually connected the team with LlamaIndex and the broader ecosystem of developers building [retrieval-augmented generation (RAG) systems](https://arize.com/blog/ragas-how-to-evaluate-rag-pipeline-phoenix/).\n\nFor Roger, who had come from Go and backend work, the project became as much a learning experience as an engineering challenge.\n\n“One major advantage of open source is your ability to read other people’s code and learn new things,” he said. “Python was new to us as well.”\n\nPhoenix was growing alongside the ecosystem it served.\n\n## How Phoenix evolved from a Notebook tool to an AI observability platform\n\nMost software projects begin with infrastructure, but Phoenix began with utility.\n\nThe team focused on [helping developers understand what was happening inside AI applications](https://arize.com/blog/llm-tracing-and-observability-with-arize-phoenix/). Infrastructure followed later.\n\n“We built features, then we built a container, then we built a database layer, and then we built authentication,” Mikyo said. “Quite literally backwards of what you would imagine a software project going.”\n\nAs developers adopted Phoenix, new requirements emerged.\n\nUsers ran long-lived Phoenix instances that accumulated millions of traces. Some built their own persistence layers using Elasticsearch or MongoDB, while others asked for ways to move beyond notebook environments.\n\n“We knew people needed to escape the notebook,” Mikyo said. The team responded by containerizing Phoenix before building a database layer.\n\n“We containerized it first, which sounds crazy.”\n\nSQLite followed, and Postgres support followed after that. Authentication arrived later, driven largely by community requests. Some developers needed Keycloak, while others needed Cognito, and others wanted OIDC support.\n\n“Each developer kind of builds a muscle of being a great developer, but also advocating for their own software,” said Mikyo.\n\nThe relationship between maintainers and users remains unusually direct. Case in point: a big part of what made that feedback loop tight was (and is) the lack of a separate support org behind Phoenix.\n\n“We don’t really have a support team,” says [Xander Song](https://github.com/axiomofjoy). “We are the support team.”\n\n“When you are the support team for your own product, there’s a certain level of trying to deliver a very high bar of quality, and feeling accountable when people come into GitHub and tell you this is not working.”\n\n## Why Phoenix adopted OpenTelemetry and created OpenInference\n\nThe move to OpenTelemetry was one that the team debated internally before committing to it. At first, the team was not convinced it would work.\n\nBy late 2023, [Phoenix had tracing](https://arize.com/resource/llm-tracing/), but it only worked with Phoenix. The team had built something that looked like OpenTelemetry without committing to it, partly out of doubt that AI data even fit the model. Phoenix addressed conversations, embeddings, retrieval results, and model outputs, whereas traditional observability systems were designed around infrastructure signals and application events.\n\n“It wasn’t obvious that OTel was the right vehicle,” Roger said.\n\nThe team debated the decision internally.\n\nRoger even submitted a pull request to switch Phoenix to OpenTelemetry before consensus existed.\n\nMikyo pushed back. “I think we’re kind of pushing a square peg through a circular hole,” he remembered thinking.\n\nWhat changed his mind was a mix of community pressure and evidence. GitHub issues kept asking for the switch. The team knew distributed tracing was coming, since agents would soon call LLMs across services rather than inside a single notebook. And the users they met in person made the case directly. “Someone would say, ‘I’m the maintainer of LangChain for Go.’ Or, ‘I have an existing Ruby application, I really want to use Phoenix, but I’m not going to switch to Python anytime soon,’” Mikyo says.\n\nThe question that had shaped Phoenix from the beginning resurfaced.\n\n“How do we reach the most users? Why not use the right plumbing, the plumbing that already existed in DevOps?” Mikyo asks.\n\nIn the end, the team came around and made the switch. And a byproduct of that switch was the development of OpenInference: a set of semantic conventions for AI applications that anyone can implement against any backend. The team kept the spec and the instrumentation in one monorepo so they could move fast, “developed by practitioners,” as Mikyo put it, on the two-week cadence that AI startups tend to move at. The first pull request from Hugging Face was a milestone, and Xander put in a lot of work maintaining it over the years.\n\nIn hindsight, Mikyo thought the doubt was misplaced. “DevOps problems are also AI ops problems. ”\n\n## Why Phoenix stayed local first\n\nOne of Phoenix’s defining characteristics emerged almost by accident. Phoenix runs completely locally.\n\n“It’s a happy accident,” Mikyo says. “Because it started as a Jupyter extension, we just were never building a SaaS platform to begin with. We always had to assume everything had to run locally.”\n\nThe benefits became increasingly obvious:\n\n- Developers could debug issues locally.\n- Teams working with sensitive data could keep telemetry inside their own environments.\n- Organizations operating in air-gapped environments could still run observability tooling.\n\nMoreover, a number of support questions came from Windows users in corporate roles, who were tracking sensitive data they could not send anywhere. Mikyo remembered a user named Rusty who was so into local development that it surprised the team, with some people pushing local SQLite instances to 200 gigabytes. He pointed to companies where 400 engineers could each run the same observability stack on their own machines without incurring additional cost.\n\nThe local-first approach also aligned with the way many developers preferred to work.\n\nWhen [Llama.cpp](https://llama-cpp.com/) made it possible to run 70-billion-parameter Qwen model on a laptop, local-first stopped looking like a constraint and started to look like a feature. “I can code while I’m on a plane,” Mikyo said.\n\nWhat started as a design constraint became one of Phoenix’s core strengths.\n\n## Open source AI evaluation without vendor lock-in\n\nPhoenix works across Python and TypeScript. It supports dozens of frameworks and integrates with observability backends throughout the ecosystem (the team calls this being the Switzerland of evals).\n\nThat openness came from humility as much as conviction.\n\n“We didn’t really know what good evals look like,” Mikyo said. “We didn’t want to say, ‘Tthis is what good evals look like,’ because there was a lot we didn’t know. We just wanted people to be experimenting.”\n\nRather than prescribe a single approach, the team focused on helping developers observe, evaluate, and improve AI systems regardless of their programming language.\n\nKeeping everything open was the way to learn from the community rather than dictate to it. Mikyo pointed to John Carmack shipping Quake with its own flavor of C so people could hack on it.\n\n“We want people to be able to build agents that work,” Mikyo said. “That’s what we have a vested interest in.”\n\n## What’s next for Phoenix\n\nAsk the team about the future and the conversation quickly turns to agents.\n\nThe way developers work has changed fast.\n\n“I don’t hand-write code anymore, which is kind of nuts,” Mikyo said. “I tell Claude what to do at this point.”\n\nThe team believes the next generation of observability and evaluation tooling will need to directly support agent workflows.\n\nThat includes observability for coding agents, [evaluation systems for agent-generated changes](https://arize.com/ai-agents/agent-evaluation/), and workflows that help humans review increasingly autonomous software systems. Today, Phoenix has to meet developers where they’re working with agents, the same way it once met them in notebooks.\n\nOne direction is giving each coding agent its own sandbox observability environment to gut-check its changes. Some of the team already run git worktrees with multiple Phoenix instances doing exactly that. Another is more human-in-the-loop flows, where humans ask agents to make changes and then approve them, or expert agents surface insights for humans to act on, with the right permissioning built in and the systems kept auditable.\n\nAt the same time, the philosophy behind Phoenix remains unchanged.\n\n“Ship fast but responsibly is kind of our motto,” Mikyo said. “We’re definitely trying to build a system that helps you move faster but also responsibly.”\n\nBut the team is wary of the easy path. “If you take the easiest path, you might be producing more slop,” Mikyo said. His view is that evidence-based development, review, and automation matter more now that agents are writing more of the code.\n\n## Thank you to the contributors\n\nPhoenix got here because people outside the core team kept showing up. The OIDC authentication came from users who needed Keycloak and Cognito.\n\nFramework maintainers pushed the team toward OpenTelemetry, while contributors expanded integrations and helped shape OpenInference.\n\nBug reports, feature requests, and design discussions influenced what got built and when.\n\nThe feedback loop worked because the maintainers stayed close to the community.\n\n*“The reason this team lives and breathes on Slack is because we miss the old days of the IRC channels where you could talk to us,”* Mikyo said. *“We love nerding out about cool stuff.”*\n\nTo everyone who filed an issue, opened a pull request, joined a discussion, shared feedback, or helped another developer in the community: thank you.\n\nYou helped shape Phoenix. We’re working hard to earn the next 10,000 stars.", "url": "https://wpnews.pro/news/phoenix-at-10000-stars-on-github-how-an-open-source-ai-observability-project-by", "canonical_source": "https://arize.com/blog/phoenix-10k/", "published_at": "2026-06-08 01:10:58+00:00", "updated_at": "2026-06-11 18:36:26.767164+00:00", "lang": "en", "topics": ["ai-tools", "ai-infrastructure", "mlops"], "entities": ["Phoenix", "Arize", "Nadia Eghbal", "OpenInference", "OpenTelemetry", "GitHub"], "alternates": {"html": "https://wpnews.pro/news/phoenix-at-10000-stars-on-github-how-an-open-source-ai-observability-project-by", "markdown": "https://wpnews.pro/news/phoenix-at-10000-stars-on-github-how-an-open-source-ai-observability-project-by.md", "text": "https://wpnews.pro/news/phoenix-at-10000-stars-on-github-how-an-open-source-ai-observability-project-by.txt", "jsonld": "https://wpnews.pro/news/phoenix-at-10000-stars-on-github-how-an-open-source-ai-observability-project-by.jsonld"}}