{"slug": "harness-use-research-is-codex-better", "title": "Harness use research, is codex better?", "summary": "Tamarillo's analysis of approximately 400,000 public GitHub repositories reveals that five coding harnesses—Claude Code, Codex, Copilot, Cursor, and Hermes—dominate roughly 80% of repositories with harness configurations, confirming a power-law distribution in AI coding tool adoption. The study tracks market share evolution over two years, showing Cursor's early dominance gave way to rising adoption of Claude Code, Codex, and Hermes, with rolling share data indicating recent shifts in developer preferences. The findings provide a lower-bound estimate of harness adoption limited to public repositories, excluding private and enterprise configurations.", "body_md": "analysis_static.py\nTamarillo — AI Coding Harness Adoption\n`tamarillo`\n\n— coding harness inspection\n\nIn the past 2 years coding harnesses (and even the term itself) became ubiquitous.\nAt Tamarillo one goal is to systematize the utilization of coding harnesses (that is why the `theta-spec`\n\nand `theta`\n\nwere created).\n~400K public GitHub repositories containing configuration files for AI coding\nassistants (harnesses) were fetched. ~400K was the count at time of collection\nafter exhaustively searching GitHub public repos[[†]](#fn-forks).\nThe process to get the data is pretty straightforward\n`filter criteria`\n\n: `PATTERNS`\n\nper harness were defined (explained in detail in [appendix A](#appendix-a-search-patterns-per-harness))\n`repo search`\n\n: Code searches on GitHub's REST API filtered against harness configuration files\n`enriching stage`\n\n: GitHub's GraphQL API was used to enrich files with commit count, file bytes, creation date, etc.\n\nThis document covers a couple of things, market share and adoption dynamics, configuration surface\nanatomy (what files exist, how big, how often touched), multi-harness co-occurrence,\nrepo demographics by stars/language/owner type, and other yerbas. It was created with the intent of\nbeing a DIY-thermometer for a **slice** of a domain. Although some suspicions were confirmed (maybe some obvious ones it MAY be argued),\nit is strongly suggested to the reader to read the [limitations and methodology](#methodology--limitations) section in this document.\nOnly public repositories were fetched. The dataset reflects configuration intentions\n(i.e. a repo that has a `.cursorrules`\n\nfile signals that someone set it up, not necessarily that Cursor is\ndaily used). This is a lower bound on harness adoption.\n[†] the number of repositories fetched does include a negligible amount of forks of already captured repos (<0.1%). This were excluded for the analysis.\n# Harness adoption: market share\n\nAt the time of writing, the C4H (Claude Code, Codex, Copilot, Cursor, Hermes) dominates ~80% of the public repositories that expose harness configurations. This is the Pareto principle in action — an expected outcome given that power laws [are usually at play in software](https://www.spinellis.gr/pubs/jrnl/2008-TOSEM-PowerLaws/html/LSV08.html) and software tools. From ~20 harnesses, 5 dominate.\n## C4H domination\n\n## Rolling share — new harness adoptions\n\nThroughout the years and months, the landscape changed. At the beginning the market share was pretty much dominated by Cursor.\nIt is no surprise to confirm that, as well as the rise of Claude Code and shortly after Codex and Hermes.\nBelow, two charts measure the following signals: recent adoption dynamics, and the current state of market share[[†]](#fn-mortals).\n- Rolling market share chart, showing the signal of recent adoption. That is, popularity calculated as the share for new repositories created in an\n`x`\n\n-day window, filtered by the date each harness config format became publicly available.\n- Evolution of market share in absolute terms over the past ~2 years.\n\nThe dates used as filters correspond to harness-specific events associated with the release date of configuration files deemed mandatory in the context of a coding harness **&& ||** the product launch date. Details and sources for each date can be found [here](#appendix-a-search-patterns-per-harness).\n[†] This analysis is constrained to what GitHub's public APIs expose. Private repos, enterprise installations, and home-directory config never reach the index, so the curves here are ballpark estimates that can't be confirmed at the population level.\nAs stated above, the evolution of market share in absolute terms over the past ~2 years filters\nby release date without including the retroactive configured repos. The results displayed below are\nthe cumulative share for each harness present in the selected group. The cumulative adoption over time, filtered by harness release date, shows that:\n- All growth shows exponential behavior.\n- Harnesses experience rate discontinuity, e.g. Cursor ~2025-07, Copilot ~2025-11 (i.e. subtle change, but noticeable with eyesight on the rate of growth, the cause is up for elucubration)\n- Even though harnesses like Claude Code or Codex were released later, they already caught up in terms of repo footprint.\n[[†]](#fn-repo-footprint).\n\n## Codex & Claude Code\n\nThere is colossal competition between frontier labs in many domains including harnesses, like Codex and Claude Code. The market share ratio\n[was discussed on this post](https://www.wired.com/story/openai-codex-race-claude-code/),\nwhich puts Codex at **~** 5% of Claude Code's usage in Sep 2025 and **~** 40% by Jan 2026\n(emphasis on the approximation operator ~).\nThe public-repo measurement here sits above the ~5% and pretty accurately on ~40%: two effects MAY compound to produce the gap:\n- WIRED reports \"5 percent as much\n**use**\" (Sep 2025) and \"40 percent of Claude Code's **user base**\" (Jan 2026), attributed to \"people with direct knowledge of the matter\" — anonymous internal sources, no methodology disclosed. This chart counts public repos with a config file committed.\n- Configs MAY be committed weeks or months after the user actually adopted the tool.\n\nThe proxy is more credible for cumulative adoption than for short-run growth. A 14-day rolling window is noisy and there is no public data or ballpark estimate to contrast that rate-of-change series against.\n## Multi-harness adoption\n\nIf each repo independently decides to adopt one more harness with\nconstant probability p, the count at k follows a geometric\ndistribution: N(k)∝pk.\nIt MAY be a bit of a stretch, but there is no resistance to the temptation of fitting a line\non a log(y) vs x chart and interpreting what that constant probability means. x is\nindexed over the count of co-occurrences of harness configurations in repos, so p\nis the probability of adopting harness i+1 after already having i. Note that it\ndoes not matter which harness is adopted (e.g. maybe it is always Claude Code because\nit is popular), what matters is that there is a new one. The decision to add harness\nk+1 does not depend on how many are already configured. The regression was done for harness counts\nk∈[1,5] because above that the counts drop to single digits.\np^ is obtained via OLS on logN(k)=logC+klogp. The 95% CI uses the\n[textbook normal-theory slope CI](https://en.wikipedia.org/wiki/Simple_linear_regression#Normality_assumption)\nβ^±1.96⋅SE(β^) and is then pushed through p^=eβ^.\nThe regression on log(N) is\nheteroscedastic (small counts are noisier), so the reported CI is slightly optimistic. A\n[Poisson GLM](https://en.wikipedia.org/wiki/Generalized_linear_model#Count_data)\nfitting N(k)∼Poisson(μk), logμk=logC+klogp is the correct way of doing things. [[†]](#fn-2-repos)\n[†] If only 2 harness are selected the linear regression will be evidently perfect.", "url": "https://wpnews.pro/news/harness-use-research-is-codex-better", "canonical_source": "https://research.tamarillo.ai/coding-harness-inspection/", "published_at": "2026-05-29 20:39:42+00:00", "updated_at": "2026-05-29 20:47:20.662773+00:00", "lang": "en", "topics": ["ai-tools", "ai-infrastructure", "ai-research", "mlops"], "entities": ["Tamarillo", "GitHub", "theta-spec", "theta"], "alternates": {"html": "https://wpnews.pro/news/harness-use-research-is-codex-better", "markdown": "https://wpnews.pro/news/harness-use-research-is-codex-better.md", "text": "https://wpnews.pro/news/harness-use-research-is-codex-better.txt", "jsonld": "https://wpnews.pro/news/harness-use-research-is-codex-better.jsonld"}}