{"slug": "ainews-codex-rises-claude-meters-programmatic-usage", "title": "[AINews] Codex Rises, Claude Meters Programmatic Usage", "summary": "Anthropic has changed its Claude subscription model to include a monthly credit of API tokens equal to the dollar amount of the plan, effectively metering programmatic usage of the model outside its own platforms. The shift, which eliminates a historical 70-90% discount on API pricing, has sparked backlash from users who view it as a \"rug pull\" even as OpenAI's Codex gains popularity among AI engineers for its more generous limits. The pricing change represents Anthropic's move to put its most favorable pricing behind its own tools while treating third-party harnesses as metered usage.", "body_md": "# [AINews] Codex Rises, Claude Meters Programmatic Usage\n\n### a quiet day lets us report on a long trend of the major coding agents\n\nIt has been a tale of two cities in the past 3 weeks since the launch of GPT 5.5; while the finance folks fall in love with [Anthropic’s growth](https://www.latent.space/p/ainews-anthropic-growing-10xyear) and [CFO](https://x.com/anquetil/status/2054637012850970631) ahead of its likely October IPO, there has been a notable rise in pro-Codex sentiment among AI Engineers, likely a combination of GPT 5.5 being a really good (in [some scenarios Mythos-tier](https://x.com/mschoening/status/2054565859491029497?s=12)) model, launch of [Codex for Everything Else](https://www.latent.space/p/ainews-agents-for-everything-else), and, a third thing, which is the trigger for today’s op-ed: more generous limits.\n\nThe messaging for Claude’s pricing change was generally pretty well done, it is simply not what uses of alternative harnesses wanted to hear: [every Claude subscription now gets a monthly credit of API tokens equal to the dollar amount of the Claude subscription plan.](https://x.com/ClaudeDevs/status/2054610152817619388) So you pay $200, you get BOTH a Claude subscription with its own limits for using Claude on Anthropic-owned harnesses like Claude.ai and Claude Code (“interactive usage”), AND $200 worth of API credits for using Claude everywhere else including `claude-p`\n\n, OpenClaw and others (“programmatic usage”).\n\nIf things had worked this way from the start, it would have been viewed as a very good deal:\n\nHowever, because of the historical subsidy/pricing advantages (estimated between 70-90% discount from API pricing), people are viewing it [as a “rug pull” of sorts](https://x.com/ClaudeDevs/status/2054610152817619388/quotes) — however it’s nice to have an official policy in place as opposed to the selective targeting of [OpenClaw](https://x.com/kloss_xyz/status/2040211360156700843), [OpenCode](https://x.com/thdxr/status/2034730036759339100?s=20), and uncertain status of less popular harnesses.\n\nThat these headlines come on the same day as [OpenAI launches their enterprise switch](https://x.com/OpenAIDevs/status/2054586214112780518/quotes) promo is an incredible coincidence:\n\nAt the end of the day, we would caution against reading too much into swings either way - both labs are doing very well, and these are in the grand scheme of things normal pricing shifts by people inventing the future of coding while figuring out optimal pricing as they shake up a decades-old industry. Anthropic was more liberal in the beginning, but now that Claude Code has a sustainable brand and clout as an agent harness, Anthropic is putting its most favorable pricing behind its own tools and metering everything else, whereas Codex as the challenger is being more liberal with everything.\n\nPerhaps hardware is destiny, perhaps this is part of a longer 6 month alternating cycle of the “[mandate equinox](https://x.com/irl_danB/status/2050051868597080482)”:\n\nAI News for 5/12/2026-5/13/2026. We checked 12 subreddits,\n\n[544 Twitters]and no further Discords.[AINews’ website]lets you search all past issues. As a reminder,[AINews is now a section of Latent Space]. You can[opt in/out]of email frequencies!\n\n**AI Twitter Recap**\n\n**Agent Infrastructure, Harnesses, and Developer Platforms**\n\n**Cline, LangChain, Notion, and Cursor all pushed deeper into agent platform territory**:[Cline](https://x.com/cline/status/2054580767779700775)open-sourced a rebuilt** Cline SDK**and refreshed CLI with a TUI, agent teams, scheduled jobs, and connectors, positioning its harness as a reusable substrate for custom coding agents.[LangChain](https://x.com/LangChain/status/2054617687238865013)shipped a large batch of agent lifecycle infrastructure at Interrupt:**LangSmith Engine**,** SmithDB**,** Sandboxes**,** Managed Deep Agents**,** LLM Gateway**,** Context Hub**, and** Deep Agents 0.6**. The most technically notable piece is[SmithDB](https://x.com/LangChain/status/2054658661776244936), a purpose-built observability database for nested, long-running traces with large payloads, reportedly yielding**12–15×** faster access on key workloads; the team says it is built atop[Apache DataFusion and Vortex](https://x.com/ankush_gola11/status/2054681251513254260). In parallel,[Notion’s External Agents API](https://x.com/NotionDevs/status/2054600524423733307)lets third-party agents such as Claude, Codex, Cursor, Decagon, Warp, and Devin operate directly inside Notion as a shared, reviewable context layer rather than another silo.[Cursor](https://x.com/cursor_ai/status/2054651526715502998)expanded cloud agents with fully configured**development environments** including cloned repos, dependencies, version history, rollback, scoped egress, and isolated secrets.**Agent UX is increasingly about long-running state, streaming, and orchestration rather than chat**: Several launches converged on the same design direction.[Duet Agent](https://x.com/dzhng/status/2054619807715348779)proposes a state-machine harness for jobs that last**weeks or months**, with parent/sub-agent coordination and memory replacing compaction. LangChain’s OSS updates added[streaming typed projections, checkpoint storage, code interpreter, harness profiles, and model-specific tuning](https://x.com/LangChain_OSS/status/2054641656222388700), all aimed at richer agent event streams than plain tokens.[Tabracadabra](https://x.com/oshaikh13/status/2054613590695641269)moved from autocomplete to a context-aware assistant in any textbox, while[VS Code](https://x.com/code/status/2054669377367064613)introduced an Agents window and better multi-project task review. The architectural message across these releases is that production agents increasingly need**durable execution, inspectable intermediate state, and tool-native UI surfaces** rather than stateless prompt/response loops.\n\n**Model Training, Architecture, and Data Efficiency**\n\n**Pretraining efficiency and architectural experimentation were the strongest research throughline**:[Nous Research’s Token Superposition Training](https://x.com/NousResearch/status/2054610062836892054)modifies the early phase of pretraining so the model reads/predicts contiguous bags of tokens before reverting to standard next-token prediction; they report**2–3× wall-clock speedup at matched FLOPs** with no inference-time architecture change, validated from**270M to 3B dense** and**10B-A1B MoE**.[Jonas Geiping et al.](https://x.com/jonasgeiping/status/2054600427128201688)argued current message-based/chat training overly constrains agents to a single stream and released a**multi-stream LLM** paper claiming lower latency, cleaner separation of concerns, and more legible parallel reasoning/tool use; paper and code are linked[here](https://x.com/jonasgeiping/status/2054600457746579816).[δ-mem](https://x.com/dair_ai/status/2054600147020222630)proposed an external online associative memory attached to a frozen full-attention backbone, with an**8×8 state** reportedly improving average score by**1.10×** and beating non-δ-mem baselines by**1.15×**, with larger gains on memory-heavy benchmarks.** Post-training/compression and data curation also produced notable results**: NVIDIA’s[Star Elastic](https://x.com/PavloMolchanov/status/2054607257166553292)claims one post-training run can derive a family of reasoning model sizes, at**360× lower cost than pretraining a family** and**7× better than SOTA compression**. Datology’s VLM work, highlighted by[Siddharth Joshi](https://x.com/sjoshi804/status/2054566179369574419)and[Pratyush Maini](https://x.com/pratyushmaini/status/2054607891202777192), argues**data curation alone** can produce major multimodal gains:**+11.7 points across 20 public VLM benchmarks at 2B**, beating InternVL3.5-2B by roughly** 10 points**at about** 17× less training compute**, and near-frontier 4B performance with** 3.3× lower response FLOPs**than Qwen3-VL-4B. On the open data side,[Percy Liang](https://x.com/percyliang/status/2054550981527146942)said the next** Marin**run already has** 18T tokens**in its mix and is still seeking more pretraining, mid-training, and SFT data, with a companion token viewer[shared here](https://x.com/percyliang/status/2054550984597328101).**Open evaluation and dataset work is maturing alongside model building**:[Kevin Li’s SWE-ZERO-12M-trajectories](https://x.com/kevin_x_li/status/2054600962137100493)is positioned as the largest open agentic trace dataset:**112B tokens, 12M trajectories, 122K PRs, 3K repos, 16 languages**.[Victor Mustar](https://x.com/victormustar/status/2054495700822478943)flagged** llama-eval**as a step toward more comparable llama.cpp community evals. Meanwhile,[Steve Rabinovich](https://x.com/steverab/status/2054564579573698921)and[Sayash Kapoor](https://x.com/sayashk/status/2054569643080077576)argued credible agent evaluation requires**log analysis**, not outcome-only metrics, because stronger agents expose hidden benchmark bugs and reward-hacking paths.\n\n**Enterprise AI Pricing, Platform Competition, and Distribution**\n\n**Anthropic vs OpenAI competition sharpened around enterprise distribution and developer lock-in**:[Ramp data cited by Andrew Curran](https://x.com/AndrewCurran_/status/2054582686698848294)showed** Anthropic at 34.4%**of businesses vs** OpenAI at 32.3%**in April, the first apparent lead change in business adoption;[The Rundown](https://x.com/TheRundownAI/status/2054588969044627906)amplified the same figures. At the same time, Anthropic changed plan economics:[ClaudeDevs announced](https://x.com/ClaudeDevs/status/2054610152817619388)that paid Claude plans will get a dedicated monthly credit for programmatic usage across the**Agent SDK**,`claude -p`\n\n, GitHub Actions, and third-party SDK apps. This was immediately read by power users as a major restriction on subscription-subsidized harnesses, with criticism from[Theo](https://x.com/theo/status/2054620998205624746),[Jeremy Howard](https://x.com/jeremyphoward/status/2054682882753597603),[Matt Pocock](https://x.com/mattpocockuk/status/2054655310388674693), and[Omar Sanseviero](https://x.com/omarsar0/status/2054679776397300188). Anthropic partially offset that backlash with a separate[50% increase in Claude Code weekly limits](https://x.com/ClaudeDevs/status/2054639777685934564)through July 13, stacked on the previously announced 2× 5-hour limit increase.**OpenAI responded aggressively with Codex enterprise incentives**:[OpenAI Devs](https://x.com/OpenAIDevs/status/2054586214112780518)and[Sam Altman](https://x.com/sama/status/2054626219858293128)offered**two months of free Codex usage** for enterprise customers switching in the next 30 days. OpenAI also published more technical platform detail, including a[Windows sandbox design write-up](https://x.com/reach_vb/status/2054655421013434510)describing the combination of local users, firewall rules, ACLs, write-restricted tokens, DPAPI, and helper executables needed to safely run coding agents with local filesystem/tool access. The competitive dynamic now looks less like “best model wins” and more like**subsidy + workflow control + harness compatibility**.** Enterprise adoption is increasingly tied to runtime/security assurances**:[Perplexity](https://x.com/perplexity_ai/status/2054608966148374715)described a hardware-isolated sandbox architecture with VPC-level separation, short-lived proxy tokens, and scanning of external content before agent actions, with[additional details](https://x.com/perplexity_ai/status/2054608978680873457)on encryption and auto-deletion.[Aravind Srinivas](https://x.com/AravSrinivas/status/2054619058650411174)framed this as foundational to Perplexity becoming an enterprise knowledge/research platform. The broader pattern: agent vendors are no longer selling only intelligence; they’re selling**bounded execution environments**.\n\n**Autonomous Science, Cyber Capability, and Robotics**\n\n**Recursive self-improvement moved from idea to startup cluster**: The largest single meta-theme was the launch of[Recursive](https://x.com/_rockt/status/2054491251345391852), founded to build AI that automates science and safely improves itself. Launch posts from[Richard Socher](https://x.com/_rockt/status/2054491251345391852),[Josh Tobin](https://x.com/josh_tobin_/status/2054576051431616873),[Dominik Schmidt](https://x.com/schmidtdominik_/status/2054498117416808727),[Jenny Zhang](https://x.com/jennyzhangzt/status/2054603211798147436), and[Shengran Hu](https://x.com/shengranhu/status/2054630820305088739)suggest a team drawn from open-endedness, AI Scientist, and research automation work. In adjacent work,[Adaption’s AutoScientist](https://x.com/adaption_ai/status/2054532113316434061)aims to automate the full training-research loop outside frontier labs, with[Sarah Hooker](https://x.com/sarahookr/status/2054551263275254084)arguing that most model training failures are due to research-loop brittleness rather than mere compute scarcity.**Cyber capability evaluations continue to steepen**: The UK[AI Security Institute](https://x.com/AISecurityInst/status/2054589758043496567)said the length of cyber tasks frontier models can complete has been doubling every few months, and that recent models are beating prior trends. Anthropic/Glasswing’s[Logan Graham](https://x.com/logangraham/status/2054613618168082935)said**Claude Mythos Preview** is the first model to solve both AISI end-to-end cyber ranges, including**Cooling Tower**, and the only one to clear every task under the institute’s** 2.5M-token**cap. XBOW reportedly found “token-for-token, unprecedented precision,” and partner usage allegedly surfaced** thousands of high/critical vulnerabilities**in weeks. Independent commentary from[scaling01](https://x.com/scaling01/status/2054594892903436553)claimed a newer Mythos version completed a cyber range**6/10 times vs 3/10** for the preview baseline.**Robotics got a concrete long-horizon deployment demo**:[Figure’s Brett Adcock](https://x.com/adcock_brett/status/2054603963996278786)streamed humanoid robots running a full** 8-hour autonomous shift**on package sorting using** Helix-02**, with follow-up details that the robots reason from camera pixels, operate around** human parity (~3s/package)**, perform** on-device inference**, coordinate as a networked fleet, autonomously swap for low battery, and self-diagnose/fail over to maintenance when needed[here](https://x.com/adcock_brett/status/2054615837903048807). This is one of the clearer public demonstrations of**multi-robot, long-duration, no-human-in-the-loop orchestration** rather than a short benchmark clip.\n\n**Top tweets (by engagement)**\n\n**Claude Code pricing and limits**:[@ClaudeDevs on 50% higher weekly limits](https://x.com/ClaudeDevs/status/2054639777685934564),[@ClaudeDevs on programmatic credits](https://x.com/ClaudeDevs/status/2054610152817619388), and the ensuing developer backlash from[@theo](https://x.com/theo/status/2054620998205624746)made pricing policy the day’s most consequential developer story.**Codex enterprise push**:[@sama offering two free months of Codex usage for switchers](https://x.com/sama/status/2054626219858293128)and[@OpenAIDevs’ enterprise call-to-action](https://x.com/OpenAIDevs/status/2054586214112780518)signaled an unusually direct go-to-market counterpunch.**Figure’s 8-hour humanoid shift**:[@adcock_brett’s livestream post](https://x.com/adcock_brett/status/2054603963996278786)drew enormous attention and is one of the few viral posts in the set with clear technical substance.**Cline SDK launch**:[@cline’s SDK release](https://x.com/cline/status/2054580767779700775)was one of the highest-engagement genuinely technical launches, reflecting demand for open coding-agent harnesses.**Token Superposition Training**:[@NousResearch’s TST post](https://x.com/NousResearch/status/2054610062836892054)stood out as a rare pretraining-method tweet that broke through widely, likely because the claim—**2–3× training speedup without changing inference-time architecture**—is concrete and economically important.\n\n**AI Reddit Recap**\n\n**/r/LocalLlama + /r/localLLM Recap**\n\n**1. Efficient On-Device LLM Inference**\n\n## Keep reading with a 7-day free trial\n\nSubscribe to Latent.Space to keep reading this post and get 7 days of free access to the full post archives.", "url": "https://wpnews.pro/news/ainews-codex-rises-claude-meters-programmatic-usage", "canonical_source": "https://www.latent.space/p/ainews-codex-rises-claude-meters", "published_at": "2026-05-14 03:53:26+00:00", "updated_at": "2026-05-25 00:19:49.762720+00:00", "lang": "en", "topics": ["artificial-intelligence", "large-language-models", "ai-products", "ai-tools", "ai-agents"], "entities": ["Anthropic", "Claude", "GPT 5.5", "Codex", "OpenClaw", "Claude.ai", "Claude Code", "Latent Space"], "alternates": {"html": "https://wpnews.pro/news/ainews-codex-rises-claude-meters-programmatic-usage", "markdown": "https://wpnews.pro/news/ainews-codex-rises-claude-meters-programmatic-usage.md", "text": "https://wpnews.pro/news/ainews-codex-rises-claude-meters-programmatic-usage.txt", "jsonld": "https://wpnews.pro/news/ainews-codex-rises-claude-meters-programmatic-usage.jsonld"}}