{"slug": "llm-0-32a2", "title": "llm 0.32a2", "summary": "Release of llm 0.32a2, highlighting that most reasoning-capable OpenAI models now use the `/v1/responses` endpoint instead of `/v1/chat/completions`, enabling interleaved reasoning across tool calls for GPT-5 class models. Users can now view summarized reasoning tokens in a different color when running prompts, with the option to hide them using the `-R` or `--hide-reasoning` flags.", "body_md": "Release: llm 0.32a2\nA bunch of useful stuff in this LLM alpha, but the most important detail is this one:\nMost reasoning-capable OpenAI models now use the\n/v1/responses\nendpoint instead of/v1/chat/completions\n. This enables interleaved reasoning across tool calls for GPT-5 class models. #1435\nThis means you can now see the summarized reasoning tokens when you run prompts against an OpenAI model, displayed in a different color to standard error. Use the -R\nor --hide-reasoning\nflags if you don't want to see that.\nTags: llm, projects, openai, generative-ai, annotated-release-notes, ai, llms", "url": "https://wpnews.pro/news/llm-0-32a2", "canonical_source": "https://simonwillison.net/2026/May/12/llm/#atom-everything", "published_at": "2026-05-12 17:45:07+00:00", "updated_at": "2026-05-19 22:12:19.458302+00:00", "lang": "en", "topics": ["large-language-models", "artificial-intelligence", "developer-tools", "open-source"], "entities": ["OpenAI", "LLM"], "alternates": {"html": "https://wpnews.pro/news/llm-0-32a2", "markdown": "https://wpnews.pro/news/llm-0-32a2.md", "text": "https://wpnews.pro/news/llm-0-32a2.txt", "jsonld": "https://wpnews.pro/news/llm-0-32a2.jsonld"}}