{"slug": "what-i-m-finding-about-llm-code-style-and-token-costs", "title": "What I'm Finding About LLM Code Style and Token Costs", "summary": "A developer finds that large language models like Claude generate verbose, legacy code patterns that increase token costs, especially output tokens which are 3-5x more expensive than input tokens. The author argues that modern Web APIs already available in runtimes like Deno and Cloudflare Workers can reduce code length and improve quality, suggesting that LLMs often ignore these built-in solutions, leading to higher API costs.", "body_md": "# What I’m Finding About LLM Code Style and Token Costs\n\n*Spending output tokens to share it. Before the price spikes.*\n\n## Where This Started\n\nI’ve been working through creating and reviewing features with Claude the past year. It’s been remarkable seeing the tension in token consumption and legacy patterns. Right when I think something is complete, a problem surfaces—regression, edge case, whatever. All the while watching the slow, steady and natural march toward eventual full-price rates. Alongside this phenomenon, my accumulated push to stay at the pragmatic edge of modern Web work. The sweet spot where nearly ubiquitous features remove lines of code and improve quality—the place where I keep wondering: why did I get that output? Why did that line of code appear instead of what’s been available for years? I usually dismiss it with the observable fact that Claude is effectively junior level at best, and a useful approximation of the encyclopedic knowledge asked in interviews.\n\nIn trying to make progress on something I am finding myself reviewing my practice and looking at where that outrageous token usage is coming from. Every one of those is output tokens, the ones that cost **several times more** (3x to 5x!!!) than input tokens in API pricing. Patterns that are longer, more fragile, more insecure, and solving problems the platform already solved–often years ago.\n\nIt’s enough to start imagining there’s some conspiracy to take the entire web platform backward, right when Ryan Dahl and separately Alex Russell, Dimitri Glazkov (and many others) made Web Components, etc. They literally made the entire Web platform great again. All to eke out some return on the tokens. So for the sake of conspiracy, this is what I’m finding.\n\nBecause my background as human being, who uses language, designed typography, programmed early on, alongside drawing and many other eclectic oddities, I actually consider things like tabs as a remarkable innovation. I can literally reduce indentation to 1 character, not some abstraction I have to go ask someone how to define or get permission to use. (I guess I’m just far too egalitarian to appreciate the exclusionary attitude of the entire software community.) I care about humans, and want things to work within some parsimonious baseline. And multiplying stuff by 4 or some arbitrary number just really doesn’t make sense–to me. I could go on, but maybe this grounds the orientation—someone who’s worked with actual language on actual media and has opinions about when something works and when it doesn’t. That part tends to speak for itself.\n\nI mention this because it colors what I looked into from a purely pragmatic standpoint. I’m not arguing for a specific position where everyone uses tabs (despite that speaking for itself). I’m disclosing background that shaped opinions I’d been sitting on—there was always an economic argument I kept to myself, and it’s now showing up in real API costs. My opinions on convention are not the article. The token usage optimizations are what I came here to share. So you can benefit too. If you want to keep using multiple spaces, I’ll remind myself that the literature said it seemed ok and the LLM doesn’t know any better.\n\n## The Easiest Token Optimization on the Planet Is Already in the Runtime\n\n[Deno](https://docs.deno.com/runtime/) and runtimes like Cloudflare Workers implement the [Web API surface natively](https://developer.mozilla.org/docs/Web/API)—`URL`\n\n, `URLSearchParams`\n\n, `fetch`\n\n, `FormData`\n\n, `Headers`\n\n, `Request`\n\n, `Response`\n\n, `AbortController`\n\n, `ReadableStream`\n\n, `crypto`\n\n, and more—the same objects that run in the browser. This is the architectural choice that Deno made deliberately, and that [WinterCG](https://wintercg.org) has been formalizing as a minimum common API surface across runtimes and it has a significant practical consequence: **the same API surface covers both browser and server-side code**. No translation layer, no shims, no adaptation cost. The platform has already solved a large category of problems, correctly, securely, and without dependencies. Deno is particularly notable for including a [standard library](https://docs.deno.com/runtime/reference/std/) where something may be missing and needs cross-platform solutions.\n\nThe LLM doesn’t know this about your environment unless you say so. Its training corpus is dominated by Node.js code from before these APIs were universal—`require('url')`\n\n, `querystring.parse()`\n\n, `express`\n\nmiddleware patterns, `axios`\n\nwith custom timeout wrappers, `multer`\n\nfor form parsing. Those patterns are statistically dominant in what the model learned from. They’re what it reaches.\n\n**The gap between what the model defaults to and what the platform already provides is where most of the output token cost lives.**\n\n## The Magnitude, by Pattern\n\nI’ve been estimating the token economics of this as I go. These are approximate—based on the actual length of the patterns, not from a formal study—but the ratios are consistent enough to be useful.\n\n### Query parameter parsing\n\n``` js\n// model default—manual parsing (~140 tokens)\nconst parts = rawUrl.split('?');\nconst pairs = parts[1] ? parts[1].split('&') : [];\nconst params = {};\npairs.forEach(p => {\n\tconst [k, v] = p.split('=');\n\tparams[decodeURIComponent(k)] = decodeURIComponent(v);\n});\n\n// Web API (~12 tokens)\nconst params = Object.fromEntries(new URL(rawUrl).searchParams);\n```\n\nRoughly 140 tokens versus 12. About 90% reduction, per occurrence. The manual version also silently fails on malformed keys, silently drops all but the last value for repeated parameters, and is a prototype pollution vector if the key is `__proto__`\n\n. The native version handles all of it by specification.\n\n### Form data\n\n```\n// model default—per-field state (~200+ tokens for a 3-field form)\nconst [name, setName] = useState('');\nconst [email, setEmail] = useState('');\nconst [role, setRole] = useState('');\nconst handleChange = (e) =>\n\tsetFields({ ...fields, [e.target.name]: e.target.value });\n\n// Web API (~14 tokens)\nconst data = Object.fromEntries(new FormData(event.target));\n```\n\nThe model will generate state tracking and change handlers for every field. The native version ingests the entire form in one call. Roughly 200–250 tokens versus 14, depending on field count—and the native version scales to twenty fields at the same cost.\n\n### Fetch lifecycle and cancellation\n\n``` js\n// model default (~90 tokens)\nlet timer;\nconst controller = new AbortController();\ntimer = setTimeout(() => controller.abort(), 5000);\ntry {\n\tconst res = await fetch(url, { signal: controller.signal });\n} finally {\n\tclearTimeout(timer);\n}\n\n// Web API (~12 tokens)\nconst res = await fetch(url, { signal: AbortSignal.timeout(5000) });\n```\n\nThe manual version leaks timers if the `finally`\n\npath is missed during refactoring. The native version has no lifecycle to manage.\n\n### Parallel async with failure isolation\n\n``` js\n// model default (~100 tokens)\nlet anyFailed = false;\nconst results = await Promise.all(\n\ttasks.map(t => t.catch(e => { anyFailed = true; return null; }))\n);\nif (anyFailed) { /* now what? */ }\n\n// Web API (~10 tokens)\nconst results = await Promise.allSettled(tasks);\n```\n\n`Promise.allSettled()`\n\nreturns a structured result per task with `.status`\n\nof `\"fulfilled\"`\n\nor `\"rejected\"`\n\nand the corresponding value or reason. The manual version loses the error detail and invents a new ad hoc status convention on every use.\n\n### UI components\n\n```\n// model default—custom modal (~250 tokens of JS lifecycle management)\nconst [isOpen, setIsOpen] = useState(false);\nuseEffect(() => {\n\tif (isOpen) document.body.style.overflow = 'hidden';\n\treturn () => { document.body.style.overflow = ''; };\n}, [isOpen]);\n// ... aria attributes, keyboard trap, backdrop click handler ...\n\n// semantic HTML (~25 tokens)\n<dialog ref={ref}>...</dialog>\n// browser handles focus trap, Escape key, accessibility tree, backdrop\n```\n\n`<dialog>`\n\nhas been supported across all major browsers since 2022. `<details>`\n\n/`<summary>`\n\nfor accordions, native `<form>`\n\nconstraint validation (`required`\n\n, `type=\"email\"`\n\n, `pattern`\n\n, `minlength`\n\n)—these are not obscure. The model reaches for JavaScript implementations because that’s what’s in its training data. It will keep doing this until directed otherwise.\n\n### A complete Deno request handler\n\nThe compound effect is where this becomes substantial. A Deno handler that parses request params, reads a form body, queries a database, and returns a response—written in the model’s default style—runs to 400–600 output tokens for the boilerplate alone, before any application logic. The same handler written with native APIs runs to 60–90 tokens. That’s not a marginal improvement.\n\n```\n// native Web APIs throughout (~70 tokens of infrastructure)\nexport async function handler(request) {\n\tconst { searchParams } = new URL(request.url);\n\tconst tenantId = searchParams.get('tenant');\n\tconst data = Object.fromEntries(new FormData(await request.formData()));\n\tconst result = await db.query(`\nSELECT id, name\nFROM records\nWHERE tenant_id = ?\nAND active = 1\n`).bind(tenantId).first();\n\treturn Response.json(result);\n}\n```\n\n## Security and Reliability as Structural Outcomes\n\nThis is worth naming directly rather than leaving as a footnote. Moving to native APIs doesn’t just reduce token cost—it eliminates categories of bugs.\n\nManual query string parsing with `params[key] = value`\n\nis a prototype pollution vector. Manual `decodeURIComponent`\n\nfails silently on `%`\n\nin certain positions. Custom `setTimeout`\n\n-based abort patterns leak when the cleanup path is skipped during refactoring. Custom form state tracking creates consistency bugs when a field is added but the handler isn’t updated. Homemade modal focus management routinely breaks keyboard navigation and screen readers.\n\nThe native implementations are spec-compliant. They’ve been tested against every edge case that exists in real web traffic. The Web Platform Tests suite runs tens of thousands of interoperability tests against each browser and runtime. `URLSearchParams`\n\nhandles `+`\n\nencoding, repeated parameters, empty values, and UTF-8 edge cases correctly because it was written to the spec that defines what correct means. The model’s hand-rolled equivalent handles whatever the author thought of that day.\n\n**This is not a minor reliability improvement. It’s the difference between code that was implemented once by the person who wrote the spec versus code that was written from memory by a pattern-matching system trained on a corpus full of implementations that got it partly wrong.**\n\n## What Comments Are Actually Doing\n\nI’d thought of comments as documentation—useful for humans, neutral for LLMs. Research from MITRE published in June 2025 ([Sabetto et al.](https://arxiv.org/abs/2506.11007), tested across Claude, GPT-4, Llama, and Mixtral) changed that. Comments aren’t neutral. **Models follow comment intent even when it contradicts the code.** Inaccurate comments—comments that describe what the code used to do before a refactor—actively degraded LLM comprehension below the no-comment baseline. Worse than silence.\n\nA stale comment isn’t harmless. It’s misinformation with authority. When a model keeps returning to a pattern I’ve moved away from, a stale comment near that code is a real candidate for why.\n\nWhat comments are worth—what actually carries useful information—is design intent. Constraints. Why this function doesn’t catch its own errors. Why the SQL filters at the database level instead of in application code. What must not change when this is refactored. The reason for a non-obvious choice. That’s signal. “Loop over items” above `items.forEach()`\n\nis noise, and adds tokens with no return.\n\n[ACL 2024 work on comment augmentation](https://aclanthology.org/2024.findings-acl.809) supports the other direction: models trained on code *with* comments outperform models trained on uncommented code. Comments are a semantic bridge. At inference time they still carry signal, so the content of that signal matters.\n\n## The Formatting Question, Correctly Weighted\n\nThere is a real finding here. [Pan, Sun et al.](https://arxiv.org/abs/2508.13666) (“The Hidden Cost of Readability,” August 2025) measured input token overhead from formatting across tens of thousands of source files. Removing indentation, blank lines, and alignment whitespace reduced input token counts by an average of 24.5% with essentially no accuracy change for Claude or GPT-4.\n\nThat’s the input side, and it’s real. The tractable individual choices—no alignment whitespace, SQL ex-dented to the left margin, no blank lines inside function bodies—aggregate to roughly 5–10% input savings under typical JS conditions.\n\nBut input tokens cost one-third to one-fifth what output tokens cost. And the output savings from native APIs are not 5–10%—they’re 85–92% per pattern, compounding across every occurrence. The formatting work is worth doing. It is not the main event.\n\nMy preference for ex-dented SQL has a sound technical rationale: the model’s SQL training data is predominantly left-aligned, so matching that distribution makes sense. Whether it measurably improves accuracy I can’t point to a controlled JavaScript study for. It looks right to me, and the argument is sound enough.\n\n## What I’m Putting in Prompts [And Working Through]\n\nThe mechanism that actually changes model output is an explicit directive named at the start of the session. General style guidance produces marginal improvement—[Wang et al.](https://arxiv.org/abs/2407.00456) (ACM, 2024–2025) found this in a study of style-aware prompting. What works better is naming specific APIs explicitly, making the correct answer available before the model reaches for its training-data default.\n\nHere’s what I’m actively working on. Note the regular use of DO THIS and NOT THAT–these work best together. (This works by constraining the probability space before generation, and is a recurring suggestion you can see across the examples described here.)\n\n```\nuse Web APIs natively: URL, URLSearchParams, FormData, AbortController, fetch, Headers, Request, Response, Promise.allSettled(), Promise.any()\nuse semantic HTML: <dialog>, <details>, <form> with native constraint\nvalidation. Do not implement in JavaScript what the browser or Deno runtime provides natively\n```\n\nCombined with comment discipline:\n\n```\nComments state design constraints, invariants, and why. Not what the\ncode does. Do not write comments that restate what the next line does.\n```\n\nThe native API directive is the one that produces the most visible difference in output quality and cost.\n\n## Where This Lands\n\nThe core finding is structural, not a tip. Deno made the choice to implement the Web API surface natively, creating a single consistent set of abstractions that work identically in the browser and on the server. That surface solves—correctly, securely, and for free—a large category of problems that LLMs are currently solving again from scratch, badly, every generation, at 85–92% more token cost than necessary.\n\nThe comment findings matter because the model treats them as authoritative input, not metadata. Stale comments produce actively wrong output. Accurate design-intent comments constrain generation in useful directions.\n\nThe formatting findings are real and worth applying. They are secondary to the API question.\n\nWhat’s striking to me is that the biggest lever here—the one that produces 7–10× output token reduction on infrastructure code and eliminates whole categories of security and reliability issues simultaneously—is not a new coding technique. It’s using what the platform already built. The friction is that the model doesn’t know to use it unless you say so. Once you do, it’s consistent about it. The model doesn't know what your runtime already ships. Someone has to—and that's the entire reason you hire professionals instead of just running the model.\n\n**Sources**\n\n[Pan, Sun et al. “The Hidden Cost of Readability: How Code Formatting Silently Consumes Your LLM Budget.”](https://arxiv.org/abs/2508.13666)arXiv:2508.13666, August 2025.[Sabetto et al. (MITRE). “Impact of Comments on LLM Comprehension of Legacy Code.”](https://arxiv.org/abs/2506.11007)arXiv:2506.11007, June 2025.[Song, Zhang et al. “Code Needs Comments: Enhancing Code LLMs with Comment Augmentation.”](https://aclanthology.org/2024.findings-acl.809)ACL Findings, August 2024.[Wang et al. “Beyond Functional Correctness: Investigating Coding Style Inconsistencies in Large Language Models.”](https://arxiv.org/abs/2407.00456)ACM, 2024–2025.\n\n*This is what I’m finding in my own workflow. All of the token estimates above are early approximations from direct observation, not from published studies. The directional findings are highly consistent. Your specific numbers will vary with your codebase so test it to see what really works for you, your work and your team.*\n\n[jimmont.com](https://www.jimmont.com/)", "url": "https://wpnews.pro/news/what-i-m-finding-about-llm-code-style-and-token-costs", "canonical_source": "https://www.jimmont.com/llm-style-token-costs", "published_at": "2026-06-25 00:52:40+00:00", "updated_at": "2026-06-25 01:14:11.749563+00:00", "lang": "en", "topics": ["large-language-models", "developer-tools", "ai-tools"], "entities": ["Claude", "Deno", "Cloudflare Workers", "WinterCG", "Ryan Dahl", "Alex Russell", "Dimitri Glazkov"], "alternates": {"html": "https://wpnews.pro/news/what-i-m-finding-about-llm-code-style-and-token-costs", "markdown": "https://wpnews.pro/news/what-i-m-finding-about-llm-code-style-and-token-costs.md", "text": "https://wpnews.pro/news/what-i-m-finding-about-llm-code-style-and-token-costs.txt", "jsonld": "https://wpnews.pro/news/what-i-m-finding-about-llm-code-style-and-token-costs.jsonld"}}