{"slug": "ai-credits-are-the-new-lines-of-code-metric", "title": "AI credits are the new lines of code metric", "summary": "GitHub added an 'ai_credits_used' field to its Copilot usage metrics API, allowing enterprise admins to track per-user AI consumption. The metric is intended for budget and adoption signals but risks being misinterpreted as a productivity measure, similar to lines of code. Developers warn that high or low credit usage does not directly correlate with valuable work, as the metric measures input rather than output.", "body_md": "GitHub added a tiny field to the Copilot usage metrics API this week that is going to create a lot of very confident spreadsheets.\n\nEnterprise and organization admins can now see `ai_credits_used`\n\nin the user-level Copilot usage reports. One field. Per user. Available for single-day and 28-day reports. It is not the invoice, and GitHub is careful to say it is a consumption signal rather than a billed total.\n\nStill, the shape is obvious.\n\nNow AI usage can sit next to adoption, activity, team, department, cost center, and whatever else the company already exports into a dashboard.\n\nThat is useful.\n\nIt is also exactly how a tool metric becomes a management metric.\n\nAnd once that happens, the question is no longer \"can we measure AI usage?\"\n\nThe question is \"what weird behavior will this metric create?\"\n\nI understand why this field exists.\n\nIf a company is paying for Copilot, especially with usage-based pieces attached to more expensive models and premium features, it needs some way to understand consumption. Platform teams need budget signals. Engineering leaders need adoption signals. Procurement needs something more concrete than \"people seem to like it.\" Finance will eventually ask why one org burns through credits much faster than another.\n\nThat is normal.\n\nThe problem starts when a consumption signal is treated as a productivity signal.\n\nHigh AI credit usage might mean a developer is doing valuable work with agent mode, code review, test generation, refactoring, or research. It might also mean the developer is stuck, repeatedly asking the model to solve the wrong problem, generating code that gets deleted, or using a heavyweight model where a small one would have been fine.\n\nLow AI credit usage might mean a developer does not need much help. It might mean the work is mostly design, review, debugging, incident response, mentoring, or architecture. It might mean the codebase is small and well understood. It might mean the developer is skeptical. It might also mean the developer has not learned the tool yet.\n\nThe number alone does not know.\n\nThat is the first trap.\n\nAI credits are not output.\n\nThey are input.\n\nSoftware has a long history of measuring the thing that is easiest to count and then pretending it represents the thing we actually care about.\n\nLines of code. Commits. Pull requests. Story points. Tickets closed. Test coverage percentage. Build count. Deploy count. Review comments. Meeting hours. Slack messages. Keyboard activity, if you work somewhere especially cursed.\n\nSome of those metrics are useful in context. None of them are engineering quality.\n\nLines of code are the classic example because everyone knows they are silly and people still accidentally reinvent them. A developer who deletes 3,000 lines of unnecessary code may have done the most valuable work of the quarter. A developer who adds 3,000 lines may have created six months of maintenance work.\n\nThe metric is not evil. The interpretation is.\n\nAI credits have the same smell.\n\nIf a team uses them to understand budget, adoption, and tool behavior, good. If a team uses them to ask why a workflow is expensive, also good. If a team uses them to decide whether a department needs training, maybe good.\n\nIf a manager starts asking why Alice used 10x more credits than Bob, or why Carol used almost none, without looking at the work, the code, the reviews, and the outcomes, we are back in lines-of-code land with better branding.\n\nThe most interesting AI work is not always the most visible AI work.\n\nA senior engineer might use Copilot heavily for one hour to explore three possible designs, then write the final change mostly by hand. Another engineer might spend an afternoon in agent mode producing a large pull request that reviewers reject because it missed a domain constraint. A third might use chat as a rubber duck during a tricky production incident and ship no code at all.\n\nWhich one was productive?\n\nThe credit number cannot answer that.\n\nThe credit number can tell you something was consumed.\n\nIt cannot tell you whether the work got better.\n\nThis distinction matters because AI tools make activity look very busy. Agents run commands. They edit files. They summarize. They retry. They generate tests. They open diffs. They can burn tokens while looking like they are making progress.\n\nSometimes they are.\n\nSometimes they are pacing around the same mistake with a nicer transcript.\n\nIf managers only see consumption, they will mistake motion for leverage.\n\nThe better question is not \"who used the most AI?\"\n\nThe better question is \"where did AI usage change the work in a way we can defend?\"\n\nDid review time go down without defects going up? Did boring migrations become cheaper? Did flaky dependency upgrades get less painful? Did junior engineers get better feedback earlier? Did senior engineers spend less time on boilerplate and more time on design? Did incidents resolve faster? Did the team ship maintainable changes with fewer abandoned branches?\n\nThose are harder questions.\n\nThat is why they are better.\n\nI do not want to sound like the answer is \"never measure this.\"\n\nPlease measure it.\n\nAI cost has to become visible. Otherwise teams will discover the bill after habits have already formed.\n\nIf a new coding-agent workflow costs $4 per successful dependency upgrade, that might be wonderful. If it costs $180 because the agent keeps running the full integration suite, calling the largest model, and regenerating the same patch, someone should notice. If one repository burns credits because its build is slow, its tests are noisy, or its instructions are bad, that is useful platform feedback.\n\nPer-user and per-team metrics can also reveal adoption gaps. Maybe one team is getting real value because it built good repository instructions and narrow workflows. Maybe another team is paying for seats nobody uses. Maybe a third team is using AI constantly but still rejecting most generated work.\n\nAll of that is worth knowing.\n\nBut the metric needs to stay attached to a workflow, not a moral judgment about the person.\n\nThe useful unit is often not \"Paulo used 1,200 credits.\"\n\nIt is \"the weekly dependency update workflow for service X used 1,200 credits, produced three pull requests, passed tests twice, needed one human rewrite, and saved roughly half a day of maintenance work.\"\n\nThat is an engineering conversation.\n\n\"Why did Paulo use 1,200 credits?\" is a trap unless you already know what he was doing.\n\nFor agentic coding, I would like credit usage to show up next to the rest of the evidence.\n\nNot as a leaderboard.\n\nAs a cost line in the work record.\n\nAn agent session should have an ID. It should link to the issue, branch, pull request, logs, tool calls, model choices, retries, test runs, and human approvals. Credit usage belongs there. It helps the team understand the actual cost of a workflow and compare it with the outcome.\n\nFor example:\n\nThat kind of measurement changes behavior in a good way. It pushes teams to design better workflows.\n\nThe bad version pushes teams to rank developers by how much AI they consumed.\n\nOne is platform engineering.\n\nThe other is cargo-cult management with an API.\n\nThe dangerous thing about metrics is that nobody has to announce the bad incentive.\n\nAt first, the dashboard is just informational. Then a leader asks why one team uses less Copilot than another. Then someone adds a target. Then managers start nudging people to \"adopt AI more.\" Then a developer leaves the model running more often because the organization has made usage feel like modernity.\n\nOr the incentive goes the other way.\n\nFinance notices high consumption. A manager starts asking people to justify AI use. Engineers stop using the tool for exploratory work because it looks expensive. The team saves credits and loses leverage.\n\nBoth failures come from the same mistake: treating usage as the goal.\n\nUsage is not the goal.\n\nBetter software is the goal.\n\nCheaper maintenance is the goal.\n\nFaster feedback is the goal.\n\nLess boring toil is the goal.\n\nMore reliable systems are the goal.\n\nIf AI credits help you understand those things, great. If they replace those things, you have built a productivity theater with nicer telemetry.\n\nIf I were responsible for an engineering org using Copilot broadly, I would still collect AI credit usage. I would just refuse to let it stand alone.\n\nI would join it with workflow outcomes:\n\nI would also look for places where high AI usage is a symptom.\n\nMaybe the documentation is bad. Maybe the test suite is too slow. Maybe the service boundaries are unclear. Maybe onboarding is painful. Maybe the agent keeps rereading the same files because the repo has no useful map. Maybe developers are using chat to compensate for architecture nobody understands.\n\nThat is the part I find interesting.\n\nAI credit usage may become a weird new observability signal for the developer experience itself.\n\nNot \"who is productive?\"\n\n\"Where is the work expensive to understand?\"\n\nThat is a much better question.\n\nGitHub exposing `ai_credits_used`\n\nis a reasonable product feature. Enterprises need budget visibility. Platform teams need consumption data. AI-assisted development cannot stay a mysterious line item forever.\n\nBut we should be honest about what the metric means.\n\nAI credits measure consumption. They do not measure judgment, maintainability, leverage, taste, review quality, incident response, mentoring, or whether the final system got simpler.\n\nSo use the number.\n\nJust do not worship it.\n\nThe teams that handle this well will treat AI credits like cloud cost: useful when tied to services, workflows, outcomes, and ownership.\n\nThe teams that handle it badly will reinvent lines of code, except this time the line goes through a model bill.\n\nTo test my projects, I use [Railway](https://railway.com?referralCode=G_jRmP). If you want $20 USD to get started, [use this link](https://railway.com?referralCode=G_jRmP).", "url": "https://wpnews.pro/news/ai-credits-are-the-new-lines-of-code-metric", "canonical_source": "https://dev.to/pvgomes/ai-credits-are-the-new-lines-of-code-metric-4pgb", "published_at": "2026-06-21 00:01:45+00:00", "updated_at": "2026-06-21 00:06:27.254293+00:00", "lang": "en", "topics": ["developer-tools", "artificial-intelligence", "ai-products", "ai-infrastructure"], "entities": ["GitHub", "Copilot"], "alternates": {"html": "https://wpnews.pro/news/ai-credits-are-the-new-lines-of-code-metric", "markdown": "https://wpnews.pro/news/ai-credits-are-the-new-lines-of-code-metric.md", "text": "https://wpnews.pro/news/ai-credits-are-the-new-lines-of-code-metric.txt", "jsonld": "https://wpnews.pro/news/ai-credits-are-the-new-lines-of-code-metric.jsonld"}}