Built a small PR guardrail for token bloat, worth maintaining?

ContextLevy, a new open-source tool, now automatically comments on GitHub pull requests to flag files that could bloat context windows and increase costs for AI coding agents like Cursor and Claude Code. The tool scans diffs for generated code, lockfile churn, and build artifacts, estimating the token weight and classifying risk to prevent "repo-level context debt" from degrading agent performance. Available as a GitHub Action and npm CLI, ContextLevy operates without LLM calls or telemetry, using only pull request metadata to provide advisory comments before merge.

Bundle-size checks, but for AI agent context cost. ContextLevy comments on pull requests when a diff is likely to make coding agents slower, more expensive, or noisier to use. | Before ContextLevy | After ContextLevy | |---|---| | A PR silently adds ~90k tokens of coverage, generated clients, and build output | Reviewers see exactly which files caused the bloat and what to remove | | Lockfile churn dominates diffs with no agent-cost signal | ContextLevy flags lockfiles, estimates token weight, and suggests review focus | | Agent instruction files change behavior without visibility | High-signal agent config changes appear in the PR thread | | Use ContextLevy if… | Maybe skip it if… | |---|---| | Your team uses Cursor, Codex, or Claude Code heavily | Your repo rarely uses AI agents | | PRs often include generated output or coverage artifacts | You already have strict artifact hygiene and pre-commit gates | | You want advisory PR comments before merge | You need exact tokenizer-accurate billing from your provider | | You care about repo-level context debt, not just session tuning | You only need per-session context packs see | See docs/EXAMPLES.md /unloopedmido/contextlevy/blob/main/docs/EXAMPLES.md for benchmark tables, monorepo recipes, and output usage. AI coding agents are powerful, but they are also extremely sensitive to noisy repository context. A single pull request can accidentally add: - generated clients - coverage reports - build output - lockfile churn - snapshots - huge logs - vendored files - agent instruction dumps - compiled bundles That may not break your app, but it can absolutely bloat every future AI-assisted coding session. ContextLevy catches that before it becomes repo debt. It scans pull request diffs, estimates added context weight, classifies risky files, and leaves a focused PR comment explaining what changed and what to clean up. See docs/COMPARISON.md /unloopedmido/contextlevy/blob/main/docs/COMPARISON.md for how ContextLevy compares to bundle tools, ctx https://github.com/forjd/ctx , and agent session tools. | Risk | Examples | Why it matters | |---|---|---| | Generated code | generated/client.ts , schema.graphql , SDK output | Often huge, repetitive, and better regenerated locally | | Coverage output | coverage/lcov.info , htmlcov/ | High token cost with almost zero agent value | | Build artifacts | dist/ , build/ , .next/ , compiled bundles | Frequently duplicated from source | | Logs and dumps | .log , traces, debug output | Noisy context that agents over-read | | Lockfile churn | package-lock.json , pnpm-lock.yaml , yarn.lock | Can dominate diffs in dependency PRs | | Snapshots | snapshots / , large fixture files | Useful sometimes, expensive always | | Agent files | .agents/ , AGENTS.md , instruction packs | Can silently steer future agent behavior | ContextLevy is intentionally boring: No LLM calls No code upload No external analysis service No telemetry required It only uses GitHub pull request metadata and diff patches available inside the workflow. Token and cost numbers are estimates, not billing-grade accounting. ContextLevy is available as a GitHub Action and an npm CLI . Choose one setup path: Best comment attribution and permissions. No repository secrets required. Install the ContextLevy GitHub App https://github.com/apps/contextlevy on your repository. Grant these repository permissions when prompted: | Permission | Access | |---|---| | Contents | Read | | Pull requests | Read & write | | Issues | Read & write | The published app posts PR comments with its own identity. You do not need to add app credentials as repository secrets or variables. After changing app permissions, accept the updated installation request on the repository. Create .github/workflows/contextlevy.yml : name: ContextLevy on: pull request: types: opened, synchronize, reopened permissions: contents: read pull-requests: write issues: write jobs: contextlevy: name: Check AI context cost runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: unloopedmido/contextlevy@v2 with: github-token: ${{ github.token }} That is the full setup. ContextLevy reads your PR diff, estimates context weight, and comments when thresholds are exceeded. Works for many internal PRs without installing the app. Fork PRs may be read-only — see Fork pull requests fork-pull-requests . permissions: contents: read pull-requests: write issues: write steps: - uses: actions/checkout@v4 - uses: unloopedmido/contextlevy@v2 with: github-token: ${{ github.token }} Maintainers and contributors only:To test with a self-hosted GitHub App in a private fork, see CONTRIBUTING.md — Self-hosted GitHub App for CONTEXTLEVY APP ID and CONTEXTLEVY APP PRIVATE KEY setup. End users should use the published app linked above. Install from npm or build from source: npm install -g contextlevy contextlevy diff --base main contextlevy diff --base origin/main --format json --fail-on-config From a clone: npm install && npm run build:cli contextlevy diff --base main See docs/CLI.md /unloopedmido/contextlevy/blob/main/docs/CLI.md for flags, exit codes, and pre-push hook recipes. Teach coding agents how to set up and use ContextLevy: npx skills add unloopedmido/contextlevy --skill contextlevy Skill source: .agents/skills/contextlevy/SKILL.md /unloopedmido/contextlevy/blob/main/.agents/skills/contextlevy/SKILL.md ContextLevy reads all analysis and comment options from a config file in the repository. Add a config file once — workflow YAML stays minimal. On pull requests, ContextLevy reads configuration from the base branch version of the repository. A PR cannot silence the check by changing .contextlevy.yml in the same diff. Supported config paths, in priority order: .contextlevy.yml .contextlevy.yaml .contextlevy.json .github/contextlevy.yml .github/contextlevy.yaml .github/contextlevy.json contextlevy.yml contextlevy.yaml contextlevy.json If no config file is found, ContextLevy uses built-in defaults. Enable editor autocomplete with the published JSON Schema: php yaml-language-server: $schema=./docs/schema/contextlevy.schema.json token-threshold: 1000 Schema file: docs/schema/contextlevy.schema.json /unloopedmido/contextlevy/blob/main/docs/schema/contextlevy.schema.json Example .contextlevy.yml : token-threshold: 1000 large-file-token-threshold: 5000 max-high-impact-items: 5 show-cost-table: true comment-format: default ignore-paths: - vendor/ - " / .map" fail-on-severity: high custom-rules: - name: generated-supabase-types paths: - "supabase/types.ts" - "src/database/generated/ " category: generated label: Generated Supabase types are usually low-value agent context. suggestion: Regenerate locally unless this repo intentionally tracks generated DB types. estimation-mode: simple severity-thresholds: medium-tokens: 5000 high-tokens: 20000 critical-tokens: 100000 pricing-profiles: - name: GPT-5.5 inputCostPerMillion: 5.0 - name: Opus 4.7 inputCostPerMillion: 5.0 - name: Team Gateway inputCostPerMillion: 1.75 Keys support both kebab-case and camelCase: token-threshold: 1000 tokenThreshold: 1000 | Key | Default | Description | |---|---|---| token-threshold | 1000 | Skip commenting below this estimated token total | large-file-token-threshold | 5000 | Mark individual files as large context risks | max-high-impact-items | 5 | Max files shown in the high-impact table | show-cost-table | true | Include estimated model input costs | comment-format | default | default or compact | ignore-paths | | Glob patterns excluded from analysis entirely | allow-paths | | Glob patterns counted but not flagged as high-impact | fail-on-severity | unset | Fail workflow at low / medium / high / critical or above | fail-above-tokens | unset | Fail workflow when estimated tokens exceed this value | estimation-mode | simple | simple ceil chars / 4 or tokenizer local BPE, no network | custom-rules | | Project-specific path rules see example above | severity-thresholds | built-in defaults | Override token/high-impact counts for Low/Medium/High/Critical | pricing-profiles | built-in defaults | Array of { name, inputCostPerMillion } objects | When fail-on-severity or fail-above-tokens is set, ContextLevy fails the workflow if thresholds are exceeded. Fail mode runs even when the PR comment is skipped — for example, when estimated tokens are below token-threshold . Analysis and fail checks always run; token-threshold only controls whether a comment is posted. The action accepts authentication inputs only . All behavior tuning belongs in the config file. | Input | Default | Description | |---|---|---| github-token | GITHUB TOKEN env | Fallback token for reading PR files and writing comments | app-client-id | CONTEXTLEVY APP ID / CONTEXTLEVY APP CLIENT ID env | Numeric GitHub App ID | app-private-key | CONTEXTLEVY APP PRIVATE KEY env | GitHub App private key PEM | app-installation-id | CONTEXTLEVY APP INSTALLATION ID env | Optional GitHub App installation ID override | Auth credentials should stay in GitHub secrets or variables. Do not put private keys in .contextlevy.yml . Use these in downstream workflow steps: | Output | Type | Example | Description | |---|---|---|---| total-estimated-tokens | integer string | "37891" | Total estimated net-new context tokens | analyzed-file-count | integer string | "12" | Changed files included in the estimate | token-source | string | "app" | Auth source: app , github-token , or GITHUB TOKEN | estimation-mode | string | "simple" | Estimation mode used: simple or tokenizer | - id: contextlevy uses: unloopedmido/contextlevy@v2 - if: ${{ steps.contextlevy.outputs.total-estimated-tokens 50000 }} run: echo "Context cost too high" ContextLevy also writes a job summary with risk level and top findings for every run. Best for most repositories. Includes: - severity - estimated token delta - high-impact files - file classifications - optional cost table - cleanup suggestions comment-format: default Best for busy repos that want a smaller PR footprint. Usually 3–4 lines: comment-format: compact Example: 🤖 ContextLevy · ⚠️ High · ~42.1k tokens +31.4k coverage/lcov.info · +8.2k dist/index.js · +2.5k generated/client.ts ~$0.02–$0.12/session est. input · Add coverage/ and dist/ to .gitignore Default pricing profiles are illustrative and may drift as model prices change. For accurate internal estimates, configure your own pricing-profiles . When pricing-profiles is omitted, ContextLevy estimates worst-case input cost using: | Profile | Input cost / 1M tokens | |---|---| | GPT-5.5 | $5.00 | | Opus 4.7 | $5.00 | | Gemini 3.1 Pro | $2.00 | | Kimi K2.6 | $0.95 | Hide the cost table in your config file: show-cost-table: false Override pricing profiles: pricing-profiles: - name: Local 70B inputCostPerMillion: 0.2 - name: Team Gateway inputCostPerMillion: 1.75 ContextLevy supports two local estimation modes no LLM calls, no network : | Mode | Method | Best for | |---|---|---| simple default | ceil chars / 4 on added diff lines | Fast warnings, CI everywhere | tokenizer | cl100k base BPE token count on added diff text | Closer to GPT-family token counts | Process: - List files changed in the pull request. - Read added diff lines from each patch. - Estimate tokens using the configured mode. - If no patch is available, fall back to additions × 10 . - Classify risky paths with built-in rules plus optional custom-rules . This is intentionally approximate. Different models tokenize differently, agents may not read every changed file, and cached-token pricing varies by provider. Cost tables show ±50% ranges. Treat the output as a practical warning signal, not an invoice. | Severity | Meaning | |---|---| Low | Small context increase, usually safe | Medium | Worth reviewing, especially in agent-heavy repos | High | Likely to affect AI coding sessions | Critical | Very large diff or obvious repo-noise artifact | Override thresholds in config: severity-thresholds: medium-tokens: 5000 high-tokens: 20000 critical-tokens: 100000 medium-high-impact-count: 1 high-high-impact-count: 3 critical-high-impact-count: 8 token-threshold: 5000 max-high-impact-items: 3 comment-format: compact estimation-mode: tokenizer custom-rules: - paths: - "packages/api/src/generated/ " category: generated label: Generated API clients add repetitive agent context. suggestion: Regenerate locally during development. show-cost-table: false pricing-profiles: - name: Internal Gateway inputCostPerMillion: 1.25 - name: Local Inference inputCostPerMillion: 0.05 ContextLevy is most useful when paired with normal repository hygiene. Common .gitignore additions: coverage/ htmlcov/ dist/ build/ .next/ .cache/ .log Generated files may still belong in version control depending on your language, package manager, or deployment setup. ContextLevy does not block PRs by default; it gives reviewers a focused warning. Your workflow token or GitHub App probably does not have enough permissions to create or update PR comments. Check: permissions: contents: read pull-requests: write issues: write If you use the GitHub App, confirm the installation has: - Contents: read - Pull requests: read & write - Issues: read & write For pull requests from forks, GitHub may still provide a read-only workflow token. In that case ContextLevy logs a warning, keeps the action successful, still exposes analysis outputs, and writes a job summary — but may not post a PR comment. Install the GitHub App when your organization policy allows it for more reliable fork PR comments. See SECURITY.md — Fork pull requests /unloopedmido/contextlevy/blob/main/SECURITY.md fork-pull-requests for permission details. Make sure the secret contains the GitHub App private key PEM. It should look like this: -----BEGIN RSA PRIVATE KEY----- ... -----END RSA PRIVATE KEY----- Do not use the app Client Secret. ContextLevy skips comments below token-threshold . Fail mode fail-on-severity , fail-above-tokens still runs in that case — a skipped comment does not mean the check was skipped. Lower the threshold while testing: token-threshold: 0 That usually means the PR added large generated files, coverage output, build artifacts, or lockfile churn. If the files are intentional, either ignore the warning or raise your thresholds. Install dependencies: npm install Run tests: npm test Build the action bundle and CLI: npm run build both npm run build:action GitHub Action only → dist/index.js npm run build:cli local CLI only → lib/ Commit dist/index.js after building the action so workflow consumers do not need to install runtime dependencies. The CLI lib/ is built automatically on npm publish via prepack . Verify the npm tarball before publishing: npm run pack:check Releases are automated when a version bump lands on main . The release workflow /unloopedmido/contextlevy/blob/main/.github/workflows/release.yml detects a package.json version change, runs tests, verifies dist/ , creates a GitHub Release, pushes the semver tag, publishes the CLI to npm via trusted publishing https://docs.npmjs.com/trusted-publishers OIDC , and updates the major tag. Do not push semver tags manually. Bump the version in package.json , package-lock.json , and CHANGELOG.md , push to main , and CI handles the tag, GitHub Release, and npm publish. On npmjs.com https://www.npmjs.com/package/contextlevy → Package settings → Trusted publishing , configure GitHub Actions with repository unloopedmido/contextlevy and workflow filename release.yml . No NPM TOKEN secret is required. If npm publish fails after a version bump, re-run the Release workflow from the Actions tab workflow dispatch once the package is missing on npm — it will retry without another version bump. Example release sequence: After updating package.json, package-lock.json, and CHANGELOG.md git push origin main The workflow updates the major-version tag v2 automatically. Before trusted publishing is configured, publish the CLI once from a clean checkout: npm ci npm run pack:check npm publish --access public Then add the trusted publisher on npmjs.com as described above. Later version bumps on main publish automatically via OIDC. Consumers should usually pin: - uses: unloopedmido/contextlevy@v2 For maximum supply-chain safety, consumers can pin a full commit SHA. ContextLevy is a pull request analysis tool. It does not execute changed code and does not send repository contents to an LLM or third-party API. Please report security issues privately through GitHub Security Advisories instead of opening a public issue. MIT