# Built a small PR guardrail for token bloat, worth maintaining?

> Source: <https://github.com/unloopedmido/contextlevy>
> Published: 2026-05-24 23:31:25+00:00

**Bundle-size checks, but for AI agent context cost.**

ContextLevy comments on pull requests when a diff is likely to make coding agents slower, more expensive, or noisier to use.

| Before ContextLevy | After ContextLevy |
|---|---|
| A PR silently adds ~90k tokens of coverage, generated clients, and build output | Reviewers see exactly which files caused the bloat and what to remove |
| Lockfile churn dominates diffs with no agent-cost signal | ContextLevy flags lockfiles, estimates token weight, and suggests review focus |
| Agent instruction files change behavior without visibility | High-signal agent config changes appear in the PR thread |

| Use ContextLevy if… | Maybe skip it if… |
|---|---|
| Your team uses Cursor, Codex, or Claude Code heavily | Your repo rarely uses AI agents |
| PRs often include generated output or coverage artifacts | You already have strict artifact hygiene and pre-commit gates |
| You want advisory PR comments before merge | You need exact tokenizer-accurate billing from your provider |
| You care about repo-level context debt, not just session tuning | You only need per-session context packs (see
|

See [docs/EXAMPLES.md](/unloopedmido/contextlevy/blob/main/docs/EXAMPLES.md) for benchmark tables, monorepo recipes, and output usage.

AI coding agents are powerful, but they are also extremely sensitive to noisy repository context.

A single pull request can accidentally add:

- generated clients
- coverage reports
- build output
- lockfile churn
- snapshots
- huge logs
- vendored files
- agent instruction dumps
- compiled bundles

That may not break your app, but it can absolutely bloat every future AI-assisted coding session.

**ContextLevy catches that before it becomes repo debt.**

It scans pull request diffs, estimates added context weight, classifies risky files, and leaves a focused PR comment explaining what changed and what to clean up.

See [docs/COMPARISON.md](/unloopedmido/contextlevy/blob/main/docs/COMPARISON.md) for how ContextLevy compares to bundle tools, [ctx](https://github.com/forjd/ctx), and agent session tools.

| Risk | Examples | Why it matters |
|---|---|---|
| Generated code | `generated/client.ts` , `schema.graphql` , SDK output |
Often huge, repetitive, and better regenerated locally |
| Coverage output | `coverage/lcov.info` , `htmlcov/` |
High token cost with almost zero agent value |
| Build artifacts | `dist/` , `build/` , `.next/` , compiled bundles |
Frequently duplicated from source |
| Logs and dumps | `*.log` , traces, debug output |
Noisy context that agents over-read |
| Lockfile churn | `package-lock.json` , `pnpm-lock.yaml` , `yarn.lock` |
Can dominate diffs in dependency PRs |
| Snapshots | `__snapshots__/` , large fixture files |
Useful sometimes, expensive always |
| Agent files | `.agents/` , `AGENTS.md` , instruction packs |
Can silently steer future agent behavior |

ContextLevy is intentionally boring:

**No LLM calls****No code upload****No external analysis service****No telemetry required**

It only uses GitHub pull request metadata and diff patches available inside the workflow.

Token and cost numbers are estimates, not billing-grade accounting.

ContextLevy is available as a **GitHub Action** and an **npm CLI**. Choose one setup path:

Best comment attribution and permissions. No repository secrets required.

Install the [ContextLevy GitHub App](https://github.com/apps/contextlevy) on your repository.

Grant these repository permissions when prompted:

| Permission | Access |
|---|---|
| Contents | Read |
| Pull requests | Read & write |
| Issues | Read & write |

The published app posts PR comments with its own identity. You do **not** need to add app credentials as repository secrets or variables.

After changing app permissions, accept the updated installation request on the repository.

Create `.github/workflows/contextlevy.yml`

:

```
name: ContextLevy

on:
  pull_request:
    types: [opened, synchronize, reopened]

permissions:
  contents: read
  pull-requests: write
  issues: write

jobs:
  contextlevy:
    name: Check AI context cost
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - uses: unloopedmido/contextlevy@v2
        with:
          github-token: ${{ github.token }}
```

That is the full setup. ContextLevy reads your PR diff, estimates context weight, and comments when thresholds are exceeded.

Works for many internal PRs without installing the app. Fork PRs may be read-only — see [Fork pull requests](#fork-pull-requests).

```
permissions:
  contents: read
  pull-requests: write
  issues: write

steps:
  - uses: actions/checkout@v4
  - uses: unloopedmido/contextlevy@v2
    with:
      github-token: ${{ github.token }}
```

Maintainers and contributors only:To test with a self-hosted GitHub App in a private fork, see[CONTRIBUTING.md — Self-hosted GitHub App]for`CONTEXTLEVY_APP_ID`

and`CONTEXTLEVY_APP_PRIVATE_KEY`

setup. End users should use the published app linked above.

Install from npm or build from source:

```
npm install -g contextlevy
contextlevy diff --base main
contextlevy diff --base origin/main --format json --fail-on-config
```

From a clone:

```
npm install && npm run build:cli
contextlevy diff --base main
```

See [docs/CLI.md](/unloopedmido/contextlevy/blob/main/docs/CLI.md) for flags, exit codes, and pre-push hook recipes.

Teach coding agents how to set up and use ContextLevy:

```
npx skills add unloopedmido/contextlevy --skill contextlevy
```

Skill source: [.agents/skills/contextlevy/SKILL.md](/unloopedmido/contextlevy/blob/main/.agents/skills/contextlevy/SKILL.md)

ContextLevy reads all analysis and comment options from a config file in the repository. Add a config file once — workflow YAML stays minimal.

On pull requests, ContextLevy reads configuration from the base branch version of the repository. A PR cannot silence the check by changing `.contextlevy.yml`

in the same diff.

Supported config paths, in priority order:

`.contextlevy.yml`

`.contextlevy.yaml`

`.contextlevy.json`

`.github/contextlevy.yml`

`.github/contextlevy.yaml`

`.github/contextlevy.json`

`contextlevy.yml`

`contextlevy.yaml`

`contextlevy.json`

If no config file is found, ContextLevy uses built-in defaults.

Enable editor autocomplete with the published JSON Schema:

``` php
# yaml-language-server: $schema=./docs/schema/contextlevy.schema.json
token-threshold: 1000
```

Schema file: [docs/schema/contextlevy.schema.json](/unloopedmido/contextlevy/blob/main/docs/schema/contextlevy.schema.json)

Example `.contextlevy.yml`

:

```
token-threshold: 1000
large-file-token-threshold: 5000
max-high-impact-items: 5
show-cost-table: true
comment-format: default

ignore-paths:
  - vendor/**
  - "**/*.map"

fail-on-severity: high

custom-rules:
  - name: generated-supabase-types
    paths:
      - "supabase/types.ts"
      - "src/database/generated/**"
    category: generated
    label: Generated Supabase types are usually low-value agent context.
    suggestion: Regenerate locally unless this repo intentionally tracks generated DB types.

estimation-mode: simple

severity-thresholds:
  medium-tokens: 5000
  high-tokens: 20000
  critical-tokens: 100000

pricing-profiles:
  - name: GPT-5.5
    inputCostPerMillion: 5.0
  - name: Opus 4.7
    inputCostPerMillion: 5.0
  - name: Team Gateway
    inputCostPerMillion: 1.75
```

Keys support both kebab-case and camelCase:

```
token-threshold: 1000
tokenThreshold: 1000
```

| Key | Default | Description |
|---|---|---|
`token-threshold` |
`1000` |
Skip commenting below this estimated token total |
`large-file-token-threshold` |
`5000` |
Mark individual files as large context risks |
`max-high-impact-items` |
`5` |
Max files shown in the high-impact table |
`show-cost-table` |
`true` |
Include estimated model input costs |
`comment-format` |
`default` |
`default` or `compact` |
`ignore-paths` |
`[]` |
Glob patterns excluded from analysis entirely |
`allow-paths` |
`[]` |
Glob patterns counted but not flagged as high-impact |
`fail-on-severity` |
unset | Fail workflow at `low` / `medium` / `high` / `critical` or above |
`fail-above-tokens` |
unset | Fail workflow when estimated tokens exceed this value |
`estimation-mode` |
`simple` |
`simple` (`ceil(chars / 4)` ) or `tokenizer` (local BPE, no network) |
`custom-rules` |
`[]` |
Project-specific path rules (see example above) |
`severity-thresholds` |
built-in defaults | Override token/high-impact counts for Low/Medium/High/Critical |
`pricing-profiles` |
built-in defaults | Array of `{ name, inputCostPerMillion }` objects |

When `fail-on-severity`

or `fail-above-tokens`

is set, ContextLevy fails the workflow if thresholds are exceeded. **Fail mode runs even when the PR comment is skipped** — for example, when estimated tokens are below `token-threshold`

. Analysis and fail checks always run; `token-threshold`

only controls whether a comment is posted.

The action accepts **authentication inputs only**. All behavior tuning belongs in the config file.

| Input | Default | Description |
|---|---|---|
`github-token` |
`GITHUB_TOKEN` env |
Fallback token for reading PR files and writing comments |
`app-client-id` |
`CONTEXTLEVY_APP_ID` / `CONTEXTLEVY_APP_CLIENT_ID` env |
Numeric GitHub App ID |
`app-private-key` |
`CONTEXTLEVY_APP_PRIVATE_KEY` env |
GitHub App private key PEM |
`app-installation-id` |
`CONTEXTLEVY_APP_INSTALLATION_ID` env |
Optional GitHub App installation ID override |

Auth credentials should stay in GitHub secrets or variables. Do not put private keys in `.contextlevy.yml`

.

Use these in downstream workflow steps:

| Output | Type | Example | Description |
|---|---|---|---|
`total-estimated-tokens` |
integer string | `"37891"` |
Total estimated net-new context tokens |
`analyzed-file-count` |
integer string | `"12"` |
Changed files included in the estimate |
`token-source` |
string | `"app"` |
Auth source: `app` , `github-token` , or `GITHUB_TOKEN` |
`estimation-mode` |
string | `"simple"` |
Estimation mode used: `simple` or `tokenizer` |

```
- id: contextlevy
  uses: unloopedmido/contextlevy@v2

- if: ${{ steps.contextlevy.outputs.total-estimated-tokens > 50000 }}
  run: echo "Context cost too high"
```

ContextLevy also writes a **job summary** with risk level and top findings for every run.

Best for most repositories.

Includes:

- severity
- estimated token delta
- high-impact files
- file classifications
- optional cost table
- cleanup suggestions

```
comment-format: default
```

Best for busy repos that want a smaller PR footprint.

Usually 3–4 lines:

```
comment-format: compact
```

Example:

```
🤖 ContextLevy · ⚠️ High · ~42.1k tokens
+31.4k coverage/lcov.info · +8.2k dist/index.js · +2.5k generated/client.ts
~$0.02–$0.12/session est. input · Add coverage/ and dist/ to .gitignore
```

Default pricing profiles are **illustrative** and may drift as model prices change. For accurate internal estimates, configure your own `pricing-profiles`

.

When `pricing-profiles`

is omitted, ContextLevy estimates worst-case input cost using:

| Profile | Input cost / 1M tokens |
|---|---|
| GPT-5.5 | `$5.00` |
| Opus 4.7 | `$5.00` |
| Gemini 3.1 Pro | `$2.00` |
| Kimi K2.6 | `$0.95` |

Hide the cost table in your config file:

```
show-cost-table: false
```

Override pricing profiles:

```
pricing-profiles:
  - name: Local 70B
    inputCostPerMillion: 0.2
  - name: Team Gateway
    inputCostPerMillion: 1.75
```

ContextLevy supports two local estimation modes (no LLM calls, no network):

| Mode | Method | Best for |
|---|---|---|
`simple` (default) |
`ceil(chars / 4)` on added diff lines |
Fast warnings, CI everywhere |
`tokenizer` |
`cl100k_base` BPE token count on added diff text |
Closer to GPT-family token counts |

Process:

- List files changed in the pull request.
- Read added diff lines from each patch.
- Estimate tokens using the configured mode.
- If no patch is available, fall back to
`additions × 10`

. - Classify risky paths with built-in rules plus optional
`custom-rules`

.

This is intentionally approximate.

Different models tokenize differently, agents may not read every changed file, and cached-token pricing varies by provider. Cost tables show ±50% ranges. Treat the output as a practical warning signal, not an invoice.

| Severity | Meaning |
|---|---|
`Low` |
Small context increase, usually safe |
`Medium` |
Worth reviewing, especially in agent-heavy repos |
`High` |
Likely to affect AI coding sessions |
`Critical` |
Very large diff or obvious repo-noise artifact |

Override thresholds in config:

```
severity-thresholds:
  medium-tokens: 5000
  high-tokens: 20000
  critical-tokens: 100000
  medium-high-impact-count: 1
  high-high-impact-count: 3
  critical-high-impact-count: 8
token-threshold: 5000
max-high-impact-items: 3
comment-format: compact
estimation-mode: tokenizer
custom-rules:
  - paths:
      - "packages/api/src/generated/**"
    category: generated
    label: Generated API clients add repetitive agent context.
    suggestion: Regenerate locally during development.
show-cost-table: false
pricing-profiles:
  - name: Internal Gateway
    inputCostPerMillion: 1.25
  - name: Local Inference
    inputCostPerMillion: 0.05
```

ContextLevy is most useful when paired with normal repository hygiene.

Common `.gitignore`

additions:

```
coverage/
htmlcov/
dist/
build/
.next/
.cache/
*.log
```

Generated files may still belong in version control depending on your language, package manager, or deployment setup. ContextLevy does not block PRs by default; it gives reviewers a focused warning.

Your workflow token or GitHub App probably does not have enough permissions to create or update PR comments.

Check:

```
permissions:
  contents: read
  pull-requests: write
  issues: write
```

If you use the GitHub App, confirm the installation has:

- Contents: read
- Pull requests: read & write
- Issues: read & write

For pull requests from forks, GitHub may still provide a read-only workflow token. In that case ContextLevy logs a warning, keeps the action successful, still exposes analysis outputs, and writes a job summary — but may not post a PR comment.

Install the GitHub App when your organization policy allows it for more reliable fork PR comments.

See [SECURITY.md — Fork pull requests](/unloopedmido/contextlevy/blob/main/SECURITY.md#fork-pull-requests) for permission details.

Make sure the secret contains the GitHub App private key PEM.

It should look like this:

```
-----BEGIN RSA PRIVATE KEY-----
...
-----END RSA PRIVATE KEY-----
```

Do not use the app Client Secret.

ContextLevy skips comments below `token-threshold`

. Fail mode (`fail-on-severity`

, `fail-above-tokens`

) still runs in that case — a skipped comment does not mean the check was skipped.

Lower the threshold while testing:

```
token-threshold: 0
```

That usually means the PR added large generated files, coverage output, build artifacts, or lockfile churn.

If the files are intentional, either ignore the warning or raise your thresholds.

Install dependencies:

```
npm install
```

Run tests:

```
npm test
```

Build the action bundle and CLI:

```
npm run build          # both
npm run build:action   # GitHub Action only → dist/index.js
npm run build:cli      # local CLI only → lib/
```

Commit `dist/index.js`

after building the action so workflow consumers do not need to install runtime dependencies. The CLI (`lib/`

) is built automatically on `npm publish`

via `prepack`

.

Verify the npm tarball before publishing:

```
npm run pack:check
```

Releases are automated when a version bump lands on `main`

. The [release workflow](/unloopedmido/contextlevy/blob/main/.github/workflows/release.yml) detects a `package.json`

version change, runs tests, verifies `dist/`

, creates a GitHub Release, pushes the semver tag, publishes the CLI to npm via [trusted publishing](https://docs.npmjs.com/trusted-publishers) (OIDC), and updates the major tag.

**Do not push semver tags manually.** Bump the version in `package.json`

, `package-lock.json`

, and `CHANGELOG.md`

, push to `main`

, and CI handles the tag, GitHub Release, and npm publish.

On [npmjs.com](https://www.npmjs.com/package/contextlevy) → **Package settings** → **Trusted publishing**, configure **GitHub Actions** with repository `unloopedmido/contextlevy`

and workflow filename `release.yml`

. No `NPM_TOKEN`

secret is required.

If npm publish fails after a version bump, re-run the **Release** workflow from the Actions tab (`workflow_dispatch`

) once the package is missing on npm — it will retry without another version bump.

Example release sequence:

```
# After updating package.json, package-lock.json, and CHANGELOG.md
git push origin main
```

The workflow updates the major-version tag (`v2`

) automatically.

Before trusted publishing is configured, publish the CLI once from a clean checkout:

```
npm ci
npm run pack:check
npm publish --access public
```

Then add the trusted publisher on npmjs.com as described above. Later version bumps on `main`

publish automatically via OIDC.

Consumers should usually pin:

```
- uses: unloopedmido/contextlevy@v2
```

For maximum supply-chain safety, consumers can pin a full commit SHA.

ContextLevy is a pull request analysis tool. It does not execute changed code and does not send repository contents to an LLM or third-party API.

Please report security issues privately through GitHub Security Advisories instead of opening a public issue.

MIT
