Bundle-size checks, but for AI agent context cost.
ContextLevy comments on pull requests when a diff is likely to make coding agents slower, more expensive, or noisier to use.
| Before ContextLevy | After ContextLevy |
|---|---|
| A PR silently adds ~90k tokens of coverage, generated clients, and build output | Reviewers see exactly which files caused the bloat and what to remove |
| Lockfile churn dominates diffs with no agent-cost signal | ContextLevy flags lockfiles, estimates token weight, and suggests review focus |
| Agent instruction files change behavior without visibility | High-signal agent config changes appear in the PR thread |
| Use ContextLevy if… | Maybe skip it if… |
|---|---|
| Your team uses Cursor, Codex, or Claude Code heavily | Your repo rarely uses AI agents |
| PRs often include generated output or coverage artifacts | You already have strict artifact hygiene and pre-commit gates |
| You want advisory PR comments before merge | You need exact tokenizer-accurate billing from your provider |
| You care about repo-level context debt, not just session tuning | You only need per-session context packs (see |
See docs/EXAMPLES.md for benchmark tables, monorepo recipes, and output usage.
AI coding agents are powerful, but they are also extremely sensitive to noisy repository context.
A single pull request can accidentally add:
- generated clients
- coverage reports
- build output
- lockfile churn
- snapshots
- huge logs
- vendored files
- agent instruction dumps
- compiled bundles
That may not break your app, but it can absolutely bloat every future AI-assisted coding session.
ContextLevy catches that before it becomes repo debt.
It scans pull request diffs, estimates added context weight, classifies risky files, and leaves a focused PR comment explaining what changed and what to clean up.
See docs/COMPARISON.md for how ContextLevy compares to bundle tools, ctx, and agent session tools.
| Risk | Examples | Why it matters |
|---|---|---|
| Generated code | generated/client.ts , schema.graphql , SDK output |
|
| Often huge, repetitive, and better regenerated locally | ||
| Coverage output | coverage/lcov.info , htmlcov/ |
|
| High token cost with almost zero agent value | ||
| Build artifacts | dist/ , build/ , .next/ , compiled bundles |
|
| Frequently duplicated from source | ||
| Logs and dumps | *.log , traces, debug output |
|
| Noisy context that agents over-read | ||
| Lockfile churn | package-lock.json , pnpm-lock.yaml , yarn.lock |
|
| Can dominate diffs in dependency PRs | ||
| Snapshots | __snapshots__/ , large fixture files |
|
| Useful sometimes, expensive always | ||
| Agent files | .agents/ , AGENTS.md , instruction packs |
|
| Can silently steer future agent behavior |
ContextLevy is intentionally boring:
No LLM callsNo code uploadNo external analysis service****No telemetry required
It only uses GitHub pull request metadata and diff patches available inside the workflow.
Token and cost numbers are estimates, not billing-grade accounting.
ContextLevy is available as a GitHub Action and an npm CLI. Choose one setup path:
Best comment attribution and permissions. No repository secrets required.
Install the ContextLevy GitHub App on your repository.
Grant these repository permissions when prompted:
| Permission | Access |
|---|---|
| Contents | Read |
| Pull requests | Read & write |
| Issues | Read & write |
The published app posts PR comments with its own identity. You do not need to add app credentials as repository secrets or variables.
After changing app permissions, accept the updated installation request on the repository.
Create .github/workflows/contextlevy.yml
:
name: ContextLevy
on:
pull_request:
types: [opened, synchronize, reopened]
permissions:
contents: read
pull-requests: write
issues: write
jobs:
contextlevy:
name: Check AI context cost
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: unloopedmido/contextlevy@v2
with:
github-token: ${{ github.token }}
That is the full setup. ContextLevy reads your PR diff, estimates context weight, and comments when thresholds are exceeded.
Works for many internal PRs without installing the app. Fork PRs may be read-only — see Fork pull requests.
permissions:
contents: read
pull-requests: write
issues: write
steps:
- uses: actions/checkout@v4
- uses: unloopedmido/contextlevy@v2
with:
github-token: ${{ github.token }}
Maintainers and contributors only:To test with a self-hosted GitHub App in a private fork, see[CONTRIBUTING.md — Self-hosted GitHub App]forCONTEXTLEVY_APP_ID
andCONTEXTLEVY_APP_PRIVATE_KEY
setup. End users should use the published app linked above.
Install from npm or build from source:
npm install -g contextlevy
contextlevy diff --base main
contextlevy diff --base origin/main --format json --fail-on-config
From a clone:
npm install && npm run build:cli
contextlevy diff --base main
See docs/CLI.md for flags, exit codes, and pre-push hook recipes.
Teach coding agents how to set up and use ContextLevy:
npx skills add unloopedmido/contextlevy --skill contextlevy
Skill source: .agents/skills/contextlevy/SKILL.md
ContextLevy reads all analysis and comment options from a config file in the repository. Add a config file once — workflow YAML stays minimal.
On pull requests, ContextLevy reads configuration from the base branch version of the repository. A PR cannot silence the check by changing .contextlevy.yml
in the same diff.
Supported config paths, in priority order:
.contextlevy.yml
.contextlevy.yaml
.contextlevy.json
.github/contextlevy.yml
.github/contextlevy.yaml
.github/contextlevy.json
contextlevy.yml
contextlevy.yaml
contextlevy.json
If no config file is found, ContextLevy uses built-in defaults.
Enable editor autocomplete with the published JSON Schema:
token-threshold: 1000
Schema file: docs/schema/contextlevy.schema.json
Example .contextlevy.yml
:
token-threshold: 1000
large-file-token-threshold: 5000
max-high-impact-items: 5
show-cost-table: true
comment-format: default
ignore-paths:
- vendor/**
- "**/*.map"
fail-on-severity: high
custom-rules:
- name: generated-supabase-types
paths:
- "supabase/types.ts"
- "src/database/generated/**"
category: generated
label: Generated Supabase types are usually low-value agent context.
suggestion: Regenerate locally unless this repo intentionally tracks generated DB types.
estimation-mode: simple
severity-thresholds:
medium-tokens: 5000
high-tokens: 20000
critical-tokens: 100000
pricing-profiles:
- name: GPT-5.5
inputCostPerMillion: 5.0
- name: Opus 4.7
inputCostPerMillion: 5.0
- name: Team Gateway
inputCostPerMillion: 1.75
Keys support both kebab-case and camelCase:
token-threshold: 1000
tokenThreshold: 1000
| Key | Default | Description |
|---|---|---|
token-threshold |
||
1000 |
||
| Skip commenting below this estimated token total | ||
large-file-token-threshold |
||
5000 |
||
| Mark individual files as large context risks | ||
max-high-impact-items |
||
5 |
||
| Max files shown in the high-impact table | ||
show-cost-table |
||
true |
||
| Include estimated model input costs | ||
comment-format |
||
default |
||
default or compact |
||
ignore-paths |
||
[] |
||
| Glob patterns excluded from analysis entirely | ||
allow-paths |
||
[] |
||
| Glob patterns counted but not flagged as high-impact | ||
fail-on-severity |
||
| unset | Fail workflow at low / medium / high / critical or above |
|
fail-above-tokens |
||
| unset | Fail workflow when estimated tokens exceed this value | |
estimation-mode |
||
simple |
||
simple (ceil(chars / 4) ) or tokenizer (local BPE, no network) |
||
custom-rules |
||
[] |
||
| Project-specific path rules (see example above) | ||
severity-thresholds |
||
| built-in defaults | Override token/high-impact counts for Low/Medium/High/Critical | |
pricing-profiles |
||
| built-in defaults | Array of { name, inputCostPerMillion } objects |
When fail-on-severity
or fail-above-tokens
is set, ContextLevy fails the workflow if thresholds are exceeded. Fail mode runs even when the PR comment is skipped — for example, when estimated tokens are below token-threshold
. Analysis and fail checks always run; token-threshold
only controls whether a comment is posted.
The action accepts authentication inputs only. All behavior tuning belongs in the config file.
| Input | Default | Description |
|---|---|---|
github-token |
||
GITHUB_TOKEN env |
||
| Fallback token for reading PR files and writing comments | ||
app-client-id |
||
CONTEXTLEVY_APP_ID / CONTEXTLEVY_APP_CLIENT_ID env |
||
| Numeric GitHub App ID | ||
app-private-key |
||
CONTEXTLEVY_APP_PRIVATE_KEY env |
||
| GitHub App private key PEM | ||
app-installation-id |
||
CONTEXTLEVY_APP_INSTALLATION_ID env |
||
| Optional GitHub App installation ID override |
Auth credentials should stay in GitHub secrets or variables. Do not put private keys in .contextlevy.yml
.
Use these in downstream workflow steps:
| Output | Type | Example | Description |
|---|---|---|---|
total-estimated-tokens |
|||
| integer string | "37891" |
||
| Total estimated net-new context tokens | |||
analyzed-file-count |
|||
| integer string | "12" |
||
| Changed files included in the estimate | |||
token-source |
|||
| string | "app" |
||
Auth source: app , github-token , or GITHUB_TOKEN |
|||
estimation-mode |
|||
| string | "simple" |
||
Estimation mode used: simple or tokenizer |
- id: contextlevy
uses: unloopedmido/contextlevy@v2
- if: ${{ steps.contextlevy.outputs.total-estimated-tokens > 50000 }}
run: echo "Context cost too high"
ContextLevy also writes a job summary with risk level and top findings for every run.
Best for most repositories.
Includes:
- severity
- estimated token delta
- high-impact files
- file classifications
- optional cost table
- cleanup suggestions
comment-format: default
Best for busy repos that want a smaller PR footprint.
Usually 3–4 lines:
comment-format: compact
Example:
🤖 ContextLevy · ⚠️ High · ~42.1k tokens
+31.4k coverage/lcov.info · +8.2k dist/index.js · +2.5k generated/client.ts
~$0.02–$0.12/session est. input · Add coverage/ and dist/ to .gitignore
Default pricing profiles are illustrative and may drift as model prices change. For accurate internal estimates, configure your own pricing-profiles
.
When pricing-profiles
is omitted, ContextLevy estimates worst-case input cost using:
| Profile | Input cost / 1M tokens |
|---|---|
| GPT-5.5 | $5.00 |
| Opus 4.7 | $5.00 |
| Gemini 3.1 Pro | $2.00 |
| Kimi K2.6 | $0.95 |
Hide the cost table in your config file:
show-cost-table: false
Override pricing profiles:
pricing-profiles:
- name: Local 70B
inputCostPerMillion: 0.2
- name: Team Gateway
inputCostPerMillion: 1.75
ContextLevy supports two local estimation modes (no LLM calls, no network):
| Mode | Method | Best for |
|---|---|---|
simple (default) |
||
ceil(chars / 4) on added diff lines |
||
| Fast warnings, CI everywhere | ||
tokenizer |
||
cl100k_base BPE token count on added diff text |
||
| Closer to GPT-family token counts |
Process:
- List files changed in the pull request.
- Read added diff lines from each patch.
- Estimate tokens using the configured mode.
- If no patch is available, fall back to
additions × 10
. - Classify risky paths with built-in rules plus optional
custom-rules
.
This is intentionally approximate.
Different models tokenize differently, agents may not read every changed file, and cached-token pricing varies by provider. Cost tables show ±50% ranges. Treat the output as a practical warning signal, not an invoice.
| Severity | Meaning |
|---|---|
Low |
|
| Small context increase, usually safe | |
Medium |
|
| Worth reviewing, especially in agent-heavy repos | |
High |
|
| Likely to affect AI coding sessions | |
Critical |
|
| Very large diff or obvious repo-noise artifact |
Override thresholds in config:
severity-thresholds:
medium-tokens: 5000
high-tokens: 20000
critical-tokens: 100000
medium-high-impact-count: 1
high-high-impact-count: 3
critical-high-impact-count: 8
token-threshold: 5000
max-high-impact-items: 3
comment-format: compact
estimation-mode: tokenizer
custom-rules:
- paths:
- "packages/api/src/generated/**"
category: generated
label: Generated API clients add repetitive agent context.
suggestion: Regenerate locally during development.
show-cost-table: false
pricing-profiles:
- name: Internal Gateway
inputCostPerMillion: 1.25
- name: Local Inference
inputCostPerMillion: 0.05
ContextLevy is most useful when paired with normal repository hygiene.
Common .gitignore
additions:
coverage/
htmlcov/
dist/
build/
.next/
.cache/
*.log
Generated files may still belong in version control depending on your language, package manager, or deployment setup. ContextLevy does not block PRs by default; it gives reviewers a focused warning.
Your workflow token or GitHub App probably does not have enough permissions to create or update PR comments.
Check:
permissions:
contents: read
pull-requests: write
issues: write
If you use the GitHub App, confirm the installation has:
- Contents: read
- Pull requests: read & write
- Issues: read & write
For pull requests from forks, GitHub may still provide a read-only workflow token. In that case ContextLevy logs a warning, keeps the action successful, still exposes analysis outputs, and writes a job summary — but may not post a PR comment.
Install the GitHub App when your organization policy allows it for more reliable fork PR comments.
See SECURITY.md — Fork pull requests for permission details.
Make sure the secret contains the GitHub App private key PEM.
It should look like this:
-----BEGIN RSA PRIVATE KEY-----
...
-----END RSA PRIVATE KEY-----
Do not use the app Client Secret.
ContextLevy skips comments below token-threshold
. Fail mode (fail-on-severity
, fail-above-tokens
) still runs in that case — a skipped comment does not mean the check was skipped.
Lower the threshold while testing:
token-threshold: 0
That usually means the PR added large generated files, coverage output, build artifacts, or lockfile churn.
If the files are intentional, either ignore the warning or raise your thresholds.
Install dependencies:
npm install
Run tests:
npm test
Build the action bundle and CLI:
npm run build # both
npm run build:action # GitHub Action only → dist/index.js
npm run build:cli # local CLI only → lib/
Commit dist/index.js
after building the action so workflow consumers do not need to install runtime dependencies. The CLI (lib/
) is built automatically on npm publish
via prepack
.
Verify the npm tarball before publishing:
npm run pack:check
Releases are automated when a version bump lands on main
. The release workflow detects a package.json
version change, runs tests, verifies dist/
, creates a GitHub Release, pushes the semver tag, publishes the CLI to npm via trusted publishing (OIDC), and updates the major tag.
Do not push semver tags manually. Bump the version in package.json
, package-lock.json
, and CHANGELOG.md
, push to main
, and CI handles the tag, GitHub Release, and npm publish.
On npmjs.com → Package settings → Trusted publishing, configure GitHub Actions with repository unloopedmido/contextlevy
and workflow filename release.yml
. No NPM_TOKEN
secret is required.
If npm publish fails after a version bump, re-run the Release workflow from the Actions tab (workflow_dispatch
) once the package is missing on npm — it will retry without another version bump.
Example release sequence:
git push origin main
The workflow updates the major-version tag (v2
) automatically.
Before trusted publishing is configured, publish the CLI once from a clean checkout:
npm ci
npm run pack:check
npm publish --access public
Then add the trusted publisher on npmjs.com as described above. Later version bumps on main
publish automatically via OIDC.
Consumers should usually pin:
- uses: unloopedmido/contextlevy@v2
For maximum supply-chain safety, consumers can pin a full commit SHA.
ContextLevy is a pull request analysis tool. It does not execute changed code and does not send repository contents to an LLM or third-party API.
Please report security issues privately through GitHub Security Advisories instead of opening a public issue.
MIT