Constraint Durability: The Missing Layer Between Policy and Trust

wpnews.pro

June 12, 2026

And why the only durable permission boundary is one the agent doesn't need to remember.

The Observation #

We run a multi-task batch pipeline with AI coding agents. Each task in the pipeline generates three artifacts: a specification, a task ticket, and a completion report. A typical batch contains six tasks. By the time the later tasks begin, the agent is carrying roughly 18,000 tokens of accumulated artifacts from earlier tasks — about 12% of its 200,000-token context window consumed before any code for the current task is written.

We started noticing something as sessions grew longer: the agent's adherence to its permission boundaries degraded. The constraints were still present in the system prompt, in the project configuration file, in the delegation rules loaded at session start. But the agent's reasoning about those constraints — the degree to which it actively referenced them when making decisions — diminished as context accumulated.

This observation led us to rethink what "context engineering" actually means when agents operate with real authority.

The Conventional Framing #

Context engineering is one of the most discussed topics in the AI engineering community. The framing is almost universally about performance: how to compress more information into finite context windows, how to retrieve the right documents at the right time, how to summarize efficiently without losing critical details.

The research literature reflects this framing. Active Context Compression achieves 22.7% token reduction on SWE-bench benchmarks while maintaining accuracy. Context-Folding produces 10x smaller active context compared to ReAct baselines by collapsing completed sub-trajectories into outcome summaries. ContextBudget uses reinforcement learning to adaptively allocate token budgets, outperforming static allocation by 1.6x on long-horizon tasks. These are real advances that address real problems.

But there is a dimension to context engineering that this framing misses entirely.

The Security Dimension #

Every LLM agent operates within a context window that serves multiple functions simultaneously. It carries the system prompt (identity, role, behavioral rules). It carries tool schemas (what the agent can do). It carries authorization constraints (what the agent may and may not do). It carries conversation history. And it carries the active working state of whatever task the agent is currently executing.

All of these compete for the same finite resource: tokens.

The critical asymmetry is this: task artifacts grow monotonically during execution, while security constraints are static. A six-task batch generates roughly 3,000 tokens of artifacts per task. The security constraints — permission boundaries, scope rules, behavioral guardrails — were defined once, at session start, and do not grow.

When the window fills, something must be compressed or evicted. The agent's compression algorithm — whether built into the model, managed by the platform, or handled by external tooling — optimizes for task relevance. Instructions that the agent is actively using get preserved. Instructions that haven't been referenced in several turns get compressed or discarded.

Security constraints, by their nature, are the kind of instruction that becomes "irrelevant" to the compression algorithm. A rule like "do not read files outside the project directory" isn't referenced during normal task execution — it only becomes relevant when the agent considers doing the thing it prohibits. If the constraint has been compressed away before that moment arrives, the boundary no longer exists in the agent's effective reasoning context.

What We Measured #

Our pipeline processes specifications, tickets, and reports across multi-task batches. We measured the artifact overhead across several workstreams (115 specs, 157 tickets, 154 reports):

Artifact Type	Average Size	Token Estimate
Specification	249 lines	~1,000 tokens
Task ticket	362 lines	~1,450 tokens
Completion report	137 lines	~550 tokens
Per-task total	748 lines	~3,000 tokens

A six-task batch consumes approximately 18,000 tokens in artifacts alone. Against a 200,000-token window, that's 12% of capacity before any source code is read, any tests are run, or any implementation work begins.

The remaining budget breaks down roughly as follows:

200K total tokens
 -20K  system prompt + rules + configuration
 -15K  design document (read once, then referenced)
 -10K  pipeline state files (batch plan, status, workstream)
 -5K   tool schemas
=150K  available for execution

Per-task budget in a 6-task batch: ~25K each
  Read source files:  ~5K
  Write code:         ~5K
  Run tests + lints:  ~3K
  Status updates:     ~2K
  = ~15K actual work, ~10K buffer

The problem is visible in the budget: security constraints live in the 20K system prompt allocation, competing for hot-layer attention against 150K of growing execution state.

Context Budget Thresholds #

Across extended use, we observed recurring patterns as context accumulated:

Context Load	Observed Behavior
Light (under ~20%)	Full constraint awareness. Peak reasoning quality.
Moderate (~20–40%)	Compression begins. Early quality drift on complex tasks.
Heavy (~50–65%)	Constraint references fade from reasoning. Security boundary weakens.
Saturated (above ~80%)	Auto-compaction fires. Verbal constraints are lost entirely.

As sessions grow longer, the agent's behavior changes — not because the model is less capable, but because the instructions that define the security boundary are no longer in the portion of the context that receives full attention.

Industry data aligns with this observation. Research on the "Lost in the Middle" phenomenon demonstrates that instructions placed at the edges of long contexts are retrieved less reliably than instructions near the beginning or end. The "Maximum Effective Context Window" — the portion of the advertised context that the model actually uses effectively — is consistently lower than the marketed number.

The Constraint Durability Hierarchy #

Not all constraints are equally fragile. We observed a clear hierarchy of durability — and we believe this hierarchy reveals something fundamental about how permission systems should be designed for agents.

We've started calling this property Constraint Durability: the degree to which a security control survives the context pressure of extended agent operation.

Constraint Type	Durability	Mechanism
Gateway-enforced tool filtering	Immune	Operates outside the agent's context entirely
System prompt / config file rules	Resilient	Re-injected at session start and after compaction
Prompt file instructions	Resilient	Loaded from disk at each iteration
Conversational constraints	Fragile	Treated as chat history; evicted during compression
Delegation context in reasoning	Fragile	Depends on whether the agent actively references it

The most important row is the last one. Permission boundaries that exist in the agent's reasoning chain — rather than being structurally enforced — degrade under context pressure. If the agent was told "you have access to Notion read-only, not write" during the session, and the session later compacts, that constraint may not survive.

Verbal constraints are the most fragile class. A directive like "don't modify file X" — set during conversation rather than in the system prompt — vanishes after compaction. This is documented behavior, not speculation. The agent's compression treats it as stale conversation history, not as a load-bearing instruction.

The design principle that emerges: the only durable permission boundary is one the agent doesn't need to remember.

The Resource Contention Model #

The underlying mechanism is resource contention — the same class of problem that operating systems have dealt with for decades.

Consider the analogy: a kernel maintains a flow table of firewall rules in memory. Under memory pressure, the kernel may evict flow entries to make room for active connections. If a deny rule is evicted because it hasn't matched recently, traffic that should be blocked will pass through until the rule is reloaded. The firewall didn't fail. The memory management system made a rational decision that happened to have a security consequence.

The same dynamic plays out in agent context windows:

Security constraints compete for the same finite resource(context tokens) as task execution.

Task artifacts grow monotonically during execution; security constraints are static.

Compression algorithms optimize for task relevance, not security salience.

The result: progressive, silent erosion of the authorization boundary.

This is not a model capability failure. The model is not "getting dumber." It is a resource allocation failure where the compression policy does not distinguish between security-critical and security-irrelevant content.

Implications for Permission Architecture #

This analysis has direct implications for how we design permission systems for AI agents — and introduces a new architectural primitive: Constraint Durability.

Context-dependent constraints are fragile

Any permission boundary that relies on the agent "remembering" its constraints is vulnerable to context drift. The longer the session, the more tasks the agent executes, and the more artifacts accumulate, the more likely it is that the constraint will be compressed away.

This includes:

Verbal instructions given during a session
Permission scope descriptions embedded in the system prompt (if the prompt is long and the constraint is in the middle)
Delegation context that the agent carries in its reasoning chain

Infrastructure-enforced constraints are durable

Permission boundaries that operate outside the agent's context window are immune to context drift. The agent cannot forget a constraint it never carried. Examples include:

Gateway-level tool filtering: the agent'stools/list

call returns only the tools it's authorized to use. Tools outside its scope are invisible, not forbidden.Delegation-grant enforcement: the gateway evaluates each tool call against the delegation grant. The agent doesn't need to know its boundaries — the infrastructure enforces them.Permission Envelope Compilation: the effective authority is computed at the infrastructure layer fromPolicy ∩ Delegation ∩ Intent Envelope

, not stored in the agent's prompt.

Context budgeting is a security control

If security constraints live in the context window at all (which is currently unavoidable for behavioral rules), then context budget management is not optional — it is a security requirement.

Three approaches from the research literature map directly to this problem:

Context Folding — after each task completes, collapse its artifacts to a one-line outcome summary. This reclaims ~3,000 tokens per task, preserving headroom for security constraints. The research shows 10x reduction in active context versus leaving completed task details in place.

Budget-Aware Planning — allocate a token budget for security constraints that is never compressed. Treat policy tokens the way an operating system treats kernel-reserved memory: the security budget is not available for task execution, regardless of memory pressure.

Structural Enforcement — move the constraint out of the context entirely. This is the most durable approach but requires infrastructure support. Gateway-level tool filtering, delegation-grant enforcement, and permission envelope compilation all fall into this category.

Connection to Permission Envelope Compilation #

In a previous analysis, we described Permission Envelope Compilation — a runtime mechanism where the effective authority for an agent is derived from the intersection of policy, delegation, and intent:

Effective Authority = Policy ∩ Delegation ∩ Intent Envelope

PEC answered: How should authority be derived?

Constraint Durability answers the next question: How do you make authority survive?

The context drift analysis reveals why PEC's formula is necessary but not sufficient as a prompt-level construct. If the permission envelope is carried in the agent's context, it is subject to the same compression and eviction dynamics as any other instruction. The envelope must be enforced at the infrastructure layer — evaluated per-tool-call, not stored in the prompt.

This produces a design principle:

Security-critical constraints must be architecturally durable, not contextually durable.

The framework progression becomes:

You don't just compile authority correctly. You ensure it survives.

Constraint Durability as a Design Principle #

Everyone optimizing context windows for cost and latency is also making a security decision — whether they realize it or not.

Every compression algorithm that decides "this instruction is less relevant than the current task" is deciding what the agent is effectively authorized to do. Every token budget that prioritizes task artifacts over policy constraints is widening the effective attack surface. Every session that runs long enough to trigger compaction is silently testing whether the security boundary survives.

The attack surface isn't just what the agent can access.

It's what the agent can remember it shouldn't.

Trustworthy agent systems therefore require two complementary properties:

derived correctly.

survive over time.

Compilation gives us the first. Durability gives us the second.

This is the sixteenth post in an ongoing series on AI agent security architecture. Previous posts covered coordination integrity (how parallel agents diverge from each other), instruction robustness (how agents diverge from their specs across models), semantic irrevocability (why agent side effects can't be undone), governable execution (what Uber's agent architecture reveals), and permission envelope compilation (how to scope authority to intent at runtime). Constraint durability completes the trust chain: how to ensure the authority boundary survives.

References #

ContextBudget: Budget-Aware Context Management for Long-Horizon Search Agents— arXiv:2604.01664, Apr 2026Active Context Compression: Autonomous Memory Management in LLM Agents— arXiv:2601.07190, Jan 2026Scaling Long-Horizon LLM Agent via Context-Folding— arXiv:2510.11967, Oct 2025Escaping the Context Bottleneck: Active Context Curation for LLM Agents via RL— arXiv:2604.11462, Apr 2026SimpleMem: Efficient Lifelong Memory for LLM Agents— arXiv:2601.02553, Jan 2026MemOS: An Operating System for Memory-Augmented Generation in LLMs— arXiv:2505.22101, May 2025Toward a Theory of Hierarchical Memory for Language Agents— arXiv:2603.21564, Mar 2026PlanCompiler: A Deterministic Compilation Architecture for Structured Multi-Step LLM Pipelines— arXiv:2604.13092, Apr 2026Effective Context Engineering for AI Agents— Anthropic Engineering Blog, Sep 2025Lost in the Middle: How Language Models Use Long Contexts— Liu et al., Jul 2023

source & further reading

imaxxs.com — original article