# Two Knowledge Hierarchies: Structuring Context for AI Agents and LLMs

> Source: <https://dev.to/orieken/two-knowledge-hierarchies-structuring-context-for-ai-agents-and-llms-2o6c>
> Published: 2026-05-27 05:22:10+00:00

TestSmith has two distinct audiences that need context about the project: AI agents that work *on* the TestSmith codebase (helping develop and extend it), and the LLM that generates test code *for your project* at runtime. These are different problems with different solutions.

When an AI agent opens TestSmith to fix a bug or add a feature, it needs to understand the codebase structure without reading every file. A single large context file doesn't work well — an agent fixing a retry bug doesn't need to know the Java driver's fixture generation logic.

The solution is a `CLAUDE.md`

hierarchy:

```
CLAUDE.md                              ← package map, invariants, dependency direction
internal/domain/CLAUDE.md             ← interfaces, key types, "add a field" checklist
internal/generation/CLAUDE.md         ← pipeline data flow, verifier selection
internal/llm/CLAUDE.md                ← middleware stack, batch vs fan-out, cache key
internal/projectknowledge/CLAUDE.md   ← TESTSMITH.md hierarchy, budget tiers
internal/drivers/CLAUDE.md            ← how to add an adapter or language driver
```

The root file is the map. The per-package files are the territory. An agent touching the LLM retry logic loads `internal/llm/CLAUDE.md`

— it never sees the driver or generation docs.

The root file contains three things that every agent needs regardless of task:

`domain`

never imports other internal packages; `drivers`

never import `generation`

)`GeneratedFile.Language`

must always be set; `resolveAction`

has specific rules for fixture vs. non-fixture files)Per-package files contain the "read this before touching this package" context: data flow diagrams for the pipeline, the middleware stack for the LLM layer, the adapter registration pattern for drivers.

When Claude Code loads a file in a package, it automatically reads that package's `CLAUDE.md`

. The agent gets exactly what it needs, nothing more.

This is what TestSmith injects into prompts when generating tests for *your* project. It's a conventions file you maintain alongside your source code.

Two levels are merged at generation time:

```
<project-root>/TESTSMITH.md     ← always loaded; project-wide framework, mock style
<source-dir>/TESTSMITH.md       ← optional; package-level overrides
```

Example root `TESTSMITH.md`

:

```
# Project conventions

Framework: pytest
Mock style: pytest-mock (use `mocker.patch`, not `unittest.mock.patch`)
Assertion style: plain assert statements

# Module structure
Services are in `src/services/`. Each service has a single public class.
Tests go in `tests/` mirroring the `src/` structure.
```

Example per-directory override in `src/services/payment/TESTSMITH.md`

:

```
# Payment service conventions
This module integrates with Stripe. Mock all `stripe.*` calls.
Use `pytest.mark.vcr` for HTTP interaction tests.
```

The root file is loaded once at startup and cached in `ProjectContext`

. The per-directory file is merged lazily — only when a file in that directory is being generated. A large monorepo never loads context it doesn't need.

**Both files go into the system prompt, not the user prompt.** This matters because the user prompt is subject to a configurable token budget (`PromptTokenBudget`

, default 6,000 tokens) with a priority-based trim:

| Priority | Content | Dropped when? |
|---|---|---|
| 1 (never) | Source code | Never |
| 2 | Internal dep signatures | Budget exceeded after source |
| 3 | Style snippet from nearby tests | Dropped first |

Project knowledge is exempt from this budget entirely — it stays in the system prompt regardless of how large the source file is.

Beyond `TESTSMITH.md`

, TestSmith also mines conventions from existing tests in the same directory — up to 5 files, capped at 80 lines total. This gives the model real examples of the project's test style without requiring the developer to maintain a conventions doc.

This is cheaper and more accurate than a hand-written guide: it automatically reflects the actual test patterns in use, and it updates itself as tests evolve. If your team starts using a new assertion pattern, the next generation run picks it up.

The third piece is the dep index: at the start of a `--all`

run, TestSmith analyses every source file once and builds a `modulePath → SourceAnalysis`

map. When generating tests for `payment.go`

, it can pull the public API signature of `discount.go`

(which `payment.go`

imports) from memory:

```
// In the prompt:
// Internal dependency signatures:
// discount.ApplyPromoCode(order Order, code string) (Order, error)
// discount.ValidateCode(code string) bool
```

This tells the model what the real interface looks like so it generates test doubles that match the actual signatures — not invented ones.

In watch mode, when a file changes, only that file's entry is refreshed. The rest of the index stays warm between regens.

The two layers solve different problems:

**Agent context** is about *development-time* navigation. It's hierarchical, human-readable, and loaded selectively. It describes architecture and invariants. It lives in the repo and is maintained alongside the code it describes.

**Runtime LLM context** is about *generation-time* quality. It's merged from two levels, injected into system prompts, and exempt from token budgets. It describes conventions and patterns specific to the target project — things an LLM can't infer from source code alone.

Conflating the two leads to either bloated system prompts (dumping agent context into every generation request) or under-informed agents (giving them only the user-facing conventions doc with no architectural guidance). Keeping them separate means each audience gets exactly what it needs.

*Next: the cross-platform bugs we hit shipping a Go CLI — detector boundary escapes and Windows path separators.*
