{"slug": "ai-context-engineering-for-ai-coding-agents-md-cursor-rules-rag", "title": "[AI] Context Engineering for AI Coding: AGENTS.md, Cursor Rules & RAG", "summary": "METR, an AI safety and capability research organization, conducted a randomized controlled trial in 2025 with 16 experienced open-source developers working on 246 real-world tasks. The study found that developers using AI tools were 19% slower on complex tasks, despite predicting a 24% speedup, due to context quality issues rather than model quality. The bottleneck is context quality, leading to the discipline of context engineering, which includes rule files like AGENTS.md to provide architectural constraints and behavioral standards for AI agents.", "body_md": "In 2025, METR — an AI safety and capability research organization — ran a rigorous randomized controlled trial. Sixteen experienced open-source developers worked on 246 real-world tasks, each randomly assigned to either use AI coding tools freely or not at all.\n\nThe result was counterintuitive: developers using AI tools were **19% slower** on complex tasks.\n\nBefore the study, those same developers predicted AI would make them **24% faster**. After completing the experiment — still believing they had gone faster — their subjective confidence remained completely unshaken.\n\nThe finding did not make headlines for the reason people assumed. The headline was not \"AI is useless.\" The headline was this: **the bottleneck is not model quality. It is context quality.**\n\nThe developers who slowed down were spending significant time on what researchers call \"verification overhead\" and \"workflow friction\" — the effort required to correct AI output that did not understand the architectural constraints, naming conventions, existing utility functions, and established patterns of the codebase they were working in. The AI was generating code. It was generating code for an imaginary system.\n\nThis part of the series is about solving that problem.\n\nSeries Orientation:This article is Part 2 of theAI Code Review & Vibe Codingseries, detailing the context engineering practices needed to align AI generation with codebase conventions. For the preceding guide on initial tools and non-technical vibe coding, see[Part 1 — Vibe Coding & The Production Wall].\n\nScope note:This article focuses specifically oncode-review-levelcontext engineering — the practices individual engineers and teams use to make AI agents produce reviewable, architecturally correct code on an existing codebase. If you are interested inplatform-levelcontext infrastructure — building an organizational AI Platform layer, internal RAG systems at scale, or enterprise knowledge management — see[Context Engineering: Domain-Driven Design for AI]in the AI-Driven Playbook series.\n\nEvery time you open a new session with an AI coding tool, you begin from zero. The model knows nothing about:\n\nWithout this information, the AI operates like a very fast, very confident junior developer who has never seen your codebase before and will reproduce whatever pattern was most common in its training data — not whatever pattern is correct for your system.\n\n**Context engineering** is the discipline of structuring and delivering organizational knowledge to AI agents in a form they can reliably use. It is, as the industry consensus now describes it, the \"DevOps moment\" for AI — the operational layer that separates experimental AI assistance from reliable production-grade AI collaboration.\n\nModern AI coding environments support context at multiple layers. Understanding the hierarchy is the foundation of any effective context strategy.\n\nRule files are plain-text configuration files that are automatically injected into every AI interaction. They are the most important and most underutilized form of context.\n\n**AGENTS.md (or CLAUDE.md / GEMINI.md)**\n\nThese files — stored at the root of your repository — are read by AI agents before they begin any task. They function as the agent's standing orders: architectural constraints, behavioral standards, and explicit prohibitions that apply to everything the agent does.\n\nA well-structured `AGENTS.md`\n\ncovers:\n\n```\n# Project Architecture\nThis is a Kratos v2 microservice using Clean Architecture.\nLayer rules:\n- api/ = contracts only (proto + generated code)\n- internal/service/ = adapter layer only, no business logic\n- internal/biz/ = business logic, NO direct database calls\n- internal/data/ = persistence only, GORM + PostgreSQL\n\n# Mandatory Standards\n- All context must propagate through function parameters\n- Use errgroup for managed goroutines only\n- SQL queries must use parameterized inputs — NEVER string concatenation\n- Secrets come from environment variables or Kratos Config — NEVER hardcode\n\n# What NOT To Do\n- Do not use global state\n- Do not expose raw database errors to HTTP/gRPC responses\n- Do not create new patterns without checking internal/util first\n```\n\nThe specificity is the point. A generic instruction like \"follow clean architecture\" produces inconsistent results. A specific instruction like \"the biz layer must never import `gorm.DB`\n\ndirectly\" produces deterministic ones.\n\n**Cursor Rules ( .cursorrules)**\n\nCursor's rule files work similarly to AGENTS.md but are native to the Cursor IDE. They support scoped rules — you can define different behavior for different file patterns, enforce language-specific standards, and specify which files should never be modified by the AI.\n\n```\n[rules]\nname = Go Microservice Standards\nglob = **/*.go\n\n[security]\nnever_hardcode_secrets = true\nrequire_parameterized_queries = true\nforbid_global_state = true\n\n[architecture]\nenforce_layer_boundaries = true\nrequire_context_propagation = true\n```\n\nThe practical effect: your AI assistant now operates with your standards embedded, not as an afterthought you patch into every prompt.\n\nConsider a request: *\"Retrieve a user profile by email in the service layer.\"*\n\n**Without a Rule File ( AGENTS.md)**: The AI will write a GORM query directly inside the adapter service layer, bypassing Clean Architecture design:\n\n```\n// File: internal/service/user.go\nfunc (s *UserService) GetProfileByEmail(ctx context.Context, req *pb.GetProfileReq) (*pb.GetProfileReply, error) {\n    var user biz.User\n    // VIOLATION: Direct database access leaking into the service layer\n    if err := s.db.WithContext(ctx).Where(\"email = ?\", req.Email).First(&user).Error; err != nil {\n        return nil, err\n    }\n    return &pb.GetProfileReply{Name: user.Name, Email: user.Email}, nil\n}\n```\n\n**With a Rule File ( AGENTS.md)**: The AI enforces layer isolation, routing GORM access exclusively through the persistence domain (repository) and business use case:\n\n```\n// File: internal/service/user.go\nfunc (s *UserService) GetProfileByEmail(ctx context.Context, req *pb.GetProfileReq) (*pb.GetProfileReply, error) {\n    // CORRECT: Service calls the biz layer orchestrator (UseCase)\n    user, err := s.userUseCase.FindByEmail(ctx, req.Email)\n    if err != nil {\n        return nil, err\n    }\n    return &pb.GetProfileReply{Name: user.Name, Email: user.Email}, nil\n}\n```\n\nEven with rule files in place, long sessions degrade. This is the \"context rot\" phenomenon: as a session accumulates failed attempts, corrected errors, and discarded planning notes, the signal-to-noise ratio in the context window drops. The model may prioritize recent noise over foundational constraints.\n\n**The Fresh Session Strategy**\n\nHigh-performing engineering teams treat AI sessions like stateless functions: one distinct task per session. When you complete a bug fix, close the session. When you begin a new feature, open a fresh one. The operational rule: task boundaries are session boundaries.\n\n**Structured Handovers**\n\nWhen a session grows long before the task is complete, perform a structured handover:\n\n`PLAN.md`\n\nor `HANDOVER.md`\n\nfile in your project directoryThis eliminates context rot while preserving all meaningful progress.\n\n**Compaction Commands**\n\nModern coding agents (Claude Code, Cursor) include `/compact`\n\nor `/summarize`\n\ncommands. Use them proactively when a session runs long — before the model hits its context limit and before performance degrades. A compacted summary is a much higher-quality input than an accumulating stream of raw conversation.\n\nRule files establish standards. Session management controls noise. Repository indexing solves a different problem: giving the AI accurate knowledge of what already exists in your codebase.\n\n**The N+1 Discovery Problem**\n\nWithout repository context, AI agents routinely implement functions that already exist. They create new database tables that duplicate existing ones. They define error types that collide with established patterns. They import packages that violate your dependency graph. Not because they are incapable of doing better — because they do not know what already exists.\n\n**Manual Selection vs. Full-Repo Scanning**\n\nMost AI coding tools offer the ability to scan an entire repository automatically. This sounds valuable and is often counterproductive. A large codebase injected wholesale into context adds significant noise — irrelevant files, outdated patterns, deprecated modules. The principle: manually select only the files directly relevant to the task.\n\nFor a task modifying user authentication, the relevant context is:\n\nNot the entire codebase.\n\n**Semantic Memory Banks**\n\nMore sophisticated teams maintain curated \"memory bank\" files — structured markdown documents that describe the codebase's architecture, key patterns, and important decisions in a form optimized for AI consumption:\n\n```\n# Memory Bank: Authentication Domain\n\n## Architecture\n- Auth service handles JWT issuance and validation\n- User identity stored in PostgreSQL via GORM, users table\n- Sessions use Redis with 24h TTL (see internal/data/session_repo.go)\n- MFA implemented via TOTP (internal/service/mfa_service.go)\n\n## Key Patterns\n- All auth errors return domain errors, never raw DB errors\n- Rate limiting is middleware-level (internal/middleware/rate_limiter.go)\n- Refresh tokens are hashed before storage (see HashToken in internal/util/crypto.go)\n\n## Common Mistakes to Avoid\n- Do NOT check password directly — always use bcrypt.CompareHashAndPassword\n- Do NOT log token values — only log token IDs\n- Do NOT implement new crypto — use internal/util/crypto.go exclusively\n```\n\nThese memory banks are updated when significant architectural decisions are made and committed to the repository alongside code.\n\nFor large engineering organizations — those with hundreds of services, mature documentation, and complex architectural standards — static rule files are insufficient. The relevant context for any given task changes too rapidly and exists in too many places to manage manually.\n\n**Retrieval-Augmented Generation (RAG)** for code context works by:\n\nThe operational result: an AI agent working on a payments feature automatically retrieves the relevant payment service interfaces, the ADR explaining why you chose the current transaction model, and the runbook for the payment provider integration — without the engineer manually curating that context.\n\n**ADRs as Machine-Readable Judgment**\n\nArchitecture Decision Records deserve special attention. When committed in a structured format and indexed into a RAG pipeline, ADRs transform from static documentation into active constraints:\n\n```\n# ADR-047: Event-Sourcing for Order State Transitions\n\n## Status: Accepted (2025-03)\n## Context\nDirect state mutation of order records creates audit trail gaps and makes rollback scenarios complex.\n## Decision\nAll order state transitions are implemented as events, appended to the events table.\nThe current state is derived by replaying events, not by direct column updates.\n## Consequences\n- New order state logic MUST add new event types, NOT modify existing ones\n- Order queries require projection logic (see internal/projection/order_projector.go)\n- Do NOT write directly to orders.status — always publish an OrderStateTransitioned event\n```\n\nAn AI agent with access to this ADR will not generate direct `UPDATE orders SET status = ?`\n\nqueries for order state changes. Without it, it almost certainly will.\n\n**MCP Servers as Context Infrastructure**\n\nThe Model Context Protocol (MCP), released by Anthropic and now adopted across the industry, provides a standardized interface for serving context to AI agents. Rather than building bespoke integrations for each AI tool, organizations build MCP servers — lightweight services that expose specific organizational knowledge (documentation, code patterns, ticket context) through a standard protocol.\n\nThe shift this enables: context infrastructure becomes a shared organizational asset rather than a per-engineer configuration problem.\n\nThe industry now has a name for operating context infrastructure at organizational scale: **ContextOps**.\n\nThe operational loop is: **Ingest → Validate → Structure → Serve → Audit → Refine**.\n\nOrganizations that treat context as throwaway configuration — updated ad hoc, inconsistently formatted, stored in unindexed markdown files — experience the METR result: AI that slows teams down. Organizations that treat context as infrastructure — versioned, validated, monitored — experience meaningfully different outcomes.\n\nIf your team does not have any context infrastructure today, the practical starting point is a three-step sequence:\n\n**Step 1: Write an AGENTS.md (one afternoon)**\n\nFocus on the highest-value content first:\n\n**Step 2: Establish session discipline (one team discussion)**\n\nAgree on task-based session boundaries. Add a compaction step to your team norms: before any session exceeds 20 substantive exchanges, compact and continue in a fresh session.\n\n**Step 3: Build your first memory bank (one sprint)**\n\nPick your most critical domain — authentication, payments, whatever carries the highest risk. Document it in a memory bank format. Add a rule to your code review checklist: \"Was the relevant memory bank file updated as part of this PR?\"\n\nThe marginal improvement from even basic context infrastructure is significant. Teams that complete these three steps report substantially fewer AI-generated PRs that violate architectural standards, require significant rework, or introduce security issues the memory bank explicitly prohibits.\n\nConsider a task: \"Implement a new endpoint to export user transaction history as a CSV.\"\n\n**Without context engineering**, an AI agent will:\n\n`users`\n\nand `transactions`\n\ndirectly in the service layer`internal/util/csv_writer.go`\n\n**With effective context engineering**, the same agent:\n\n`csv_writer.go`\n\nand uses it rather than reimplementing`internal/service/report_service.go`\n\nand applies it`internal/middleware/`\n\nas specified in your AGENTS.mdThis is not a different model. It is the same model with correct context. The output difference is substantial.\n\nContext engineering is not a replacement for code review. It is a force multiplier on code review. When AI agents operate with accurate, comprehensive context, the output they produce:\n\nThe result: human reviewers spend less time on pattern violations and architectural corrections, and more time on the genuinely high-value review tasks — logical correctness, edge case handling, and the security behaviors that require judgment rather than rule application.\n\nPart 3 covers what those high-value review tasks are: the full taxonomy of AI-generated bugs, from the ones automated tools catch to the ones that only careful human review finds.\n\n*Next: Part 3 — AI Bug Taxonomy: From Silent Logic Failures to Slopsquatting*\n\n*This post was originally published on my blog at Context Engineering for AI Coding: AGENTS.md, Cursor Rules & RAG.*\n\n**Hi, I'm Lê Tuấn Anh (vesviet) 👋**\n\n*I am a Senior Go Backend Architect & Distributed Systems Engineer with 17+ years of experience building high-traffic platforms (25M+ requests/month).*\n\n*If you enjoyed this deep-dive, let's connect on LinkedIn or explore my consulting services at tanhdev.com/hire.*", "url": "https://wpnews.pro/news/ai-context-engineering-for-ai-coding-agents-md-cursor-rules-rag", "canonical_source": "https://dev.to/vesviet/ai-context-engineering-for-ai-coding-agentsmd-cursor-rules-rag-17lb", "published_at": "2026-06-29 23:42:03+00:00", "updated_at": "2026-06-30 00:19:30.878618+00:00", "lang": "en", "topics": ["artificial-intelligence", "ai-research", "ai-safety", "developer-tools", "large-language-models"], "entities": ["METR", "AGENTS.md", "CLAUDE.md", "GEMINI.md", "Kratos", "GORM", "PostgreSQL", "Clean Architecture"], "alternates": {"html": "https://wpnews.pro/news/ai-context-engineering-for-ai-coding-agents-md-cursor-rules-rag", "markdown": "https://wpnews.pro/news/ai-context-engineering-for-ai-coding-agents-md-cursor-rules-rag.md", "text": "https://wpnews.pro/news/ai-context-engineering-for-ai-coding-agents-md-cursor-rules-rag.txt", "jsonld": "https://wpnews.pro/news/ai-context-engineering-for-ai-coding-agents-md-cursor-rules-rag.jsonld"}}