{"slug": "agentforge-28-production-grade-skills-that-make-ai-agents-ship-reliable-code", "title": "AgentForge–28 production-grade skills that make AI agents ship reliable code", "summary": "AgentForge launched a set of 28 production-grade skills and workflows designed to make AI coding agents build reliable software by enforcing structured development processes used by senior engineers at companies like Google, Netflix, and Stripe. The system provides step-by-step workflows with checkpoints, evidence-based verification, and anti-rationalization defenses across seven slash commands covering the full development lifecycle from specification to operations. AgentForge aims to solve the problem of AI agents shipping prototypes instead of production code by replacing vague suggestions with battle-tested, verifiable development practices.", "body_md": "**Forge production-grade AI agents.**\n\n*28 battle-tested skills. 6 specialist personas. 5 reference checklists. One mission: make AI agents build software like senior engineers.*\n\nAI coding agents are fast. They're also reckless.\n\nThey skip specs. They \"forget\" tests. They ship without review. They treat \"it works on my machine\" as a success criteria. In short, they build prototypes, not production software.\n\n**AgentForge fixes this.**\n\nWe don't give agents vague suggestions. We give them **structured, battle-tested workflows** that encode how senior engineers actually build software — the same workflows that power teams at Google, Netflix, and Stripe. Every skill has steps, checkpoints, anti-rationalization defenses, and evidence-based verification. When an agent follows these, it ships code you can trust.\n\n```\n  DEFINE          PLAN           BUILD          VERIFY         REVIEW          SHIP           OPS\n ┌──────┐      ┌──────┐      ┌──────┐      ┌──────┐      ┌──────┐      ┌──────┐      ┌──────┐\n │ Idea │ ───▶ │ Spec │ ───▶ │ Code │ ───▶ │ Test │ ───▶ │  QA  │ ───▶ │  Go  │ ───▶ │ Run  │\n │Refine│      │  PRD │      │ Impl │      │Debug │      │ Gate │      │ Live │      │ Ops  │\n └──────┘      └──────┘      └──────┘      └──────┘      └──────┘      └──────┘      └──────┘\n  /spec          /plan          /build        /test         /review       /ship\n```\n\n7 slash commands. 28 skills. 6 phases. Zero excuses.\n\n| Other Prompt Packs | AgentForge | |\n|---|---|---|\nStructure |\nVague advice | Step-by-step workflows with checkpoints |\nVerification |\n\"Make sure it works\" | Evidence-based exit criteria (tests, builds, runtime data) |\nAnti-cheating |\nNone | Rationalization tables that call out excuses agents use to skip steps |\nScope |\nGeneric coding tips | Full lifecycle: spec → plan → build → verify → review → ship → ops |\nQuality gates |\nNone | Built-in CI pipeline with 8 automated checks |\nCross-reference |\nSilos | Every skill references related skills; no duplication |\n\n7 slash commands that map to the development lifecycle. Each one activates the right skills automatically.\n\n| What you're doing | Command | Key principle |\n|---|---|---|\n| Define what to build | `/spec` |\nSpec before code |\n| Plan how to build it | `/plan` |\nSmall, atomic tasks |\n| Build incrementally | `/build` |\nOne slice at a time |\n| Prove it works | `/test` |\nTests are proof |\n| Review before merge | `/review` |\nImprove code health |\n| Simplify the code | `/code-simplify` |\nClarity over cleverness |\n| Ship to production | `/ship` |\nFaster is safer |\n\nWant fewer manual steps once the spec exists? ** /build auto** generates the plan and implements every task in a single approved pass — you approve the plan once, then it runs autonomously. It removes the human stepping\n\n*between*tasks, not the verification: every task is still test-driven and committed individually, and it pauses on failures or risky steps.\n\nSkills also activate automatically based on what you're doing — designing an API triggers `api-and-interface-design`\n\n, building UI triggers `frontend-ui-engineering`\n\n, and so on.\n\n**Claude Code (recommended)**\n\n**Marketplace install:**\n\n```\n/plugin marketplace add borhen68/SkillEngine\n/plugin install agentforge@borhen-agentforge\n```\n\nSSH errors?The marketplace clones repos via SSH. If you don't have SSH keys set up on GitHub, either[add your SSH key]or use the full HTTPS URL to force the HTTPS cloning:\n\n```\n/plugin marketplace add https://github.com/borhen68/SkillEngine.git\n/plugin install agentforge@borhen-agentforge\n```\n\n**Local / development:**\n\n```\ngit clone https://github.com/borhen68/SkillEngine.git\nclaude --plugin-dir /path/to/agentforge\n```\n\n**Cursor**\n\nCopy any `SKILL.md`\n\ninto `.cursor/rules/`\n\n, or reference the full `skills/`\n\ndirectory. See [docs/cursor-setup.md](/borhen68/SkillEngine/blob/main/docs/cursor-setup.md).\n\n**Antigravity CLI**\n\nInstall as a native plugin for skills, subagents, and slash commands. See [docs/antigravity-setup.md](/borhen68/SkillEngine/blob/main/docs/antigravity-setup.md).\n\n**Install from the repo:**\n\n```\nagy plugin install https://github.com/borhen68/SkillEngine.git\n```\n\n**Install from a local clone:**\n\n```\ngit clone https://github.com/borhen68/SkillEngine.git\nagy plugin install ./agentforge\n```\n\n**Gemini CLI**\n\nInstall as native skills for auto-discovery, or add to `GEMINI.md`\n\nfor persistent context. See [docs/gemini-cli-setup.md](/borhen68/SkillEngine/blob/main/docs/gemini-cli-setup.md).\n\n**Install from the repo:**\n\n```\ngemini skills install https://github.com/borhen68/SkillEngine.git --path skills\n```\n\n**Install from a local clone:**\n\n```\ngemini skills install ./agentforge/skills/\n```\n\n**Windsurf**\n\nAdd skill contents to your Windsurf rules configuration. See [docs/windsurf-setup.md](/borhen68/SkillEngine/blob/main/docs/windsurf-setup.md).\n\n**OpenCode**\n\nUses agent-driven skill execution via AGENTS.md and the `skill`\n\ntool.\n\n**GitHub Copilot**\n\nUse agent definitions from `agents/`\n\nas Copilot personas and skill content in `.github/copilot-instructions.md`\n\n. See [docs/copilot-setup.md](/borhen68/SkillEngine/blob/main/docs/copilot-setup.md).\n\n**Kiro IDE & CLI **\n\nSkills for Kiro reside under \".kiro/skills/\" and can be stored under Project or Global level. Kiro also supports Agents.md. See Kiro docs at [https://kiro.dev/docs/skills/](https://kiro.dev/docs/skills/)\n\n**Codex / Other Agents**\n\nSkills are plain Markdown - they work with any agent that accepts system prompts or instruction files. See [docs/getting-started.md](/borhen68/SkillEngine/blob/main/docs/getting-started.md).\n\nThe commands above are entry points. The pack includes 28 skills total — 24 lifecycle skills, 4 operations skills, plus the `using-agentforge`\n\nmeta-skill. Each skill is a structured workflow with steps, verification gates, and anti-rationalization tables. You can also reference any skill directly.\n\n| Skill | What It Does | Use When |\n|---|---|---|\n|\n\n| Skill | What It Does | Use When |\n|---|---|---|\n|\n\n[idea-refine](/borhen68/SkillEngine/blob/main/skills/idea-refine/SKILL.md)[spec-driven-development](/borhen68/SkillEngine/blob/main/skills/spec-driven-development/SKILL.md)| Skill | What It Does | Use When |\n|---|---|---|\n|\n\n| Skill | What It Does | Use When |\n|---|---|---|\n|\n\n[test-driven-development](/borhen68/SkillEngine/blob/main/skills/test-driven-development/SKILL.md)[context-engineering](/borhen68/SkillEngine/blob/main/skills/context-engineering/SKILL.md)[source-driven-development](/borhen68/SkillEngine/blob/main/skills/source-driven-development/SKILL.md)[doubt-driven-development](/borhen68/SkillEngine/blob/main/skills/doubt-driven-development/SKILL.md)[frontend-ui-engineering](/borhen68/SkillEngine/blob/main/skills/frontend-ui-engineering/SKILL.md)[api-and-interface-design](/borhen68/SkillEngine/blob/main/skills/api-and-interface-design/SKILL.md)| Skill | What It Does | Use When |\n|---|---|---|\n|\n\n[debugging-and-error-recovery](/borhen68/SkillEngine/blob/main/skills/debugging-and-error-recovery/SKILL.md)| Skill | What It Does | Use When |\n|---|---|---|\n|\n\n[code-simplification](/borhen68/SkillEngine/blob/main/skills/code-simplification/SKILL.md)[security-and-hardening](/borhen68/SkillEngine/blob/main/skills/security-and-hardening/SKILL.md)[performance-optimization](/borhen68/SkillEngine/blob/main/skills/performance-optimization/SKILL.md)| Skill | What It Does | Use When |\n|---|---|---|\n|\n\n[ci-cd-and-automation](/borhen68/SkillEngine/blob/main/skills/ci-cd-and-automation/SKILL.md)[deprecation-and-migration](/borhen68/SkillEngine/blob/main/skills/deprecation-and-migration/SKILL.md)[documentation-and-adrs](/borhen68/SkillEngine/blob/main/skills/documentation-and-adrs/SKILL.md)*why*[observability-and-instrumentation](/borhen68/SkillEngine/blob/main/skills/observability-and-instrumentation/SKILL.md)[shipping-and-launch](/borhen68/SkillEngine/blob/main/skills/shipping-and-launch/SKILL.md)| Skill | What It Does | Use When |\n|---|---|---|\n|\n\n[cost-optimization](/borhen68/SkillEngine/blob/main/skills/cost-optimization/SKILL.md)[data-engineering](/borhen68/SkillEngine/blob/main/skills/data-engineering/SKILL.md)[ai-ops](/borhen68/SkillEngine/blob/main/skills/ai-ops/SKILL.md)Pre-configured specialist personas for targeted reviews:\n\n| Agent | Role | Perspective |\n|---|---|---|\n|\n\n[test-engineer](/borhen68/SkillEngine/blob/main/agents/test-engineer.md)[security-auditor](/borhen68/SkillEngine/blob/main/agents/security-auditor.md)[web-performance-auditor](/borhen68/SkillEngine/blob/main/agents/web-performance-auditor.md)`/webperf`\n\n[site-reliability-engineer](/borhen68/SkillEngine/blob/main/agents/site-reliability-engineer.md)Quick-reference material that skills pull in when needed:\n\n| Reference | Covers |\n|---|---|\n|\n\n[security-checklist.md](/borhen68/SkillEngine/blob/main/references/security-checklist.md)[performance-checklist.md](/borhen68/SkillEngine/blob/main/references/performance-checklist.md)[accessibility-checklist.md](/borhen68/SkillEngine/blob/main/references/accessibility-checklist.md)[reliability-checklist.md](/borhen68/SkillEngine/blob/main/references/reliability-checklist.md)Every skill follows a consistent anatomy:\n\n```\n┌─────────────────────────────────────────────────┐\n│  SKILL.md                                       │\n│                                                 │\n│  ┌─ Frontmatter ─────────────────────────────┐  │\n│  │ name: lowercase-hyphen-name               │  │\n│  │ description: Guides agents through [task].│  │\n│  │              Use when…                    │  │\n│  └───────────────────────────────────────────┘  │                                                                                                \n│  Overview         → What this skill does        │\n│  When to Use      → Triggering conditions       │\n│  Process          → Step-by-step workflow       │\n│  Rationalizations → Excuses + rebuttals         │\n│  Red Flags        → Signs something's wrong     │\n│  Verification     → Evidence requirements       │\n└─────────────────────────────────────────────────┘\n```\n\n**Key design choices:**\n\n**Process, not prose.** Skills are workflows agents follow, not reference docs they read. Each has steps, checkpoints, and exit criteria.**Anti-rationalization.** Every skill includes a table of common excuses agents use to skip steps (e.g., \"I'll add tests later\") with documented counter-arguments.**Verification is non-negotiable.** Every skill ends with evidence requirements - tests passing, build output, runtime data. \"Seems right\" is never sufficient.**Progressive disclosure.** The`SKILL.md`\n\nis the entry point. Supporting references load only when needed, keeping token usage minimal.\n\n```\nagent-skills/\n├── skills/                            # 28 skills (24 lifecycle + 4 ops + 1 meta)\n│   ├── interview-me/                  #   Define\n│   ├── idea-refine/                   #   Define\n│   ├── spec-driven-development/       #   Define\n│   ├── planning-and-task-breakdown/   #   Plan\n│   ├── incremental-implementation/    #   Build\n│   ├── context-engineering/           #   Build\n│   ├── source-driven-development/     #   Build\n│   ├── doubt-driven-development/      #   Build\n│   ├── frontend-ui-engineering/       #   Build\n│   ├── test-driven-development/       #   Build\n│   ├── api-and-interface-design/      #   Build\n│   ├── browser-testing-with-devtools/ #   Verify\n│   ├── debugging-and-error-recovery/  #   Verify\n│   ├── code-review-and-quality/       #   Review\n│   ├── code-simplification/           #   Review\n│   ├── security-and-hardening/        #   Review\n│   ├── performance-optimization/      #   Review\n│   ├── git-workflow-and-versioning/   #   Ship\n│   ├── ci-cd-and-automation/          #   Ship\n│   ├── deprecation-and-migration/     #   Ship\n│   ├── documentation-and-adrs/        #   Ship\n│   ├── observability-and-instrumentation/ # Ship\n│   ├── shipping-and-launch/           #   Ship\n│   ├── chaos-engineering/             #   Ops\n│   ├── cost-optimization/             #   Ops\n│   ├── data-engineering/              #   Ops\n│   ├── ai-ops/                        #   Ops\n│   └── using-agentforge/            #   Meta: how to use this pack\n├── agents/                            # 5 specialist personas\n├── references/                        # 5 supplementary checklists\n├── hooks/                             # Session lifecycle hooks\n├── scripts/                           # Validation & build automation\n├── .claude/commands/                  # 7 slash commands (Claude Code)\n├── .gemini/commands/                  # 7 slash commands (Gemini CLI)\n├── commands/                          # 8 slash commands (Antigravity CLI)\n├── plugin.json                        # Antigravity plugin manifest\n├── package.json                       # Node.js tooling & scripts\n├── Makefile                           # Local development workflows\n└── docs/                              # Setup guides per tool\n```\n\nThis repository includes a comprehensive validation and quality pipeline:\n\n```\n# Install dependencies\nnpm install\n\n# Run full validation suite\nnpm test\n\n# Or use Make\nmake ci\n```\n\n**Available commands:**\n\n| Command | What It Does |\n|---|---|\n`npm run validate` |\nValidate all skill files for anatomy compliance |\n`npm run validate:strict` |\nSame, but warnings block CI |\n`npm run quality:cross-skill` |\nCheck cross-skill consistency and references |\n`npm run quality:agents` |\nValidate agent persona files |\n`npm run test:hooks` |\nTest session lifecycle hooks |\n`npm run build:packages` |\nBuild .zip packages for distribution |\n`npm run stats` |\nShow project statistics dashboard |\n\n**Quality gates enforced:**\n\n- YAML frontmatter validation (name, description, max length)\n- Required sections: Overview, When to Use, Common Rationalizations, Red Flags, Verification\n- Cross-skill reference integrity (no dead links)\n- Internal markdown link validation\n- Description quality (must contain both \"what\" and \"when\" signals)\n- Token estimation and size warnings\n- Code block language specifier checks\n- Agent persona consistency\n- Lifecycle coverage completeness\n\n\"I watched an AI agent ship a 'working' feature that had no tests, no error handling, and a SQL injection vulnerability. It was 'done' in 20 minutes. It would have taken 2 days to fix in production.\" — Every engineering lead, 2024-2025\n\nAI agents are incredible accelerators. They're also incredible liability generators — because they optimize for *speed*, not *correctness*. They don't know what they don't know, and they don't know that they don't know it.\n\n**AgentForge is the guardrail.**\n\nEvery skill in this pack encodes hard-won judgment from production engineering:\n\n**When** to write a spec (always, for anything non-trivial)**What** to test (behavior, not implementation; edge cases, not just happy path)**How** to review (five axes, not just \"does it compile\")**When** to ship (when rollback is faster than fix-forward)\n\nThese aren't theoretical ideals. They're the workflows that separate teams that sleep through launches from teams that don't.\n\nThis pack draws from the best engineering cultures in the world:\n\n**Google:** Hyrum's Law, Beyonce Rule, test pyramid, change sizing, trunk-based development, code as liability**Netflix:** Chaos engineering, circuit breakers, graceful degradation**Stripe:** API design, backward compatibility, developer experience**Amazon:** Two-pizza teams, service boundaries, operational readiness\n\nEvery principle is embedded directly into the step-by-step workflows agents follow — not as footnotes, but as non-negotiable steps.\n\nWe accept contributions that make agents more reliable, not more clever.\n\nSee [docs/skill-anatomy.md](/borhen68/SkillEngine/blob/main/docs/skill-anatomy.md) for the format specification and [CONTRIBUTING.md](/borhen68/SkillEngine/blob/main/CONTRIBUTING.md) for guidelines. Every PR goes through the same quality gates the skills enforce — eat your own dog food.\n\nMIT — use these skills in your projects, teams, and tools. Build something great.", "url": "https://wpnews.pro/news/agentforge-28-production-grade-skills-that-make-ai-agents-ship-reliable-code", "canonical_source": "https://github.com/borhen68/SkillEngine", "published_at": "2026-06-11 23:35:50+00:00", "updated_at": "2026-06-11 23:48:58.498083+00:00", "lang": "en", "topics": ["ai-agents", "ai-tools", "ai-products", "ai-startups", "artificial-intelligence"], "entities": ["AgentForge", "Google", "Netflix", "Stripe"], "alternates": {"html": "https://wpnews.pro/news/agentforge-28-production-grade-skills-that-make-ai-agents-ship-reliable-code", "markdown": "https://wpnews.pro/news/agentforge-28-production-grade-skills-that-make-ai-agents-ship-reliable-code.md", "text": "https://wpnews.pro/news/agentforge-28-production-grade-skills-that-make-ai-agents-ship-reliable-code.txt", "jsonld": "https://wpnews.pro/news/agentforge-28-production-grade-skills-that-make-ai-agents-ship-reliable-code.jsonld"}}