cd /news/ai-agents/agentforge-28-production-grade-skill… · home topics ai-agents article
[ARTICLE · art-24611] src=github.com pub= topic=ai-agents verified=true sentiment=↑ positive

AgentForge–28 production-grade skills that make AI agents ship reliable code

AgentForge launched a set of 28 production-grade skills and workflows designed to make AI coding agents build reliable software by enforcing structured development processes used by senior engineers at companies like Google, Netflix, and Stripe. The system provides step-by-step workflows with checkpoints, evidence-based verification, and anti-rationalization defenses across seven slash commands covering the full development lifecycle from specification to operations. AgentForge aims to solve the problem of AI agents shipping prototypes instead of production code by replacing vague suggestions with battle-tested, verifiable development practices.

read9 min publishedJun 11, 2026

Forge production-grade AI agents.

28 battle-tested skills. 6 specialist personas. 5 reference checklists. One mission: make AI agents build software like senior engineers.

AI coding agents are fast. They're also reckless.

They skip specs. They "forget" tests. They ship without review. They treat "it works on my machine" as a success criteria. In short, they build prototypes, not production software.

AgentForge fixes this.

We don't give agents vague suggestions. We give them structured, battle-tested workflows that encode how senior engineers actually build software — the same workflows that power teams at Google, Netflix, and Stripe. Every skill has steps, checkpoints, anti-rationalization defenses, and evidence-based verification. When an agent follows these, it ships code you can trust.

  DEFINE          PLAN           BUILD          VERIFY         REVIEW          SHIP           OPS
 ┌──────┐      ┌──────┐      ┌──────┐      ┌──────┐      ┌──────┐      ┌──────┐      ┌──────┐
 │ Idea │ ───▶ │ Spec │ ───▶ │ Code │ ───▶ │ Test │ ───▶ │  QA  │ ───▶ │  Go  │ ───▶ │ Run  │
 │Refine│      │  PRD │      │ Impl │      │Debug │      │ Gate │      │ Live │      │ Ops  │
 └──────┘      └──────┘      └──────┘      └──────┘      └──────┘      └──────┘      └──────┘
  /spec          /plan          /build        /test         /review       /ship

7 slash commands. 28 skills. 6 phases. Zero excuses.

Other Prompt Packs AgentForge
Structure
Vague advice Step-by-step workflows with checkpoints
Verification
"Make sure it works" Evidence-based exit criteria (tests, builds, runtime data)
Anti-cheating
None Rationalization tables that call out excuses agents use to skip steps
Scope
Generic coding tips Full lifecycle: spec → plan → build → verify → review → ship → ops
Quality gates
None Built-in CI pipeline with 8 automated checks
Cross-reference
Silos Every skill references related skills; no duplication

7 slash commands that map to the development lifecycle. Each one activates the right skills automatically.

What you're doing Command Key principle
Define what to build /spec
Spec before code
Plan how to build it /plan
Small, atomic tasks
Build incrementally /build
One slice at a time
Prove it works /test
Tests are proof
Review before merge /review
Improve code health
Simplify the code /code-simplify
Clarity over cleverness
Ship to production /ship
Faster is safer

Want fewer manual steps once the spec exists? ** /build auto** generates the plan and implements every task in a single approved pass — you approve the plan once, then it runs autonomously. It removes the human stepping

betweentasks, not the verification: every task is still test-driven and committed individually, and it s on failures or risky steps.

Skills also activate automatically based on what you're doing — designing an API triggers api-and-interface-design

, building UI triggers frontend-ui-engineering

, and so on.

Claude Code (recommended)

Marketplace install:

/plugin marketplace add borhen68/SkillEngine
/plugin install agentforge@borhen-agentforge

SSH errors?The marketplace clones repos via SSH. If you don't have SSH keys set up on GitHub, either[add your SSH key]or use the full HTTPS URL to force the HTTPS cloning:

/plugin marketplace add https://github.com/borhen68/SkillEngine.git
/plugin install agentforge@borhen-agentforge

Local / development:

git clone https://github.com/borhen68/SkillEngine.git
claude --plugin-dir /path/to/agentforge

Cursor

Copy any SKILL.md

into .cursor/rules/

, or reference the full skills/

directory. See docs/cursor-setup.md.

Antigravity CLI

Install as a native plugin for skills, subagents, and slash commands. See docs/antigravity-setup.md.

Install from the repo:

agy plugin install https://github.com/borhen68/SkillEngine.git

Install from a local clone:

git clone https://github.com/borhen68/SkillEngine.git
agy plugin install ./agentforge

Gemini CLI

Install as native skills for auto-discovery, or add to GEMINI.md

for persistent context. See docs/gemini-cli-setup.md.

Install from the repo:

gemini skills install https://github.com/borhen68/SkillEngine.git --path skills

Install from a local clone:

gemini skills install ./agentforge/skills/

Windsurf

Add skill contents to your Windsurf rules configuration. See docs/windsurf-setup.md.

OpenCode

Uses agent-driven skill execution via AGENTS.md and the skill

tool.

GitHub Copilot

Use agent definitions from agents/

as Copilot personas and skill content in .github/copilot-instructions.md

. See docs/copilot-setup.md.

**Kiro IDE & CLI **

Skills for Kiro reside under ".kiro/skills/" and can be stored under Project or Global level. Kiro also supports Agents.md. See Kiro docs at https://kiro.dev/docs/skills/

Codex / Other Agents

Skills are plain Markdown - they work with any agent that accepts system prompts or instruction files. See docs/getting-started.md.

The commands above are entry points. The pack includes 28 skills total — 24 lifecycle skills, 4 operations skills, plus the using-agentforge

meta-skill. Each skill is a structured workflow with steps, verification gates, and anti-rationalization tables. You can also reference any skill directly.

Skill What It Does Use When
Skill What It Does Use When

idea-refinespec-driven-development| Skill | What It Does | Use When | |---|---|---| |

Skill What It Does Use When

test-driven-developmentcontext-engineeringsource-driven-developmentdoubt-driven-developmentfrontend-ui-engineeringapi-and-interface-design| Skill | What It Does | Use When | |---|---|---| |

debugging-and-error-recovery| Skill | What It Does | Use When | |---|---|---| |

code-simplificationsecurity-and-hardeningperformance-optimization| Skill | What It Does | Use When | |---|---|---| |

ci-cd-and-automationdeprecation-and-migrationdocumentation-and-adrswhyobservability-and-instrumentationshipping-and-launch| Skill | What It Does | Use When | |---|---|---| |

cost-optimizationdata-engineeringai-opsPre-configured specialist personas for targeted reviews:

Agent Role Perspective

test-engineersecurity-auditorweb-performance-auditor/webperf

site-reliability-engineerQuick-reference material that skills pull in when needed:

Reference Covers

security-checklist.mdperformance-checklist.mdaccessibility-checklist.mdreliability-checklist.mdEvery skill follows a consistent anatomy:

┌─────────────────────────────────────────────────┐
│  SKILL.md                                       │
│                                                 │
│  ┌─ Frontmatter ─────────────────────────────┐  │
│  │ name: lowercase-hyphen-name               │  │
│  │ description: Guides agents through [task].│  │
│  │              Use when…                    │  │
│  └───────────────────────────────────────────┘  │                                                                                                
│  Overview         → What this skill does        │
│  When to Use      → Triggering conditions       │
│  Process          → Step-by-step workflow       │
│  Rationalizations → Excuses + rebuttals         │
│  Red Flags        → Signs something's wrong     │
│  Verification     → Evidence requirements       │
└─────────────────────────────────────────────────┘

Key design choices:

Process, not prose. Skills are workflows agents follow, not reference docs they read. Each has steps, checkpoints, and exit criteria.Anti-rationalization. Every skill includes a table of common excuses agents use to skip steps (e.g., "I'll add tests later") with documented counter-arguments.Verification is non-negotiable. Every skill ends with evidence requirements - tests passing, build output, runtime data. "Seems right" is never sufficient.Progressive disclosure. TheSKILL.md

is the entry point. Supporting references load only when needed, keeping token usage minimal.

agent-skills/
├── skills/                            # 28 skills (24 lifecycle + 4 ops + 1 meta)
│   ├── interview-me/                  #   Define
│   ├── idea-refine/                   #   Define
│   ├── spec-driven-development/       #   Define
│   ├── planning-and-task-breakdown/   #   Plan
│   ├── incremental-implementation/    #   Build
│   ├── context-engineering/           #   Build
│   ├── source-driven-development/     #   Build
│   ├── doubt-driven-development/      #   Build
│   ├── frontend-ui-engineering/       #   Build
│   ├── test-driven-development/       #   Build
│   ├── api-and-interface-design/      #   Build
│   ├── browser-testing-with-devtools/ #   Verify
│   ├── debugging-and-error-recovery/  #   Verify
│   ├── code-review-and-quality/       #   Review
│   ├── code-simplification/           #   Review
│   ├── security-and-hardening/        #   Review
│   ├── performance-optimization/      #   Review
│   ├── git-workflow-and-versioning/   #   Ship
│   ├── ci-cd-and-automation/          #   Ship
│   ├── deprecation-and-migration/     #   Ship
│   ├── documentation-and-adrs/        #   Ship
│   ├── observability-and-instrumentation/ # Ship
│   ├── shipping-and-launch/           #   Ship
│   ├── chaos-engineering/             #   Ops
│   ├── cost-optimization/             #   Ops
│   ├── data-engineering/              #   Ops
│   ├── ai-ops/                        #   Ops
│   └── using-agentforge/            #   Meta: how to use this pack
├── agents/                            # 5 specialist personas
├── references/                        # 5 supplementary checklists
├── hooks/                             # Session lifecycle hooks
├── scripts/                           # Validation & build automation
├── .claude/commands/                  # 7 slash commands (Claude Code)
├── .gemini/commands/                  # 7 slash commands (Gemini CLI)
├── commands/                          # 8 slash commands (Antigravity CLI)
├── plugin.json                        # Antigravity plugin manifest
├── package.json                       # Node.js tooling & scripts
├── Makefile                           # Local development workflows
└── docs/                              # Setup guides per tool

This repository includes a comprehensive validation and quality pipeline:

npm install

npm test

make ci

Available commands:

Command What It Does
npm run validate
Validate all skill files for anatomy compliance
npm run validate:strict
Same, but warnings block CI
npm run quality:cross-skill
Check cross-skill consistency and references
npm run quality:agents
Validate agent persona files
npm run test:hooks
Test session lifecycle hooks
npm run build:packages
Build .zip packages for distribution
npm run stats
Show project statistics dashboard

Quality gates enforced:

  • YAML frontmatter validation (name, description, max length)
  • Required sections: Overview, When to Use, Common Rationalizations, Red Flags, Verification
  • Cross-skill reference integrity (no dead links)
  • Internal markdown link validation
  • Description quality (must contain both "what" and "when" signals)
  • Token estimation and size warnings
  • Code block language specifier checks
  • Agent persona consistency
  • Lifecycle coverage completeness

"I watched an AI agent ship a 'working' feature that had no tests, no error handling, and a SQL injection vulnerability. It was 'done' in 20 minutes. It would have taken 2 days to fix in production." — Every engineering lead, 2024-2025

AI agents are incredible accelerators. They're also incredible liability generators — because they optimize for speed, not correctness. They don't know what they don't know, and they don't know that they don't know it.

AgentForge is the guardrail.

Every skill in this pack encodes hard-won judgment from production engineering:

When to write a spec (always, for anything non-trivial)What to test (behavior, not implementation; edge cases, not just happy path)How to review (five axes, not just "does it compile")When to ship (when rollback is faster than fix-forward)

These aren't theoretical ideals. They're the workflows that separate teams that sleep through launches from teams that don't.

This pack draws from the best engineering cultures in the world:

Google: Hyrum's Law, Beyonce Rule, test pyramid, change sizing, trunk-based development, code as liabilityNetflix: Chaos engineering, circuit breakers, graceful degradationStripe: API design, backward compatibility, developer experienceAmazon: Two-pizza teams, service boundaries, operational readiness

Every principle is embedded directly into the step-by-step workflows agents follow — not as footnotes, but as non-negotiable steps.

We accept contributions that make agents more reliable, not more clever.

See docs/skill-anatomy.md for the format specification and CONTRIBUTING.md for guidelines. Every PR goes through the same quality gates the skills enforce — eat your own dog food.

MIT — use these skills in your projects, teams, and tools. Build something great.

── more in #ai-agents 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/agentforge-28-produc…] indexed:0 read:9min 2026-06-11 ·