AgentForge–28 production-grade skills that make AI agents ship reliable code

AgentForge launched a set of 28 production-grade skills and workflows designed to make AI coding agents build reliable software by enforcing structured development processes used by senior engineers at companies like Google, Netflix, and Stripe. The system provides step-by-step workflows with checkpoints, evidence-based verification, and anti-rationalization defenses across seven slash commands covering the full development lifecycle from specification to operations. AgentForge aims to solve the problem of AI agents shipping prototypes instead of production code by replacing vague suggestions with battle-tested, verifiable development practices.

Forge production-grade AI agents. 28 battle-tested skills. 6 specialist personas. 5 reference checklists. One mission: make AI agents build software like senior engineers. AI coding agents are fast. They're also reckless. They skip specs. They "forget" tests. They ship without review. They treat "it works on my machine" as a success criteria. In short, they build prototypes, not production software. AgentForge fixes this. We don't give agents vague suggestions. We give them structured, battle-tested workflows that encode how senior engineers actually build software — the same workflows that power teams at Google, Netflix, and Stripe. Every skill has steps, checkpoints, anti-rationalization defenses, and evidence-based verification. When an agent follows these, it ships code you can trust. DEFINE PLAN BUILD VERIFY REVIEW SHIP OPS ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ │ Idea │ ───▶ │ Spec │ ───▶ │ Code │ ───▶ │ Test │ ───▶ │ QA │ ───▶ │ Go │ ───▶ │ Run │ │Refine│ │ PRD │ │ Impl │ │Debug │ │ Gate │ │ Live │ │ Ops │ └──────┘ └──────┘ └──────┘ └──────┘ └──────┘ └──────┘ └──────┘ /spec /plan /build /test /review /ship 7 slash commands. 28 skills. 6 phases. Zero excuses. | Other Prompt Packs | AgentForge | | |---|---|---| Structure | Vague advice | Step-by-step workflows with checkpoints | Verification | "Make sure it works" | Evidence-based exit criteria tests, builds, runtime data | Anti-cheating | None | Rationalization tables that call out excuses agents use to skip steps | Scope | Generic coding tips | Full lifecycle: spec → plan → build → verify → review → ship → ops | Quality gates | None | Built-in CI pipeline with 8 automated checks | Cross-reference | Silos | Every skill references related skills; no duplication | 7 slash commands that map to the development lifecycle. Each one activates the right skills automatically. | What you're doing | Command | Key principle | |---|---|---| | Define what to build | /spec | Spec before code | | Plan how to build it | /plan | Small, atomic tasks | | Build incrementally | /build | One slice at a time | | Prove it works | /test | Tests are proof | | Review before merge | /review | Improve code health | | Simplify the code | /code-simplify | Clarity over cleverness | | Ship to production | /ship | Faster is safer | Want fewer manual steps once the spec exists? /build auto generates the plan and implements every task in a single approved pass — you approve the plan once, then it runs autonomously. It removes the human stepping between tasks, not the verification: every task is still test-driven and committed individually, and it pauses on failures or risky steps. Skills also activate automatically based on what you're doing — designing an API triggers api-and-interface-design , building UI triggers frontend-ui-engineering , and so on. Claude Code recommended Marketplace install: /plugin marketplace add borhen68/SkillEngine /plugin install agentforge@borhen-agentforge SSH errors?The marketplace clones repos via SSH. If you don't have SSH keys set up on GitHub, either add your SSH key or use the full HTTPS URL to force the HTTPS cloning: /plugin marketplace add https://github.com/borhen68/SkillEngine.git /plugin install agentforge@borhen-agentforge Local / development: git clone https://github.com/borhen68/SkillEngine.git claude --plugin-dir /path/to/agentforge Cursor Copy any SKILL.md into .cursor/rules/ , or reference the full skills/ directory. See docs/cursor-setup.md /borhen68/SkillEngine/blob/main/docs/cursor-setup.md . Antigravity CLI Install as a native plugin for skills, subagents, and slash commands. See docs/antigravity-setup.md /borhen68/SkillEngine/blob/main/docs/antigravity-setup.md . Install from the repo: agy plugin install https://github.com/borhen68/SkillEngine.git Install from a local clone: git clone https://github.com/borhen68/SkillEngine.git agy plugin install ./agentforge Gemini CLI Install as native skills for auto-discovery, or add to GEMINI.md for persistent context. See docs/gemini-cli-setup.md /borhen68/SkillEngine/blob/main/docs/gemini-cli-setup.md . Install from the repo: gemini skills install https://github.com/borhen68/SkillEngine.git --path skills Install from a local clone: gemini skills install ./agentforge/skills/ Windsurf Add skill contents to your Windsurf rules configuration. See docs/windsurf-setup.md /borhen68/SkillEngine/blob/main/docs/windsurf-setup.md . OpenCode Uses agent-driven skill execution via AGENTS.md and the skill tool. GitHub Copilot Use agent definitions from agents/ as Copilot personas and skill content in .github/copilot-instructions.md . See docs/copilot-setup.md /borhen68/SkillEngine/blob/main/docs/copilot-setup.md . Kiro IDE & CLI Skills for Kiro reside under ".kiro/skills/" and can be stored under Project or Global level. Kiro also supports Agents.md. See Kiro docs at https://kiro.dev/docs/skills/ https://kiro.dev/docs/skills/ Codex / Other Agents Skills are plain Markdown - they work with any agent that accepts system prompts or instruction files. See docs/getting-started.md /borhen68/SkillEngine/blob/main/docs/getting-started.md . The commands above are entry points. The pack includes 28 skills total — 24 lifecycle skills, 4 operations skills, plus the using-agentforge meta-skill. Each skill is a structured workflow with steps, verification gates, and anti-rationalization tables. You can also reference any skill directly. | Skill | What It Does | Use When | |---|---|---| | | Skill | What It Does | Use When | |---|---|---| | idea-refine /borhen68/SkillEngine/blob/main/skills/idea-refine/SKILL.md spec-driven-development /borhen68/SkillEngine/blob/main/skills/spec-driven-development/SKILL.md | Skill | What It Does | Use When | |---|---|---| | | Skill | What It Does | Use When | |---|---|---| | test-driven-development /borhen68/SkillEngine/blob/main/skills/test-driven-development/SKILL.md context-engineering /borhen68/SkillEngine/blob/main/skills/context-engineering/SKILL.md source-driven-development /borhen68/SkillEngine/blob/main/skills/source-driven-development/SKILL.md doubt-driven-development /borhen68/SkillEngine/blob/main/skills/doubt-driven-development/SKILL.md frontend-ui-engineering /borhen68/SkillEngine/blob/main/skills/frontend-ui-engineering/SKILL.md api-and-interface-design /borhen68/SkillEngine/blob/main/skills/api-and-interface-design/SKILL.md | Skill | What It Does | Use When | |---|---|---| | debugging-and-error-recovery /borhen68/SkillEngine/blob/main/skills/debugging-and-error-recovery/SKILL.md | Skill | What It Does | Use When | |---|---|---| | code-simplification /borhen68/SkillEngine/blob/main/skills/code-simplification/SKILL.md security-and-hardening /borhen68/SkillEngine/blob/main/skills/security-and-hardening/SKILL.md performance-optimization /borhen68/SkillEngine/blob/main/skills/performance-optimization/SKILL.md | Skill | What It Does | Use When | |---|---|---| | ci-cd-and-automation /borhen68/SkillEngine/blob/main/skills/ci-cd-and-automation/SKILL.md deprecation-and-migration /borhen68/SkillEngine/blob/main/skills/deprecation-and-migration/SKILL.md documentation-and-adrs /borhen68/SkillEngine/blob/main/skills/documentation-and-adrs/SKILL.md why observability-and-instrumentation /borhen68/SkillEngine/blob/main/skills/observability-and-instrumentation/SKILL.md shipping-and-launch /borhen68/SkillEngine/blob/main/skills/shipping-and-launch/SKILL.md | Skill | What It Does | Use When | |---|---|---| | cost-optimization /borhen68/SkillEngine/blob/main/skills/cost-optimization/SKILL.md data-engineering /borhen68/SkillEngine/blob/main/skills/data-engineering/SKILL.md ai-ops /borhen68/SkillEngine/blob/main/skills/ai-ops/SKILL.md Pre-configured specialist personas for targeted reviews: | Agent | Role | Perspective | |---|---|---| | test-engineer /borhen68/SkillEngine/blob/main/agents/test-engineer.md security-auditor /borhen68/SkillEngine/blob/main/agents/security-auditor.md web-performance-auditor /borhen68/SkillEngine/blob/main/agents/web-performance-auditor.md /webperf site-reliability-engineer /borhen68/SkillEngine/blob/main/agents/site-reliability-engineer.md Quick-reference material that skills pull in when needed: | Reference | Covers | |---|---| | security-checklist.md /borhen68/SkillEngine/blob/main/references/security-checklist.md performance-checklist.md /borhen68/SkillEngine/blob/main/references/performance-checklist.md accessibility-checklist.md /borhen68/SkillEngine/blob/main/references/accessibility-checklist.md reliability-checklist.md /borhen68/SkillEngine/blob/main/references/reliability-checklist.md Every skill follows a consistent anatomy: ┌─────────────────────────────────────────────────┐ │ SKILL.md │ │ │ │ ┌─ Frontmatter ─────────────────────────────┐ │ │ │ name: lowercase-hyphen-name │ │ │ │ description: Guides agents through task .│ │ │ │ Use when… │ │ │ └───────────────────────────────────────────┘ │ │ Overview → What this skill does │ │ When to Use → Triggering conditions │ │ Process → Step-by-step workflow │ │ Rationalizations → Excuses + rebuttals │ │ Red Flags → Signs something's wrong │ │ Verification → Evidence requirements │ └─────────────────────────────────────────────────┘ Key design choices: Process, not prose. Skills are workflows agents follow, not reference docs they read. Each has steps, checkpoints, and exit criteria. Anti-rationalization. Every skill includes a table of common excuses agents use to skip steps e.g., "I'll add tests later" with documented counter-arguments. Verification is non-negotiable. Every skill ends with evidence requirements - tests passing, build output, runtime data. "Seems right" is never sufficient. Progressive disclosure. The SKILL.md is the entry point. Supporting references load only when needed, keeping token usage minimal. agent-skills/ ├── skills/ 28 skills 24 lifecycle + 4 ops + 1 meta │ ├── interview-me/ Define │ ├── idea-refine/ Define │ ├── spec-driven-development/ Define │ ├── planning-and-task-breakdown/ Plan │ ├── incremental-implementation/ Build │ ├── context-engineering/ Build │ ├── source-driven-development/ Build │ ├── doubt-driven-development/ Build │ ├── frontend-ui-engineering/ Build │ ├── test-driven-development/ Build │ ├── api-and-interface-design/ Build │ ├── browser-testing-with-devtools/ Verify │ ├── debugging-and-error-recovery/ Verify │ ├── code-review-and-quality/ Review │ ├── code-simplification/ Review │ ├── security-and-hardening/ Review │ ├── performance-optimization/ Review │ ├── git-workflow-and-versioning/ Ship │ ├── ci-cd-and-automation/ Ship │ ├── deprecation-and-migration/ Ship │ ├── documentation-and-adrs/ Ship │ ├── observability-and-instrumentation/ Ship │ ├── shipping-and-launch/ Ship │ ├── chaos-engineering/ Ops │ ├── cost-optimization/ Ops │ ├── data-engineering/ Ops │ ├── ai-ops/ Ops │ └── using-agentforge/ Meta: how to use this pack ├── agents/ 5 specialist personas ├── references/ 5 supplementary checklists ├── hooks/ Session lifecycle hooks ├── scripts/ Validation & build automation ├── .claude/commands/ 7 slash commands Claude Code ├── .gemini/commands/ 7 slash commands Gemini CLI ├── commands/ 8 slash commands Antigravity CLI ├── plugin.json Antigravity plugin manifest ├── package.json Node.js tooling & scripts ├── Makefile Local development workflows └── docs/ Setup guides per tool This repository includes a comprehensive validation and quality pipeline: Install dependencies npm install Run full validation suite npm test Or use Make make ci Available commands: | Command | What It Does | |---|---| npm run validate | Validate all skill files for anatomy compliance | npm run validate:strict | Same, but warnings block CI | npm run quality:cross-skill | Check cross-skill consistency and references | npm run quality:agents | Validate agent persona files | npm run test:hooks | Test session lifecycle hooks | npm run build:packages | Build .zip packages for distribution | npm run stats | Show project statistics dashboard | Quality gates enforced: - YAML frontmatter validation name, description, max length - Required sections: Overview, When to Use, Common Rationalizations, Red Flags, Verification - Cross-skill reference integrity no dead links - Internal markdown link validation - Description quality must contain both "what" and "when" signals - Token estimation and size warnings - Code block language specifier checks - Agent persona consistency - Lifecycle coverage completeness "I watched an AI agent ship a 'working' feature that had no tests, no error handling, and a SQL injection vulnerability. It was 'done' in 20 minutes. It would have taken 2 days to fix in production." — Every engineering lead, 2024-2025 AI agents are incredible accelerators. They're also incredible liability generators — because they optimize for speed , not correctness . They don't know what they don't know, and they don't know that they don't know it. AgentForge is the guardrail. Every skill in this pack encodes hard-won judgment from production engineering: When to write a spec always, for anything non-trivial What to test behavior, not implementation; edge cases, not just happy path How to review five axes, not just "does it compile" When to ship when rollback is faster than fix-forward These aren't theoretical ideals. They're the workflows that separate teams that sleep through launches from teams that don't. This pack draws from the best engineering cultures in the world: Google: Hyrum's Law, Beyonce Rule, test pyramid, change sizing, trunk-based development, code as liability Netflix: Chaos engineering, circuit breakers, graceful degradation Stripe: API design, backward compatibility, developer experience Amazon: Two-pizza teams, service boundaries, operational readiness Every principle is embedded directly into the step-by-step workflows agents follow — not as footnotes, but as non-negotiable steps. We accept contributions that make agents more reliable, not more clever. See docs/skill-anatomy.md /borhen68/SkillEngine/blob/main/docs/skill-anatomy.md for the format specification and CONTRIBUTING.md /borhen68/SkillEngine/blob/main/CONTRIBUTING.md for guidelines. Every PR goes through the same quality gates the skills enforce — eat your own dog food. MIT — use these skills in your projects, teams, and tools. Build something great.