Production-grade Machine Learning, Data Science & MLOps skills for AI coding agents.
Coding agents are great generalists but make the same ML mistakes over and over: leaking preprocessing into cross-validation, scoring imbalanced data with accuracy, forgetting model.eval()
, building RAG with dense-only retrieval. agent-ml-skills
is a curated pack of 15 battle-tested skills that teach your agent how an experienced ML engineer actually works — so it stops guessing.
Works with Codex, Claude Code, Cursor, and OpenCode.
Install all skills into your agent with one command — no install, no dependencies:
npx agent-ml-skills install --target codex
npx agent-ml-skills install --target claude
npx agent-ml-skills install --target cursor --scope project
npx agent-ml-skills install --target opencode
npx agent-ml-skills install --target all
Browse what's inside first:
npx agent-ml-skills list
Then restart your agent (or start a new session) and it will pick the right skill up automatically when your task matches.
A skill is a single Markdown file with YAML frontmatter telling the agent when to use it and how to do the task well:
---
name: sklearn-pipelines
description: Use when building scikit-learn models that must not leak preprocessing...
---
...workflow, code patterns, pitfalls, hand-off...
Agents that support skills load the description
up front and pull in the full body only when the task matches — so you get expert guidance without bloating every prompt.
| Skill | Use when… |
|---|---|
| exploratory-data-analysis | |
| Starting on a new dataset — profiling, distributions, correlations, leakage & viz. | |
| data-cleaning | |
| Handling missing values, duplicates, types, outliers — with train-only imputation. | |
| feature-engineering | |
| Encoding, scaling, datetime/text/aggregation features, leakage-safe target encoding. | |
| pandas-patterns | |
Writing idiomatic, vectorized, memory-efficient pandas (no SettingWithCopyWarning ). |
|
| imbalanced-data | |
| The target is rare (fraud/churn/disease) — metrics, SMOTE, class weights, thresholds. |
| Skill | Use when… |
|---|---|
| sklearn-pipelines | |
| Building scikit-learn models that must not leak preprocessing into CV. | |
| pytorch-training-loop | |
| Writing/reviewing a PyTorch loop — eval modes, AMP, checkpointing, devices. | |
| model-evaluation | |
| Choosing metrics, validating, calibration, confusion-matrix analysis. | |
| hyperparameter-tuning | |
| Optimizing params — random vs Optuna, leakage-safe CV, early stopping, budget. |
| Skill | Use when… |
|---|---|
| llm-finetuning | |
| Fine-tuning an LLM — full vs LoRA/QLoRA, data formatting, transformers/PEFT/TRL. | |
| rag-pipeline | |
| Building RAG — chunking, embeddings, hybrid + reranking retrieval, eval. |
| Skill | Use when… |
|---|---|
| experiment-tracking | |
| Experiments need comparing/reproducing — MLflow/W&B, what to log, registry. | |
| reproducible-ml | |
| A result must be reproducible — seeds, env pinning, data versioning, CUDA determinism. | |
| ml-debugging | |
| A model won't learn, loss is NaN, or metrics look too good — a diagnosis decision tree. | |
| model-serving | |
| Deploying behind an API — FastAPI, safe artifact , batching, ONNX, monitoring. |
npx agent-ml-skills <command> [options]
Commands
list List available skills
install Install skills into an agent
Options
-t, --target <name> codex | claude | opencode | cursor | all
--scope <scope> global (default) | project
--skills <a,b,c> comma-separated subset (default: all)
--dir <path> install into a custom directory (overrides target)
-f, --force overwrite existing skills
-h, --help show this help
Examples
npx agent-ml-skills install --target claude --skills rag-pipeline,llm-finetuning --scope project
npx agent-ml-skills install --dir ./my-agent/skills
npx agent-ml-skills install --target codex --force
| Target | Global | Project |
|---|---|---|
| Codex | ~/.codex/skills |
|
.codex/skills |
||
| Claude Code | ~/.claude/skills |
|
.claude/skills |
||
| OpenCode | ~/.config/opencode/skills |
|
.opencode/skills |
||
| Cursor | — | .cursor/rules (flat .md rules) |
Leakage-safe by default. Every data skill fits transforms on train only.Concrete over abstract. Real code patterns, not vague advice.Pitfalls included. Each skill ends with the mistakes agents actually make.Composable. Skills hand off to each other (EDA → cleaning → features → pipeline → eval → serving).Zero-dependency installer. Pure Node, nothing to install, nothing to trust.
New skills and improvements are very welcome — see ** CONTRIBUTING.md**. Every skill is validated in CI:
node scripts/validate-skills.mjs
Open a skill request if there's an ML workflow you want your agent to master.
Built by ** Param Bhavsar** — Google Summer of Code '19 @ TensorFlow, ex-HSBC. If this saves you a debugging session, a ⭐ helps others find it.