cd /news/ai-agents/agent-ml-skills-teach-codex-claude-c… · home topics ai-agents article
[ARTICLE · art-23095] src=github.com pub= topic=ai-agents verified=true sentiment=↑ positive

Agent-ML-skills – Teach Codex/Claude/Cursor to stop making ML mistakes

Agent-ML-skills, a curated pack of 15 battle-tested machine learning skills, has been released to teach AI coding agents like Codex, Claude Code, and Cursor how to avoid common ML mistakes such as data leakage and scoring imbalanced data with accuracy. The skills install with a single command and provide expert guidance on tasks from exploratory data analysis to model serving, without bloating prompts. The tool aims to stop agents from guessing and instead work like experienced ML engineers.

read4 min publishedJun 6, 2026

Production-grade Machine Learning, Data Science & MLOps skills for AI coding agents.

Coding agents are great generalists but make the same ML mistakes over and over: leaking preprocessing into cross-validation, scoring imbalanced data with accuracy, forgetting model.eval()

, building RAG with dense-only retrieval. agent-ml-skills

is a curated pack of 15 battle-tested skills that teach your agent how an experienced ML engineer actually works — so it stops guessing.

Works with Codex, Claude Code, Cursor, and OpenCode.

Install all skills into your agent with one command — no install, no dependencies:

npx agent-ml-skills install --target codex

npx agent-ml-skills install --target claude

npx agent-ml-skills install --target cursor --scope project

npx agent-ml-skills install --target opencode

npx agent-ml-skills install --target all

Browse what's inside first:

npx agent-ml-skills list

Then restart your agent (or start a new session) and it will pick the right skill up automatically when your task matches.

A skill is a single Markdown file with YAML frontmatter telling the agent when to use it and how to do the task well:

---
name: sklearn-pipelines
description: Use when building scikit-learn models that must not leak preprocessing...
---

...workflow, code patterns, pitfalls, hand-off...

Agents that support skills load the description

up front and pull in the full body only when the task matches — so you get expert guidance without bloating every prompt.

Skill Use when…
exploratory-data-analysis
Starting on a new dataset — profiling, distributions, correlations, leakage & viz.
data-cleaning
Handling missing values, duplicates, types, outliers — with train-only imputation.
feature-engineering
Encoding, scaling, datetime/text/aggregation features, leakage-safe target encoding.
pandas-patterns
Writing idiomatic, vectorized, memory-efficient pandas (no SettingWithCopyWarning ).
imbalanced-data
The target is rare (fraud/churn/disease) — metrics, SMOTE, class weights, thresholds.
Skill Use when…
sklearn-pipelines
Building scikit-learn models that must not leak preprocessing into CV.
pytorch-training-loop
Writing/reviewing a PyTorch loop — eval modes, AMP, checkpointing, devices.
model-evaluation
Choosing metrics, validating, calibration, confusion-matrix analysis.
hyperparameter-tuning
Optimizing params — random vs Optuna, leakage-safe CV, early stopping, budget.
Skill Use when…
llm-finetuning
Fine-tuning an LLM — full vs LoRA/QLoRA, data formatting, transformers/PEFT/TRL.
rag-pipeline
Building RAG — chunking, embeddings, hybrid + reranking retrieval, eval.
Skill Use when…
experiment-tracking
Experiments need comparing/reproducing — MLflow/W&B, what to log, registry.
reproducible-ml
A result must be reproducible — seeds, env pinning, data versioning, CUDA determinism.
ml-debugging
A model won't learn, loss is NaN, or metrics look too good — a diagnosis decision tree.
model-serving
Deploying behind an API — FastAPI, safe artifact , batching, ONNX, monitoring.
npx agent-ml-skills <command> [options]

Commands
  list                       List available skills
  install                    Install skills into an agent

Options
  -t, --target <name>        codex | claude | opencode | cursor | all
      --scope <scope>        global (default) | project
      --skills <a,b,c>       comma-separated subset (default: all)
      --dir <path>           install into a custom directory (overrides target)
  -f, --force                overwrite existing skills
  -h, --help                 show this help

Examples

npx agent-ml-skills install --target claude --skills rag-pipeline,llm-finetuning --scope project

npx agent-ml-skills install --dir ./my-agent/skills

npx agent-ml-skills install --target codex --force
Target Global Project
Codex ~/.codex/skills
.codex/skills
Claude Code ~/.claude/skills
.claude/skills
OpenCode ~/.config/opencode/skills
.opencode/skills
Cursor .cursor/rules (flat .md rules)

Leakage-safe by default. Every data skill fits transforms on train only.Concrete over abstract. Real code patterns, not vague advice.Pitfalls included. Each skill ends with the mistakes agents actually make.Composable. Skills hand off to each other (EDA → cleaning → features → pipeline → eval → serving).Zero-dependency installer. Pure Node, nothing to install, nothing to trust.

New skills and improvements are very welcome — see ** CONTRIBUTING.md**. Every skill is validated in CI:

node scripts/validate-skills.mjs

Open a skill request if there's an ML workflow you want your agent to master.

Built by ** Param Bhavsar** — Google Summer of Code '19 @ TensorFlow, ex-HSBC. If this saves you a debugging session, a ⭐ helps others find it.

── more in #ai-agents 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/agent-ml-skills-teac…] indexed:0 read:4min 2026-06-06 ·