{"slug": "agent-ml-skills-teach-codex-claude-cursor-to-stop-making-ml-mistakes", "title": "Agent-ML-skills – Teach Codex/Claude/Cursor to stop making ML mistakes", "summary": "Agent-ML-skills, a curated pack of 15 battle-tested machine learning skills, has been released to teach AI coding agents like Codex, Claude Code, and Cursor how to avoid common ML mistakes such as data leakage and scoring imbalanced data with accuracy. The skills install with a single command and provide expert guidance on tasks from exploratory data analysis to model serving, without bloating prompts. The tool aims to stop agents from guessing and instead work like experienced ML engineers.", "body_md": "**Production-grade Machine Learning, Data Science & MLOps skills for AI coding agents.**\n\nCoding agents are great generalists but make the same **ML mistakes over and over**: leaking preprocessing into cross-validation, scoring imbalanced data with accuracy, forgetting `model.eval()`\n\n, building RAG with dense-only retrieval. `agent-ml-skills`\n\nis a curated pack of **15 battle-tested skills** that teach your agent how an experienced ML engineer actually works — so it stops guessing.\n\nWorks with **Codex, Claude Code, Cursor, and OpenCode**.\n\nInstall all skills into your agent with one command — **no install, no dependencies**:\n\n```\n# Codex\nnpx agent-ml-skills install --target codex\n\n# Claude Code\nnpx agent-ml-skills install --target claude\n\n# Cursor\nnpx agent-ml-skills install --target cursor --scope project\n\n# OpenCode\nnpx agent-ml-skills install --target opencode\n\n# Everything, everywhere\nnpx agent-ml-skills install --target all\n```\n\nBrowse what's inside first:\n\n```\nnpx agent-ml-skills list\n```\n\nThen restart your agent (or start a new session) and it will pick the right skill up automatically when your task matches.\n\nA skill is a single Markdown file with YAML frontmatter telling the agent **when** to use it and **how** to do the task well:\n\n```\n---\nname: sklearn-pipelines\ndescription: Use when building scikit-learn models that must not leak preprocessing...\n---\n\n# scikit-learn Pipelines\n...workflow, code patterns, pitfalls, hand-off...\n```\n\nAgents that support skills load the `description`\n\nup front and pull in the full body only when the task matches — so you get expert guidance **without bloating every prompt**.\n\n| Skill | Use when… |\n|---|---|\nexploratory-data-analysis |\nStarting on a new dataset — profiling, distributions, correlations, leakage & viz. |\ndata-cleaning |\nHandling missing values, duplicates, types, outliers — with train-only imputation. |\nfeature-engineering |\nEncoding, scaling, datetime/text/aggregation features, leakage-safe target encoding. |\npandas-patterns |\nWriting idiomatic, vectorized, memory-efficient pandas (no `SettingWithCopyWarning` ). |\nimbalanced-data |\nThe target is rare (fraud/churn/disease) — metrics, SMOTE, class weights, thresholds. |\n\n| Skill | Use when… |\n|---|---|\nsklearn-pipelines |\nBuilding scikit-learn models that must not leak preprocessing into CV. |\npytorch-training-loop |\nWriting/reviewing a PyTorch loop — eval modes, AMP, checkpointing, devices. |\nmodel-evaluation |\nChoosing metrics, validating, calibration, confusion-matrix analysis. |\nhyperparameter-tuning |\nOptimizing params — random vs Optuna, leakage-safe CV, early stopping, budget. |\n\n| Skill | Use when… |\n|---|---|\nllm-finetuning |\nFine-tuning an LLM — full vs LoRA/QLoRA, data formatting, transformers/PEFT/TRL. |\nrag-pipeline |\nBuilding RAG — chunking, embeddings, hybrid + reranking retrieval, eval. |\n\n| Skill | Use when… |\n|---|---|\nexperiment-tracking |\nExperiments need comparing/reproducing — MLflow/W&B, what to log, registry. |\nreproducible-ml |\nA result must be reproducible — seeds, env pinning, data versioning, CUDA determinism. |\nml-debugging |\nA model won't learn, loss is NaN, or metrics look too good — a diagnosis decision tree. |\nmodel-serving |\nDeploying behind an API — FastAPI, safe artifact loading, batching, ONNX, monitoring. |\n\n```\nnpx agent-ml-skills <command> [options]\n\nCommands\n  list                       List available skills\n  install                    Install skills into an agent\n\nOptions\n  -t, --target <name>        codex | claude | opencode | cursor | all\n      --scope <scope>        global (default) | project\n      --skills <a,b,c>       comma-separated subset (default: all)\n      --dir <path>           install into a custom directory (overrides target)\n  -f, --force                overwrite existing skills\n  -h, --help                 show this help\n```\n\n**Examples**\n\n```\n# Just the LLM skills, into the current project\nnpx agent-ml-skills install --target claude --skills rag-pipeline,llm-finetuning --scope project\n\n# Into a custom agent directory\nnpx agent-ml-skills install --dir ./my-agent/skills\n\n# Re-install and overwrite\nnpx agent-ml-skills install --target codex --force\n```\n\n| Target | Global | Project |\n|---|---|---|\n| Codex | `~/.codex/skills` |\n`.codex/skills` |\n| Claude Code | `~/.claude/skills` |\n`.claude/skills` |\n| OpenCode | `~/.config/opencode/skills` |\n`.opencode/skills` |\n| Cursor | — | `.cursor/rules` (flat `.md` rules) |\n\n**Leakage-safe by default.** Every data skill fits transforms on train only.**Concrete over abstract.** Real code patterns, not vague advice.**Pitfalls included.** Each skill ends with the mistakes agents actually make.**Composable.** Skills hand off to each other (EDA → cleaning → features → pipeline → eval → serving).**Zero-dependency installer.** Pure Node, nothing to install, nothing to trust.\n\nNew skills and improvements are very welcome — see ** CONTRIBUTING.md**. Every skill is validated in CI:\n\n```\nnode scripts/validate-skills.mjs\n```\n\nOpen a [skill request](https://github.com/param087/agent-ml-skills/issues/new?template=skill_request.md) if there's an ML workflow you want your agent to master.\n\nBuilt by ** Param Bhavsar** — Google Summer of Code '19 @ TensorFlow, ex-HSBC. If this saves you a debugging session, a ⭐ helps others find it.", "url": "https://wpnews.pro/news/agent-ml-skills-teach-codex-claude-cursor-to-stop-making-ml-mistakes", "canonical_source": "https://github.com/param087/agent-ml-skills", "published_at": "2026-06-06 02:21:42+00:00", "updated_at": "2026-06-06 03:17:49.430460+00:00", "lang": "en", "topics": ["ai-agents", "machine-learning", "mlops", "ai-tools", "ai-products"], "entities": ["Codex", "Claude Code", "Cursor", "OpenCode", "Agent-ML-skills"], "alternates": {"html": "https://wpnews.pro/news/agent-ml-skills-teach-codex-claude-cursor-to-stop-making-ml-mistakes", "markdown": "https://wpnews.pro/news/agent-ml-skills-teach-codex-claude-cursor-to-stop-making-ml-mistakes.md", "text": "https://wpnews.pro/news/agent-ml-skills-teach-codex-claude-cursor-to-stop-making-ml-mistakes.txt", "jsonld": "https://wpnews.pro/news/agent-ml-skills-teach-codex-claude-cursor-to-stop-making-ml-mistakes.jsonld"}}