Cloud Skills Are Still Just Skills

Anthropic released two new premium Claude Code skills, `/ultraplan` and `/ultrareview`, that generate multi-phase implementation plans and merge-readiness assessments by processing code through a closed, cloud-based pipeline. A developer who tested the skills found their core review quality comparable to existing open-source pipelines, with the main improvement being a verification step that filters false positives before presenting findings. Unlike previous Claude Code skills shipped as readable markdown, the new "ultra" versions hide their prompts and pipeline logic, preventing users from learning from or adapting the methodology.

I ran /ultraplan and /ultrareview against my own skills for a week. They’re good. They have a verification loop I hadn’t built yet. But they’re the same kind of pipeline I’ve been running locally for months, wrapped in a remote execution layer you can’t read, billed after three free runs. What They Do what-they-do /ultraplan generates a multi-phase implementation plan with verification checkpoints. /ultrareview runs a code review pipeline and produces a merge-readiness assessment. Both run on Anthropic’s cloud. Your code gets sent up, processed through stages you can’t inspect, and results come back. I’ve seen reports of $5-$20 per run depending on codebase size. You can’t read the prompts. You can’t see the pipeline. Every previous Claude Code skill shipped as readable markdown in the install directory. /review , /plan , /simplify . You could open them, learn from them, fork them. Not these. My A/B Test my-ab-test I ran /ultrareview against my own pr-review pipeline https://vexjoy.com/posts/pipelines-not-prompts/ on the same PRs. Same codebase, same changes. Core review quality was comparable. Both catch the same class of issues. Security findings, logic errors, style violations. My pipeline dispatches three parallel reviewers and merges their findings. The ultra version produces similar coverage. Where /ultrareview added something was at the end. It runs a verification step that checks whether review findings hold up against the actual code, then produces a merge-readiness verdict. My pipeline didn’t do that. It catches false positives, the kind I wrote about in my piece on adaptive thinking variance https://vexjoy.com/posts/adaptive-thinking-variance/ , where a CRITICAL finding sends you down a rabbit hole investigating something that isn’t a real problem. Re-checking findings against the code before showing them to the developer is a real improvement. So I built it. A verification agent that takes review output, re-reads the relevant code sections, filters out findings that don’t hold up, and scores merge-readiness by what remains. Took a day. Nothing novel about it, just the step you eventually add after watching false positives waste enough of your time. The Tension the-tension Anthropic’s own engineers have been saying the right things. Barry Zhang and Mahesh Murag, both on the Claude Code team, have publicly said variations of “Stop building agents. Build Skills.” Skills are open. Inspectable. Composable. Build them and share them. My entire toolkit https://github.com/notque/vexjoy-agent is built on that premise. Skills as methodology, agents as domain knowledge, the handyman pulling the right tool from the toolbox https://vexjoy.com/posts/the-handyman-principle-why-your-ai-forgets-everything/ . It works because you can read a skill, understand what it does, and adapt it. /ultraplan and /ultrareview are the same kind of pipeline I run on my laptop. Phases that produce artifacts, verification steps, parallel agents. Nothing about cloud execution gives them capabilities local execution can’t. I run 10 parallel review agents during full codebase reviews https://vexjoy.com/posts/pipelines-not-prompts/ . Worktrees handle isolation. Headless sessions handle parallelism. I learned more about code review methodology by reading Claude Code’s /review skill than from most blog posts about the topic. The prompts showed what the tool prioritized, what it checked first, how it structured findings. That made me a better skill author. /ultrareview produces output but doesn’t teach you how to build a review pipeline. The Pattern the-pattern I’ve seen this before. You build an open ecosystem. People invest in it. They build skills, share patterns, develop expertise. Then you ship premium features that use the same architecture but close the implementation. Review and planning aren’t the only ones. Anthropic also launched Claude Security https://claude.com/product/claude-security , a cloud-hosted vulnerability scanner that traces data flows, validates findings through adversarial verification, and suggests patches. Same approach. Runs remotely, no visibility into the prompts or pipeline. Currently in beta for Enterprise with Team and Max access coming. A security review pipeline is something you can build with the skill architecture that already exists. Scan, verify, patch. Three agents with phase gates between them. I’m not saying Anthropic promised to keep everything open. And these features aren’t bad. But shipping them as opaque cloud services when they could have been open local skills is a choice. The message used to be “here’s how we build skills, now you build them too.” Now it also includes “here are skills we built that you can’t see, and they cost extra.” What It Means what-it-means I can take a review skill, pair it with a Go agent, wrap it in a pipeline that saves artifacts at phase boundaries. I can inspect every piece. When something breaks, I can diagnose it because I can read the prompts. You can’t compose what you can’t read. And you can’t diagnose what you can’t inspect. If Anthropic ships more features this way, the ecosystem splits. Open skills you build on. Closed skills you pay for. The closed ones will probably be better out of the box because Anthropic has more resources. The open ones will be more adaptable because you can modify them. That split favors people who already know how to build skills. For everyone else, the premium tier becomes the default because the alternative requires expertise the closed skills no longer help you develop. I recreated the verification step and it sits in my toolkit where I can see it, modify it, and compose it with everything else. But I’ve got months of skill-building experience behind that. The shift from open to opaque makes it harder for new people to build that experience by studying how the built-in skills work. These are prompt pipelines producing artifacts through phased methodology. That’s what skills are. The question is whether Anthropic ships new capabilities as open skills people can learn from, or as closed services people pay for. The last month points in one direction. Maybe they’ll open these up eventually, or ship open alternatives alongside premium versions. But we’ll see.