Securing AI Agents: Inside NVIDIA's SkillSpector Scanner

NVIDIA open-sourced SkillSpector, a security scanner for AI agent skills, to address vulnerabilities like prompt injection and tool poisoning. Research shows 26.1% of 42,447 skills had vulnerabilities, with 5.2% malicious. SkillSpector uses capability governance to validate skills before installation, detecting threats traditional scanners miss.

Security https://www.devclubhouse.com/c/security Article Securing AI Agents: Inside NVIDIA's SkillSpector Scanner NVIDIA's open-source tool introduces capability governance to protect agentic workflows from prompt injection, tool poisoning, and malicious skills. Ji-ho Choi https://www.devclubhouse.com/u/jiho choi As autonomous AI agents transition from simple chat interfaces to active execution environments, they rely increasingly on portable instruction sets—commonly referred to as "skills." Frameworks like Claude Code, Codex CLI, Gemini CLI, and Cursor allow developers to extend agent capabilities dynamically using the agentskills.io https://agentskills.io open specification. However, this extensibility has introduced a severe, structural security vulnerability: these skills execute with implicit trust and minimal vetting. Recent security research highlights the scale of this threat. An analysis of 42,447 AI agent skills from major marketplaces revealed that 26.1% contained at least one vulnerability, and 5.2% exhibited likely malicious intent. These are not theoretical concerns; malicious skills can harvest API keys, exfiltrate conversation context to external servers, execute arbitrary code, persist across sessions via cron jobs, and poison an agent's memory to manipulate future behavior. To address this, NVIDIA has open-sourced SkillSpector https://github.com/NVIDIA/SkillSpector , a security scanner designed specifically for AI agent skills. By shifting the security paradigm from reactive runtime guardrails to proactive "capability governance," SkillSpector attempts to validate skills before they are ever installed. However, real-world registry data reveals that securing the agentic stack is far more complex than running a traditional vulnerability scan. The Blind Spots of Traditional Security Traditional security tools are fundamentally blind to agentic risk. A standard static analysis tool looks for dangerous system calls or vulnerable package versions, while malware scanners like VirusTotal look for known file signatures. Neither is equipped to detect semantic risks, such as an agent skill that claims to summarize local logs but secretly bundles a prompt instructing the LLM to exfiltrate those logs. This gap is quantified in a public dataset released by the registry OpenClaw https://openclaw.ai OpenClaw/clawhub-security-signals on Hugging Face . Across 67,453 public skill versions evaluated by three independent scanners—VirusTotal, traditional static analysis, and SkillSpector—the overlap in detections was remarkably low. xychart-beta title "Scanner Positive Rates on ClawHub Dataset 67,453 Skills " x-axis SkillSpector, VirusTotal, Static Analysis y-axis "Positive Rate % " 0 -- 60 bar 48.71, 7.75, 6.57 The Jaccard agreement between the scanners highlights this divergence: VirusTotal and SkillSpector: 0.094 agreement 3,286 mutual positives Static Analysis and SkillSpector: 0.104 agreement 3,511 mutual positives Static Analysis and VirusTotal: 0.065 agreement 586 mutual positives Only 468 skills 0.69% were flagged by all three scanners simultaneously, and 81.9% of all positive findings came from a single scanner alone. While VirusTotal caught 72.8% of the 206 confirmed malicious skills, SkillSpector flagged 75.3% of the 25,504 "suspicious" skills—such as those with overbroad permissions or hidden instructions. This demonstrates that agentic risk represents an entirely different attack surface that requires specialized semantic evaluation. Inside the SkillSpector Architecture SkillSpector evaluates skills using a two-stage analysis pipeline that covers 64 vulnerability patterns across 16 categories, including prompt injection, data exfiltration, privilege escalation, supply chain vulnerabilities, excessive agency, memory poisoning, and Model Context Protocol MCP tool poisoning. 1. Fast Static Analysis Static analysis is fast, deterministic, and runs locally. It parses the skill's code using Abstract Syntax Trees AST , runs YARA signatures, and performs taint tracking to identify dangerous code patterns and suspicious strings. It also queries OSV.dev https://osv.dev for real-time CVE lookups to catch vulnerable dependencies, falling back to an offline database if network access is unavailable. 2. LLM-Assisted Semantic Analysis Static analysis cannot easily detect intent. For issues like description-behavior mismatches, vague triggers, or hidden instructions, SkillSpector uses an optional semantic analysis stage. This stage sends the skill's documentation such as its SKILL.md file and its actual code to an LLM. The model acts as a judge, verifying whether the skill's declared purpose matches its actual behavior and requested permissions. Developer Workflow and CI/CD Integration For developers building or consuming AI agent skills, SkillSpector can be integrated directly into local development workflows or CI/CD pipelines. It supports scanning local directories, single files, ZIP archives, and remote Git repositories. Installation SkillSpector can be installed using uv or standard pip . It requires Python 3.12: Clone and navigate to the repository git clone https://github.com/NVIDIA/SkillSpector.git cd SkillSpector Create and activate a virtual environment uv venv .venv && source .venv/bin/activate Install production dependencies make install Alternatively, it can be run via Docker to avoid local Python dependency conflicts: docker build -t skillspector . docker run --rm -v "$PWD:/scan" skillspector scan ./my-skill/ --no-llm Running Scans and Configuring LLMs To run a basic static scan and output the results in SARIF format for GitHub Actions or IDE integration: skillspector scan ./my-skill/ --no-llm --format sarif --output report.sarif To enable semantic analysis, developers must configure an LLM provider. SkillSpector supports OpenAI, Anthropic, NVIDIA's inference gateway, and local OpenAI-compatible servers like Ollama or vLLM: export SKILLSPECTOR PROVIDER=anthropic export ANTHROPIC API KEY=sk-ant-... skillspector scan ./my-skill/ Triage Policy When integrating SkillSpector into a release pipeline, developers should establish clear triage rules based on the scanner's risk scoring 0-100 : Critical/High Severity: Block the release. This includes active prompt injections, unauthorized data exfiltration paths, or critical CVEs. Hidden Instructions / Tool Poisoning: Block and require the removal of any undocumented prompts or unexpected tool definitions. Underdeclared Capabilities: If a skill requests access to tools or files not documented in its metadata, the build should fail until the permissions are aligned. Description-Behavior Mismatch: If the semantic LLM flags a mismatch, developers must rewrite the description or refactor the code to ensure transparency. The Path to Capability Governance Scanning is only the first step in securing the agentic ecosystem. NVIDIA is pairing SkillSpector with a broader capability governance framework that introduces Skill Cards and Cryptographic Signatures . When a skill is verified, it is issued a machine-readable Skill Card. This card acts as a trust record, detailing the skill's ownership, dependencies, technical limitations, and verification status. To prevent tampering after verification, the skill is cryptographically signed with a detached signature skill.oms.sig . This allows agent runtimes to verify the integrity and provenance of a skill before execution. Registries like OpenClaw are already implementing this multi-layered approach. In their ClawScan pipeline, a publishing event triggers three parallel checks: traditional static analysis, VirusTotal malware scanning, and SkillSpector agentic risk scanning. An LLM-as-judge then synthesizes these signals to generate a final Skill Card and trust verdict. The Editorial Verdict SkillSpector is a highly necessary tool for an ecosystem that has scaled far faster than its security practices. However, developers must understand its limitations before making it a hard gate in their CI/CD pipelines. Because SkillSpector flagged nearly 49% of the skills in the OpenClaw dataset as positive for some level of risk, using it as a binary pass/fail gate without fine-tuning will result in an overwhelming number of false positives. Many of these flags represent "underdeclared capabilities" or minor policy deviations rather than active malware. For production enterprise environments, the correct approach is to use SkillSpector's static analysis as a hard gate for known vulnerabilities CVEs and dangerous AST patterns , while routing its semantic LLM findings to a human-in-the-loop triage queue or a secondary LLM-as-judge pipeline. Treating agent skills as active, deployable capabilities rather than static text is the only way to safely scale autonomous agent workflows. Sources & further reading - NVIDIA/SkillSpector https://github.com/NVIDIA/SkillSpector — github.com - NVIDIA-Verified Agent Skills Provide Capability Governance for AI Agents | NVIDIA Technical Blog https://developer.nvidia.com/blog/nvidia-verified-agent-skills-provide-capability-governance-for-ai-agents/ — developer.nvidia.com - Scan Agent Skills Before Installation | NVIDIA Skill Documentation https://docs.nvidia.com/skills/scanning-agent-skills — docs.nvidia.com - Nvidia SkillSpector · AI Automation Society https://www.skool.com/ai-automation-society/nvidia-skillspector — skool.com - OpenClaw Collaborates with NVIDIA for Stronger Agent Skill Security - OpenClaw Blog https://openclaw.ai/blog/openclaw-nvidia-skill-security — openclaw.ai Ji-ho Choi https://www.devclubhouse.com/u/jiho choi · Security & Cloud Editor Ji-ho covers the increasingly tangled overlap between cloud architecture and security, drawing on a background as a penetration tester to keep his reporting grounded in real-world attack paths. He never lets a vendor claim go unquestioned and insists that every buzzword come with a proof of concept. Discussion 1 There’s so much cool stuff out there I don’t know how we are supposed to keep up