{"slug": "hardening-api-scan-boundaries-in-skill-scanner-with-sqry-as-the-review-map", "title": "Hardening API Scan Boundaries in skill-scanner, with sqry as the Review Map", "summary": "A developer hardened the REST API boundaries in Cisco's skill-scanner repository, a tool for scanning Agent Skill packages, by adding authentication, rate limiting, and input validation. The hardening branch, codex/harden-api-scan-boundaries, modified 24 files with 1186 insertions and 210 deletions, focusing on shared modules like archive_limits.py and fs_limits.py to prevent bugs in API, CLI, and other paths. The developer used sqry, a semantic code query tool, to analyze 20,445 symbols across 202 files and identify cross-cutting security concerns.", "body_md": "On 14 June 2026 I cloned [cisco-ai-defense/skill-scanner](https://github.com/cisco-ai-defense/skill-scanner), set up the locked `uv`\n\nenvironment, and worked through one small but important question: what does it take to make the REST API safer when the API can scan local directories, accept uploaded ZIP files, run optional analyzers, and queue batch work in the background?\n\nI am not pretending this is a universal API security methodology, or that one branch makes a whole product \"secure\" in the abstract. This is a narrower story, and I think the narrowness is the useful part: a concrete pass over one public Python repository, with a hardening branch called `codex/harden-api-scan-boundaries`\n\n, ending in commit `2cfa313`\n\nand draft [PR #119](https://github.com/cisco-ai-defense/skill-scanner/pull/119), where the evidence was code, tests, docs, and a graph of the repository rather than a confident read of the obvious files.\n\nThe branch changed 24 files, with `1186 insertions`\n\nand `210 deletions`\n\n. The main implementation files were `skill_scanner/api/router.py`\n\n, `skill_scanner/core/analyzer_factory.py`\n\n, `skill_scanner/core/extractors/content_extractor.py`\n\n, `skill_scanner/core/loader.py`\n\n, and `skill_scanner/core/scanner.py`\n\n, plus two new shared modules: `skill_scanner/core/archive_limits.py`\n\nand `skill_scanner/core/fs_limits.py`\n\n.\n\n`skill-scanner`\n\nscans Agent Skill packages. It has CLI paths, Python library paths, eval paths, pre-commit hook paths, and a FastAPI router that exposes endpoints for direct skill scans, uploaded ZIP scans, batch scans, batch-result polling, health checks, and analyzer listing.\n\nThat matters because the REST API does not sit in front of a simple database lookup. It sits in front of local filesystem access, archive extraction, analyzer construction, optional remote-service analyzers such as VirusTotal and Cisco AI Defense, LLM-backed analysis, scanner traversal, loader discovery, and report generation. A bug in one visible route handler can be obvious. A missing bound in a shared loader, reached through API, CLI, evals, tests, and scanner methods, is much easier to miss.\n\nThe first setup step was boring and necessary:\n\n```\nuv sync --frozen --all-extras --dev\n```\n\nThat gave the API dependencies, analyzer extras, pytest, lint tooling, and the project commands needed to move from reading code to running it. The repository also had clear contribution constraints in `CONTRIBUTING.md`\n\n: include tests for changed behaviour, update docs where behaviour or configuration changes, use a conventional commit, keep the `uv.lock`\n\nmodel intact, and verify with the repository's normal commands.\n\nThe hardening target became four broad risk classes:\n\nThere is also the basic API boundary: scan work and scan-result retrieval now require `X-API-Key`\n\n, and the expensive endpoints have process-local rate limiting. Root, health, and analyzer-listing endpoints remain informational.\n\nThe tool that changed the review was [sqry](https://github.com/verivus-oss/sqry), version `20.0.5`\n\n. sqry uses \"semantic\" in the compiler sense, it parses code into ASTs, builds a graph of symbols and relationships, and answers structural questions from that graph. It is not an embedding search tool, and it is not just grep with better ranking.\n\nThe local index for this repository had `20,445`\n\nsymbols across `202`\n\nfiles, with relation support enabled. The graph manifest recorded `26,120`\n\nedges across `200`\n\nPython files, one Ruby file, and one shell file. That is the practical reason it helped here: the API hardening problem crossed API request models, FastAPI handlers, shared scan implementation, analyzer construction, scanner traversal, loader discovery, archive extraction, documentation, and tests.\n\nThe first useful query was not clever:\n\n```\nsqry query 'path:skill_scanner/api/router.py AND kind:function'\n```\n\nIt returned `98`\n\nfunction symbols from `skill_scanner/api/router.py`\n\nin about `35 ms`\n\non this checkout. More importantly, it produced a checklist that included `scan_skill`\n\n, `_scan_skill_impl`\n\n, `scan_uploaded_skill`\n\n, `scan_batch`\n\n, `get_batch_scan_result`\n\n, `run_batch_scan`\n\n, `_validate_path`\n\n, `_count_batch_candidates`\n\n, and `_build_analyzers`\n\n.\n\nThat sounds mundane until you compare it with a manual route read. A manual read tends to start from decorators and then follow the code that looks important. sqry gave me the public route handlers and the helpers in one structural inventory, before I had decided which parts mattered.\n\nThe scanner side was the same:\n\n```\nsqry query 'path:skill_scanner/core/scanner.py AND kind:function'\n```\n\nThat returned `76`\n\nfunction symbols in about `31 ms`\n\n, including `SkillScanner.scan_skill`\n\n, `SkillScanner.scan_directory`\n\n, and `_find_skill_directories`\n\n. The useful distinction was between single-skill scanning, directory discovery, and module-level convenience functions. For a hardening pass, that distinction is load-bearing.\n\nThen the review shifted from \"where is this string?\" to \"what code can reach this behaviour?\"\n\n```\nsqry graph direct-callers _validate_path --json\n```\n\nsqry reported four direct callers: `_resolve_policy`\n\n, `_scan_skill_impl`\n\n, `scan_batch`\n\n, and `run_batch_scan`\n\n. That made the path gate concrete. It was not enough to harden the direct `/scan`\n\npath. The same gate needed to cover policy paths, direct skill paths, batch roots before queuing, and batch execution inside the background task.\n\nThe loader trace was the bigger warning:\n\n```\nsqry graph direct-callers 'SkillLoader.load_skill' --json\n```\n\nThat returned `92`\n\ndirect callers across evals, API code, CLI code, scanner code, and tests. This is where plain text search is weak. You can find `load_skill`\n\ntext matches, but you still have to reason manually about which are method calls, convenience wrappers, test helpers, and shared execution paths. sqry made the broad shared surface visible, which is why the fix did not stop at the API router. The loader itself needed a bounded contract.\n\nThe same pattern showed up in analyzer construction. `build_analyzers`\n\nhad `11`\n\ndirect callers across API, CLI, hooks, evals, and tests. That meant `llm_consensus_runs`\n\nneeded two checks: request-model validation at the API edge, and a second cap inside the analyzer factory so non-API callers get the same invariant.\n\nFor `LLMAnalyzer._consensus_analyze`\n\n, sqry reported one direct caller, `LLMAnalyzer.analyze_async`\n\n, which kept the execution-side analysis focussed. The cap belongs before construction reaches the analyzer loop.\n\nPlain `rg`\n\nstill had a place for exact strings, route decorators, docs, and final sanity checks. The difference is that sqry gave the graph-backed layer: functions and methods instead of arbitrary text, same-name symbols separated across API, CLI, hooks, evals and tests, and caller/callee traces for security-sensitive helpers.\n\nThe API path boundary now fails closed. `_validate_path`\n\nrejects null bytes, resolves the supplied path, and denies access unless `SKILL_SCANNER_ALLOWED_ROOTS`\n\nis configured and the resolved path is inside one of those roots. If no roots are configured, API filesystem access is denied.\n\nThat is a deliberate posture. An API that scans local paths should not assume that \"current working directory\" is a sensible trust boundary, and it should not silently accept arbitrary absolute paths because the caller knows them.\n\nThe upload path changed in a similarly blunt way. `/scan-upload`\n\nstill checks the client-provided filename to require a `.zip`\n\nupload, but the server no longer uses that filename for the staging path. Uploaded bytes are written to:\n\n```\nzip_path = temp_dir / \"upload.zip\"\n```\n\nThat small line removes an entire class of filename-controlled staging behaviour. Around it, the upload flow now streams in `1 MB`\n\nchunks, enforces a `50 MB`\n\nupload limit, reads ZIP EOCD metadata before constructing `ZipFile`\n\n, rejects ZIPs over `500`\n\nentries, rejects uncompressed ZIP contents over `200 MB`\n\n, rejects path traversal entries by resolving each destination under the extraction root, rejects symlink entries, checks again after extraction that no symlink appeared on disk, and only then searches the extracted tree for `SKILL.md`\n\nusing a bounded walk.\n\nThe EOCD preflight lives in `skill_scanner/core/archive_limits.py`\n\nas `read_zip_member_count`\n\n. It reads the ZIP end-of-central-directory metadata, including the ZIP64 case, before the code has to build a `ZipFile`\n\nobject and iterate the archive. The same helper is used by the API upload handler and by `ContentExtractor`\n\n, so archive member-count limits are not two unrelated implementations that can drift.\n\nThe traversal helpers live in `skill_scanner/core/fs_limits.py`\n\n:\n\n```\niter_directory_bounded\nwalk_directory_bounded\n```\n\nBoth are based on `os.scandir`\n\n, and both count entries as they are yielded rather than first materialising a whole tree. They are now used by API batch preflight, scanner directory discovery, loader file discovery, lenient markdown synthesis, and uploaded-tree search. That is the kind of change that looks less exciting than a route patch, but it is exactly where the graph evidence mattered. If the loader has 92 direct callers, the loader cannot depend on the API being the only adult in the room.\n\nBatch scanning now validates the batch root, counts candidates before queueing background work, rejects requests over the configured candidate limit, and passes bounds into `SkillScanner.scan_directory`\n\n:\n\n```\nmax_candidates=MAX_BATCH_SKILLS\nmax_entries_visited=MAX_BATCH_PATHS_VISITED\n```\n\nThe default values in the API are `100`\n\ncandidate skills and `10,000`\n\nfilesystem entries. The scanner then passes loader bounds into `SkillLoader.load_skill`\n\n, which means the per-skill load step is part of the same bounded execution path rather than an unbounded second phase.\n\nThe analyzer boundary changed too. `llm_consensus_runs`\n\nis capped in the API request models with Pydantic, and again in `build_analyzers`\n\n. The API no longer exposes a remote-callable Cisco AI Defense URL override; the analyzer factory can still use operator-controlled arguments and environment configuration, including `AI_DEFENSE_API_URL`\n\n, but the public request model does not let a caller pick the remote endpoint for the server.\n\nFinally, scan endpoints now require `X-API-Key`\n\nbacked by `SKILL_SCANNER_API_KEY`\n\n. `/scan`\n\n, `/scan-upload`\n\n, `/scan-batch`\n\n, and `/scan-batch/{scan_id}`\n\nall check it. The result cache for batch scans is also bounded: `1,000`\n\nentries, with a `3600`\n\nsecond TTL. The rate limiter is deliberately process-local, configurable through `SKILL_SCANNER_API_RATE_LIMIT_REQUESTS`\n\nand `SKILL_SCANNER_API_RATE_LIMIT_WINDOW_SECONDS`\n\n; that is useful for this server, but it is not a distributed quota system, and the docs should make that kind of caveat visible.\n\nThe branch did not stop at implementation. Tests were added or updated across:\n\n`tests/test_api_endpoints.py`\n\n`tests/test_api_deep.py`\n\n`tests/test_analyzer_factory.py`\n\n`tests/test_loader.py`\n\n`tests/test_scanner.py`\n\n`tests/test_extractors.py`\n\n`tests/test_cli_tui_api_fixes.py`\n\nThe focussed verification command was:\n\n```\nuv run pytest \\\n  tests/test_api_endpoints.py \\\n  tests/test_api_deep.py \\\n  tests/test_analyzer_factory.py \\\n  tests/test_loader.py \\\n  tests/test_scanner.py \\\n  tests/test_extractors.py \\\n  tests/test_cli_tui_api_fixes.py \\\n  -q\n```\n\nOn the current checkout, that collected `216`\n\ntests and returned `215 passed, 1 skipped`\n\non Python `3.13.13`\n\n, with only third-party deprecation warnings. The process report also records a broader non-integration, non-LLM, non-e2e run at `1308 passed, 5 skipped, 7 deselected`\n\n, plus `ruff check .`\n\nand `git diff --check`\n\nduring the contribution.\n\nThe documentation updates matter because this is not only a code contract. `.env.example`\n\n, API docs, operations docs, endpoint detail pages, and generated reference docs now describe `SKILL_SCANNER_API_KEY`\n\n, `SKILL_SCANNER_ALLOWED_ROOTS`\n\n, rate limits, traversal limits, archive limits, batch limits, and the LLM consensus cap. A security control that exists only in code is easier to bypass operationally than one that is named in the configuration surface people actually read.\n\nThe useful lesson here is not \"AI found security bugs\". That is too vague, and frankly not the interesting part.\n\nThe useful lesson is that AI-assisted review gets much better when the agent is forced to work from repository facts that can be rerun: symbol inventories, caller traces, callee traces, exact changed files, test names, and concrete verification commands. A model can read the most obvious route handler and sound convincing. A graph can show that the helper under discussion has four direct callers, or that a loader method has 92 direct callers, and that changes the review from opinion to coverage.\n\nThat is where sqry was valuable. It made the review faster, but the speed was not the main win. The main win was not having to trust a first-pass mental map of the codebase. The map was queryable, and when the map said the loader was shared across API, CLI, eval, scanner, and tests, the fix moved down into the loader. When the map said analyzer construction was shared, the consensus cap moved into the factory as well as the API request model.\n\nThis is also why I do not like abstract claims about \"secure by design\" unless the design names the boundary and the evidence. In this branch, the claims are more modest and more useful: API path access fails closed without configured roots; uploaded filenames no longer control staging paths; archive expansion has member, size, traversal, and symlink checks; batch discovery and scanner traversal have explicit limits; loader discovery has explicit limits; LLM consensus runs are capped at both the request and factory boundary; the focussed suite passes.\n\nThose are claims a maintainer can inspect.\n\nThe same pattern showed up while working through issues in [NVIDIA SkillSpector](https://github.com/NVIDIA/SkillSpector/issues): Stage 2 LLM batch failures, retry and concurrency behaviour, unanalyzed findings, ingest-layer bounds, and whitespace-padding detection all ended up being boundary questions. Different repository, different implementation, same shape of problem.\n\nThis is the part that feels important to me. AI-assisted development can help us ship faster, but faster shipping also means we can expose larger attack surfaces sooner: more API entry points, more archive and clone paths, more model calls, more background work, more places where a scanner accepts untrusted input. The answer is not to slow everything down by default; it is to make boundary review part of the shipping motion, with concrete limits, tests, and code-graph evidence before the surface gets too wide to reason about.\n\n`SKILL_SCANNER_ALLOWED_ROOTS`\n\nbeing absent means no API path access, not \"scan whatever path was supplied\".Thanks for reading this far, I hope this is useful if you are hardening an API that wraps local filesystem work, archive extraction, or other expensive scanner-style behaviour. The bit I would reuse first is not any single line of code, it is the habit of asking the repository graph where the boundary actually runs before deciding where the fix belongs.", "url": "https://wpnews.pro/news/hardening-api-scan-boundaries-in-skill-scanner-with-sqry-as-the-review-map", "canonical_source": "https://dev.to/wernerk_au/hardening-api-scan-boundaries-in-skill-scanner-with-sqry-as-the-review-map-dip", "published_at": "2026-06-14 12:24:42+00:00", "updated_at": "2026-06-14 12:40:41.776895+00:00", "lang": "en", "topics": ["developer-tools", "ai-safety", "ai-infrastructure"], "entities": ["Cisco", "skill-scanner", "sqry", "FastAPI", "VirusTotal", "Cisco AI Defense"], "alternates": {"html": "https://wpnews.pro/news/hardening-api-scan-boundaries-in-skill-scanner-with-sqry-as-the-review-map", "markdown": "https://wpnews.pro/news/hardening-api-scan-boundaries-in-skill-scanner-with-sqry-as-the-review-map.md", "text": "https://wpnews.pro/news/hardening-api-scan-boundaries-in-skill-scanner-with-sqry-as-the-review-map.txt", "jsonld": "https://wpnews.pro/news/hardening-api-scan-boundaries-in-skill-scanner-with-sqry-as-the-review-map.jsonld"}}