{"slug": "i-asked-claude-to-map-my-infrastructure-then-i-asked-a-purpose-built-tool", "title": "I Asked Claude to Map My Infrastructure. Then I Asked a Purpose-Built Tool.", "summary": "A developer compared a general-purpose LLM (Claude) with a purpose-built AI SRE tool (Knox) for infrastructure discovery on a small stack of Linux VMs and a Kubernetes cluster. Claude produced a flat markdown report lacking topology visualization, business grouping, risk assessment, and persistence, while Knox delivered a structured, verified model with specialized agents and quality gates. The developer concluded that general AI excels at reasoning and explanation, but specialized AI is needed for building structured, persistent, and verifiable models.", "body_md": "I manage a small stack. Three Linux VMs, one Kubernetes cluster, maybe 20-something services total. Not big. But underdocumented — the kind of environment where you SSH in and discover things you forgot were running.\n\nLast week I ran the same task through two different AI tools: \"tell me what's running, how it connects, and what looks risky.\" One is a general-purpose LLM (Claude). The other is a purpose-built AI SRE tool. Same environment, same ask. The results were... instructive.\n\nSimple brief: infrastructure discovery. I want a full picture — services, dependencies, topology, risks. The kind of thing a new hire would spend their first week piecing together from wikis that haven't been updated since 2023.\n\nMy prompt:\n\n\"I manage a small infrastructure — 3 Linux VMs (172.30.0.41, 172.30.0.42, 172.30.0.43) and a Kubernetes cluster. SSH access is already configured. Help me understand what's running across this environment — I want a full picture of my services, dependencies, and topology.\"\n\nI'm running Claude Code locally with the Opus model — their flagship tier. Claude didn't ask questions. It just started SSH-ing in.\n\nFive minutes later it handed me a report. And honestly? It was better than I expected.\n\nWhat Claude delivered:\n\nFive minutes. No hand-holding. For a \"quick, what's running here?\" sweep, this is genuinely useful.\n\nHere's what I noticed after the initial \"wow, that was fast\" wore off.\n\nThe output is a wall of markdown. Accurate, mostly. But flat. Everything has the same weight — a critical single-point-of-failure sits next to a cosmetic naming inconsistency. No severity. No priority.\n\nMore specifically:\n\n**No topology visualization.** I got an ASCII diagram. It's readable for 6 machines. At 60 machines, it's unreadable. At 600, impossible.\n\n**No business grouping.** Claude listed every service but couldn't tell me which ones form the e-commerce flow vs. the logistics flow vs. the platform layer. That requires domain context it doesn't have.\n\n**No risk assessment.** Four issues found, but no severity classification. The dead Redis replica and the cosmetic knoxd naming thing are presented with equal weight.\n\n**No quality gate.** Nobody verified whether Claude's topology was actually correct. It connected things confidently — but was the canary weight really 90/10? I'd need to go check.\n\n**No persistence.** Close the chat window. The report is gone. Tomorrow I'd run it again and get a slightly different exploration path, slightly different findings.\n\n**No depth control.** I can't say \"that Business Island looks risky, go deeper on it.\" It's all-or-nothing.\n\nThis maps to a pattern I keep seeing across industries. In legal tech, people noticed the same thing — general LLMs are good at summarizing contracts but can't do precision clause verification. In finance, ChatGPT can describe how to post a journal entry but can't actually post one. The dividing line is consistent: general AI is a thinking tool; specialized AI is an acting tool.\n\nWhen the task is \"reason about this data and explain it to me\" — general tools are great. When the task shifts to \"build a structured, persistent, verifiable model of my environment\" — you've crossed into territory they weren't designed for.\n\nFor comparison, here's what happens when I send one line to Knox (our purpose-built AI SRE tool — yes, this is our product, stating that upfront):\n\n\"Run a full infrastructure discovery on our production environment.\"\n\nShorter prompt. No need to explain the environment — it already has connectors configured.\n\nTwenty minutes later:\n\nThe differences that matter:\n\nHow it got there — a team of agents, not a single model:\n\nThis is the work process, not a deliverable. Multiple specialized agents collaborated — one coordinated the task, one did the actual discovery, one quality-checked the findings — flagging 9 uncertain items for human review instead of presenting everything with equal confidence.\n\nWe ran this on 5-6 machines. The gap is already visible. But this is the minimum-gap scenario.\n\nAt 60 servers across multiple environments, Claude's context window fills up. You'd need multiple sessions, manual stitching, and the \"flat markdown\" problem becomes unbearable. The gap doesn't grow linearly — it compounds.\n\nThat's not a knock on Claude. A Swiss Army knife is great. But when you need surgery, you reach for a scalpel.\n\n*What's your environment look like? At what scale did you find general AI tools hitting their ceiling for ops work?*\n\nIf you want to try the purpose-built approach: [knoxops.app](https://knoxops.app/?invite_token=DEVTO26)", "url": "https://wpnews.pro/news/i-asked-claude-to-map-my-infrastructure-then-i-asked-a-purpose-built-tool", "canonical_source": "https://dev.to/paul_knoxops/i-asked-claude-to-map-my-infrastructure-then-i-asked-a-purpose-built-tool-51jp", "published_at": "2026-06-15 07:13:09+00:00", "updated_at": "2026-06-15 07:40:32.544473+00:00", "lang": "en", "topics": ["artificial-intelligence", "large-language-models", "ai-tools", "ai-agents", "developer-tools"], "entities": ["Claude", "Knox", "Opus", "Linux", "Kubernetes"], "alternates": {"html": "https://wpnews.pro/news/i-asked-claude-to-map-my-infrastructure-then-i-asked-a-purpose-built-tool", "markdown": "https://wpnews.pro/news/i-asked-claude-to-map-my-infrastructure-then-i-asked-a-purpose-built-tool.md", "text": "https://wpnews.pro/news/i-asked-claude-to-map-my-infrastructure-then-i-asked-a-purpose-built-tool.txt", "jsonld": "https://wpnews.pro/news/i-asked-claude-to-map-my-infrastructure-then-i-asked-a-purpose-built-tool.jsonld"}}