{"slug": "i-built-a-free-audit-tool-that-runs-12-checks-in-parallel-against-any-domain-is", "title": "I built a free audit tool that runs 12 checks in parallel against any domain. Here is the architecture.", "summary": "The article describes the architecture of Canopy Guard, a free website audit tool built by the author that runs 12 parallel checks on any domain, combining SEO, AEO, and GEO visibility scoring with a security posture assessment. The backend uses a Node.js Express server on Railway with TypeScript, while the frontend is a React app on Vercel, executing all scan modules simultaneously via `Promise.all` to produce a single report in about 15 seconds. Each module checks specific aspects like DNS resolution, TLS, security headers, HTML structure, schema markup, and AI crawl risk, with results normalized into 0-1 scores for visibility and security.", "body_md": "I spent the past few months building Canopy Guard, a free website audit tool that combines SEO, AEO, and GEO visibility scoring with a full security posture check. One scan, one report, about 15 seconds.\nThis is the technical breakdown of how it works.\nThe problem\nI audit websites for clients as part of my regular work. Every engagement started with the same routine: run the site through an SEO checker, then a separate security header scanner, then manually check for structured data, then look at robots.txt. Four tools, four tabs, four different report formats, and none of them cross-referenced their findings.\nI wanted a single scan that checked everything and surfaced the gaps between visibility and security.\nArchitecture\nThe backend is a Node.js Express server written in TypeScript, deployed on Railway. The frontend is a React app on Vercel.\nWhen a user enters a domain, the frontend POSTs to /api/scan on the Railway backend. The backend runs 12 scan modules in parallel using Promise.all:\nconst [dns, tls, headers, htmlStructure, schema, qa, geo,\ncrawlRisk, endpoints, links, vulns, bizLogic] =\nawait Promise.all([\ncheckDNS(domain),\ncheckTLS(domain),\ncheckSecurityHeaders(domain),\ncheckHTMLStructure(domain),\ncheckSchemaMarkup(domain),\ncheckQADensity(domain),\ncheckGEO(domain),\ncheckAICrawlRisk(domain),\ncheckExposedEndpoints(domain),\ncheckInternalLinking(domain),\ncheckVulnerabilities(domain),\ncheckBusinessLogic(domain),\n]);\nEach module is an async function that fetches specific data from the target domain and returns structured results.\nThe scan modules\nDNS: Resolves the domain via Google's public DNS API (dns.google/resolve). Returns whether the domain resolves and the IP address.\nTLS: Checks HTTPS reachability, HSTS header presence and max-age value, and whether HTTP redirects to HTTPS.\nSecurity Headers: Checks for all six critical headers: Content-Security-Policy, Strict-Transport-Security, X-Frame-Options, X-Content-Type-Options, Referrer-Policy, and Permissions-Policy.\nHTML Structure: Fetches the full page HTML and parses it for H1 count, meta description presence and length, canonical URL match, and page title.\nSchema Markup: Extracts all blocks, parses them, identifies FAQPage and Organization types, and flags structural errors like missing @context.<br>\nQ&A Density: Strips HTML tags, splits into sentences, and calculates the ratio of question-pattern sentences to total sentences. This measures how \"answer engine ready\" the content is.<br>\nGEO: Measures chunking efficiency (how well content divides into ~350-token blocks based on header/paragraph structure), citation precision (ratio of specific data points to generic text), and checks for llms.txt at the domain root.<br>\nAI Crawl Risk: Fetches robots.txt, classifies the policy as PERMISSIVE/BALANCED/RESTRICTIVE/NONE, checks for AI-specific bot blocks (GPTBot, Anthropic, Google-Extended, CCBot, ByteSpider), and looks for crawl-delay directives.<br>\nExposed Endpoints: This one was interesting to build. It probes 12 common sensitive paths (/.env, /.git/config, /graphql, etc.). The tricky part: sites with catch-all redirects return 200 for every path. So the module first fetches a guaranteed-nonsense path to detect catch-all behavior. If detected, it compares each probe's response body length and content-type against the catch-all fingerprint to filter out false positives.<br>\nInternal Linking: Counts unique internal links on the homepage and samples a few to estimate link depth.<br>\nVulnerabilities: Checks server headers for version disclosure and outdated software signatures.<br>\nBusiness Logic: Checks for author/publisher attribution markup and cross-references sitemap URLs against homepage links to find orphaned pages.<br>\nScoring<br>\nEach module feeds into a scoring function that normalizes results to 0-1:<br>\nconst seo_score = scoreSEO(htmlStructure, links);<br>\nconst aeo_score = scoreAEO(schema, qa);<br>\nconst geo_score = scoreGEO(geo);<br>\nconst security_posture_score = scoreSecurity(<br>\ntls, headers, crawlRisk, endpoints, vulns<br>\n);<br>\nThe scoring weights are calibrated based on what actually impacts discoverability and security posture. For example, in SEO scoring, crawlability gets the highest weight (0.25) because nothing else matters if bots cannot reach your page. In security scoring, TLS validity (0.15) and security headers (0.25 distributed across 6 headers) carry the most weight.<br>\nCross-Reference Intelligence<br>\nThis is the differentiator. After scoring, the report engine maps findings across layers:</p>\n<p>geo_branch.llms_txt_status vs ai_crawl_risk.robots_policy: If llms.txt is MISSING and robots is PERMISSIVE, flag as CRITICAL. AI scrapers have access with no citation guidance.<br>\napplication_security.exposed_endpoints vs GEO context: If endpoints are exposed, AI RAG parsers can index internal routes from JavaScript bundles.<br>\nbusiness_logic_gaps.data_provenance_leak vs overall visibility: If content has no attribution markup, AI training sets can ingest without linking back.</p>\n<p>Lead capture<br>\nWhen a user wants their PDF report, they enter their email. The frontend sends the lead data to the Railway backend, which writes it to a Notion database via the Notion API. Name, email, domain, all four scores, full report JSON, and a Status field (New/Reviewed/Booked/Closed).<br>\nThe PDF generates entirely in-browser using a print-ready HTML template opened in a new window.<br>\nWhat I would do differently<br>\nIf I were starting over, I would add a headless browser module (Playwright) for JavaScript-rendered sites. The current HTML parser uses server-side fetch, which misses content rendered client-side. That is the biggest gap in the current scan accuracy.<br>\nI would also add a competitor comparison feature: scan two domains side by side and diff the results.<br>\nTry it<br>\nFree, no signup: <a href=\"https://thecanopyguard.com\">https://thecanopyguard.com</a><br>\nThe code is not open source yet, but I am considering it. Would love feedback on the scoring methodology, especially the GEO layer.<br>\nAdam McClarin, CISSP<br>\nMeraki is Love Digital | Soulful TechShareContent{<br>\n\"$schema\": \"<a href=\"https://json-schema.org/draft/2020-12/schema\">https://json-schema.org/draft/2020-12/schema</a>\",<br>\n\"title\": \"UnifiedVisibilityAndSecurityAudit\",<br>\n\"description\": \"Data schema for a combined SEO/AEO/GEO optimization and cybersecurity audit report.\",<br>\n\"type\": \"object\",<br>\n\"required\": [<br>\n\"audit_id\",<br>\n\"target_domain\",<br>\n\"timestapastedPlatform at a glance<br>\nThe CNAPP features offered by Singularity™ Cloud Security brings hyper automation and AI into security auditing. The platform offers modules for cloud security posture management (CSPM), cloud detection and response (CDR), and cloud infrastructure entitlement management (CIEM),pasted</p>", "url": "https://wpnews.pro/news/i-built-a-free-audit-tool-that-runs-12-checks-in-parallel-against-any-domain-is", "canonical_source": "https://dev.to/meraki6966/i-built-a-free-audit-tool-that-runs-12-checks-in-parallel-against-any-domain-here-is-the-2icg", "published_at": "2026-05-22 17:53:17+00:00", "updated_at": "2026-05-22 18:04:08.209582+00:00", "lang": "en", "topics": ["developer-tools", "cybersecurity", "products"], "entities": ["Canopy Guard", "Node.js", "Express", "TypeScript", "Railway", "React", "Vercel"], "alternates": {"html": "https://wpnews.pro/news/i-built-a-free-audit-tool-that-runs-12-checks-in-parallel-against-any-domain-is", "markdown": "https://wpnews.pro/news/i-built-a-free-audit-tool-that-runs-12-checks-in-parallel-against-any-domain-is.md", "text": "https://wpnews.pro/news/i-built-a-free-audit-tool-that-runs-12-checks-in-parallel-against-any-domain-is.txt", "jsonld": "https://wpnews.pro/news/i-built-a-free-audit-tool-that-runs-12-checks-in-parallel-against-any-domain-is.jsonld"}}