{"slug": "turn-any-company-website-into-structured-b2b-data-one-api-call", "title": "Turn any company website into structured B2B data (one API call)", "summary": "A developer built an API that turns any company website into structured B2B data in a single call. The API reads live site content, never guesses missing fields, and returns clean JSON with company name, sector, description, social links, contact email, and tech stack. It uses a two-pass tech detection system and strict schema validation to ensure reliability.", "body_md": "If you've ever needed to go from a company's website to clean, structured data — its name, sector, a short description, social links, a contact email, and the technologies it runs on — you know the options aren't great:\n\n-\n**Build your own scraper.** Brittle, and every site is different. You'll spend more time maintaining selectors than using the data.\n-\n**Pay a heavyweight data provider.** Expensive, and the data is often a stale snapshot from months ago.\n-\n**Paste HTML into an LLM and pray.** Sometimes you get valid JSON. Sometimes you get a hallucinated CEO email that doesn't exist.\n\nI kept hitting this wall while working with lists of company domains, so I built a small API that does one thing well: **send a company URL, get back clean JSON.**\n\n##\nThe two rules that shaped it\n\n**1. It reads the live site at request time.** Not a database snapshot from last quarter. If a company rebranded yesterday, you get today's version.\n\n**2. It never guesses.** This was the hardest constraint to enforce with an LLM in the pipeline. Missing fields come back as `null`\n\n— never invented. If there's no contact email on the site, you get `\"email\": null`\n\n, not a plausible-looking fake you'd import straight into your CRM.\n\n##\nWhat a call looks like\n\nAnd the response:\n\n##\nHow it works under the hood\n\nA few design decisions, for the curious:\n\n-\n**Two-pass tech detection.** A fast pattern-matching pass first (think Wappalyzer-style fingerprints), then an LLM enrichment pass only for what patterns can't catch. Cheaper and faster than going full-LLM on everything.\n-\n**Hard content trimming before the LLM.** Page content is capped before any model call. This keeps latency and cost predictable instead of exploding on heavy JS-rendered sites.\n-\n**Caching with a 14-day TTL.** Repeat lookups on the same domain return in ~200 ms instead of re-scraping. The `cached`\n\nfield in the response tells you which path you hit.\n-\n**Strict schema validation.** Every response is validated against a strict schema (Pydantic v2) before it leaves the API. Either the JSON conforms, or you get a proper error — never half-broken output.\n\n##\nUse cases I built it for\n\n-\n**Lead enrichment:** turn a list of prospect domains into CRM-ready records.\n-\n**Tech-based targeting:** filter prospects by their stack (\"show me companies running Shopify\").\n-\n**Data hygiene:** verify and refresh company records against the live web instead of stale databases.\n\n##\nTry it\n\nThere's a free tier (100 requests/month), enough to test it against your own data:\n\n👉 [AI Live Company Enrichment & Tech Detector on RapidAPI](https://rapidapi.com/coinduciel143/api/ai-live-company-enrichment-tech-detector)\n\nI'd genuinely love feedback from other builders — on the positioning, the pricing, and especially: **what field would you want it to extract next?** Drop a comment below.", "url": "https://wpnews.pro/news/turn-any-company-website-into-structured-b2b-data-one-api-call", "canonical_source": "https://dev.to/cdcsaas/turn-any-company-website-into-structured-b2b-data-one-api-call-38p0", "published_at": "2026-06-13 00:08:28+00:00", "updated_at": "2026-06-13 00:43:24.013439+00:00", "lang": "en", "topics": ["ai-products", "ai-tools", "ai-infrastructure"], "entities": ["RapidAPI", "Pydantic", "Wappalyzer"], "alternates": {"html": "https://wpnews.pro/news/turn-any-company-website-into-structured-b2b-data-one-api-call", "markdown": "https://wpnews.pro/news/turn-any-company-website-into-structured-b2b-data-one-api-call.md", "text": "https://wpnews.pro/news/turn-any-company-website-into-structured-b2b-data-one-api-call.txt", "jsonld": "https://wpnews.pro/news/turn-any-company-website-into-structured-b2b-data-one-api-call.jsonld"}}