Replacing Fragile CSS Selectors with LLM-Powered Zero-Shot JSON Extraction A developer proposes replacing fragile CSS selectors with LLM-powered zero-shot JSON extraction for web scraping. By converting cleaned HTML to Markdown and using a predefined JSON schema, the approach makes pipelines resilient to UI changes and reduces maintenance overhead. The method shifts effort from selector upkeep to schema definition, enabling more robust data collection. Zero-shot JSON extraction replaces brittle CSS selectors with Large Language Models that map unstructured web content to predefined schemas semantically. By processing cleaned HTML or Markdown through an LLM context window, scraping pipelines become resilient to UI changes, A/B tests, and dynamic class names. This approach shifts data engineering effort from constant selector maintenance to high-level schema definition, enabling truly agentic data collection. Web scraping pipelines eventually hit the same bottleneck: selector maintenance. Traditional data extraction relies on identifying structural patterns in the Document Object Model DOM . You write rules targeting specific HTML nodes using tools like XPath, BeautifulSoup, or Cheerio. A standard selector might look like div.product-details span:nth-child 3 b.price-tag . This structural dependency is a massive liability. Modern front-end development practices have rendered static selectors obsolete. Frameworks like React and Vue dynamically inject DOM nodes. Styling solutions like Tailwind CSS or CSS-in-JS libraries generate randomized, utility-based class names .css-1a2b3c , .tw-mt-4 . When a front-end team deploys a minor UI update or runs an A/B test, the structural path changes. The scraping pipeline breaks. The extraction script returns null values. Data engineering teams are forced into a reactive cycle, spending engineering cycles updating scripts for specific target sites instead of building core infrastructure. The maintenance cost scales linearly with the number of domains you track. Zero-shot extraction removes the structural dependency entirely. Instead of telling the code where to look in the DOM, you tell an LLM what data you want. This represents a shift from structural mapping to semantic mapping. Large Language Models understand the concept of a "price", a "product description", or an "author name" based on the surrounding context. If a site moves the price from an