CrawlForge v4.2.2: New CLI + 3 Tools for Local AI Scraping CrawlForge v4.2.2 introduces a new command-line interface (CLI) and three new tools, shifting its focus toward local, API-key-free web scraping for AI. The new tools include `extract_with_llm`, which defaults to local Ollama models for structured data extraction, and `scrape_template`, which provides pre-built scrapers for sites like Amazon and GitHub. The release also adds a tool to list local Ollama models, with all new features accessible via the CLI and sharing the same credit system as the existing API. Today we are shipping CrawlForge v4.2.2 , our biggest release since launch. It brings three new tools, a standalone command-line interface, and a quiet shift in how we think about web scraping for AI: most of it should run locally, on your own machine, without API keys. This post is the umbrella for everything in 4.2.2. Three deep-dive guides follow in the next nine days. Table of Contents What Shipped what-shipped The New CrawlForge CLI the-new-crawlforge-cli Extract With LLM: Local AI Extraction extract-with-llm-local-ai-extraction Scrape Template: Ten Sites, One Call scrape-template-ten-sites-one-call list ollama models: Free Model Discovery list ollama models-free-model-discovery Old Workflow vs v4.2.2 Workflow old-workflow-vs-v422-workflow Credit Costs credit-costs How to Upgrade how-to-upgrade What Is Next what-is-next What Shipped v4.2.2 adds four things: - @crawlforge/cli -- a standalone command-line tool exposing all 23 CrawlForge tools to your shell. No MCP client required. - extract with llm -- LLM-powered structured extraction that defaults to local Ollama. No external API key needed. - scrape template -- pre-built scrapers for Amazon, LinkedIn, GitHub, YouTube, Reddit, Hacker News, Stack Overflow, npm, Product Hunt, and Twitter/X. - list ollama models -- a free discovery tool that lists models on your local Ollama instance. Tool count goes from 20 to 23. The CLI is brand new -- it is not a tool, it is a delivery channel. +----------------+ +-------------------+ +----------------+ | Your Shell | <-- | @crawlforge/cli | <-- | CrawlForge | | cron, CI | | JSON in/out | | API + Tools | +----------------+ +-------------------+ +----------------+ ^ No MCP handshake. Just HTTPS + stdout. The New CrawlForge CLI The CLI is the shortest path from intent to scraped data. You install it once, set an environment variable, and every CrawlForge tool becomes a command: npm install -g @crawlforge/cli export CRAWLFORGE API KEY="cf live your key here" crawlforge scrape https://example.com crawlforge search "best MCP servers 2026" crawlforge research "AI agent frameworks" --depth 3 Why does this matter? Because MCP is great for AI agents, but a lot of scraping work is not an AI agent task. It is a cron job. A CI step. A one-off pull from your terminal. For that, you want JSON on stdout that pipes into jq, not a JSON-RPC handshake. MCP is optimized for AI agents picking tools dynamically. The CLI is optimized for All three paths hit the same backend, share the same credit balance, and use the same API key. Why have a CLI when MCP already exists? humans typing commands and scripts piping JSON . Different shapes for different jobs: Workflow Best fit Claude/Cursor agent MCP Cron job CLI GitHub Actions step CLI One-off terminal CLI Server in a loop Raw API Read the complete CrawlForge CLI guide https://www.crawlforge.dev/blog/web-scraping-cli-complete-guide for the full command reference and real-world workflows. Extract With LLM: Local AI Extraction extract with llm is structured extraction powered by a language model. You hand it a URL and a schema, it gives you back JSON. The new part is that it defaults to local Ollama rather than calling OpenAI or Anthropic. { "url": "https://news.ycombinator.com/item?id=123456", "schema": { "type": "object", "properties": { "title": { "type": "string" }, "points": { "type": "number" }, "comments": { "type": "number" } } }, "provider": "ollama", "model": "llama3.1:8b" } Three things follow from the local-first default: - No third-party API costs. The LLM is free. You only pay 3 CrawlForge credits per extraction. - No data leaving your machine. Scraped content stays on localhost. - No new API key to manage. If Ollama is installed, you are done. Local models are great for predictable schemas titles, prices, counts, ratings . For long-form reasoning -- summarizing a 10,000-word article, classifying nuanced sentiment, extracting fields that require world knowledge -- a frontier model still wins. Switch providers with one parameter: crawlforge extract https://example.com \ --provider anthropic \ --model claude-sonnet-4-6 You pay the provider's per-token cost plus 3 CrawlForge credits. Same schema, same output shape. When to still use OpenAI or Anthropic Detailed guide: extract data with local LLMs https://www.crawlforge.dev/blog/extract-data-with-local-llms-ollama . Scrape Template: Ten Sites, One Call scrape template is for the long tail of scraping requests that all look the same: "get me product data from Amazon", "get me a GitHub repo's metadata", "get me the top posts on Hacker News today". You should not need to write CSS selectors for these. We did it once, we maintain it, you call it. crawlforge template amazon --url "https://www.amazon.com/dp/B0CHX1W1XY" crawlforge template github --url "https://github.com/anthropics/anthropic-sdk-python" crawlforge template hackernews --top 10 Ten templates ship in this release: | Template | What it returns | Credits | |---|---|---| amazon | Product title, price, rating, reviews, images | 1 | linkedin | Profile name, headline, experience, skills | 1 | github | Repo metadata, stars, languages, README | 1 | youtube | Video title, views, channel, transcript | 1 | reddit | Post title, score, comments, top replies | 1 | hackernews | Story title, points, URL, comments | 1 | stackoverflow | Question, answers, accepted, vote counts | 1 | npm | Package metadata, weekly downloads, versions | 1 | producthunt | Product name, tagline, upvotes, makers | 1 | tweet | Tweet text, author, engagement, replies | 1 | Full walkthrough with code: scrape Amazon, LinkedIn, and GitHub with one tool https://www.crawlforge.dev/blog/scrape-amazon-linkedin-github-templates . list ollama models: Free Model Discovery Most useful as a sanity-check before running extract with llm . Lists every model on your local Ollama instance with name, size, and modified date. crawlforge extract --list-ollama-models Costs zero credits . It does no scraping, no LLM call -- it just queries Ollama's local API on 127.0.0.1:11434 and returns the result. If you have ever wondered which model you actually have installed, this is the answer. Old Workflow vs v4.2.2 Workflow | Task | Pre-4.2.2 | v4.2.2 | |---|---|---| | Scrape from your terminal | curl + custom parser, or boot a Node REPL | crawlforge scrape