cd /news/ai-tools/spidra-api-python-tutorial-scrape-an… · home topics ai-tools article
[ARTICLE · art-24203] src=spidra.io pub= topic=ai-tools verified=true sentiment=↑ positive

Spidra API Python tutorial: scrape any website with Python

Spidra released a Python SDK that allows developers to scrape any website — including those with JavaScript rendering, anti-bot protections, and CAPTCHAs — using a single package and plain English prompts. The SDK handles browser automation, anti-bot bypass, and AI extraction on Spidra's infrastructure, returning structured data without requiring users to manage proxies, stealth plugins, or selector maintenance. The tool is available now with a free API key from app.spidra.io.

read18 min publishedJun 10, 2026

Web scraping with Python has a well-worn path. You start with requests

and BeautifulSoup for simple static pages. Then you hit a JavaScript-rendered site and reach for Playwright. Then you hit Cloudflare and spend two hours debugging stealth plugins. Then your selectors break because the site redesigned.

Spidra's Python SDK cuts across that whole progression. You install one package, describe what you want in plain English, and get back structured data from any website. The browser rendering, anti-bot bypass, CAPTCHA solving, and AI extraction all happen on Spidra's infrastructure. You get clean results back.

This tutorial walks through the entire Python SDK from installation to crawling a full website. All code examples come directly from the SDK and will work as written.

Prerequisites #

  • Python 3.9 or higher
  • A Spidra API key (get one free at app.spidra.iounder Settings → API Keys)

Installation #

pip install spidra

Once installed, store your API key as an environment variable. Never hardcode it in your scripts.

export SPIDRA_API_KEY="spd_YOUR_API_KEY"

Setting up the client #

Everything in the SDK flows through a single SpidraClient

instance. You initialise it once and then access all functionality through its namespaced attributes.

from spidra import SpidraClient

spidra = SpidraClient(api_key="spd_YOUR_API_KEY")

In practice, pull the key from your environment:

import os
from spidra import SpidraClient

spidra = SpidraClient(api_key=os.environ["SPIDRA_API_KEY"])

The client exposes five namespaces:

Namespace What it does
spidra.scrape Scrape one to three URLs with browser automation and AI extraction
spidra.batch Process up to 50 URLs in parallel
spidra.crawl Discover and scrape pages across an entire site
spidra.logs Access the history of every scrape your API key has made
spidra.usage Check credit and request consumption

Async by default, sync anywhere #

The SDK is async-first. Every method is an async

function that you await

inside an async context.

import asyncio
from spidra import SpidraClient, ScrapeParams, ScrapeUrl

spidra = SpidraClient(api_key=os.environ["SPIDRA_API_KEY"])

async def main():
    job = await spidra.scrape.run(ScrapeParams(
        urls=[ScrapeUrl(url="https://news.ycombinator.com")],
        prompt="Extract the top 5 post titles and their point scores",
        output="json",
    ))
    print(job.result.content)

asyncio.run(main())

If you are working in a regular script, a Django view, a Flask route, or a Jupyter notebook, use the _sync

counterpart instead. It handles the event loop automatically, including environments like Jupyter where calling asyncio.run()

directly would fail.

from spidra import SpidraClient, ScrapeParams, ScrapeUrl
import os

spidra = SpidraClient(api_key=os.environ["SPIDRA_API_KEY"])

job = spidra.scrape.run_sync(ScrapeParams(
    urls=[ScrapeUrl(url="https://news.ycombinator.com")],
    prompt="Extract the top 5 post titles and their point scores",
    output="json",
))

print(job.result.content)

Every method in the SDK has both versions. The rest of this tutorial uses _sync

in the examples for simplicity, but the async versions work identically — just add await

.

Part 1: Scraping a page #

The scrape

namespace handles single-page scraping. You can pass up to three URLs per request and they run in parallel.

Your first scrape

from spidra import SpidraClient, ScrapeParams, ScrapeUrl
import os

spidra = SpidraClient(api_key=os.environ["SPIDRA_API_KEY"])

job = spidra.scrape.run_sync(ScrapeParams(
    urls=[ScrapeUrl(url="https://news.ycombinator.com")],
))

print(job.result.content)

Without a prompt

, Spidra returns the raw page content as Markdown. The page loads in a real browser, JavaScript executes, and the full rendered content is converted to clean Markdown. That is what ends up in job.result.content

.

How the job lifecycle works

When you call run_sync()

, the SDK submits the job, then polls in the background every 3 seconds until it is done. From your side it looks synchronous. Under the hood, the job moves through these states:

waiting → active → completed (or failed)

waiting

means the job is queued. active

means the browser is running. completed

means the result is ready. failed

means something went wrong.

If you want to submit a job and check on it later rather than waiting for it to finish, use submit()

and get()

separately:

from spidra import SpidraClient, ScrapeParams, ScrapeUrl
import os, time

spidra = SpidraClient(api_key=os.environ["SPIDRA_API_KEY"])

queued = spidra.scrape.submit_sync(ScrapeParams(
    urls=[ScrapeUrl(url="https://example.com")],
    prompt="Extract the main headline",
))

print(f"Job submitted: {queued.job_id}")

time.sleep(5)
status = spidra.scrape.get_sync(queued.job_id)

if status.status == "completed":
    print(status.result.content)
elif status.status == "failed":
    print(f"Failed: {status.error}")

Part 2: Extracting data with prompts #

The prompt

field is what makes Spidra different from a plain headless browser scraper. Instead of writing CSS selectors to find elements, you describe what you want in plain English and the AI figures out where it is on the page.

job = spidra.scrape.run_sync(ScrapeParams(
    urls=[ScrapeUrl(url="https://news.ycombinator.com")],
    prompt="Extract the top 10 post titles and their point scores",
    output="json",
))

print(job.result.content)

Setting output="json"

tells the AI to return structured JSON rather than formatted text. The default is "markdown"

.

The AI reads the rendered page the way a person would. It knows a number next to a currency symbol is a price, that a short bold line at the top of a product page is probably the title, and that a longer block of text is probably a description. You do not need to know the class names or DOM structure of the page.

That said, Spidra also fully supports CSS selectors and XPath for browser actions if you prefer to be explicit about where to find things. We will cover that in the browser actions section.

Part 3: Enforcing output shape with JSON schema #

Plain prompts are flexible but not predictable. The AI decides what fields to return and what to name them. That works for exploration but it is a problem in production where a database or downstream service expects a specific shape every time.

The schema

field solves this. Pass a JSON Schema object and the AI must return data matching it exactly. Fields marked as required

always appear in the output. If the page does not have a value for a required field, it comes back as None

rather than being silently omitted.

job = spidra.scrape.run_sync(ScrapeParams(
    urls=[ScrapeUrl(url="https://jobs.example.com/senior-engineer")],
    prompt="Extract the job listing details. Normalize salary to a USD number.",
    output="json",
    schema={
        "type": "object",
        "required": ["title", "company", "remote"],
        "properties": {
            "title":           {"type": "string"},
            "company":         {"type": "string"},
            "remote":          {"type": ["boolean", "null"]},
            "salary_min":      {"type": ["number", "null"]},
            "salary_max":      {"type": ["number", "null"]},
            "employment_type": {
                "type": ["string", "null"],
                "enum": ["full_time", "part_time", "contract", None]
            },
            "skills": {"type": "array", "items": {"type": "string"}},
        },
    },
))

print(job.result.content)

When you provide a schema

, output

is automatically set to "json"

. You do not need to set it yourself.

If you use Pydantic for data validation in your application, you can generate the schema from your existing models rather than writing it by hand:

from pydantic import BaseModel
from typing import Optional

class JobListing(BaseModel):
    title: str
    company: str
    remote: Optional[bool] = None
    salary_min: Optional[float] = None
    salary_max: Optional[float] = None

job = spidra.scrape.run_sync(ScrapeParams(
    urls=[ScrapeUrl(url="https://jobs.example.com/senior-engineer")],
    prompt="Extract the job listing details",
    schema=JobListing.model_json_schema(),
))

One schema definition in your codebase. Works in your application logic and in your scraping requests.

Part 4: Browser actions #

Some pages require interaction before the content you want is visible. A cookie banner blocking everything. A search form that needs filling. Lazy-loaded content that only appears after scrolling. Tabs that hide data until clicked.

The actions

list inside each ScrapeUrl

lets you interact with the page before extraction runs. Actions execute in order inside the browser.

from spidra import BrowserAction

job = spidra.scrape.run_sync(ScrapeParams(
    urls=[
        ScrapeUrl(
            url="https://example.com/products",
            actions=[
                BrowserAction(type="click", selector="#accept-cookies"),
                BrowserAction(type="wait", duration=1000),
                BrowserAction(type="scroll", to="80%"),
            ],
        ),
    ],
    prompt="Extract all product names and prices visible on the page",
))

For click

, check

, and uncheck

actions, you have two options for targeting elements:

selector

for a CSS selector or XPath expression like"#accept-cookies"

or".submit-btn"

value

for a plain English description like"Accept cookies button"

and Spidra locates the element using AI

Both are valid and you can mix them in the same actions list:

actions=[
    BrowserAction(type="click", selector="#accept-cookies"),  # CSS selector
    BrowserAction(type="click", value="Search button"),        # plain English
]

Use whichever is more convenient for the page you are working with.

All available actions

Action What it does Key fields
click Clicks a button, link, or any element selector or value
type Types text into an input field selector , value
check Checks a checkbox selector or value
uncheck Unchecks a checkbox selector or value
wait s for a number of milliseconds duration
scroll Scrolls to a percentage of the page height to (e.g. "80%" )
forEach Finds matching elements and processes each one value , mode

The forEach action

forEach

is the most powerful action in the SDK. It finds a set of matching elements on the page and processes each one individually, then combines all the results into a single output.

It works in three modes:

inline

reads the content of each matched element directly. Use this for product cards, table rows, or any content that lives inside the element.

navigate

follows each element as a link, loads the destination page, and scrapes it. Use this when the data you want is on detail pages you need to click into.

click

clicks each element to expand or reveal content, then scrapes what appears. Use this for accordions, modals, or expandable sections.

job = spidra.scrape.run_sync(ScrapeParams(
    urls=[
        ScrapeUrl(
            url="https://directory.example.com/companies",
            actions=[
                BrowserAction(type="click", value="Accept cookies"),
                BrowserAction(
                    type="forEach",
                    value="Find all company listing cards",
                    mode="navigate",
                    max_items=20,
                    item_prompt="Extract company name, website, and industry",
                    pagination={
                        "nextSelector": "a.next-page",
                        "maxPages": 3
                    }
                ),
            ],
        ),
    ],
    output="json",
))

This dismisses the cookie banner, finds every company card on the page, navigates into each company's profile page, extracts the company details, and repeats across three pages of pagination. All in a single request.

Part 5: Proxy and geo-targeting #

Some sites block requests from cloud infrastructure IP ranges. Others show different content depending on where you are browsing from. Setting use_proxy=True

routes the request through a residential proxy.

job = spidra.scrape.run_sync(ScrapeParams(
    urls=[ScrapeUrl(url="https://www.amazon.de/gp/bestsellers")],
    prompt="List the top 10 products with name and price",
    use_proxy=True,
    proxy_country="de",
))

proxy_country

accepts:

  • A two-letter ISO country code like "us"

,"de"

,"gb"

,"fr"

,"jp"

"eu"

to rotate randomly across all 27 EU member states"global"

or omit it for no country preference

Proxy usage is billed from your bandwidth quota, not your credits. There is no credit multiplier for enabling proxy routing.

Part 6: Scraping pages behind a login #

To access content that requires authentication, pass your session cookies as a raw cookie header string. Log in through your browser, open DevTools, copy the Cookie

header from any authenticated request, and pass it here.

job = spidra.scrape.run_sync(ScrapeParams(
    urls=[ScrapeUrl(url="https://app.example.com/dashboard")],
    prompt="Extract the monthly revenue and active user count",
    cookies="session=abc123; auth_token=xyz789",
))

Both standard cookie format (name=value; name2=value2

) and Chrome DevTools paste format work. Cookies are passed ephemerally to the browser worker and never stored by Spidra.

Part 7: Stripping boilerplate with extract_content_only #

By default Spidra returns the full page content including navigation, headers, footers, and sidebars. If you only want the main content, turn on extract_content_only

. It strips the noise before the AI sees the page, which reduces token usage and keeps the result focused.

job = spidra.scrape.run_sync(ScrapeParams(
    urls=[ScrapeUrl(url="https://blog.example.com/long-article")],
    prompt="Summarize this article in three sentences",
    extract_content_only=True,
))

Particularly useful for article pages, documentation, and any page where the main content is surrounded by heavy navigation.

Part 8: Screenshots #

Capture screenshots of scraped pages for debugging, monitoring, or archival.

job = spidra.scrape.run_sync(ScrapeParams(
    urls=[ScrapeUrl(url="https://example.com")],
    screenshot=True,
    full_page_screenshot=True,
))

print(job.result.screenshots)  # list of URLs

screenshot=True

captures the visible viewport. full_page_screenshot=True

captures the entire scrollable page.

Part 9: Controlling polling behaviour #

By default run_sync()

polls every 3 seconds and gives up after 120 seconds. For complex pages or large crawls that take longer, pass a PollOptions

object to override both.

from spidra import PollOptions

job = spidra.scrape.run_sync(
    ScrapeParams(
        urls=[ScrapeUrl(url="https://example.com")],
        prompt="Extract all content from this page",
    ),
    PollOptions(poll_interval=5, timeout=180),
)

PollOptions

works on batch.run_sync()

and crawl.run_sync()

too.

Part 10: Batch scraping #

When you have a list of URLs to process, the batch endpoint handles up to 50 at a time in parallel. Each URL runs in its own independent worker.

Note that batch URLs are plain strings, not ScrapeUrl

objects. Per-URL browser actions are not supported in batch mode.

from spidra import SpidraClient, BatchScrapeParams
import os

spidra = SpidraClient(api_key=os.environ["SPIDRA_API_KEY"])

batch = spidra.batch.run_sync(BatchScrapeParams(
    urls=[
        "https://shop.example.com/product/1",
        "https://shop.example.com/product/2",
        "https://shop.example.com/product/3",
    ],
    prompt="Extract product name, price, and whether it is in stock",
    output="json",
))

print(f"{batch.completed_count}/{batch.total_urls} completed")

for item in batch.items:
    if item.status == "completed":
        print(item.url, item.result)
    else:
        print(f"Failed: {item.url} — {item.error}")

Batch with schema

The same schema enforcement that works in single scraping works in batch. Every item returns data matching the same shape:

batch = spidra.batch.run_sync(BatchScrapeParams(
    urls=urls,
    prompt="Extract the product details",
    schema={
        "type": "object",
        "required": ["name", "price"],
        "properties": {
            "name":      {"type": "string"},
            "price":     {"type": ["number", "null"]},
            "currency":  {"type": ["string", "null"]},
            "available": {"type": ["boolean", "null"]}
        }
    }
))

Managing batches

Once a batch is running, you have a few additional operations available:

Retrying failures. If some items fail due to transient errors, retry just those without re-running the ones that already succeeded:

if batch.failed_count > 0:
    spidra.batch.retry_sync(queued.batch_id)

Cancelling a batch. Stop a running batch and get credits refunded for anything that has not started yet:

response = spidra.batch.cancel_sync(batch_id)
print(f"Cancelled {response.cancelled_items} items, refunded {response.credits_refunded} credits")

Listing past batches:

from spidra import BatchListParams

page = spidra.batch.list_sync(BatchListParams(page=1, limit=20))

for job in page.jobs:
    print(job.uuid, job.status, f"{job.completed_count}/{job.total_urls}")

Processing large URL lists

The batch endpoint caps at 50 URLs per request. For larger lists, chunk them and process in batches:

import os, json
from spidra import SpidraClient, BatchScrapeParams

spidra = SpidraClient(api_key=os.environ["SPIDRA_API_KEY"])

def scrape_url_list(urls: list[str], prompt: str, batch_size: int = 50) -> list:
    all_results = []

    for i in range(0, len(urls), batch_size):
        chunk = urls[i:i + batch_size]
        print(f"Processing batch {i // batch_size + 1} of {-(-len(urls) // batch_size)}...")

        batch = spidra.batch.run_sync(BatchScrapeParams(
            urls=chunk,
            prompt=prompt,
            output="json",
        ))

        for item in batch.items:
            if item.status == "completed":
                all_results.append({
                    "url": item.url,
                    "data": item.result
                })
            else:
                print(f"  Failed: {item.url}")

    return all_results

urls = [f"https://example.com/product/{i}" for i in range(1, 201)]
results = scrape_url_list(urls, "Extract product name and price")

with open("results.jsonl", "w") as f:
    for record in results:
        f.write(json.dumps(record) + "\n")

print(f"Saved {len(results)} results")

Part 11: Crawling entire websites #

Batch scraping works when you already have a list of URLs. Crawling is for when you want Spidra to discover pages for you.

You give it a starting URL, describe which pages to follow, and describe what to extract from each one. Spidra loads the base URL, finds links matching your crawl instruction, visits each one, and applies your transform instruction to every page it visits.

from spidra import SpidraClient, CrawlParams, PollOptions
import os

spidra = SpidraClient(api_key=os.environ["SPIDRA_API_KEY"])

job = spidra.crawl.run_sync(
    CrawlParams(
        base_url="https://competitor.com/blog",
        crawl_instruction="Follow links to blog posts only. Skip tag pages, category pages, and the homepage.",
        transform_instruction="Extract the post title, author name, publish date, and a one-sentence summary.",
        max_pages=30,
        use_proxy=True,
    ),
    PollOptions(timeout=360),
)

for page in job.result:
    print(page.url, page.data)

Three fields are required: base_url

, crawl_instruction

, and transform_instruction

.

crawl_instruction

tells the crawler which links to follow. transform_instruction

tells the AI what to extract from each page it visits. max_pages

defaults to 5 and goes up to 20. Pass a higher timeout

in PollOptions

for larger crawls since the default 120 seconds may not be enough.

The same use_proxy

, proxy_country

, and cookies

options from single scraping all work here too.

Down the raw content

Once a crawl completes, you can fetch the raw HTML and Markdown for every page that was crawled. The URLs are signed and expire after an hour.

response = spidra.crawl.pages_sync(job_id)

for page in response.pages:
    print(page.url, page.status)

Re-extracting with a different prompt

If you crawled a site and later want to pull out different information, you do not have to re-crawl. extract()

runs a new AI pass over the already-crawled content and only charges transformation credits.

queued = spidra.crawl.extract_sync(
    completed_job_id,
    "Extract only product SKUs and prices as structured JSON",
)

result = spidra.crawl.get_sync(queued.job_id)

Browsing crawl history

from spidra import CrawlHistoryParams

response = spidra.crawl.history_sync(CrawlHistoryParams(page=1, limit=10))
print(f"Total crawl jobs: {response.total}")

stats = spidra.crawl.stats_sync()
print(f"All-time crawls: {stats.total}")

Part 12: Logs and usage #

Browsing your scrape logs

Every request your API key makes is logged automatically. You can filter by status, URL, date range, and more.

from spidra import ScrapeLogsParams

response = spidra.logs.list_sync(ScrapeLogsParams(
    status="failed",
    search_term="amazon.com",
    date_start="2025-01-01",
    date_end="2025-12-31",
    page=1,
    limit=20,
))

for log in response.logs:
    print(log.urls[0].get("url"), log.status, log.credits_used)

To get full details of a single log entry including the extraction output:

log = spidra.logs.get_sync(log_uuid)
print(log.result_data)

Checking usage

Track your credit and request consumption over time:

rows = spidra.usage.get_sync("30d")  # "7d" | "30d" | "weekly"

for row in rows:
    print(row.date, row.requests, row.credits)

"7d"

gives one row per day for the last week. "30d"

gives the last 30 days. "weekly"

gives one row per week for the last seven weeks.

Part 13: Error handling #

Every API error maps to a typed exception class. Catch exactly what you care about and let everything else bubble up.

from spidra import (
    SpidraError,
    SpidraAuthenticationError,
    SpidraInsufficientCreditsError,
    SpidraRateLimitError,
    SpidraServerError,
)

try:
    job = spidra.scrape.run_sync(ScrapeParams(
        urls=[ScrapeUrl(url="https://example.com")],
        prompt="Extract the main headline",
    ))
    print(job.result.content)

except SpidraAuthenticationError:
    print("API key is missing or invalid. Check your SPIDRA_API_KEY.")

except SpidraInsufficientCreditsError:
    print("Account is out of credits. Top up at app.spidra.io.")

except SpidraRateLimitError:
    print("Rate limit hit. Wait before retrying.")

except SpidraServerError as e:
    print(f"Server error ({e.status}): {e.message}. Retry is usually safe.")

except SpidraError as e:
    print(f"API error {e.status}: {e.message}")
Exception HTTP status When it fires
SpidraAuthenticationError 401 API key missing or invalid
SpidraInsufficientCreditsError 403 No credits remaining
SpidraRateLimitError 429 Too many requests
SpidraServerError 500 Unexpected error on Spidra's side
SpidraError any Base class for all Spidra exceptions

All exceptions expose .status

for the HTTP code and .message

for a human-readable explanation.

Also check the ai_extraction_failed

flag in the result. If AI extraction fails for any reason, Spidra falls back to returning the raw page Markdown and sets this flag so your code can detect it:

job = spidra.scrape.run_sync(ScrapeParams(
    urls=[ScrapeUrl(url="https://example.com")],
    prompt="Extract the main headline",
))

if job.result.ai_extraction_failed:
    raw = job.result.data[0].markdown_content
    print("Extraction failed, falling back to raw content")
else:
    print(job.result.content)

Putting it all together: a complete pipeline #

Here is a full example that uses browser actions with forEach

to collect job listings from a directory, enforces a schema on the output, handles errors properly, and saves results to JSONL:

import os, json
from spidra import (
    SpidraClient,
    ScrapeParams,
    ScrapeUrl,
    BrowserAction,
    SpidraError,
    SpidraInsufficientCreditsError,
)

spidra = SpidraClient(api_key=os.environ["SPIDRA_API_KEY"])

JOB_SCHEMA = {
    "type": "object",
    "required": ["title", "company", "location"],
    "properties": {
        "title":           {"type": "string"},
        "company":         {"type": "string"},
        "location":        {"type": ["string", "null"]},
        "remote":          {"type": ["boolean", "null"]},
        "salary_min":      {"type": ["number", "null"]},
        "salary_max":      {"type": ["number", "null"]},
        "employment_type": {
            "type": ["string", "null"],
            "enum": ["full_time", "part_time", "contract", None]
        },
    },
}

def collect_listings(board_url: str) -> list:
    try:
        job = spidra.scrape.run_sync(ScrapeParams(
            urls=[
                ScrapeUrl(
                    url=board_url,
                    actions=[
                        BrowserAction(type="click", value="Accept cookies"),
                        BrowserAction(
                            type="forEach",
                            value="Find all job listing cards",
                            mode="navigate",
                            max_items=50,
                            item_prompt="Extract job title, company, location, remote status, salary range, and employment type",
                            pagination={
                                "nextSelector": "a.next-page",
                                "maxPages": 3
                            }
                        ),
                    ],
                )
            ],
            output="json",
            schema=JOB_SCHEMA,
        ))

        if job.result.ai_extraction_failed:
            print(f"Warning: AI extraction failed for {board_url}")
            return []

        content = job.result.content
        return content if isinstance(content, list) else [content]

    except SpidraInsufficientCreditsError:
        print("Out of credits. Stopping.")
        return []
    except SpidraError as e:
        print(f"Error scraping {board_url}: {e.message}")
        return []

boards = [
    "https://jobs.example.com/engineering",
    "https://careers.anothersite.com/remote",
]

all_jobs = []
for board in boards:
    print(f"Collecting from {board}...")
    listings = collect_listings(board)
    all_jobs.extend(listings)
    print(f"  Got {len(listings)} listings")

with open("jobs.jsonl", "w") as f:
    for job in all_jobs:
        f.write(json.dumps(job) + "\n")

print(f"\nDone. {len(all_jobs)} jobs saved to jobs.jsonl")

All scrape parameters #

For reference, here is the full list of parameters you can pass to ScrapeParams

:

Parameter Type Description
urls list Up to 3 ScrapeUrl objects. Each takes a url and optional actions .
prompt str What to extract, in plain English
output str "markdown" (default) or "json"
schema dict JSON Schema for a guaranteed output shape
use_proxy bool Route through a residential proxy
proxy_country str Two-letter country code or "eu" / "global"
extract_content_only bool Strip nav, ads, and boilerplate before AI extraction
screenshot bool Capture a viewport screenshot
full_page_screenshot bool Capture a full-page screenshot
cookies str Raw Cookie header string for authenticated pages

If you want to go deeper on any part of the SDK:

Browser actions guidecovers every option for each action type including allforEach

parametersStructured output guidecovers schemas in depth including Pydantic integration and schema limitsStealth mode guidehas the full country list and proxy optionsAuthenticated scraping guidecovers how to get cookies from your browser and the formats Spidra accepts

Get your API key at app.spidra.io. The free plan has 300 credits and no card required.

── more in #ai-tools 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/spidra-api-python-tu…] indexed:0 read:18min 2026-06-10 ·