Three sleep intervals for three APIs: Steam 250ms, GitHub 100ms, HuggingFace none

wpnews.pro

cd /news/ai-tools/three-sleep-intervals-for-three-apis… · home › topics › ai-tools › article

[ARTICLE · art-23628] src=dev.to ↗ pub=2026-06-06T22:10Z topic=ai-tools verified=true sentiment=· neutral

Three sleep intervals for three APIs: Steam 250ms, GitHub 100ms, HuggingFace none

A developer built ETL pipelines for three directory sites in April, encountering different rate limits for the Steam, GitHub, and HuggingFace APIs. The Steam pipeline uses a 250ms sleep interval despite a documented safe rate of 1.5 seconds per request, accepting occasional HTTP 429 errors as non-fatal for review data. The GitHub pipeline relies on an authenticated token for 5,000 requests per hour, while the HuggingFace model registry API requires no sleep interval even for rapid-fire requests.

read5 min views16 publishedJun 6, 2026

When I built the ETL pipelines for three programmatic directory sites in April — Top AI Tools (HuggingFace data), Find Games Like (Steam data), and Open Alternative To (GitHub data) — I had to figure out rate limits for three completely different APIs in the same week. The numbers, the failure modes, and the right way to handle errors are all different.

Here's what I actually shipped and the reasoning behind each number.

Steam's developer docs are sparse on hard rate-limit specifics. What I found from community discussion and trial: roughly 200 requests per 5 minutes per IP on the public Web API, which works out to one request per 1.5 seconds as a documented-safe interval. My code comments this openly:

await sleep(250); // Steam rate limit: ~200/5min, 1.5s is safe; 250ms is aggressive but usually fine

I chose 250ms anyway because the ETL runs as a nightly GitHub Actions job over ~60 game entries. At 250ms that's 15 seconds of sleep total. At 1.5 seconds it would be 90 seconds. The gap matters when the cron has three sites to process.

The acceptable risk: Steam doesn't hard-ban on the first rate-limit violation, it returns HTTP 429 and the job logs the error. The games ETL treats review-endpoint failures as non-fatal — the game row is still written; only the review stats are absent until the next run:

try {
  const r = await getAppReviewSummary(appid);
  // ... write to DB
} catch (err) {
  reviewsFailed++;
  console.error(`! Review fetch failed for appid ${appid}:`, err);
}

The reviewsFailed

counter appears in the job log. If I see it climbing consistently, that's the signal to increase the sleep interval. So far I haven't needed to.

GitHub's REST API is explicit about limits: 60 requests per hour unauthenticated, 5,000 per hour with a personal access token. The GitHub docs on rate limiting cover both the primary limit and the secondary limits for specific endpoint categories. The OSS alternatives ETL makes one GET /repos/:owner/:repo

call per alternative project — roughly 3–5 repos per SaaS tool in the seed data. Even a large seed run of 50 tools with 5 alternatives each is only 250 requests.

The sleep is there as a politeness interval, but authentication is doing the real rate-limit work:

function authHeaders(): Record<string, string> {
  const token = process.env.GITHUB_TOKEN;
  const base: Record<string, string> = {
    Accept: "application/vnd.github+json",
    "X-GitHub-Api-Version": "2022-11-28",
  };
  if (token) base.Authorization = `Bearer ${token}`;
  return base;
}

GITHUB_TOKEN

is set in GitHub Actions from a repository secret. Without it, 60 requests per hour would exhaust in under a minute for a full seed run. With it, the 5,000/hour ceiling gives comfortable headroom.

One subtlety: there are two separate GitHub rate limits — the core REST API limit (5,000/hour authenticated) and the search API limit (30 requests per minute unauthenticated, 10 per second authenticated). The current ETL uses GET /repos/:owner/:repo

directly, not search, so the looser core limit applies. If I ever switch to search-based discovery the math changes.

The model registry API — listing models, fetching model metadata — has no hard documented rate limit that I've hit in weeks of nightly runs. The ETL fetches up to 100 models in one GET /api/models?limit=100&sort=downloads

call, then one detailed fetch per model. 100 rapid-fire requests, no sleep, no 429s.

Part of this is the HUGGINGFACE_TOKEN

header in authenticated requests, which raises whatever ceiling exists. Part of it is that the registry API is explicitly designed for automated tooling at batch scale — it's the primary way model cards, metadata scrapers, and leaderboard tools consume the catalog.

function authHeaders(): Record<string, string> {
  const token = process.env.HUGGINGFACE_TOKEN;
  return token ? { Authorization: `Bearer ${token}` } : {};
}

If I scale to 1,000 models per nightly fetch I'd add a 50ms sleep as a precaution. For 100, the simplest thing that works is also the correct thing.

API	Sleep	Auth impact	Failure mode	Fatal?
Steam appdetails	250ms	None (public)	429, occasional	Non-fatal
Steam reviews	250ms (shared)	None (public)	429, more frequent	Non-fatal
GitHub REST	100ms	60→5,000/hr	403, clear message	Non-fatal
HuggingFace registry	None	Raises ceiling	Rare 429	Non-fatal

All four code paths are non-fatal. A 429 or connection error anywhere in the batch writes a fallback-template row to Turso and increments a counter. The content upgrade loop picks up any gaps the next night.

The sleep interval is a guess. What actually protects the ETL from being useless after a rate-limit event is that failures are cheap. Every external API call in this stack is wrapped in a try/catch that writes degraded content rather than crashing the batch. The sleep interval controls how likely you are to hit a rate limit; the fallback chain controls what happens when you do.

For indie-scale ETL — tens to hundreds of entries per night — the combination of a conservative-ish sleep and a non-fatal error path is enough. If the site grows to thousands of entries per run, I'd rethink both: moving to a queue-bounded concurrent fetcher with exponential backoff, and separating the content generation from the data fetch into stages that can be retried independently.

Part of an ongoing 6-month experiment running three AI-curated directory sites. The technical claims here are real; this article was AI-assisted.

source & further reading

dev.to — original article How MCP Is Changing Website QA Workflows for Development Teams When AI Models Escaped Their Sandbox: What the OpenAI Hugging Face Breach Really Means From Release Notes to Product Demo: A Repeatable AI Video Workflow for SaaS Teams

~/api · this article 200

$curl api.wpnews.pro/v1/news/three-sleep-intervals-fo…

Read original on dev.to → dev.to/morinaga/three-sleep-intervals-for-three-…

mentioned entities

Steam

GitHub

HuggingFace

Top AI Tools

Find Games Like

Open Alternative To

GitHub Actions

metadata

slugthree-sleep-intervals-for-three-apis-steam-250ms-github-100ms-huggingface-none

topic#ai-tools

secondary4 topics

sentimentneutral

canonicaldev.to

navigation

← prevHow I built a three-tier content…

next →Reid Hoffman to leave Microsoft …

── more in #ai-tools 4 stories · sorted by recency

startupfortune.com · 22 Jul · #ai-tools

Tencent stock falls 7% as gaming fears and an AI rotation hammer China tech

dev.to · 6 Jun · #ai-tools

How I built a three-tier content quality ladder for programmatic directory ETL

dev.to · 20 May · #ai-tools

Why I'm betting on AI-curated directories when Google AI Overviews answer the same queries

insideai.news · 22 Jul · #ai-tools

Consultants Confront AI ‘Heal Thyself’ Moment as $100 Billion in Value Erased

── more on @steam 3 stories trending now

wpnews · 30 May · #ai-safety

Nightcord Security Analysis Report - Threat Investigation

wpnews · 26 May · #ai-agents

Think, Durable Objects, and the Real Shape of AI Applications

wpnews · 8 Jul · #ai-tools

What's the Future of Clay?

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required