AI + TMDB: 3 Passes to Match Torrent Posters — Prompt Iteration With Real Numbers

wpnews.pro

cd /news/artificial-intelligence/ai-tmdb-3-passes-to-match-torrent-po… · home › topics › artificial-intelligence › article

[ARTICLE · art-18426] src=dev.to ↗ pub=2026-05-30T09:00Z topic=artificial-intelligence verified=true sentiment=· neutral

AI + TMDB: 3 Passes to Match Torrent Posters — Prompt Iteration With Real Numbers

A developer built a 3-pass AI pipeline using Claude Haiku to match torrent folder names to TMDB posters, improving accuracy from 80% to near-perfect on 290 real entries. The system reduced false skips by 43% in pass one, false negatives by 84% in pass two, and eliminated all parse failures in pass three. The key insight: precise edge-case rules like "seasons matched to Season N are CORRECT" proved more valuable than generic instructions.

read3 min views23 publishedMay 30, 2026

ShareBox displays shared folders as a Netflix-style grid with TMDB posters. The problem: folder names come from torrents. Naruto.INTEGRALE.MULTI.VFF.1080p.BluRay.x264-AMB3R

needs to match "Naruto" on TMDB — not "Naruto Shippuden", not "Naruto the Movie". And Vol 1

must definitely not match "Kill Bill: Volume 1".

Basic regex + TMDB search works for 80% of cases. For the remaining 20%, I built a 3-pass AI pipeline (Claude Haiku via CLI) with a cron every 30 minutes. Here's each pass in detail, the exact prompts, and iterations measured on 290 real entries.

The architecture is layered, cheapest to most expensive:

extract_title_year() cleans the name, searches TMDB, takes the first result with a poster. Free, instant, correct ~80% of the time.The first prompt was simple: "extract the proper movie title for a TMDB search." Tested on 290 real names, it produced 72 false skips — the AI considered "Naruto.INTEGRALE", "Pokemon La Series", "Despicable Me COLLECTION" as non-titles and marked them skip=true

The fix: explicit rules about what to keep vs. skip, a "when in doubt, skip=false" rule, and instructions to translate known English titles to French. Result: 72 → 41 skips. 31 improvements, zero regressions.

The verification prompt sent {name, TMDB title} pairs and asked correct: true/false

. On 247 entries, it flagged 55 as incorrect. But 46 were false negatives.

The AI didn't know that S01 → "Season 1"

is a correct match — it's a TMDB season poster, not a generic match. Same for all 34 Simpsons seasons, 11 Walking Dead seasons, 4 Batman seasons.

The fix: a "Special cases — do NOT mark as incorrect" section explaining that season folders matched to season titles are correct, and translations/saga names are fine. Result: 55 → 9 incorrects. All 9 are real problems. Zero false negatives.

When pass 2 detects a false positive and suggests "Naruto" as a better title, we search TMDB. Problem: TMDB returns results by popularity. "Naruto" → Naruto Shippuden (more popular). Taking the first result reproduces the error.

The solution: get 15 TMDB candidates (via multi + tv + movie endpoints), send the full list to AI with the filename for context. The AI picks {"idx": 1}

— Naruto (2002), the original series. The word "INTEGRALE" in the filename helps it understand this is the complete series, not a spin-off.

A gotcha: Claude sometimes adds explanations after the JSON, breaking parsing. Fix: extract {"idx": N}

via regex instead of full JSON parsing.

Prompt

Before

After

Improvement

Pass 1 (extraction) 72 false skips

-43%

Pass 2 (verification) 55 false negatives

9 (all real) -84%

Pass 3 (candidate pick) 4 parse failures

-100%

Measure before iterating. Without 290 real entries as a benchmark, I would have iterated blindly. The numbers showed pass 2 v1 had 84% false negatives — impossible to see without real data.

Edge cases dominate. 46 out of 55 false negatives came from one pattern: season folders. One line in the prompt ("seasons matched to Season N are CORRECT") eliminated 84% of errors. The 80/20 rule applies to prompts too.

Parsing matters as much as the prompt. A perfect prompt is useless if parsing breaks. The AI adds text, code fences, explanations. Regex extraction is more reliable than json_decode()

Layered architecture reduces costs. Free regex handles 80%. AI only runs on the remaining 20%. Pass 3 (the most expensive) only fires when pass 2 detects a problem — 9 times out of 290 entries.

The best prompt isn't the one with the most instructions — it's the one that precisely describes edge cases. "When in doubt, skip=false" and "seasons are CORRECT" are worth more than 20 lines of generic rules.

source & further reading

dev.to — original article Microsoft said the patches would get bigger. I measured how much bigger. Build Firebase AI Logic Application with Antigravity CLI and Stitch MCP Server [GDE] LingoBridge-AI: Simplifying Complex Medical Reports for Rural Patients

~/api · this article 200

$curl api.wpnews.pro/v1/news/ai-tmdb-3-passes-to-matc…

Read original on dev.to → dev.to/ohugonnot/ai-tmdb-3-passes-to-match-torre…

mentioned entities

TMDB

Claude Haiku

ShareBox

Naruto

Kill Bill

metadata

slugai-tmdb-3-passes-to-match-torrent-posters-prompt-iteration-with-real-numbers

topic#artificial-intelligence

secondary4 topics

sentimentneutral

canonicaldev.to

navigation

← prevTired of AI Overviews? I found 9…

next →I helped design the system that …

── more in #artificial-intelligence 4 stories · sorted by recency

dev.to · 15 Jul · #artificial-intelligence

LingoBridge-AI: Simplifying Complex Medical Reports for Rural Patients

machinebrief.com · 15 Jul · #artificial-intelligence

Shrinking Context Windows: The Roundtable Test Shakes Up LLM Coordination

machinebrief.com · 15 Jul · #artificial-intelligence

WikiSTAR: The AI That Knows Wikipedia's Secrets

machinebrief.com · 15 Jul · #artificial-intelligence

The Mysteries of Cross-Language Comprehension

── more on @tmdb 3 stories trending now

wpnews · 23 May · #artificial-intelligence

AccessLens — a blind person's lanyard, powered by Gemma 4 on-device

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 21 May · #developer-tools

Antigravity CLI: A Hands-On Guide to Google's Terminal Coding Agent

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required