All tests run on an 8-year-old MacBook Air.
This is Part 3 of my series on training a card game AI with Google Colab.
Part 1: Google Colab basics
In Part 2, I got the RL side working. The next challenge: feeding real game state into the model automatically. That meant recognizing cards from screenshots. I spent about two weeks on it. It didn't work. Here's exactly what happened.
The pipeline I was aiming for:
Card image (JPG)
↓ OpenCV
Card name recognized
↓ Match against TOML
Effect + cost retrieved
↓ RL environment
Board evaluated (good move / bad move)
Simple in theory. Painful in practice.
The plan was template matching — compare a screenshot crop against ~300 master card images from GitHub, find the closest match.
img = cv2.imread("card.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(...)
result = cv2.matchTemplate(input_card, template, cv2.TM_CCOEFF_NORMED)
Straightforward enough. Except it kept failing.
After 6+ hours of debugging, I found three recurring issues:
Mistake 1: Not normalizing image size
The master images and screenshot crops weren't the same dimensions. Even a few pixels difference tanks template matching accuracy.
Mistake 2: Threshold set too strict
if score > 0.99: # This almost never triggers
Even visually identical images often scored below 0.95 due to minor rendering differences. 3px of misalignment was enough to break it.
Mistake 3: Comparing in color
Color variation between the master JPG and the screenshot was enough to cause mismatches. Always convert to grayscale first.
The core problem: these are original game cards that no AI model has been trained on. And the visual difference between master data and a smartphone screenshot — even cropped carefully — was just enough to break every approach I tried.
Giving up was the right call — but I learned something useful in the process.
If you have a decent machine, this problem is actually pretty solvable: just continuously capture the screen and process frames in real time. No template matching needed.
The reason I struggled wasn't the approach itself — it was the hardware constraint. I was trying to push an 8-year-old MacBook Air to its absolute limit, minimizing every layer of the stack to see how far I could get.
Turns out, some problems genuinely need more horsepower. That's a valid finding too.
Two weeks, multiple approaches, zero working card recognition. But I understand exactly why it failed — and what it would take to make it work.
If you're attempting something similar: normalize your image sizes, go grayscale early, and don't set your threshold above 0.95. And if your cards are from an obscure offline game, don't expect Vision AI to save you 😅
Part 1: Google Colab basics