I run a Make.com pipeline that produces daily sports betting articles. Odds API in, API-Football in, aggregation in the middle, GPT-4o for the writing, Google Docs out. Looks great on the diagram. Worked beautifully in testing.
Then it shipped. And within a week we had articles confidently telling readers that a Spanish second-division side was "the reigning Champions League winners," that a 38-year-old striker had "just signed his first professional contract," and — my personal favourite — that a match scheduled for Saturday would "kick off this past Tuesday."
Every fact technically plausible. Every fact completely wrong.
This post is what I changed to stop that happening. Not a clever prompt trick. Not a model swap. A structural change in how the data flows into the LLM call.
The instinct when GPT invents facts is to blame the model. Bigger model, better prompt, lower temperature. Most of the time that's wrong. In a pipeline, hallucination is almost always caused by the prompt asking GPT to know things it has no way to know.
My original system prompt looked something like this:
You are a sports betting journalist. Write a 600-word match preview for the upcoming match. Cover team form, head-to-head history, key players, and betting angles. Be authoritative and confident.
Then the user prompt contained the two team names and the match date.
Read that carefully. I told GPT to write authoritatively about team form, h2h history, and key players — without giving it any of those things. So it did exactly what an LLM does when asked a confident question with no data: it generated something that sounds like sports journalism, drawn from training data that's months or years stale, hallucinating the gaps.
The fix isn't "tell GPT not to hallucinate." That instruction does almost nothing. The fix is to make hallucination structurally impossible by giving GPT every fact it needs and forbidding it from going beyond those facts.
This is what people mean when they say "data injection." It's not jargon — it's a constraint pattern.
Here's the before and after at the module level.
Before:
Odds API → API-Football → Aggregator → GPT-4o → Google Docs
The Aggregator module passed a clean object to GPT. The prompt asked GPT to write a preview. Everything between "here are two team names" and "600 words of confident content" was a black box that GPT filled with plausible-sounding nonsense.
After:
Odds API → API-Football → Aggregator → Data Validator →
Structured Fact Block Builder → GPT-4o (constrained) →
Output Validator → Google Docs (or Error Queue)
Three extra modules. None of them are AI. All of them exist to make sure GPT only ever sees verified facts and is structurally prevented from inventing more.
Let's go through them.
Hallucinations are often born in null fields. If the API returns no head-to-head data for two teams that haven't played in five years, and the Aggregator forwards h2h: null
to GPT, the prompt will still ask GPT to "cover head-to-head history." Guess what GPT does with that.
The Data Validator is a Make.com router that checks every field the article will reference. If any required field is missing, the flow branches: either skip the article entirely, fetch a fallback, or proceed but mark that field as "data unavailable" — which the prompt later treats as a hard constraint.
Example: I require the last 5 matches for each team. If the API returns fewer than 5 for either side, the flow doesn't fail — it sets a form_data_complete: false
flag, and the prompt builder downstream handles it differently.
In code-equivalent terms (your Make.com setup will use a Router + filters, but the logic is this):
const required = {
home_team_name: data.home_team?.name,
away_team_name: data.away_team?.name,
match_date_iso: data.fixture?.date,
kickoff_time_local: data.fixture?.kickoff_local,
venue: data.fixture?.venue?.name,
home_last_5: data.home_team?.last_5_results,
away_last_5: data.away_team?.last_5_results,
odds_home: data.odds?.home,
odds_draw: data.odds?.draw,
odds_away: data.odds?.away,
};
const missing = Object.entries(required)
.filter(([_, v]) => v === null || v === undefined || v === '')
.map(([k]) => k);
if (missing.length > 0) {
// Route to error queue, don't generate
return { action: 'skip', reason: `missing: ${missing.join(', ')}` };
}
This single step killed more hallucinations than any prompt change ever did. Most of the bad articles in the early days came from articles GPT shouldn't have been asked to write at all because the underlying data was incomplete.
This is the core of the data injection pattern. Instead of asking GPT to write about teams using its training data, the prompt builder constructs a deterministic, structured fact block from the validated API data — and the system prompt is rewritten to say only use the facts in the block below.
Here's what gets injected:
=== MATCH FACTS (USE ONLY THESE) ===
FIXTURE
- Home team: AFC Bournemouth
- Away team: Villarreal CF
- Competition: Friendly (pre-season)
- Date: Saturday, 9 August 2025
- Kickoff: 15:00 local time
- Venue: Vitality Stadium, Bournemouth
HOME TEAM RECENT FORM (last 5, most recent first)
1. W 2-1 vs Brentford (29 July)
2. D 1-1 vs Bristol City (24 July)
3. L 0-2 vs Real Sociedad (20 July)
4. W 3-0 vs Yeovil Town (15 July)
5. W 1-0 vs Hashtag United (10 July)
AWAY TEAM RECENT FORM (last 5, most recent first)
1. L 0-1 vs Aston Villa (2 August)
2. W 2-0 vs Levante (28 July)
3. D 1-1 vs Real Betis (24 July)
4. W 4-1 vs Cádiz B (20 July)
5. W 2-1 vs Mirandés (16 July)
HEAD-TO-HEAD (last 5 meetings)
- No previous competitive meetings
- 2 previous friendlies: 1 win each, both 1-0
CURRENT ODDS (decimal)
- Home win: 2.10
- Draw: 3.40
- Away win: 3.20
KEY PLAYERS (from current squad data)
- Home: Solanke, Semenyo, Tavernier (top scorers this pre-season)
- Away: Pino, Moreno, Baena (confirmed in pre-season starting XI)
=== END OF FACTS ===
Notice what's not in the block: nothing speculative, nothing about "form trends," nothing about player narratives. Just data points the API confirmed.
The Make.com module that builds this is just a text aggregator with template variables — no AI involved. It's deterministic and auditable. If the article says "Solanke scored twice last weekend," I can trace exactly which API field that came from. If it doesn't trace, it's a hallucination.
The system prompt that consumes the fact block is the second half of the data injection pattern. It's not "be a great sports writer." It's a contract.
You are a sports betting content writer. You are writing a 600-word
match preview using ONLY the facts provided in the MATCH FACTS block.
HARD CONSTRAINTS:
1. Every factual claim you make MUST be directly supported by the
MATCH FACTS block. Do not introduce facts, statistics, transfers,
injuries, manager quotes, historical context, or player biographies
that are not in the block.
2. If a topic would normally require information not in the block
(e.g. detailed h2h history, specific injury news, manager tactics),
either omit it or explicitly write "no recent data available."
3. Do not invent player names. Only reference players named in
KEY PLAYERS.
4. Do not invent dates or times. Use only the date/time in FIXTURE.
5. Do not state outcomes as certain. Use language like "favoured,"
"value bet," "the odds suggest" — never "will win" or "guaranteed."
6. If the facts contradict your training data (e.g. a player you
remember as injured is listed as a key player), trust the facts.
The block reflects the current state.
STRUCTURE:
- Lead paragraph: fixture, venue, kickoff
- Form section: recent results for both sides (use the data, don't
characterise it beyond what 5 results can support)
- Betting angle: discuss the odds rationally
- Closing: brief value-bet suggestion based on odds + form
TONE: neutral, German sports-journalism register. No hype.
Two things matter here. First, constraint #6 — "trust the facts over your training data." This is the line that prevents the model from "correcting" current data with outdated training memories. Without it I'd see GPT helpfully writing about a player who'd retired six months ago because training data said they were a star.
Second, constraint #2 — explicit permission to write "no recent data available." Without that, GPT will fill silence with invention. With it, GPT will happily skip a topic and the article reads as appropriately measured rather than confidently wrong.
Temperature stays low (0.3-0.4). Not zero — you still want readable prose — but low enough that the model isn't reaching for creative completions.
No matter how tight the prompt, occasional hallucinations slip past. The Output Validator is a final Make.com step that runs the generated article through a set of regex and lookup checks against the fact block.
The checks I run:
match_date_iso
. If a date in the article doesn't match, route to manual review.KEY PLAYERS
list and the squad data from the API. Unknown names = flag.last_5_results
data for both teams. If it's neither a real recent result nor framed as a prediction, flag.Articles that fail any check route to a "Needs Review" Google Doc folder. Articles that pass go straight to publish queue.
Roughly 5-8% of articles trip a flag in any given week. Maybe a third of those are actual hallucinations, the rest are false positives from the regex (a player whose name overlaps with a common word, etc.). I'd rather review false positives than ship false facts.
Three things, in priority order.
One: hallucination is a data problem, not a model problem. Every hour I'd spent on prompt engineering before this change had near-zero impact. The hour I spent building the Data Validator and the Fact Block had a 10x impact. If your pipeline produces confidently wrong output, look at what you're not giving the model before you look at the model itself.
Two: deterministic modules around the LLM are non-negotiable. Make.com (or n8n, or whatever you're using) is great at exactly this — building the boring, predictable, auditable scaffolding around the unpredictable LLM call. If your flow is "API → GPT → Output," you have no scaffolding. The LLM is doing work it shouldn't be doing.
Three: explicit permission to fail gracefully beats implicit pressure to perform. Telling the model "write a great preview" creates pressure to invent when data is thin. Telling it "if data is missing, write that data is missing" releases that pressure. Counterintuitively, the article quality went up after I gave the model permission to say "no recent data available" — because the alternative was always invention, and invention always read worse than honest gaps.
If you're building content pipelines and hitting hallucination problems, start with the Data Validator. It's the cheapest change with the biggest impact, and it surfaces a lot of upstream data issues you didn't know you had.
I'll write up the n8n version of this same pattern in a follow-up — same logic, different scaffolding.