Let’s be honest: there is nothing quite like the feeling of moving a machine learning model off a messy local Jupyter Notebook and seeing it live, breathing, and clickable on the internet.
A few weeks ago, I decided to tackle a fun challenge: predicting the knockout stages of the upcoming 2026 FIFA World Cup. Instead of just looking at standard text readouts on my terminal, I wanted to build a complete, interactive visual experience that anyone could play with.
Here is the story of how I built the data pipeline, why I chose my model, and the cheeky trick I used to keep the app online 24/7 for free.
I wanted to keep this workspace modular and clean. Instead of one massive script that does everything, I split the workflow across four dedicated stages:
01_data_alignment.ipynb
):02_historical_feature_store.ipynb
):03_local_validation_and_baseline.ipynb
):app.py
):For the predictive heavy lifting, I went with LightGBM.
International football data is highly tabular but filled with tricky, non-linear relationships (e.g., how a team's current attacking form clashes with an opponent's specific defensive structure). LightGBM handles these interactions beautifully. Plus, it trains incredibly fast and allows me to output precise match probability distributions rather than just a boring binary "Win/Loss" result.
I wanted this app to be intuitive for regular football fans, not just techies.
I built a sidebar playground where you can match any two qualified nations head-to-head and tweak "volatility sliders" to simulate massive underdog upsets.
But the hardest part? The bracket. Streamlit is amazing, but drawing a fully responsive, clean 16-team tournament tree using standard components is tough. To fix it, I injected raw custom CSS blocks directly into the layout. Now, it draws crisp vector grid lines mapping progression straight from the Round of 16 down to a gold-bordered podium crowning the grand champion.
I deployed the app for free using Streamlit Community Cloud. It’s an incredible service, but there is a catch: if your app doesn't get traffic for 12 hours, the server puts it to sleep. The next visitor gets a slow "This app is hibernating" screen.
To bypass this for my portfolio, I set up a simple GitHub Actions workflow (.github/workflows/keep_alive.yml
). Every 10 hours, it automatically pushes an invisible, empty commit to the repository. This tricks the cloud server into seeing active development, resetting the hibernation timer to zero. Result? The app stays awake 24/7!
The entire project is public, and I’d love for you to play around with it.
To give you a sneak peek of the dashboard in action, here is the full knockout bracket generated by the pipeline's baseline run:
2026 World Cup Predictive Bracket Simulation
As you can see, the LightGBM engine forecasts an absolute heavyweight clash for the final, ultimately predicting France to edge out Argentina to lift the trophy, while nations like England and Brazil fall just short in the semifinals.