GNN vs. Trees: High-Speed Hybrid Architecture for XLA Runtime Prediction

A developer built a hybrid GNN and decision-tree architecture for predicting XLA compiler runtime, achieving high speed and low memory footprint. The solution outperforms standard deep GNNs by shifting heavy lifting to structured feature engineering. The project is open-sourced as a modular MLOps package.

GNN vs. Trees: High-Speed Hybrid Architecture for XLA Runtime Prediction Introduction A common trap in Machine Learning engineering is deploying over-parameterized models where simpler, structurally informed pipelines can deliver identical precision at a fraction of the cost. To prove this hypothesis, I spent 24 hours reverse-engineering the "Google - Fast or Slow? Predict AI Model Runtime" competition from 2023. The Challenge Google's XLA compiler needs to pick the best physical memory layout and tile configurations for complex operation graphs. The wrong choice causes dramatic slowdowns. Benchmarking every configuration on actual TPUs is too expensive. We need an intelligent proxy model to rank configurations by speed. The Standard Approach vs. My Hybrid Architecture Top-tier competition submissions favored massive, deep GNNs coupled with heavy MLP ranking heads optimized via Pairwise Margin Ranking Loss. While powerful, these architectures leave significant optimization potential on the table regarding memory footprints and inference speeds. My solution shifts the heavy lifting from continuous gradient propagation to structured feature engineering: Five Technical Benefits for Google Infrastructure The Nexus to Frontier AI Gemini Compiler optimizations at the XLA layer are not theoretical exercises. The ability to predict graph runtimes and optimize memory layout sequences is exactly what allows Google to train and serve massive LLMs like Gemini efficiently. Codebase & Notebooks The entire project has been converted from a research script into a modular, production-ready MLOps package. Review the architecture and run the experiments: GitHub: Kaggle: If you find this engineering study or the resource-aware pipeline design helpful, please consider leaving a star on the repository.