Pedro Rocha
CTO of xelerate.tech
Porto, 4th of May 2026
**One Model Won’t Save You: How We Actually Built Our AI Stack **
I was grabbing coffee with a peer last week who was tearing his hair out trying to find the “perfect” AI platform for his entire company. He wanted one contract, one interface, and one bill.
I told him he was chasing a ghost.
The “one-size-fits-all” AI platform is a marketing myth. If you try to force a single model to handle everything from your system architecture to your HR emails, you’re going to end up with a tool that is either too expensive for the simple stuff or too “dumb” for the heavy lifting. We stopped looking for the “Holy Grail” and started building a toolbox instead.
The Real Cost of “Intelligence”
Before you sign a vendor agreement, you have to look past the flashy demos and talk about the plumbing. Specifically: Privacy and the Bill.
Self-Hosting: If you have the hardware and the budget, hosting your own models (like Llama or Mistral) is the gold standard for privacy. Your data never leaves your VPC. But let’s be honest—the infrastructure costs and the specialized talent needed to maintain those clusters will make your CFO’s eyes water.Pay-per-token: This is the most transparent way to scale. You pay for what you use. It’s great for high-volume automation, but if you don’t have tight guardrails, a single buggy loop in development can burn through a month’s budget by lunch.Subscriptions: These are predictable, which finance loves. But you have to watch the “soft limits.” Once you hit that invisible ceiling, your high-performance model suddenly feels like it’s running on a 56k modem.
A quick observation:
I’ve noticed a lot of teams ignore the “Digital Sovereignty” gap. Right now, Europe is essentially sitting in the bleachers while the US and China sprint ahead. For us, this isn’t just about tech; it’s about control. Every time a European company relies solely on a foreign model, we’re renting our cognitive infrastructure. It’s a risk we have to manage, not just a line item we can ignore.
Our Stack: Why We Chose What We Chose
We didn’t pick one winner. We picked the best tools for the specific jobs our people do every day.
1. For the Dev Team: GitHub Copilot Pro+
We didn’t overthink this one. Our code lives on GitHub. Our devs live in VSCode. Copilot Pro+ gives us seamless integration without the “context-switching tax.” It provides access to top-tier models, and the subscription model—considering the limits they provide—is the most cost-effective way to keep a high-output engineering team happy.
2. For Everything Else: Gemini (Google Workspace)
To support the rest of our operations—marketing, hr, admin—we lean on Gemini. Since it’s already bundled into our Google Workspace subscription, the barrier to entry was zero. It handles the spreadsheets and the emails perfectly fine without us having to manage another separate subscription.
3. For Automation: n8n + OpenAI
Then there’s the glue holding our background processes together. We’ve standardised on n8n paired with OpenAI for our heavy-duty automation. While Gemini handles the “human-facing” work in our docs, this duo handles the logic under the hood. We use n8n to build the workflows—connecting our CRM, Notion, BambooHR, Google Workspace, and internal databases—and then call OpenAI’s API (and a bunch of MCP Servers) whenever we need a specific “brain” to parse data or make a decision. It’s the ultimate pay-as-you-go setup. We only spend tokens when a trigger actually fires, which keeps our automation overhead lean without sacrificing the raw power of the GPT models.
The “Run Book”: Matching IQ to the Task
To make sure we aren’t burning through our premium subscriptions on tasks a calculator could do, we implemented a specific “Run Book” for our engineering tasks. We treat AI tokens like a resource, not a free-for-all.
Task Category | Model Choice | Why? | |---|---|---| | Complex Planning / System Design | Claude Opus 4.7/4.6 | High-level reasoning for architectural changes. | | Security Reviews & Hard Tests | Claude Opus 4.7 | You don’t take shortcuts with security. | | Standard Code Reviews | Claude Opus 4.6 | High accuracy with faster response times. | | Simple Fixes / Minor Changes | Claude Sonnet 4.6 / GPT-5.4 | Fast, punchy, and keeps costs low. |
The Goal: Maximize the number of issues implemented by a single Agent Session. We want the AI to solve the problem in one go, not engage in a twenty-minute chat that drains our rate limits.
The Bottom Line
Stop looking for the “Best AI.” It doesn’t exist. Instead, look at your workflow and figure out where you need a scalpel and where you need a sledgehammer. Your team will be more productive, and your budget will actually stay in the black.