I've worked on two teams with opposite philosophies of how to demo to a client, and I still go back and forth on which one is right.
One builds demos for impact. Everything is mocked: the screens are polished, the flow is choreographed, and the result that lands on screen is the best possible version of itself. The backend isn't really doing the work β and crucially, the AI isn't really being called. It's a beautiful film of the product.
The other builds functional proof-of-concepts. As much as possible is real and wired up β real services, real model calls, real data β and then you iterate on top of it. The first version is rougher and less choreographed, but what you're looking at is actually happening.
Both "work" in the sense that both can win a room. They just optimise for different things, and the gap between them gets a lot wider the moment AI is involved.
It's easy to be snobby about mocked demos, so let me steelman them first, because the reasons are good ones.
A mock is fast. You're not blocked on infrastructure, model access, data pipelines, or the ten unglamorous things that have to exist before a real flow runs end to end. You can demo a product that doesn't exist yet.
A mock is controlled. Demos fail in stupid, memorable ways β a timeout, a rate limit, a model having a bad day in front of the one person you needed to impress. A mock removes that variance. The story you rehearsed is the story they see.
And a mock sells the vision, not the current state. Early on, what you're really validating is desire: do people want this? A crisp mock answers that question without you having to build the thing first. For a non-technical stakeholder deciding whether to fund the next phase, a polished mock can be exactly the right artifact.
None of that is dishonest. It's a legitimate strategy with real upsides.
The functional PoC gives up some polish and a lot of control in exchange for one thing: what you show is true.
That truth compounds. A functional PoC isn't thrown away after the meeting β it's the first commit of the product. You iterate on it instead of rebuilding from a slide deck. The feedback you get is real feedback, because people are reacting to real behaviour, not to your best-case storyboard. And the hard parts surface now, while they're cheap, instead of after a contract is signed and the timeline is fixed.
I've felt this directly. In one live demo, someone asked the assistant about errors that had occurred, and its answer blended what had gone well in with what had gone wrong. The person watching reacted on the spot: this needs to be more concise. In the same session, someone asked whether it could do a particular thing that β because of an internal constraint β it simply couldn't. Two concrete pieces of feedback and two new tickets, in the span of one demo. With a mock, you can't even attempt those questions: the script answers what it was scripted to, and nothing real is being tested.
It's slower to first wow. But it never has to walk anything back.
For ordinary software, the distance between a mock and the real thing is mostly polish: the real version will be a bit slower, a bit less pretty, a few edge cases will misbehave. Manageable. With AI, the distance is substance β because a mock hides the two properties that define how the product will actually feel:
So when you demo AI with a mock, you're not just smoothing over rough edges. You're selling away the two risks the project actually has. The client falls in love with an instant, always-correct assistant, and then the team has to build something that is neither of those by default. That gap doesn't close itself β someone pays for it later, usually in eroded trust during the build.
Honestly, it depends on what you're trying to learn from the demo:
My own bias leans toward the functional PoC, and it's leaned further the more I work with AI β because with AI the risk lives exactly in what the mock paints over. A functional PoC doesn't have to be ugly or slow to build, either: the AG-UI demo I put together recently is real end to end β real model, real latency, real rendered widgets β and it's about a hundred lines. The "real is too expensive to demo" assumption is often less true than it looks.
But I hold it loosely. A mock that honestly sells a vision, followed by a team that closes the gap, is a perfectly good way to build a company. The failure mode isn't mocking β it's mocking the risky parts and then quietly hoping reality will cooperate.
It doesn't have to be binary. I've worked on projects that ran both at once: a mocked path to guarantee a clean end-to-end walkthrough β the story you can always tell without something breaking mid-meeting β and the real system alongside it, to probe live with different cases. The mock de-risks the narrative; the real part invites the hard questions. You get the controlled wow and the honest feedback in the same session, as long as everyone in the room knows which half is which.
Maybe the real question isn't "mock or functional." It's which moment you're optimising for: the signature at the end of the demo, or the trust at the end of the first sprint. Sometimes those point the same way. With AI, more often than I'd like, they don't.
How does your team demo β and have you ever been bitten by the gap between the demo and the thing you shipped? I'd genuinely like to hear it.
Related: When the Chat Builds Its Own Interface (a functional demo, end to end) and LLM-as-Judge Is Three Decisions (on measuring the variance a mock hides).
Originally published on javieraguilar.ai
Want to see more AI agent projects? Check out my portfolio where I showcase multi-agent systems, MCP development, and compliance automation.