Building Pulso: What it Actually Takes to Put Agentic AI in a Solo Practice

wpnews.pro

Most of what gets written about “agentic AI” lives a long way from the people it’s supposed to help. We talk about autonomous agents, orchestration, RAG, benchmarks. Meanwhile the nutritionist, the psychologist, the dentist who runs their own practice is still copying appointments into a notebook, typing up records at midnight, and answering “do you take insurance?” for the thousandth time.

Pulso started in that gap. This isn’t a pitch. It’s an honest log of the engineering and product problems we hit while turning real AI capability into something that fits the day, and the budget, of a health professional who works mostly alone. Some of these challenges we chose. Others chose us.

The first hard call in a new product is rarely a feature. It’s the foundation. Where this runs, how it grows, and what it costs while there’s still no revenue to justify anything fancy.

We moved off a heavier, more “enterprise” setup and onto Railway, and it was a deliberate choice. Not because it’s the cheapest or the trendiest, but because it gives us three things that matter right now:

The word that did the heavy lifting was modular. We designed the system in layers that can be separated without rewriting the product: the messaging layer (WhatsApp through a BSP), the agent layer (the AI flows), the domain layer (journeys, records, scheduling), and the usage and billing layer. Each one can scale, get swapped, or become its own service later without dragging the rest down.

The lesson we keep coming back to: in the validation stage, the best infrastructure isn’t the one that survives a million users. It’s the one that doesn’t get in the way of learning fast, and that you won’t have to throw away once the learning turns into traction.

This is probably the most important decision we made, and it isn’t technical.

The classic way to get product wrong is simple. You imagine a solution, bring it to the customer finished, and ask “would you want this?”. The answer is almost always “yes, love it”. And it’s almost always a polite lie. People don’t want to let you down, so they praise ideas they’ll never use. That bias has a name and a whole body of work behind it. It’s the spine of The Mom Test by Rob Fitzpatrick, and it echoes Uri Levine’s “fall in love with the problem, not the solution”. The practical rule is to stop asking whether people like your idea and start digging into the problem and what they actually do.

That’s exactly how we worked. Instead of building a bundle of features and asking if people liked it, we started from a real pilot, ran it through several validation sessions, and listened to the problem before proposing anything. Every feature came out of a concrete pain we watched happen, not a roadmap we invented in a room:

The difference is the method. We listened to the problem and brought the solution, instead of bringing a solution and asking if they’d like it. Sounds like a detail. It’s the difference between building what sells and building what fills up the repo.

There’s an almost absurd distance right now between what agentic AI can already do and what an independent professional actually has in their hands day to day.

Models hold a natural conversation, transcribe, summarize, structure, schedule, remind. And yet most small practices still run on a personal WhatsApp, a spreadsheet, and memory. The technology exists. The access doesn’t. It’s packaged into expensive, complex tools built for big clinics and networks, with onboarding that assumes an IT team the solo professional never had.

That’s the thesis behind Pulso, and it’s what moved us from the start: use that gap. Take real AI capability, the same kind that powers polished products today, and wrap it so a person working alone can use it on day one, in the channel they already live in (WhatsApp), at a fair price.

It was never about having the newest model. It’s about removing the friction between that model and the person who’d gain the most from it. The innovation here is as much about distribution and packaging as it is about technology.

This is the least glamorous challenge and the most decisive one for any product that ships AI. Every message a model handles has a cost. Every transcription has a cost. Every agent that answers on its own is spending real money. If you don’t measure that per customer, you’re flying blind, and you find out the hard way when the bill arrives.

So per-tenant usage control wasn’t a late “nice to have”. It was foundation. We built a usage meter tied to the pricing model, and gave it a name the customer understands without thinking about tokens: Pulsos.

The trick is turning a technical cost into a unit the customer can reason about. A Pulso is that unit. Instead of tokens, message fees, and model bills, the professional buys a quota of Pulsos for the period (monthly, quarterly, or yearly), and every action draws it down by a weight that mirrors what it actually costs us:

The professional sees this in plain language (“info” and “outreach”, never “utility” and “marketing”) and always knows how much of the pack is left. Underneath, that exact same weight feeds our unit economics. We know the margin per plan, per customer, per type of use, and we can answer the question that kills AI startups: does this customer make money?

The message weights aren’t the whole cost, though. The single most expensive thing we run is the conversation agent, the back-and-forth that figures out what the patient wants and carries a booking all the way to a confirmed slot. It’s multi-turn by nature, and even with the right model and aggressive prompt caching, those turns add up fast. That’s the point: the most valuable feature is also one of the priciest to operate, and you have to see that cost per customer before you scale it, not after.

So there’s a second layer of cost engineering in how we use the models themselves. We route each task to the right one. A fast, cached model (Claude Haiku) handles the day-to-day patient conversations and the basic tasks. A stronger one (Claude Sonnet) takes the heavier work where quality matters more than speed: generating care journeys, importing medical records, and powering the literature agent. There’s a fallback for when something breaks. The expensive model only shows up where it earns its keep.

The lesson: in an AI product, unit economics isn’t a finance spreadsheet, it’s an architecture decision. Measure per tenant from day one and you can scale with confidence. Skip it and you scale the losses.

Those four were the pillars. But some problems only surface once you put agentic AI in front of a real patient, on WhatsApp, about health. These three earn a spot in the article.

When the channel is about health, and sometimes mental health, “the agent got it wrong” stops being a bug and becomes a risk. We had to build crisis-signal detection with an AI confirmation step before anything fires, specifically so we don’t trigger a crisis response on a false positive (someone filling in an “emergency contact” field at signup, for example). And there had to be an off switch: a flag that pulls that detection out of production in seconds if it ever starts doing more harm than good. Safety in a health product is less about the happy path and more about failing carefully.

WhatsApp is not a friendly API. There’s a 24-hour window for conversations, templates that need approval, automatic categorization by Meta (which decides on its own whether your message is utility or marketing), and sending limits. We went with each professional bringing their own number, activated with help from our team, and built the whole template and rate-limit layer around rules we don’t control. A good chunk of Pulso’s invisible engineering is exactly that: making a smooth experience fit inside a platform full of sharp edges.

Here’s something we learned about distribution. You can’t scale this with a signup link, and the reason is easy to underestimate. A professional’s schedule is their professional life. Handing it over to an AI is not a casual yes. So we stopped chasing volume and put trust first. What actually works is high-touch: short, focused meetings with demos we’ve prepared in advance, backed by real test numbers, so the person feels the product working before a purchase is even on the table.

And selling AI with screenshots doesn’t work anyway. Nobody believes it until they try it. So we turned the demo itself into a product. The lead gets a code, sends it to a number, and lives the experience in their own hands: books a slot, dictates a record by voice, watches the assistant answer questions on its own. The same number serves both the demo and the real flow, with smart routing between the two. Building that was its own challenge, but it’s what makes the “oh, now I get it” land in minutes, and it’s what earns the right to touch the schedule later.

The next challenges are already on the horizon, and they follow the same logic of catching a window while it’s open: digital signing of records and prescriptions with the Brazilian ICP-Brasil standard, deeper calendar integration, and fitting into a regulatory landscape that’s opening up to digital in healthcare. Each one is, again, the same exercise: take a capability that already exists and make it reachable for someone who works alone.

If there’s one line that sums up what we’ve learned building Pulso, it’s this. The most advanced technology in the world only becomes value when it fits into a real person’s routine, at a price they’re willing to pay. The rest is engineering to close that distance. Building Pulso: What it Actually Takes to Put Agentic AI in a Solo Practice was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

source & further reading

pub.towardsai.net — original article Can Your Computer Run Nvidia’s 550B Model? Not Even Close, and the Reason Is Fascinating Agent AI Sprawl Nobody Owns The Multimodal Lakehouse: Data Engineering’s Answer to AI’s Messiest Problem

Building Pulso: What it Actually Takes to Put Agentic AI in a Solo Practice

Run your AI side-project on zahid.host