{"slug": "the-ai-implementation-process-i-use-with-every-client", "title": "The AI Implementation Process I Use With Every Client", "summary": "An engineer outlines a five-phase AI implementation process used with clients: scoping, proof of concept, integration, evaluation, and operations. Each phase has an exit criterion that must be met before proceeding, with the goal of avoiding common project failures. The process emphasizes adversarial testing, idempotency keys, approval queues, and token budget alerts.", "body_md": "Most AI projects do not fail at the model. They fail in the six weeks before anyone writes a prompt, and in the six weeks after the demo lands in a Slack channel and nobody knows who owns it. I have run enough of these now (from one-off automations to multi-agent content systems running unattended) that the process has converged into something stable. This is the version I actually use.\n\nIt has five phases: scoping, POC, integration, evaluation, operations. Each phase has an exit criterion. If we cannot meet the exit criterion, we do not move forward. That single rule has saved more projects than any clever architecture choice.\n\nScoping ends with a written document that names the workflow being automated, the system of record it touches, the success metric in hours or dollars, the data we have access to, and the smallest possible first slice. No model is chosen yet. No code is written. If we cannot produce that document, the engagement stops here and the client keeps the document.\n\nThe hardest part of scoping is resisting the urge to solve the interesting problem. Clients almost always describe the AI-shaped fantasy (\"an agent that handles all support tickets\") when the real opportunity is narrower and uglier (\"triage tier-1 tickets that mention billing, route to the right queue, draft a reply for human approval\"). The narrower version ships. The fantasy does not.\n\nI run scoping as three sessions:\n\n**Exit criterion:** a one-page scope with a single first slice, a measurable success metric, and a named human owner on the client side. No owner, no project.\n\nThe POC has one job: kill the project cheaply if it cannot work. I treat the POC as adversarial. I am trying to find the reason this will not ship, before we spend integration money on it.\n\nConcretely, a POC for me looks like this:\n\nThe POC answers four questions in order:\n\n| Question | What \"no\" means |\n|---|---|\n| Does the model produce the right shape of output reliably? | Schema issues, structured-output failures. Fixable. |\nDoes it produce the right content on easy cases? |\nCapability gap. Sometimes fixable with retrieval or examples. |\n| Does it handle the long tail without catastrophic failures? | The real risk. Often the project killer. |\n| Can we detect when it is wrong? | If no, the project cannot ship to production. Full stop. |\n\nThat last question is the one most people skip. An AI system you cannot evaluate is an AI system you cannot trust, and an AI system you cannot trust is a demo, not a product. I have walked away from POCs that worked 90% of the time because there was no signal to catch the 10%.\n\n**Exit criterion:** measurable performance on the eval set that the client agrees is good enough to justify integration cost, plus a documented failure mode list.\n\nThis is where most of the actual work lives, and where most of my time goes. The model is usually the easy part by now. The integration is what makes it real.\n\nMy default stack for production AI work:\n\nThree integration details I now treat as non-negotiable:\n\nAny external action (send email, create ticket, post to CRM) gets an idempotency key derived from the input. Retries are inevitable, duplicate side effects are not.\n\n``` php\ndef idempotency_key(workflow_id: str, input_hash: str, step: str) -> str:\n    return f\"{workflow_id}:{step}:{input_hash}\"\n```\n\nI always build the approval queue before I build the auto-send. Even if the client wants full automation eventually, shipping with human review for the first 2 to 4 weeks catches the failure modes the eval set missed. Turning approval off later is one config change.\n\nToken budgets per execution, hard cutoffs, alerts at 50/80/100% of monthly budget. I have seen a single retry loop burn $400 in an hour. Never again.\n\n**Exit criterion:** the system runs end to end on real production data, with logging, retries, idempotency, and a kill switch. Not perfect outputs yet, but the pipes are sound.\n\nEvaluation is not a phase you finish. It is a system you build once and keep running forever. But there is a discrete block of work to set it up, and that is what this phase is.\n\nI build three layers of evaluation:\n\nThe trap here is treating eval as a one-time gate. Models change. Prompts drift. Data shifts. The eval set has to be re-run on every change and the production telemetry has to feed back into growing the eval set. If a real production failure happens, it goes into the eval set the same day.\n\n**Exit criterion:** the client can answer \"is the system still working correctly?\" without calling me.\n\nThis is the phase that separates a project that survives from one that dies six months in when something breaks and nobody knows where to look.\n\nWhat I deliver in operations:\n\nA few opinions, after running this loop enough times:\n\nThe shape of this process is not unique to my work. What is mine is the calibration: which phases I now know to invest in, which exit criteria I refuse to skip, and which mistakes I have made enough times to write them down. That last category is the actual deliverable when you hire someone like me, more than the code.\n\nIf you are scoping an AI implementation and want a second pair of eyes on it before you commit budget, I am happy to look at it. Reach out at [lazar-milicevic.com/#contact](https://lazar-milicevic.com/#contact), or browse the rest of the blog for more on evaluation, RAG, and getting agents into production.", "url": "https://wpnews.pro/news/the-ai-implementation-process-i-use-with-every-client", "canonical_source": "https://dev.to/lamingsrb/the-ai-implementation-process-i-use-with-every-client-5a6i", "published_at": "2026-06-29 06:24:31+00:00", "updated_at": "2026-06-29 06:57:24.094111+00:00", "lang": "en", "topics": ["artificial-intelligence", "ai-products", "ai-agents", "mlops", "developer-tools"], "entities": ["Slack", "CRM"], "alternates": {"html": "https://wpnews.pro/news/the-ai-implementation-process-i-use-with-every-client", "markdown": "https://wpnews.pro/news/the-ai-implementation-process-i-use-with-every-client.md", "text": "https://wpnews.pro/news/the-ai-implementation-process-i-use-with-every-client.txt", "jsonld": "https://wpnews.pro/news/the-ai-implementation-process-i-use-with-every-client.jsonld"}}