Making AI-Generated Code Fail Gracefully

A developer building an AI-powered video editing tool implemented silent self-correction for LLM-generated code failures, achieving a 90%+ success rate. Instead of showing users raw Python tracebacks, the system feeds error messages back to the LLM for automatic retries, with a 70-80% success rate on the first retry. The approach transforms user experience from seeing cryptic errors to receiving friendly messages like "Hmm, let me try that a different way..." followed by successful execution.

Making AI-Generated Code Fail Gracefully If your app generates code with an LLM and executes it, you already know the dirty secret: it fails a lot. Not catastrophically — just wrong method names, bad assumptions about state, off-by-one stuff. The kind of errors a human would fix in 10 seconds. The question is what your user sees when that happens. The Problem Version 1 of my app showed users raw Python tracebacks when a generated script failed. Something like: Script execution failed: Traceback most recent call last : File "", line 3, in items = timeline.GetItemsInTrack "video", 1 AttributeError: 'Timeline' object has no attribute 'GetItemsInTrack' The LLM got the method name wrong — it's GetItemListInTrack, not GetItemsInTrack. An easy fix. But my users are video editors, not Python developers. That traceback means nothing to them except "it broke." The Fix: Silent Self-Correction Instead of showing the error, I send it back to the LLM with context: "The previous script failed with: AttributeError: 'Timeline' object has no attribute 'GetItemsInTrack'. Generate a corrected script." The LLM sees its own mistake, fixes the method name, and the corrected script runs. The user sees: "Hmm, let me try that a different way..." Then 2 seconds later: "✓ Set opacity to 50% on 12 clips" They never see the error. It just works on the second attempt. The Implementation High Level The retry loop is simple: LLM generates a script Script fails validation or execution Send the error message back to the LLM as a new prompt LLM generates a corrected script Try again up to 2 retries If all retries fail, show a friendly message suggesting simpler commands The key insight: LLMs are surprisingly good at fixing their own mistakes when you show them the exact error. The success rate on retry is much higher than the first attempt because the error message narrows the solution space. Friendly Validation Messages Not all failures are execution errors. Some scripts get rejected before they run because they violate sandbox rules my app runs generated code in a restricted environment . Instead of showing "Script contains blocked import: 'os'", the user sees: "That operation would need external libraries that aren't available. Try rephrasing — most operations work with the built-in tools." Different failure modes get different messages. The user gets guidance on what to try next, not a technical explanation of why it broke. What I Learned Users don't care about errors — they care about results. If you can fix it silently, fix it silently. LLMs are good debuggers of their own output. Feeding the error back works 70-80% of the time on the first retry. Three retries is the sweet spot. One isn't enough sometimes the fix introduces a new error . Two catches most errors. Three is for that last 10-20% that need complex logic reevaluations. Friendly messages need to be actionable. "Something went wrong" is useless. "Try a simpler version of your request" gives the user a next step. The QThread signal collision was the real bug. I spent hours debugging why retries weren't working before realizing Qt's built-in finished signal was shadowing my custom one in packaged builds. Renamed it and everything clicked. If you're subclassing QThread — don't name your signals finished or started. The UX Difference Before: 30% of commands showed a traceback. Users assumed the app was broken. After: 90%+ of commands succeed including retries . The 10% that fail get a conversational message. Users assume the app is smart but has limits — which is exactly right. Building Cutting Room AI — natural language video editing for DaVinci Resolve Studio. Available now FOR FREE: NickValenciaTech.com