{"slug": "i-shipped-one-messy-python-script-here-s-the-10-point-checklist-that-got-it", "title": "I Shipped One Messy Python Script. Here's the 10-Point Checklist That Got It There.", "summary": "A developer outlines a 10-point checklist for shipping AI-generated Python scripts reliably, based on a typical messy script with hardcoded paths and bare exception handling. The checklist covers error handling, configuration management, input validation, logging, and testing, aiming to close the gap between 'runs on my machine' and production-ready code.", "body_md": "AI can generate a functional Python script in approximately ninety seconds. The script executes successfully, and the user proceeds to the next task.\n\nHowever, the script often persists beyond its initial use. It may include a hard-coded Downloads path or a comment such as # edit before running!!. Subsequently, someone might duplicate it as script_v2_FINAL.py. Weeks later, a teammate may attempt to use the script with a slightly different CSV file, resulting in a raw traceback or, more problematically, the script silently producing an incomplete file and exiting with a success code.\n\nThat gap between \"runs on my machine\" and \"I'd let someone else run this\" — that distance is the actual engineering work. And AI almost never closes it for you, because you never asked. (That's an observation from watching a lot of generated coThis post provides a detailed analysis of a deliberately typical, disorganized script. I will identify each defect and demonstrate the necessary modifications to prepare it for distribution. By the conclusion, readers will have a ten-point checklist applicable to their own code, requiring only Python and pytest.on your own code today, with zero tools beyond Python and pytest.\n\nThe \"before\" — a script that works exactly once\n\nThe following is the initial script. It aggregates expenses from a CSV file by category and flags entries exceeding a specified limit. Although it functions as intended, this apparent success can be misleading. (The script is abbreviated for brevity; the overall structure remains representative.)\n\nimport csv\n\nINPUT = \"/Users/me/Downloads/expenses.csv\" # <- edit this before running!!\n\nLIMIT = 500\n\nprint(\"starting...\")\n\nrows = []\n\ntry:\n\nf = open(INPUT)\n\nr = csv.reader(f)\n\nnext(r) # skip header\n\nfor row in r:\n\nrows.append(row)\n\nexcept:\n\nprint(\"error reading file\")\n\nfor row in rows:\n\ncat = row[1]\n\namt = float(row[2])\n\n# ...total by category, flag over LIMIT, print a report...\n\nout = open(\"report.txt\", \"w\")\n\nCount the landmines:\n\nA hardcoded home path (/Users/me/Downloads/expenses.csv) you must edit in source to use the tool.\n\nA bare except: that swallows every error — including bugs in your own code — and then keeps running on an empty rows list.\n\nPositional column access (row[1], row[2]) that explodes on a missing column and gives no clue which one.\n\nfloat(row[2]) that crashes the whole run on one malformed cell.\n\nprint() for everything — diagnostics, results, and the \"FLAGGED\" lines all jammed into stdout together.\n\nA non-atomic write (out = open(\"report.txt\", \"These issues are common and representative of typical AI-generated Python scripts. The following sections will address how to prepare such scripts for reliable distribution. --help.\n\nNone of this is exotic. This is what working AI-generated Python looks like. Let's ship it.\n\nThe checklist (run this on any script)\n\nThis is the evaluation rubric I apply before sharing code with others. Each criterion is scored from 0 to 2, for a maximum of twenty points. Apply strict standards; a 'partially' addressed item receives a score of 1. Each entry presents a Failure and corresponding Fix for efficient review.\n\nError handling — Failure: a bare except: (automatic 0), or raw tracebacks on bad input. Fix: wrap I/O, parsing, and network calls in specific exceptions; failures produce actionable messages and a nonzero exit code.\n\nSecrets & config — Failure: hardcoded keys, tokens, or home paths. Fix: config comes from arguments or env. Grep your own code for api_key =, password =, token =, sk-, Bearer, and absolute home paths before you ship.\n\nInputs & validation — Failure: the script assumes well-formed input. Fix: check every external input — what happens on an empty file? A missing column? a path with spaces?\n\nLogging & observability — Failure: print() for diagnostics, or total silence on failure. Fix: logging with levels; user output separated from debug noise.\n\nTests — Failure: none (most scripts). Fix: a pytest suite covering the happy path and at least three failure modes, with all tests running green.\n\nDependency hygiene — Failure: undeclared or unpinned deps, dead imports. Fix: declare dependencies with version bounds; imports match declared deps.\n\nInterface & UX — Failure: values you edit in source to run it. Fix: a real CLI (--help, exit codes) or a documented API.\n\nPackaging & install — Failure: \"clone it and run python script.py and hope.\" Fix: pip install . works, an entry point is defined, it runs from any directory.\n\nDocumentation — Failure: no runnable example. Fix: a README with one-line purpose, install, a copy-pasteable example, and expected output.\n\nPortability — Failure: open() without encoding= blows up on a non-UTF-8 file; OS-specific paths wiA critical rule for effective use of this checklist is that every finding must reference a specific file and line number. General statements such as 'improve error handling' are insufficient. Instead, precise observations like 'open() at line 14 crashes on a missing file' are required. Ambiguous findings are unlikely to be addressed.dling\" is banned. \"open() at line 14 crashes on a missing file\" is the standard. Vague findings never get fixed.\n\nNow the three fixes that matter most.\n\nFix 1: bare except → specific exceptions + real exit codes\n\nThe guiding principle is concise: always catch specific exceptions rather than broad ones, and handle exceptions at the program's boundaries rather than within core logic. Internal code should raise exceptions without handling them directly. Only the main() function should convert errors into user-facing messages and exit codes. Exit codes must be meaningful: 0 for success, 1 for runtime failure, and 2 for usage errors. Otherwise, shell pipelines and scheduled jobs that depend on the script may fail unpredictably.\n\nHere's the validation from the shipped core.py. Notice it names the missing column instead of dying on row[1]:\n\nclass AppError(Exception):\n\n\"\"\" Expected, user-facing failure. Message says what to do, not just what broke.\"\"\"\n\nREQUIRED_COLUMNS = (\"date\", \"category\", \"amount\")\n\ndef read_expenses(path: Path) -> list[dict[str, str]]:\n\nif not path.is_file():\n\nraise AppError(f\"input file not found: {path}\")\n\nwith open(path, newline=\"\", encoding=\"utf-8\") as f:\n\nreader = csv.DictReader(f)\n\nif reader. fieldnames is None:\n\nraise AppError(f\"{path} is empty — expected a header row\")\n\nmissing = [c for c in REQUIRED_COLUMNS if c not in reader.fieldnames]\n\nif missing:\n\nraise AppError(f\"{path} is missing required column(s): {', '.join(missing)}\")\n\nreturn list(reader)\n\nA bad row no longer kills the run — it's skipped with a warning and counted. And the top of main() does the translating:\n\ntry:\n\nrows = read_expenseImportantly, this approach catches only AppError exceptions. Unexpected exceptions are allowed to propagate, displaying a full traceback. Suppressing such errors makes diagnosis difficult, whereas explicit tracebacks provide actionable information. Avoid wrapping main() in a generic exception handler that outputs only a vague error message.aceback — because a swallowed crash is undiagnosable, while a loud one tells you exactly what to fix. Don't wrap main() in a catch-all that prints \"something went wrong.\"\n\nThat output write also became atomic — write to a .tmp file, then rename it into place — so a crash never leaves a cA common misconception is to conflate program output with logging. If the script is intended to print a report to standard output, use print() for that purpose. Logging, which provides diagnostic information, should be directed to standard error. Mixing these streams can disrupt output redirection for users.t goes to stderr. Mix them, then run script.py > out.csv for every user.\n\nSo the \"starting...\" and \"read N rows\" lines became leveled logs to stderr, wired to a verbosity flag — quiet by default, -v for milestones, -vv for debug:\n\ndef setup_logging(verbosity: int = 0) -> None:\n\nlevel = [logging.WARNING, logging.INFO, logging.DEBUG][min(verbosity, 2)]\n\nlogging.basicConfig(level=level, handlers=[logging.StreamHandler(sys.stderr)])\n\nMaintain discipline in logging levels: detailed information for each row should be logged at the DEBUG level; summary actions such as 'wrote report.csv' should use INFO; and warnings like 'skipped 3 rows with missing amount' should use WARNING. INFO-level messages should occur only a constant number of times per execution, never within per-row loops. Additionally, use the lazy formatting style (log.debug(\"row %d amount=%s\", i, amt)) to avoid unnecessary string formatting when the log level is not active.\n\nFix 3: loose script → installable, tested CLI\n\nThe hardcoded INPUT path was replaced by an argparse CLI with --help, --limit, --output, --force, -v, and --version. The loose file became a src/ layout package with a pyproject. toml declaring an entry point:\n\n[project.scripts]\n\nexpense-report = \"expense_report.cli:main\"\n\nThat's what turns \"clone it and run python script.py and hope\" into a real command. And the pytest suite covers the happy path plus the failure modes the original couldn't survive: missing file, empty file, missing column, non-numeric amount (skipped, not fatal), and refusing to overwrite an existing report without --force.\n\nI verified the whole chain in a fresh virtual environment before writing this:\n\npython3 -m venv .venv && .venv/bin/pip install \".[dev]\"\n\n.venv/bin/python -m pytest # 16 passed\n\n.venv/bin/expense-report --help\n\nAll sixteen tests pass, the installation process is clean, and the script executes correctly from any directory. Demonstrating evidence of reliability. The checklist provided is intended for your use, and the preceding example demonstrates the complete methodology. Evaluate one of your completed scripts against the ten categories, address the most critical issues first, and verify each fix by reproducing the corresponding failure. The determination of whether a script 'needs work' is subjective, but scripts with low scores in error handling, testing, and packaging should not be distributed. A low score in error handling, tests, and packaging is one you shouldn't hand to anyone yet.\n\nIf you'd rather your AI assistant enforce this loop instead of doing it by hand, I packaged the discipline as eight Claude Code skills — the scored audit (ship-check), plus harden-errors, add-logging, make-cli, add-tests, package-it, write-readme, and a release-prep gate — along with the full before/after sample project you just read through. Full disclosure: I built it, and it's a paid kit ($19) at jackiecole.gumroad.com/l/lcscdf. The checklist in this post stands on its own either way.", "url": "https://wpnews.pro/news/i-shipped-one-messy-python-script-here-s-the-10-point-checklist-that-got-it", "canonical_source": "https://dev.to/2glitch/i-shipped-one-messy-python-script-heres-the-10-point-checklist-that-got-it-there-58jo", "published_at": "2026-06-13 11:21:55+00:00", "updated_at": "2026-06-13 11:47:28.847953+00:00", "lang": "en", "topics": ["developer-tools", "artificial-intelligence", "ai-tools"], "entities": ["Python", "pytest"], "alternates": {"html": "https://wpnews.pro/news/i-shipped-one-messy-python-script-here-s-the-10-point-checklist-that-got-it", "markdown": "https://wpnews.pro/news/i-shipped-one-messy-python-script-here-s-the-10-point-checklist-that-got-it.md", "text": "https://wpnews.pro/news/i-shipped-one-messy-python-script-here-s-the-10-point-checklist-that-got-it.txt", "jsonld": "https://wpnews.pro/news/i-shipped-one-messy-python-script-here-s-the-10-point-checklist-that-got-it.jsonld"}}