{"slug": "ai-made-development-faster-testing-needs-to-stop-living-in-spreadsheets", "title": "AI Made Development Faster. Testing Needs to Stop Living in Spreadsheets.", "summary": "A developer built testboat, a structured testing system that treats test artifacts like code, to address the bottleneck of proving what was tested and whether a release is safe as AI agents accelerate development. The tool creates a .testboat directory with connected YAML files for requirements, test cases, execution plans, bugs, and reports, enabling teams to query release evidence instead of relying on memory.", "body_md": "AI agents are making software development faster.\n\nThat is great.\n\nBut there is a problem I do not think we are talking about enough:\n\n**testing is not speeding up in the same way.**\n\nIn many teams, testing is still held together by spreadsheets, meeting notes, screenshots, chat messages, and the memory of a few experienced QA engineers.\n\nThat worked when delivery was slower.\n\nIt becomes fragile when one developer can use multiple agents to change code across several modules in a single afternoon.\n\nThe bottleneck is no longer \"can we write more test cases?\"\n\nThe bottleneck is:\n\nCan the team prove what was tested, why it was tested, what failed, what was fixed, and whether the release is safe?\n\nThat is the problem I built `testboat`\n\nfor.\n\nThe sentence I worry about most is not:\n\nWe did not test this.\n\nAt least that is honest.\n\nThe dangerous sentence is:\n\nI think we tested this.\n\nThat sentence usually means the team has test artifacts, but they are disconnected:\n\nEach piece may be useful on its own.\n\nBut when a Tech Lead asks, \"Which requirements are not covered?\" or a founder asks, \"Can we release today?\", the team has to reconstruct the answer manually.\n\nThat is not a testing process.\n\nThat is institutional memory under pressure.\n\nAI agents are very good at increasing throughput.\n\nThey can:\n\nBut faster change creates more testing uncertainty.\n\nIf an agent changes the authentication module, what should be rerun?\n\nIf a test fails, is it a product bug, a flaky automation script, or an environment issue?\n\nIf a developer says \"fixed\", has the failed test actually been rerun?\n\nIf a release report says \"main flows passed\", where is the evidence?\n\nWithout a structured system, QA becomes the human buffer. Tech Leads become risk translators. Founders buy uncertainty with every release.\n\nThat is not sustainable.\n\n`testboat`\n\ntreats test artifacts like code.\n\nIt creates a `.testboat/`\n\ndirectory in your project:\n\n```\n.testboat/\n  .active\n  draft/\n    strategy.yaml\n    tags.yaml\n    cases/\n      TC-001.yaml\n    bugs/\n      BUG-001.yaml\n    executions/\n      plans/\n      results/\n      execution-matrix.yaml\n      automate/\n    reports/\n```\n\nThe important part is not \"YAML is nice.\"\n\nThe important part is **connection**.\n\nA requirement connects to a test case through `req_id`\n\n.\n\nA test case connects to an execution plan.\n\nAn execution plan connects to an automation script.\n\nA result connects back to the test case.\n\nA bug can connect to both the test case and the failing result.\n\nThe latest execution state is summarized in an execution matrix.\n\nReports are generated from the same artifacts, not written from memory.\n\nThat changes the conversation.\n\nInstead of asking:\n\nDid we test login?\n\nYou can ask:\n\nShow me every auth test case, its latest result, open bugs, and whether the release exit criteria passed.\n\nQA should not have to be the team's memory database.\n\nWith `testboat`\n\n, a test case is a structured file:\n\n```\nid: TC-001\ntitle: Login with wrong password returns 401\nstatus: ready\npriority: P1\nautomation: to-automate\ntags:\n  sprint: v1.0\n  type: functional\n  module: auth\nreq_id: STORY-001\nsteps:\n  - action: Enter wrong password\n    expected: API returns 401\nexpected_result: User sees a clear error message\n```\n\nIt is diffable.\n\nIt is reviewable.\n\nIt has a state:\n\n``` php\ndraft -> ready -> pass / fail / blocked / skipped\n```\n\nThat means QA can maintain testing facts instead of constantly answering questions from memory.\n\nTech Leads need quality gates, not just good intentions.\n\n`testboat validate`\n\nruns pre-report checks:\n\nThat last part matters.\n\nYour `strategy.yaml`\n\ncan define severity rules and exit criteria. For example, P0 and P1 bugs must be zero before release.\n\nSo the report is not just a nice HTML page.\n\nIt is generated after the system checks whether the release evidence is healthy enough.\n\nThis is the kind of thing that can eventually belong in CI.\n\nFounders do not need to read every test case.\n\nBut they do need release confidence.\n\n\"Main flows passed\" is not enough.\n\nThe useful questions are:\n\n`testboat`\n\ngenerates strategy, sprint, and closure reports from the actual test artifacts.\n\nThat gives leadership evidence instead of vibes.\n\nThe goal is not to replace QA.\n\nThe goal is to give AI agents a testing workflow they can follow.\n\n`testboat enable`\n\ncreates agent-specific instructions for tools like Claude, Copilot, Cursor, Kiro, and others.\n\nAn agent can then follow a repeatable SOP:\n\nThat is the difference between \"AI wrote some tests\" and \"AI participated in the testing lifecycle.\"\n\nIf the auth module changed, you should not ask:\n\nCan someone test login?\n\nYou should be able to do this:\n\n```\ntestboat case list --module auth\ntestboat matrix show\n```\n\nThen rerun the affected tests and record results:\n\n```\ntestboat result record TC-001 pass --type automated --by \"AI\"\n```\n\nIf a failure appears:\n\n```\ntestboat bug add \\\n  --title \"Wrong password returns 500 instead of 401\" \\\n  --tc TC-001 \\\n  --severity major \\\n  --priority P1\n```\n\nAnd after the fix, the bug should not jump straight to \"closed.\"\n\nIt should move through retest:\n\n``` php\nfixed -> pending-retest -> verified -> closed\n```\n\nThat is the loop teams need when development is moving faster.\n\nAI is making code cheaper to produce.\n\nThat does not automatically make releases safer.\n\nIf anything, it makes weak testing systems more visible.\n\nThe next layer of AI engineering is not just faster code generation.\n\nIt is turning the surrounding engineering practices into systems that agents can participate in.\n\nTesting is one of those practices.\n\nThat is why I built `testboat`\n\n.\n\nNot to generate more test cases.\n\nTo make testing traceable, reviewable, versioned, validated, and reportable.\n\n```\npip install testboat\ntestboat init\ntestboat enable cursor\ntestboat strategy create\n```\n\nProject:\n\n[https://github.com/lijma/testboat](https://github.com/lijma/testboat)\n\nDocs:\n\n[https://lijma.github.io/testboat/](https://lijma.github.io/testboat/)\n\nHow does your team know a release is actually ready?\n\nIs that answer stored in a system, or mostly in people's heads?", "url": "https://wpnews.pro/news/ai-made-development-faster-testing-needs-to-stop-living-in-spreadsheets", "canonical_source": "https://dev.to/marvin_ma_597e184518c2221/ai-made-development-faster-testing-needs-to-stop-living-in-spreadsheets-4ap0", "published_at": "2026-06-17 03:29:24+00:00", "updated_at": "2026-06-17 03:51:38.756448+00:00", "lang": "en", "topics": ["developer-tools", "ai-agents"], "entities": ["testboat", "QA"], "alternates": {"html": "https://wpnews.pro/news/ai-made-development-faster-testing-needs-to-stop-living-in-spreadsheets", "markdown": "https://wpnews.pro/news/ai-made-development-faster-testing-needs-to-stop-living-in-spreadsheets.md", "text": "https://wpnews.pro/news/ai-made-development-faster-testing-needs-to-stop-living-in-spreadsheets.txt", "jsonld": "https://wpnews.pro/news/ai-made-development-faster-testing-needs-to-stop-living-in-spreadsheets.jsonld"}}