AI Readiness Radar

A new AI readiness radar framework warns that AI coding tools will amplify existing engineering dysfunctions rather than fix them, urging teams to assess their foundations in codified standards, feedback loops, and trusted reviews before adopting AI agents.

AI readiness radar Hot take: AI isn’t going to save your dysfunctional engineering team… it’s going to expose it. Rather than being a silver bullet that fixes weak engineering practices, it’s going to accelerate what is already there, both the good and the bad. In a mature engineering culture it can help with shipping safely at pace, but for an immature team it’s going to amplify dysfunction meaning you ship more of the wrong thing faster AI generated slop at scale . The teams that benefit most from AI development workflows aren’t the ones adopting the tooling the quickest, they’re the ones who have the foundations in place that allow it to work So before you or your team jumps straight into picking up AI coding agents, it’s worth actually asking “are we really ready for AI?” That’s what my AI readiness radar is for. I’ve created this to help teams self assess their foundations so that they can know whether it’s the right time to add AI or if it’ll just cause a mess. The five AI readiness signals 1. Codified standards Has the team expressed what good and good enough looks like in the form of meaningful documentation, rather than it living in people’s heads or by convention? - Definition of done - Engineering guidelines - Non-functional requirements - Security standards - Quality statements and philosopies - Coding practices and standards - Testing best practices - System documentation - Architectural decision documentation - Product direction and market fit documentation - User stories AI needs context in order to produce useful outputs. Without documented standards it has nothing to base work from and the code and product it generates drift away from what you’d intended it to do. In practice that means teams end up hand holding the AI through every single step to compensate for that slowing them down massively . The goal here isn’t documents for their own sake, but to capture the team’s view of what good looks like enough that it can be consistently applied by a person or an AI agent. 2. Feedback loops Has the development pipeline got meaningful tests and observability throughout that catch failure and regressions that AI development introduces? - Shift left testing risk analysis and testing of ideas and user stories or specs - Inner loop short feedback loop tests during development - Outer loop deployed environment and integrated system tests - The right tests and coverage throughout the stack - Shift right testing observability and customer feedback tests - Tests that are trusted and not flakey - “You break it, you fix it” team mentality for managing tests - Automated over manual testing running in a pipeline - Exploratory testing - Non-functional testing - Snapshot and UX tests - KPI and DORA metrics Faster development only helps when you know when something is going wrong. Without meaningful feedback loops, faster development can easily introduce failures like regressions, bad behaviour changes and unmaintainable code patterns in a way that’s expensive to fix. This can happen at pace with AI causing problems too quickly for anybody to reasonably spot and remediate manually, which is why trusted and deep testing coverage is needed. 3. Trusted reviews Do teams have the capability and capacity to meaningfully review the outputs of software development, rather than just rubber stamping things? - Mature Pull/Merge request reviews - Engineering capability, knowing what good looks like - Agreed standards and guidelines for coding and engineering - Product team reviews against company needs - Quality evaluations validation and verification - Short review lead times and turn around time - Low ego review processes what works rather than personal preferences - Team pragmatism to know what good enough is, rather than gold plating Human review becomes the central control in your AI development process; without it there’s no governance on whether what’s being shipped is good and teams start to lose an understanding of their own code and product. Reviews that are slow, inconsistently applied or ego driven create a bottleneck that cancels out any speed gains from adding AI. From what I’ve seen, the quality of human in the loop reviews matter as much as their existence because a rubber stamp can be pretty much useless for quality when working at pace. 4. Quality ownership Has the team taken on quality as a whole team concern or is there siloed thinking where quality is thrown over the wall for someone else to think about? - Quality engineering practices - Engineers owning their own testing - No trade offs to other teams - Quality capability, knowing what needs testing - Short turnaround time on fixes - Psychological safety to raise and own quality issues - Pragmatism to avoid gold plating - Team owned test environments AI development really does remove the traditional QA phase and gatekeeping. If development teams are treating quality as someone else’s problem and throwing things over the wall, then there’s no safety net baked in and issues will reach production faster and in greater volume . I’ve worked with teams who only spot issues when something goes wrong in front of the customer and that won’t cut it with the faster pace and volumes we get with AI development. Teams need enough quality capability and ownership distributed through the development lifecycle to build the right quality guardrails in from the start, not wait to be told something went wrong at the end. 5. Context & Understanding Does the team actually understand what good enough means and looks like for their product, market fit, code and infrastructure to be able to work as a product development house, or do they blindly work as a feature factory? - Team mission statements - Team consensus on what’s important - Domain and market fit knowledge - Whole team maturity in identifying product needs - Team can explain why things are being built - Pragmatism around what good enough means - High trust around shipping, knowing it’ll be right - Stakeholder alignment - Product fit to sales, branding, marketing is understood by the team - Understanding the customer and user personas Without a shared understanding of the product, its customer base and what “good enough” looks like, AI can accelerate output without creating end user value. Teams end up shipping faster but without any judgement of whether what they’re shipping was the right thing for the team, product or the company. In my experience this is one of the hardest signals to score because teams often think they have enough context until they’re deeply questioned on what good enough really means across all quality dimensions. Context is what allows a team to use AI deliberately, not just widely. How to assess AI readiness with a radar Using a radar gets teams to visualise and easily spot gaps in their foundational maturity which allows them to enter into conversations and create plans to build in capability that’ll support adopting AI development. Each of the foundational signals are mapped onto an axis across four maturity bands: Codified standards No maturity nothing happening – Nothing is written down; “good” lives in individuals’ heads and is discovered only when someone gets it wrong. Low getting started – Some docs exist a README, a DoD but they’re stale, scattered, or contradicted by how the team actually works. Moderate getting there – Most standards DoD, coding/testing practices, NFRs, key architecture decisions are documented, current, and referenced in reviews. High living and breathing – Standards are living, version-controlled, owned, routinely updated, and an agent could be pointed at them to produce on-standard work. Feedback loops No maturity nothing happening – Failures are found in production or by users; there’s no reliable test signal during development. Low getting started – Tests exist but are patchy, slow, or flaky enough that people ignore or skip them; observability is minimal. Moderate getting there – Trusted automated tests run across inner and outer loops with reasonable coverage, plus some production observability. High living and breathing – Fast, trusted feedback spans shift-left through shift-right specs to observability , failures surface in minutes, and the team fixes flakes as a matter of course. Trusted reviews No maturity nothing happening – Reviews are absent or pure rubber-stamping; merges happen on trust or speed alone. Low getting started – Reviews happen but are inconsistent, ego-driven, or bottlenecked by a single gatekeeper and long lead times. Moderate getting there – Reviews reliably catch real issues against agreed standards, with reasonable turnaround and low-ego norms. High living and breathing – Reviews are fast, pragmatic, standard-anchored, and the team trusts them enough to scale review throughput as the binding constraint on AI-paced output. Quality ownership No maturity nothing happening – Quality is someone else’s job, thrown over the wall; engineers don’t test their own work. Low getting started – Some engineers own quality but it’s inconsistent, and there’s still a fallback assumption that QA will catch it. Moderate getting there – Quality is a whole team concern; engineers own their testing and there’s psychological safety to raise issues. High living and breathing – Quality is built in by default, the team knows what guardrails to embed into agents and skills, and there’s no gatekeeper safety net needed. Context & understanding No maturity nothing happening – The team builds what it’s told with no grasp of product, market fit, or why… a pure feature factory. Low getting started – Some people understand the “why,” but it’s uneven and the team can’t consistently judge “good enough”. Moderate getting there – Most of the team understands product direction, customers, and trade-offs, and can explain why things are built. High living and breathing – The whole team shares domain and market-fit understanding, aligns with stakeholders, and makes deliberate trade-offs on what’s worth shipping. I’ve put together some example archetypes to show how the radar can be used to visualise the readiness of different types of team. These are based on the different types of teams that I’ve worked with in the past: A Startup Engineering team Technically proficient with high engineering capability and decent feedback loops. These teams may miss deeper product context in favour of putting things out there to see what works and likely haven’t documented or formalised any standards. To address their gaps to support AI they would need to think about: - Documenting standards - Documenting engineering guidelines - Documenting product and market fit B Feature factory Usually found in big organisations like banks , the engineers in these teams are removed from decision making and just do what they’re told without being involved in the “what” of the product. To address their gaps to support AI they would need to think about: - Building in ownership of quality rather than throwing it over the wall - Documenting product and market fit - Increasing testing capabilities within team to cover more quality attributes C Solo or Rockstar developer An individual or small team with a lot of context but who doesn’t have to work with others. They usually don’t have formalised ways of working, collaboration techniques like review processes or formalised testing. To address their gaps to support AI they would need to think about: - Understanding gaps in knowledge about holistic product design and quality - Building in engineering standards and practices - Building formalised quality engineering practices D Mature product development team A team in a mature product development house, likely cross-functional with high autonomy and has had coaching on specialist topics like quality and security . They likely have most of the foundations in place and only need to expand on their existing capabilities to make them work well with AI. - Spot gaps in documentation to fill in - Extend existing testing to increase coverage - Write down the common sense stuff “everybody knows” Engineering team AI maturity self assessment Teams can use the radar to run a team self assessment to provide a realistic and honest view of where they are with the foundational signals. The aim of this is not to point fingers, blame the team or individuals, but to create a view of what needs to be put into place or improved to allow for AI development to be a success. A self assessment could look like: - Introduce the session, its purpose and highlight psychological safety to provide honest feedback. - Talk through the radar and each of the signals, providing examples to show what each of them mean. - Invite team members to individually score the team against each of the signals not themselves, but team maturity . - Plot each team member’s scores on the radar and discuss any differences in scoring per signal. - Agree consensus for a team scoring if agreement cannot be made, I’d recommend going for the lowest score . From here the team can locate the weakest signals and start working to fix them before moving into AI usage. It should be noted that readiness will need to be across all signals, not a few or just one of them. Teams should also be reassessing their readiness by re-reviewing periodically, to ensure that new gaps haven’t occurred and that existing gaps have been fixes. A note on self assessment : many engineering teams don’t know what they don’t know see my talk on using Quality Radars https://www.youtube.com/watch?v=lFbFKEwlK9w&list=PLKBhokJ0qd3 Qms3DloAbdq0zTGLQ0pFE&t=1s for more on this so a self assessment may give an over prepared view. Teams should be open to learning more about the foundations of software development, especially quality, or bring in a coach / expert to help with their assessment in order to gain a realistic picture. Continued assessment So how does the radar sit alongside TMMi, DORA or Team Topologies? It doesn’t compete with them, it sits underneath as a conversation tool. It can give you an idea of whether your foundations are solid enough to start using AI in your development process, but it won’t tell you whether the AI you add is actually working. That’s what DORA metrics and developer experience signals are for Once you’re accelerating your development using AI, you have to actually to measure whether AI usage is genuinely improving delivery or creating slop and making your product worse. If they show things getting worse, then that’s your cue to come back to the radar, re-assess your readiness and fix the foundations that buckle under the pace… rather than blaming the tooling. Thanks for taking the time to read If you found this helpful and would like learn more, be sure to check out my other posts on the blog. You can also connect with me on LinkedIn for additional content, updates and discussions; I’d love to hear your thoughts and continue the conversation.