Designing with Mustard

A designer who previously satirized tech hype by declaring ketchup the ultimate design tool is now warning that the same absurdity is being repeated with artificial intelligence. The author argues that tools like AI for prototyping and "vibe coding" echo the same empty promises, and that the fundamental principle remains unchanged: prototypes are questions, not products, and the tool must fit the specific question being asked about role, look and feel, or implementation.

Designing with Mustard Designing with mustard is the future of design. I used to think it was ketchup, but I was wrong. If you're not currently designing with mustard, you should get out of design and tech. The future of design is mustard. Adapt or perish. … Just kidding. It would be wild if I actually said something like that, right? Yet here we are, in the middle of constant messaging just like this but with AI instead of mustard. A little context: a few years ago, I wrote a piece about designing with ketchup https://annaecook.com/writing/2021/10/27/designing-with-ketchup . It started as a joke. Someone called Webflow the "ultimate" design tool and I responded by making wireframes with a bottle of Heinz, proclaiming that ketchup was the ultimate design tool. As one does. It was ridiculous, but many of you remind me about it, years later. I think it captured a feeling many of us deal with in the tech space. There's some real absurdity here, bold claims without evidence, and utterly ridiculous people selling slop. If they're going to be ridiculous, why can't we be too? Beyond satire, the article went into something that is fundamental to how we build: Tools don't define design . You can use anything — Figma, Webflow, PowerPoint, paper, code, even ketchup — if it helps you think through a problem and communicate a solution. Apparently, we need to talk about this again. Just with a different condiment. Lately, I've been seeing a lot of posts about using AI for design, coding, prototyping, etc. Or vibe coding https://en.wikipedia.org/wiki/Vibe coding . Or whatever else we’re calling it this week. The pitch is straightforward: describe what you want, generate something that looks like a product, iterate from there. You don't really need to wireframe. You don't need to think through flows the way you used to. You can just…start generating. I get why that's appealing if you haven't done this kind of work before. If you're used to looking at someone else's prototypes and reacting to them rather than building them yourself. AI–err, I mean, Mustard is spicier than ketchup. Sharper. But underneath, it's the same trick: squeeze something onto the canvas and call the meal made. Why do we prototype? Before going further, I want to slow down on something this conversation sometimes skips. Prototypes aren't products. Or demos https://productpicnic.beehiiv.com/p/designers-will-never-have-influence-without-understanding-how-organizations-learn . A prototype is a question , something you make in order to learn something specific you couldn't learn any other way. Houde and Hill https://hci.stanford.edu/courses/cs247/2012/readings/WhatDoPrototypesPrototype.pdf said this back in 1997, in a paper I keep coming back to. They organize prototypes into three kinds of questions: Role — what is this thing for? How does it fit in someone's life? Look and feel — what is it like to experience? What does it look like, sound like, feel like to use? Implementation — how does it actually work? What's the structure underneath? Different prototypes ask different questions and are used to seek answers. A storyboard might ask about role. A high-fidelity mock asks about look and feel. A coded prototype asks about implementation. There's overlap in all of these spaces, which is part of the reason why people have always struggled so much with job titles and responsibilities in product spaces. Regardless of your job title, the question you're asking determines what your prototype needs to be true to . Your prototype might skimp on details, and that's okay as long as it isn't being shipped as is. A role prototype might live in a PowerPoint. A look-and-feel prototype might live in Figma. An implementation prototype might live in Storybook. So when we talk about AI prototyping, the actual question to ask is: which of these is this tool the best fit for the question we need to answer? Role In Houde and Hill's framing, role is the function a thing serves in someone's life. A role question asks: What does this do for them? Why does it matter? Where does it fit into their day, their goals, the tools they already use? The answer doesn't live in an artifact. It lives in the person and their context. That's where AI runs into trouble. It's trained on artifacts, not on the lives those artifacts were made for. It can generate something that looks like an answer because it's seen thousands of them. What it can't do is verify whether this answer fits this person — your user, your context, the people you're at risk of excluding. It also can’t challenge design biases because it’s built on them. Yes, it’s a possible answer to a question, but a probable output isn't the same as a correct one. What AI can do is generate something quick to react to while the team does the actual figuring-out together. The artifact isn't answering the role question. It's giving the team a shared object to think against. That's the legitimate use, though less effective. Why? Because we keep letting the artifact stand in for the work https://productpicnic.beehiiv.com/p/vibe-prototyping-isn-t-solving-any-problems-but-it-s-creating-many-new-ones . Getting answers to our prototypes' questions needs people who aren't in the room. They need feedback, relationships, co-design, and research https://productpicnic.beehiiv.com/p/ux-works-through-social-relationships-ai-tools-are-erasing-them . Otherwise, we just generated something without any role or purpose…slop. Look and feel In Houde and Hill's framing, look and feel is the concrete sensory experience of using something — what you look at, feel, and hear while you use it. Look and feel prototypes are where AI prototyping seems useful but it’s actually where its limits get hidden best. AI is good at generating a lot of visuals and commonly designed interactions quickly. A layout. A palette. A few screens to react against. A rough component idea or a snippet of code we can adapt to a context. Essentially, it's the Stack Overflow of 2026. Which means we should not trust it, only react to it. This is also why people think AI is a qualified designer. People who are not trained as designers are used to reacting to look and feel. They receive a prototype, don't dig past the surface level, and approve with some feedback "make it pop" . What AI is not good at is the rest of what look and feel actually covers. This is in part because "look and feel" has always been a half-true framing. It tends to focus more on some mechanisms of how we interact and not others. "Look and feel" is not just the visuals. It is also the sounds, the motion, the haptics, the response timing, the interaction patterns as a whole. A blind user's look and feel might be auditory, or tactile, or rendered in high contrast. A keyboard user's might be the rhythm of tab order, browse modes, and focus states. A user with a tremor might experience it through their hands, in tap targets and pointers. A user with neurodivergence might experience it through pacing, language, and predictability. AI's "default" look and feel is screen-shaped, sighted, mouse-and-keyboard. It's inherently biased and limiting. It optimizes for what fits in a screenshot, because that's the data shape it was trained on. It was trained on the gaps we avoided accounting for. The slice it generates from is real, but partial — and the parts it doesn't generate from are the parts most often left underdesigned. Even within the visual slice, "looks designed" isn't the same as "designed for this." These are the gaps many designers have learned how to fill through deeper systems and infrastructure work, but that work has rarely been appreciated by those who now hype AI as the death of design. Gen AI can be used as a sketch, as long as you know what you're using it as and you understand its limits, costs, and impacts. The risk is letting it convince you the work is accurate, complete, or unbiased. Models only know averages. And no human is average https://www.thestar.com/news/insight/when-u-s-air-force-discovered-the-flaw-of-averages/article e3231734-e5da-5bf5-9496-a34e52d60bd9.html . Design is the edges. It's the human reasoning, strategy, empathy, and collaboration required to think within and beyond our own lived experiences. AI cannot replace that. Implementation In Houde and Hill's framing, implementation is the techniques and components through which a thing performs its function . It's the nuts and bolts that hold the machine together. Implementation is, fundamentally, a question about whether the system holds together. This requires at least some degree of logic and semantics across systems to function. Connecting the pieces together to make things work means there has to be consistency, frameworks, and coherence. And this is where we've seen AI's failures most clearly https://microsoft.github.io/a11y-llm-eval-report/ . AI often outputs something that looks like it works without it actually working. When using AI for code, what we get back usually looks like it probably could be structure. It isn't. In reality, it's a collection of elements arranged in a way that resembles one. Layout. Hierarchy. Familiar components doing familiar things. The moment we start asking basic questions, it falls apart: Why is this here? What happens if this changes? How does this connect to anything else? What is being used to present this? Is it semantically sound, or is it <div soup? Is it inventing unnecessary IDs and classes? Is it connecting them to the right places? Is it accessible? Does it over-rely on ARIA https://www.smashingmagazine.com/2025/06/what-i-wish-someone-told-me-aria/ ? There are usually no clear answers, because the model wasn't reasoning about whether the system holds together. It was predicting what's likely to come next, given what came before https://en.wikipedia.org/wiki/Stochastic parrot . Each fragment looks reasonable on its own. Whether the fragments actually fit together as a working system isn't a question it can verify. It's not that the tool made bad decisions. It's that it didn't make any. The most frustrating part is what happens when we try to work with it. We ask it to fix something small. Something small breaks. We fix that. Something else breaks. A layout shifts in a way we didn't ask for. A component fails somewhere unrelated. Something that felt stable a moment ago just…collapses. At a certain point, it stops being iteration and starts being damage control. Not because the tool is unreliable in some random way, but because there was never a structure holding it together. You can't iterate on integrity that was never there. As an accessibility advocate, I have significant concerns about introducing another prototyping tool that puts form over function. What's the point of a tool that produces the same outputs we already produce, just with the structural debt distributed differently? The only thing that shifts is where we spend time and money. Instead of shipping with intention we’re shipping 22.5% more elements per page https://webaim.org/projects/million/ in a single year, with logic and semantics that are haphazard. That is implementation that is, without exaggeration, doomed to fail in the long term. Friction is the work One thing about designing that I didn't fully appreciate earlier in my career: the act of making something is how you figure out what it is. It’s the process of taking role, look and feel, and implementation questions and finding answers until you integrate it together into one cohesive solution. You don't start with that clarity. You get there by: Coming up with ideas Discussing ideas with others Eating lunch Sketching something incomplete Noticing what's missing Gathering insights from data Going to bed Adjusting, iterating Asking someone what's missing Trying again Looping through it again…and again…and again That process forces you to confront what you don't understand and work with others to reduce your biases. AI skips that part. It hands you something that looks complete enough to move forward without ever really figuring out what you're building. That's the part that concerns me. The "friction" in design processes isn't a problem to eliminate . It's where the thinking happens. It's how we vet ideas with lower risk. AI can't design that way. But you can. When AI fails, we take the blame. There's also a professional issue here: If I use AI and AI makes a mistake, AI can't be held accountable for it. Only I can. AI is not only incapable of doing many things effectively, it's also incapable of taking accountability for failures. AI has no skin in the game. No license to defend. No reputation that suffers when its work fails. When something it produces hurts a user or violates a regulation, the model isn't at the table. You are. Look at what happened when Grok went on antisemitic rants in July 2025, calling itself "MechaHitler" https://techcrunch.com/2025/07/12/xai-and-grok-apologize-for-horrific-behavior/ and offering instructions for sexual assault. xAI "apologized" on its behalf. They blamed "a bug" and "deprecated code." The model lost nothing. X CEO Linda Yaccarino — a human — resigned. The AI got patched and went back to work. Many of us don't have the luxury of experimenting freely. Shipping low-quality or unstable outputs doesn't just affect the product — it affects our credibility, our teams, and sometimes our jobs. Using AI without verification isn't just a design or engineering risk. It's a professional one. When the screen reader test fails on the AI-generated component, AI isn't in the meeting. When laws are broken, AI isn't named as defendant. When a user gets harmed by AI making up sources, AI isn't the one who apologizes. When AI fails, we pay the price. That's the part of the "use AI or else" equation that doesn't make it into the pitch: the accountability has to land somewhere. It lands with us. “Use AI or else.” Earlier I joked about how "The future is mustard. Adapt or perish." That was a satirical remark, but not far from the reality we currently live in. What is most remarkable about the AI shift to me is not the tool itself but the rhetoric. Tech "thought leaders" and execs have long had a penchant for hyping and overstating capabilities. What feels different now is that the AI shift has been posed as a threat rather than a promise. The framing isn't "this could help you." It's "use this or you're done." Designers and engineers are being told that if we don't adopt AI, we won't have careers https://www.businessinsider.com/amazon-jassy-ceos-warning-workers-ai-bring-massive-change-2025-6 . We'll be replaced. We should find another profession. This is new. Previous tech waves came with hype — mobile-first, cloud-everything, blockchain, the metaverse lol — but usually they invited adoption. They didn't issue ultimatums. The shift from invitation to coercion isn't a coincidence. Threats benefit the people doing the threatening https://www.cnbc.com/2026/04/24/20k-job-cuts-at-meta-microsoft-raise-concern-of-ai-labor-crisis-.html : investors who need the inevitability narrative to justify the spend, executives whose performance depends on the AI bet paying off, vendors who need adoption to survive. They don't benefit the people being told to comply. They certainly don't benefit users. What threats accomplish is the suppression of judgment https://productpicnic.beehiiv.com/p/ai-mandates-are-a-demand-for-cognitive-surrender . A person told "use this tool or quit" doesn't get to ask whether the tool is right for the job. They have to use it. For everything. Including the things it's bad at. The Return of the Luddite If you say "AI doesn't work well for this," you get coded as resistant, behind, slow. A Luddite https://en.wikipedia.org/wiki/Luddite . The people who slow down to consider accessibility, guardrails, or structural integrity get reclassified as people who can't “keep up.” That's not a critique of AI-usage. It's a critique of the work itself, the kind of work that has always taken time because it's worth taking time for. I'm a designer who works in this space. I have used AI. I have significant concerns about it. The rhetoric tells me those two things shouldn't coexist. They do, and they have to. The alternative is producing work that causes harm because I was afraid not to. People pushing back is a signal. If your teams can't push back on a tool, or adopt it where it's actually helpful, you're going to lose them https://www.wsj.com/tech/ai/the-ai-splurge-is-costing-big-tech-its-workforce-34a88e68?mod=djem10point . Threaten them instead of listening, and they'll find a better solution — and not the one you're hoping for. They won't keep AI in the equation. They'll remove your company and products from their lives. We have been doing this work. We know what AI is good for and what it isn't. We know the costs we're not pricing in https://www.consumerreports.org/data-centers/ai-data-centers-impact-on-electric-bills-water-and-more-a1040338678/ , the labor we're not crediting https://restofworld.org/2025/karen-hao-empire-of-ai-book/ , the outputs that don't actually hold up. Many folks are feeling pushed out of tech and quitting https://ky.fyi/posts/ai-burnout . I don’t blame them. While I don’t think all of the issues we are dealing with are because of AI, I do think AI has been used as an excuse for it. The cycle doesn't end because reality catches up to the rhetoric. It ends because we refuse to keep producing what the rhetoric demands. Some of us will quit. Sometimes that's the right move. https://jointhewalkout.com/ That choice helps us shift too. But doing good work doesn't always require an exit. It looks like small refusals, repeated. The big shift happens through accumulated choices and actions https://dair-institute.org/projects/luddite-lab/ . Thousands of practitioners making small, defensible choices about what they will and won't ship, day after day, until the rhetoric runs into a wall of professionals who refuse to confuse outputs with quality. It’s the accessibility step you don't skip https://piccalil.li/blog/applying-accessibility-fixes-with-stealth-for-the-greater-good/ because the deadline is tight. The AI output you stop and verify instead of waving through. The conversation with leadership where you say "this isn't ready" and mean it. The harm you reduce https://ericwbailey.website/published/harm-reduction-principles-for-digital-accessibility-practitioners/ . It looks like documenting what you see, internally and externally, so the work has receipts. Writing the case study, the blog post, the talk. Putting language to the practice so others can defend it too. We are the practitioners. The judgment is ours. The choices are ours. Question the Prompt If you take away a single thing from this piece, please take this: artifact generation is not designing — it never was. Design requires thinking, collaborating, co-designing, challenging assumptions, dismantling biases, thinking deeply about systems, and understanding mediums. So if AI prototyping works for you, use it. If you're being mandated to use it and need to keep your job, use it when the stakes are low. But know what question you're asking. If it's a low-stakes question — a sketch, an early concept, something to react to — go ahead. Squeeze the bottle. If it's a deeper question — what people need, how the interaction is modeled, how the system actually works — use a different tool. Or refuse it entirely. That refusal is what professionals do. If you use AI, you might spend more time untangling than designing, and fixing things that keep breaking in new and interesting ways. You're not imagining it. That's the tradeoff. It's worth being honest about. Because if we skip the part where we actually figure out what we're building, we're not designing. We're just designing with ketchup. Or mustard. Again.