Designing Forms an AI Agent Can Actually Submit

wpnews.pro

Most form codebases I have read were designed against one mental model of the submitter.

A person.

A person who reads each label.

A person who watches the screen between submit and the confirmation banner.

A person whose retries look like fast double clicks, not like a queued workflow that came back online.

A person whose definition of bot is "obviously not a person."

That mental model is still correct for a lot of traffic. It is becoming less correct over time.

Increasingly, the entity submitting a form is an AI agent acting on behalf of a person. Desktop clients with tool calling, MCP-aware agents, browser automation agents, and computer-use models all submit forms on someone's behalf. The form on the other side rarely knows.

This article is about what to change in your form so an agent can submit it reliably, without you having to ship an API just for them.

I will use FORMLOVA as the working example, because that is the codebase I work in. FORMLOVA is a chat-first form service whose primary surface is an MCP server (129 tools across 25 categories) and whose secondary surface is a hosted form page. Operators and respondents can both reach it via chat or page; in both cases, at least one side of the interaction can be an agent.

The patterns themselves are not FORMLOVA-specific.

Before talking about code, it helps to fix the requirements.

An agent submitting on behalf of a user typically needs five things from your form:

1. A way to identify each field by meaning, not by pixel position.
2. A way to learn the validation rules before sending values.
3. A way to submit safely if the network blinks or the user retries.
4. A way to read the confirmation result without depending on a toast.
5. A way to prove it is legitimate without solving an "are you a human" puzzle.

If any of those five are missing, the agent will either fail silently or burn user trust with retries. Neither is a good outcome for your conversion rate.

The interesting product question is not "should we have these properties." It is "which of these can we expose as a tool call, and which has to live in the rendered page so that browser agents can still operate the form from outside our walls."

In FORMLOVA, the answer is hybrid: each property is exposed both ways. The tool surface is the canonical contract. The rendered page mirrors it.

The first thing to fix is the field identity.

A field name like field_3

or q_short_answer_12

works for a renderer. It does not work for an agent.

The agent needs a stable, semantic identifier that survives layout redesigns.

type FieldDescriptor = {
  // Stable across the life of the response. Never recycled.
  stableId: string;
  // Semantic name reused across forms. e.g. "email", "company", "consent_marketing".
  semanticName: string;
  // Position-only id used for current rendering.
  renderId: string;
  label: { default: string; locales?: Record<string, string> };
  type: "text" | "email" | "phone" | "url" | "select" | "checkbox" | "date" | string;
  required: boolean;
  validation?: ValidationRule[];
  helpText?: string;
};

Two practical rules:

The semanticName

is what agents reason about. Pick a small vocabulary and reuse it across forms.

The id

is what the submission payload references. Do not regenerate it on each publish, or you will silently break agents that cached the previous shape.

In FORMLOVA, the form schema has 29 field types (text, textarea, number, radio, checkbox, dropdown, date, datetime, time, email, phone, url, file_upload, matrix, signature, address, rating_scale, NPS, linear_scale, slider, opinion_scale, ranking, picture_choice, yes_no, country, legal, statement, section_break, hidden_field). Every one of them carries a stable id, a semantic name, and a snapshot of the label that was active at the time the response was collected. An agent that wants to fill a yes_no field can do so by semantic name without parsing the rendered UI.

If your form lives behind an MCP server, the same descriptors should be returned as part of the form's tool schema. Agents will read it. Treat it as a public contract.

Agents are good at filling values. They are less good at guessing your invisible rules.

The classic anti-pattern is rejecting a submission with a friendly error message, but never publishing the rule that produced it. A human reader reads the message and retries. An agent has to round-trip every guess.

A small contract avoids that:

type ValidationRule =
  | { kind: "required" }
  | { kind: "min_length"; value: number }
  | { kind: "max_length"; value: number }
  | { kind: "pattern"; regex: string; description: string }
  | { kind: "enum"; values: string[] }
  | { kind: "domain_allow"; suffixes: string[] }
  | { kind: "duplicate_prevention"; window: "form" | "user" | "off" };

If the form publishes the rules, an agent can:

The rules do not have to be exhaustive. They have to be honest. A rule the form silently enforces but does not declare is a contract violation against the agent reader.

FORMLOVA publishes the duplicate prevention rule explicitly. When a form is created or updated, the operator picks between "email-based duplicate prevention," "cookie-based duplicate prevention (same browser)," and "no duplicate prevention." The choice is part of the form's published contract. An agent attempting to submit twice for the same intent can know in advance whether the second submission will be accepted or treated as a duplicate.

A similar choice is offered for conditional logic. The workflow engine in lib/webhooks/workflows.ts

uses the operators eq

, neq

, gt

, gte

, lt

, lte

, contains

, not_contains

, is_empty

, is_not_empty

, with logical combinators all

(AND) and any

(OR). These operators are part of the form's published behavior, so an agent can predict what will happen when its submission lands.

Network blinks happen. So do user-side retries. So do queue replays in agent platforms.

If your form treats every POST as a new intent, you will create duplicates the user did not ask for.

The fix is small:

type FormSubmission = {
  formId: string;
  values: Record<string, unknown>;
  submissionIntentId: string;
  submittedOnBehalfOf?: {
    actorType: "human" | "agent" | "automation";
    consentToken?: string;
  };
};

The submissionIntentId

is generated by the submitter, not the server. It should be a UUID the agent assigns once per intent. The server uses it as the deduplication key.

On the server side:

async function submitForm(input: FormSubmission) {
  const existing = await db.responses.findByIntentId(input.submissionIntentId);
  if (existing) {
    return { status: "duplicate", canonical: existing };
  }
  const created = await db.responses.create({
    ...input,
    receivedAt: new Date(),
  });
  return { status: "created", canonical: created };
}

You do not need a distributed lock for most form traffic. You need a unique index on (form_id, submission_intent_id)

and a clear status response so the agent can decide what to do next.

A user filling once and clicking submit twice should produce one record. An agent retrying after a 502 should also produce one record. Same mechanism, both cases.

FORMLOVA goes one step further on the server side. The capacity check for forms with a participant limit uses a PostgreSQL RPC called insert_response_with_capacity_check

that combines SELECT ... FOR UPDATE

and INSERT

in the same transaction. The intent id deduplication and the capacity check are both single-trip operations, so agents do not have to choose between throughput and correctness.

The respondent identifier is also reused as a long-term key. Every response carries a respondent_identifier

derived either from the email address (when one was submitted) or from a salted hash of IP and user agent (when one was not). This means that a respondent retrying through a different network still resolves to the same identifier when an email was provided, and the agent can avoid creating duplicate respondent profiles.

Toast notifications are great for humans. They are useless for agents.

If the confirmation only exists for a few seconds in a client-side animation, the agent has to guess whether submit succeeded.

The confirmation surface for an agent should be:

A minimum shape:

type SubmitResponse = {
  status: "created" | "duplicate" | "rejected";
  responseId?: string;
  canonicalUrl?: string;
  fieldErrors?: Array<{ id: string; rule: string; message: string }>;
  postSubmit?: {
    autoReplyScheduled: boolean;
    redirectUrl?: string;
    expectedFollowUp?: "email" | "review" | "none";
  };
};

That structure makes it possible for an agent to tell its user, "your submission was recorded as ABC123. An email confirmation will arrive shortly. The team typically responds within two business days." That sentence is the actual product surface, not the toast.

A common mistake here is to conflate "auto-reply enabled" with "auto-reply delivered." They are not the same. FORMLOVA treats the auto-reply state as a small enum: not_required

, pending

, sent

, failed

. The submit response can include the current state, and a follow-up tool call can re-read it after the asynchronous email job completes. The agent never has to guess.

The thank-you page state has the same shape. The respondent saw a thank-you page does not mean the team has received and acknowledged the response. FORMLOVA's thank-you pages support both a basic message and a structured blocks

shape (text, image, button, link, video, divider; up to twenty entries) with conditional logic. An agent reading the form's published shape can know in advance whether the thank-you page will say "received" or "received and routed to your account manager," and it can mirror that statement to the user.

This is the place most teams get wrong, and it is the one that hurts conversion the most.

A captcha designed around "is this a human" treats any agent as an attacker. That is the wrong question for the next few years.

The right question is, "is this submission a legitimate intent."

A legitimate agent submission has properties a fraud submission usually does not:

A reasonable policy is to keep a friction layer for unknown sources, and to relax it for actor-declared submissions whose history looks legitimate:

async function defenceCheck(input: FormSubmission, request: Request) {
  if (input.submittedOnBehalfOf?.actorType === "agent" && hasValidConsentToken(input)) {
    return passSoftCheck(request);
  }
  if (looksLikeBrowserAutomation(request) && !input.submittedOnBehalfOf) {
    return blockOrChallenge(request);
  }
  return defaultTurnstileCheck(request);
}

The point is not the exact rules. The point is that "all non-humans are bad" stops being the right default. The category of "non-human submitter the user explicitly delegated to" needs a different lane.

FORMLOVA already maintains a separate lane for the sales-pitch problem. After submit, each response on forms with spam_filter_enabled = true

is asynchronously classified into legitimate

, sales

, or suspicious

by a lightweight OpenRouter-hosted model (about $0.0002 per response). This classification does not block the submit; it shapes downstream analysis, filtering, and ownership. The architectural lesson generalizes: do not refuse the submission at the wall. Refine it after, with state on the record. The same lesson applies to agent submitters: route them, do not block them, and let the workflow downstream decide.

This is the part most relevant when the agent is on the operator side rather than the respondent side, but it shapes the entire MCP contract and is worth covering.

Some operations are externally irreversible: sending bulk emails, replying to a respondent, deleting a form, unpublishing a form, removing a team member, ending an A/B test, deleting a response, restoring a previous form version. If an MCP client tells a model "you can call any tool," and the model decides to call send_bulk_email

, the cost of being wrong is high.

FORMLOVA classifies tools into four levels:

L0  read-only           execute immediately
L1  reversible write    execute immediately (version history covers the rollback path)
L2  respondent-facing   server-side review state machine (publish_form)
L3  externally final    HMAC-signed confirmation_token required

The eleven L3 tools and the L2 publish_form

all require a confirmation_token

signed with HMAC-SHA256 and valid for 5 minutes. The token is issued by the server only after the model has presented the right summary to the user and the user has explicitly approved. Even if the model misreads its own instructions, it cannot bluff its way past the gate without a current token.

This matters for agent submitters too, because the same form can be operated from the respondent side and the operator side, and we want the agent to feel exactly as safe to talk to as a careful intern. The contract is "you can read freely, you can write reversibly, anything else needs a fresh user-signed token."

If you are already running an MCP server, this is the part that ties it together.

The form should not only exist as a rendered page. It should also exist as a tool an MCP client can call.

A first cut of the tool surface:

list_forms                -> Returns form descriptors.
get_form_schema(formId)   -> Returns fields and validation rules.
submit_form(formId, ...)  -> Returns SubmitResponse, including duplicate handling.
get_form_status(formId)   -> Returns whether the form is open, capped, scheduled.

That is the minimum the agent reader needs. Operator-side tools like response search, status updates, exclusions, and reports are a separate concern.

In FORMLOVA, this minimum surface lives in the forms

category, with adjacent categories handling the read side (responses

, pulse

), the response-management side (response-management

, filtering

), and the workflow side (webhooks

, email-sequences

, scheduling

, smart-notifications

). An agent that wants to operate a form has a stable categorical map of what is available, not a flat list of 129 verbs to guess at.

I want to call out one place where the abstract advice can mislead.

The argument is not "expose everything as MCP and delete the dashboard." It is "the canonical operational surface should not depend on a person scanning the dashboard."

FORMLOVA's dashboard is still there, because some tasks are better done visually:

The honest claim is that the chat (MCP) surface and the dashboard are different shapes of the same product. Chat is sequential and great for intent-driven workflows ("show me responses from this week, exclude sales pitches, draft a reply to the three demo requests"). The dashboard is parallel and great for scanning. An agent submitter benefits from the same architecture, because the form's published shape is the same shape the dashboard renders.

This pattern does not solve fraud entirely. It just stops treating "agent" and "fraud" as the same word.

It does not solve consent. You still need to record who delegated to the agent, and how. A consent token is a placeholder for a real consent record, not a substitute for one.

It does not solve every accessibility case. Screen readers and AT consumers also benefit from semantic field naming, but they have their own contract you should still meet.

It does not solve the visual side. Humans are still submitting too, and the visual design still matters for them.

What it does is stop the form from quietly failing for an audience that will keep growing.

If you only do five things from this article, do these:

1. Give every field a stable semantic name in addition to a render id.
2. Publish your validation rules in a machine-readable form.
3. Accept a submitter-assigned intent id and deduplicate on it.
4. Return a structured confirmation in the submit response, not only in UI.
5. Stop using "are you a human" as your only bot signal.

If you can do five more, do these:

6. Classify side effects by blast radius (L0..L3) and require confirmation tokens for the irreversibly final ones.
7. Surface the auto-reply / notification / classification state on the response record, not in a separate notification log.
8. Track the respondent across forms with a stable identifier that does not depend on which fields a given form happened to ask for.
9. Expose form, schema, submit, and status as MCP tools so agents can talk to your product without scraping.
10. Keep the visual surface honest with the agent surface. Both are reading the same form.

Each is a small change in isolation. Together they change what kind of submitter your form is honest with.

source & further reading

dev.to — original article Instrument Like a Learning Scientist Stop Prompting and Start Engineering: Treating LLMs as Unreliable Functions Seven archetypes, a skill radar, and what your Claude sessions actually reveal

Designing Forms an AI Agent Can Actually Submit

Run your AI side-project on zahid.host