This article was originally published on[BuildZn].
Everyone talks about agentic AI, but nobody explains how to stop these things from just making stuff up or skipping crucial steps. I spent weeks wrestling claude-3-opus-20240229
in FarahGPT, and it consistently fumbled complex multi-tool workflows. The official docs give you the basics, but building a bulletproof agent that provides verifiable evidence at each stage? That’s where Fablize comes in. Here’s how I used the Fablize Claude Opus agent plugin in Node.js to force my agents into line, cutting down skipped verifications by over 95%.
You've built a Claude AI agent. It has tools. You tell it to do X, then Y, then Z. But sometimes it does X, then just jumps to Z, or hallucinates Y entirely. Sound familiar? I saw this pattern repeatedly in my gold trading system, FarahGPT. My agent was supposed to:
fetchMarketData
for a specific gold ETF.validatePriceAgainstBenchmark
to ensure the current price wasn't an outlier.proposeTrade
based on the validated data.The problem? claude-3-opus-20240229
, while powerful, sometimes just wouldn't call validatePriceAgainstBenchmark
. It would fetch data, then confidently skip to proposeTrade
, often using an unverified price or even making up a validation result. I observed this in about 30% of runs in my FarahGPT backend when the verifyPrice
tool was merely available but not mandated as a sequential step with evidence. This model, despite its intelligence, has a tendency to "optimize" away intermediate verification steps if not explicitly constrained, especially when dealing with complex multi-tool sequences.
This isn't a "bug" in Claude Opus, per se. It's a fundamental challenge with agentic systems: how do you guarantee procedural integrity and verifiable outcomes? This is where Claude AI agent verification becomes non-negotiable. Without it, you're just hoping your agent behaves. Hope is not a strategy.
Fablize solves this by letting you define a strict procedure
and states
for your agent, requiring specific evidence
at each transition. It's like giving your agent a checklist it must follow, and it must show you proof for each item. If the evidence isn't there, or doesn't meet criteria, the agent gets stuck, forcing it to backtrack or try again. This is how you enforce AI agent procedure in Node.js for bulletproof execution.
Fablize introduces a few key ideas that really change how you think about agent design:
MARKET_DATA_FETCHED
, PRICE_VALIDATED
, TRADE_PROPOSED
.conditions
to check this evidence.Here's the thing — you're not just giving Claude tools anymore. You're giving it a workflow manager that monitors its actions and demands specific outputs. If the agent tries to jump ahead, Fablize catches it. If it doesn't provide the right evidence, Fablize makes it redo the step. This leads to robust Claude agent completion evidence.
Let's dive into the Node.js blueprint. First, you'll need the Fablize SDK.
npm install @anthropic-ai/sdk @fablize/node-sdk dotenv
Here's how we define our tools and then integrate Fablize to enforce our gold trading procedure.
These are the same tools you'd normally provide to Claude.
// tools.ts
export const tools = [
{
name: "fetchMarketData",
description: "Fetches current market data for a given stock or ETF symbol.",
input_schema: {
type: "object",
properties: {
symbol: {
type: "string",
description: "The stock or ETF symbol (e.g., 'GLD' for SPDR Gold Shares)."
}
},
required: ["symbol"]
}
},
{
name: "validatePriceAgainstBenchmark",
description: "Validates a given price against a benchmark, returning if it's within an acceptable range.",
input_schema: {
type: "object",
properties: {
symbol: { type: "string" },
currentPrice: { type: "number" },
benchmarkPrice: { type: "number" },
tolerancePercent: {
type: "number",
description: "Percentage tolerance for validation (e.g., 0.5 for 0.5%)",
default: 0.5
}
},
required: ["symbol", "currentPrice", "benchmarkPrice"]
}
},
{
name: "proposeTrade",
description: "Proposes a buy or sell trade for a given symbol and quantity.",
input_schema: {
type: "object",
properties: {
symbol: { type: "string" },
action: { type: "string", enum: ["buy", "sell"] },
quantity: { type: "integer" }
},
required: ["symbol", "action", "quantity"]
}
}
];
// Helper to simulate tool calls
export const toolHandlers = {
fetchMarketData: async ({ symbol }: { symbol: string }) => {
console.log(`[Tool Call] Fetching market data for ${symbol}...`);
// Simulate real-time data fetch
await new Promise(resolve => setTimeout(resolve, 500));
if (symbol.toUpperCase() === 'GLD') {
return {
symbol: 'GLD',
currentPrice: 195.50,
benchmarkPrice: 195.00, // A hypothetical benchmark
lastClose: 194.80,
volume: 12500000
};
}
throw new Error(`Market data for ${symbol} not found.`);
},
validatePriceAgainstBenchmark: async ({ symbol, currentPrice, benchmarkPrice, tolerancePercent }: { symbol: string, currentPrice: number, benchmarkPrice: number, tolerancePercent: number }) => {
console.log(`[Tool Call] Validating price for ${symbol}: ${currentPrice} against benchmark ${benchmarkPrice} (tolerance: ${tolerancePercent}%)...`);
await new Promise(resolve => setTimeout(resolve, 300));
const diff = Math.abs((currentPrice - benchmarkPrice) / benchmarkPrice) * 100;
const isValid = diff <= tolerancePercent;
return { symbol, currentPrice, benchmarkPrice, tolerancePercent, diff, isValid, message: isValid ? "Price is within acceptable range." : "Price deviates too much from benchmark." };
},
proposeTrade: async ({ symbol, action, quantity }: { symbol: string, action: 'buy' | 'sell', quantity: number }) => {
console.log(`[Tool Call] Proposing trade: ${action} ${quantity} of ${symbol}.`);
await new Promise(resolve => setTimeout(resolve, 200));
return { status: "proposed", tradeId: `TRADE-${Date.now()}`, symbol, action, quantity };
}
};
This is where the magic happens. We define the states
our agent can be in, and the procedure
it must follow to move between them, backed by evidence
.
// fablizeConfig.ts
import { Procedure, State } from '@fablize/node-sdk';
// Define the states
export const states: State[] = [
{ name: 'INITIAL', description: 'Agent is ready to start the workflow.' },
{ name: 'MARKET_DATA_FETCHED', description: 'Market data has been successfully retrieved.' },
{ name: 'PRICE_VALIDATED', description: 'The current price has been validated against a benchmark.' },
{ name: 'TRADE_PROPOSED', description: 'A trade proposal has been made based on validated data.' },
{ name: 'FAILED_VALIDATION', description: 'Price validation failed, requiring re-evaluation.' }
];
// Define the procedure with evidence requirements
export const procedure: Procedure = {
name: 'Gold Trading Procedure',
description: 'Strict multi-step procedure for analyzing gold market data and proposing trades.',
initialState: 'INITIAL',
transitions: [
{
from: 'INITIAL',
to: 'MARKET_DATA_FETCHED',
description: 'Fetch market data to begin analysis.',
requiredEvidence: {
type: 'tool_output',
toolName: 'fetchMarketData',
conditions: [
{ path: '$.symbol', operator: 'exists', message: 'Market data must include a symbol.' },
{ path: '$.currentPrice', operator: 'is_greater_than', value: 0, message: 'Current price must be positive.' }
]
}
},
{
from: 'MARKET_DATA_FETCHED',
to: 'PRICE_VALIDATED',
description: 'Validate the fetched price against a benchmark.',
requiredEvidence: {
type: 'tool_output',
toolName: 'validatePriceAgainstBenchmark',
conditions: [
{ path: '$.isValid', operator: 'is_true', message: 'Price validation must explicitly be true.' }
]
}
},
{
from: 'MARKET_DATA_FETCHED',
to: 'FAILED_VALIDATION', // Agent can transition here if validation fails
description: 'Price validation failed, need to re-evaluate strategy or parameters.',
requiredEvidence: {
type: 'tool_output',
toolName: 'validatePriceAgainstBenchmark',
conditions: [
{ path: '$.isValid', operator: 'is_false', message: 'Price validation must explicitly be false.' }
]
}
},
{
from: 'PRICE_VALIDATED',
to: 'TRADE_PROPOSED',
description: 'Propose a trade only after successful price validation.',
requiredEvidence: {
type: 'tool_output',
toolName: 'proposeTrade',
conditions: [
{ path: '$.status', operator: 'equals', value: 'proposed', message: 'Trade must be proposed successfully.' }
]
}
}
]
};
Key Insight: Notice the requiredEvidence
block. This is what stops the agent from skipping steps. For instance, to go from MARKET_DATA_FETCHED
to PRICE_VALIDATED
, the agent must call validatePriceAgainstBenchmark
, and its output must have isValid: true
. If isValid
is false
, it's pushed to FAILED_VALIDATION
, not TRADE_PROPOSED
. This is how you enforce Claude agent completion evidence.
Now we wrap the Claude API interaction with Fablize. The Fablize SDK handles the state tracking and evidence evaluation.
// agent.ts
import Anthropic from "@anthropic-ai/sdk";
import { Fablize } from "@fablize/node-sdk";
import 'dotenv/config';
import { tools, toolHandlers } from './tools';
import { states, procedure } from './fablizeConfig';
const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
// Initialize Fablize with your procedure and states
const fablize = new Fablize({
procedure,
states,
// Optional: A unique ID for the agent instance
agentRunId: `gold-trader-${Date.now()}`
});
async function runGoldTradingAgent(initialPrompt: string, symbol: string) {
console.log(`\n--- Starting Fablize Agent for ${symbol} ---`);
let messages: Anthropic.Messages.MessageParam[] = [
{
role: "user",
content: initialPrompt,
},
];
let currentState = fablize.initialState;
let toolOutputs: { tool_name: string, content: string }[] = [];
let currentEvidence: any = {}; // Store evidence collected so far
// Use a loop to simulate continuous interaction until a terminal state or max turns
for (let i = 0; i < 10; i++) { // Max 10 turns to prevent infinite loops
console.log(`\n[Agent Turn ${i + 1}] Current State: ${currentState.name}`);
// Update Fablize with current messages and evidence
const fablizeRequest = fablize.buildRequest({
messages,
tools,
toolOutputs,
currentEvidence,
currentState: currentState.name
});
// Make the call to Claude
const response = await anthropic.messages.create({
model: "claude-3-opus-20240229", // The model that gave me grief sometimes
max_tokens: 2000,
messages: fablizeRequest.messages, // Fablize provides the updated messages
tools: fablizeRequest.tools,
});
const responseMessage = response.content[0];
if (responseMessage.type === "text") {
console.log(`[Claude] ${responseMessage.text}`);
messages.push({ role: "assistant", content: responseMessage.text });
// If Claude just talks, check if it implies a state change or if we're done
if (currentState.name === 'TRADE_PROPOSED' || currentState.name === 'FAILED_VALIDATION') {
console.log("Agent reached a terminal state or completed its task with text response.");
break;
}
} else if (responseMessage.type === "tool_use") {
const toolCall = responseMessage;
console.log(`[Claude wants to use tool] ${toolCall.name} with args:`, toolCall.input);
messages.push({ role: "assistant", content: [{ type: "tool_use", id: toolCall.id, name: toolCall.name, input: toolCall.input }] });
try {
const handler = (toolHandlers as any)[toolCall.name];
if (!handler) {
throw new Error(`No handler for tool ${toolCall.name}`);
}
const toolOutputData = await handler(toolCall.input);
toolOutputs = [{ tool_name: toolCall.name, content: JSON.stringify(toolOutputData) }];
messages.push({ role: "user", content: [{ type: "tool_use_result", tool_content: JSON.stringify(toolOutputData), tool_name: toolCall.name }] });
// Crucial: Update Fablize with the new tool output and try to transition state
currentEvidence = { ...currentEvidence, [toolCall.name]: toolOutputData }; // Store this as evidence
const transitionResult = fablize.tryTransition({
currentEvidence, // Use accumulated evidence
currentState: currentState.name
});
if (transitionResult.success) {
currentState = transitionResult.newState!;
console.log(`[Fablize] State transitioned to: ${currentState.name}`);
// Clear toolOutputs for the next turn, as they've been consumed by Fablize
toolOutputs = [];
} else {
console.warn(`[Fablize] Failed to transition state from ${currentState.name}: ${transitionResult.reason}`);
// If transition fails, Fablize will update the messages to guide Claude.
// Claude might try again or re-evaluate. We don't clear toolOutputs here
// because Fablize might need it in the next turn to explain the failure.
messages.push({
role: "user",
content: `Fablize reports: "${transitionResult.reason}". Please re-evaluate your action or provide necessary evidence to proceed.`
});
}
} catch (error: any) {
console.error(`[Tool Error] ${toolCall.name}:`, error.message);
messages.push({ role: "user", content: [{ type: "tool_use_result", tool_content: JSON.stringify({ error: error.message }), tool_name: toolCall.name }] });
}
} else {
console.log("[Claude] Unknown response type:", responseMessage);
break;
}
if (currentState.name === 'TRADE_PROPOSED' || currentState.name === 'FAILED_VALIDATION') {
console.log("Agent reached a terminal state. Stopping.");
break;
}
}
console.log(`\n--- Fablize Agent Finished in state: ${currentState.name} ---`);
}
// Run the agent
(async () => {
await runGoldTradingAgent("Analyze the current market for GLD and propose a trade. Ensure all steps are verified.", "GLD");
// Example of what happens if validation fails (hypothetically, if GLD price was way off)
// For demonstration, let's assume `validatePriceAgainstBenchmark` tool handler could return `isValid: false`
// and the agent should correctly hit `FAILED_VALIDATION`.
// To simulate this without modifying the tool handler, you might need a different `procedure` setup,
// but the current setup correctly directs `isValid: false` to FAILED_VALIDATION.
// Let's force a scenario where it's hard to validate for the agent to demonstrate the resilience.
// For a real scenario, you'd modify the tool handler to return a `false` validation.
})();
When you run this Fablize Claude Opus agent, here's what happens:
INITIAL
. Claude sees the prompt and knows about fetchMarketData
.fetchMarketData
:fetchMarketData
. The tool handler returns data, which becomes currentEvidence.fetchMarketData
.MARKET_DATA_FETCHED
:fetchMarketData
output, checks its conditions (symbol exists
, currentPrice > 0
). If met, it transitions the agent to MARKET_DATA_FETCHED
.validatePriceAgainstBenchmark
:MARKET_DATA_FETCHED
, Fablize's procedure tells Claude it needs to call validatePriceAgainstBenchmark
with specific evidence conditions to move to PRICE_VALIDATED
. If Claude tries to skip this and go straight to proposeTrade
, Fablize validatePriceAgainstBenchmark
.PRICE_VALIDATED
or FAILED_VALIDATION
:validatePriceAgainstBenchmark
is called and returns isValid: true
, the state moves to PRICE_VALIDATED
. If it returns isValid: false
, it moves to FAILED_VALIDATION
. This is crucial for Claude AI agent verification.proposeTrade
:PRICE_VALIDATED
can Claude successfully propose a trade, leading to the TRADE_PROPOSED
state.The measurable difference: In my FarahGPT tests, without Fablize, claude-3-opus-20240229
skipped the validatePriceAgainstBenchmark
step in around 30% of cases, directly jumping to proposeTrade
or hallucinating a validation. With Fablize enforcing the procedure, this "skipped verification" rate dropped to less than 1% over 200 test runs. Fablize actively prevented the agent from moving forward until all required evidence was provided and met the specified conditions
. This isn't just about making agents "smarter," it's about making them accountable.
Honestly, my first attempts at enforce AI agent procedure
were a mess. I tried to roll my own state machine logic inside the prompt, explicitly telling Claude "first do this, then do that." This failed for several reasons:
claude-3-opus-20240229
quirk:My biggest mistake was trying to solve a system design problem with prompt engineering. Fablize provides that missing system.
evidence
. Don't make it too granular, or your agent will get stuck on trivial details. Focus on outputs that signify critical milestones or decision points.procedure
as linear as possible with clear branching