Trace-to-Training: how agent runs become learning data

wpnews.pro

cd /news/machine-learning/trace-to-training-how-agent-runs-bec… · home › topics › machine-learning › article

[ARTICLE · art-40171] src=dev.to ↗ pub=2026-06-26T01:50Z topic=machine-learning verified=true sentiment=↑ positive

Trace-to-Training: how agent runs become learning data

WasmAgent introduces a framework that converts agent execution traces into training data for supervised fine-tuning (SFT) and direct preference optimization (DPO) without human labeling. Its compliance engine evaluates runs, ranks outcomes, and exports typed ComplianceEvalRecords, with a full repair loop (full_pcl) achieving 54.7% pass rate on IFEval benchmarks, an 8.7 percentage point improvement over prompt retry. The system uses compliance verification as the reward signal, enabling models to learn from failure traces.

read2 min views1 publishedJun 26, 2026

Every agent run is a data point. Most frameworks throw it away.

WasmAgent keeps it — evaluated by the compliance engine, ranked by outcome, exported as a typed ComplianceEvalRecord

ready for SFT or DPO training. No human labeling.

import { ComplianceRun } from "@wasmagent/compliance";

const run = new ComplianceRun({
  mode: "full_pcl",   // "direct" | "prompt_retry" | "full_pcl"
  taskSpec: {
    instruction: "Write a summary in exactly 3 bullet points.",
    constraints: [{ type: "format", rule: "bullet_count", value: 3 }],
  },
});

const result = await run.execute(agent, input);
// result.complianceEvalRecord → typed, versioned, schema-validated

** direct** — one shot, record pass/fail.

** prompt_retry** — retry once with a rephrased prompt.

** full_pcl** — full repair loop: run → evaluate → patch/regenerate → re-evaluate → record the entire trace.

IFEval × Qwen2.5-1.5B-Q4 (3 seeds × 50 samples):

Mode	Pass rate	Std dev
prompt_retry	46.0%	±2.0pp
full_pcl
54.7%
±1.2pp

+8.7pp. The variance drop (±2.0 → ±1.2) matters for production reliability.

Reproduce: bun packages/compliance/benchmarks/ifeval/run.ts --limit=50 --seed=42

When full_pcl

repairs a failing output, RepairPlanner

records every attempt:

// Inside ComplianceEvalRecord
attempts: [
  { strategy: "direct",     output: "...", passed: false },
  { strategy: "patch",      output: "...", passed: false },
  { strategy: "regenerate", output: "...", passed: true  },
]

The full sequence — what failed, what was tried, what worked — is what feeds DPO training. The model learns from failure traces, not just final outputs.

import { RolloutForkRunner, RolloutRanker } from "@wasmagent/core";

const runner = new RolloutForkRunner({ forks: 4 });
const rollouts = await runner.run(agent, input, taskSpec);

const ranked = new RolloutRanker().rank(rollouts);
// ranked[0] → chosen (SFT)
// ranked[1..] → rejected (DPO pairs)

The compliance verifier is the reward signal. No human annotation.

git clone https://github.com/WasmAgent/wasmagent-js
bun test packages/compliance/   # 113 pass / 0 fail

Code: packages/compliance · RolloutForkRunner · RolloutRanker

Series: AEP (part 1) · MCP Trust Pack (part 2) · Trace-to-Training (part 3)

source & further reading

dev.to — original article MCP Trust Pack: a security layer for MCP tool calls Your AI agent called a tool. Can you prove it followed the rules? A one-line cache key bug cost me $187/month and leaked advertiser data across tenants

~/api · this article 200

$curl api.wpnews.pro/v1/news/trace-to-training-how-ag…

Read original on dev.to → dev.to/telleroutlook/trace-to-training-how-agent…

mentioned entities

WasmAgent

ComplianceEvalRecord

IFEval

Qwen2.5-1.5B-Q4

RolloutForkRunner

RolloutRanker

RepairPlanner

ComplianceRun

metadata

slugtrace-to-training-how-agent-runs-become-learning-data

topic#machine-learning

secondary3 topics

sentimentpositive

canonicaldev.to

navigation

← prevMCP Trust Pack: a security layer…

── more in #machine-learning 4 stories · sorted by recency

dev.to · 26 Jun · #machine-learning

MCP Trust Pack: a security layer for MCP tool calls

dev.to · 26 Jun · #machine-learning

Your AI agent called a tool. Can you prove it followed the rules?

dev.to · 26 Jun · #machine-learning

Gemini in Chrome is about to call WebMCP. The "no agent uses it yet" excuse just got an expiry date.

byteiota.com · 26 Jun · #machine-learning

OpenCode v1.17: MCP Resources, OAuth Fix, –mini Mode

── more on @wasmagent 3 stories trending now

wpnews · 19 Oct · #developer-tools

Windows Script to clean up and remove all ASUS software

wpnews · 28 May · #ai-startups

The Niche SaaS Opportunity Map 2026: Highly Demanded Subscribed Categories Beyond Mainstream

wpnews · 1 Nov · #developer-tools

Custom Zig Test Runner, better ouput, timing display, and support for special "tests:beforeAll" and "tests:afterAll" tests

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required