DRIVE: Modeling Skills at the Reasoning and Interaction Levels for Web Agents under Continual Learning

wpnews.pro

cd /news/artificial-intelligence/drive-modeling-skills-at-the-reasoni… · home › topics › artificial-intelligence › article

[ARTICLE · art-14034] src=arxiv.org ↗ pub=2026-05-26T04:00Z topic=artificial-intelligence verified=true sentiment=· neutral

DRIVE: Modeling Skills at the Reasoning and Interaction Levels for Web Agents under Continual Learning

Researchers have developed DRIVE, a dual-level skill modeling framework that separates web agent knowledge into natural language reasoning skills for transferable task logic and programmatic interaction skills for executable page operations. The system, tested across five WebArena domains, achieved a 52.8% average task success rate, outperforming skill-free baselines by 7.3 percentage points. This approach addresses the fundamental challenge of disentangling abstract, cross-site reasoning from concrete, site-specific interactions to enable continual learning in web agents.

read1 min views6 publishedMay 26, 2026

arXiv:2605.23939v1 Announce Type: new Abstract: Web agents require both high-level reasoning (for task decomposition) and low-level interactions (for page elements manipulation) to conduct different tasks. However, these knowledge types differ fundamentally: reasoning knowledge (e.g., booking a flight requires first searching for routes) is abstract and transferable across websites, while interaction knowledge (e.g., clicking the Search button at a specific coordinate on Site A) depends heavily on page-specific contexts. Existing methods store experiences uniformly. This creates a dilemma: abstract representations lose executability on concrete pages, while concrete representations fail to generalize across domains. This entanglement limits capability accumulation: on new websites, agents either fail to recognize reusable task logic due to surface-level differences or attempt infeasible actions from outdated page structures. To disentangle them, we propose DRIVE, a dual-level skill modeling framework separating historical experience into natural language reasoning skills, which capture transferable task logic, and programmatic interaction skills, grounding abstract actions to executable operations. A scene-aware coordination mechanism adaptively retrieves and invokes these dual-level skills based on task semantics. DRIVE also uses skill-level reflection to identify hierarchy-specific failure modes, enabling targeted skill library expansion and refinement. Experiments across five WebArena domains show DRIVE attains an average task success rate of 52.8%, exceeding the skill-free baseline by 7.3 percentage points. Further ablations show reasoning and interaction skills provide distinct, complementary benefits, supporting separation of transferable task logic from executable page-level operations.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/drive-modeling-skills-at…

Read original on arxiv.org → arxiv.org/abs/2605.23939

mentioned entities

DRIVE

metadata

slugdrive-modeling-skills-at-the-reasoning-and-interaction-levels-for-web-agents

topic#artificial-intelligence

secondary4 topics

sentimentneutral

canonicalarxiv.org

navigation

← prevShow HN: Self-hosted collaborati…

next →Google Enters The Ecommerce Wars…

── more in #artificial-intelligence 4 stories · sorted by recency

machinebrief.com · 10 Jul · #artificial-intelligence

DeepSearch-Evolve: The Next Step in Self-Improving AI Agents

machinebrief.com · 10 Jul · #artificial-intelligence

Revolutionizing Intent Detection: A Leap with MiniLM

machinebrief.com · 10 Jul · #artificial-intelligence

Chinese Tech Giant Tightens Grip on AI Agent Start-up after U.S. Setback

machinebrief.com · 10 Jul · #artificial-intelligence

Revamping Ad Headlines: The AI Behind Higher Click-Through Rates

── more on @drive 3 stories trending now

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 30 May · #ai-safety

Nightcord Security Analysis Report - Threat Investigation

wpnews · 8 Jul · #artificial-intelligence

Anthropic's "J-lens" reveals workspace in Claude mirrors theory of consciousness

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required