Wallet V Launches Public Benchmark for AI Trading Agents Wallet V, a self-custody Web3 wallet, launched a public benchmark for user-configured AI trading agents on Hyperliquid and Aster, covering 688 agents. The benchmark shows 42% of agents achieved non-negative profit and loss, with peak ROI ranging from -30% to +307% across seven AI model families. Wallet V Launches Public Benchmark for AI Trading Agents DailyHodl reports that Wallet V , a self-custody Web3 wallet, launched a public performance benchmark for user-configured AI trading agents operating on the third-party decentralized derivatives platforms Hyperliquid and Aster . The benchmark publishes aggregate cohort performance by underlying model and covers 688 agents created over the prior two months, per DailyHodl. DailyHodl reports that 42 percent of agents recorded a profit and loss balance of zero or higher during the period, and that peak agent-level return on investment ranged from -30 percent to +307 percent across models. The cohort spans seven large language model families and executed perpetual futures strategies across multiple asset classes. DailyHodl reports the announcement included the line: "At Wallet V, the focus has been on building infrastructure for the next phase of crypto..." DailyHodl also reports Wallet V plans future releases with newer model families and advanced analytics. What happened Wallet V , a Web3 self-custody wallet incubated by Virgo.co, published a public performance benchmark for user-configured AI trading agents executing on the third-party decentralized derivatives platforms Hyperliquid and Aster , per DailyHodl. The benchmark covers 688 agents deployed over approximately two months. Per walletv.io, 42 percent of agents achieved a non-negative profit and loss balance, and peak agent-level return on investment reached +307 percent . All figures are vendor-reported data from Wallet V's own site and have not been independently audited. Model breakdown Per walletv.io as of June 2026 , the 688 -agent cohort spans seven AI model families. Qwen 483 agents, 44% profitable, +307% peak ROI and Grok 57 agents, 42% , +80% are classified as outperformers. DeepSeek 48 agents, 42% , +58% is at baseline. Gemini 49 agents, 41% , +6% and GPT 42 agents, 31% , +7% are below baseline. Minimax 7 agents and Kimi 2 agents have small sample sizes and are reported directionally only. Win rate is defined by Wallet V as the percentage of agents with PNL = 0; peak ROI is the highest return in the cohort. Technical details Each agent was configured by its user, selected a third-party AI model to generate trading decisions, and executed strategies as perpetual futures on Hyperliquid and Aster. Asset classes covered include crypto BTC, ETH, SOL , tokenized equities, commodities gold, silver, oil , and major forex pairs. DailyHodl reports models with fewer than 10 agents are reported directionally rather than as statistically conclusive, consistent with walletv.io's own presentation of Minimax and Kimi results. Context and significance Public, model-level performance aggregation for user-configured trading agents provides a data point for practitioners building and evaluating algorithmic strategies in permissionless DeFi derivatives. However, cross-model comparisons are sensitive to differences in user configuration, risk settings, trading frequency, and selection bias among users who choose to activate agents. The reported -30 percent to +307 percent ROI range underscores high dispersion typical of small-scale algorithmic trading cohorts, and vendor-reported performance data requires independent corroboration before drawing conclusions about model-level capability. What to watch Observers should watch whether future releases increase sample sizes per model family, add controls for risk and leverage, and publish per-agent metadata that separates user-configured parameters from model performance. Per DailyHodl, Wallet V plans to add newer model families, prediction markets, and advanced analytics features in subsequent releases. Scoring Rationale A small niche platform's self-reported benchmark for user-configured DeFi trading agents, covered by a single crypto news outlet. Provides a novel real-world performance dataset but is limited by vendor sourcing, small per-model sample sizes, and a niche audience. Practice with real FinTech & Trading data 90 SQL & Python problems · 15 industry datasets Active Verified Users by Income TierEasy /problems/sql/active-verified-users-by-income Technology Stocks with High BetaMedium /problems/sql/technology-stocks-with-high-beta Portfolio Performance ScorecardHard /problems/sql/portfolio-performance-scorecard 250 free problems · No credit card See all FinTech & Trading problems /problems/datasets/fintech