IRTS-ToolBench

mentions 1 type Organization feed RSS

// recent coverage 1 mentions

04:00

2026-06-16

arxiv.org

large-language-models

Towards Verifiable Agentic Data Science: Solving Irregular TSQA Via Tool-Grounded Reasoning

Researchers introduced IRTS-ToolBench, a benchmark of 1,700 questions across 10 task types and 13 domains, to evaluate large language models and AI agents on irregular time series question answering. …

// co-occurs with top 1 entities

arXiv 1