Synthetic Contrastive Reasoning for Multi-Table Q&A

wpnews.pro

cd /news/large-language-models/synthetic-contrastive-reasoning-for-… · home › topics › large-language-models › article

[ARTICLE · art-23123] src=arxiv.org pub=2026-06-06T04:00Z topic=large-language-models verified=true sentiment=↑ positive

Synthetic Contrastive Reasoning for Multi-Table Q&A

Researchers have developed a synthetic contrastive reasoning-trace dataset for multi-table question answering (QA) by generating validated positive and plausible negative traces using heterogeneous large language models (LLMs). Fine-tuning open-weight LLMs with Contrastive Preference Optimization (CPO) on these preference pairs yielded absolute average improvements of 9.7% to 16.3% over standard QA supervised fine-tuning across Qwen3-14B, Mistral-8B, and Llama-3.1-8B, with gains up to 21 percentage points on the MMQA benchmark. The approach addresses the lack of reasoning supervision in existing multi-table QA resources, and evaluations confirm the generated traces are largely faithful, coherent, and meaningfully contrastive.

read1 min publishedJun 6, 2026

arXiv:2606.05382v1 Announce Type: new Abstract: Multi-table question answering requires models to retrieve relevant evidence, link schemas, and perform compositional reasoning across relational tables. Existing multi-table Q&A resources typically provide questions and final answers but lack reasoning supervision that explains how answers are derived. To address this gap, we construct a synthetic contrastive reasoning-trace dataset for MMQA by generating validated positive traces and plausible negative traces with heterogeneous LLMs. We then use the resulting preference pairs to fine-tune open-weight LLMs with Contrastive Preference Optimization (CPO). Across Qwen3-14B, Mistral-8B, and Llama-3.1-8B, CPO achieves absolute average improvements over Q&A supervised fine-tuning ranging from 9.7%-16.3%, with gains up to 21 percentage points on MMQA. Ablations show that heterogeneous positive and negative trace generators strengthen the contrastive signal, and automated as well as human evaluations indicate that the generated pairs are largely faithful, coherent, and meaningfully contrastive.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/synthetic-contrastive-re…

Read original on arxiv.org → arxiv.org/abs/2606.05382

mentioned entities

MMQA

Qwen3-14B

Mistral-8B

Llama-3.1-8B

Contrastive Preference Optimization

CPO

metadata

slugsynthetic-contrastive-reasoning-for-multi-table-q-a

topic#large-language-models

secondary4 topics

sentimentpositive

langen

canonicalarxiv.org

navigation

← prevAI Surfer News

next →The Ethical Dilemmas of AI

── more in #large-language-models 4 stories · sorted by recency

arxiv.org · 6 Jun · #large-language-models

What Should Agents Say? Action-state Communication for Efficient Multi-Agent Systems

arxiv.org · 6 Jun · #large-language-models

Stability vs. Manipulability: Evaluating Robustness Under Post-Decision Interaction in LLM Judges

arxiv.org · 6 Jun · #large-language-models

LeanMarathon: Toward Reliable AI Co-Mathematicians through Long-Horizon Lean Autoformalization

arxiv.org · 6 Jun · #large-language-models

GITCO: Gated Inference-Time Context Optimization in TSFMs

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required