Training Therapeutic Judges and Multi-Agent Systems for Human-Aligned Mental Health Support

wpnews.pro

cd /news/large-language-models/training-therapeutic-judges-and-mult… · home › topics › large-language-models › article

[ARTICLE · art-45914] src=arxiv.org ↗ pub=2026-07-01T04:00Z topic=large-language-models verified=true sentiment=↑ positive

Training Therapeutic Judges and Multi-Agent Systems for Human-Aligned Mental Health Support

Researchers introduced TheraJudge and TheraAgent, a framework that uses multi-agent systems and human-aligned evaluation to improve therapeutic quality in mental health support LLMs, achieving a +0.43 improvement in clinician-rated quality and 96% inter-rater reliability.

read1 min views1 publishedJul 1, 2026

arXiv:2606.30887v1 Announce Type: new Abstract: Large language models show promise for mental health support, yet therapeutic quality improves only when evaluation functions as an actionable control signal rather than a passive metric. We introduce a framework that formulates therapeutic response generation as a decision-refinement problem driven by multi-dimensional, human-aligned evaluation. In Stage I, we introduce TheraJudge, an open-source therapeutic evaluator trained via preference-based optimization on human-annotated data to produce reliable judgments across 7 psychological dimensions. In Stage II, we introduce TheraAgent, which operationalizes TheraJudge's evaluations through a coordinated refinement process with specialized Critic, Coach, and Therapist roles that translate evaluative signals into targeted response revisions. Empirically, TheraJudge achieves strong agreement with clinician ratings, with intraclass correlation coefficients (ICC = 0.87-0.95), surpassing supervised baselines and strong closed-source judges, particularly on critical dimensions such as Safety, Relevance, and Empathy. Acting on these evaluations, TheraAgent yields a +0.43 improvement in human-rated therapeutic quality (on a 5-point scale) under blind evaluation, with 96% clinician inter-rater reliability. Low-quality responses ($\leq 3$) improve by +2.45 points with a 94% recovery rate, demonstrating targeted correction of unsafe outputs. Overall, our results indicate that effective alignment of mental-health LLMs stems from acting on human-aligned evaluation, rather than relying solely on stronger generation. We release code at https://github.com/vis-nlp/TheraAlign.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/training-therapeutic-jud…

Read original on arxiv.org → arxiv.org/abs/2606.30887

mentioned entities

TheraJudge

TheraAgent

TheraAlign

arXiv

metadata

slugtraining-therapeutic-judges-and-multi-agent-systems-for-human-aligned-mental

topic#large-language-models

secondary4 topics

sentimentpositive

canonicalarxiv.org

navigation

← prevI Built 5 Free AI Tools That Rep…

next →Sivers emission övertecknades "f…

── more in #large-language-models 4 stories · sorted by recency

arxiv.org · 1 Jul · #large-language-models

Using AI Agents to Automate Black-Box Audits of Personalization Algorithms at Scale

arxiv.org · 1 Jul · #large-language-models

A Single Rewrite Suffices: Empirical Lessons from Production Skill Description Optimization

arxiv.org · 1 Jul · #large-language-models

When transformers learn "impossible" languages, what do they learn?

arxiv.org · 1 Jul · #large-language-models

Measuring Judgment Quality in Natural-Language Explanations: Evidence from Forecasting Tournaments

── more on @therajudge 3 stories trending now

wpnews · 30 May · #ai-tools

I was wasting 10 minutes every Claude session. So I built a fix.

wpnews · 27 May · #machine-learning

hunting for headroom on modded-nanoGPT (WR #82)

wpnews · 2 Jun · #ai-products

Microsoft launches Discovery platform for scientific R&D with Ginkgo Bioworks partnership

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required