cd /news/machine-learning/pebs-per-rater-empirical-bayes-shrin… · home topics machine-learning article
[ARTICLE · art-42927] src=arxiv.org ↗ pub= topic=machine-learning verified=true sentiment=↑ positive

PEBS: Per-rater Empirical-Bayes Shrinkage for RLHF Reward-Model Calibration

Researchers introduced PEBS, a per-rater empirical-Bayes shrinkage estimator for calibrating reward models in RLHF, which reduces within-user held-out RMSE by 8.58% on PRISM and 9.66% on PluriHarms compared to the pooled population-slope baseline. PEBS fits per-rater affine calibrators and applies Morris-James-Stein shrinkage toward the population mean without retraining the reward model.

read1 min views1 publishedJun 29, 2026

arXiv:2606.27578v1 Announce Type: new Abstract: Reward models for Reinforcement Learning from Human Feedback (RLHF) pool preferences across thousands of annotators and fit one global affine calibrator, collapsing raters with systematically different rating-scale offsets and slopes into a single average-rater fit that does not match any individual annotator. PEBS is a per-rater empirical-Bayes shrinkage estimator: it fits per-rater affine calibrators on a held-out slice of each annotator's ratings and applies Morris-James-Stein empirical-Bayes shrinkage toward the population mean, in closed form and without retraining the reward model. On PRISM, PEBS reduces within-user held-out RMSE by 8.58% over the pooled population-slope baseline. The procedure replicates on PluriHarms harm ratings (Qwen-2.5 base, in-family) with a +9.66% RMSE reduction over the same population-slope baseline. PEBS is a closed-form post-hoc estimator for annotator-specific affine calibration in RLHF reward modeling; it leaves the reward base model unchanged and estimates only the rater-level map used at inference time for new ratings.

── more in #machine-learning 4 stories · sorted by recency
── more on @pebs 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/pebs-per-rater-empir…] indexed:0 read:1min 2026-06-29 ·