cd/entity/GlobalOpinionQAยท homeโ€บ entitiesโ€บ GlobalOpinionQA
grep -l @globalopinionqa /news/*.json | wc -l โ†’ 1

GlobalOpinionQA

mentions 1 type Organization feed RSS

// recent coverage 1 mentions

04:00
2026-06-18
arxiv.org
large-language-models

Steerable Cultural Preference Optimization of Reward Models

Researchers introduced Steerable Cultural Preference Optimization (SCPO), a reward model training algorithm that balances diverse cultural preferences in large language models. SCPO improved minority โ€ฆ

// co-occurs with top 4 entities