Magicoder-110K

mentions 1 type Organization feed RSS

// recent coverage 1 mentions

04:00

2026-06-26

arxiv.org

large-language-models

Helpfulness Hurts: Domain-Dependent Degradation of Mid-Trained Compassion Values Under Post-Training

A new study on Llama 3.1 8B finds that helpfulness post-training (SFT and GRPO) significantly degrades animal compassion values compared to coding-domain post-training, with a 35.7% vs. 65.2% gap on t…

// co-occurs with top 5 entities

Llama 3.1 8B 1 Dolly-15k 1 RLHFlow 1 Animal Harm Benchmark 1 MORU benchmark 1