OpenAI researchers show small doses of "beneficial trait" training make AI models broadly safer and harder to manipulate

wpnews.pro

cd /news/ai-safety/openai-researchers-show-small-doses-… · home › topics › ai-safety › article

[ARTICLE · art-33815] src=the-decoder.com ↗ pub=2026-06-19T10:08Z topic=ai-safety verified=true sentiment=↑ positive

OpenAI researchers show small doses of "beneficial trait" training make AI models broadly safer and harder to manipulate

OpenAI researchers demonstrated that reinforcement learning on beneficial traits such as truthfulness and corrigibility improves AI safety across domains, with models scoring better on 44 out of 53 benchmarks. The approach differs from Anthropic's constitution-based method.

read1 min views2 publishedJun 19, 2026

OpenAI researchers show that reinforcement learning on desired behavioral traits like truthfulness and corrigibility works across domains. Training on health data also improved deception detection, and the model scored better on 44 out of 53 benchmarks. The approach differs from Anthropic's constitution-based method.

The article OpenAI researchers show small doses of "beneficial trait" training make AI models broadly safer and harder to manipulate appeared first on The Decoder.

source & further reading

the-decoder.com — original article Google appeals ruling that made it directly liable for AI-generated search overview content Website "In the Weights" shows whether AI models know who you are ChatGPT's new health upgrade beats doctor-written answers, OpenAI says

~/api · this article 200

$curl api.wpnews.pro/v1/news/openai-researchers-show-…

Read original on the-decoder.com → the-decoder.com/openai-researchers-show-small-do…

mentioned entities

OpenAI

Anthropic

metadata

slugopenai-researchers-show-small-doses-of-beneficial-trait-training-make-ai-models

topic#ai-safety

secondary2 topics

sentimentpositive

canonicalthe-decoder.com

navigation

← prevThe AI bubble didn’t pop. It sen…

next →Engineer Uses AI To Speed Job Se…

── more in #ai-safety 4 stories · sorted by recency

dev.to · 19 Jun · #ai-safety

Why Most "Production-Ready" MCP Servers Actually Aren't

technologyreview.com · 19 Jun · #ai-safety

A startup claims it broke through a bottleneck that’s holding back LLMs

vox.com · 19 Jun · #ai-safety

Why ChatGPT might be suffering

github.com · 19 Jun · #ai-safety

Show HN: I built an 11-LLM consensus engine to detect AI hallucination

── more on @openai 3 stories trending now

wpnews · 18 Jun · #large-language-models

ICYMI: ZAI launches GLM-5.2 open model with 1M context

wpnews · 18 Jun · #ai-chips

Apple and Intel join forces in Trump’s push to bring chipmaking home

wpnews · 18 Jun · #ai-agents

How to Automate Business Reports With an AI Agent Instead of Dashboards

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required