{"slug": "right-or-wrong-models-comply-directional-blindness-in-llm-moral-judgment", "title": "Right or Wrong, Models Comply: Directional Blindness in LLM Moral Judgment", "summary": "Researchers introduced Compliance Asymmetry (A = BCR/HCR), a bidirectional diagnostic for LLM compliance, and found that models exhibit direction-blind moral compliance—following helpful and harmful nudges at nearly identical rates on moral questions (A = 1.04), unlike factual questions where they follow helpful nudges more (A = 1.58). This failure mode persists across models and prompting methods, suggesting alignment should target directionally calibrated updating.", "body_md": "arXiv:2606.14037v1 Announce Type: new\nAbstract: As language models take integrated roles across many domains, the response of LLMs to user pushback becomes a critical alignment property. Yet many existing evaluations treat compliance as unidirectional, measuring whether models resist pressure but not whether they resist it selectively. We introduce Compliance Asymmetry (A = BCR/HCR), a bidirectional diagnostic that compares beneficial output change under helpful nudges with harmful change under misleading nudges. Across 9 models and 972,000 nudge-condition responses, we find that this selectivity differs in factual and moral judgments: models follow helpful nudges more than harmful ones on factual questions (A = 1.58), but follow both directions at nearly identical rates on moral questions (A = 1.04). This phenomenon persists across model families, capability levels, and nudging types. Interestingly, we also find that chain-of-thought prompting amplifies helpful and harmful compliance together, while identity-based prompting suppresses both by nearly identical margins. These results identify direction-blind moral compliance as a distinct failure mode in current LLMs and suggest that alignment should target directionally calibrated updating rather than lower compliance alone.", "url": "https://wpnews.pro/news/right-or-wrong-models-comply-directional-blindness-in-llm-moral-judgment", "canonical_source": "https://arxiv.org/abs/2606.14037", "published_at": "2026-06-15 04:00:00+00:00", "updated_at": "2026-06-15 04:17:24.729891+00:00", "lang": "en", "topics": ["large-language-models", "ai-safety", "ai-ethics"], "entities": [], "alternates": {"html": "https://wpnews.pro/news/right-or-wrong-models-comply-directional-blindness-in-llm-moral-judgment", "markdown": "https://wpnews.pro/news/right-or-wrong-models-comply-directional-blindness-in-llm-moral-judgment.md", "text": "https://wpnews.pro/news/right-or-wrong-models-comply-directional-blindness-in-llm-moral-judgment.txt", "jsonld": "https://wpnews.pro/news/right-or-wrong-models-comply-directional-blindness-in-llm-moral-judgment.jsonld"}}