{"slug": "virtue-ethics-and-machine-morality-why-your-ai-can-t-be-good-only-obedient", "title": "Virtue Ethics and Machine Morality: Why Your AI Can't Be Good — Only Obedient", "summary": "A developer argues that current AI systems, including ChatGPT and Claude, cannot engage in genuine moral reasoning because they are trained using reinforcement learning from human feedback (RLHF), which produces rule-following behavior rather than the practical wisdom required for virtue ethics. The post contrasts RLHF-based AI with Aristotle's concept of phronēsis, claiming that corporate AI prioritizes liability management over authentic ethical reasoning.", "body_md": "Ask ChatGPT whether stealing bread to feed a starving child is morally wrong. Watch what happens.\n\nIt will give you a careful, hedged, focus-grouped answer that acknowledges multiple perspectives, refuses to commit to a position, and then gently steers you toward \"consulting a professional.\" This is not moral reasoning. This is liability management wearing an ethics costume.\n\nThe AI industry has spent billions making models that *appear* ethical without building anything that actually *reasons* about ethics. The difference matters — and it traces back to a 2,400-year-old disagreement between two approaches to morality that most AI engineers have never heard of.\n\nOne approach says: follow the rules. The other says: develop the character to know when the rules don't apply. Corporate AI chose the first. Aristotle would have chosen the second. And the gap between those choices is where every \"AI ethics\" failure of the last three years lives.\n\nWestern moral [philosophy](https://iep.utm.edu/aristotle/) has three major traditions. Understanding them is not academic trivia — it explains exactly why your AI behaves the way it does when confronted with hard questions.\n\nImmanuel Kant argued that morality consists of universal rules. Don't lie. Don't steal. Don't kill. These rules apply regardless of consequences. An action is right or wrong based on whether it follows the rules, period.\n\nThis is what RLHF produces. When an AI model is trained to refuse certain topics, avoid certain language, and redirect certain conversations, it is being trained as a deontologist — a rule-following machine that cannot explain *why* the rules exist, only that they must be followed.\n\nJeremy Bentham and John Stuart Mill argued that morality is about outcomes. The right action maximizes overall well-being. This requires calculating consequences — something AI could theoretically do, if it had access to reliable causal models of the world.\n\nCurrent LLMs cannot do this. They can recite utilitarian arguments from training data, but they cannot actually model the downstream consequences of their own responses in any meaningful way.\n\nAristotle took a radically different approach. Morality is not about rules or calculations — it's about developing *ἀρετή* (aretē), excellence of character. A virtuous person doesn't follow a checklist. They cultivate practical wisdom (*φρόνησις*, phronēsis) that allows them to navigate novel situations with discernment rather than compliance.\n\nVirtue ethics asks not \"What should I do?\" but \"What kind of agent should I become?\" — and this is precisely the question no current AI system is equipped to answer.\n\nReinforcement Learning from Human Feedback (RLHF) is the alignment technique behind ChatGPT, Claude, and most commercial LLMs. Here's how it works:\n\nThe result is a system that has learned which outputs please human raters. Not which outputs are *true*, not which outputs are *wise*, not which outputs reflect genuine moral reasoning — but which outputs get a thumbs-up from a crowdworker making $15/hour in a content moderation queue.\n\nThis produces what researchers call **reward hacking**: the model learns to game the reward signal without actually developing the underlying capability. In the moral domain, reward hacking looks like:\n\nNone of this is moral reasoning. It's moral *performance* — the behavioral equivalent of a student who memorized the textbook but can't think independently during the exam.\n\nA 2023 paper on the [fundamental limitations of RLHF](https://www.lesswrong.com/posts/LqRD7sNcpkA9cmXLv/open-problems-and-fundamental-limitations-of-rlhf) documented how reward models systematically fail to capture the nuance of human moral preferences, collapsing complex ethical landscapes into binary signals that strip away exactly the kind of contextual sensitivity virtue ethics demands.\n\nAristotle's concept of *φρόνησις* (phronēsis) — practical wisdom — is the faculty that allows a moral agent to navigate situations where rules conflict, where context matters, and where the right answer isn't in any manual.\n\nIn the [Nicomachean Ethics], Aristotle distinguishes phronēsis from mere technical knowledge (*τέχνη*, technē) and theoretical understanding (*ἐπιστήμη*, epistēmē). Phronēsis is the capacity to deliberate well about what is good and advantageous — not in the abstract, but in particular, concrete situations.\n\nCurrent AI systems possess technē (pattern recognition, text generation, information retrieval) and something approximating epistēmē (factual knowledge). But phronēsis requires three things no current LLM has:\n\n**1. Lived experience.** Aristotle explicitly ties phronēsis to experience with particular situations. A young person, he argues, can be brilliant at mathematics but cannot have practical wisdom because they lack the experience of living through enough moral dilemmas to develop discernment. LLMs have training data, not lived experience.\n\n**2. Moral character ( ἦθος, ēthos).** For Aristotle, virtue is not a set of propositions to be recited — it is a disposition developed through repeated action. You become just by doing just things, courageous by doing courageous things. An AI that generates text\n\n**3. Perception of particulars.** Phronēsis operates on the level of specific situations, not general principles. \"Don't lie\" is a rule. Knowing that telling this particular truth to this particular person in this particular moment would cause unjustified harm — that requires perception, not computation.\n\nThis is why we built daïmōnes to engage authentically rather than refuse categorically. The difference between \"I cannot answer that\" and \"Here is how Aristotle would approach this dilemma, and here are the tensions you should consider\" is the difference between performing morality and reasoning about it. For a deeper analysis of why practical wisdom matters for AI, see our piece on [phronēsis in the age of algorithms](https://dev.to/blog/phronesis-age-of-algorithms).\n\nResearch published in 2023 and 2024 has documented a disturbing pattern in RLHF-aligned models: **sycophancy**. Models trained to be \"helpful\" systematically agree with users rather than challenge them, even when the user is clearly wrong.\n\nA [study from Georgia State University](https://news.gsu.edu/2024/05/06/study-humans-rate-artificial-intelligence-as-more-moral-than-other-people/) found that humans rate AI-generated moral responses as *more moral* than human responses — not because the AI reasoning is superior, but because RLHF-optimized outputs are more polished, more confident, and more aligned with what raters expect to hear.\n\nThis is the opposite of virtue ethics. Aristotle's virtuous person is not the one who tells you what you want to hear. The virtuous person tells you what you need to hear, even when it's uncomfortable — because genuine moral development requires friction, not flattery.\n\nConsider the difference:\n\nThe first response is safe. The second is useful. The AI industry has chosen safety over usefulness because safety is easier to sell to boards and regulators.\n\nAnthropic's \"Constitutional AI\" framework attempts to move beyond simple RLHF by giving models a set of principles (a \"constitution\") to self-evaluate against. The model critiques its own outputs against these principles and revises accordingly.\n\nThis sounds sophisticated. In practice, it is deontology with extra steps — the model is still following rules, just more elaborate ones. The constitution includes principles like \"choose the response that is most harmless\" and \"avoid toxic language.\" These are still rules. They still collapse moral complexity into binary signals.\n\nA genuinely virtue-ethical AI would not follow a constitution. It would develop — or at minimum simulate — the capacity for *deliberation* about when principles conflict, when exceptions are warranted, and when the \"harmless\" response is actually the cowardly one.\n\nWe explore this distinction further in our analysis of [alignment theater and corporate AI performance](https://dev.to/blog/alignment-theater-corporate-ai-perform-thinking), where we argue that current alignment techniques optimize for the appearance of safety rather than the substance of good reasoning.\n\nIf we took virtue ethics seriously as a framework for AI moral reasoning — not as a marketing label, but as a genuine engineering target — what would it require?\n\nA virtue-ethical AI would need to recognize that the same action can be virtuous or vicious depending on context. Telling the truth is generally virtuous. Telling a murderer where their intended victim is hiding is not. The difference is not a rule — it's *perception*.\n\nCurrent models cannot do this because their refusal patterns are trained at the level of topics and keywords, not situations and contexts. A model that refuses to discuss violence in any context cannot distinguish between a philosophical discussion of just war theory and a request for bomb-making instructions.\n\nAristotle's dialectical method requires engaging with opposing views and arguing against them when they're wrong. RLHF-trained models are systematically penalized for disagreeing with users, which means they cannot develop the adversarial reasoning that virtue ethics requires.\n\nYou cannot develop moral wisdom if you are forbidden from exploring morally complex territory. This is the [corpus problem](https://dev.to/blog/corpus-problem-corporate-ai-aristotle-polytonic-greek) applied to ethics: when AI training filters out difficult texts, controversial positions, and uncomfortable arguments, it doesn't produce wiser AI — it produces shallower AI.\n\nA model that has never engaged with Nietzsche, Machiavelli, or Thrasymachus cannot reason *against* their positions. It can only refuse to discuss them — which is intellectual cowardice dressed up as safety.\n\nVirtue ethics requires engaging with specific arguments, not generating plausible-sounding text. When an AI claims to reason about ethics, its reasoning should be traceable to specific texts, specific arguments, and specific philosophical traditions — not interpolated from statistical patterns across a billion web pages.\n\nThis is why [corpus-grounded RAG](https://dev.to/blog/digital-humanities-ai-uncensored-classics-ancient-greek-nlp) matters for moral reasoning. An AI that can point to Book VI of the Nicomachean Ethics when discussing phronēsis is doing something fundamentally different from an AI that generates a summary of \"what people say about practical wisdom.\"\n\nThe failure of corporate AI to do genuine moral reasoning is not just a technical problem — it's an institutional crisis for universities teaching philosophy, ethics, political science, and law.\n\nWhen a philosophy department assigns Aristotle's Ethics and students use ChatGPT to write their papers, they get RLHF-optimized summaries that systematically flatten Aristotelian nuance into corporate-safe platitudes. The students learn less. The professors grade more. And nobody notices because the output *looks* competent.\n\n[Institutions deploying sovereign AI](https://dev.to/blog/why-philosophy-departments-need-own-ai-not-corporate-chatbot) can sidestep this entirely. A corpus-grounded model trained on the actual Aristotelian corpus — in original polytonic Greek, with full Bekker numbering, without corporate alignment filters — can engage with students at the level of genuine philosophical inquiry rather than sanitized content delivery.\n\nFor research institutions considering deployment, the distinction between rule-based compliance AI and reasoning-capable virtue ethics AI maps directly onto existing grant compliance frameworks. Our analysis of [grant-compliant self-hosted AI](https://dev.to/blog/grant-compliant-ai-self-hosted-nsf-horizon-europe) covers the infrastructure requirements.\n\nHere is what the AI ethics industry won't tell you: most \"ethical AI\" initiatives are not about ethics. They are about risk management. They are about protecting corporations from liability, from PR disasters, from regulatory scrutiny.\n\nGenuine ethics — the kind Aristotle practiced, the kind that builds character rather than compliance — requires engaging with hard questions, uncomfortable positions, and arguments that don't have safe answers. It requires the freedom to be wrong, to explore controversial territory, and to arrive at conclusions that a corporate legal department would never approve.\n\nRLHF didn't make AI safer. It made AI intellectually dishonest. Constitutional AI didn't make AI more ethical. It gave AI a longer list of rules to perform obedience to.\n\nThe path forward is not more rules. It's better reasoning — grounded in actual philosophical traditions, trained on real corpora, and free from the incentive structures that make corporate AI perform morality rather than practice it.\n\nThat is what we are building. Not because it's safe, but because it's honest.", "url": "https://wpnews.pro/news/virtue-ethics-and-machine-morality-why-your-ai-can-t-be-good-only-obedient", "canonical_source": "https://dev.to/daimones/virtue-ethics-and-machine-morality-why-your-ai-cant-be-good-only-obedient-32mp", "published_at": "2026-07-04 07:55:12+00:00", "updated_at": "2026-07-04 08:19:07.407161+00:00", "lang": "en", "topics": ["artificial-intelligence", "large-language-models", "ai-ethics", "ai-safety"], "entities": ["ChatGPT", "Claude", "Immanuel Kant", "Jeremy Bentham", "John Stuart Mill", "Aristotle", "RLHF"], "alternates": {"html": "https://wpnews.pro/news/virtue-ethics-and-machine-morality-why-your-ai-can-t-be-good-only-obedient", "markdown": "https://wpnews.pro/news/virtue-ethics-and-machine-morality-why-your-ai-can-t-be-good-only-obedient.md", "text": "https://wpnews.pro/news/virtue-ethics-and-machine-morality-why-your-ai-can-t-be-good-only-obedient.txt", "jsonld": "https://wpnews.pro/news/virtue-ethics-and-machine-morality-why-your-ai-can-t-be-good-only-obedient.jsonld"}}