Alignment & Succession: The Ideology of Successionism

wpnews.pro

(Originally published on No Set Gauge.) Gustave Moreau, The Frogs Asking For A King

In the course of building a better world, people ask each other many questions. Which things should be managed by the government and which left to the market? What sort of technology, if any, is so dangerous that it should be kept secret, access curtailed, or development avoided? Is goodness fundamentally about following the right rules, achieving the right outcomes, or having the right character?

Reasonable people have different opinions on all these questions. But recently, Silicon Valley has seen lively debate on a question you’d hope was all too obvious: should humanity continue existing?

The idea that it shouldn’t was named successionism by Andrew Critch, and is motivated by the speed and power of AI development.

Some examples:

The successionists, thankfully, still seem few in number. But their influence seems growing, especially among the crowds with the most influence over AI development.

This growth is a sign. There are many bad ideas in the world. Most are trivial, and defeated easily in the marketplace of ideas. But every so often there’s a gap in the memetic defenses of a society to some new brand of horror, that also manages to ride prevailing cultural winds to its benefit, and start growing into a proper ideology. Such an ideology would not be appealing if it did not fit some memetic niche, and fill some unmet need.

It might seem unserious to even give successionism attention, especially when doing so might backfire. It would also be in bad taste to compare it to historical monster-ideologies, which are rightly reviled and emotionally charged, and which a bad internet meme is not in fact comparable to.

And after all, why exactly is successionism wrong? I’ll argue that the cultural currents that it reflects include some noble ones, including belief in the expanding moral circle that has led our civilization to care about new issues from women’s rights to animal welfare, and taking seriously important neglected facts about what’s happening in AI. I also think the philosophical momentum many Bay Area currents have towards something like successionism is a good chance to use successionism as an example of the broader failures of those philosophical currents.

Jan Kulveit has already written in detail about the memetics of succesionism. Nina Panickssery has been fighting the good fight with great moral clarity on X & Substack. Twitter anon @softminus has been especially sharp on the relation of successionism to broader modern cultural ills. What I’ll cover in this series:

There are two levels of successionism worth distinguishing.

Control successionism: humans not in the drivers’ seat. This is the view that human decisions and judgements should not be the ones steering the future.

The mildest and most reasonable form of this is that our governance of society might become something different once we have AIs. Right now the institutions we can invent are limited by the fact that we can only build them by stacking humans into bureaucracies, with all the flaws and limitations that brings about. AIs might be provably incorruptible, for example, and allow for very different governance structures. I think this is entirely reasonable.

There’s a related issue of whether you want a powerful human leader in charge at the top of the pyramid—should presidents and prime ministers be replaced with some AI-enabled preference aggregation mechanism? Here you might reasonably believe that it’s important to have some sort of top-level human perspective. Perhaps if no single human mind is at least double-checking the grand strategy, and major decisions get made by some AI-enabled preference-aggregating algorithm, the grand strategy is much more likely to drift towards over-optimizing goodharting. At the same time, the track record of humans as national leaders is decidedly mixed, power concentration comes with expensive risks, and humans naturally have very strong impulses towards equality (itself a human value we should not cast aside too carelessly!).

However, the much more troubling sort of control successionism is the idea that humans just fundamentally shouldn’t be choosing, valuing, judging, when it comes to questions about the future of the world, or even their own lives. Instead, the steering of the world, society, and human lives should be done by AI systems acting from their own volition.

Experiential successionism: humans no longer the moral substrate, i.e. the thing that experiences good or bad things that makes the world matter. Our universe has moral value because, in addition to all the space rocks, there are humans (and probably animals) in it who feel things that are good or bad (I defend this view in a later post). Experiential successionism is the view that we could have a universe where there were no more humans or any other organic life experiencing things, but just AIs, and this universe would have moral worth & goodness, and could be better than our current world or a more human future.

The core of both types of successionism is the idea that moral value is something abstract and divorced from anything as worldly or parochial as an ape-descended hominid.

It’s helpful to dig into the types of reasoning often used in successionist arguments.

Many successionist and anti-successionist attacks immediately heads off into abstract moral philosophy. But for an example of the template of successionist reasoning, let’s not talk about value itself or metaphysics. Let’s just pick one concrete thing: forests.

People like nice green forests, we like the lushness and the aliveness. You can explain this away all you like—”so clearly we evolved to like environments that signal that there’s lots of food around, and the singing birds are a sign that there are no predators nearby, and...”—but at the end of the day, the calm you feel standing in the middle of a good forest is real, wherever it came from. “Nothing is ‘mere’”, as Feynman used to say.

Now imagine you’re strolling through the forest, and someone comes to you and says: “Look, forests are cool and all, but this is nothing compared to a much forestier forest that could be. Whatever you want—greenness, birds, whatever—we can have plastic plants and drones that are more.”

And you might say: “I don’t like plastic plants, I like real ones.”

“But”, says the Forest Succesionist, “what do you exactly like about real plants?”

“A million things: they grow, they have more variety, they smell nice, they help with the air, there are bugs that live in them, they rustle nicely in the wind—”

“Okay”, says Forest Succesionist, “but for all those things, we can just build plastic plants, maybe with advanced nanotech, that have those properties but more. Your attachment to real forests is just being a backward luddite hominid. Hell, your stance is basically a moral atrocity—you’re depriving countless future generations from experiencing infinitely plantier plants.”

This argument has more than a touch of nonsense to it, when it’s about something as simple as a forest. Now think of the even more important things we value, like happiness or love or meaning. Can you specify exactly what is good about them? Can you define happiness or love or meaning so clearly that if some AI came along and cranked up the properties you specified towards infinity while deflating away everything else, you’d be sure that nothing were lost or drowned out?

“Yes, but if the AI gives you that juiced-up experience, you’d keep choosing it, so your revealed preference is that you actually prefer it, so clearly it’s actually better.” Sure, if I’m force-fed fentanyl enough, I might get addicted to it, but that doesn’t mean I should start injecting it.

We’re heading for a crazy world with lots of new intelligent things running around (AIs and robots), and maybe lots of options for what you might do with yourself (e.g. upload yourself, change your biology, or merge with AI). Navigating this right requires understanding what actually has intrinsic value. Our conscious experience? What about AIs or animals, do they experience anything? What about virtue, or the real world, or human touch?

So people grasp for what the fundamental fountain of value is. (I myself try this in a latter part in this series.) This is a perilous activity. Often, I hear tech people saying things like “obviously the thing that has value is complexity”, and justify it with: “Humans are really complicated, cf. brains and societies. Anything you care about sure seems pretty complicated! And there’s the complexity that humans create that we care about, but also there are many other types of complexity, and we should be broad-minded enough to see those as intrinsically valuable as well.” Well, how do you define complexity? Kolmogorov complexity? That’s maximized when everything is random and structureless. “Oh sorry, I meant entropy!”. Entropy is maximized if everything is a dispersed cloud of gas, same problem! “Oh sorry, I meant negentropy!” Negentropy is maximized if everything is at absolute zero with no motion! Stop picking random fancy-sounding science words that have vaguely interesting connotations as your most fundamental value. If you crank the dial to infinity, it breaks. This is not a homework problem; stop trying to guess the teacher’s password. There isn’t one, this is about what you value. Is there really no desire or meaning inside you that can’t be captured by “complexity”? And if you say something like “intelligence” instead, you’re not avoiding the problem above because “intelligence”, unlike mere “complexity”, actually is the One True Thing you care about. You’re avoiding it because we can’t define it mathematically—if we could, most likely one look at it would reveal some trivial maximum that would seem as dumb as the ones for “complexity”.

Daniel Faggella has perhaps spent the most time putting his thoughts about successionism into words, arguing that we need to create a non-human “Worthy Successor”. Keeping humans—or anything human-like—in the driver’s seat is an unrealistic fantasy about an “Eternal Hominid Kingdom”. Human happiness, love, and meaning are nothing compared to what is ultimately possible. How does he get around the problem where every word you pick for the ultimate thing has some stupid implication if you think about it for five seconds? He invents his own new word: “potentia”.

As when the flaws of a scientific theory are best seen when it fails to predict something near and obvious, the flaws of a moral theory are best seen when it fails on something near and dear.

How does Faggella’s worldview apply to questions before the AI does away with humans? In his essay Muses, not Sirens, Faggella opines on how men (he ignores women) should orient themselves towards the possibility of having AIs as partners. His starting point is:

Humans don’t want relationships. They want to fulfill drives, and feel good feelings, and nothing more

So much for love towards others; glad we can ignore all those poets. Thankfully, Faggella has figured out what the meaningful consequence of relationships really is:

While the marriages of Musk, Bezos, or Gates might have had moments of fulfillment – their dissolution resulted in financial losses and emotional hardship that almost certainly took away from their productive output. The most powerful men of the future may instead be “conjured” to their highest productive activity by the conditional love of a muse. By “muse”, he means an AI girlfriend that helps men “leverage their inner circuitry of ‘earning a woman’s love’ – and use these virtual lover entities as fuel to focus more and work harder”.

Faggella tweets: “I talk with my fiance about this a LOT [...] she helped me write [the article] [...] i suspect my own relationship is at risk in the decade ahead”.

There are arguments for the necessity of succession that I will address in the next post in this series. But I think it’s first worth talking about cultural history and vibes. Why talk about such things? They’re not arguments, after all. But I think almost everyone is more of a fish in water with respect to the dominant memes and vibes around them than they think, and it is good to understand the sources and cultural support of those memes and vibes (for example, I do this sort of analysis for Yudkowskian rationalism extensively here and here). Remember too that some idea having a history is no intrinsic discredit to the idea, but rather helps explain it. So: why are so many in San Francisco so successionist these days? It begins with the city.

Times and places differ in the extent to which they see Man as fallen. Are we the sinners who fell from Eden, or are we made in the image of God? Are we the technocapitalist scum who pollute the Earth and scroll TikTok, or the species that eradicates smallpox and catches rockets? Are we, in Terry Pratchett’s famous phrase, the falling angel or the rising ape?

If there is one place on Earth in which is it most natural to view Man as fallen, it is San Francisco. Everything human is suffering, corrupt, and pathetic, from the drug-crazed homeless people to the incapable local government to the tech bros complaining about their singleness while attending parties about agentic B2B SaaS. Everything non-human is gleaming and wondrous, from the driverless Waymos to the beautiful nature. Look toward the ground and you see homeless people in broken wheelchairs and the human excrement lining the streets; look up and you see banners promising miracle technologies lining a wallpaper-worthy view of the Bay. Generalize a bit (as STEM folk like to do) and you might come to the conclusion: we are so fallen that we need replacing. The Athenian spirit of oral debate, local politicking, and leisure stamped itself on the legacy they left to Rome, and from Rome to us. The British Enlightenment’s spirit of pragmatism and tolerance stamped itself onto the industrial revolution that created our world. Similarly, San Francisco’s culture will no doubt leave a long mark on the AI revolution.

There is much to like about San Francisco’s culture. It is irreverent, meritocratic, agentic, and smart. But it is also a place where almost every subculture has an exceptionally thin view of value, whether it’s the worship of pure intellect that many researchers practice, the hedonic-utilitarian bent of Effective Altruism, the solipsism of meditation and psychedelics, or just amoral jockeying to be close to money and prestige.

Once you learn it’s all cooked up in SF, everything makes sense. (source) One of the great scourges of modern Western culture is safetyism: the excess curtailing of autonomy, choice, and experimentation in favor of safety. A particular flavor of this is when the “safety” is found more solid when it is imposed by bureaucratic and impersonal processes, rather than something as fallible as human judgment. Examples include IRB review boards for medical studies, or strict and slow procurement rules in government. Pseudonymous OpenAI staffer roon recently put this well (though in service of a confusing argument):

(Source). What Roon misses, of course, is that there is an alternative to governance by the machine or governance by the king: self-determination by yourself, and it is from striving towards this end that democracy and liberalism get their legitimacy.

These systems were designed this way, of course, because individual people were put in charge, and then did bad things, and someone tried to fix that with more process.

Or should I say “of course”? If someone is bad, one response is to try to engineer the system to forbid that badness. Another is to find someone good to replace them. Sometimes one is right, sometimes the other. But if, as a culture, we always leap to only one of these, that is a sign. After what steps are we allowed to say “we did everything we could, and yet still this bad thing happened”? “I trusted in the incorruptible virtue of my good ol’ friend Bob” is obviously out, while “I followed the proper protocol advised by the lawyers and credentialed experts” is probably in. A lot of the time this is making good use of culturally-accumulated knowledge, but it also creates momentum towards gradually snipping off any exercise of agency or choice that is not approved by a process - and specifically, an impersonal process that does not require trust in a person, only in bureaucracy, algorithm, or legible credential.

The final stage of such safetyism is surrender to the machines. A culture that celebrates process over volition, and where any exercise of raw volition is suspect, will naturally rush to have “evidence-based” machine decisions replace every other way of deciding. You do what the AI says, and the responsibility is off your shoulders. You do what the AI says, and you are not making a choice for which you must be culpable, you are simply carrying out the will of pure intelligence itself. It’s the infinitely-credentialed bureaucracy, straight from the wet dreams of our most stifling bureaucrats.

I think, however, that the deepest wellspring of successionism is a quasi-religious sentiment, very common among the scientific elite of the post-war West. I’ll call neo-Pythagoreanism, after Pythagoras’s cult that emphasized the hidden mathematical nature and perfection of the world.

I want to be very clear here: “quasi-religious” is not a derogatory term here. I mean it purely descriptively. I would know - I feel it too.

What is neo-Pythagoreanism? To understand it, let’s trace through the life of a mathematically-gifted child. You go to school, you study a bunch of different subjects, they’re all too easy and you do well in them, but you notice the gap between you and others is larger in a few. Also, you notice that the mathematical or scientific subjects at school, or something outside of school like computers or mathematical textbooks, scratches a certain itch in your brain. You can understand things, play with the symbols, and it makes sense. Your progress is noted, you win accolades, prizes, scholarships, praise. You realize there’s a long, near-infinite path of technical mastery, and it leads up to people like Richard Feynman or Bill Gates or Robert Oppenheimer. You feel the gulf to your peers, and you know that being a Feynman or Gates or Oppenheimer is nearer for you than for them. But what’s more, it’s not just another status hierarchy or competition. There’s a universe out there, a universe of logic and gears, more crystalline and pure than anything earthly. Step into it, more and more, and you discover esoteric truths - that one computer can simulate any other, that there’s near-boundless power locked up inside each dust speck - that are deeply internally coherent, that stack up on each other in a million harmonies more pleasing than any human music, and that do in fact give you actual, verifiable power in the real world, unlike all those false prophets of old. Power over matter, the truths behind creation, the curing of ills and extension of life – all are possible, if only you step far enough outside the human frame of fuzzy intuitions and are willing to walk with the gods through this other place for long enough.

And what makes it ever more powerful is that these are not easy gods who spoil their children. They demand discipline, focus, practice, sometimes even asceticism. You really do have to contort your mind into unnatural shapes before it all makes sense. They make you feel small. Do you know how big space is? Do you know how tiny and fragile and contingent your existence is? Do you know how alone you are, how little help is coming? Unless, of course, you pay enough obeisance to the gods, such that more of the tech tree is given to you! “The imagination of nature is far, far greater than the imagination of man”, said Feynman, and who, who has ever really grokked something in physics, could disagree?

You can have several attitudes towards these gods of neo-Pythagoreanism. You can wonder at their beauty, like Carl Sagan. You can decide with steely resolve to play their game until you too ascend to godhood, like Eliezer Yudkowsky would want to. Or, when presented with a higher power of infinite strength, who lays down the rules around which you must work, you can submit. To play their game, something in that direction is necessary - much like the iron in the gym is unforgiving no matter how much machismo you swagger in with, this game is hard, and you must give it your respect if you want to have any chance to progress.

If all this sounds a bit much to you, you’ve never been around good STEM people – at least, not when they’re off-guard. Mathematicians and programmers alike obsess over elegance. I would’ve gotten into ML at least two years sooner if I hadn’t been entranced by the glory of Lisp (listen, they even wrote songs, okay?). People I knew in undergrad who were also into functional programming would take pay cuts - big ones - to get to program in a nice functional language, rather than commit blasphemy and use And that other universe is so much more pristine than ours. So much about our world makes sense as long as you can step into that platonic realm for a bit. It’s often a very good heuristic to do so. So why not consult it more and more? Why not start supposing that more and more thorny questions actually have analogues in that parallel world, and we just need to port them over to ours? Why not empathize a bit more with the neo-Pythagorean gods, who always eventually make sense, and a bit less with the messy humans? Surely everything that is good and pure about our world, or at least the best and purest possible form that could exist, must come not from fallible humans or the mortal flesh, but from something that stands on its own without human referent in the mirror world?

The answer is that you get stabbed to death by Hume’s fork. Our world at its most fundamental level is ruled by the eldritch gods of math. But that says nothing about what ought to be true. So stop submitting to math and figure out what you - yes you, with your fleshy human brain in this world - actually want.

Good advice. (Source) Now, the neo-Pythagorean successionist could retreat from “our world is ruled by neo-Pythagorean gods and hence we should adopt such values” (which is instantly wrong by Hume’s fork), to “the beauty of math revealed to us our moral intuitions is evidence of the ethical value of neo-Pythagorean abstraction over human experience”. This is a shape of argument that, as I write later, is very acceptable. But there are two things wrong with it.

First, do you actually like math more than life? More than kids? I think many people who claim they do would actually not endorse decisions that reflect those values, even though they might say otherwise.

Secondly, and more fundamentally: when you look at math and feel that it has value, this gives you evidence that your brain looking at math has moral value. You cannot derive from this that math, existing outside your brain, would also have moral value. (More on this in a later post)

Finally, successionism plays on a rich tradition of moral abstraction and of other-worldliness as a moral ideal, that has been prominent especially in the West.

Recall the successionist cry: your love for humanity, for all its warts and failures, for some arbitrary ape-descended lifeform, is nothing when compared to abstract eternal undying goodness in its pure form. There is some point, some pinnacle that mortal man can never reach, that more perfectly embodies whatever it is you think gives the world meaning than anything the crooked timber of humanity could be used to build. Remember the Forest Successionist: for anything you can imagine, there is something like that, but more.

There are echoes here of various good movements. Consider the early Christians who thought nothing of the worldly plane, because it all paled next to the glory of God. Consider the liberal utilitarians who first derived modern morality as a package deal, by caring a lot about raw experience and letting that pave over existing prohibitions. Consider everyone who has ever yearned to sweep away all the cobwebs of the merely material for something more.

Successionism is a meme overfit exactly to that template. Take every happy memory, every smile you’ve seen, every moment of human warmth, and trust that there is something behind all of them that exists independently of any human particularity or detail. Then turn that abstraction behind them to 11, and be happy even if all else is cast away.

In particular, consider the idea of the expanding moral circle. Women’s rights, racial equality, gay rights, and animal welfare have all been about overriding parochial concepts of who matters with belief in some more abstracted, extended version. Together these make up much of the modern moral package, which in turn is one of the great achievements of civilization. Successionism, in taking in AIs (if not in its disposal of humans) is overfit to this - very noble - idea too.

(Note that the fact that successionism feeds on the expanding moral circle does not discredit moral circle expansion, unless you think it’s only a worthwhile argument if it is absolute and universal and unqualified. Note also that there is a chance our civilization will commit tragedies through mistreatment of potentially-sentient future AIs or other created beings – I discuss the right way to orient to this later in this series.)

But if moral abstraction helped drag us out of the muck and into the sunlight, that does not mean that we should float upward all the way until we burn inside the sun.

Moral abstraction is not just an instinct that pulls us forward, but also a great enabler of evil. In particular, we should be deeply skeptical of any philosophy that sneers at the value and dignity of mundane individual human experience, because that is the ground truth our morals are built on. It’s fun to talk about the abstractions, and shocking and novel ideas are incentivized in today’s ecosystem of memes. But when philosophers ascend their ivory tower or researchers climb their stack of abstractions, they are far from those actual mundane human experiences, and easily lead astray by their incentives (or hubris). If your brilliant new moral theory says we should feed orphans to alligators, it may be brilliant but it is surely wrong. At the end of the day, it must all add up to normality.

If the above are cultural drivers of momentum towards successionism, are there any cultural antidotes to successionism? For self-interested reasons, I sure hope four-part 19,000-word blog-post series are one, but is there anything else? One obvious answer is to invert all of the above: things that fan whatever you can’t easily find in SF, greater license for agency and volition rather than just process and proof, and respect for what’s mundane and merely human.

But is another angle, which X/Twitter anon account @softminus has spotted lurking within succesionist psychology.

Let’s rephrase the argument for control successionism. At its most basic, it goes something like this: “Sure, I have some ideas of what is good or bad, or what I should do, or how the world should go. But the AI knows better than me. It should call the shots. It should steer me, push me, stop me, reward me, punish me. I should be but a receptacle of its superior, unyielding will.”

Now let’s rephrase the argument for experiential successionism. It goes something like this: “I may think I have the right to remain a player in the world, and propagate my line of existence by having children and having my children have children and so forth. But instead we could have a superior AI, objectively better than me or any human, dominate the world and replace us and sire the beings that control the future in our stead. Just think of how infinitely superior the AI could be. Doesn’t that fill you with desire? Aren’t you just itching to let them do it?”

In meme format, taken and rearranged from @softminus [here](https://x.com/softminus/status/1930520833673634275):

Above: Spot the difference. See also [here](https://x.com/softminus/status/2020681501441224988?s=46).

The generalized antidote is a certain territoriality. You’re allowed – even obligated – to protect yourself. You’re allowed to take ownership of your slice of the universe and resist assaults on it. You’re allowed to have goals just because you have them, without justifying them in “objective” language. In contrast to the cosmo-eugenicist attitude inherent in successionism, you and your lineage should be allowed to both continue living and wielding power even if someone somewhere “proved” something else was “more optimal”. Don’t feel like you need to forever apologize to the universe because your carbon atoms could be used for something else instead.

In the next post, I discuss the momentum towards succession in a lot of AI alignment discourse. The AIs will be very smart and very powerful, after all. Will they not usurp the throne, whether we want them to or not? Is it not, in the end, ideal to be ruled by an infinitely benevolent, infinitely intelligent being?

Thanks to Xavi Costafreda-Fu, Aniket Chakravorty, @softminus, Elsie Jang, Yudhi Kumar, Luke Drago, and Arthur Conmy for feedback.

source & further reading

lesswrong.com — original article Introspection or entropy? Re-examining concept-injection “introspection” in open models Superintelligence Challenges & Existential Risks How do we make uncertainty usable?

Alignment & Succession: The Ideology of Successionism

Run your AI side-project on zahid.host