# Trans-Humeanism. The Problem of Induction Revisited

> Source: <https://www.lesswrong.com/posts/gdPcvCpATLA7mPZ4X/trans-humeanism-the-problem-of-induction-revisited>
> Published: 2026-05-28 18:10:49+00:00

*I'm writing this up as a quick sketch of an argument that I don't think anyone has explicitly made yet. I am about to start the PIBBSS Fellowship so won't have time to develop it fully, but I believe it could give a useful perspective on why alignment is a difficult new problem for science to deal with. It's also short and general enough that I think others could develop this framing quite quickly into something useful.*

Induction is one of the ways in which we acquire knowledge. I'll take the classic example from philosophy seminars.

I'm in a park, observing swans. I notice that all the swans I see are white. Therefore, I conclude that the next swan I see will also be white, and therefore that all swans are white.

Hume famously argued of the lack of a basis for the types of guarantees obtained via this method. He claimed that we believe that the Sun will continue to rise in the morning, because we have seen the Sun successfully rise on all previous mornings.

Science typically uses a lot of induction. We make a series of observations, identify a regularity and try to infer the reason behind the regularity.

Induction works well when the objects you're generalizing over are slow-moving. The swans don't evolve whilst you’re counting all the white ones, and the Solar System doesn’t redesign itself on timescales relevant to your attempts to demonstrate the Sun's commitment to sunrising.

Classical science gets away with induction because nature is patient. In some cases it is so patient that we can take the observations of sunrises and planetary motion and infer theoretical reasons why the they should be so. This is Newtonian gravity.

Prosaic AI safety is predominately inductive. You study a system, establish safety claims, and generalize across inputs, time, deployment contexts. Evals, red-teaming, circuit discovery, control protocols seem to follow this pattern.

Unfortunately, whilst nature is patient, frontier labs are not. They are actively trying to develop new models, scaffolding, and whatnot to improve their models' capabilities. In trying to build a science of AI, we are faced with a target that is not only moving but is being actively evolved.

As an analogy, imagine counting swans in the park, but by the time you observe the last swan the first one has already evolved into some new species. The term "swan" in your inductive statement "All swans are white" no longer picks out a clear, consistent object.

Thus AI safety as a scientific endeavour is faced with a new problem: its objects of study are not stable, and so it's conclusions (and consequently its safety guarantees) are likely to have a finite shelf life before which they become irrelevant.

A further complication here is that the process of evolution itself is becoming increasingly dominated by those very objects of study. Strongly self-modifying AI do not seem too far away. At this point we lose even a handle on the speed at and extent to which our objects of study evolve.

[I've already written about this](https://www.lesswrong.com/posts/X3buTt6yWZKjd8vSZ/management-of-substrate-sensitive-ai-capabilities-mossaic) for the field of mechanistic interpretability, in more detail though with less clarity. Above is what I take to be the philosophical core of that paper's threat model, now distilled and made more generally applicable. It is also one of the problems that my organization, [Groundless](https://groundless.ai/), is trying to take seriously.

Here's a few rough notes for future development. I'm not strongly attached to any, and won't have time to return fully to them for the next few months. I invite people to engage with them.
