cd /news/ai-safety/trans-humeanism-the-problem-of-induc… · home topics ai-safety article
[ARTICLE · art-16735] src=lesswrong.com pub= topic=ai-safety verified=true sentiment=· neutral

Trans-Humeanism. The Problem of Induction Revisited

A new philosophical argument, termed "Trans-Humeanism," contends that artificial intelligence safety faces a fundamental scientific challenge because its objects of study—AI systems—are unstable and rapidly evolving. The argument, presented by a researcher beginning a PIBBSS Fellowship, claims that unlike classical science which relies on induction over patient natural phenomena, frontier AI labs are actively developing new models that change before safety conclusions can be reliably drawn. This instability means safety guarantees have a finite shelf life, a problem that intensifies as self-modifying AI systems accelerate their own evolution.

read3 min publishedMay 28, 2026

I'm writing this up as a quick sketch of an argument that I don't think anyone has explicitly made yet. I am about to start the PIBBSS Fellowship so won't have time to develop it fully, but I believe it could give a useful perspective on why alignment is a difficult new problem for science to deal with. It's also short and general enough that I think others could develop this framing quite quickly into something useful.

Induction is one of the ways in which we acquire knowledge. I'll take the classic example from philosophy seminars.

I'm in a park, observing swans. I notice that all the swans I see are white. Therefore, I conclude that the next swan I see will also be white, and therefore that all swans are white.

Hume famously argued of the lack of a basis for the types of guarantees obtained via this method. He claimed that we believe that the Sun will continue to rise in the morning, because we have seen the Sun successfully rise on all previous mornings.

Science typically uses a lot of induction. We make a series of observations, identify a regularity and try to infer the reason behind the regularity.

Induction works well when the objects you're generalizing over are slow-moving. The swans don't evolve whilst you’re counting all the white ones, and the Solar System doesn’t redesign itself on timescales relevant to your attempts to demonstrate the Sun's commitment to sunrising.

Classical science gets away with induction because nature is patient. In some cases it is so patient that we can take the observations of sunrises and planetary motion and infer theoretical reasons why the they should be so. This is Newtonian gravity.

Prosaic AI safety is predominately inductive. You study a system, establish safety claims, and generalize across inputs, time, deployment contexts. Evals, red-teaming, circuit discovery, control protocols seem to follow this pattern.

Unfortunately, whilst nature is patient, frontier labs are not. They are actively trying to develop new models, scaffolding, and whatnot to improve their models' capabilities. In trying to build a science of AI, we are faced with a target that is not only moving but is being actively evolved.

As an analogy, imagine counting swans in the park, but by the time you observe the last swan the first one has already evolved into some new species. The term "swan" in your inductive statement "All swans are white" no longer picks out a clear, consistent object.

Thus AI safety as a scientific endeavour is faced with a new problem: its objects of study are not stable, and so it's conclusions (and consequently its safety guarantees) are likely to have a finite shelf life before which they become irrelevant.

A further complication here is that the process of evolution itself is becoming increasingly dominated by those very objects of study. Strongly self-modifying AI do not seem too far away. At this point we lose even a handle on the speed at and extent to which our objects of study evolve.

I've already written about this for the field of mechanistic interpretability, in more detail though with less clarity. Above is what I take to be the philosophical core of that paper's threat model, now distilled and made more generally applicable. It is also one of the problems that my organization, Groundless, is trying to take seriously.

Here's a few rough notes for future development. I'm not strongly attached to any, and won't have time to return fully to them for the next few months. I invite people to engage with them.

── more in #ai-safety 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/trans-humeanism-the-…] indexed:0 read:3min 2026-05-28 ·