Do Language Models Know What Not to Say? Causal Evidence for Statistical Preemption in LLMs

A new computational study provides causal evidence that large language models acquire knowledge of unacceptable grammatical constructions through statistical preemption, a mechanism previously theorized in Construction Grammar. Across four experiments with 120 English verb-construction pairings, researchers found that LLM surprisal patterns correlate strongly with human acceptability judgments and that manipulating competing-form frequencies directly shifts model behavior. The findings demonstrate that neural language models learn negative linguistic knowledge through distributional competition without explicit negative evidence.

arXiv:2605.23039v1 Announce Type: new Abstract: How do learners acquire knowledge of what is unacceptable without negative evidence? Construction Grammar proposes statistical preemption: exposure to a conventional form e.g., "donated the books to the library" preempts structurally possible but unattested alternatives " donated the library the books" . We present a computational study that, for the first time, directly dissociates statistical preemption from the competing entrenchment hypothesis in large language models within a single converging design. Across four experiments spanning 120 English verb-construction pairings dative, causative, locative , we show that 1 LLM surprisal patterns correlate strongly with human acceptability judgments $r = 0.79$ , validated against three independent behavioral datasets; 2 these patterns are driven by competing-form frequency rather than overall verb frequency, confirmed by non-circular partial correlations; 3 preemption sensitivity scales as a power law with model size; and 4 a controlled fine-tuning intervention causally demonstrates that manipulating competing-form frequencies shifts preemption behavior in the predicted direction, with reverse-direction controls ruling out frequency-sensitivity confounds. These results provide converging evidence that neural language models acquire negative linguistic knowledge through distributional competition, the core mechanism posited by Construction Grammar.