Model Collapse as Cultural Evolution

wpnews.pro

cd /news/large-language-models/model-collapse-as-cultural-evolution · home › topics › large-language-models › article

[ARTICLE · art-13645] src=arxiv.org ↗ pub=2026-05-25T04:00Z topic=large-language-models verified=true sentiment=· neutral

Model Collapse as Cultural Evolution

A new study published on arXiv demonstrates that model collapse in large language models (LLMs) — the progressive degradation of models trained on their own outputs — follows predictable patterns of cultural evolution. Researchers tested LLaMA-2-7B and Mistral-7B across 10 generations in three languages, finding that compositionality initially rises then falls under unfiltered self-training, a non-monotonic trajectory that matches human behavioral data with 94% accuracy. The findings reframe model collapse as a cultural transmission phenomenon and provide concrete principles for designing self-training pipelines.

read1 min views12 publishedMay 25, 2026

arXiv:2605.23054v1 Announce Type: new
Abstract: Model collapse, the progressive degradation of LLMs trained on their own outputs, has been characterized statistically but lacks a linguistic explanation for which structures degrade, in what order, and why. We show that iterated learning theory from cultural evolution fills this gap. We derive five falsifiable predictions, distinguish those uniquely discriminative for the theory from confirmatory ones, and test them by self-training LLaMA-2-7B and Mistral-7B over 10 generations in English, German, and Turkish. The critical discriminative finding: compositionality follows a non-monotonic trajectory (initially rising, then falling) under unfiltered self-training. This signature persists with maximally regular seed data (ruling out noise removal) and is sustained only by task-grounded filtering, not random filtering, providing the first LLM-scale evidence for the compression-communication tradeoff. All predictions are confirmed with large effect sizes (Hedges' $g > 1.6$; $\mathrm{BF}_{10} > 100$), and LLM regularization gradients closely match human behavioral data ($R^2 = 0.94$). These results reframe model collapse as a cultural transmission phenomenon and yield concrete principles for self-training pipeline design.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/model-collapse-as-cultur…

Read original on arxiv.org → arxiv.org/abs/2605.23054

mentioned entities

LLaMA-2-7B

Mistral-7B

Hedges

arXiv

metadata

slugmodel-collapse-as-cultural-evolution

topic#large-language-models

secondary4 topics

sentimentneutral

canonicalarxiv.org

navigation

← prevThe Eternal Sloptember

next →Samsung memory workers call off …

── more in #large-language-models 4 stories · sorted by recency

lesswrong.com · 9 Jul · #large-language-models

Natural Language Autoencoders are summarizers, but do they have to be?

macrumors.com · 9 Jul · #large-language-models

iOS 27: 8+ New CarPlay Features

dev.to · 9 Jul · #large-language-models

AI Agents That Speak SQL: Text-to-SQL with Hugging Face smolagents

pub.towardsai.net · 9 Jul · #large-language-models

How Qdrant Reduced RAG Token Costs by 67% with Native ColBERT Reranking

── more on @llama-2-7b 3 stories trending now

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 8 Jul · #artificial-intelligence

Anthropic's "J-lens" reveals workspace in Claude mirrors theory of consciousness

wpnews · 30 May · #ai-safety

Nightcord Security Analysis Report - Threat Investigation

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required