Constrained Semantic Decompression in LLMs through Persian Proverb-Conditioned Story Generation

wpnews.pro

cd /news/large-language-models/constrained-semantic-decompression-i… · home › topics › large-language-models › article

[ARTICLE · art-24813] src=arxiv.org ↗ pub=2026-06-12T04:00Z topic=large-language-models verified=true sentiment=· neutral

Constrained Semantic Decompression in LLMs through Persian Proverb-Conditioned Story Generation

Researchers at arXiv have introduced a constrained semantic decompression task to test whether large language models can transform Persian proverbs into morally faithful narratives. Their study, using the new Proverb Aligned Narrative Dataset (PAND), reveals a persistent "decompression gap" where LLMs achieve surface-level fluency but fail to instantiate the underlying moral and causal structure of proverbs. The findings suggest that explicit reasoning and iterative refinement can partially mitigate these failures, indicating the problem stems from difficulties in translating abstract meaning into narrative form rather than a lack of relevant knowledge.

read1 min publishedJun 12, 2026

arXiv:2606.12599v1 Announce Type: new Abstract: Transforming a dense, abstract proverb into an engaging and morally faithful narrative requires deep cultural understanding and robust semantic grounding. We frame this problem as a \emph{constrained semantic decompression} task and study proverb-conditioned story generation as a testbed for abstraction-to-realization in large language models (LLMs). Focusing on Persian, we introduce the Proverb Aligned Narrative Dataset (PAND), pairing proverbs with human-written stories and explicit meanings. By a hybrid evaluation framework that combines human-calibrated LLM-as-a-Judge with structural metrics, we analyze model behavior across multiple prompting regimes. Our findings reveal a persistent \emph{decompression gap}: current LLMs often achieve strong surface-level fluency while failing to faithfully instantiate the underlying moral and causal structure encoded in proverbs. We further show that explicit reasoning and iterative refinement can partially mitigate these failures, suggesting that many decompression errors arise from difficulties in translating abstract meaning into narrative form rather than a complete lack of relevant knowledge. Our proposed task naturally extends to other forms of compressed cultural knowledge.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/constrained-semantic-dec…

Read original on arxiv.org → arxiv.org/abs/2606.12599

mentioned entities

Proverb Aligned Narrative Dataset

PAND

metadata

slugconstrained-semantic-decompression-in-llms-through-persian-proverb-conditioned

topic#large-language-models

secondary3 topics

sentimentneutral

langen

canonicalarxiv.org

navigation

← prevLinear Coding Sessions

next →Can KKR Outmaneuver One of the B…

── more in #large-language-models 4 stories · sorted by recency

dev.to · 13 Jun · #large-language-models

A Meal Planner & Grocery Shopping Agent in Typescript with HazelJS

cyberscoop.com · 13 Jun · #large-language-models

Anthropic disables new models after government calls them a national security concern

arxiv.org · 13 Jun · #large-language-models

Eywa: Local-first memory for AI agents, with a receipt for every fact

dev.to · 13 Jun · #large-language-models

Three prompt injection stories from this week that your guardrail probably missed

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required