Dunning-Kruger and the Communication Tax Communication between humans and LLMs incurs a "communication tax" due to mismatched domain expertise, requiring one party to simplify or adapt their output. This tax is formalized as a cost function combining information loss (Kullback-Leibler divergence) and cognitive encoding/decoding effort, which increases with capability mismatch. The author notes that while human conversations use social lubrication to manage this tax, current LLM calibration methods (like preference prompts) are clumsy and force users to explicitly define adaptation dimensions without visibility into what the model internalizes. Dunning-Kruger and the Communication Tax The sociological impact of the Dunning-Kruger is often understated and hidden under the expectation of social lubrication, but this breaks down during human-LLM communication. Humans and LLMs occupy different ends of domain expertise: Humans can understand the world, society and other humans better, but LLMs have access to nearly all the world's knowledge. One may imagine alignment as making AI more kind, more helpful and other qualities that make it more anthropomorphic, but the conditions for this to occur may be more difficult than expected, because it is rooted in a communication "tax" that happens during a conversation. The simplest case of humans consulting LLMs similar to searching Google in an area they are unfamiliar with results in a situation where the LLM has more domain knowledge than a human. In this case, the human would typically expect the LLM to "dumb down" the topic, either implicitly or after requests, so the human would understand. This "dumbing down" is a tax the LLM pays, in terms of tokens and electricity, to communicate to someone with lesser domain knowledge. On the other hand, when the human is the expert and the LLM is responding in the average register it has learned, the tax flips. The human pays it in long preference prompts, in re-prompting, in the mental work of filtering out hedges that shouldn't have been there. In human-to-human conversation this gets papered over by social lubrication like tone of voice, softening when you're not sure, backing up when you notice you've lost the other person etc. Across enough interactions both sides build a model of each other and the lubrication becomes what the calibration rests on. LLMs have mechanisms aimed at this, including user preference prompts, CLAUDE.md and similar project-level rules, and explicit memory features in some chatbots. They are how the calibration is built today, and they work to a point but the mechanisms are clumsy and carry hidden failure modes. The user has to write the calibration explicitly. Worse, the user has to figure out what dimensions to name in the first place, and the dimensions of human-side cost are not obvious. The user has no visibility into what the model actually internalized versus what it discarded, and calibration written for one task tends to misfire in another. So, I tried to examine how to actually quantify and understand the underlying mechanism behind this calibration. The framework Communication runs between two agents sender s and receiver r over time t. The question is what produces aversion in the sender as the interaction unfolds. Capability Let K s, K r ∈ ℝ≥0 denote the capability of sender and receiver in the relevant content domain the resolution at which content can be represented, manipulated, and transmitted without loss . Capability is domain-specific: agents can have high K in one domain and low K in another. K is treated as approximately stationary on the timescale of a single interaction and allowed to vary across interactions. Communication cost The communication cost C t is the cost incurred in making the exchange work at time t: C t = D KL P intended t ‖ P transmitted t + W encode t + W decode t where P intended t is the distribution over content the sender means to convey at time t, P transmitted t is the distribution that actually reaches the receiver after encoding, transmission, and decoding, D KL is the Kullback–Leibler divergence capturing information loss in the channel, W encode t is the cognitive work the sender pays to compress content into a form the receiver can decode, and W decode t is the receiver's decoding work. C → 0 when capabilities match for the content domain and compression schemes are shared the shared-prior case . C rises with capability mismatch and with mismatched compression schemes. A units note. D KL is information-theoretic bits or nats ; W encode and W decode are effort-valued. The scalar form C = D KL + W encode + W decode requires a units convention casting the work terms in information-theoretic units for instance, the channel capacity each side commits to the exchange . The cleaner reading keeps the components as a vector; the vector extension below takes this up. Tokens are not the cost unit. Token counts can be correlated with C in some regimes and anti-correlated in others; the cost is information-theoretic at root. Expected cost The expected cost E t is the agent's prior at time t on what C should be. E is set initially by category cues kinship, shared profession, prior interactions, stated specifications and updated by experience. E is partly produced by the interlocutor's earlier outputs: the prior is endogenous to the system, not exogenous. E is not updated by Bayesian rules in general. Real agents exhibit prior stickiness priors that persist despite contrary evidence and category snap-back priors that reset to a category default after each interaction ends . These deviations from rational updating are accommodated as empirical inputs. The endogeneity is the structurally novel point. E t is functional of the cost history {C t′ } {t′