Mitigating Position Bias in Transformers via Layer-Specific Positional Embedding Scaling

Researchers introduced LPES, a layer-specific positional embedding scaling method that mitigates the 'lost-in-the-middle' problem in LLMs by assigning distinct scaling factors to each layer, achieving up to 11.2% accuracy gain on key-value retrieval without fine-tuning or latency increase.

arXiv:2606.27705v1 Announce Type: new Abstract: Large Language Models LLMs still struggle with the lost-in-the-middle'' problem, where critical information located in the middle of long-context inputs is often underrepresented or lost. While existing methods attempt to address this by combining multi-scale rotary position embeddings RoPE , they typically suffer from high latency or rely on suboptimal hand-crafted scaling strategies. To overcome these limitations, we introduce a layer-specific positional embedding scaling~ LPES method that assigns distinct scaling factors to each layer. LPES achieves a more balanced attention distribution without fine-tuning model parameters or increasing inference delay. A specially designed genetic algorithm is employed to efficiently select the optimal scaling factors for each layer by incorporating B\'{e}zier curves to significantly reduce the search space. Extensive experiments demonstrate that LPES effectively mitigates positional attention bias and delivers consistent improvements across multiple long-context benchmarks, yielding up to an $11.2$\% accuracy gain on the key-value retrieval dataset.