# RAG-Coding: Enhancing LLM Medical Coding with Structured External Knowledge

> Source: <https://arxiv.org/abs/2605.27377>
> Published: 2026-05-28 04:00:00+00:00

arXiv:2605.27377v1 Announce Type: new
Abstract: We present RAG-Coding, an agentic method for automated ICD-10-CM coding. RAG-Coding orchestrates four large language model (LLM) agents and grounds their coding decisions in external knowledge sources (e.g. the official coding tabular list and guidelines). By retrieving and cross-referencing relevant knowledge in these sources, the agents enhance coding accuracy and ensure clinical compliance. On the MDACE dataset, RAG-Coding outperforms the best LLM-based baseline by 8-13\% in micro-F1 and 2-8\% in macro-F1 across multiple LLM backbones. Compared to the state-of-the-art pretrained language model method, PLM-ICD, RAG-Coding exhibits higher micro recall (+11\%), while PLM-ICD exhibits higher micro precision (+6\%), yielding comparable micro- and macro-F1. Ablations show stepwise gains, highlighting the importance of incorporating external knowledge. We also release MDACE-2025, updating the original dataset with expert re-annotations with the latest 2025 ICD-10-CM guidelines. This update features more fine-grained code labels and enables evaluation against current clinical standards.
