{"slug": "reinforcement-learning-with-metacognitive-feedback", "title": "Reinforcement Learning with Metacognitive Feedback", "summary": "Researchers introduced reinforcement learning with metacognitive feedback (RLMF) to improve large language models' ability to express uncertainty faithfully. The method outperformed standard reinforcement learning by up to 63% on calibration tasks, enhancing models' self-assessment of knowledge boundaries. This approach positions metacognitive performance as a novel reinforcement signal for more trustworthy AI.", "body_md": "# Computer Science > Computation and Language\n\n[Submitted on 30 Jun 2026]\n\n# Title:Reinforcement Learning with Metacognitive Feedback Elicits Faithful Uncertainty Expression in LLMs\n\n[View PDF](/pdf/2606.32032)\n\n[HTML (experimental)](https://arxiv.org/html/2606.32032v1)\n\nAbstract:Metacognition is a critical component of intelligence that describes the ability to monitor and regulate one's own cognitive processes. Yet LLMs exhibit systemic deficiencies in key metacognitive faculties: they hallucinate with high confidence, fail to recognize knowledge boundaries, and misrepresent their internal uncertainty--undermining trustworthiness and reliability. Since monitoring task performance and adapting behavior accordingly are central to metacognition, we posit that models capable of accurately judging their own performance are better positioned to improve it. We operationalize this idea via two novel mechanisms: reinforcement learning with metacognitive feedback (RLMF), a paradigm to refine completion rankings during preference optimization based on the quality of a model's self-judgments of performance, and metacognitive data selection, which uses similar self-judgments to identify high-value training examples, outperforming naive active learning. We apply these innovations to the problem of faithful calibration (FC), a task that is itself fundamentally metacognitive: the goal is to align expressed with intrinsic uncertainty, difficult even for frontier LLMs. We adopt a two-stage, decoupled approach, first using these methods to calibrate the faithfulness of models' self-reported confidence scores, then mapping to natural, context-adaptable linguistic uncertainty via targeted output editing. Extensive experiments show RLMF achieves generalizable, state-of-the-art FC on diverse tasks while preserving accuracy. Further, RLMF surpasses standard RL by up to 63% while enhancing models' ability to assess and express their own capability limits. This positions RLMF as a promising paradigm to enhance LLM metacognition toward improved abilities and alignment, and suggests metacognitive performance as an effective RL signal to overcome limits of prior intrinsic feedback methods.\n\n### References & Citations\n\nLoading...\n\n# Bibliographic and Citation Tools\n\nBibliographic Explorer\n\n*(*[What is the Explorer?](https://info.arxiv.org/labs/showcase.html#arxiv-bibliographic-explorer))\nConnected Papers\n\n*(*[What is Connected Papers?](https://www.connectedpapers.com/about))\nLitmaps\n\n*(*[What is Litmaps?](https://www.litmaps.co/))\nscite Smart Citations\n\n*(*[What are Smart Citations?](https://www.scite.ai/))# Code, Data and Media Associated with this Article\n\nalphaXiv\n\n*(*[What is alphaXiv?](https://alphaxiv.org/))\nCatalyzeX Code Finder for Papers\n\n*(*[What is CatalyzeX?](https://www.catalyzex.com))\nDagsHub\n\n*(*[What is DagsHub?](https://dagshub.com/))\nGotit.pub\n\n*(*[What is GotitPub?](http://gotit.pub/faq))\nHugging Face\n\n*(*[What is Huggingface?](https://huggingface.co/huggingface))\nScienceCast\n\n*(*[What is ScienceCast?](https://sciencecast.org/welcome))# Demos\n\n# Recommenders and Search Tools\n\nInfluence Flower\n\n*(*[What are Influence Flowers?](https://influencemap.cmlab.dev/))\nCORE Recommender\n\n*(*[What is CORE?](https://core.ac.uk/services/recommender))# arXivLabs: experimental projects with community collaborators\n\narXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.\n\nBoth individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.\n\nHave an idea for a project that will add value for arXiv's community? [ Learn more about arXivLabs](https://info.arxiv.org/labs/index.html).", "url": "https://wpnews.pro/news/reinforcement-learning-with-metacognitive-feedback", "canonical_source": "https://arxiv.org/abs/2606.32032", "published_at": "2026-07-01 05:34:21+00:00", "updated_at": "2026-07-01 05:49:43.133526+00:00", "lang": "en", "topics": ["large-language-models", "ai-safety", "ai-research", "machine-learning"], "entities": [], "alternates": {"html": "https://wpnews.pro/news/reinforcement-learning-with-metacognitive-feedback", "markdown": "https://wpnews.pro/news/reinforcement-learning-with-metacognitive-feedback.md", "text": "https://wpnews.pro/news/reinforcement-learning-with-metacognitive-feedback.txt", "jsonld": "https://wpnews.pro/news/reinforcement-learning-with-metacognitive-feedback.jsonld"}}