AI search agents often confirm what they already know instead of actually researching the web

Leading AI search agents such as GPT-5.4 and Kimi K2.6 primarily use the web to confirm pre-existing knowledge rather than conducting genuine research, according to researchers at the Harbin Institute of Technology. A new time-based benchmark, LiveBrowseComp, which tests only events from the last 90 days, revealed that model performance collapses when they cannot rely on memory, reshuffling existing rankings.

Leading AI search agents like GPT-5.4 and Kimi K2.6 don't appear to do much actual research on established benchmarks. They mostly just use the web to confirm what they already learned during training. Researchers at the Harbin Institute of Technology found this using a new time-based benchmark called LiveBrowseComp, which only asks about events from the last 90 days. Once the models can't fall back on memory, performance falls apart and the existing rankings get reshuffled. The article AI search agents often confirm what they already know instead of actually researching the web https://the-decoder.com/ai-search-agents-often-confirm-what-they-already-know-instead-of-actually-researching-the-web/ appeared first on The Decoder https://the-decoder.com .