Can AI Guess What You Know? Performance Comparison of Large Language Models for Human Domain Knowledge Estimation From Communication Logs

wpnews.pro

cd /news/large-language-models/can-ai-guess-what-you-know-performan… · home › topics › large-language-models › article

[ARTICLE · art-13635] src=arxiv.org ↗ pub=2026-05-25T04:00Z topic=large-language-models verified=true sentiment=· neutral

Can AI Guess What You Know? Performance Comparison of Large Language Models for Human Domain Knowledge Estimation From Communication Logs

Researchers tested seven large language models on their ability to infer employee expertise from 27,188 Slack messages, finding that Gemini 2.5 Flash achieved the lowest estimation error at 21.13% mean absolute error. The study, which compared model estimates against self-reported skill ratings from 27 participants, revealed that GPT models produced significantly larger discrepancies and that accuracy depended only weakly on message volume. The findings demonstrate the feasibility of automated expertise mapping from communication logs while highlighting current limitations and the need for privacy-preserving implementations.

read1 min views6 publishedMay 25, 2026

arXiv:2605.22971v1 Announce Type: new Abstract: Employees often struggle to identify ``who knows what,'' leading to organizational productivity losses. We investigate whether Large Language Models (LLMs) can infer individual domain knowledge directly from long-term Slack logs. Analyzing 27,188 messages from 43 users, we evaluated seven models (including Gemini, Claude, and GPT families) by comparing their zero-shot estimates against self-reported skill ratings from 27 participants. Gemini 2.5 Flash achieved the lowest error (MAE 21.13%), while GPT models showed significantly larger discrepancies. Notably, estimation accuracy depended only weakly on message volume, indicating that more text alone does not guarantee better inference. These findings demonstrate the feasibility and current limits of automated expertise mapping, highlighting the need for privacy-preserving deployments and richer, structure-aware representations of human knowledge.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/can-ai-guess-what-you-kn…

Read original on arxiv.org → arxiv.org/abs/2605.22971

mentioned entities

Gemini

Claude

GPT

metadata

slugcan-ai-guess-what-you-know-performance-comparison-of-large-language-models-for

topic#large-language-models

secondary4 topics

sentimentneutral

canonicalarxiv.org

navigation

← prevThe Eternal Sloptember

next →Samsung memory workers call off …

── more in #large-language-models 4 stories · sorted by recency

kadoa.com · 15 Jul · #large-language-models

AI Agents: Hype vs. Reality (2024)

dev.to · 15 Jul · #large-language-models

A 86k-Star Tool Maps Codebases Into Graphs. I Tested It on My Own Code.

artificialanalysis.ai · 15 Jul · #large-language-models

GPT-5.6 Sol, Terra, Luna compare on intelligence vs. cost

dev.to · 15 Jul · #large-language-models

Stop re-explaining your codebase to every AI agent — `cast-skills`

── more on @gemini 3 stories trending now

wpnews · 23 May · #artificial-intelligence

AccessLens — a blind person's lanyard, powered by Gemma 4 on-device

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 21 May · #developer-tools

Antigravity CLI: A Hands-On Guide to Google's Terminal Coding Agent

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required