Self-supervised User Profile Generation for Personalization

wpnews.pro

cd /news/large-language-models/self-supervised-user-profile-generat… · home › topics › large-language-models › article

[ARTICLE · art-22181] src=arxiv.org pub=2026-06-05T04:00Z topic=large-language-models verified=true sentiment=· neutral

Self-supervised User Profile Generation for Personalization

Researchers have developed BUMP, a self-supervised framework that trains large language models to generate personalized user profiles without requiring labeled data from downstream tasks. The system uses a bidirectional ranking objective to learn from raw user interaction histories, matching or exceeding the performance of methods that depend on expensive annotated supervision. This approach could enable more scalable personalization across recommendation, search, and dialogue systems by eliminating the need for task-specific labels.

read1 min publishedJun 5, 2026

arXiv:2606.05336v1 Announce Type: new Abstract: Personalizing large language models (LLMs) has become a central challenge as LLMs are deployed across recommendation, search, dialogue, and content generation -- settings where the same query should yield different answers given different users. A promising route is to summarize each user's interaction history into a natural-language memory or profile and prepend it to the prompt to facilitate personalization. Existing methods learn such profile generators with explicit rewards derived from labeled downstream tasks, which are expensive and sparse as they require annotated supervision for every target task. In light of this challenge, we introduce Bidirectional User Modeling via Profiles (BUMP), a self-supervised framework that trains a profile generator without any downstream labels. Specifically, given a user's interaction history, we use GRPO to train an LLM to emit a free-form textual profile under a bidirectional in-batch ranking objective: a small LLM judge measures (i) how well the generated profile, used as a query, ranks the user's own held-out interactions above interactions from other users in the batch, and (ii) how well a held-out interaction, used as a query, ranks the user's own profile above profiles of other users. Both directions are scored with multi-positive NDCG and combined into a dense reward per rollout; other users in the batch supply free negatives, so every training example yields supervision from raw interaction logs alone. Evaluated on the LaMP benchmark, BUMP matches or outperforms closed-source APIs and prior methods relying on labeled rewards, while requiring no task label at training.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/self-supervised-user-pro…

Read original on arxiv.org → arxiv.org/abs/2606.05336

mentioned entities

BUMP

GRPO

LLM

NDCG

metadata

slugself-supervised-user-profile-generation-for-personalization

topic#large-language-models

secondary4 topics

sentimentneutral

langen

canonicalarxiv.org

navigation

← prevBuilt a runtime layer so automat…

next →headroom, OpenRouter, MAI-Code-1…

── more in #large-language-models 4 stories · sorted by recency

arxiv.org · 5 Jun · #large-language-models

Improving Heart-Focused Medical Question Answering in LLMs via Variance-Aware Rubric Rewards with GRPO

arxiv.org · 5 Jun · #large-language-models

VideoKR: Towards Knowledge- and Reasoning-Intensive Video Understanding

arxiv.org · 5 Jun · #large-language-models

From Scoring to Explanations: Evaluating SHAP and LLM Rationales for Rubric-based Teaching Quality Assessment

arxiv.org · 5 Jun · #large-language-models

MCBench: A Multicontext Safety Assessment Benchmark for Omni Large Language Models

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required