{"slug": "profillm-utility-aligned-agentic-user-profiling-for-industrial-ride-hailing", "title": "ProfiLLM: Utility-Aligned Agentic User Profiling for Industrial Ride-Hailing Dispatch", "summary": "Researchers from DiDi introduced ProfiLLM, an agentic LLM data pipeline that generates utility-aligned user profiles for industrial ride-hailing dispatch. Deployed on DiDi's production system, ProfiLLM achieved up to +6.14% AUC improvement in outcome prediction and +4.35% GMV gain in simulations, with consistent gains in a 14-day online A/B test including +0.47% GMV and +0.33% Completion Rate.", "body_md": "arXiv:2606.18803v1 Announce Type: new\nAbstract: Bringing Large Language Models (LLMs) into industrial ride-hailing dispatch as semantic feature extractors over platform-scale behavioral logs is a compelling but under-explored data systems problem. Production matching pipelines remain dominated by structured numerical features, yet decisive behavioral signals (e.g., a driver's habitual aversion to certain regions) are inherently contextual and naturally expressible as LLM-generated user profiles. However, scaling such profiling to a live, millisecond-latency dispatcher faces three intertwined constraints rarely addressed together: on a platform with millions of daily orders, logs exceed any LLM's context window by orders of magnitude; most users are long-tail, with too few interactions for per-user profiling; and surface-fluent profiles do not necessarily improve downstream prediction utility. We present ProfiLLM, an agentic LLM data pipeline that operationalizes utility-aligned user profiling for production matching systems through two modules. (1) Tool-Augmented Global Knowledge Mining equips an LLM agent with 27 analytical tools to mine platform-scale data, producing reusable global knowledge, adaptive user clustering rules, and region-level supply-demand priors. (2) Utility-Aligned Profile Exploration generates multiple candidate profiles per cluster, evaluates them via a lightweight downstream utility proxy, iteratively refines the best candidates and constructs preference pairs for DPO fine-tuning. Deployed on DiDi's production dispatcher, ProfiLLM achieves up to +6.14% relative AUC improvement in outcome prediction, up to +4.35% GMV gain in dispatching simulation, and consistent improvements in a 14-day online A/B test including +0.47% GMV, +0.33% Completion Rate, and -0.82% Cancel-Before-Accept rate.", "url": "https://wpnews.pro/news/profillm-utility-aligned-agentic-user-profiling-for-industrial-ride-hailing", "canonical_source": "https://arxiv.org/abs/2606.18803", "published_at": "2026-06-18 04:00:00+00:00", "updated_at": "2026-06-18 04:23:19.862549+00:00", "lang": "en", "topics": ["large-language-models", "ai-agents", "ai-infrastructure", "machine-learning"], "entities": ["DiDi", "ProfiLLM", "LLM"], "alternates": {"html": "https://wpnews.pro/news/profillm-utility-aligned-agentic-user-profiling-for-industrial-ride-hailing", "markdown": "https://wpnews.pro/news/profillm-utility-aligned-agentic-user-profiling-for-industrial-ride-hailing.md", "text": "https://wpnews.pro/news/profillm-utility-aligned-agentic-user-profiling-for-industrial-ride-hailing.txt", "jsonld": "https://wpnews.pro/news/profillm-utility-aligned-agentic-user-profiling-for-industrial-ride-hailing.jsonld"}}