{"slug": "a-three-phase-foundation-model-for-tax-aware-personalized-portfolio-management", "title": "A Three-Phase Foundation Model for Tax-Aware Personalized Portfolio Management", "summary": "Researchers introduced a three-phase deep reinforcement learning system for personalized portfolio management that overcomes ticker lock-in, monolithic objectives, and static user models. Phase 1 uses a self-supervised cross-asset encoder with a Chronos time series foundation model; Phase 2 employs a Mixture-of-Experts actor-critic for six investment objectives; Phase 3 adds a LoRA personalization layer fine-tuned on individual brokerage data. The system enables tax-aware, goal-adaptive portfolio management without retraining for new assets.", "body_md": "arXiv:2606.30997v1 Announce Type: new\nAbstract: We present a three-phase deep reinforcement learning system for personalized portfolio management that addresses three limitations shared by all prior financial RL work: 1) ticker lock-in, 2) monolithic objectives , and 3) static user models. Phase 1 pretrains a ticker-identity-free cross asset encoder via self-supervised learning on a multi-asset corpus, augmented by a frozen parallel branch using Chronos, a T5-based time series foundation model, fused via a learned gating mechanism. To our knowledge, this is the first application of a time series foundation model to portfolio management RL. The encoder generalizes to any publicly traded asset via a 50-dimensional observable metadata vector that requires no retraining for new tickers. Phase 2 fine-tunes a MoE (Mixture of Experts) portfolio actor critic with PPO under an objective-conditioned reward that simultaneously serves six distinct investment goals sampled per episode: short-term alpha, short-term gain, long-term gain, capital preservation, tax-loss harvesting, and long-term-gains-only. A MoE architecture assigns each objective to a specialized expert head (momentum, growth, defensive, tax-aware), and a learned intent router blends experts based on the active objective and current market regime, which eliminates cross-objective gradient conflict. Phase 3 adds a lightweight personalization layer further adapted at inference time to each individual via a 76-parameter LoRA module fine-tuned on real brokerage transaction history, inferring investment objectives from revealed trading behavior rather than questionnaires. A natural language intent parser converts free-form goals directly into structured investment objective parameters.", "url": "https://wpnews.pro/news/a-three-phase-foundation-model-for-tax-aware-personalized-portfolio-management", "canonical_source": "https://arxiv.org/abs/2606.30997", "published_at": "2026-07-01 04:00:00+00:00", "updated_at": "2026-07-01 04:25:17.576276+00:00", "lang": "en", "topics": ["machine-learning", "large-language-models", "ai-research", "ai-agents"], "entities": ["Chronos", "PPO", "MoE", "LoRA"], "alternates": {"html": "https://wpnews.pro/news/a-three-phase-foundation-model-for-tax-aware-personalized-portfolio-management", "markdown": "https://wpnews.pro/news/a-three-phase-foundation-model-for-tax-aware-personalized-portfolio-management.md", "text": "https://wpnews.pro/news/a-three-phase-foundation-model-for-tax-aware-personalized-portfolio-management.txt", "jsonld": "https://wpnews.pro/news/a-three-phase-foundation-model-for-tax-aware-personalized-portfolio-management.jsonld"}}