{"slug": "end-to-end-model-that-listens-sees-thinks-and-responds-on-video-in-real-time", "title": "End-to-end model that listens, sees, thinks and responds on video in real time", "summary": "Alibaba unveiled Wan Streamer, an AI agent capable of real-time video interaction that can see, hear, and respond to users, marking a significant advancement beyond voice-only AI systems.", "body_md": "Min Choi\n@minchoi\nWe are cooked.\n\nChina's Alibaba just revealed Wan Streamer.\n\nAI agents can now see you, hear you, and talk back on video in real time.\n\nThis is not voice mode anymore 🤯\n00:00\n3:25 AM · Jun 26, 2026\n371K\nViews", "url": "https://wpnews.pro/news/end-to-end-model-that-listens-sees-thinks-and-responds-on-video-in-real-time", "canonical_source": "https://twitter.com/minchoi/status/2070347790115565792", "published_at": "2026-06-27 09:27:59+00:00", "updated_at": "2026-06-27 10:05:31.313454+00:00", "lang": "en", "topics": ["artificial-intelligence", "ai-agents", "computer-vision", "natural-language-processing"], "entities": ["Alibaba", "Wan Streamer", "Min Choi"], "alternates": {"html": "https://wpnews.pro/news/end-to-end-model-that-listens-sees-thinks-and-responds-on-video-in-real-time", "markdown": "https://wpnews.pro/news/end-to-end-model-that-listens-sees-thinks-and-responds-on-video-in-real-time.md", "text": "https://wpnews.pro/news/end-to-end-model-that-listens-sees-thinks-and-responds-on-video-in-real-time.txt", "jsonld": "https://wpnews.pro/news/end-to-end-model-that-listens-sees-thinks-and-responds-on-video-in-real-time.jsonld"}}