04:00
2026-06-03
arxiv.org
computer-vision
AVTrack: Audio-Visual Tracking in Human-centric Complex Scenes
Researchers introduced AVTrack, a new human-centric audio-visual instance segmentation dataset designed for dynamic real-world scenarios. The dataset features challenging conditions such as camera motโฆ