X-Stream: Exploring MLLMs as Multiplexers for Multi-Stream Understanding Paper • 2606.02482 • Published 13 days ago • 35
OmniInteract: Benchmarking Real-World Streaming Interaction for Real-Time Omnimodal Assistants Paper • 2605.26485 • Published 19 days ago • 3
UI-KOBE: Knowledge-Oriented Behavior Exploration for Lightweight Graph-Guided GUI Agents Paper • 2605.29534 • Published 17 days ago • 15
AURA: Always-On Understanding and Real-Time Assistance via Video Streams Paper • 2604.04184 • Published Apr 5 • 51
CoVe Collection Learning Multi-turn Tool Use via Constraint-Guided Verification • 2 items • Updated Feb 26 • 1
VGGT-Det: Mining VGGT Internal Priors for Sensor-Geometry-Free Multi-View Indoor 3D Object Detection Paper • 2603.00912 • Published Mar 1 • 40
CoVe: Training Interactive Tool-Use Agents via Constraint-Guided Verification Paper • 2603.01940 • Published Mar 2 • 24
3DGS-DET: Empower 3D Gaussian Splatting with Boundary Guidance and Box-Focused Sampling for 3D Object Detection Paper • 2410.01647 • Published Oct 2, 2024 • 31