InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing Paper • 2603.09877 • Published 2 days ago • 31
Stepping VLMs onto the Court: Benchmarking Spatial Intelligence in Sports Paper • 2603.09896 • Published 2 days ago • 22
Holi-Spatial: Evolving Video Streams into Holistic 3D Spatial Intelligence Paper • 2603.07660 • Published 4 days ago • 74
Visual Para-Thinker: Divide-and-Conquer Reasoning for Visual Comprehension Paper • 2602.13310 • Published about 1 month ago • 8
Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models Paper • 2602.07026 • Published Feb 2 • 138