WebVR: Benchmarking Multimodal LLMs for WebPage Recreation from Videos via Human-Aligned Visual Rubrics Paper • 2603.13391 • Published 11 days ago • 19
PhysGame: Uncovering Physical Commonsense Violations in Gameplay Videos Paper • 2412.01800 • Published Dec 2, 2024 • 6
OmniBench: Towards The Future of Universal Omni-Language Models Paper • 2409.15272 • Published Sep 23, 2024 • 30