Q-ARVD: Quantizing Autoregressive Video Diffusion Models Paper • 2605.21072 • Published 10 days ago • 21
Mix-Quant: Quantized Prefilling, Precise Decoding for Agentic LLMs Paper • 2605.20315 • Published 11 days ago • 28
ReactiveGWM: Steering NPC in Reactive Game World Models Paper • 2605.15256 • Published 16 days ago • 28
Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling Paper • 2604.28185 • Published 30 days ago • 90
Vision-Language-Action Safety: Threats, Challenges, Evaluations, and Mechanisms Paper • 2604.23775 • Published Apr 26 • 45
FreeSwim: Revisiting Sliding-Window Attention Mechanisms for Training-Free Ultra-High-Resolution Video Generation Paper • 2511.14712 • Published Nov 18, 2025 • 2
Gated Condition Injection without Multimodal Attention: Towards Controllable Linear-Attention Transformers Paper • 2603.27666 • Published Mar 29 • 18
Gated Condition Injection without Multimodal Attention: Towards Controllable Linear-Attention Transformers Paper • 2603.27666 • Published Mar 29 • 18
Gated Condition Injection without Multimodal Attention: Towards Controllable Linear-Attention Transformers Paper • 2603.27666 • Published Mar 29 • 18
MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification Paper • 2603.15726 • Published Mar 16 • 187
Anatomy of a Lie: A Multi-Stage Diagnostic Framework for Tracing Hallucinations in Vision-Language Models Paper • 2603.15557 • Published Mar 16 • 29
ViFeEdit: A Video-Free Tuner of Your Video Diffusion Transformer Paper • 2603.15478 • Published Mar 16 • 24
ViFeEdit: A Video-Free Tuner of Your Video Diffusion Transformer Paper • 2603.15478 • Published Mar 16 • 24
SliderEdit: Continuous Image Editing with Fine-Grained Instruction Control Paper • 2511.09715 • Published Nov 12, 2025 • 11