GEMS: Agent-Native Multimodal Generation with Memory and Skills Paper • 2603.28088 • Published 6 days ago • 81 • 4
VGGRPO: Towards World-Consistent Video Generation with 4D Latent Reward Paper • 2603.26599 • Published 8 days ago • 57 • 3
DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models Paper • 2603.26164 • Published 9 days ago • 152 • 4
The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook Paper • 2604.02029 • Published 3 days ago • 115 • 4
Project Imaging-X: A Survey of 1000+ Open-Access Medical Imaging Datasets for Foundation Model Development Paper • 2603.27460 • Published 7 days ago • 63 • 4
ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers Paper • 2603.24414 • Published 10 days ago • 175 • 4
MiroEval: Benchmarking Multimodal Deep Research Agents in Process and Outcome Paper • 2603.28407 • Published 6 days ago • 62 • 5
MiroEval: Benchmarking Multimodal Deep Research Agents in Process and Outcome Paper • 2603.28407 • Published 6 days ago • 62 • 5
SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization Paper • 2604.02268 • Published 3 days ago • 81 • 4
Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning Paper • 2603.04597 • Published Mar 4 • 210 • 4
Generation Models Know Space: Unleashing Implicit 3D Priors for Scene Understanding Paper • 2603.19235 • Published 16 days ago • 94 • 5
SAMA: Factorized Semantic Anchoring and Motion Alignment for Instruction-Guided Video Editing Paper • 2603.19228 • Published 16 days ago • 67 • 4
HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning Paper • 2603.17024 • Published 18 days ago • 107 • 5
Astrolabe: Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models Paper • 2603.17051 • Published 18 days ago • 106 • 7
Omni-WorldBench: Towards a Comprehensive Interaction-Centric Evaluation for World Models Paper • 2603.22212 • Published 12 days ago • 124 • 10
PixelSmile: Toward Fine-Grained Facial Expression Editing Paper • 2603.25728 • Published 9 days ago • 116 • 4
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization Paper • 2603.19835 • Published 16 days ago • 313 • 7