Canvas-to-Image: Compositional Image Generation with Multimodal Controls Paper • 2511.21691 • Published Nov 26, 2025 • 36
Agentic Learner with Grow-and-Refine Multimodal Semantic Memory Paper • 2511.21678 • Published Nov 26, 2025 • 12
ENACT: Evaluating Embodied Cognition with World Modeling of Egocentric Interaction Paper • 2511.20937 • Published Nov 26, 2025 • 16
Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation Paper • 2512.10949 • Published Dec 11, 2025 • 47
Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving Paper • 2512.10739 • Published Dec 11, 2025 • 47
OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification Paper • 2512.10756 • Published Dec 11, 2025 • 35
MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens Paper • 2603.23516 • Published Mar 6 • 48
AVControl: Efficient Framework for Training Audio-Visual Controls Paper • 2603.24793 • Published 25 days ago • 26
Astrolabe: Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models Paper • 2603.17051 • Published Mar 17 • 109
RealMaster: Lifting Rendered Scenes into Photorealistic Video Paper • 2603.23462 • Published 26 days ago • 33
Gen-Searcher: Reinforcing Agentic Search for Image Generation Paper • 2603.28767 • Published 20 days ago • 57
FileGram: Grounding Agent Personalization in File-System Behavioral Traces Paper • 2604.04901 • Published 14 days ago • 40
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published 18 days ago • 481