Embarrassingly Simple Self-Distillation Improves Code Generation Paper • 2604.01193 • Published 9 days ago • 34 • 6
GBQA: A Game Benchmark for Evaluating LLMs as Quality Assurance Engineers Paper • 2604.02648 • Published 8 days ago • 41 • 3
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver Paper • 2604.08377 • Published 2 days ago • 141 • 5
HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents Paper • 2604.07430 • Published 3 days ago • 126 • 3
Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding Paper • 2604.05015 • Published 5 days ago • 222 • 8
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published 9 days ago • 164 • 6
Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents Paper • 2604.06132 • Published 4 days ago • 107 • 5
MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale Paper • 2604.04771 • Published 5 days ago • 110 • 4
OpenWorldLib: A Unified Codebase and Definition of Advanced World Models Paper • 2604.04707 • Published 5 days ago • 195 • 12
InCoder-32B-Thinking: Industrial Code World Model for Thinking Paper • 2604.03144 • Published 8 days ago • 222 • 3
LIBERO-Para: A Diagnostic Benchmark and Metrics for Paraphrase Robustness in VLA Models Paper • 2603.28301 • Published 12 days ago • 77 • 5
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression Paper • 2604.04921 • Published 5 days ago • 95 • 4
CORAL: Towards Autonomous Multi-Agent Evolution for Open-Ended Discovery Paper • 2604.01658 • Published 9 days ago • 52 • 3
Brainstacks: Cross-Domain Cognitive Capabilities via Frozen MoE-LoRA Stacks for Continual LLM Learning Paper • 2604.01152 • Published 9 days ago • 5 • 4
NearID: Identity Representation Learning via Near-identity Distractors Paper • 2604.01973 • Published 9 days ago • 29 • 3