When Does Sparsity Mitigate the Curse of Depth in LLMs Paper • 2603.15389 • Published 1 day ago • 3
When Does Sparsity Mitigate the Curse of Depth in LLMs Paper • 2603.15389 • Published 1 day ago • 3
When Does Sparsity Mitigate the Curse of Depth in LLMs Paper • 2603.15389 • Published 1 day ago • 3
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem Paper • 2512.24873 • Published Dec 31, 2025 • 107
Running on CPU Upgrade Featured 3.05k The Smol Training Playbook 📚 3.05k The secrets to building world-class LLMs
Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning Paper • 2508.08221 • Published Aug 11, 2025 • 50
Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm Enables Fine-Grained Policy Optimization Paper • 2510.13554 • Published Oct 15, 2025 • 58
Diffusion Language Models Know the Answer Before Decoding Paper • 2508.19982 • Published Aug 27, 2025 • 27
Diffusion Language Models Know the Answer Before Decoding Paper • 2508.19982 • Published Aug 27, 2025 • 27