We Can't Understand AI Using our Existing Vocabulary Paper • 2502.07586 • Published Feb 11, 2025 • 12
Why Do Reasoning Models Lose Coverage? The Role of Data and Forks in the Road Paper • 2605.17026 • Published 26 days ago • 4
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences Paper • 2404.03715 • Published Apr 4, 2024 • 62
Weak-to-Strong Jailbreaking on Large Language Models Paper • 2401.17256 • Published Jan 30, 2024 • 16
WARM: On the Benefits of Weight Averaged Reward Models Paper • 2401.12187 • Published Jan 22, 2024 • 19