Synthetic Sandbox for Training Machine Learning Engineering Agents Paper • 2604.04872 • Published Apr 6 • 14
OmniOPD: Logit-Free On-Policy Distillation via Speculative Verification Paper • 2606.01476 • Published May 31 • 9
SWE-Together: Evaluating Coding Agents in Interactive User Sessions Paper • 2606.29957 • Published 3 days ago • 12
SWE-Together: Evaluating Coding Agents in Interactive User Sessions Paper • 2606.29957 • Published 3 days ago • 12
SWE-Together: Evaluating Coding Agents in Interactive User Sessions Paper • 2606.29957 • Published 3 days ago • 12
OmniOPD: Logit-Free On-Policy Distillation via Speculative Verification Paper • 2606.01476 • Published May 31 • 9
view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 aminediroHF, qgallouedec, kashif, lewtun, edbeeching, albertvillanova, nouamanetazi, lvwerra, sergiopaniego • Mar 10 • 166
A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis Paper • 2405.14839 • Published May 23, 2024