SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published Jan 28, 2025 • 125
Systran/faster-whisper-large-v3 Automatic Speech Recognition • Updated Nov 23, 2023 • 702k • 535
deepseek-ai/DeepSeek-R1-0528 Text Generation • 685B • Updated May 29, 2025 • 1.05M • • 2.41k