lsteno/Qwen3-4B-Instruct-2507-RLM-RLVR-depth2-recursive-r64-a128-lr1e-5-adapter Reinforcement Learning • Updated 3 days ago • 14
lsteno/Qwen3-4B-Instruct-2507-RLM-RLVR-FullFT-lr5e-6-depth1-v1 Text Generation • 4B • Updated 25 days ago • 118