AI & ML interests
None defined yet.
Recent Activity
MultiRL/qwen3_4b_easy_rl_our_adv_final
4B • Updated
• 1
MultiRL/qwen3_1.7b_easy_rl_final_group_norm
2B • Updated
• 49
MultiRL/qwen3_1.7b_easy_rl_final_gamma_1
2B • Updated
• 2
MultiRL/qwen3_4b_base_easy_rl_final
4B • Updated
MultiRL/qwen3_4b_base_sft_final
4B • Updated
• 3
MultiRL/qwen3_4b_easy_rl_new
4B • Updated
MultiRL/qwen3_1.7b_easy_rl_gspo
2B • Updated
• 1
4B • Updated
MultiRL/qwen3_1.7b_easy_rl_final_step120
2B • Updated
• 2
MultiRL/qwen3_4b_medium_rl_final
4B • Updated
• 1
MultiRL/qwen3_4b_sft_one_act
4B • Updated
MultiRL/qwen3_1.7b_easy_rl_reinforce_ori
2B • Updated
• 2
MultiRL/qwen3_1.7b_easy_rl_reinforce_alpha_0.5
2B • Updated
MultiRL/qwen3_1.7b_easy_rl_reinforce_alpha_1
2B • Updated
• 1
MultiRL/qwen3_1.7b_easy_rl_reinforce_alpha_0
2B • Updated
• 1
MultiRL/qwen3_1.7b_sft_one_act
2B • Updated
• 1
MultiRL/qwen3_1.7b_easy_rl_final
2B • Updated
• 40
MultiRL/qwen3_4b_easy_rl_final
4B • Updated
MultiRL/qwen3_1.7b_sft_final
2B • Updated
• 39
MultiRL/qwen3_4b_sft_final
4B • Updated
• 1
MultiRL/qwen3_1.7b_easy_rl_new
2B • Updated
• 3
MultiRL/qwen3_4b_standard_medium_rl
MultiRL/qwen3_4b_standard_easy_rl
MultiRL/qwen3_4b_medium_rl_progress_C
MultiRL/qwen3_4b_medium_rl
4B • Updated
• 3
MultiRL/qwen3_4b_instruct_sft
MultiRL/qwen3_1.7b_easy_rl_test_task_group
2B • Updated
• 2
MultiRL/qwen3_1.7b_easy_rl_test
2B • Updated
MultiRL/qwen3_1.7b_sudoku_sft
2B • Updated
MultiRL/qwen3_1.7b_easy_reinforce_batch_32_by_pass
2B • Updated