MultiRL

non-profit

AI & ML interests

None defined yet.

Recent Activity

KimSHine updated a model 27 days ago

MultiRL/qwen3_4b_sudoku_multi_act_sft_final_new

KimSHine published a model 27 days ago

MultiRL/qwen3_4b_sudoku_multi_act_sft_final_new

KimSHine updated a model 28 days ago

MultiRL/qwen3_4b_sudoku_one_act_sft_final_new

View all activity

MultiRL 's models 171

MultiRL/qwen3_4b_easy_rl_our_adv_final

4B • Updated Dec 22, 2025 • 1

MultiRL/qwen3_1.7b_easy_rl_final_group_norm

2B • Updated Dec 22, 2025 • 49

MultiRL/qwen3_1.7b_easy_rl_final_gamma_1

2B • Updated Dec 18, 2025 • 2

MultiRL/qwen3_4b_base_easy_rl_final

4B • Updated Dec 18, 2025

MultiRL/qwen3_4b_base_sft_final

4B • Updated Dec 17, 2025 • 3

MultiRL/qwen3_4b_easy_rl_new

4B • Updated Dec 16, 2025

MultiRL/qwen3_1.7b_easy_rl_gspo

2B • Updated Dec 16, 2025 • 1

MultiRL/qwen3_4b_sft_new

4B • Updated Dec 15, 2025

MultiRL/qwen3_1.7b_easy_rl_final_step120

2B • Updated Dec 15, 2025 • 2

MultiRL/qwen3_4b_medium_rl_final

4B • Updated Dec 15, 2025 • 1

MultiRL/qwen3_4b_sft_one_act

4B • Updated Dec 14, 2025

MultiRL/qwen3_1.7b_easy_rl_reinforce_ori

2B • Updated Dec 14, 2025 • 2

MultiRL/qwen3_1.7b_easy_rl_reinforce_alpha_0.5

2B • Updated Dec 14, 2025

MultiRL/qwen3_1.7b_easy_rl_reinforce_alpha_1

2B • Updated Dec 14, 2025 • 1

MultiRL/qwen3_1.7b_easy_rl_reinforce_alpha_0

2B • Updated Dec 14, 2025 • 1

MultiRL/qwen3_1.7b_sft_one_act

2B • Updated Dec 14, 2025 • 1

MultiRL/qwen3_1.7b_easy_rl_final

2B • Updated Dec 13, 2025 • 40

MultiRL/qwen3_4b_easy_rl_final

4B • Updated Dec 13, 2025

MultiRL/qwen3_1.7b_sft_final

2B • Updated Dec 11, 2025 • 39

MultiRL/qwen3_4b_sft_final

4B • Updated Dec 11, 2025 • 1

MultiRL/qwen3_1.7b_easy_rl_new

2B • Updated Dec 6, 2025 • 3

MultiRL/qwen3_4b_standard_medium_rl

4B • Updated Dec 6, 2025

MultiRL/qwen3_4b_standard_easy_rl

4B • Updated Dec 5, 2025

MultiRL/qwen3_4b_medium_rl_progress_C

4B • Updated Dec 5, 2025

MultiRL/qwen3_4b_medium_rl

4B • Updated Dec 4, 2025 • 3

MultiRL/qwen3_4b_instruct_sft

4B • Updated Dec 1, 2025

MultiRL/qwen3_1.7b_easy_rl_test_task_group

2B • Updated Dec 1, 2025 • 2

MultiRL/qwen3_1.7b_easy_rl_test

2B • Updated Nov 30, 2025

MultiRL/qwen3_1.7b_sudoku_sft

2B • Updated Nov 28, 2025

MultiRL/qwen3_1.7b_easy_reinforce_batch_32_by_pass

2B • Updated Nov 26, 2025