Reasoning_eval

university

https://chtholly17.github.io/

AI & ML interests

None defined yet.

Recent Activity

Chtholly17 authored a paper 17 days ago

Harness Updating Is Not Harness Benefit: Disentangling Evolution Capabilities in Self-Evolving LLM Agents

dwenlong submitted a paper 24 days ago

Directional Alignment Mitigates Reward Hacking in Reinforcement Learning for Language Models

Chtholly17 submitted a paper 25 days ago

From Seeing to Thinking: Decoupling Perception and Reasoning Improves Post-Training of Vision-Language Models

View all activity

models 18

ReasoningEval/huatuo_sft_m23k_grpo_qwen3-14b

15B • Updated Nov 3, 2025 • 4

ReasoningEval/huatuo_sft_m23k_grpo_qwen3-8b

8B • Updated Nov 3, 2025 • 9

ReasoningEval/huatuo_sft_m23k_grpo_llama31-8b

8B • Updated Nov 3, 2025 • 1

ReasoningEval/openr1_sft_PRIME_grpo_qwen3-14b

15B • Updated Nov 3, 2025 • 1

ReasoningEval/openr1_sft_PRIME_grpo_qwen3-8b

8B • Updated Nov 3, 2025 • 5

ReasoningEval/openr1_sft_PRIME_grpo_llama31-8b

8B • Updated Nov 3, 2025 • 5

ReasoningEval/openr1_sft_qwen3-8b

8B • Updated Oct 29, 2025 • 2

ReasoningEval/openr1_sft_qwen3-14b

425k • Updated Oct 28, 2025 • 2

ReasoningEval/openr1_sft_llama31-8b

8B • Updated Oct 28, 2025 • 3

ReasoningEval/huatuo_sft_qwen3-8b

8B • Updated Oct 28, 2025 • 3

datasets 0

None public yet