arxiv:2601.21590
xiaotong
xtongji
AI & ML interests
None yet
Recent Activity
upvoted a paper 20 days ago
Multi-Task GRPO: Reliable LLM Reasoning Across Tasks authored
a paper
27 days ago
Bourbaki: Self-Generated and Goal-Conditioned MDPs for Theorem Proving Organizations
None yet