LARK-Lab/SWITCH-Phase3-GRPO-LoRA-Qwen3-8B
Text Generation • Updated
Large Language Models
Attention Amnesia in Hybrid LLMs: When CoT Fine-Tuning Breaks Long-Range Recall, and How to Fix It
EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL