Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled Text Generation • 28B • Updated 6 days ago • 53.2k • 551
INTELLECT-2: A Reasoning Model Trained Through Globally Decentralized Reinforcement Learning Paper • 2505.07291 • Published May 12, 2025 • 15
MemoryRewardBench: Benchmarking Reward Models for Long-Term Memory Management in Large Language Models Paper • 2601.11969 • Published Jan 17 • 27
Toward Efficient Agents: Memory, Tool learning, and Planning Paper • 2601.14192 • Published Jan 20 • 56
Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression Paper • 2506.09482 • Published Jun 11, 2025 • 45