Beyond Individual Intelligence: Surveying Collaboration, Failure Attribution, and Self-Evolution in LLM-based Multi-Agent Systems Paper • 2605.14892 • Published 16 days ago • 48
Multi-Objective and Mixed-Reward Reinforcement Learning via Reward-Decorrelated Policy Optimization Paper • 2605.13641 • Published 17 days ago • 49
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published 18 days ago • 193
AI CFD Scientist: Toward Open-Ended Computational Fluid Dynamics Discovery with Physics-Aware AI Agents Paper • 2605.06607 • Published 18 days ago • 2
TextLDM: Language Modeling with Continuous Latent Diffusion Paper • 2605.07748 • Published 22 days ago • 26
Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning Paper • 2605.06130 • Published 23 days ago • 111
SEIF: Self-Evolving Reinforcement Learning for Instruction Following Paper • 2605.07465 • Published 22 days ago • 29
Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers Paper • 2605.06169 • Published 23 days ago • 231
SkillOS: Learning Skill Curation for Self-Evolving Agents Paper • 2605.06614 • Published 23 days ago • 46
Step-level Optimization for Efficient Computer-use Agents Paper • 2604.27151 • Published about 1 month ago • 18
From Context to Skills: Can Language Models Learn from Context Skillfully? Paper • 2604.27660 • Published 27 days ago • 165
jackf857/qwen3-8b-base-beta-dpo-hh-helpful-4xh200-batch-64-20260424-013732 Text Generation • 8B • Updated Apr 24 • 178 • • 1