T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasoning Paper • 2603.03790 • Published 17 days ago • 120
InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery Paper • 2602.08990 • Published Feb 9 • 75
QuantaAlpha: An Evolutionary Framework for LLM-Driven Alpha Mining Paper • 2602.07085 • Published Feb 6 • 189
Closing the Loop: Universal Repository Representation with RPG-Encoder Paper • 2602.02084 • Published Feb 2 • 85
Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives Paper • 2601.20833 • Published Jan 28 • 183
Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning Paper • 2601.06943 • Published Jan 11 • 214
DramaBench: A Six-Dimensional Evaluation Framework for Drama Script Continuation Paper • 2512.19012 • Published Dec 22, 2025 • 17
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length Paper • 2512.04677 • Published Dec 4, 2025 • 175
PICABench: How Far Are We from Physically Realistic Image Editing? Paper • 2510.17681 • Published Oct 20, 2025 • 65
LongCodeZip: Compress Long Context for Code Language Models Paper • 2510.00446 • Published Oct 1, 2025 • 108
StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs Paper • 2509.22220 • Published Sep 26, 2025 • 66
RPG: A Repository Planning Graph for Unified and Scalable Codebase Generation Paper • 2509.16198 • Published Sep 19, 2025 • 129
MachineLearningLM: Continued Pretraining Language Models on Millions of Synthetic Tabular Prediction Tasks Scales In-Context ML Paper • 2509.06806 • Published Sep 8, 2025 • 63
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR Paper • 2508.14029 • Published Aug 19, 2025 • 119
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency Paper • 2508.18265 • Published Aug 25, 2025 • 216
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published Feb 13, 2025 • 192
MetaChain: A Fully-Automated and Zero-Code Framework for LLM Agents Paper • 2502.05957 • Published Feb 9, 2025 • 15