arxiv:2605.14678
Haoran Zhang
zzzhr97
AI & ML interests
Lange Language Models, Large Reasoning Models
Recent Activity
upvoted a paper about 14 hours ago
π-Bench: Evaluating Proactive Personal Assistant Agents in Long-Horizon Workflows submitted a paper about 14 hours ago
π-Bench: Evaluating Proactive Personal Assistant Agents in Long-Horizon Workflows authored a paper about 20 hours ago
$π$-Bench: Evaluating Proactive Personal Assistant Agents in Long-Horizon Workflows