Papers
arxiv:2604.11811

M^star: Every Task Deserves Its Own Memory Harness

Published on Apr 10
Authors:
,
,
,
,
,

Abstract

M$^\star$ automatically discovers task-optimized memory systems for large language model agents through program evolution, demonstrating superior performance across conversation, embodied planning, and expert reasoning tasks.

Large language model agents rely on specialized memory systems to accumulate and reuse knowledge during extended interactions. Recent architectures typically adopt a fixed memory design tailored to specific domains, such as semantic retrieval for conversations or skills reused for coding. However, a memory system optimized for one purpose frequently fails to transfer to others. To address this limitation, we introduce M^star, a method that automatically discovers task-optimized memory harnesses through executable program evolution. Specifically, M^star models an agent memory system as a memory program written in Python. This program encapsulates the data Schema, the storage Logic, and the agent workflow Instructions. We optimize these components jointly using a reflective code evolution method; this approach employs a population-based search strategy and analyzes evaluation failures to iteratively refine the candidate programs. We evaluate M^star on four distinct benchmarks spanning conversation, embodied planning, and expert reasoning. Our results demonstrate that M^star improves performance over existing fixed-memory baselines robustly across all evaluated tasks. Furthermore, the evolved memory programs exhibit structurally distinct processing mechanisms for each domain. This finding indicates that specializing the memory mechanism for a given task explores a broad design space and provides a superior solution compared to general-purpose memory paradigms.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2604.11811
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2604.11811 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2604.11811 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2604.11811 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.