Long-term memory
📖 3 min readUpdated 2026-04-19
Long-term memory is what the agent remembers across sessions. Without it, every conversation starts fresh and the agent feels like an amnesiac intern. With it, the agent knows your preferences, recalls past decisions, builds context over time. But long-term memory is also where agents get weird: they remember wrong things, forget important things, or leak info across users. Design it deliberately or don't ship it.
What belongs in long-term vs what doesn't
Four storage models
Most production agents use two stores: a key-value for stable facts (loaded on every session) and a vector store for rich history (queried on demand).
What to write, when
Writing policy is more important than read policy. The wrong writes are how long-term memory gets noisy and stale.
- Always write: user-explicit "remember this" requests.
- Sometimes write: strong preferences inferred from repeated behavior. (But tell the user: "I'll remember you prefer X. Is that right?")
- Rarely write: end-of-session summaries when the session was meaningful.
- Never write: PII that wasn't explicitly opted in, one-off chatter, anything the user said in passing.
Retrieval: when the agent pulls memory in
Memory is only valuable if it's retrieved at the right moment. Three retrieval patterns:
- On session start: load a small user profile (name, prefs). Always relevant, cheap.
- On demand: expose a
recall(topic) tool. The agent decides when it needs older context.
- Automatic: vector-search the current turn against memory and inject top-k matches. Works well for chatty assistants, overkill for task agents.
A worked example: remembering a preference
Session 1. User says: "Please always use metric units in your answers." Agent saves: {user_id: 42, pref_units: "metric"} into a key-value store.
Session 2, a week later. User: "How far is the moon?" Agent loads user profile on start, sees pref_units: "metric", answers "about 384,400 km" instead of miles. The agent "remembers" without ever being told again.
That's the payoff. A 20-byte row in a key-value store turns a stateless model into one that feels like it knows you.
Staleness: the silent killer
A preference from 2024 might be wrong in 2026. A fact stored once becomes a lie the agent confidently repeats. Build in:
- TTL on volatile facts. "User's job title" probably has a 1-year expiration.
- Update semantics. When new info contradicts old, overwrite or supersede. Don't append both.
- Confidence decay. Facts stored 2 years ago should be hedged ("last you mentioned...").
- User-facing controls. Let users see, edit, and delete what the agent has stored about them. Legally often required, usability-wise always good.
Privacy and isolation
Long-term memory is where data leaks happen. Make sure:
- Every memory row is scoped to a user id.
- Retrieval queries always include the user id filter.
- Vector searches can't return another user's embeddings.
- PII in memory obeys your retention policy.
Pitfalls
- Saving everything. Vector store gets noisy, retrieval returns garbage. Be picky about writes.
- Never deleting. Users ask you to forget; honor it.
- Append-only with no supersede. Agent recalls contradictory facts and gets confused.
- Retrieval on every turn. Expensive and often unhelpful. Prefer on-demand via a tool.
- No cross-user barrier. A single missing filter leaks user A's memories to user B.
What to do with this
- Start small: key-value store for 5-10 clearly-useful fields. Add more only when evidence supports it.
- Read episodic memory for remembering whole past interactions.
- Read procedural memory for remembering "how to do this task."