Comparison guide

AI memory tools, side by side

A direct comparison of the major AI memory systems for LLM agents: SmartMemory, Mem0, Zep, Letta (formerly MemGPT), claude-mem, and plain vector stores. We name what each tool is good at, where its design diverges, and when to pick it.

Last updated: 2026-04-30

At a glance

CapabilitySmartMemoryMem0ZepLetta / MemGPTclaude-memVector store
Typed memory layers (5 cognitive types)5 types
Knowledge graph (default, no paywall)$249/mo Pro
Multi-hop graph retrieval
Bitemporal (event + system time)
Observable 11-stage extraction pipeline11 stages
Native MCP serverGraphiti MCP
Last public benchmarkIn progress49.0% LongMemEval71.2% LongMemEval74.0% LoCoMoNone publishedN/A
GitHub stars (snapshot)~54K~25K~70KN/A

Benchmark numbers cited from third-party sources (LongMemEval arXiv:2504.19413, vendor publications). We do not republish raw scores. See docs for our own benchmark plan (BENCH-AGENT-1).

Tool by tool

Mem0

What it's good at

Universal memory layer with the largest community in the category (~54K GitHub stars) and an AWS Strands exclusive partnership. The obvious "drop in to a chatbot" choice.

Where it diverges

Vector-search by default; the knowledge graph feature is gated to the $249/mo Pro tier. Recent independent benchmark reads ~49.0% on LongMemEval (arXiv:2504.19413) — the lowest among tools we track that publish numbers there.

When to pick it

Pick Mem0 when you want a simple memory bolt-on for a chatbot, ecosystem reach matters more than typed structure, and graph features behind a paywall are acceptable.

Zep / Graphiti

What it's good at

Long-term agent memory built on a temporal knowledge graph (Graphiti). Strong public benchmarks: ~71.2% on LongMemEval and ~94.8% on DMR (deep memory retrieval). Recently shipped a Graphiti MCP server.

Where it diverges

Zep's pipeline is more opaque from the outside — there isn't an exposed per-stage trajectory (timing, tokens, cost) the way SmartMemory's 11-stage pipeline is.

When to pick it

Pick Zep when you want a mature graph-backed memory layer with strong published benchmarks, and per-stage pipeline observability isn't a hard requirement.

Letta (formerly MemGPT)

What it's good at

Stateful agent memory with an OS-inspired virtual-context approach. Backed by a NeurIPS paper and the Letta Leaderboard. Reports ~74.0% LoCoMo with GPT-4o mini.

Where it diverges

Letta ships memory inside an agent framework. If you have already built on LangChain, LlamaIndex, or the Anthropic Agent SDK, adopting Letta means adopting their agent runtime alongside their memory.

When to pick it

Pick Letta when you want an opinionated full-agent framework with memory included. Pick SmartMemory when you want memory primitives that compose with whatever stack you already have.

claude-mem

What it's good at

Zero-config Claude Code session capture via PostToolUse hooks. ~70K GitHub stars, currently the most-starred project in the AI-memory tracker. Backed by SQLite + ChromaDB.

Where it diverges

Stores compressed text blobs of sessions — no typed entity extraction, no relations, no knowledge graph, no published benchmarks. Tightly coupled to Claude Code; not portable to Cursor, MCP clients, or your own agent code.

When to pick it

Pick claude-mem when Claude Code is the only surface you care about. Pick SmartMemory when memory needs to survive across multiple agent runtimes and tools.

Vector store (Pinecone, pgvector, Chroma)

What it's good at

The right primitive for semantic similarity over flat chunks. Mature, well-understood, abundant managed and self-hosted options. If your task is genuinely "find similar text," you do not need a memory layer.

Where it diverges

A vector store cannot tell you that Bob is the CTO of Acme, that a fact supersedes a prior one, or that two memories are about the same entity. Anything beyond similarity is your code on top of the embedding.

When to pick it

Pick a plain vector store for retrieval-only RAG over a static corpus. Pick SmartMemory when agents need to accumulate, reason over, and update structured knowledge.

Quick decision flow

  1. Q1. Do you only use Claude Code and want zero-config session memory?

    → claude-mem.

  2. Q2. Do you only need to personalize a chatbot (remember names, preferences) and want the simplest possible bolt-on?

    → Mem0.

  3. Q3. Are you starting from scratch and want an opinionated agent framework with memory built in?

    → Letta.

  4. Q4. Do you need a typed memory system that drops into LangChain / LlamaIndex / Anthropic SDK / MCP, with bitemporal accuracy, an observable extraction pipeline, and multi-tenant isolation?

    → SmartMemory.

  5. Q5. Is your task literally just "find similar text in a fixed corpus"?

    → A plain vector store. No memory layer needed.

Frequently asked

Ready to try SmartMemory?

Free tier — 1,000 memories, full knowledge graph, MCP server. No credit card.