Mnemosyne vs Zep

An honest, technical comparison for users who need memory systems for AI agents and are evaluating Zep's Graphiti temporal knowledge graph against Mnemosyne's local-first approach.

Last updated: 2026-05-13 · Mnemosyne v2.8.0

TL;DR: These are fundamentally different deployment philosophies. Zep is a cloud-native context engineering platform with best-in-class temporal reasoning and a sophisticated knowledge graph engine. Mnemosyne is a local-first memory layer optimized for simplicity, zero-network deployments, and full user control. Zep excels at temporal fact tracking; Mnemosyne excels at being a lightweight, dependency-free memory layer you fully own.

Architecture

Dimension	Mnemosyne	Zep
Process model	In-process Python library	SaaS cloud platform (Graphiti engine)
IPC overhead	Zero (direct function calls)	HTTPS + JSON serialization to api.getzep.com
Database	SQLite (single file, WAL mode)	Proprietary temporal knowledge graph (Graphiti)
Embedding model	fastembed ONNX -- BAAI/bge-small-en-v1.5 (~67MB)	Managed by Zep (model details not disclosed)
Vector search	sqlite-vec (int8/bit/float32) or numpy fallback	Managed vector store within Graphiti
Cold start	Instant (if models cached locally)	API key configuration + network round-trip
Runtime memory	~10-20MB per session (SQLite + ONNX)	Zero local resources (everything remote)
Self-hosted option	Yes (default)	No (Community Edition deprecated, SaaS only)

Memory Model

Mnemosyne: BEAM (Bilevel Episodic-Associative Memory)

Three SQLite tables:

Tier	Purpose	Behavior
Working memory	Hot, recent context	Auto-injected into prompts. TTL-based eviction (default 24h). Max 10,000 items. FTS5 indexed.
Episodic memory	Long-term consolidated storage	Populated by `sleep()` consolidation. Hybrid vector + FTS5 search.
Scratchpad	Temporary agent workspace	Not searchable, not consolidated. Cleared explicitly. Max 1,000 items.

Additional: TripleStore -- temporal knowledge graph with valid_from/valid_until for point-in-time queries.

Core operations: remember(), recall(), sleep() -- intentionally simple.

Zep: Graphiti Temporal Knowledge Graph

Per-user knowledge graphs built by the Graphiti framework. Three node types:

Node type	Purpose
Entity Nodes	Represent people, organizations, concepts extracted from conversations
Entity Edges	Relationships between entities with temporal validity
Episodic Nodes	Time-stamped events and facts linked to entities

All nodes carry valid_at/invalid_at timestamps for full temporal versioning. Facts can be superseded or expired without data loss -- the graph maintains the full edit history.

Core operations: Add memory via API/SDK, retrieve via hybrid semantic + temporal + graph search.

Operation	Mnemosyne equivalent	Gap?
Add memory -- Structured fact extraction via LLM, entity resolution, graph insertion	`remember()` with `extract=True` enables LLM fact extraction. `extract_entities=True` captures entity mentions.	Partial gap. Zep's entity resolution and structured extraction are fully automatic and core to the system. Mnemosyne's are opt-in and simpler.
Recall -- Hybrid semantic + temporal + graph traversal, sub-200ms retrieval	`recall()` -- hybrid (vector + FTS5 + importance) x recency decay. Single-pass.	Design difference. Zep has deeper temporal reasoning (valid_at/invalid_at edges). Mnemosyne is simpler by choice.
Temporal query -- Point-in-time queries, fact versioning, "what did we know and when"	TripleStore `valid_from`/`valid_until` supports temporal queries but without Zep's automatic fact evolution tracking	Gap. Zep's Graphiti engine was purpose-built for temporal reasoning and achieves 94.8% on the DMR benchmark. Mnemosyne has basic temporal support.

Retrieval

Feature	Mnemosyne	Zep
Vector search	sqlite-vec (cosine distance)	Managed vector search within Graphiti
Keyword search	SQLite FTS5	Not publicly documented
Graph search	TripleStore (subject-predicate-object, temporal)	Native knowledge graph traversal across Entity Nodes, Edges, Episodic Nodes
Temporal search	`temporal_weight` + `temporal_halflife` params on `recall()`	Full temporal versioning with `valid_at`/`invalid_at` on every node and edge. Point-in-time queries.
Scoring formula	`vec_weight x vec + fts_weight x fts + importance_weight x importance`, then x recency decay	Hybrid semantic + temporal + graph scoring (proprietary)
Default weights	50% vector, 30% FTS, 20% importance	Not publicly documented
Configurable?	Yes -- per-query `vec_weight`, `fts_weight`, `importance_weight` params	Limited (API-level filters, not scoring weights)
Reranking	None (single-pass)	Not publicly documented
Retrieval latency	~2-10ms (in-process, no network)	Sub-200ms (network round-trip to Zep cloud)
Temporal benchmark	Not benchmarked	94.8% on DMR (Diachronic Memory Reasoning)

Entity Extraction

Feature	Mnemosyne	Zep
Method	Regex patterns + pure Python Levenshtein distance	LLM-driven extraction with automatic entity resolution
Patterns	`@mentions`, `#hashtags`, "quoted phrases", capitalized sequences (2-5 words)	Full entity recognition from conversation context
Fuzzy matching	Levenshtein distance with prefix/substring bonuses	Automatic entity resolution and normalization
Storage	TripleStore triples: `(memory_id, "mentions", "entity_name")`	Entity Nodes in per-user knowledge graph
Speed	~0.01ms per extraction	Network-dependent (remote LLM call)
Opt-in?	`extract_entities=True` on `remember()`	Automatic (core part of ingestion pipeline)

Verdict: Mnemosyne's regex approach is fast, local, and dependency-free but misses nuanced entity relationships. Zep's automatic entity resolution is more powerful but requires network calls and is not transparent. This is a deliberate trade-off: speed and local control versus NLP sophistication.

Integrations

MCP (Model Context Protocol)

	Mnemosyne	Zep
Tools	6 tools	13 tools
Transports	stdio + SSE	stdio + SSE
SDKs	Python (core library)	Python, TypeScript, Go

Mnemosyne MCP tools (6):

Tool	Description
`mnemosyne_remember`	Store a memory (supports entity extraction, fact extraction, bank selection)
`mnemosyne_recall`	Search memories with hybrid scoring and configurable weights
`mnemosyne_sleep`	Run consolidation cycle
`mnemosyne_scratchpad_read`	Read agent scratchpad
`mnemosyne_scratchpad_write`	Write to scratchpad
`mnemosyne_get_stats`	Get memory statistics

mnemosyne mcp                          # stdio transport (Claude Desktop, etc.)
mnemosyne mcp --transport sse --port 8080  # SSE transport (web clients)
mnemosyne mcp --bank project_a            # scoped to a specific bank

Zep MCP tools (13): Includes user/session management, fact retrieval, graph traversal, temporal queries, and more -- reflecting its broader surface area as a cloud platform.

Hermes Agent Integration

	Mnemosyne	Zep
Hermes	Native (in-process, no serialization). 15 tools + 3 hooks via `plugin.yaml`.	MCP client only (network round-trip per call)
Hooks	`pre_llm_call`, `on_session_start`, `post_tool_call`	None (stateless API)

Deployment & Ownership

Feature	Mnemosyne	Zep
Self-hosted	Yes (`pip install mnemosyne`)	No (Community Edition killed; SaaS only)
Offline capable	Yes (zero network dependency)	No (requires internet access to api.getzep.com)
Data locality	All data on your disk (SQLite files)	All data in Zep's cloud (US-hosted)
Privacy	Full -- data never leaves your machine	Data processed and stored by third party
Vendor lock-in	None -- open source, standard SQLite format	High -- proprietary Graphiti engine, no export path documented
Cost	Free (MIT licensed)	Free tier (1K credits), Flex $125/mo, Flex Plus $375/mo, Enterprise custom
Credit system	N/A	Credit-based pricing -- different operations consume different credit amounts

Pricing (Zep)

Zep uses a credit-based pricing model. Different operations (add memory, search, graph traversal) consume different numbers of credits.

Tier	Price	Credits	Best for
Free	$0	1,000	Evaluation and prototyping
Flex	$125/mo	~25,000	Individual developers, small projects
Flex Plus	$375/mo	~100,000	Small teams, production agents
Enterprise	Custom	Custom	Scale deployments, SLAs, dedicated support

Mnemosyne is free (MIT license) with no usage caps, no credit systems, and no recurring costs. You only pay for the compute you run it on.

Additional Features

Mnemosyne-specific (not in Zep)

Feature	Module	Description
Streaming	`core/streaming.py`	`MemoryStream` with push (callbacks) and pull (iterator) patterns. Thread-safe event buffer.
Delta sync	`core/streaming.py`	`DeltaSync` -- incremental synchronization between Mnemosyne instances with checkpointed resume.
Pattern detection	`core/patterns.py`	`PatternDetector` -- temporal (hour/weekday), content (keyword frequency, co-occurrence), sequence patterns.
Memory compression	`core/patterns.py`	`MemoryCompressor` -- dictionary-based, RLE, and semantic compression strategies.
Plugin system	`core/plugins.py`	`MnemosynePlugin` base class with 4 lifecycle hooks. Discovers plugins from `~/.hermes/mnemosyne/plugins/`.
Diagnostics	`diagnose.py`	PII-safe health check -- dependencies, database state, vector readiness. No memory content or API keys.
Memory banks	`BankManager`	Named, isolated memory banks with per-bank SQLite files. No credit limits, no per-bank pricing.

Zep-specific (not in Mnemosyne)

Feature	Description
Best-in-class temporal reasoning	94.8% on DMR benchmark. `valid_at`/`invalid_at` timestamps on every node and edge. Full fact versioning.
Per-user knowledge graphs	Automatic entity resolution, relationship extraction, and graph building per user/session.
Context engineering	Zep positions itself as a "context engineering" platform -- managing what context AI agents need, when they need it.
Managed infrastructure	Zero ops burden. No database to manage, no models to download, no updates to apply.
Multi-language SDKs	Official Python, TypeScript, and Go SDKs with idiomatic APIs.
Automatic entity resolution	Entities are automatically normalized and deduplicated across conversations.

Performance Characteristics

Metric	Mnemosyne	Zep
Recall latency (10K corpus)	~2-10ms -- in-process SQLite + sqlite-vec, no HTTP overhead	Sub-200ms -- HTTPS round-trip to Zep cloud + Graphiti traversal
IPC model	Direct Python function call	HTTPS POST to api.getzep.com -> JSON serialization -> response parsing
Storage footprint	~50-100MB SQLite file per 10K memories	Zero local storage (all remote)
Model download	One-time ~67MB (fastembed ONNX)	None (models managed by Zep)
Runtime memory	~10-20MB per session	Minimal (thin API client)
Network dependency	None	Full -- every operation requires internet access
Temporal query perf	~1-5ms (SQLite index scan)	Sub-200ms (Graphiti temporal traversal)

Important caveat on latency numbers: Mnemosyne's raw latency advantage comes from being an in-process library with no network calls. But Zep's sub-200ms retrieval includes sophisticated temporal graph traversal that Mnemosyne does not attempt. Latency is not quality -- these systems are doing different work per query.

When to Choose What

Choose Mnemosyne if:

You need pip install with zero cloud dependencies or API keys
You need the fastest possible recall latency for interactive agent loops
You require full data privacy -- memory content must never leave your machine
You're running on resource-constrained or air-gapped environments
You're building a single-user, single-machine agent (Hermes, Claude Desktop, etc.)
You want an MCP-compatible memory layer (stdio + SSE) with no usage caps
You want memory banks with per-project isolation without per-bank pricing
You want full control over the memory model and data format
You need offline capability -- your agent works without internet

Choose Zep if:

You need best-in-class temporal reasoning (94.8% DMR, fact versioning, point-in-time queries)
You need automatic entity resolution and relationship extraction at scale
You need per-user knowledge graphs for multi-user applications
You want zero ops burden -- no databases, no model downloads, no updates
You're building a production SaaS that needs managed memory infrastructure
You need multi-language SDK support (Python, TypeScript, Go) with first-class treatment
You're okay with credit-based pricing and can budget for recurring costs
You don't need self-hosted or offline capability

Neither is "better." They optimize for fundamentally different constraints: ownership and simplicity versus managed sophistication and temporal reasoning depth.

Known Gaps in Mnemosyne (honest list)

Gap	Severity	Workaround
No automatic entity normalization	Medium	`extract_entities=True` captures entities; fuzzy matching helps but doesn't resolve coreference
No per-user knowledge graphs	Medium for multi-user apps	Per-bank SQLite isolation provides domain separation, not per-user isolation
Temporal reasoning is basic (TripleStore)	Medium for temporal-heavy use cases	`temporal_weight` + `temporal_halflife` on `recall()` covers recency, not versioned fact evolution
No managed cloud option	Low for self-hosted users	Export/import JSON for migration; backup SQLite files directly
No TypeScript or Go SDK	Medium for non-Python environments	MCP protocol is language-agnostic; use MCP client in any language
No automatic fact versioning/deprecation	Medium	Manual `invalidate(memory_id, replacement_id=new_id)`

This page compares Mnemosyne v2.8.0 against Zep's current cloud offering as of May 2026. Zep's Community Edition was deprecated and is no longer available for self-hosting. Every Mnemosyne feature listed has been verified against the source code. If anything here is wrong, please open an issue -- we'll fix it.