Mnemosyne vs Zep
An honest, technical comparison for users who need memory systems for AI agents and are evaluating Zep's Graphiti temporal knowledge graph against Mnemosyne's local-first approach.
Last updated: 2026-05-13 · Mnemosyne v2.8.0
TL;DR: These are fundamentally different deployment philosophies. Zep is a cloud-native context engineering platform with best-in-class temporal reasoning and a sophisticated knowledge graph engine. Mnemosyne is a local-first memory layer optimized for simplicity, zero-network deployments, and full user control. Zep excels at temporal fact tracking; Mnemosyne excels at being a lightweight, dependency-free memory layer you fully own.
Architecture
| Dimension | Mnemosyne | Zep |
|---|---|---|
| Process model | In-process Python library | SaaS cloud platform (Graphiti engine) |
| IPC overhead | Zero (direct function calls) | HTTPS + JSON serialization to api.getzep.com |
| Database | SQLite (single file, WAL mode) | Proprietary temporal knowledge graph (Graphiti) |
| Embedding model | fastembed ONNX -- BAAI/bge-small-en-v1.5 (~67MB) | Managed by Zep (model details not disclosed) |
| Vector search | sqlite-vec (int8/bit/float32) or numpy fallback | Managed vector store within Graphiti |
| Cold start | Instant (if models cached locally) | API key configuration + network round-trip |
| Runtime memory | ~10-20MB per session (SQLite + ONNX) | Zero local resources (everything remote) |
| Self-hosted option | Yes (default) | No (Community Edition deprecated, SaaS only) |
Memory Model
Mnemosyne: BEAM (Bilevel Episodic-Associative Memory)
Three SQLite tables:
| Tier | Purpose | Behavior |
|---|---|---|
| Working memory | Hot, recent context | Auto-injected into prompts. TTL-based eviction (default 24h). Max 10,000 items. FTS5 indexed. |
| Episodic memory | Long-term consolidated storage | Populated by sleep() consolidation. Hybrid vector + FTS5 search. |
| Scratchpad | Temporary agent workspace | Not searchable, not consolidated. Cleared explicitly. Max 1,000 items. |
Additional: TripleStore -- temporal knowledge graph with valid_from/valid_until for point-in-time queries.
Core operations: remember(), recall(), sleep() -- intentionally simple.
Zep: Graphiti Temporal Knowledge Graph
Per-user knowledge graphs built by the Graphiti framework. Three node types:
| Node type | Purpose |
|---|---|
| Entity Nodes | Represent people, organizations, concepts extracted from conversations |
| Entity Edges | Relationships between entities with temporal validity |
| Episodic Nodes | Time-stamped events and facts linked to entities |
All nodes carry valid_at/invalid_at timestamps for full temporal versioning. Facts can be superseded or expired without data loss -- the graph maintains the full edit history.
Core operations: Add memory via API/SDK, retrieve via hybrid semantic + temporal + graph search.
| Operation | Mnemosyne equivalent | Gap? |
|---|---|---|
| Add memory -- Structured fact extraction via LLM, entity resolution, graph insertion | remember() with extract=True enables LLM fact extraction. extract_entities=True captures entity mentions. | Partial gap. Zep's entity resolution and structured extraction are fully automatic and core to the system. Mnemosyne's are opt-in and simpler. |
| Recall -- Hybrid semantic + temporal + graph traversal, sub-200ms retrieval | recall() -- hybrid (vector + FTS5 + importance) x recency decay. Single-pass. | Design difference. Zep has deeper temporal reasoning (valid_at/invalid_at edges). Mnemosyne is simpler by choice. |
| Temporal query -- Point-in-time queries, fact versioning, "what did we know and when" | TripleStore valid_from/valid_until supports temporal queries but without Zep's automatic fact evolution tracking | Gap. Zep's Graphiti engine was purpose-built for temporal reasoning and achieves 94.8% on the DMR benchmark. Mnemosyne has basic temporal support. |
Retrieval
| Feature | Mnemosyne | Zep |
|---|---|---|
| Vector search | sqlite-vec (cosine distance) | Managed vector search within Graphiti |
| Keyword search | SQLite FTS5 | Not publicly documented |
| Graph search | TripleStore (subject-predicate-object, temporal) | Native knowledge graph traversal across Entity Nodes, Edges, Episodic Nodes |
| Temporal search | temporal_weight + temporal_halflife params on recall() | Full temporal versioning with valid_at/invalid_at on every node and edge. Point-in-time queries. |
| Scoring formula | vec_weight x vec + fts_weight x fts + importance_weight x importance, then x recency decay | Hybrid semantic + temporal + graph scoring (proprietary) |
| Default weights | 50% vector, 30% FTS, 20% importance | Not publicly documented |
| Configurable? | Yes -- per-query vec_weight, fts_weight, importance_weight params | Limited (API-level filters, not scoring weights) |
| Reranking | None (single-pass) | Not publicly documented |
| Retrieval latency | ~2-10ms (in-process, no network) | Sub-200ms (network round-trip to Zep cloud) |
| Temporal benchmark | Not benchmarked | 94.8% on DMR (Diachronic Memory Reasoning) |
Entity Extraction
| Feature | Mnemosyne | Zep |
|---|---|---|
| Method | Regex patterns + pure Python Levenshtein distance | LLM-driven extraction with automatic entity resolution |
| Patterns | @mentions, #hashtags, "quoted phrases", capitalized sequences (2-5 words) | Full entity recognition from conversation context |
| Fuzzy matching | Levenshtein distance with prefix/substring bonuses | Automatic entity resolution and normalization |
| Storage | TripleStore triples: (memory_id, "mentions", "entity_name") | Entity Nodes in per-user knowledge graph |
| Speed | ~0.01ms per extraction | Network-dependent (remote LLM call) |
| Opt-in? | extract_entities=True on remember() | Automatic (core part of ingestion pipeline) |
Verdict: Mnemosyne's regex approach is fast, local, and dependency-free but misses nuanced entity relationships. Zep's automatic entity resolution is more powerful but requires network calls and is not transparent. This is a deliberate trade-off: speed and local control versus NLP sophistication.
Integrations
MCP (Model Context Protocol)
| Mnemosyne | Zep | |
|---|---|---|
| Tools | 6 tools | 13 tools |
| Transports | stdio + SSE | stdio + SSE |
| SDKs | Python (core library) | Python, TypeScript, Go |
Mnemosyne MCP tools (6):
| Tool | Description |
|---|---|
mnemosyne_remember | Store a memory (supports entity extraction, fact extraction, bank selection) |
mnemosyne_recall | Search memories with hybrid scoring and configurable weights |
mnemosyne_sleep | Run consolidation cycle |
mnemosyne_scratchpad_read | Read agent scratchpad |
mnemosyne_scratchpad_write | Write to scratchpad |
mnemosyne_get_stats | Get memory statistics |
mnemosyne mcp # stdio transport (Claude Desktop, etc.)
mnemosyne mcp --transport sse --port 8080 # SSE transport (web clients)
mnemosyne mcp --bank project_a # scoped to a specific bank
Zep MCP tools (13): Includes user/session management, fact retrieval, graph traversal, temporal queries, and more -- reflecting its broader surface area as a cloud platform.
Hermes Agent Integration
| Mnemosyne | Zep | |
|---|---|---|
| Hermes | Native (in-process, no serialization). 15 tools + 3 hooks via plugin.yaml. | MCP client only (network round-trip per call) |
| Hooks | pre_llm_call, on_session_start, post_tool_call | None (stateless API) |
Deployment & Ownership
| Feature | Mnemosyne | Zep |
|---|---|---|
| Self-hosted | Yes (pip install mnemosyne) | No (Community Edition killed; SaaS only) |
| Offline capable | Yes (zero network dependency) | No (requires internet access to api.getzep.com) |
| Data locality | All data on your disk (SQLite files) | All data in Zep's cloud (US-hosted) |
| Privacy | Full -- data never leaves your machine | Data processed and stored by third party |
| Vendor lock-in | None -- open source, standard SQLite format | High -- proprietary Graphiti engine, no export path documented |
| Cost | Free (MIT licensed) | Free tier (1K credits), Flex $125/mo, Flex Plus $375/mo, Enterprise custom |
| Credit system | N/A | Credit-based pricing -- different operations consume different credit amounts |
Pricing (Zep)
Zep uses a credit-based pricing model. Different operations (add memory, search, graph traversal) consume different numbers of credits.
| Tier | Price | Credits | Best for |
|---|---|---|---|
| Free | $0 | 1,000 | Evaluation and prototyping |
| Flex | $125/mo | ~25,000 | Individual developers, small projects |
| Flex Plus | $375/mo | ~100,000 | Small teams, production agents |
| Enterprise | Custom | Custom | Scale deployments, SLAs, dedicated support |
Mnemosyne is free (MIT license) with no usage caps, no credit systems, and no recurring costs. You only pay for the compute you run it on.
Additional Features
Mnemosyne-specific (not in Zep)
| Feature | Module | Description |
|---|---|---|
| Streaming | core/streaming.py | MemoryStream with push (callbacks) and pull (iterator) patterns. Thread-safe event buffer. |
| Delta sync | core/streaming.py | DeltaSync -- incremental synchronization between Mnemosyne instances with checkpointed resume. |
| Pattern detection | core/patterns.py | PatternDetector -- temporal (hour/weekday), content (keyword frequency, co-occurrence), sequence patterns. |
| Memory compression | core/patterns.py | MemoryCompressor -- dictionary-based, RLE, and semantic compression strategies. |
| Plugin system | core/plugins.py | MnemosynePlugin base class with 4 lifecycle hooks. Discovers plugins from ~/.hermes/mnemosyne/plugins/. |
| Diagnostics | diagnose.py | PII-safe health check -- dependencies, database state, vector readiness. No memory content or API keys. |
| Memory banks | BankManager | Named, isolated memory banks with per-bank SQLite files. No credit limits, no per-bank pricing. |
Zep-specific (not in Mnemosyne)
| Feature | Description |
|---|---|
| Best-in-class temporal reasoning | 94.8% on DMR benchmark. valid_at/invalid_at timestamps on every node and edge. Full fact versioning. |
| Per-user knowledge graphs | Automatic entity resolution, relationship extraction, and graph building per user/session. |
| Context engineering | Zep positions itself as a "context engineering" platform -- managing what context AI agents need, when they need it. |
| Managed infrastructure | Zero ops burden. No database to manage, no models to download, no updates to apply. |
| Multi-language SDKs | Official Python, TypeScript, and Go SDKs with idiomatic APIs. |
| Automatic entity resolution | Entities are automatically normalized and deduplicated across conversations. |
Performance Characteristics
| Metric | Mnemosyne | Zep |
|---|---|---|
| Recall latency (10K corpus) | ~2-10ms -- in-process SQLite + sqlite-vec, no HTTP overhead | Sub-200ms -- HTTPS round-trip to Zep cloud + Graphiti traversal |
| IPC model | Direct Python function call | HTTPS POST to api.getzep.com -> JSON serialization -> response parsing |
| Storage footprint | ~50-100MB SQLite file per 10K memories | Zero local storage (all remote) |
| Model download | One-time ~67MB (fastembed ONNX) | None (models managed by Zep) |
| Runtime memory | ~10-20MB per session | Minimal (thin API client) |
| Network dependency | None | Full -- every operation requires internet access |
| Temporal query perf | ~1-5ms (SQLite index scan) | Sub-200ms (Graphiti temporal traversal) |
Important caveat on latency numbers: Mnemosyne's raw latency advantage comes from being an in-process library with no network calls. But Zep's sub-200ms retrieval includes sophisticated temporal graph traversal that Mnemosyne does not attempt. Latency is not quality -- these systems are doing different work per query.
When to Choose What
Choose Mnemosyne if:
- You need
pip installwith zero cloud dependencies or API keys - You need the fastest possible recall latency for interactive agent loops
- You require full data privacy -- memory content must never leave your machine
- You're running on resource-constrained or air-gapped environments
- You're building a single-user, single-machine agent (Hermes, Claude Desktop, etc.)
- You want an MCP-compatible memory layer (stdio + SSE) with no usage caps
- You want memory banks with per-project isolation without per-bank pricing
- You want full control over the memory model and data format
- You need offline capability -- your agent works without internet
Choose Zep if:
- You need best-in-class temporal reasoning (94.8% DMR, fact versioning, point-in-time queries)
- You need automatic entity resolution and relationship extraction at scale
- You need per-user knowledge graphs for multi-user applications
- You want zero ops burden -- no databases, no model downloads, no updates
- You're building a production SaaS that needs managed memory infrastructure
- You need multi-language SDK support (Python, TypeScript, Go) with first-class treatment
- You're okay with credit-based pricing and can budget for recurring costs
- You don't need self-hosted or offline capability
Neither is "better." They optimize for fundamentally different constraints: ownership and simplicity versus managed sophistication and temporal reasoning depth.
Known Gaps in Mnemosyne (honest list)
| Gap | Severity | Workaround |
|---|---|---|
| No automatic entity normalization | Medium | extract_entities=True captures entities; fuzzy matching helps but doesn't resolve coreference |
| No per-user knowledge graphs | Medium for multi-user apps | Per-bank SQLite isolation provides domain separation, not per-user isolation |
| Temporal reasoning is basic (TripleStore) | Medium for temporal-heavy use cases | temporal_weight + temporal_halflife on recall() covers recency, not versioned fact evolution |
| No managed cloud option | Low for self-hosted users | Export/import JSON for migration; backup SQLite files directly |
| No TypeScript or Go SDK | Medium for non-Python environments | MCP protocol is language-agnostic; use MCP client in any language |
| No automatic fact versioning/deprecation | Medium | Manual invalidate(memory_id, replacement_id=new_id) |
This page compares Mnemosyne v2.8.0 against Zep's current cloud offering as of May 2026. Zep's Community Edition was deprecated and is no longer available for self-hosting. Every Mnemosyne feature listed has been verified against the source code. If anything here is wrong, please open an issue -- we'll fix it.
Mnemosyne