Mnemosyne vs Zep

An honest, technical comparison for users who need memory systems for AI agents and are evaluating Zep's Graphiti temporal knowledge graph against Mnemosyne's local-first approach.

Last updated: 2026-05-13 · Mnemosyne v2.8.0

TL;DR: These are fundamentally different deployment philosophies. Zep is a cloud-native context engineering platform with best-in-class temporal reasoning and a sophisticated knowledge graph engine. Mnemosyne is a local-first memory layer optimized for simplicity, zero-network deployments, and full user control. Zep excels at temporal fact tracking; Mnemosyne excels at being a lightweight, dependency-free memory layer you fully own.


Architecture

DimensionMnemosyneZep
Process modelIn-process Python librarySaaS cloud platform (Graphiti engine)
IPC overheadZero (direct function calls)HTTPS + JSON serialization to api.getzep.com
DatabaseSQLite (single file, WAL mode)Proprietary temporal knowledge graph (Graphiti)
Embedding modelfastembed ONNX -- BAAI/bge-small-en-v1.5 (~67MB)Managed by Zep (model details not disclosed)
Vector searchsqlite-vec (int8/bit/float32) or numpy fallbackManaged vector store within Graphiti
Cold startInstant (if models cached locally)API key configuration + network round-trip
Runtime memory~10-20MB per session (SQLite + ONNX)Zero local resources (everything remote)
Self-hosted optionYes (default)No (Community Edition deprecated, SaaS only)

Memory Model

Mnemosyne: BEAM (Bilevel Episodic-Associative Memory)

Three SQLite tables:

TierPurposeBehavior
Working memoryHot, recent contextAuto-injected into prompts. TTL-based eviction (default 24h). Max 10,000 items. FTS5 indexed.
Episodic memoryLong-term consolidated storagePopulated by sleep() consolidation. Hybrid vector + FTS5 search.
ScratchpadTemporary agent workspaceNot searchable, not consolidated. Cleared explicitly. Max 1,000 items.

Additional: TripleStore -- temporal knowledge graph with valid_from/valid_until for point-in-time queries.

Core operations: remember(), recall(), sleep() -- intentionally simple.

Zep: Graphiti Temporal Knowledge Graph

Per-user knowledge graphs built by the Graphiti framework. Three node types:

Node typePurpose
Entity NodesRepresent people, organizations, concepts extracted from conversations
Entity EdgesRelationships between entities with temporal validity
Episodic NodesTime-stamped events and facts linked to entities

All nodes carry valid_at/invalid_at timestamps for full temporal versioning. Facts can be superseded or expired without data loss -- the graph maintains the full edit history.

Core operations: Add memory via API/SDK, retrieve via hybrid semantic + temporal + graph search.

OperationMnemosyne equivalentGap?
Add memory -- Structured fact extraction via LLM, entity resolution, graph insertionremember() with extract=True enables LLM fact extraction. extract_entities=True captures entity mentions.Partial gap. Zep's entity resolution and structured extraction are fully automatic and core to the system. Mnemosyne's are opt-in and simpler.
Recall -- Hybrid semantic + temporal + graph traversal, sub-200ms retrievalrecall() -- hybrid (vector + FTS5 + importance) x recency decay. Single-pass.Design difference. Zep has deeper temporal reasoning (valid_at/invalid_at edges). Mnemosyne is simpler by choice.
Temporal query -- Point-in-time queries, fact versioning, "what did we know and when"TripleStore valid_from/valid_until supports temporal queries but without Zep's automatic fact evolution trackingGap. Zep's Graphiti engine was purpose-built for temporal reasoning and achieves 94.8% on the DMR benchmark. Mnemosyne has basic temporal support.

Retrieval

FeatureMnemosyneZep
Vector searchsqlite-vec (cosine distance)Managed vector search within Graphiti
Keyword searchSQLite FTS5Not publicly documented
Graph searchTripleStore (subject-predicate-object, temporal)Native knowledge graph traversal across Entity Nodes, Edges, Episodic Nodes
Temporal searchtemporal_weight + temporal_halflife params on recall()Full temporal versioning with valid_at/invalid_at on every node and edge. Point-in-time queries.
Scoring formulavec_weight x vec + fts_weight x fts + importance_weight x importance, then x recency decayHybrid semantic + temporal + graph scoring (proprietary)
Default weights50% vector, 30% FTS, 20% importanceNot publicly documented
Configurable?Yes -- per-query vec_weight, fts_weight, importance_weight paramsLimited (API-level filters, not scoring weights)
RerankingNone (single-pass)Not publicly documented
Retrieval latency~2-10ms (in-process, no network)Sub-200ms (network round-trip to Zep cloud)
Temporal benchmarkNot benchmarked94.8% on DMR (Diachronic Memory Reasoning)

Entity Extraction

FeatureMnemosyneZep
MethodRegex patterns + pure Python Levenshtein distanceLLM-driven extraction with automatic entity resolution
Patterns@mentions, #hashtags, "quoted phrases", capitalized sequences (2-5 words)Full entity recognition from conversation context
Fuzzy matchingLevenshtein distance with prefix/substring bonusesAutomatic entity resolution and normalization
StorageTripleStore triples: (memory_id, "mentions", "entity_name")Entity Nodes in per-user knowledge graph
Speed~0.01ms per extractionNetwork-dependent (remote LLM call)
Opt-in?extract_entities=True on remember()Automatic (core part of ingestion pipeline)

Verdict: Mnemosyne's regex approach is fast, local, and dependency-free but misses nuanced entity relationships. Zep's automatic entity resolution is more powerful but requires network calls and is not transparent. This is a deliberate trade-off: speed and local control versus NLP sophistication.


Integrations

MCP (Model Context Protocol)

MnemosyneZep
Tools6 tools13 tools
Transportsstdio + SSEstdio + SSE
SDKsPython (core library)Python, TypeScript, Go

Mnemosyne MCP tools (6):

ToolDescription
mnemosyne_rememberStore a memory (supports entity extraction, fact extraction, bank selection)
mnemosyne_recallSearch memories with hybrid scoring and configurable weights
mnemosyne_sleepRun consolidation cycle
mnemosyne_scratchpad_readRead agent scratchpad
mnemosyne_scratchpad_writeWrite to scratchpad
mnemosyne_get_statsGet memory statistics
mnemosyne mcp                          # stdio transport (Claude Desktop, etc.)
mnemosyne mcp --transport sse --port 8080  # SSE transport (web clients)
mnemosyne mcp --bank project_a            # scoped to a specific bank

Zep MCP tools (13): Includes user/session management, fact retrieval, graph traversal, temporal queries, and more -- reflecting its broader surface area as a cloud platform.

Hermes Agent Integration

MnemosyneZep
HermesNative (in-process, no serialization). 15 tools + 3 hooks via plugin.yaml.MCP client only (network round-trip per call)
Hookspre_llm_call, on_session_start, post_tool_callNone (stateless API)

Deployment & Ownership

FeatureMnemosyneZep
Self-hostedYes (pip install mnemosyne)No (Community Edition killed; SaaS only)
Offline capableYes (zero network dependency)No (requires internet access to api.getzep.com)
Data localityAll data on your disk (SQLite files)All data in Zep's cloud (US-hosted)
PrivacyFull -- data never leaves your machineData processed and stored by third party
Vendor lock-inNone -- open source, standard SQLite formatHigh -- proprietary Graphiti engine, no export path documented
CostFree (MIT licensed)Free tier (1K credits), Flex $125/mo, Flex Plus $375/mo, Enterprise custom
Credit systemN/ACredit-based pricing -- different operations consume different credit amounts

Pricing (Zep)

Zep uses a credit-based pricing model. Different operations (add memory, search, graph traversal) consume different numbers of credits.

TierPriceCreditsBest for
Free$01,000Evaluation and prototyping
Flex$125/mo~25,000Individual developers, small projects
Flex Plus$375/mo~100,000Small teams, production agents
EnterpriseCustomCustomScale deployments, SLAs, dedicated support

Mnemosyne is free (MIT license) with no usage caps, no credit systems, and no recurring costs. You only pay for the compute you run it on.


Additional Features

Mnemosyne-specific (not in Zep)

FeatureModuleDescription
Streamingcore/streaming.pyMemoryStream with push (callbacks) and pull (iterator) patterns. Thread-safe event buffer.
Delta synccore/streaming.pyDeltaSync -- incremental synchronization between Mnemosyne instances with checkpointed resume.
Pattern detectioncore/patterns.pyPatternDetector -- temporal (hour/weekday), content (keyword frequency, co-occurrence), sequence patterns.
Memory compressioncore/patterns.pyMemoryCompressor -- dictionary-based, RLE, and semantic compression strategies.
Plugin systemcore/plugins.pyMnemosynePlugin base class with 4 lifecycle hooks. Discovers plugins from ~/.hermes/mnemosyne/plugins/.
Diagnosticsdiagnose.pyPII-safe health check -- dependencies, database state, vector readiness. No memory content or API keys.
Memory banksBankManagerNamed, isolated memory banks with per-bank SQLite files. No credit limits, no per-bank pricing.

Zep-specific (not in Mnemosyne)

FeatureDescription
Best-in-class temporal reasoning94.8% on DMR benchmark. valid_at/invalid_at timestamps on every node and edge. Full fact versioning.
Per-user knowledge graphsAutomatic entity resolution, relationship extraction, and graph building per user/session.
Context engineeringZep positions itself as a "context engineering" platform -- managing what context AI agents need, when they need it.
Managed infrastructureZero ops burden. No database to manage, no models to download, no updates to apply.
Multi-language SDKsOfficial Python, TypeScript, and Go SDKs with idiomatic APIs.
Automatic entity resolutionEntities are automatically normalized and deduplicated across conversations.

Performance Characteristics

MetricMnemosyneZep
Recall latency (10K corpus)~2-10ms -- in-process SQLite + sqlite-vec, no HTTP overheadSub-200ms -- HTTPS round-trip to Zep cloud + Graphiti traversal
IPC modelDirect Python function callHTTPS POST to api.getzep.com -> JSON serialization -> response parsing
Storage footprint~50-100MB SQLite file per 10K memoriesZero local storage (all remote)
Model downloadOne-time ~67MB (fastembed ONNX)None (models managed by Zep)
Runtime memory~10-20MB per sessionMinimal (thin API client)
Network dependencyNoneFull -- every operation requires internet access
Temporal query perf~1-5ms (SQLite index scan)Sub-200ms (Graphiti temporal traversal)

Important caveat on latency numbers: Mnemosyne's raw latency advantage comes from being an in-process library with no network calls. But Zep's sub-200ms retrieval includes sophisticated temporal graph traversal that Mnemosyne does not attempt. Latency is not quality -- these systems are doing different work per query.


When to Choose What

Choose Mnemosyne if:

  • You need pip install with zero cloud dependencies or API keys
  • You need the fastest possible recall latency for interactive agent loops
  • You require full data privacy -- memory content must never leave your machine
  • You're running on resource-constrained or air-gapped environments
  • You're building a single-user, single-machine agent (Hermes, Claude Desktop, etc.)
  • You want an MCP-compatible memory layer (stdio + SSE) with no usage caps
  • You want memory banks with per-project isolation without per-bank pricing
  • You want full control over the memory model and data format
  • You need offline capability -- your agent works without internet

Choose Zep if:

  • You need best-in-class temporal reasoning (94.8% DMR, fact versioning, point-in-time queries)
  • You need automatic entity resolution and relationship extraction at scale
  • You need per-user knowledge graphs for multi-user applications
  • You want zero ops burden -- no databases, no model downloads, no updates
  • You're building a production SaaS that needs managed memory infrastructure
  • You need multi-language SDK support (Python, TypeScript, Go) with first-class treatment
  • You're okay with credit-based pricing and can budget for recurring costs
  • You don't need self-hosted or offline capability

Neither is "better." They optimize for fundamentally different constraints: ownership and simplicity versus managed sophistication and temporal reasoning depth.


Known Gaps in Mnemosyne (honest list)

GapSeverityWorkaround
No automatic entity normalizationMediumextract_entities=True captures entities; fuzzy matching helps but doesn't resolve coreference
No per-user knowledge graphsMedium for multi-user appsPer-bank SQLite isolation provides domain separation, not per-user isolation
Temporal reasoning is basic (TripleStore)Medium for temporal-heavy use casestemporal_weight + temporal_halflife on recall() covers recency, not versioned fact evolution
No managed cloud optionLow for self-hosted usersExport/import JSON for migration; backup SQLite files directly
No TypeScript or Go SDKMedium for non-Python environmentsMCP protocol is language-agnostic; use MCP client in any language
No automatic fact versioning/deprecationMediumManual invalidate(memory_id, replacement_id=new_id)

This page compares Mnemosyne v2.8.0 against Zep's current cloud offering as of May 2026. Zep's Community Edition was deprecated and is no longer available for self-hosting. Every Mnemosyne feature listed has been verified against the source code. If anything here is wrong, please open an issue -- we'll fix it.