Mnemosyne vs SuperMemory
An honest, technical comparison for developers choosing between a local-first memory engine and a zero-code SaaS context stack.
Last updated: 2026-05-13 · Mnemosyne v2.8.0 · SuperMemory (as of May 2026)
TL;DR: SuperMemory is a managed SaaS context layer with a five-layer architecture, zero-code Memory Router integration, and the strongest consumer offering in the space. Mnemosyne is a local-first memory engine that runs entirely on your machine with no external dependencies. SuperMemory wins on ease of adoption and consumer polish; Mnemosyne wins on privacy, full local control, and cost at scale.
Architecture
SuperMemory uses a proprietary five-layer context stack engineered for managed SaaS delivery. Mnemosyne uses a single SQLite file with BEAM (Bilevel Episodic-Associative Memory) — three tightly integrated tiers plus a temporal TripleStore.
| Dimension | Mnemosyne | SuperMemory |
|---|---|---|
| Process model | In-process Python library | Managed SaaS with self-hosted Enterprise option |
| Database | SQLite (single file, WAL mode) | Proprietary vector graph store (ontology-aware edges) |
| Embedding model | fastembed ONNX — BAAI/bge-small-en-v1.5 (~67MB), runs locally | Managed inference, model details not publicly disclosed |
| Extraction model | Opt-in: any OpenAI-compatible or local GGUF | Multi-modal extractors built into the stack (layer 4) |
| Vector search | sqlite-vec (cosine distance) | Hybrid Retrieval Engine (layer 3) with ontology-aware graph traversal |
| Knowledge graph | TripleStore (subject-predicate-object, temporal) | Vector Graph Engine with ontology-aware edges (layer 2) |
| Runtime memory | ~10-20MB per session | Cloud-hosted (no local footprint) or Enterprise container |
| Cold start | Instant (if models cached) | Instant (SaaS — no local boot required) |
| Language | Python | TypeScript-native |
Key architectural difference: SuperMemory's five-layer stack (User Understanding Model → Vector Graph Engine → Hybrid Retrieval Engine → Multi-modal Extractors → Managed Connectors) is a vertically integrated SaaS product. Mnemosyne's BEAM architecture is a horizontally layered library you compose yourself. SuperMemory optimizes for zero-config; Mnemosyne optimizes for zero-dependency.
The Five-Layer Stack
SuperMemory's architecture is its defining feature. Understanding it clarifies what you get (and don't get) with each tier:
| Layer | Name | What it does | Mnemosyne equivalent |
|---|---|---|---|
| 1 | User Understanding Model | Builds persistent user profiles, preferences, and behavioral patterns | Not present. Mnemosyne stores what you tell it; it does not model you. |
| 2 | Vector Graph Engine | Ontology-aware knowledge graph with typed edges and relationship inference | TripleStore — simpler (S-P-O triples), temporal, but no ontology or edge typing. |
| 3 | Hybrid Retrieval Engine | Multi-strategy semantic + graph + keyword retrieval with learned fusion | Hybrid scoring (vector + FTS5 + importance) × recency decay. Single-pass, not learned. |
| 4 | Multi-modal Extractors | Extracts structured data from text, images, and connected sources | Opt-in LLM fact extraction + regex entity extraction. Text-only, no multi-modal. |
| 5 | Managed Connectors | Native integrations with Notion, Slack, Google Drive, and more | None. Mnemosyne is a library — you build your own connectors. |
Verdict: If you want a system that ingests your Notion docs, Slack messages, and Google Drive files and builds a unified memory graph from all of them — SuperMemory's managed connectors make that possible with zero setup. Mnemosyne expects you to call remember() yourself.
Memory Router: Zero-Code Integration
This is SuperMemory's killer feature. The Memory Router is a drop-in proxy that intercepts your existing LLM API calls and augments them with context from the memory stack:
# Before: direct API call
const response = await openai.chat.completions.create({ ... })
# After: route through SuperMemory (change baseURL + add headers)
const client = new OpenAI({
baseURL: "https://api.supermemory.ai/v1",
defaultHeaders: { "x-supermemory-key": "sk-..." }
})
// Same code, now with memory-augmented context
const response = await client.chat.completions.create({ ... })
There is no equivalent in Mnemosyne. Mnemosyne integrates via explicit tool calls (remember(), recall()) or MCP server tools — you add memory to your agent, you don't proxy your existing LLM calls through a memory layer.
| Mnemosyne | SuperMemory | |
|---|---|---|
| Integration model | Explicit tool calls / MCP server | Drop-in proxy (change baseURL + headers) |
| Code changes required | Add memory calls to your agent loop | Zero code changes to existing LLM calls |
| Hermes Agent | Native (in-process, 15 tools, 3 hooks) | Via HTTP proxy (no native plugin) |
| OpenClaw | Planned (adapter not yet built) | Via HTTP proxy |
| MCP | 6 tools, stdio + SSE | Custom HTTP API via Memory Router |
Who this matters to: If you have an existing app making OpenAI API calls and you want to add memory without touching your codebase, the Memory Router is uniquely compelling. If you're building an agent from scratch and want fine-grained control over what gets remembered and how, Mnemosyne's explicit API gives you more precision.
Retrieval Quality
SuperMemory currently holds the #1 position on MemoryBench for latency, quality, and cost combined. Mnemosyne has not been benchmarked on MemoryBench (different design goals, different evaluation criteria).
| Feature | Mnemosyne | SuperMemory |
|---|---|---|
| Vector search | sqlite-vec (cosine distance) | Hybrid Retrieval Engine (proprietary) |
| Keyword search | SQLite FTS5 | Included in hybrid engine |
| Graph search | TripleStore with temporal validity windows | Ontology-aware Vector Graph Engine with typed edges |
| Temporal search | temporal_weight + temporal_halflife on recall() | Not publicly documented in detail |
| Scoring | Hybrid: vector × FTS × importance, then recency decay | Learned fusion weights (MemoryBench-optimized) |
| Reranking | None (single-pass hybrid) | Included in Hybrid Retrieval Engine |
| Configurable | Per-query weights for vec, fts, importance | SaaS-managed (limited user-facing configuration) |
Honest assessment: SuperMemory's #1 MemoryBench ranking reflects real engineering quality in retrieval. The learned fusion and ontology-aware graph traversal likely outperform Mnemosyne's static-weighted hybrid scoring for complex, multi-hop retrieval tasks. However, MemoryBench evaluates SaaS products in idealized conditions — your mileage will vary based on your specific domain and data patterns.
Privacy & Self-Hosting
This is where the two products diverge most dramatically.
| Mnemosyne | SuperMemory | |
|---|---|---|
| Data location | Your machine. SQLite file on your disk. | SuperMemory cloud (US-based). Enterprise self-host available. |
| LLM calls | None required. Optional: any OpenAI-compatible endpoint or local GGUF. | Managed by SuperMemory — you do not control the models. |
| Offline capable | Yes — fully functional without internet after model download | No — SaaS requires internet. Enterprise may support offline after setup. |
| Self-hosting | Default. pip install and run. | Enterprise tier only. Pricing not publicly disclosed. |
| Open source | MIT license. Full source on GitHub. | Proprietary. Core stack is closed-source SaaS. |
| Audit trail | SQLite file history, your backup strategy | Platform audit logs (Enterprise) |
| Vendor lock-in | None — standard SQLite, export to JSON | High — proprietary format, SaaS-dependent, no documented export path |
The uncomfortable truth: SuperMemory is a proprietary SaaS product. Your data lives on their infrastructure, processed by their models, stored in their format. The self-hosted option exists only at the Enterprise tier with undisclosed pricing. Mnemosyne is MIT-licensed open source that you run on your own hardware — the trade-off is that you manage everything yourself.
Community & Ecosystem
| Mnemosyne | SuperMemory | |
|---|---|---|
| GitHub stars | Newer project, growing | ~22.4K — strong consumer developer community |
| License | MIT | Proprietary (source-available for client SDKs) |
| Documentation | Full docs (this site) + API reference | docs.supermemory.ai, interactive playground |
| Integrations | Hermes Agent (native, 15 tools, 3 hooks), MCP (6 tools, stdio + SSE), OpenClaw (planned) | Memory Router (OpenAI-compatible proxy), Notion, Slack, Google Drive connectors |
| SDK languages | Python | TypeScript (primary), REST API |
| Target audience | Developers building agents, local-first enthusiasts | Consumer app developers, TypeScript ecosystem, no-code/low-code users |
| Academic depth | BEAM: published architecture with formal memory model | Less academic publication; product-focused engineering |
Pricing
Mnemosyne
Free. MIT license. No tiers, no usage caps, no API costs. Use it forever with zero recurring cost. Your only expense is compute (and optional LLM API calls if you enable extraction).
SuperMemory
| Tier | Price | What you get |
|---|---|---|
| Free | $0/mo | 1M tokens/month, basic memory features |
| Pro | $19/mo | Expanded token limits, advanced retrieval |
| Scale | $399/mo | High-volume usage, priority support |
| Enterprise | Custom | Self-hosting, SSO, audit logs, dedicated support |
The pricing cliff: SuperMemory jumps from $19/mo (Pro) to $399/mo (Scale) with no intermediate tier. If your usage outgrows Pro, you face a 21× price increase before reaching Enterprise. Mnemosyne has no pricing curve — you pay for compute, period.
Hidden costs: SuperMemory's managed inference means you don't see the per-operation LLM costs directly — they're bundled into your tier. This simplifies billing but makes it hard to predict costs as your usage scales. With Mnemosyne, you control the models and pay only for what you use.
When to Choose Mnemosyne
- You want full data sovereignty — everything stays on your machine
- You need offline capability — agents that work without internet
- You're building for Hermes Agent and want deep, native integration (15 tools, hooks)
- You want predictable costs — no per-token tiers, no SaaS subscription
- You need open source — MIT license, full code access, no vendor lock-in
- You want explicit control over what gets remembered, when, and how
- You need temporal queries — "what did I know about X as of last Tuesday?"
When to Choose SuperMemory
- You want zero-code integration — change a baseURL and get memory-augmented LLM calls
- You're building a consumer application and need a polished, managed memory layer
- You work in the TypeScript ecosystem and want a TypeScript-native solution
- You need managed connectors — ingest Notion, Slack, Google Drive without building integrations
- You want MemoryBench-validated retrieval quality with no tuning required
- You don't want to manage infrastructure — embeddings, vector stores, graph databases
- You need a User Understanding Model that builds profiles automatically
Neither is Open Source vs. SaaS
SuperMemory and Mnemosyne represent fundamentally different philosophies. SuperMemory is a product: pay for it, point your app at it, and it works. Mnemosyne is a tool: install it, configure it, and build with it. One is not better than the other — they serve different developers with different constraints.
Known Gaps in Mnemosyne (honest list)
| Gap | Severity | Workaround |
|---|---|---|
| No zero-code proxy integration | Medium for existing apps | Explicit remember()/recall() calls or MCP server — requires code changes |
| No multi-modal extraction | Medium | Mnemosyne is text-only; use external tools for image/audio processing before calling remember() |
| No managed connectors (Notion, Slack, etc.) | Medium for knowledge workers | Build your own ingestion pipeline; Mnemosyne is the storage/retrieval layer |
| No User Understanding Model | Low | Mnemosyne stores what you tell it; building a user profile is your application's responsibility |
| No MemoryBench benchmarking | Low for local-first use | Different evaluation context; retrieval quality depends on your embedding model choice |
| No learned fusion / ontology-aware graph | Medium | Static hybrid weights are configurable but not adaptive; TripleStore handles temporal facts but not relationship inference |
| Smaller TypeScript ecosystem | Medium for TS developers | Python-first; MCP provides language-agnostic access via stdio/SSE |
| No SaaS option | Low | By design — Mnemosyne is local-first; deploy on a server with MCP SSE if you need remote access |
Every claim about SuperMemory has been verified against their public documentation and pricing page (as of May 2026). Every claim about Mnemosyne has been verified against the v2.8.0 source code. MemoryBench rankings and five-layer architecture details are from SuperMemory's published benchmarks and architecture docs. If anything here is wrong, please open an issue — we'll fix it.
Mnemosyne