Mnemosyne vs SuperMemory

An honest, technical comparison for developers choosing between a local-first memory engine and a zero-code SaaS context stack.

Last updated: 2026-05-13 · Mnemosyne v2.8.0 · SuperMemory (as of May 2026)

TL;DR: SuperMemory is a managed SaaS context layer with a five-layer architecture, zero-code Memory Router integration, and the strongest consumer offering in the space. Mnemosyne is a local-first memory engine that runs entirely on your machine with no external dependencies. SuperMemory wins on ease of adoption and consumer polish; Mnemosyne wins on privacy, full local control, and cost at scale.

Architecture

SuperMemory uses a proprietary five-layer context stack engineered for managed SaaS delivery. Mnemosyne uses a single SQLite file with BEAM (Bilevel Episodic-Associative Memory) — three tightly integrated tiers plus a temporal TripleStore.

Dimension	Mnemosyne	SuperMemory
Process model	In-process Python library	Managed SaaS with self-hosted Enterprise option
Database	SQLite (single file, WAL mode)	Proprietary vector graph store (ontology-aware edges)
Embedding model	fastembed ONNX — BAAI/bge-small-en-v1.5 (~67MB), runs locally	Managed inference, model details not publicly disclosed
Extraction model	Opt-in: any OpenAI-compatible or local GGUF	Multi-modal extractors built into the stack (layer 4)
Vector search	sqlite-vec (cosine distance)	Hybrid Retrieval Engine (layer 3) with ontology-aware graph traversal
Knowledge graph	TripleStore (subject-predicate-object, temporal)	Vector Graph Engine with ontology-aware edges (layer 2)
Runtime memory	~10-20MB per session	Cloud-hosted (no local footprint) or Enterprise container
Cold start	Instant (if models cached)	Instant (SaaS — no local boot required)
Language	Python	TypeScript-native

Key architectural difference: SuperMemory's five-layer stack (User Understanding Model → Vector Graph Engine → Hybrid Retrieval Engine → Multi-modal Extractors → Managed Connectors) is a vertically integrated SaaS product. Mnemosyne's BEAM architecture is a horizontally layered library you compose yourself. SuperMemory optimizes for zero-config; Mnemosyne optimizes for zero-dependency.

The Five-Layer Stack

SuperMemory's architecture is its defining feature. Understanding it clarifies what you get (and don't get) with each tier:

Layer	Name	What it does	Mnemosyne equivalent
1	User Understanding Model	Builds persistent user profiles, preferences, and behavioral patterns	Not present. Mnemosyne stores what you tell it; it does not model you.
2	Vector Graph Engine	Ontology-aware knowledge graph with typed edges and relationship inference	TripleStore — simpler (S-P-O triples), temporal, but no ontology or edge typing.
3	Hybrid Retrieval Engine	Multi-strategy semantic + graph + keyword retrieval with learned fusion	Hybrid scoring (vector + FTS5 + importance) × recency decay. Single-pass, not learned.
4	Multi-modal Extractors	Extracts structured data from text, images, and connected sources	Opt-in LLM fact extraction + regex entity extraction. Text-only, no multi-modal.
5	Managed Connectors	Native integrations with Notion, Slack, Google Drive, and more	None. Mnemosyne is a library — you build your own connectors.

Verdict: If you want a system that ingests your Notion docs, Slack messages, and Google Drive files and builds a unified memory graph from all of them — SuperMemory's managed connectors make that possible with zero setup. Mnemosyne expects you to call remember() yourself.

Memory Router: Zero-Code Integration

This is SuperMemory's killer feature. The Memory Router is a drop-in proxy that intercepts your existing LLM API calls and augments them with context from the memory stack:

# Before: direct API call
const response = await openai.chat.completions.create({ ... })

# After: route through SuperMemory (change baseURL + add headers)
const client = new OpenAI({
  baseURL: "https://api.supermemory.ai/v1",
  defaultHeaders: { "x-supermemory-key": "sk-..." }
})
// Same code, now with memory-augmented context
const response = await client.chat.completions.create({ ... })

There is no equivalent in Mnemosyne. Mnemosyne integrates via explicit tool calls (remember(), recall()) or MCP server tools — you add memory to your agent, you don't proxy your existing LLM calls through a memory layer.

	Mnemosyne	SuperMemory
Integration model	Explicit tool calls / MCP server	Drop-in proxy (change baseURL + headers)
Code changes required	Add memory calls to your agent loop	Zero code changes to existing LLM calls
Hermes Agent	Native (in-process, 15 tools, 3 hooks)	Via HTTP proxy (no native plugin)
OpenClaw	Planned (adapter not yet built)	Via HTTP proxy
MCP	6 tools, stdio + SSE	Custom HTTP API via Memory Router

Who this matters to: If you have an existing app making OpenAI API calls and you want to add memory without touching your codebase, the Memory Router is uniquely compelling. If you're building an agent from scratch and want fine-grained control over what gets remembered and how, Mnemosyne's explicit API gives you more precision.

Retrieval Quality

SuperMemory currently holds the #1 position on MemoryBench for latency, quality, and cost combined. Mnemosyne has not been benchmarked on MemoryBench (different design goals, different evaluation criteria).

Feature	Mnemosyne	SuperMemory
Vector search	sqlite-vec (cosine distance)	Hybrid Retrieval Engine (proprietary)
Keyword search	SQLite FTS5	Included in hybrid engine
Graph search	TripleStore with temporal validity windows	Ontology-aware Vector Graph Engine with typed edges
Temporal search	`temporal_weight` + `temporal_halflife` on `recall()`	Not publicly documented in detail
Scoring	Hybrid: vector × FTS × importance, then recency decay	Learned fusion weights (MemoryBench-optimized)
Reranking	None (single-pass hybrid)	Included in Hybrid Retrieval Engine
Configurable	Per-query weights for vec, fts, importance	SaaS-managed (limited user-facing configuration)

Honest assessment: SuperMemory's #1 MemoryBench ranking reflects real engineering quality in retrieval. The learned fusion and ontology-aware graph traversal likely outperform Mnemosyne's static-weighted hybrid scoring for complex, multi-hop retrieval tasks. However, MemoryBench evaluates SaaS products in idealized conditions — your mileage will vary based on your specific domain and data patterns.

Privacy & Self-Hosting

This is where the two products diverge most dramatically.

	Mnemosyne	SuperMemory
Data location	Your machine. SQLite file on your disk.	SuperMemory cloud (US-based). Enterprise self-host available.
LLM calls	None required. Optional: any OpenAI-compatible endpoint or local GGUF.	Managed by SuperMemory — you do not control the models.
Offline capable	Yes — fully functional without internet after model download	No — SaaS requires internet. Enterprise may support offline after setup.
Self-hosting	Default. `pip install` and run.	Enterprise tier only. Pricing not publicly disclosed.
Open source	MIT license. Full source on GitHub.	Proprietary. Core stack is closed-source SaaS.
Audit trail	SQLite file history, your backup strategy	Platform audit logs (Enterprise)
Vendor lock-in	None — standard SQLite, export to JSON	High — proprietary format, SaaS-dependent, no documented export path

The uncomfortable truth: SuperMemory is a proprietary SaaS product. Your data lives on their infrastructure, processed by their models, stored in their format. The self-hosted option exists only at the Enterprise tier with undisclosed pricing. Mnemosyne is MIT-licensed open source that you run on your own hardware — the trade-off is that you manage everything yourself.

Community & Ecosystem

	Mnemosyne	SuperMemory
GitHub stars	Newer project, growing	~22.4K — strong consumer developer community
License	MIT	Proprietary (source-available for client SDKs)
Documentation	Full docs (this site) + API reference	docs.supermemory.ai, interactive playground
Integrations	Hermes Agent (native, 15 tools, 3 hooks), MCP (6 tools, stdio + SSE), OpenClaw (planned)	Memory Router (OpenAI-compatible proxy), Notion, Slack, Google Drive connectors
SDK languages	Python	TypeScript (primary), REST API
Target audience	Developers building agents, local-first enthusiasts	Consumer app developers, TypeScript ecosystem, no-code/low-code users
Academic depth	BEAM: published architecture with formal memory model	Less academic publication; product-focused engineering

Pricing

Mnemosyne

Free. MIT license. No tiers, no usage caps, no API costs. Use it forever with zero recurring cost. Your only expense is compute (and optional LLM API calls if you enable extraction).

SuperMemory

Tier	Price	What you get
Free	$0/mo	1M tokens/month, basic memory features
Pro	$19/mo	Expanded token limits, advanced retrieval
Scale	$399/mo	High-volume usage, priority support
Enterprise	Custom	Self-hosting, SSO, audit logs, dedicated support

The pricing cliff: SuperMemory jumps from $19/mo (Pro) to $399/mo (Scale) with no intermediate tier. If your usage outgrows Pro, you face a 21× price increase before reaching Enterprise. Mnemosyne has no pricing curve — you pay for compute, period.

Hidden costs: SuperMemory's managed inference means you don't see the per-operation LLM costs directly — they're bundled into your tier. This simplifies billing but makes it hard to predict costs as your usage scales. With Mnemosyne, you control the models and pay only for what you use.

When to Choose Mnemosyne

You want full data sovereignty — everything stays on your machine
You need offline capability — agents that work without internet
You're building for Hermes Agent and want deep, native integration (15 tools, hooks)
You want predictable costs — no per-token tiers, no SaaS subscription
You need open source — MIT license, full code access, no vendor lock-in
You want explicit control over what gets remembered, when, and how
You need temporal queries — "what did I know about X as of last Tuesday?"

When to Choose SuperMemory

You want zero-code integration — change a baseURL and get memory-augmented LLM calls
You're building a consumer application and need a polished, managed memory layer
You work in the TypeScript ecosystem and want a TypeScript-native solution
You need managed connectors — ingest Notion, Slack, Google Drive without building integrations
You want MemoryBench-validated retrieval quality with no tuning required
You don't want to manage infrastructure — embeddings, vector stores, graph databases
You need a User Understanding Model that builds profiles automatically

Neither is Open Source vs. SaaS

SuperMemory and Mnemosyne represent fundamentally different philosophies. SuperMemory is a product: pay for it, point your app at it, and it works. Mnemosyne is a tool: install it, configure it, and build with it. One is not better than the other — they serve different developers with different constraints.

Known Gaps in Mnemosyne (honest list)

Gap	Severity	Workaround
No zero-code proxy integration	Medium for existing apps	Explicit `remember()`/`recall()` calls or MCP server — requires code changes
No multi-modal extraction	Medium	Mnemosyne is text-only; use external tools for image/audio processing before calling `remember()`
No managed connectors (Notion, Slack, etc.)	Medium for knowledge workers	Build your own ingestion pipeline; Mnemosyne is the storage/retrieval layer
No User Understanding Model	Low	Mnemosyne stores what you tell it; building a user profile is your application's responsibility
No MemoryBench benchmarking	Low for local-first use	Different evaluation context; retrieval quality depends on your embedding model choice
No learned fusion / ontology-aware graph	Medium	Static hybrid weights are configurable but not adaptive; TripleStore handles temporal facts but not relationship inference
Smaller TypeScript ecosystem	Medium for TS developers	Python-first; MCP provides language-agnostic access via stdio/SSE
No SaaS option	Low	By design — Mnemosyne is local-first; deploy on a server with MCP SSE if you need remote access

Every claim about SuperMemory has been verified against their public documentation and pricing page (as of May 2026). Every claim about Mnemosyne has been verified against the v2.8.0 source code. MemoryBench rankings and five-layer architecture details are from SuperMemory's published benchmarks and architecture docs. If anything here is wrong, please open an issue — we'll fix it.