Mnemosyne vs SuperMemory

An honest, technical comparison for developers choosing between a local-first memory engine and a zero-code SaaS context stack.

Last updated: 2026-05-13 · Mnemosyne v2.8.0 · SuperMemory (as of May 2026)

TL;DR: SuperMemory is a managed SaaS context layer with a five-layer architecture, zero-code Memory Router integration, and the strongest consumer offering in the space. Mnemosyne is a local-first memory engine that runs entirely on your machine with no external dependencies. SuperMemory wins on ease of adoption and consumer polish; Mnemosyne wins on privacy, full local control, and cost at scale.


Architecture

SuperMemory uses a proprietary five-layer context stack engineered for managed SaaS delivery. Mnemosyne uses a single SQLite file with BEAM (Bilevel Episodic-Associative Memory) — three tightly integrated tiers plus a temporal TripleStore.

DimensionMnemosyneSuperMemory
Process modelIn-process Python libraryManaged SaaS with self-hosted Enterprise option
DatabaseSQLite (single file, WAL mode)Proprietary vector graph store (ontology-aware edges)
Embedding modelfastembed ONNX — BAAI/bge-small-en-v1.5 (~67MB), runs locallyManaged inference, model details not publicly disclosed
Extraction modelOpt-in: any OpenAI-compatible or local GGUFMulti-modal extractors built into the stack (layer 4)
Vector searchsqlite-vec (cosine distance)Hybrid Retrieval Engine (layer 3) with ontology-aware graph traversal
Knowledge graphTripleStore (subject-predicate-object, temporal)Vector Graph Engine with ontology-aware edges (layer 2)
Runtime memory~10-20MB per sessionCloud-hosted (no local footprint) or Enterprise container
Cold startInstant (if models cached)Instant (SaaS — no local boot required)
LanguagePythonTypeScript-native

Key architectural difference: SuperMemory's five-layer stack (User Understanding Model → Vector Graph Engine → Hybrid Retrieval Engine → Multi-modal Extractors → Managed Connectors) is a vertically integrated SaaS product. Mnemosyne's BEAM architecture is a horizontally layered library you compose yourself. SuperMemory optimizes for zero-config; Mnemosyne optimizes for zero-dependency.


The Five-Layer Stack

SuperMemory's architecture is its defining feature. Understanding it clarifies what you get (and don't get) with each tier:

LayerNameWhat it doesMnemosyne equivalent
1User Understanding ModelBuilds persistent user profiles, preferences, and behavioral patternsNot present. Mnemosyne stores what you tell it; it does not model you.
2Vector Graph EngineOntology-aware knowledge graph with typed edges and relationship inferenceTripleStore — simpler (S-P-O triples), temporal, but no ontology or edge typing.
3Hybrid Retrieval EngineMulti-strategy semantic + graph + keyword retrieval with learned fusionHybrid scoring (vector + FTS5 + importance) × recency decay. Single-pass, not learned.
4Multi-modal ExtractorsExtracts structured data from text, images, and connected sourcesOpt-in LLM fact extraction + regex entity extraction. Text-only, no multi-modal.
5Managed ConnectorsNative integrations with Notion, Slack, Google Drive, and moreNone. Mnemosyne is a library — you build your own connectors.

Verdict: If you want a system that ingests your Notion docs, Slack messages, and Google Drive files and builds a unified memory graph from all of them — SuperMemory's managed connectors make that possible with zero setup. Mnemosyne expects you to call remember() yourself.


Memory Router: Zero-Code Integration

This is SuperMemory's killer feature. The Memory Router is a drop-in proxy that intercepts your existing LLM API calls and augments them with context from the memory stack:

# Before: direct API call
const response = await openai.chat.completions.create({ ... })

# After: route through SuperMemory (change baseURL + add headers)
const client = new OpenAI({
  baseURL: "https://api.supermemory.ai/v1",
  defaultHeaders: { "x-supermemory-key": "sk-..." }
})
// Same code, now with memory-augmented context
const response = await client.chat.completions.create({ ... })

There is no equivalent in Mnemosyne. Mnemosyne integrates via explicit tool calls (remember(), recall()) or MCP server tools — you add memory to your agent, you don't proxy your existing LLM calls through a memory layer.

MnemosyneSuperMemory
Integration modelExplicit tool calls / MCP serverDrop-in proxy (change baseURL + headers)
Code changes requiredAdd memory calls to your agent loopZero code changes to existing LLM calls
Hermes AgentNative (in-process, 15 tools, 3 hooks)Via HTTP proxy (no native plugin)
OpenClawPlanned (adapter not yet built)Via HTTP proxy
MCP6 tools, stdio + SSECustom HTTP API via Memory Router

Who this matters to: If you have an existing app making OpenAI API calls and you want to add memory without touching your codebase, the Memory Router is uniquely compelling. If you're building an agent from scratch and want fine-grained control over what gets remembered and how, Mnemosyne's explicit API gives you more precision.


Retrieval Quality

SuperMemory currently holds the #1 position on MemoryBench for latency, quality, and cost combined. Mnemosyne has not been benchmarked on MemoryBench (different design goals, different evaluation criteria).

FeatureMnemosyneSuperMemory
Vector searchsqlite-vec (cosine distance)Hybrid Retrieval Engine (proprietary)
Keyword searchSQLite FTS5Included in hybrid engine
Graph searchTripleStore with temporal validity windowsOntology-aware Vector Graph Engine with typed edges
Temporal searchtemporal_weight + temporal_halflife on recall()Not publicly documented in detail
ScoringHybrid: vector × FTS × importance, then recency decayLearned fusion weights (MemoryBench-optimized)
RerankingNone (single-pass hybrid)Included in Hybrid Retrieval Engine
ConfigurablePer-query weights for vec, fts, importanceSaaS-managed (limited user-facing configuration)

Honest assessment: SuperMemory's #1 MemoryBench ranking reflects real engineering quality in retrieval. The learned fusion and ontology-aware graph traversal likely outperform Mnemosyne's static-weighted hybrid scoring for complex, multi-hop retrieval tasks. However, MemoryBench evaluates SaaS products in idealized conditions — your mileage will vary based on your specific domain and data patterns.


Privacy & Self-Hosting

This is where the two products diverge most dramatically.

MnemosyneSuperMemory
Data locationYour machine. SQLite file on your disk.SuperMemory cloud (US-based). Enterprise self-host available.
LLM callsNone required. Optional: any OpenAI-compatible endpoint or local GGUF.Managed by SuperMemory — you do not control the models.
Offline capableYes — fully functional without internet after model downloadNo — SaaS requires internet. Enterprise may support offline after setup.
Self-hostingDefault. pip install and run.Enterprise tier only. Pricing not publicly disclosed.
Open sourceMIT license. Full source on GitHub.Proprietary. Core stack is closed-source SaaS.
Audit trailSQLite file history, your backup strategyPlatform audit logs (Enterprise)
Vendor lock-inNone — standard SQLite, export to JSONHigh — proprietary format, SaaS-dependent, no documented export path

The uncomfortable truth: SuperMemory is a proprietary SaaS product. Your data lives on their infrastructure, processed by their models, stored in their format. The self-hosted option exists only at the Enterprise tier with undisclosed pricing. Mnemosyne is MIT-licensed open source that you run on your own hardware — the trade-off is that you manage everything yourself.


Community & Ecosystem

MnemosyneSuperMemory
GitHub starsNewer project, growing~22.4K — strong consumer developer community
LicenseMITProprietary (source-available for client SDKs)
DocumentationFull docs (this site) + API referencedocs.supermemory.ai, interactive playground
IntegrationsHermes Agent (native, 15 tools, 3 hooks), MCP (6 tools, stdio + SSE), OpenClaw (planned)Memory Router (OpenAI-compatible proxy), Notion, Slack, Google Drive connectors
SDK languagesPythonTypeScript (primary), REST API
Target audienceDevelopers building agents, local-first enthusiastsConsumer app developers, TypeScript ecosystem, no-code/low-code users
Academic depthBEAM: published architecture with formal memory modelLess academic publication; product-focused engineering

Pricing

Mnemosyne

Free. MIT license. No tiers, no usage caps, no API costs. Use it forever with zero recurring cost. Your only expense is compute (and optional LLM API calls if you enable extraction).

SuperMemory

TierPriceWhat you get
Free$0/mo1M tokens/month, basic memory features
Pro$19/moExpanded token limits, advanced retrieval
Scale$399/moHigh-volume usage, priority support
EnterpriseCustomSelf-hosting, SSO, audit logs, dedicated support

The pricing cliff: SuperMemory jumps from $19/mo (Pro) to $399/mo (Scale) with no intermediate tier. If your usage outgrows Pro, you face a 21× price increase before reaching Enterprise. Mnemosyne has no pricing curve — you pay for compute, period.

Hidden costs: SuperMemory's managed inference means you don't see the per-operation LLM costs directly — they're bundled into your tier. This simplifies billing but makes it hard to predict costs as your usage scales. With Mnemosyne, you control the models and pay only for what you use.


When to Choose Mnemosyne

  • You want full data sovereignty — everything stays on your machine
  • You need offline capability — agents that work without internet
  • You're building for Hermes Agent and want deep, native integration (15 tools, hooks)
  • You want predictable costs — no per-token tiers, no SaaS subscription
  • You need open source — MIT license, full code access, no vendor lock-in
  • You want explicit control over what gets remembered, when, and how
  • You need temporal queries — "what did I know about X as of last Tuesday?"

When to Choose SuperMemory

  • You want zero-code integration — change a baseURL and get memory-augmented LLM calls
  • You're building a consumer application and need a polished, managed memory layer
  • You work in the TypeScript ecosystem and want a TypeScript-native solution
  • You need managed connectors — ingest Notion, Slack, Google Drive without building integrations
  • You want MemoryBench-validated retrieval quality with no tuning required
  • You don't want to manage infrastructure — embeddings, vector stores, graph databases
  • You need a User Understanding Model that builds profiles automatically

Neither is Open Source vs. SaaS

SuperMemory and Mnemosyne represent fundamentally different philosophies. SuperMemory is a product: pay for it, point your app at it, and it works. Mnemosyne is a tool: install it, configure it, and build with it. One is not better than the other — they serve different developers with different constraints.


Known Gaps in Mnemosyne (honest list)

GapSeverityWorkaround
No zero-code proxy integrationMedium for existing appsExplicit remember()/recall() calls or MCP server — requires code changes
No multi-modal extractionMediumMnemosyne is text-only; use external tools for image/audio processing before calling remember()
No managed connectors (Notion, Slack, etc.)Medium for knowledge workersBuild your own ingestion pipeline; Mnemosyne is the storage/retrieval layer
No User Understanding ModelLowMnemosyne stores what you tell it; building a user profile is your application's responsibility
No MemoryBench benchmarkingLow for local-first useDifferent evaluation context; retrieval quality depends on your embedding model choice
No learned fusion / ontology-aware graphMediumStatic hybrid weights are configurable but not adaptive; TripleStore handles temporal facts but not relationship inference
Smaller TypeScript ecosystemMedium for TS developersPython-first; MCP provides language-agnostic access via stdio/SSE
No SaaS optionLowBy design — Mnemosyne is local-first; deploy on a server with MCP SSE if you need remote access

Every claim about SuperMemory has been verified against their public documentation and pricing page (as of May 2026). Every claim about Mnemosyne has been verified against the v2.8.0 source code. MemoryBench rankings and five-layer architecture details are from SuperMemory's published benchmarks and architecture docs. If anything here is wrong, please open an issue — we'll fix it.