Mnemosyne vs Hindsight

An honest, technical comparison for users running memory systems locally with Hermes Agent and OpenClaw.

Last updated: 2026-05-13 · Mnemosyne v2.8.0

TL;DR: They are not direct competitors. Hindsight is a memory engine with sophisticated NLP and multi-signal retrieval. Mnemosyne is a memory layer optimized for simplicity, speed, and single-machine deployments. Choose based on what you actually need.


Architecture

DimensionMnemosyneHindsight Self-Hosted
Process modelIn-process Python librarySeparate Docker containers (FastAPI + PostgreSQL)
IPC overheadZero (direct function calls)HTTP + JSON serialization to localhost:8888
DatabaseSQLite (single file, WAL mode)PostgreSQL + pgvector extension
Embedding modelfastembed ONNX — BAAI/bge-small-en-v1.5 (~67MB)sentence-transformers PyTorch (~500MB)
Vector searchsqlite-vec (int8/bit/float32) or numpy fallbackpgvector HNSW (mature, optimized)
Cold startInstant (if models cached locally)~5–10s (Docker container boot + model loading)
Runtime memory~10–20MB per session (SQLite + ONNX)~100–300MB (PostgreSQL pool + PyTorch)

Memory Model

Mnemosyne: BEAM (Bilevel Episodic-Associative Memory)

Three SQLite tables:

TierPurposeBehavior
Working memoryHot, recent contextAuto-injected into prompts. TTL-based eviction (default 24h). Max 10,000 items. FTS5 indexed.
Episodic memoryLong-term consolidated storagePopulated by sleep() consolidation. Hybrid vector + FTS5 search.
ScratchpadTemporary agent workspaceNot searchable, not consolidated. Cleared explicitly. Max 1,000 items.

Additional: TripleStore — temporal knowledge graph with valid_from/valid_until for point-in-time queries.

Core operations: remember(), recall(), sleep() — intentionally simple.

Hindsight: Retain / Recall / Reflect

OperationMnemosyne equivalentGap?
Retain — LLM-driven fact extraction, entity normalization, 2–5 structured facts per chunkremember() stores raw text + optional embedding. extract=True enables LLM fact extraction.Partial gap. Mnemosyne can extract facts via LLM but has no automatic entity normalization.
Recall — 4-way parallel (semantic + BM25 + graph + temporal), RRF fusion, cross-encoder rerankrecall() — hybrid (vector + FTS5 + importance) × recency decay. Single-pass.Design difference. Mnemosyne is simpler by choice.
Reflect — Agentic loop with tool calling, mental models, disposition traitssleep() — LLM summarization of working → episodic. No agentic loop, no mental models.Gap. Mnemosyne does consolidation, not reasoning about knowledge.

Retrieval

FeatureMnemosyneHindsight Self-Hosted
Vector searchsqlite-vec (cosine distance)pgvector HNSW
Keyword searchSQLite FTS5PostgreSQL full-text + BM25
Graph searchTripleStore (subject-predicate-object, temporal)Native knowledge graph with co-occurrence tracking
Temporal searchtemporal_weight + temporal_halflife params on recall()Native date parsing, occurred_start/end, temporal recall strategy
Scoring formulavec_weight × vec + fts_weight × fts + importance_weight × importance, then × recency decay4-way parallel retrieval → RRF fusion → cross-encoder rerank
Default weights50% vector, 30% FTS, 20% importanceLearned fusion weights
Configurable?Yes — per-query vec_weight, fts_weight, importance_weight paramsYes — configurable strategies
RerankingNone (single-pass)Cross-encoder rerank

Entity Extraction (v2)

FeatureMnemosyneHindsight Self-Hosted
MethodRegex patterns + pure Python Levenshtein distancespaCy NLP pipeline + LLM extraction
Patterns@mentions, #hashtags, "quoted phrases", capitalized sequences (2–5 words)Full NLP: named entities, noun phrases, coreference
Fuzzy matchingLevenshtein distance with prefix/substring bonusesTrigram/full resolution strategies. Entity co-occurrence tracking.
StorageTripleStore triples: (memory_id, "mentions", "entity_name")Structured entity table with normalization
Speed~0.01ms per extractionHeavier (spaCy model loading + inference)
Opt-in?extract_entities=True on remember()Always on

Verdict: Mnemosyne's regex approach is fast and dependency-free but misses many entity types that spaCy catches. This is a deliberate trade-off: speed and simplicity over NLP accuracy.


Fact Extraction (v2)

FeatureMnemosyneHindsight Self-Hosted
MethodLLM-driven: sends text to LLM, parses 2–5 factual statementsLLM-driven Retain pipeline with provenance tracking
Fallback chainRemote OpenAI-compatible API → local ctransformers GGUF → skip (graceful)N/A (runs inside container)
StorageTripleStore: (memory_id, "fact", fact_text)Structured fact table with evidence tracking
Opt-in?extract=True on remember()Always on via Retain

Integrations

MCP (Model Context Protocol) — v2

Mnemosyne provides an MCP server with 6 tools and 2 transports:

ToolDescription
mnemosyne_rememberStore a memory (supports entity extraction, fact extraction, bank selection)
mnemosyne_recallSearch memories with hybrid scoring and configurable weights
mnemosyne_sleepRun consolidation cycle
mnemosyne_scratchpad_readRead agent scratchpad
mnemosyne_scratchpad_writeWrite to scratchpad
mnemosyne_get_statsGet memory statistics
mnemosyne mcp                          # stdio transport (Claude Desktop, etc.)
mnemosyne mcp --transport sse --port 8080  # SSE transport (web clients)
mnemosyne mcp --bank project_a            # scoped to a specific bank

Hermes Agent Integration

15 tools and 3 hooks via plugin.yaml.

Hooks: pre_llm_call (context injection), on_session_start (session init), post_tool_call (memory capture)

Hindsight Integration

Custom HTTP API on port 8888. Native openclaw-hindsight plugin exists for OpenClaw. Hermes integration via HTTP client.

MnemosyneHindsight
HermesNative (in-process, no serialization)HTTP client
OpenClawPlanned (adapter not yet built)Native plugin exists
MCP6 tools, stdio + SSECustom HTTP API
Cross-machineExport/import JSON onlyAny agent with HTTP access to port 8888

Memory Banks (v2)

FeatureMnemosyneHindsight
Named banksBankManager — create, list, delete, rename banksbanks table with strict isolation
IsolationPer-bank SQLite file under data_dir/banks/<name>/PostgreSQL schema-level isolation
UsageMnemosyne(bank="work") or mnemosyne mcp --bank workAPI-level bank selection
Multi-tenancyNo access controlHindClaw extension (JWT/API key multi-tenancy)

Additional Features (v2)

Mnemosyne-specific

FeatureModuleDescription
Streamingcore/streaming.pyMemoryStream with push (callbacks) and pull (iterator) patterns. Thread-safe event buffer.
Delta synccore/streaming.pyDeltaSync — incremental synchronization between Mnemosyne instances with checkpointed resume.
Pattern detectioncore/patterns.pyPatternDetector — temporal (hour/weekday), content (keyword frequency, co-occurrence), sequence patterns.
Memory compressioncore/patterns.pyMemoryCompressor — dictionary-based, RLE, and semantic compression strategies.
Plugin systemcore/plugins.pyMnemosynePlugin base class with 4 lifecycle hooks. Discovers plugins from ~/.hermes/mnemosyne/plugins/.
Diagnosticsdiagnose.pyPII-safe health check — dependencies, database state, vector readiness. No memory content or API keys.

Hindsight-specific (not in Mnemosyne)

FeatureDescription
Automatic entity normalization"Abdias" and "Abdias J" resolved to same entity automatically
Cross-encoder rerankingSecond-pass neural reranking of retrieval results
Mental modelsAgent reasoning about user preferences and traits
Agentic reflectionTool-calling loop during Reflect phase
Conflict detectionAutomatic contradiction detection and merging
Multi-machine sharingNetwork API for distributed agents
Multi-tenancyPer-user isolation with access control via HindClaw

Performance Characteristics

MetricMnemosyneHindsight Self-Hosted
Recall latency (10K corpus)~2–10ms — in-process SQLite + sqlite-vec, no HTTP overhead~50–200ms — HTTP round-trip + PostgreSQL + 4-way retrieval + rerank
IPC modelDirect Python function callHTTP POST to localhost:8888 → JSON serialization → response parsing
Storage footprint~50–100MB SQLite file per 10K memories~200–500MB PostgreSQL + WAL per 10K memories
Model downloadOne-time ~67MB (fastembed ONNX)One-time ~500MB (sentence-transformers PyTorch)
Runtime memory~10–20MB per session~100–300MB (PostgreSQL pool + PyTorch runtime)

Important caveat on latency numbers: Mnemosyne's latency advantage comes from being an in-process library calling SQLite directly, compared to HTTP round-trips to a local Docker container. This is an architectural advantage, not a retrieval-quality advantage.


When to Choose What

Choose Mnemosyne if:

  • You want pip install with zero containers
  • You need the fastest possible recall latency for interactive agent loops
  • You're running on a resource-constrained environment (VPS, ephemeral VM, CI)
  • You're building a single-user, single-machine agent (Hermes, Claude Desktop, etc.)
  • You want an MCP-compatible memory layer (stdio + SSE)
  • You want full control over the memory model and don't need automatic "magic"
  • You want memory banks with per-bank SQLite isolation without standing up PostgreSQL

Choose Hindsight Self-Hosted if:

  • You need entity resolution ("Abdias" and "Abdias J" are the same person)
  • You need automatic structured fact extraction from raw text
  • You need cross-machine memory sharing via network API
  • You need multi-tenant memory banks with access control
  • You need temporal reasoning with automatic date extraction
  • You need the highest recall quality (4-way retrieval + cross-encoder rerank)
  • You're okay with Docker + PostgreSQL complexity as a trade-off for richer capabilities

Neither is "better." They serve different points on the simplicity-sophistication spectrum.


Known Gaps in Mnemosyne (honest list)

GapSeverityWorkaround
No automatic entity normalizationMediumextract_entities=True captures entities; fuzzy matching helps but doesn't resolve coreference
No cross-machine network APIMedium for multi-agent setupsExport/import JSON; same-machine sharing via shared SQLite file. Can now import FROM Hindsight directly — migrate without data loss
No cross-encoder rerankingLow for most queriesHybrid scoring with configurable weights covers common cases
No automatic conflict detectionMediumManual invalidate(memory_id, replacement_id=new_id)
No multi-tenancy / access controlHigh for SaaS use casesUse per-bank SQLite isolation for domain separation
No mental models / agentic reflectionLowsleep() does consolidation; reasoning about knowledge is the caller's job
OpenClaw adapter not yet builtMedium for OpenClaw usersHermes integration is native; OpenClaw requires MCP adapter work

This page was rewritten for v2.8.0 after community feedback about inaccurate comparisons. Every feature listed for Mnemosyne has been verified against the source code. If anything here is wrong, please open an issue — we'll fix it.