Mnemosyne vs Hindsight

An honest, technical comparison for users running memory systems locally with Hermes Agent and OpenClaw.

Last updated: 2026-05-13 · Mnemosyne v2.8.0

TL;DR: They are not direct competitors. Hindsight is a memory engine with sophisticated NLP and multi-signal retrieval. Mnemosyne is a memory layer optimized for simplicity, speed, and single-machine deployments. Choose based on what you actually need.

Architecture

Dimension	Mnemosyne	Hindsight Self-Hosted
Process model	In-process Python library	Separate Docker containers (FastAPI + PostgreSQL)
IPC overhead	Zero (direct function calls)	HTTP + JSON serialization to localhost:8888
Database	SQLite (single file, WAL mode)	PostgreSQL + pgvector extension
Embedding model	fastembed ONNX — BAAI/bge-small-en-v1.5 (~67MB)	sentence-transformers PyTorch (~500MB)
Vector search	sqlite-vec (int8/bit/float32) or numpy fallback	pgvector HNSW (mature, optimized)
Cold start	Instant (if models cached locally)	~5–10s (Docker container boot + model loading)
Runtime memory	~10–20MB per session (SQLite + ONNX)	~100–300MB (PostgreSQL pool + PyTorch)

Memory Model

Mnemosyne: BEAM (Bilevel Episodic-Associative Memory)

Three SQLite tables:

Tier	Purpose	Behavior
Working memory	Hot, recent context	Auto-injected into prompts. TTL-based eviction (default 24h). Max 10,000 items. FTS5 indexed.
Episodic memory	Long-term consolidated storage	Populated by `sleep()` consolidation. Hybrid vector + FTS5 search.
Scratchpad	Temporary agent workspace	Not searchable, not consolidated. Cleared explicitly. Max 1,000 items.

Additional: TripleStore — temporal knowledge graph with valid_from/valid_until for point-in-time queries.

Core operations: remember(), recall(), sleep() — intentionally simple.

Hindsight: Retain / Recall / Reflect

Operation	Mnemosyne equivalent	Gap?
Retain — LLM-driven fact extraction, entity normalization, 2–5 structured facts per chunk	`remember()` stores raw text + optional embedding. `extract=True` enables LLM fact extraction.	Partial gap. Mnemosyne can extract facts via LLM but has no automatic entity normalization.
Recall — 4-way parallel (semantic + BM25 + graph + temporal), RRF fusion, cross-encoder rerank	`recall()` — hybrid (vector + FTS5 + importance) × recency decay. Single-pass.	Design difference. Mnemosyne is simpler by choice.
Reflect — Agentic loop with tool calling, mental models, disposition traits	`sleep()` — LLM summarization of working → episodic. No agentic loop, no mental models.	Gap. Mnemosyne does consolidation, not reasoning about knowledge.

Retrieval

Feature	Mnemosyne	Hindsight Self-Hosted
Vector search	sqlite-vec (cosine distance)	pgvector HNSW
Keyword search	SQLite FTS5	PostgreSQL full-text + BM25
Graph search	TripleStore (subject-predicate-object, temporal)	Native knowledge graph with co-occurrence tracking
Temporal search	`temporal_weight` + `temporal_halflife` params on `recall()`	Native date parsing, `occurred_start/end`, temporal recall strategy
Scoring formula	`vec_weight × vec + fts_weight × fts + importance_weight × importance`, then × recency decay	4-way parallel retrieval → RRF fusion → cross-encoder rerank
Default weights	50% vector, 30% FTS, 20% importance	Learned fusion weights
Configurable?	Yes — per-query `vec_weight`, `fts_weight`, `importance_weight` params	Yes — configurable strategies
Reranking	None (single-pass)	Cross-encoder rerank

Entity Extraction (v2)

Feature	Mnemosyne	Hindsight Self-Hosted
Method	Regex patterns + pure Python Levenshtein distance	spaCy NLP pipeline + LLM extraction
Patterns	`@mentions`, `#hashtags`, `"quoted phrases"`, capitalized sequences (2–5 words)	Full NLP: named entities, noun phrases, coreference
Fuzzy matching	Levenshtein distance with prefix/substring bonuses	Trigram/full resolution strategies. Entity co-occurrence tracking.
Storage	TripleStore triples: `(memory_id, "mentions", "entity_name")`	Structured entity table with normalization
Speed	~0.01ms per extraction	Heavier (spaCy model loading + inference)
Opt-in?	`extract_entities=True` on `remember()`	Always on

Verdict: Mnemosyne's regex approach is fast and dependency-free but misses many entity types that spaCy catches. This is a deliberate trade-off: speed and simplicity over NLP accuracy.

Fact Extraction (v2)

Feature	Mnemosyne	Hindsight Self-Hosted
Method	LLM-driven: sends text to LLM, parses 2–5 factual statements	LLM-driven Retain pipeline with provenance tracking
Fallback chain	Remote OpenAI-compatible API → local ctransformers GGUF → skip (graceful)	N/A (runs inside container)
Storage	TripleStore: `(memory_id, "fact", fact_text)`	Structured fact table with evidence tracking
Opt-in?	`extract=True` on `remember()`	Always on via Retain

Integrations

MCP (Model Context Protocol) — v2

Mnemosyne provides an MCP server with 6 tools and 2 transports:

Tool	Description
`mnemosyne_remember`	Store a memory (supports entity extraction, fact extraction, bank selection)
`mnemosyne_recall`	Search memories with hybrid scoring and configurable weights
`mnemosyne_sleep`	Run consolidation cycle
`mnemosyne_scratchpad_read`	Read agent scratchpad
`mnemosyne_scratchpad_write`	Write to scratchpad
`mnemosyne_get_stats`	Get memory statistics

mnemosyne mcp                          # stdio transport (Claude Desktop, etc.)
mnemosyne mcp --transport sse --port 8080  # SSE transport (web clients)
mnemosyne mcp --bank project_a            # scoped to a specific bank

Hermes Agent Integration

15 tools and 3 hooks via plugin.yaml.

Hooks: pre_llm_call (context injection), on_session_start (session init), post_tool_call (memory capture)

Hindsight Integration

Custom HTTP API on port 8888. Native openclaw-hindsight plugin exists for OpenClaw. Hermes integration via HTTP client.

	Mnemosyne	Hindsight
Hermes	Native (in-process, no serialization)	HTTP client
OpenClaw	Planned (adapter not yet built)	Native plugin exists
MCP	6 tools, stdio + SSE	Custom HTTP API
Cross-machine	Export/import JSON only	Any agent with HTTP access to port 8888

Memory Banks (v2)

Feature	Mnemosyne	Hindsight
Named banks	`BankManager` — create, list, delete, rename banks	`banks` table with strict isolation
Isolation	Per-bank SQLite file under `data_dir/banks/<name>/`	PostgreSQL schema-level isolation
Usage	`Mnemosyne(bank="work")` or `mnemosyne mcp --bank work`	API-level bank selection
Multi-tenancy	No access control	HindClaw extension (JWT/API key multi-tenancy)

Additional Features (v2)

Mnemosyne-specific

Feature	Module	Description
Streaming	`core/streaming.py`	`MemoryStream` with push (callbacks) and pull (iterator) patterns. Thread-safe event buffer.
Delta sync	`core/streaming.py`	`DeltaSync` — incremental synchronization between Mnemosyne instances with checkpointed resume.
Pattern detection	`core/patterns.py`	`PatternDetector` — temporal (hour/weekday), content (keyword frequency, co-occurrence), sequence patterns.
Memory compression	`core/patterns.py`	`MemoryCompressor` — dictionary-based, RLE, and semantic compression strategies.
Plugin system	`core/plugins.py`	`MnemosynePlugin` base class with 4 lifecycle hooks. Discovers plugins from `~/.hermes/mnemosyne/plugins/`.
Diagnostics	`diagnose.py`	PII-safe health check — dependencies, database state, vector readiness. No memory content or API keys.

Hindsight-specific (not in Mnemosyne)

Feature	Description
Automatic entity normalization	"Abdias" and "Abdias J" resolved to same entity automatically
Cross-encoder reranking	Second-pass neural reranking of retrieval results
Mental models	Agent reasoning about user preferences and traits
Agentic reflection	Tool-calling loop during Reflect phase
Conflict detection	Automatic contradiction detection and merging
Multi-machine sharing	Network API for distributed agents
Multi-tenancy	Per-user isolation with access control via HindClaw

Performance Characteristics

Metric	Mnemosyne	Hindsight Self-Hosted
Recall latency (10K corpus)	~2–10ms — in-process SQLite + sqlite-vec, no HTTP overhead	~50–200ms — HTTP round-trip + PostgreSQL + 4-way retrieval + rerank
IPC model	Direct Python function call	HTTP POST to localhost:8888 → JSON serialization → response parsing
Storage footprint	~50–100MB SQLite file per 10K memories	~200–500MB PostgreSQL + WAL per 10K memories
Model download	One-time ~67MB (fastembed ONNX)	One-time ~500MB (sentence-transformers PyTorch)
Runtime memory	~10–20MB per session	~100–300MB (PostgreSQL pool + PyTorch runtime)

Important caveat on latency numbers: Mnemosyne's latency advantage comes from being an in-process library calling SQLite directly, compared to HTTP round-trips to a local Docker container. This is an architectural advantage, not a retrieval-quality advantage.

When to Choose What

Choose Mnemosyne if:

You want pip install with zero containers
You need the fastest possible recall latency for interactive agent loops
You're running on a resource-constrained environment (VPS, ephemeral VM, CI)
You're building a single-user, single-machine agent (Hermes, Claude Desktop, etc.)
You want an MCP-compatible memory layer (stdio + SSE)
You want full control over the memory model and don't need automatic "magic"
You want memory banks with per-bank SQLite isolation without standing up PostgreSQL

Choose Hindsight Self-Hosted if:

You need entity resolution ("Abdias" and "Abdias J" are the same person)
You need automatic structured fact extraction from raw text
You need cross-machine memory sharing via network API
You need multi-tenant memory banks with access control
You need temporal reasoning with automatic date extraction
You need the highest recall quality (4-way retrieval + cross-encoder rerank)
You're okay with Docker + PostgreSQL complexity as a trade-off for richer capabilities

Neither is "better." They serve different points on the simplicity-sophistication spectrum.

Known Gaps in Mnemosyne (honest list)

Gap	Severity	Workaround
No automatic entity normalization	Medium	`extract_entities=True` captures entities; fuzzy matching helps but doesn't resolve coreference
No cross-machine network API	Medium for multi-agent setups	Export/import JSON; same-machine sharing via shared SQLite file. Can now import FROM Hindsight directly — migrate without data loss
No cross-encoder reranking	Low for most queries	Hybrid scoring with configurable weights covers common cases
No automatic conflict detection	Medium	Manual `invalidate(memory_id, replacement_id=new_id)`
No multi-tenancy / access control	High for SaaS use cases	Use per-bank SQLite isolation for domain separation
No mental models / agentic reflection	Low	`sleep()` does consolidation; reasoning about knowledge is the caller's job
OpenClaw adapter not yet built	Medium for OpenClaw users	Hermes integration is native; OpenClaw requires MCP adapter work

This page was rewritten for v2.8.0 after community feedback about inaccurate comparisons. Every feature listed for Mnemosyne has been verified against the source code. If anything here is wrong, please open an issue — we'll fix it.