Mnemosyne vs Hindsight
An honest, technical comparison for users running memory systems locally with Hermes Agent and OpenClaw.
Last updated: 2026-05-13 · Mnemosyne v2.8.0
TL;DR: They are not direct competitors. Hindsight is a memory engine with sophisticated NLP and multi-signal retrieval. Mnemosyne is a memory layer optimized for simplicity, speed, and single-machine deployments. Choose based on what you actually need.
Architecture
| Dimension | Mnemosyne | Hindsight Self-Hosted |
|---|---|---|
| Process model | In-process Python library | Separate Docker containers (FastAPI + PostgreSQL) |
| IPC overhead | Zero (direct function calls) | HTTP + JSON serialization to localhost:8888 |
| Database | SQLite (single file, WAL mode) | PostgreSQL + pgvector extension |
| Embedding model | fastembed ONNX — BAAI/bge-small-en-v1.5 (~67MB) | sentence-transformers PyTorch (~500MB) |
| Vector search | sqlite-vec (int8/bit/float32) or numpy fallback | pgvector HNSW (mature, optimized) |
| Cold start | Instant (if models cached locally) | ~5–10s (Docker container boot + model loading) |
| Runtime memory | ~10–20MB per session (SQLite + ONNX) | ~100–300MB (PostgreSQL pool + PyTorch) |
Memory Model
Mnemosyne: BEAM (Bilevel Episodic-Associative Memory)
Three SQLite tables:
| Tier | Purpose | Behavior |
|---|---|---|
| Working memory | Hot, recent context | Auto-injected into prompts. TTL-based eviction (default 24h). Max 10,000 items. FTS5 indexed. |
| Episodic memory | Long-term consolidated storage | Populated by sleep() consolidation. Hybrid vector + FTS5 search. |
| Scratchpad | Temporary agent workspace | Not searchable, not consolidated. Cleared explicitly. Max 1,000 items. |
Additional: TripleStore — temporal knowledge graph with valid_from/valid_until for point-in-time queries.
Core operations: remember(), recall(), sleep() — intentionally simple.
Hindsight: Retain / Recall / Reflect
| Operation | Mnemosyne equivalent | Gap? |
|---|---|---|
| Retain — LLM-driven fact extraction, entity normalization, 2–5 structured facts per chunk | remember() stores raw text + optional embedding. extract=True enables LLM fact extraction. | Partial gap. Mnemosyne can extract facts via LLM but has no automatic entity normalization. |
| Recall — 4-way parallel (semantic + BM25 + graph + temporal), RRF fusion, cross-encoder rerank | recall() — hybrid (vector + FTS5 + importance) × recency decay. Single-pass. | Design difference. Mnemosyne is simpler by choice. |
| Reflect — Agentic loop with tool calling, mental models, disposition traits | sleep() — LLM summarization of working → episodic. No agentic loop, no mental models. | Gap. Mnemosyne does consolidation, not reasoning about knowledge. |
Retrieval
| Feature | Mnemosyne | Hindsight Self-Hosted |
|---|---|---|
| Vector search | sqlite-vec (cosine distance) | pgvector HNSW |
| Keyword search | SQLite FTS5 | PostgreSQL full-text + BM25 |
| Graph search | TripleStore (subject-predicate-object, temporal) | Native knowledge graph with co-occurrence tracking |
| Temporal search | temporal_weight + temporal_halflife params on recall() | Native date parsing, occurred_start/end, temporal recall strategy |
| Scoring formula | vec_weight × vec + fts_weight × fts + importance_weight × importance, then × recency decay | 4-way parallel retrieval → RRF fusion → cross-encoder rerank |
| Default weights | 50% vector, 30% FTS, 20% importance | Learned fusion weights |
| Configurable? | Yes — per-query vec_weight, fts_weight, importance_weight params | Yes — configurable strategies |
| Reranking | None (single-pass) | Cross-encoder rerank |
Entity Extraction (v2)
| Feature | Mnemosyne | Hindsight Self-Hosted |
|---|---|---|
| Method | Regex patterns + pure Python Levenshtein distance | spaCy NLP pipeline + LLM extraction |
| Patterns | @mentions, #hashtags, "quoted phrases", capitalized sequences (2–5 words) | Full NLP: named entities, noun phrases, coreference |
| Fuzzy matching | Levenshtein distance with prefix/substring bonuses | Trigram/full resolution strategies. Entity co-occurrence tracking. |
| Storage | TripleStore triples: (memory_id, "mentions", "entity_name") | Structured entity table with normalization |
| Speed | ~0.01ms per extraction | Heavier (spaCy model loading + inference) |
| Opt-in? | extract_entities=True on remember() | Always on |
Verdict: Mnemosyne's regex approach is fast and dependency-free but misses many entity types that spaCy catches. This is a deliberate trade-off: speed and simplicity over NLP accuracy.
Fact Extraction (v2)
| Feature | Mnemosyne | Hindsight Self-Hosted |
|---|---|---|
| Method | LLM-driven: sends text to LLM, parses 2–5 factual statements | LLM-driven Retain pipeline with provenance tracking |
| Fallback chain | Remote OpenAI-compatible API → local ctransformers GGUF → skip (graceful) | N/A (runs inside container) |
| Storage | TripleStore: (memory_id, "fact", fact_text) | Structured fact table with evidence tracking |
| Opt-in? | extract=True on remember() | Always on via Retain |
Integrations
MCP (Model Context Protocol) — v2
Mnemosyne provides an MCP server with 6 tools and 2 transports:
| Tool | Description |
|---|---|
mnemosyne_remember | Store a memory (supports entity extraction, fact extraction, bank selection) |
mnemosyne_recall | Search memories with hybrid scoring and configurable weights |
mnemosyne_sleep | Run consolidation cycle |
mnemosyne_scratchpad_read | Read agent scratchpad |
mnemosyne_scratchpad_write | Write to scratchpad |
mnemosyne_get_stats | Get memory statistics |
mnemosyne mcp # stdio transport (Claude Desktop, etc.)
mnemosyne mcp --transport sse --port 8080 # SSE transport (web clients)
mnemosyne mcp --bank project_a # scoped to a specific bank
Hermes Agent Integration
15 tools and 3 hooks via plugin.yaml.
Hooks: pre_llm_call (context injection), on_session_start (session init), post_tool_call (memory capture)
Hindsight Integration
Custom HTTP API on port 8888. Native openclaw-hindsight plugin exists for OpenClaw. Hermes integration via HTTP client.
| Mnemosyne | Hindsight | |
|---|---|---|
| Hermes | Native (in-process, no serialization) | HTTP client |
| OpenClaw | Planned (adapter not yet built) | Native plugin exists |
| MCP | 6 tools, stdio + SSE | Custom HTTP API |
| Cross-machine | Export/import JSON only | Any agent with HTTP access to port 8888 |
Memory Banks (v2)
| Feature | Mnemosyne | Hindsight |
|---|---|---|
| Named banks | BankManager — create, list, delete, rename banks | banks table with strict isolation |
| Isolation | Per-bank SQLite file under data_dir/banks/<name>/ | PostgreSQL schema-level isolation |
| Usage | Mnemosyne(bank="work") or mnemosyne mcp --bank work | API-level bank selection |
| Multi-tenancy | No access control | HindClaw extension (JWT/API key multi-tenancy) |
Additional Features (v2)
Mnemosyne-specific
| Feature | Module | Description |
|---|---|---|
| Streaming | core/streaming.py | MemoryStream with push (callbacks) and pull (iterator) patterns. Thread-safe event buffer. |
| Delta sync | core/streaming.py | DeltaSync — incremental synchronization between Mnemosyne instances with checkpointed resume. |
| Pattern detection | core/patterns.py | PatternDetector — temporal (hour/weekday), content (keyword frequency, co-occurrence), sequence patterns. |
| Memory compression | core/patterns.py | MemoryCompressor — dictionary-based, RLE, and semantic compression strategies. |
| Plugin system | core/plugins.py | MnemosynePlugin base class with 4 lifecycle hooks. Discovers plugins from ~/.hermes/mnemosyne/plugins/. |
| Diagnostics | diagnose.py | PII-safe health check — dependencies, database state, vector readiness. No memory content or API keys. |
Hindsight-specific (not in Mnemosyne)
| Feature | Description |
|---|---|
| Automatic entity normalization | "Abdias" and "Abdias J" resolved to same entity automatically |
| Cross-encoder reranking | Second-pass neural reranking of retrieval results |
| Mental models | Agent reasoning about user preferences and traits |
| Agentic reflection | Tool-calling loop during Reflect phase |
| Conflict detection | Automatic contradiction detection and merging |
| Multi-machine sharing | Network API for distributed agents |
| Multi-tenancy | Per-user isolation with access control via HindClaw |
Performance Characteristics
| Metric | Mnemosyne | Hindsight Self-Hosted |
|---|---|---|
| Recall latency (10K corpus) | ~2–10ms — in-process SQLite + sqlite-vec, no HTTP overhead | ~50–200ms — HTTP round-trip + PostgreSQL + 4-way retrieval + rerank |
| IPC model | Direct Python function call | HTTP POST to localhost:8888 → JSON serialization → response parsing |
| Storage footprint | ~50–100MB SQLite file per 10K memories | ~200–500MB PostgreSQL + WAL per 10K memories |
| Model download | One-time ~67MB (fastembed ONNX) | One-time ~500MB (sentence-transformers PyTorch) |
| Runtime memory | ~10–20MB per session | ~100–300MB (PostgreSQL pool + PyTorch runtime) |
Important caveat on latency numbers: Mnemosyne's latency advantage comes from being an in-process library calling SQLite directly, compared to HTTP round-trips to a local Docker container. This is an architectural advantage, not a retrieval-quality advantage.
When to Choose What
Choose Mnemosyne if:
- You want
pip installwith zero containers - You need the fastest possible recall latency for interactive agent loops
- You're running on a resource-constrained environment (VPS, ephemeral VM, CI)
- You're building a single-user, single-machine agent (Hermes, Claude Desktop, etc.)
- You want an MCP-compatible memory layer (stdio + SSE)
- You want full control over the memory model and don't need automatic "magic"
- You want memory banks with per-bank SQLite isolation without standing up PostgreSQL
Choose Hindsight Self-Hosted if:
- You need entity resolution ("Abdias" and "Abdias J" are the same person)
- You need automatic structured fact extraction from raw text
- You need cross-machine memory sharing via network API
- You need multi-tenant memory banks with access control
- You need temporal reasoning with automatic date extraction
- You need the highest recall quality (4-way retrieval + cross-encoder rerank)
- You're okay with Docker + PostgreSQL complexity as a trade-off for richer capabilities
Neither is "better." They serve different points on the simplicity-sophistication spectrum.
Known Gaps in Mnemosyne (honest list)
| Gap | Severity | Workaround |
|---|---|---|
| No automatic entity normalization | Medium | extract_entities=True captures entities; fuzzy matching helps but doesn't resolve coreference |
| No cross-machine network API | Medium for multi-agent setups | Export/import JSON; same-machine sharing via shared SQLite file. Can now import FROM Hindsight directly — migrate without data loss |
| No cross-encoder reranking | Low for most queries | Hybrid scoring with configurable weights covers common cases |
| No automatic conflict detection | Medium | Manual invalidate(memory_id, replacement_id=new_id) |
| No multi-tenancy / access control | High for SaaS use cases | Use per-bank SQLite isolation for domain separation |
| No mental models / agentic reflection | Low | sleep() does consolidation; reasoning about knowledge is the caller's job |
| OpenClaw adapter not yet built | Medium for OpenClaw users | Hermes integration is native; OpenClaw requires MCP adapter work |
This page was rewritten for v2.8.0 after community feedback about inaccurate comparisons. Every feature listed for Mnemosyne has been verified against the source code. If anything here is wrong, please open an issue — we'll fix it.
Mnemosyne