Configuration

Mnemosyne is configured via environment variables and an optional Hermes config.yaml file. All settings have sensible defaults — you can start using it with zero configuration.

Environment Variables

Storage & Memory Tiers

Variable	Default	Description
`MNEMOSYNE_DATA_DIR`	`~/.hermes/mnemosyne/data`	Root data directory for SQLite database files
`MNEMOSYNE_WM_MAX_ITEMS`	`10000`	Maximum items in working memory (per session)
`MNEMOSYNE_WM_TTL_HOURS`	`24`	Working memory TTL in hours
`MNEMOSYNE_EP_LIMIT`	`50000`	Episodic recall scan limit
`MNEMOSYNE_SLEEP_BATCH`	`5000`	Sleep/consolidation batch size
`MNEMOSYNE_SP_MAX`	`1000`	Maximum scratchpad entries
`MNEMOSYNE_RECENCY_HALFLIFE`	`168`	Recency decay half-life in hours (default: 1 week)

Vector & Retrieval Weights

Mnemosyne uses hybrid scoring (vector + FTS5 + importance) for recall. These weights are configurable and auto-normalized per query.

Variable	Default	Description
`MNEMOSYNE_VEC_TYPE`	`int8`	Vector quantization: `float32`, `int8`, or `bit`
`MNEMOSYNE_VEC_WEIGHT`	`0.5`	Vector similarity weight in hybrid scoring
`MNEMOSYNE_FTS_WEIGHT`	`0.3`	FTS5 text relevance weight
`MNEMOSYNE_IMPORTANCE_WEIGHT`	`0.2`	Importance score weight
`MNEMOSYNE_TEMPORAL_HALFLIFE_HOURS`	`24`	Temporal boost half-life in hours
`MNEMOSYNE_BEAM_OPTIMIZATIONS`	`false`	Enable BEAM benchmark optimizations (OR semantics, larger scans)

Tiered Degradation

Episodic memories degrade through 3 tiers over time. Tier 1 is full content, Tier 2 is LLM-summarized, Tier 3 is entity-extracted compressed signal.

Variable	Default	Description
`MNEMOSYNE_TIER2_DAYS`	`30`	Days before Tier 1 degrades to Tier 2
`MNEMOSYNE_TIER3_DAYS`	`180`	Days before Tier 2 degrades to Tier 3
`MNEMOSYNE_TIER1_WEIGHT`	`1.0`	Recall score multiplier for Tier 1 memories
`MNEMOSYNE_TIER2_WEIGHT`	`0.5`	Recall score multiplier for Tier 2 memories
`MNEMOSYNE_TIER3_WEIGHT`	`0.25`	Recall score multiplier for Tier 3 memories
`MNEMOSYNE_DEGRADE_BATCH`	`100`	Max memories per degradation cycle
`MNEMOSYNE_SMART_COMPRESS`	`true`	Enable entity-aware sentence extraction for Tier 3
`MNEMOSYNE_TIER3_MAX_CHARS`	`300`	Max characters for Tier 3 compressed content

Veracity Weights

Memories carry veracity labels that affect their recall weight. These multipliers tune how strongly each veracity level influences retrieval.

Variable	Default	Description
`MNEMOSYNE_STATED_WEIGHT`	`1.0`	Recall multiplier for explicitly stated facts
`MNEMOSYNE_INFERRED_WEIGHT`	`0.7`	Recall multiplier for inferred facts
`MNEMOSYNE_TOOL_WEIGHT`	`0.5`	Recall multiplier for tool-generated output
`MNEMOSYNE_IMPORTED_WEIGHT`	`0.6`	Recall multiplier for imported facts
`MNEMOSYNE_UNKNOWN_WEIGHT`	`0.8`	Recall multiplier for uncategorized/legacy facts

Embeddings

Mnemosyne uses BAAI/bge-small-en-v1.5 via fastembed (384 dimensions, local ONNX) for embedding generation. No external API key is needed.

Variable	Default	Description
`MNEMOSYNE_EMBEDDING_MODEL`	`BAAI/bge-small-en-v1.5`	Embedding model (local fastembed or `openai/*` API)

Local LLM (Sleep Summarization)

The sleep/consolidation cycle can use a local or remote LLM to summarize grouped working memories before promoting them to episodic memory.

Variable	Default	Description
`MNEMOSYNE_LLM_ENABLED`	`true`	Enable LLM-based summarization during sleep
`MNEMOSYNE_LLM_BASE_URL`	—	OpenAI-compatible API base URL for remote LLM
`MNEMOSYNE_LLM_API_KEY`	—	API key for the remote LLM endpoint
`MNEMOSYNE_LLM_MODEL`	—	Model name for the remote LLM endpoint
`MNEMOSYNE_LLM_MAX_TOKENS`	`2048`	Max output tokens for generated summaries
`MNEMOSYNE_LLM_N_THREADS`	`4`	CPU threads for local LLM inference
`MNEMOSYNE_LLM_N_CTX`	`2048`	Context window size for the local LLM
`MNEMOSYNE_LLM_REPO`	`TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF`	GGUF model repository
`MNEMOSYNE_LLM_FILE`	`tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf`	GGUF model filename

Host LLM Backend (Hermes Integration)

When running inside Hermes, the host agent can provide an LLM backend for consolidation — avoiding duplicate API calls.

Variable	Default	Description
`MNEMOSYNE_HOST_LLM_ENABLED`	`false`	Route LLM calls through Hermes' authenticated client
`MNEMOSYNE_HOST_LLM_PROVIDER`	—	Provider name override (e.g. `openai-codex`)
`MNEMOSYNE_HOST_LLM_MODEL`	—	Model name override
`MNEMOSYNE_HOST_LLM_N_CTX`	`32000`	Prompt context budget for host LLM path

Multi-Agent Identity

Filter memories by author and channel for multi-agent or multi-tenant deployments.

Variable	Default	Description
`MNEMOSYNE_AUTHOR_ID`	—	Default author identifier
`MNEMOSYNE_AUTHOR_TYPE`	—	Author category: `human`, `agent`, or `system`
`MNEMOSYNE_CHANNEL_ID`	—	Default channel identifier for cross-session memory

Auto-Sleep & SHMR

Variable	Default	Description
`MNEMOSYNE_AUTO_SLEEP_ENABLED`	`false`	Enable auto-sleep after N turns
`MNEMOSYNE_SHMR_BATCH_SIZE`	`50`	SHMR batch size for harmony cycles
`MNEMOSYNE_SHMR_MAX_ITERATIONS`	`3`	SHMR max iterations per cluster
`MNEMOSYNE_SHMR_SIMILARITY_THRESHOLD`	`0.70`	SHMR cosine similarity threshold
`MNEMOSYNE_SHMR_HARMONY_THRESHOLD`	`0.60`	SHMR harmony score threshold
`MNEMOSYNE_LOG_TOOLS`	`false`	Opt-in tool call logging in hermes_plugin
`MNEMOSYNE_MCP_BANK`	`default`	Default bank for MCP server

YAML Configuration

Mnemosyne integrates with the Hermes config.yaml file (not a standalone mnemosyne.yaml). Configure settings under the memory.mnemosyne key:

# config.yaml (Hermes project root)
memory:
mnemosyne:
  # Auto-sleep consolidation
  auto_sleep: false             # Auto-run sleep() when working memory exceeds threshold
  sleep_threshold: 50           # Working memory count before auto-sleep triggers

  # Vector configuration
  vector_type: int8             # Quantization: float32, int8, or bit

  # Pattern filtering
  ignore_patterns:              # Content patterns to skip during remember()
    - "be ACTIVE"               # Skill refinement boilerplate
    - "nothing to change"       # No-op responses
    - "skill.*refined"          # Wildcard match

Config file lookup order (first found wins):

./config.yaml (current working directory)
~/.hermes/mnemosyne/config.yaml (user-level)
Environment variables (fallback for any setting not in config.yaml)

Programmatic Configuration

No config file is needed. You can create a Mnemosyne instance directly in Python:

from mnemosyne import Mnemosyne

mem = Mnemosyne(
  session_id="my-agent",        # Scope working memories to this session
  db_path="./my-memory.db",     # Custom SQLite database path (optional)
  bank="work",                  # Named memory bank for isolation
  author_id="alice",            # Multi-agent identity
  author_type="human",          # "human", "agent", or "system"
  channel_id="team-slack",      # Cross-session channel
)

Parameter	Type	Default	Description
`session_id`	`str`	`"default"`	Session identifier for scoping working memories
`db_path`	`str \| None`	`None`	Path to the SQLite database file (auto-created if None)
`bank`	`str \| None`	`None`	Named memory bank — each bank gets its own isolated SQLite file under `data_dir/banks/`
`author_id`	`str \| None`	`None`	Author identifier for multi-agent memory scoping
`author_type`	`str \| None`	`None`	Author category: `"human"`, `"agent"`, or `"system"`
`channel_id`	`str \| None`	`None`	Channel identifier for cross-session shared memory

Local-first by default

Mnemosyne requires no external API keys for basic usage. Embeddings use BAAI/bge-small-en-v1.5 via fastembed (384-dimensional, local ONNX runtime). The sleep/consolidation cycle defaults to a local TinyLlama GGUF model. You only need API keys if you configure a remote LLM endpoint via MNEMOSYNE_LLM_BASE_URL.

Installation Guide

Complete installation instructions for Mnemosyne including Python version requirements, optional dep...

First Steps

Learn the fundamentals of Mnemosyne after installation: storing memories, querying, understanding th...

Performance Tuning

Optimize Mnemosyne performance: index tuning, query optimization, cache configuration, WAL mode sett...