BEAM Architecture Overview

BEAM (Biological-inspired Episodic-Associative Memory) is the core architecture powering Mnemosyne. It models human memory organization with three tiers plus TripleStore, each optimized for different retention and retrieval patterns.

The Three Memory Tiers + TripleStore



graph TB
  subgraph Input["Agent Input"]
      I1[Observation]
      I2[Conversation]
      I3[Tool Result]
  end

  subgraph BEAM["BEAM Memory System"]
      WM[Working Memory
      Short-term, up to 10K entries
      <100ms access]
      EM[Episodic Memory
      Long-term experiences
      Hybrid retrieval]
      SP[Scratchpad
      Temporary workspace
      Up to 1K entries]
  end

  I1 -->|remember| WM
  I2 -->|remember| WM
  I3 -->|remember| WM
  WM -->|sleep consolidation| EM
  EM -->|recall| Agent

  style WM fill:#e0f2fe,stroke:#0284c7
  style EM fill:#fef3c7,stroke:#d97706
  style SP fill:#faf5ff,stroke:#9333ea

Working Memory

The agent's immediate context window. Holds recent observations, conversation history, and active tool results. Fastest access (<100ms), limited capacity (up to 10,000 entries), and subject to eviction via sleep consolidation.

Key characteristics:

  • Latency: Sub-100ms median
  • Capacity: Up to 10,000 entries (configurable via MNEMOSYNE_WM_MAX_ITEMS, default 10,000)
  • TTL: 24 hours (configurable via MNEMOSYNE_WM_TTL_HOURS, default 24)
  • Persistence: Survives process restarts via SQLite

Episodic Memory

Long-term storage for experiences, conversations, and events. Organized chronologically with rich metadata. Uses hybrid retrieval combining vector similarity and full-text search.

Key characteristics:

  • Retention: Permanent (until explicit deletion)
  • Retrieval: Hybrid (vector + FTS5) with configurable scoring weights (vec_weight, fts_weight, importance_weight, defaulting to 50/30/20)
  • Temporal decay: Tunable via temporal_weight and temporal_halflife parameters on recall()
  • Consolidation: Promoted from Working Memory via sleep()

TripleStore (Knowledge Graph)

Structured knowledge stored as subject-predicate-object triples. Managed by the separate TripleStore class (not part of the main Mnemosyne class).

Key characteristics:

  • Format: Subject-predicate-object triples with confidence and validity timestamps
  • Query: Filter by subject, predicate, or object via TripleStore.query()
  • Access: Separate TripleStore class or module-level add_triple() / query_triples() helpers
  • Schema: Flexible, no predefined ontology

Scratchpad

A temporary workspace for reasoning, planning, and intermediate computations. Analogous to human working memory's manipulation buffer.

Key characteristics:

  • Capacity: Up to 1,000 entries (configurable via MNEMOSYNE_SP_MAX, default 1,000)
  • Use case: Chain-of-thought, planning, scratch calculations
  • API: scratchpad_write(), scratchpad_read(), scratchpad_clear()
  • Lifetime: Session-bound

Data Flow



sequenceDiagram
  participant Agent
  participant WM as Working Memory
  participant SP as Scratchpad
  participant EM as Episodic Memory
  participant SM as TripleStore (Knowledge Graph)

  Agent->>WM: remember(content)
  WM->>WM: store + index
  Agent->>SP: scratchpad_write(thought)
  SP->>SP: store
  Agent->>EM: recall(query)
  EM->>Agent: return results
  Agent->>SM: add_triple(s, p, o)
  SM->>SM: store triple
  Agent->>SM: query_triples(s, p, o)
  SM->>Agent: return triples

Sleep Consolidation

Call mem.sleep() to run a consolidation cycle that:

  1. Select Working Memory entries older than TTL/2 (12 hours by default)
  2. Group candidates by source
  3. Summarize each group via LLM or AAAK text substitution
  4. Promote summaries to Episodic Memory
  5. Evict the original Working Memory entries

Sleep is triggered explicitly — it does not run automatically on a timer.

V2 Features

Mnemosyne v2.8.0 builds on the core BEAM architecture with these capabilities:

Entity Extraction

remember() accepts extract_entities=True to automatically detect named entities (mentions, hashtags, proper nouns) using regex patterns and Levenshtein fuzzy matching. Extracted entities are stored as TripleStore triples.

Fact Extraction

remember() accepts extract=True to send content through an LLM pipeline that extracts 2–5 structured factual statements, stored as TripleStore triples. The extraction pipeline gracefully falls back through remote API → local GGUF → skip.

Temporal Recall

recall() now supports temporal_weight, query_time, and temporal_halflife parameters for fine-grained control over how recency influences retrieval. Point-in-time queries allow retrieving memories as they existed at a specific timestamp.

Configurable Scoring

The hybrid scoring weights (vec_weight, fts_weight, importance_weight) are now configurable per-query instead of fixed at compile time. Default: 50/30/20.

Memory Banks

Named banks provide isolated SQLite databases per project or context. Create via Mnemosyne(bank="project-a") or mnemosyne mcp --bank project-a.

Streaming & Patterns

MemoryStream provides push/pull event patterns for real-time notifications. PatternDetector discovers temporal, content, and sequence patterns across stored memories. DeltaSync enables incremental synchronization between instances.

Plugin System

A plugin base class (MnemosynePlugin) with 4 lifecycle hooks. Built-in plugins: LoggingPlugin, MetricsPlugin, FilterPlugin. Discovers plugins from ~/.hermes/mnemosyne/plugins/.

Design Philosophy

Mnemosyne's architecture prioritizes:

  1. Locality: All data in SQLite — no network calls for retrieval
  2. Transparency: Every operation is inspectable and debuggable
  3. Efficiency: Sub-100ms retrieval for typical workloads
  4. Flexibility: Works standalone or as a Hermes plugin
  5. Privacy: No data leaves your machine by default
Comparison with Vector-Only Stores

Unlike pure vector databases (Pinecone, Weaviate), Mnemosyne combines dense embeddings with structured storage, full-text search, and temporal relationships. This hybrid approach delivers more precise retrieval for agent use cases where exact facts matter as much as semantic similarity.