Memory Confidence: Veracity Signal

Mnemosyne (since v2.3) supports a veracity field for every memory so you know how trustworthy it is. Not all memories are created equal: something you explicitly stated deserves more weight than something the AI inferred from context.

Confidence Levels

LevelMeaningWeight
statedYou explicitly said this1.0x
inferredAI deduced from context0.7x
toolAutomated system output0.5x
importedCross-provider import0.6x
unknownLegacy or uncategorized0.8x

Why It Matters

AI agents store a lot of inferred and tool-generated memories. Over time, these can drown out what you actually said. Veracity lets you:

  • Weight direct user statements higher than AI inferences in recall
  • Filter queries to only your explicit preferences
  • Review and audit inferred memories periodically
  • Track contamination in your memory system

Without veracity signals, a joke you made about liking Haskell could rank equally with your explicit preference for Python.

Writing With Confidence

# Explicit user statement
mem.remember("Dark mode is preferred", veracity="stated")

# AI inference
mem.remember("User likely prefers dark mode based on theme choices", veracity="inferred")

# Tool output
mem.remember("Deploy succeeded at 3:14 PM", veracity="tool")

Recalling With Confidence Filters

# Only get things I explicitly said
results = mem.recall("preference", veracity="stated")

# Only tool outputs
results = mem.recall("deploy", veracity="tool")

# Everything, unfiltered
results = mem.recall("preference")  # no veracity filter

How Recall Weighting Works

During recall(), each result's score gets multiplied by its veracity weight:

  • A stated memory with score 0.9 stays 0.9
  • An inferred memory with score 0.9 becomes 0.63
  • A tool memory becomes 0.45

Inferred memories need to be much more semantically relevant to surface above stated ones. The system still surfaces inferences when they're genuinely relevant, but they start at a disadvantage.

get_contaminated(): Audit Your Memory

# Get all non-stated memories for review
contaminated = mem.get_contaminated(limit=50)

for m in contaminated:
    print(f"[{m['veracity']}] {m['content'][:100]}")
    # Review and optionally promote

Returns episodic memories where veracity is NOT stated. Sorted by importance descending so the most impactful contamination surfaces first.

Configuration

MNEMOSYNE_STATED_WEIGHT=1.0
MNEMOSYNE_INFERRED_WEIGHT=0.7
MNEMOSYNE_TOOL_WEIGHT=0.5
MNEMOSYNE_IMPORTED_WEIGHT=0.6
MNEMOSYNE_UNKNOWN_WEIGHT=0.8

Migration

New databases get the veracity column via CREATE TABLE. Existing databases get it via ALTER TABLE migration in init_beam(). All existing rows default to unknown. No data loss, fully backward compatible.

Combined With Tiers

Veracity and tier degradation work together:

  • A tier-3 (cold) + stated: 0.25 x 1.0 = 0.25 total weight
  • A tier-3 (cold) + tool: 0.25 x 0.5 = 0.125 total weight

You control both dimensions independently.

Practical Example: Preference Cleanup

# I told my agent: "I prefer Python over JavaScript"
mem.remember("User prefers Python over JavaScript", veracity="stated")

# Agent later inferred: "User seems to like Haskell" based on a joke
mem.remember("User seems to like Haskell", veracity="inferred")

# Four days later...
what_i_said = mem.recall("language preference", veracity="stated")
# Returns my real preference, not the joke-inference

# Quarterly audit
suspect = mem.get_contaminated()
for s in suspect:
    if confirm_with_user(s["content"]):
        mem.update(s["id"], veracity="stated")
    else:
        mem.forget(s["id"])

Audit tip: Run get_contaminated() as part of a weekly or monthly cleanup. Inferred memories pile up quietly. A quick audit keeps your memory system aligned with what you actually believe and prefer.