Memory Confidence: Veracity Signal
Mnemosyne (since v2.3) supports a veracity field for every memory so you know how trustworthy it is. Not all memories are created equal: something you explicitly stated deserves more weight than something the AI inferred from context.
Confidence Levels
| Level | Meaning | Weight |
|---|---|---|
stated | You explicitly said this | 1.0x |
inferred | AI deduced from context | 0.7x |
tool | Automated system output | 0.5x |
imported | Cross-provider import | 0.6x |
unknown | Legacy or uncategorized | 0.8x |
Why It Matters
AI agents store a lot of inferred and tool-generated memories. Over time, these can drown out what you actually said. Veracity lets you:
- Weight direct user statements higher than AI inferences in recall
- Filter queries to only your explicit preferences
- Review and audit inferred memories periodically
- Track contamination in your memory system
Without veracity signals, a joke you made about liking Haskell could rank equally with your explicit preference for Python.
Writing With Confidence
# Explicit user statement
mem.remember("Dark mode is preferred", veracity="stated")
# AI inference
mem.remember("User likely prefers dark mode based on theme choices", veracity="inferred")
# Tool output
mem.remember("Deploy succeeded at 3:14 PM", veracity="tool")
Recalling With Confidence Filters
# Only get things I explicitly said
results = mem.recall("preference", veracity="stated")
# Only tool outputs
results = mem.recall("deploy", veracity="tool")
# Everything, unfiltered
results = mem.recall("preference") # no veracity filter
How Recall Weighting Works
During recall(), each result's score gets multiplied by its veracity weight:
- A
statedmemory with score 0.9 stays 0.9 - An
inferredmemory with score 0.9 becomes 0.63 - A
toolmemory becomes 0.45
Inferred memories need to be much more semantically relevant to surface above stated ones. The system still surfaces inferences when they're genuinely relevant, but they start at a disadvantage.
get_contaminated(): Audit Your Memory
# Get all non-stated memories for review
contaminated = mem.get_contaminated(limit=50)
for m in contaminated:
print(f"[{m['veracity']}] {m['content'][:100]}")
# Review and optionally promote
Returns episodic memories where veracity is NOT stated. Sorted by importance descending so the most impactful contamination surfaces first.
Configuration
MNEMOSYNE_STATED_WEIGHT=1.0
MNEMOSYNE_INFERRED_WEIGHT=0.7
MNEMOSYNE_TOOL_WEIGHT=0.5
MNEMOSYNE_IMPORTED_WEIGHT=0.6
MNEMOSYNE_UNKNOWN_WEIGHT=0.8
Migration
New databases get the veracity column via CREATE TABLE. Existing databases get it via ALTER TABLE migration in init_beam(). All existing rows default to unknown. No data loss, fully backward compatible.
Combined With Tiers
Veracity and tier degradation work together:
- A tier-3 (cold) +
stated: 0.25 x 1.0 = 0.25 total weight - A tier-3 (cold) +
tool: 0.25 x 0.5 = 0.125 total weight
You control both dimensions independently.
Practical Example: Preference Cleanup
# I told my agent: "I prefer Python over JavaScript"
mem.remember("User prefers Python over JavaScript", veracity="stated")
# Agent later inferred: "User seems to like Haskell" based on a joke
mem.remember("User seems to like Haskell", veracity="inferred")
# Four days later...
what_i_said = mem.recall("language preference", veracity="stated")
# Returns my real preference, not the joke-inference
# Quarterly audit
suspect = mem.get_contaminated()
for s in suspect:
if confirm_with_user(s["content"]):
mem.update(s["id"], veracity="stated")
else:
mem.forget(s["id"])
Audit tip: Run get_contaminated() as part of a weekly or monthly cleanup. Inferred memories pile up quietly. A quick audit keeps your memory system aligned with what you actually believe and prefer.
Mnemosyne