memeX · Documentation

memeX Concepts

How memeX's cognitive memory architecture works — multi-store (episodic / semantic / procedural), provenance, time-decay, working set, and the RAM ↔ human-memory mapping that derives the design.

The architecture isn’t designed — it’s derived. Start from a small set of axioms about what memory fundamentally is, and the shape of memeX is forced by them.

The 9 axioms

1 · Bounded vs unbounded

The LLM's effective state is bounded; the work it reasons about is unbounded. Memory exists to bridge this gap.

2 · Multiple types

Episodic (events), semantic (facts), procedural (how-to). Collapsing them into one store is the original sin of every RAG system.

3 · Memory has a lifecycle

Encode → store → retrieve → consolidate → reconsolidate → forget. Most systems implement only the first three.

4 · Truth has a source + timestamp

A fact without provenance is a poison pill in a multi-agent system.

5 · Falsifiability

A claim that can't be falsified isn't knowledge — it's opinion.

6 · Working set

Cognition has global workspace + long-term store. Retrieval without a working set is amnesia.

7 · Repetition creates primitives

Patterns that recur become atomic units (chunking).

8 · Time erodes truth

Stale memory is worse than no memory.

9 · Retrieval cost wins

Memory's value is retrieval cost, not storage cost. Hoarding has negative utility.

The dual metaphor: RAM hardware + human memory

memeX is shaped jointly on the structure of RAM and on what cognitive science says about human memory.

RAM hardwareHuman memorymemeX
Cell (smallest addressable unit)EngramConcept node
Bank (parallel storage region)Memory systemstores/episodic, stores/semantic, stores/vector
Row addressingCued recallrecall(query)
Refresh cycle (DRAM autorefresh)Reconsolidationlast_confirmed_at + working-set touch() on retrieval
ECC (error correction)Pattern separationNLI / conflict detection
Cache hierarchy (L1/L2/L3)Working memoryWorkingSet (LRU cache)
Bus bandwidthAttention budgettoken-budget enforcement
Read/write ports (parallel)Sensory channelsMCP + HTTP + CLI frontends
Wear levelingConsolidation (episodic → semantic)lifecycle/consolidation.py
DMA (direct CPU bypass)PrimingMCP stdio (zero-copy editor → engine)
Page table (address translation)Naming/aliasingconcept IDs + same_as edges

What’s forced by the axioms

Multi-store, not single graph (axiom 2)

Three separate stores with different schemas, different decay rates, different retrieval defaults:

  • Episodic: time-indexed event log (sessions, conversations, incidents)
  • Semantic: typed property graph (concepts, decisions, contracts)
  • Procedural: ordered workflows (deploys, runbooks)

Multi-encoding per fact (axiom 3)

Every node stored as: text, embedding, graph node, and an executable predicate. Retrieval picks encoding by query type.

Provenance-or-die (axiom 4)

Every node and every edge carries source, created_at, last_confirmed_at, confidence. The schema rejects nodes without these.

Falsifiability requirement (axiom 5)

Every node has an optional verification predicate. Nodes without verification are downgraded over time. This prevents drift.

Working set as first-class structure (axiom 6)

WorkingSet is a bounded LRU cache that biases retrieval ranking — recently touched concepts get higher rank (priming).

Chunking via co-retrieval (axiom 7)

Track which nodes are retrieved together. When co-retrieval frequency crosses a threshold, the subgraph compresses into a chunk node.

Time-aware confidence + active forgetting (axiom 8)

Confidence decays on an Ebbinghaus exponential (default half-life 90 days) unless re-confirmed. Active forgetting deletes nodes that meet stale-and-orphaned criteria.

Retrieval is budget-bounded by construction (axiom 9)

Every retrieval is (query, token_budget) → minimal subgraph. There is no “give me everything.” The API doesn’t allow unbounded retrieval.

Foundational data structures

Each layer has a CS-foundational data structure underneath:

  • Property graph — typed nodes + typed edges, primary key indexing
  • Inverted index (rank_bm25) — token → docs mapping for keyword retrieval
  • Brute-force k-NN with cosine (numpy) — at v1 scale; HNSW upgrade is a Protocol-conforming swap
  • LRU cache (OrderedDict) — WorkingSet, the L1 layer
  • Time-indexed btree — episodic events, ordered by timestamp
  • Reciprocal Rank Fusion — combining BM25 + vector ranks; provably parameter-stable (k=60)
  • Exponential decay — Ebbinghaus forgetting curve, confidence(t) = base * 0.5^(days/half_life)

Software architecture: SOLID with Protocols

  • Repository pattern with Protocol-based interfaces — every store defined as a Protocol so future swaps don’t require core changes.
  • Constructor dependency injectionEngine takes its stores + retriever + provider as constructor args.
  • Factory methodEngine.build_default() for the conventional wiring.
  • Strategy pattern — retrieval strategies (BM25, hybrid) interchangeable.
  • Provider pattern — embeddings as NoOpEmbeddingProvider + FastEmbedProvider, swappable at runtime.
  • FaçadeEngine as the single entry point; frontends never reach into stores.
  • Layered architecture — frontends → engine → core → stores.
  • Lazy loading — heavy resources (embedding models) load on first use.
  • Fail loud — HTTP daemon refuses non-localhost binding without auth.

Polyglot by design

The Protocol-based interfaces mean any layer can be reimplemented in any language and dropped into the same architecture. v1 ships all-Python because it’s the fastest path to correctness; future hot paths can be replaced module-by-module.

The HTTP API is the universal cross-language boundary. OpenAPI spec at /openapi.json autogenerates bindings.

Watchful, not passive

memeX isn’t a database the agent queries — it’s an active observer:

  • Every add / link / recall / validate auto-emits an episodic event.
  • progress() exposes the full activity log.
  • The AI can introspect what it’s already done and avoid repeating itself.
  • Future versions detect contradictions between memories and surface them.