memeX Concepts — Quefly Docs

The architecture isn’t designed — it’s derived. Start from a small set of axioms about what memory fundamentally is, and the shape of memeX is forced by them.

The 9 axioms

1 · Bounded vs unbounded

The LLM's effective state is bounded; the work it reasons about is unbounded. Memory exists to bridge this gap.

2 · Multiple types

Episodic (events), semantic (facts), procedural (how-to). Collapsing them into one store is the original sin of every RAG system.

3 · Memory has a lifecycle

Encode → store → retrieve → consolidate → reconsolidate → forget. Most systems implement only the first three.

4 · Truth has a source + timestamp

A fact without provenance is a poison pill in a multi-agent system.

5 · Falsifiability

A claim that can't be falsified isn't knowledge — it's opinion.

6 · Working set

Cognition has global workspace + long-term store. Retrieval without a working set is amnesia.

7 · Repetition creates primitives

Patterns that recur become atomic units (chunking).

8 · Time erodes truth

Stale memory is worse than no memory.

9 · Retrieval cost wins

Memory's value is retrieval cost, not storage cost. Hoarding has negative utility.

The dual metaphor: RAM hardware + human memory

memeX is shaped jointly on the structure of RAM and on what cognitive science says about human memory.

RAM hardware	Human memory	memeX
Cell (smallest addressable unit)	Engram	`Concept` node
Bank (parallel storage region)	Memory system	`stores/episodic`, `stores/semantic`, `stores/vector`
Row addressing	Cued recall	`recall(query)`
Refresh cycle (DRAM autorefresh)	Reconsolidation	`last_confirmed_at` + working-set `touch()` on retrieval
ECC (error correction)	Pattern separation	NLI / conflict detection
Cache hierarchy (L1/L2/L3)	Working memory	`WorkingSet` (LRU cache)
Bus bandwidth	Attention budget	token-budget enforcement
Read/write ports (parallel)	Sensory channels	MCP + HTTP + CLI frontends
Wear leveling	Consolidation (episodic → semantic)	`lifecycle/consolidation.py`
DMA (direct CPU bypass)	Priming	MCP stdio (zero-copy editor → engine)
Page table (address translation)	Naming/aliasing	concept IDs + `same_as` edges

What’s forced by the axioms

Multi-store, not single graph (axiom 2)

Three separate stores with different schemas, different decay rates, different retrieval defaults:

Episodic: time-indexed event log (sessions, conversations, incidents)
Semantic: typed property graph (concepts, decisions, contracts)
Procedural: ordered workflows (deploys, runbooks)

Multi-encoding per fact (axiom 3)

Every node stored as: text, embedding, graph node, and an executable predicate. Retrieval picks encoding by query type.

Provenance-or-die (axiom 4)

Every node and every edge carries source, created_at, last_confirmed_at, confidence. The schema rejects nodes without these.

Falsifiability requirement (axiom 5)

Every node has an optional verification predicate. Nodes without verification are downgraded over time. This prevents drift.

Working set as first-class structure (axiom 6)

WorkingSet is a bounded LRU cache that biases retrieval ranking — recently touched concepts get higher rank (priming).

Chunking via co-retrieval (axiom 7)

Track which nodes are retrieved together. When co-retrieval frequency crosses a threshold, the subgraph compresses into a chunk node.

Time-aware confidence + active forgetting (axiom 8)

Confidence decays on an Ebbinghaus exponential (default half-life 90 days) unless re-confirmed. Active forgetting deletes nodes that meet stale-and-orphaned criteria.

Retrieval is budget-bounded by construction (axiom 9)

Every retrieval is (query, token_budget) → minimal subgraph. There is no “give me everything.” The API doesn’t allow unbounded retrieval.

Foundational data structures

Each layer has a CS-foundational data structure underneath:

Property graph — typed nodes + typed edges, primary key indexing
Inverted index (rank_bm25) — token → docs mapping for keyword retrieval
Brute-force k-NN with cosine (numpy) — at v1 scale; HNSW upgrade is a Protocol-conforming swap
LRU cache (OrderedDict) — WorkingSet, the L1 layer
Time-indexed btree — episodic events, ordered by timestamp
Reciprocal Rank Fusion — combining BM25 + vector ranks; provably parameter-stable (k=60)
Exponential decay — Ebbinghaus forgetting curve, confidence(t) = base * 0.5^(days/half_life)

Software architecture: SOLID with Protocols

Repository pattern with Protocol-based interfaces — every store defined as a Protocol so future swaps don’t require core changes.
Constructor dependency injection — Engine takes its stores + retriever + provider as constructor args.
Factory method — Engine.build_default() for the conventional wiring.
Strategy pattern — retrieval strategies (BM25, hybrid) interchangeable.
Provider pattern — embeddings as NoOpEmbeddingProvider + FastEmbedProvider, swappable at runtime.
Façade — Engine as the single entry point; frontends never reach into stores.
Layered architecture — frontends → engine → core → stores.
Lazy loading — heavy resources (embedding models) load on first use.
Fail loud — HTTP daemon refuses non-localhost binding without auth.

Polyglot by design

The Protocol-based interfaces mean any layer can be reimplemented in any language and dropped into the same architecture. v1 ships all-Python because it’s the fastest path to correctness; future hot paths can be replaced module-by-module.

The HTTP API is the universal cross-language boundary. OpenAPI spec at /openapi.json autogenerates bindings.

Watchful, not passive

memeX isn’t a database the agent queries — it’s an active observer:

Every add / link / recall / validate auto-emits an episodic event.
progress() exposes the full activity log.
The AI can introspect what it’s already done and avoid repeating itself.
Future versions detect contradictions between memories and surface them.