MEMM OS

v0.1.0 · Free · Open Source · AI-Native Context Architecture

The Soul of your AI.

A local, portable, and universal 🧠 for any LLM — from local models to frontier systems.

MEMM OS is the native intelligence layer that captures your reasoning, wisdom, and project history. Stop re-explaining, inconsistent results and token burn (money); give your AI a persistent memory it actually controls.

Persistent, inspectable memory that grows and refines over time. Built on files. Owned by you.

Also available for Windows and Linux  ·  View on GitHub ↗

+43%
Context Accuracy

Better retrieval precision over any vector-only RAG model. Consistent, reliable results.

87%
Token Savings

Stop burning money on noise. Inject only high-signal reasoning that actually matters.

<10ms
Query Latency

Pure Rust scoring engine. Sub-millisecond contextual awareness at any scale.

Your AI has amnesia.

Three problems every engineer knows too well.

01

The blank slate.

Every conversation starts from scratch.
You've explained your error handling conventions a hundred times. Your AI hasn't retained any of it.

02

The bloated context file.

CLAUDE.md grows until you stop trusting it.
No structure. No scoring. No way to know what the AI actually reads. A 2,000-word file injected wholesale into every query.

03

The tool silo.

Your Claude context doesn't help Cursor.
Every tool has its own memory. You maintain three diverging copies of the same knowledge.

Four Pillars of Sovereign Memory.

01

Your memory, in plain text.

Memories are Markdown files with YAML frontmatter. You can open them, edit them, version them with Git, and share them with your team. No database. No embeddings. No black box.

Pillar: Files you can read
---
type: knowledge
area: architecture
importance: 0.9
---
## Error Handling Convention

Always use Result<T, AppError>.
Propagate with `?` operator.
Never panic in library code.
02

Scored, tiered, and token-aware.

A 6-signal scoring engine (BM25 + semantic + graph + recency + importance + access frequency) ranks your memories by relevance to each query. A tiered content model (L0 / L1 / L2) loads the right level of detail within your token budget. No over-stuffing. No under-informing.

Pillar: Context that earns its token budget
Query: "How do we handle errors?"
error-handling-convention.md 1.94
rust-patterns.md 1.23
api-design.md 0.41
BM25 · Semantic · Graph · Recency · Importance · Access
03

Connect once, works everywhere.

MEMM OS runs an MCP server that exposes your memory to Claude Code, Claude Desktop, Cursor, Windsurf, Codex, and any local model — simultaneously. A single ground truth. No duplication.

Pillar: One workspace, every tool
Claude Desktop
Cursor
Windsurf
Claude Code
Codex
Local LLMs
↑ All served by one MCP server ↑
04

Memory that doesn't rot.

A built-in governance layer surfaces stale memories (decay), contradictions (conflicts), redundancy (consolidation candidates), and temporary debris (scratch cleanup). A 0–100 health score keeps you honest about the state of your knowledge base.

Pillar: Memory you can govern
87/100 Health Score
3 decayed memories
1 conflict detected
2 consolidation candidates

Born for the AI Era.

Legacy tools solve human amnesia. MEMM OS is built to solve AI amnesia.

Legacy Adapters (e.g. Obsidian)

A different epoch.

Human-first note-taking adapted through a patchwork of 1000+ plugins. Convoluted, fragile, and high cognitive load — for a problem that didn't exist when they were built.

  • +1000 Plugins to Duct-Tape AI
  • Manual Link Maintenance
  • High Setup & Sync Complexity
  • No Context Scoring or Governance
  • Amnesic by default
AI-Native (MEMM OS)

Integrated from day one.

A context engine built from line zero for the LLM era. No plugins — just a direct, high-fidelity bridge between your knowledge and your AI.

  • Zero-Config Intelligence
  • Automated Knowledge Scoring
  • Built-in Governance & Health
  • Native MCP Connectivity
  • Grows & refines over time

From files to compound intelligence.

Three steps to a memory worth trusting.

01

Build your workspace.

Create your first memories: who you are, your stack, your conventions, your architectural decisions. Drop external documents into inbox/ and promote them to the right category.

02

Connect your tools.

Point Claude Desktop, Cursor, or Claude Code at your MCP server. They call get_context with every query and receive the most relevant memories, scored and budgeted.

03

Let it compound.

Every session adds knowledge. The scoring engine learns what you use. Governance surfaces what needs updating. Your AI gets smarter with your project over time.

The Technical Matrix.

Compare the architecture of MEMM OS against traditional vector systems, legacy tools, and manual approaches.

Capability MEMM OS Vector RAG Manual (CLAUDE.md) Legacy (Obsidian)
You can read the memory
Inspectable Retrieval Yes (Transparent) No (Blackbox) — Manual only No
See what AI will retrieve ✓ (Simulation view)
Real-time Token Budgeting Yes (L0/L1/L2) Static Window None None
Works across all tools ✓ (Native MCP) ✗ Siloed Partial ✗ Plugins
Local Model Support Native / Hybrid Limited / API High Via Plugins
No database infrastructure ✓ Zero infra ✗ Vector DB Local App
Context Health & Decay ✓ Automated Static DB Rotting Human Only
Scales with complexity Partial
Built-in governance
Sovereign Portability Filesystem-Native DB Lock-in High Filesystem

Built for Engineers.

What's under the hood before you download anything.

Stack

Built with Tauri v2 (Rust backend, React frontend). The scoring engine is pure Rust — sub-10ms query times. MCP server runs as stdio (CLI) and HTTP/SSE (IDEs) simultaneously. Memory stored as local files — git-friendly, backup-friendly, editor-friendly.

Open Source

MIT licensed. The full source is on GitHub. The file format is human-readable and documented. Your memory is yours — forever, no vendor lock-in.

🔒

Privacy

Nothing leaves your machine. No analytics, no telemetry, no cloud sync. The MCP server binds to 127.0.0.1 only. Your wisdom stays yours.

Start with what you know. Files.

MEMM OS is free, open source, and works with the AI tools you already use. Download the app, set up your workspace in 5 minutes, and stop re-explaining your entire context every session.

Read the academic paper · Read the technical whitepaper · View on GitHub