Architecture And Operating Model

Purpose

AI Context OS is a filesystem-first memory layer for AI agents.

It is not meant to be a chat product and not meant to be tied to a single external tool. The system is intended to act as a persistent, compounding knowledge base that can be edited and maintained by humans and AI together, while remaining readable and portable as plain files.

The product thesis is:

  • files are the source of truth
  • context should be routed and maintained, not rediscovered from scratch every time
  • integrations should be adapters, not the canonical system
  • the memory graph should improve over time as the user works

Core storage model

The canonical data model lives in the workspace filesystem.

What is a Canonical Memory (and what is not)

A canonical memory is a Markdown file that the system explicitly recognizes. It must have:

  • Valid YAML frontmatter
  • Strict fields: id, type (ontology), and l0
  • Content structured by levels via <!-- L1 --> and <!-- L2 -->

Important Note:

  • Not every .md file in the repository is a memory.
  • Files like README.md, pure documentation in docs/, and naked markdown files without canonical frontmatter are intentionally ignored by the indexer.
  • The scanner no longer auto-injects frontmatter into non-compliant Markdown files. This preserves the repository’s native documents untouched.

Other elements of the filesystem:

  • journal pages are files
  • tasks are files
  • rules are files
  • scratch output is files
  • router and adapter artifacts are files

The app also uses a local file for telemetry and observability:

  • location: {workspace}/.cache/memory-usage.json (usage telemetry, access counts, last access)
  • purpose: telemetry, context-request history, optimization suggestions
  • non-canonical: it is not the source of truth for memory content. The frontmatter of memories no longer contains last_access or access_count to keep git commits clean.

This distinction is critical. The filesystem is the knowledge base. The cache is support infrastructure.

Progressive memory model

Each memory has three levels:

  • L0: one-line summary in frontmatter
  • L1: operational summary
  • L2: full detail

The current system uses explicit markers:

---
id: example-memory
type: context
ontology: entity
l0: "One-line summary"
importance: 0.8
tags: [example]
related: [other-memory]
protected: true
---

<!-- L1 -->
Operational summary.

<!-- L2 -->
Long-form detail.

This model is important because it allows the engine to load just enough context.

Progressive loading: MCP vs Non-MCP

It is critical to understand the real capabilities of the L1/L2 distinction:

  • With MCP: Progressive loading is real. The backend engine parses the memory, separates L1 and L2, and decides how much to return based on score and budget.
  • Without MCP: Strict isolation does not exist. The static router tells the agent “read L1 first”, but if the agent opens the file natively, it can read the whole file. It is a well-guided convention, not a hard technical barrier.

Current memory classification model

Operational type

The existing type field is still the system’s operational classification. It is part of the current contract and is used across Rust, TypeScript, routing, CRUD, folder inference, and UI behavior.

Current values:

  • context
  • daily
  • intelligence
  • project
  • resource
  • skill
  • task
  • rule
  • scratch

This field should currently be understood as “how the system treats this memory operationally”.

Ontology layer

The system now includes an optional ontology field:

  • source
  • entity
  • concept
  • synthesis

This layer exists to improve AI reasoning without replacing the current storage model.

The ontology is meant to answer:

  • what kind of thing is this file semantically?
  • should the AI treat it as raw input, a real-world object, an abstract idea, or a distilled output?

Why both layers exist

There was an important product decision behind this:

  • users should remain free to organize their lives with folders however they want
  • the system should not depend only on folder names to understand content
  • the AI needs a semantic layer that survives future folder changes

So the intended long-term separation is:

  • folders: for human organization
  • type: currently operational system classification
  • ontology: semantic classification for AI

Workspace structure

Current stable workspace

The workspace balances human freedom with system needs:

workspace/
├── inbox/
├── sources/
├── .ai/
│   ├── rules/
│   ├── skills/
│   ├── catalog.md
│   ├── index.yaml
│   └── config.yaml
├── claude.md
├── .cursorrules
├── .windsurfrules
└── .cache/
    └── memory-usage.json

Users can create any folder structure they want. The system infers metadata dynamically based on paths:

  • The first segment of the relative path is used as folder_category.
  • Paths in .ai/rules/ receive system_role = rule.
  • Paths in .ai/skills/ receive system_role = skill.

Protected locations like .ai/, inbox/, and sources/ are managed by the system and have restricted write access.

Meaning of 00-inbox

00-inbox/ is now the intake area for future ingestion workflows.

Its role is:

  • hold raw or semi-raw material pending processing
  • give the user an obvious drop zone for incoming material
  • provide a clean first step for future ingest commands and UI

Important:

  • 00-inbox/ is not currently a full memory folder
  • files there can be opened as raw files in the explorer
  • but they are not yet part of the regular memory index unless they are moved into a memory folder or future ingestion logic promotes them

Routing model

The router is designed to be neutral-first and adapter-based. Architecture shifted from a single monolithic router to an intermediate representation (RouterManifest) which generates four distinct outputs:

  1. Static Router (claude.md, .cursorrules, .windsurfrules)
    • Valuable by itself without MCP.
    • Includes: Main workspace rules, reading/writing rules, structural breakdown, compact L0 index, relative paths, and basic ontology for each memory.
    • Purposefully ignores: Heavy structured metadata, access telemetry, detailed provenance networks.
  2. Enriched Catalog (.ai/catalog.md)
    • Human-readable supplementary view.
    • Provides deep-dives into metadata that the static router omits: tags, relations (related, derived_from), status, triggers, requirements, and protected flags.
  3. Structured Index (.ai/index.yaml)
    • A machine-oriented serializable manifest for stable tool integration.
  4. MCP Prelude
    • A much shorter, tools-oriented router injected dynamically explicitly for connected agents. Highly focused on available tools rather than massive memory indices.

This matters because the product should not become conceptually owned by one tool, and heavy metadata no longer clutters the static router.

Query and scoring model

The engine currently works by:

  1. scanning workspace memories
  2. scoring them against a query
  3. using budget-aware loading rules
  4. deciding L0, L1, or L2
  5. returning loaded vs unloaded memory context

Current scoring is deterministic and heuristic-first. It combines:

  • heuristic semantic score
  • BM25-style lexical score
  • recency
  • importance
  • access frequency
  • graph proximity

The system does not currently use Tantivy or production embeddings as its active retrieval core.

The intended future direction is likely:

  • stronger local lexical retrieval first
  • semantic layers second
  • local embeddings used initially for maintenance and assistance, not as the only source of retrieval truth

Governance model

The system already contains a governance layer that inspects the knowledge base for quality issues.

Current governance areas:

  • contradictions between related memories
  • decay candidates
  • consolidation suggestions
  • scratch cleanup candidates

This is conceptually close to the “lint” operation described in LLM-maintained wiki patterns, even though the full ingest-query-lint cycle is not finished yet.

Ingestion model: intended direction

The intended ingestion workflow is:

  1. a source lands in 00-inbox/
  2. the user triggers ingestion explicitly
  3. the system reads the source
  4. the system proposes ontology, summary, tags, and affected pages
  5. the system creates new knowledge or updates existing memories
  6. the source is marked as processed or moved
  7. the action is logged

This is not fully implemented yet. Right now, only the first structural piece exists: the inbox itself.

External AI vs local AI roles

The current system already integrates with external AI tools through adapters and connectors. Connectors are defined by their real capabilities:

  • Native MCP (e.g., Claude Desktop, Claude Code): Direct MCP/stdio support and full system tool access.
  • Remote MCP (e.g., Cursor, Windsurf): HTTP/SSE MCP when the app is open, otherwise they fallback to static rules like .cursorrules.
  • Bridge (e.g., ChatGPT Web, Copilot): No real MCP. Full reliance on static context snapshots.

Hard Protections

The system now physically enforces the protected flag:

  • Backend operations (save_memory, delete_memory) and MCP tools prevent modifications to protected files.
  • To edit a protected memory, the agent/user must perform an explicit operation strictly setting protected: false before altering content.
  • Generated system artifacts (claude.md, .ai/index.yaml) also reject raw file manipulations to preserve integrity.

Longer-term, there is also room for local model support for background maintenance tasks such as:

  • classification
  • deduplication hints
  • ontology suggestions
  • ingestion summaries
  • update proposals
  • governance assistance

That future direction should not replace external-tool integration. It should complement it.

Product principle about folders

One of the most important decisions reached in the design discussion was this:

  • users should keep the freedom to organize their world using folders however they want
  • the app should not force a single lifestyle taxonomy onto everyone

Therefore the future architecture should preserve:

  • human folder freedom
  • AI semantic understanding through frontmatter
  • small system-owned zones only where the product truly needs them

Operational Rules

These rules ensure the memory repository stays clean and predictable.

For Humans:

  • Use canonical memories only when the file must be part of the context system. Do not convert general repository documentation into AI memories unless conceptually intended.
  • Clearly mark protected: true if a memory aims to be immutable pending human confirmation.
  • Treat folders dynamically: use inbox/ for staging, .ai/skills strictly for system-level logic capabilities.

For AIs:

  • Do not assume every .md is a memory. Only read those matching the canonical standards.
  • If MCP is available, use exact tools. If not, follow the Static Router bootstrap manually.
  • Use .ai/catalog.md only when the primary Static Router implies useful metadata not shown explicitly.
  • Never edit derived system artifacts (claude.md, .cursorrules, .ai/index.yaml, etc.) as if they were true sources.
  • Adhere strictly to the explicit unlocking pipeline if you must manipulate protected entries.

Invariants to preserve

  • src/lib/types.ts must mirror src-tauri/src/core/types.rs
  • new Rust commands must be registered in core/mod.rs, commands/mod.rs, and lib.rs
  • L0/L1/L2 should remain explicit
  • adapter artifacts are derived, not canonical
  • ontology should enhance the system without breaking existing operational behavior