EXOCORTEX CORE INFRASTRUCTURE

LCM

Lossless Context Management

Infinite conversation history. Bounded context. Zero information loss.

18 Messages sum_01 5 msgs sum_02 7 msgs sum_03 6 msgs 3 Summaries sum_ROOT depth: 3 18 msgs 1 Root
15:1 compression ratio
The Problem

Context Windows Have Hard Limits

LLMs can only process so many tokens at once. As conversations grow, older context vanishes β€” until LCM.

πŸͺŸ

Fixed Context Windows

Even 128K context windows fill up. When they do, the model forgets β€” permanently.

4K–128K
πŸ“ˆ

Infinite History Growth

Chats accumulate indefinitely. A year of daily use = millions of tokens of history.

∞
πŸ’€

Forgetting & Hallucination

Without context, models hallucinate or contradict past decisions and facts.

↑ Errors
βš–οΈ

The Tradeoff

More history = smarter responses, but costs more tokens and risks overflow.

Tradeoff
LCM solves this with lossy-free compression. Instead of discarding old messages, we compress them into summaries while preserving the full information graph. The model always has context β€” it just arrives in different forms depending on depth.
The Architecture

The Summary DAG

A Directed Acyclic Graph of compressed conversation history. Each node knows its lineage β€” where it came from and what it contains.

Message (leaf)
Summary depth 1
Summary depth 2
Summary depth 3+
Hover nodes for details
sum_ROOT Root Summary depth:3 | 18 msgs sum_01 Jan–Mar 2024 depth:2 | 5 msgs sum_02 Apr–Aug 2024 depth:2 | 6 msgs sum_03 Sep 2024–Mar 2025 depth:2 | 7 msgs sum_01a Project Alpha depth:1 | 2 msgs sum_01b API Design depth:1 | 1 msg sum_01c Auth Review depth:1 | 2 msgs sum_02a DB Migration depth:1 | 2 msgs sum_02b Perf Tuning depth:1 | 2 msgs sum_02c Deployment depth:1 | 2 msgs sum_03a V2 Planning depth:1 | 3 msgs sum_03b LCM System depth:1 | 2 msgs sum_03c Blueprints depth:1 | 2 msgs user msg_001 assistant msg_002 user msg_003 assistant msg_004 user msg_005 assistant msg_006 user msg_007 user msg_008 assistant msg_009 user msg_010 assistant msg_011 user msg_012 assistant msg_013 user msg_014 assistant msg_015 user msg_016 assistant msg_017 user msg_018 assistant msg_019 ROOT LEVEL 2 LEVEL 1 MESSAGES 19 nodes β†’ 4

sum_ROOT

typesummary
depth3
descendants18
earliestβ€”
latestβ€”
children3
Field Type Description
id string Unique identifier (e.g. sum_01a, msg_003)
content string Summary text or original message content
lineage string[] Parent summary IDs β€” enables DAG traversal
descendant_count number Total messages in this subtree
earliest_at timestamp When the oldest message in this node was created
latest_at timestamp When the newest message in this node was created
depth number DAG depth (0 = leaf message, higher = more compressed)
How It Works

Traversing the DAG

LCM tools let you move between compressed summaries and expanded detail on demand.

1

User Asks About a Past Topic

The user asks something that references old conversation. The system needs to find relevant context without loading everything into the context window.

"What did we decide about the LCM compression strategy?"
2

System Runs lcm_grep Across Compressed History

Instead of scanning all messages, it searches summaries. The DAG structure means we only traverse relevant branches.

lcm_grep("LCM compression strategy")
3

Matching Summaries Identified β†’ lcm_expand

Found sum_03b (LCM System) and sum_02b (Perf Tuning). The system expands these specific branches to recover the full message subtree.

lcm_expand(["sum_03b", "sum_02b"], maxDepth=2)
4

Detailed Context Delivered to the Model

The expanded messages are injected into the prompt. The model sees the full relevant conversation history β€” exactly what it needs, nothing more.

β†’ 12 messages expanded from 2 summaries β†’ model context

LCM API Reference

lcm_grep

Full-text or regex search across compacted summaries and messages. Returns matching snippets with IDs.

lcm_grep("compression", scope="both")

lcm_expand

Traverses the DAG from summary IDs downward, recovering the full subtree of messages and child summaries.

lcm_expand(["sum_03b"], maxDepth=3)

lcm_describe

Returns metadata for any LCM item: token counts, depth, compression ratio, lineage, timestamps.

lcm_describe("sum_ROOT")

lcm_expand_query

Delegated search + expand in one call. Greps for matching summaries, then expands the top results to answer a specific question.

lcm_expand_query(query, prompt)
Performance

Compression Ratios

LCM achieves dramatic compression while preserving every piece of information. Here's the math.

Messages
30,000 tokens
↓
Summaries
2,000 tokens
↓
Root
400 tokens
compression_ratio = original_tokens / summary_tokens = 30,000 / 2,000 = 15:1
15:1
Typical compression ratio
100%
Information preserved
~2KB
Per-summary overhead
O(log n)
Access complexity
Hands-On

Interactive Demo

Explore a simulated DAG. Search for topics, see results highlight, and expand to see the full subtree.

πŸ•ΈοΈ Simulated DAG β€” Click to Select
ROOT Ξ± sum Ξ² sum auth api db perf m1 m2 m3 m4 m5 m6 m7 m8
πŸ“‚ Expansion View β€” Click a result to expand
// lcm_expand output will appear here
Select a node from the DAG or run a search to see its expanded subtree.
Integration

LCM in the Exocortex

LCM doesn't work alone. It's one half of a two-layer memory system combining semantic search with historical context.

Layer 1

πŸ” Vector Embeddings

Every conversation chunk and summary is embedded into a vector. This enables semantic similarity search β€” finding topics by meaning, not keywords.

Layer 2

🧠 LCM Summary DAG

Hierarchical summarization preserves the full conversation tree. When relevant summaries are found, they can be fully expanded to recover exact context.

Synergy

⚑ Vector Search + LCM Expansion

Vector search finds the right summaries fast. LCM expansion restores the full detail. Together: semantic recall + perfect fidelity.

Storage

πŸ’Ύ What Gets Stored

Raw messages, summary DAG nodes, vector embeddings, and metadata (timestamps, lineage, token counts). Old messages can be evicted from the vector store while remaining accessible via LCM.

πŸ’¬
User Message
New conversation occurs
β†’
πŸ”’
Embed
Vector embedding + LCM summary
β†’
πŸ”
Vector Search
Find semantically relevant summaries
β†’
🌳
LCM Expand
Traverse DAG, recover full subtree
β†’
🧠
Model Context
Full relevant history injected
The Exocortex Memory Model:

When the user asks about something, the system first runs a vector similarity search across embedded summaries. This returns the most semantically relevant nodes. Then lcm_expand is called on those nodes to pull the full conversation history. The result: the model gets exactly the context it needs, with perfect accuracy, regardless of how old the conversation is.