Documentation Index Fetch the complete documentation index at: https://mintlify.com/spacedriveapp/spacebot/llms.txt
Use this file to discover all available pages before exploring further.
The compactor watches each channel’s context size and triggers background compaction before the channel fills up. The channel never blocks on compaction — it keeps responding to users while old context is summarized in the background.
Not an LLM Process
The compactor is not an LLM process. It’s a programmatic monitor that watches a number (context token count) and spawns workers when thresholds are crossed.
// From src/agent/compactor.rs
pub struct Compactor {
pub channel_id : ChannelId ,
pub deps : AgentDeps ,
pub history : Arc < RwLock < Vec < Message >>>,
is_compacting : Arc < RwLock < bool >>,
}
The LLM work (summarization + memory extraction) happens in the compaction worker it spawns, not in the compactor itself.
Tiered Thresholds
The compactor uses three thresholds:
[ defaults . compaction ]
background_threshold = 0.80 # 80% context usage
aggressive_threshold = 0.85 # 85% context usage
emergency_threshold = 0.95 # 95% context usage
Background Compaction (>80%)
Summarize oldest 30% of messages:
// From src/agent/compactor.rs
let fraction = 0.3 ;
let compaction_worker = spawn_compaction_worker ( fraction );
The worker:
Reads the oldest 30% of messages
Uses an LLM to summarize them
Extracts any memorable facts, decisions, preferences
Saves extracted memories to the memory graph
Replaces original messages with a summary
Aggressive Compaction (>85%)
Summarize oldest 50% of messages:
Same process, more aggressive. Fires when background compaction didn’t reclaim enough space.
Emergency Truncation (>95%)
Drop oldest messages without LLM summarization :
// From src/agent/compactor.rs
async fn emergency_truncate ( & self ) -> Result <()> {
let mut history = self . history . write () . await ;
let cutoff = history . len () / 2 ;
history . splice (
0 .. cutoff ,
vec! [ Message :: assistant (
"[Emergency truncation: oldest messages removed to prevent overflow]"
)]
);
Ok (())
}
Fast and synchronous. Only fires when context is critically full and background compaction hasn’t completed yet.
Compaction Flow
Channel completes a turn
After the channel finishes processing a user message, it calls the compactor.
Compactor checks context size
// From src/agent/compactor.rs
let context_window = ** rc . context_window . load ();
let usage = estimated_tokens as f32 / context_window as f32 ;
Threshold check
let action = if usage >= config . emergency_threshold {
Some ( CompactionAction :: EmergencyTruncate )
} else if usage >= config . aggressive_threshold {
Some ( CompactionAction :: Aggressive )
} else if usage >= config . background_threshold {
Some ( CompactionAction :: Background )
} else {
None
};
Spawn compaction worker (or emergency truncate)
Emergency truncation is synchronous and fast. Background and aggressive spawn a worker. tokio :: spawn ( async move {
let result = run_compaction ( & deps , & prompt , & history , fraction ) . await ;
});
Worker summarizes and extracts
The compaction worker:
Reads old messages
Summarizes them into a cohesive narrative
Extracts memories using memory_save tool
Returns summary
Summary swaps into history
let mut history = self . history . write () . await ;
history . splice (
0 .. compacted_count ,
vec! [ Message :: assistant ( summary )]
);
The channel sees the summary on its next turn.
Compaction Worker Prompt
The compaction worker gets a focused system prompt:
You are a compaction worker.
Your job:
1. Summarize the provided conversation turns into a cohesive narrative
2. Extract any memorable facts, decisions, or preferences
3. Save them using memory_save
Guidelines:
- Preserve important context
- Maintain chronological flow
- Don't lose critical information
- Extract structured memories for anything worth remembering
The worker has access to:
memory_save — Create typed memories
No other tools (no shell, file, exec)
Token Estimation
Context size is estimated using a simple heuristic:
// From src/agent/compactor.rs
pub fn estimate_history_tokens ( history : & [ Message ]) -> usize {
history . iter () . map ( | msg | {
match msg {
Message :: User { content , .. } => estimate_content_tokens ( content ),
Message :: Assistant { content , .. } => estimate_content_tokens ( content ),
Message :: ToolCall { .. } => 50 , // Approximate
Message :: ToolResult { .. } => 100 ,
}
}) . sum ()
}
fn estimate_content_tokens ( content : & str ) -> usize {
// Rough approximation: 1 token ≈ 4 characters
content . len () / 4
}
This is deliberately conservative. Better to compact slightly early than overflow.
Compaction Lock
Only one compaction runs per channel at a time:
// From src/agent/compactor.rs
is_compacting : Arc < RwLock < bool >>
async fn spawn_compaction_worker ( & self , action : CompactionAction ) {
let mut is_compacting = self . is_compacting . write () . await ;
* is_compacting = true ;
drop ( is_compacting );
tokio :: spawn ( async move {
// ... compaction work ...
let mut flag = is_compacting . write () . await ;
* flag = false ;
});
}
If compaction is already running, new checks are skipped.
Summary Stacking
Multiple compactions stack chronologically:
[Summary 1: turns 1-50]
[Summary 2: turns 51-100]
[Summary 3: turns 101-150]
[Turn 151]
[Turn 152]
...
Eventually, old summaries themselves get compacted into meta-summaries.
During compaction, the worker extracts memories:
{
"name" : "memory_save" ,
"input" : {
"content" : "User decided to refactor the auth module to use dependency injection" ,
"memory_type" : "decision" ,
"importance" : 0.8
}
}
This ensures context that gets summarized away is preserved as structured knowledge.
Context Window Configuration
Per-agent context window:
[ defaults ]
context_window = 200000 # 200k tokens (Claude Sonnet/Opus)
[ defaults . compaction ]
background_threshold = 0.80
aggressive_threshold = 0.85
emergency_threshold = 0.95
For smaller models:
[ defaults ]
context_window = 128000 # 128k tokens
[ defaults . compaction ]
background_threshold = 0.70 # Compact earlier
aggressive_threshold = 0.80
emergency_threshold = 0.90
Branch Compaction
Branches inherit large channel histories and can overflow on first call. They have built-in pre-flight compaction:
// From src/agent/branch.rs
self . maybe_compact_history ();
fn maybe_compact_history ( & mut self ) {
let estimated = estimate_history_tokens ( & self . history);
let context_window = ** self . deps . runtime_config . context_window . load ();
let usage = estimated as f32 / context_window as f32 ;
if usage > 0.6 {
self . force_compact_history ();
}
}
If a branch overflows despite this, it retries with compaction:
const MAX_OVERFLOW_RETRIES : usize = 2 ;
match agent . prompt ( & prompt ) . await {
Err ( error ) if is_context_overflow_error ( & error . to_string ()) => {
self . force_compact_history ();
current_prompt = "Continue where you left off. Older context has been compacted." ;
}
}
Worker Compaction
Workers run in segments and compact between segments:
// From src/agent/worker.rs
const TURNS_PER_SEGMENT : usize = 25 ;
// After 25 turns
let estimated = estimate_history_tokens ( & history );
if estimated as f32 / context_window as f32 > 0.7 {
compact_worker_history ( & mut history );
}
Worker compaction is simpler than channel compaction — just drop oldest tool calls and keep only the task description + recent turns.
Compaction Observability
Compaction events are logged:
tracing :: info! (
channel_id = % self . channel_id,
usage = % format! ( "{:.1}%" , usage * 100.0 ),
? action ,
"compaction triggered"
);
On completion:
tracing :: info! (
channel_id = % channel_id ,
turns_compacted = turns_compacted ,
"compaction completed"
);
On failure:
tracing :: error! (
channel_id = % channel_id ,
% error ,
"compaction failed"
);
Error Handling
If compaction fails:
Background/aggressive compaction — Logged, compaction lock released, channel continues normally
Emergency truncation — Always succeeds (synchronous drop)
The channel is never blocked by compaction failures. Worst case: emergency truncation kicks in and drops old messages without summarization.
Best Practices
When to adjust thresholds
Lower thresholds (compact earlier):
Small context window models (less than 128k tokens)
Very active channels with rapid message flow
You want more aggressive memory extraction
Higher thresholds (compact later):
Large context window models (200k+ tokens)
Slow-moving conversations
You want to preserve more raw context
How to tune compaction aggressiveness
Conservative (preserve more context): [ defaults . compaction ]
background_threshold = 0.85
aggressive_threshold = 0.90
emergency_threshold = 0.95
Aggressive (compact earlier): [ defaults . compaction ]
background_threshold = 0.70
aggressive_threshold = 0.80
emergency_threshold = 0.90
When to use emergency truncation
Emergency truncation is a safety valve. It fires when:
Context is critically full (>95%)
Background compaction hasn’t completed yet
The next user message would overflow
If you’re hitting emergency truncation frequently:
Lower background/aggressive thresholds
Increase context window
Check if compaction workers are failing
Compaction vs Memory
Compaction and memory are complementary:
Compaction — Manages context window size. Summarizes old messages. Runs automatically.
Memory — Extracts structured knowledge. Stores facts, preferences, decisions. Queried by branches.
During compaction, both happen:
Messages are summarized (compaction)
Important facts are extracted as typed memories (memory)
The summary keeps context coherent. The memories make knowledge queryable.
Next Steps
Memory System Learn how compaction extracts memories
Branches See how branches handle context overflow
Workers Understand worker segmentation and compaction
Configuration Full compaction configuration reference