Skip to content
Navigation

Epic: 4 — Context Engine Enhancements Date: 2026-03-10

This document maps agent-core’s (openJiuwen) context engine to Exo’s enhanced exo-context package, helping contributors familiar with either framework navigate both.


1. Agent-Core Overview

Agent-core’s context engine lives in openjiuwen/core/context_engine/ (also referenced as context/) and provides layered context management for long- running agent conversations.

Key Components

ContextManager — The central orchestrator that coordinates all context operations. Manages the conversation history buffer, delegates to specialized managers for offloading, compression, and windowing, and enforces token budgets before LLM calls.

OffloadManager — Handles message-level offloading for oversized content. When a message exceeds a configurable character threshold, the manager replaces it with an [[OFFLOAD: handle=<id>]] marker and stores the original content for on-demand retrieval. This keeps the active context small while preserving access to full content when needed.

CompressManager — LLM-based dialogue compression. Detects verbose tool-call chains (sequences of assistant messages with tool_calls followed by tool results) and replaces them with concise summaries generated by an LLM call. Dramatically reduces context size for multi-step tool use without losing semantic content.

RoundManager — Round-level windowing that understands dialogue structure. Defines a “round” as a user message through the next assistant response without tool calls. Trims history to the most recent N rounds while preserving complete round boundaries, preventing mid-conversation truncation that confuses models.

TokenBudgetManager — Accurate token counting and enforcement via tiktoken. Supports two encodings:

EncodingModels
cl100k_baseGPT-3.5-turbo, GPT-4, GPT-4-turbo
o200k_baseGPT-4o, GPT-4o-mini, o1, o3

Counts tokens for all messages in history and trims oldest messages when the total exceeds a configurable budget, acting as a final safety net.

ContextConfig — Configuration dataclass controlling all context behavior:

FieldPurpose
max_history_roundsMaximum conversation rounds to keep
summary_thresholdMessage count before summarization triggers
offload_thresholdMessage size before offloading triggers
max_tokensHard token budget for context
encodingTiktoken encoding name

Processing Order

Agent-core applies context management in this sequence before each LLM call:

  1. Offload oversized individual messages
  2. Compress verbose tool-call chains
  3. Window to recent N rounds
  4. Summarize if threshold exceeded
  5. Enforce hard token budget

2. Exo Equivalent

Exo’s context engine lives in the exo-context package (packages/exo-context/) and implements the same capabilities as event-driven ContextProcessor subclasses that plug into a ProcessorPipeline — no monolithic manager needed.

Architecture Difference

Where agent-core uses a single ContextManager that calls specialized managers in sequence, Exo uses a composable processor pipeline:

python
# Agent-core: monolithic manager
context_manager = ContextManager(config)
context_manager.process(history)  # calls all managers internally

# Exo: composable pipeline
pipeline = ProcessorPipeline()
pipeline.register(MessageOffloader(max_message_size=10_000))
pipeline.register(DialogueCompressor(summarizer=my_summarizer))
pipeline.register(RoundWindowProcessor())
pipeline.register(SummarizeProcessor())
pipeline.register(TokenBudgetProcessor(max_tokens=100_000))
await pipeline.fire("pre_llm_call", ctx, {})

Each processor is independent, opt-in, and can be registered in any order (though the recommended order matches agent-core’s sequence).

Component Mapping

Agent-Core ComponentExo EquivalentNotes
ContextManagerProcessorPipelinePipeline dispatches to registered processors by event
OffloadManagerMessageOffloader + ToolResultOffloaderSplit into message-level and tool-result offloading
CompressManagerDialogueCompressorLLM-agnostic via injected summarizer callable
RoundManagerRoundWindowProcessorUses ContextConfig.history_rounds for window size
TokenBudgetManagerTokenBudgetProcessor + TiktokenCounterSeparate counter class for reuse
ContextConfigContextConfigFrozen dataclass with extra dict for extensibility
[[OFFLOAD: handle=<id>]] markers[[OFFLOADED: handle=off_<hex>]] markersSame concept, slightly different format
(no equivalent)Context with fork/mergePer-task hierarchical state with parent-chain inheritance
(no equivalent)ContextStateHierarchical key-value state (reads inherit, writes isolate)
(no equivalent)Checkpoint / CheckpointStoreVersioned snapshots of context state
(no equivalent)TokenTracker / TokenStepPer-agent, per-step token usage tracking
(no equivalent)Context tools (planning, knowledge, file, reload)Agent-callable tools for self-managing context

Key Exo Additions Beyond Agent-Core

Hierarchical Context (Context + ContextState) — Exo introduces a fork/merge lifecycle for task decomposition. A parent context can fork() child contexts that inherit state but write in isolation. On merge(), the child’s state changes and net token delta flow back to the parent.

Automation ModesContextConfig.mode supports three presets:

Modehistory_roundssummary_thresholdoffload_threshold
pilot100(disabled)(disabled)
copilot201050
navigator10520

Context Tools — Agents can self-manage their context via built-in tools: reload_offloaded(handle) retrieves offloaded content, planning tools manage a task checklist, knowledge tools search workspace artifacts, and file tools provide safe filesystem access.


3. Configuration Comparison

Agent-Core ContextConfig

python
# openjiuwen/core/context_engine/config.py
@dataclass
class ContextConfig:
    max_history_rounds: int = 20
    summary_threshold: int = 10
    offload_threshold: int = 50
    max_tokens: int = 100_000
    encoding: str = "cl100k_base"

Exo ContextConfig

python
# exo-context: config.py
@dataclass(frozen=True)
class ContextConfig:
    mode: AutomationMode = "copilot"    # pilot / copilot / navigator
    history_rounds: int = 20            # ≡ max_history_rounds
    summary_threshold: int = 10         # same semantics
    offload_threshold: int = 50         # same semantics
    enable_retrieval: bool = False      # RAG integration toggle
    neuron_names: tuple[str, ...] = ()  # prompt composition neurons
    extra: dict[str, Any] = field(      # extensible — token_budget, encoding, etc.
        default_factory=dict
    )

Field Mapping

Agent-Core FieldExo FieldNotes
max_history_roundshistory_roundsSame default (20)
summary_thresholdsummary_thresholdIdentical
offload_thresholdoffload_thresholdIdentical
max_tokensextra["token_budget"]Moved to extra dict to keep frozen schema stable
encodingextra["token_encoding"]Same — defaults to cl100k_base if not set
(no equivalent)modeAutomation level presets
(no equivalent)enable_retrievalRAG toggle
(no equivalent)neuron_namesPrompt composition

Factory Function

Exo provides make_config(mode, **overrides) to create configs with sensible defaults per automation level, avoiding manual threshold tuning:

python
from exo.context import make_config

config = make_config("navigator")  # aggressive compression settings
config = make_config("pilot")      # minimal processing, large context window

4. Migration Table

Agent-Core PathExo ImportSymbol
openjiuwen.core.context_engine.ContextManagerexo.context.ProcessorPipelineComposable processor dispatch (replaces monolithic manager)
openjiuwen.core.context_engine.OffloadManagerexo.context.MessageOffloaderMessage-level offloading with [[OFFLOADED: handle=...]] markers
(tool result offloading in OffloadManager)exo.context.ToolResultOffloaderTruncates large tool results, stores full content in state
openjiuwen.core.context_engine.CompressManagerexo.context.DialogueCompressorLLM-based tool-chain compression via injected summarizer
openjiuwen.core.context_engine.RoundManagerexo.context.RoundWindowProcessorRound-level history windowing
openjiuwen.core.context_engine.TokenBudgetManagerexo.context.TokenBudgetProcessorHard token budget enforcement
openjiuwen.core.context_engine.ContextConfigexo.context.ContextConfigFrozen config with extra dict for extensibility
(tiktoken usage inline)exo.context.TiktokenCounterStandalone token counter with model→encoding mapping
(no equivalent)exo.context.ContextPer-task context with fork/merge lifecycle
(no equivalent)exo.context.ContextStateHierarchical key-value state
(no equivalent)exo.context.ContextProcessorABC for event-driven processors
(no equivalent)exo.context.SummarizeProcessorMarks history for summarization at threshold
(no equivalent)exo.context.CheckpointVersioned context state snapshot
(no equivalent)exo.context.CheckpointStoreManages checkpoint versions per task
(no equivalent)exo.context.TokenTrackerPer-agent, per-step token usage tracking
(no equivalent)exo.context.TokenStepSingle LLM call token record
(no equivalent)Context tools (get_context_tools())Agent-callable tools for reload, planning, knowledge, files

All public symbols are re-exported from exo.context (the package __init__.py), so from exo.context import MessageOffloader works as a convenience import.

Processor Event Mapping

Agent-Core Processing StepExo ProcessorEvent
Offload oversized messagesMessageOffloaderpre_llm_call
Compress tool chainsDialogueCompressorpre_llm_call
Window to N roundsRoundWindowProcessorpre_llm_call
Mark for summarizationSummarizeProcessorpre_llm_call
Enforce token budgetTokenBudgetProcessorpre_llm_call
Truncate tool resultsToolResultOffloaderpost_tool_call