Operator Pattern & Self-Optimization — agent-core to Exo Mapping
Epic: 10 — Operator Pattern with Self-Optimization Date: 2026-03-11
Epic: 10 — Operator Pattern with Self-Optimization Date: 2026-03-11
This document maps agent-core’s (openJiuwen) agent_evolving/ system to
Exo’s operator pattern in the exo-train package, helping contributors
familiar with either framework navigate both.
1. Agent-Core Overview
Agent-core’s self-optimization system lives in openjiuwen/agent_evolving/ and
enables iterative improvement of agent parameters (prompts, tool descriptions,
memory configs) through textual gradients.
Key Components
Operator ABC — The atomic unit of execution and optimization. Each
operator wraps a single agent capability and exposes tunable parameters via
get_tunables(), state snapshots via get_state()/load_state(), and
parameter mutation via set_parameter(). Concrete implementations:
| Operator | Domain | Tunable Parameters |
|---|---|---|
LLMCallOperator | LLM calls | system_prompt, user_prompt |
ToolCallOperator | Tool invocations | tool_description, tool_filter |
MemoryCallOperator | Memory retrieval | enabled, max_retries |
TunableSpec — Declares what an optimizer can modify on an operator.
Each spec has a name, kind (one of PROMPT, CONTINUOUS, DISCRETE,
TOOL_SELECTOR, MEMORY_SELECTOR), optional path, and constraint string.
Analogous to nn.Module.parameters() in PyTorch.
Trainer — Orchestrates the full optimization loop: forward pass →
trajectory extraction → update generation → candidate selection → checkpointing.
Manages an EvolveCheckpoint for operator state persistence and resume across
training runs.
TracerTrajectoryExtractor — Builds DAG-linked TrajectoryStep sequences
from agent-core’s Session.tracer() spans. Each step is typed (LLM, TOOL,
MEMORY, WORKFLOW, AGENT) and links to its originating operator via
operator_id.
InstructionOptimizer — Textual gradient-based prompt optimization.
Two-phase loop: backward() analyzes failures to generate natural-language
gradients describing what went wrong; step() rewrites prompts to address
those issues.
ToolOptimizerBase — Multi-stage beam search for tool description
optimization: generate → evaluate → select → refine.
MemoryOptimizerBase — Optimizes memory retrieval configuration
(enable/disable, retry counts) based on trajectory analysis.
SingleDimUpdater / MultiDimUpdater — Updaters compose optimizers.
SingleDimUpdater wraps one optimizer; MultiDimUpdater composes multiple
domain-specific optimizers with attribution (attributes failures to the
responsible domain before running that domain’s optimizer).
3-Dimension Evolution
Agent-core’s evolution operates across three independent dimensions, each with its own optimization strategy:
- Prompt — Textual gradients rewrite system/user prompts (via
InstructionOptimizer) - Tool — Beam search improves tool descriptions (via
ToolOptimizerBase) - Memory — Configuration tuning for memory retrieval (via
MemoryOptimizerBase)
These can run independently (SingleDimUpdater) or jointly with failure
attribution (MultiDimUpdater).
Checkpoint/Resume
Agent-core persists EvolveCheckpoint objects containing operator states,
optimizer states, and training metadata. Training can resume from any saved
checkpoint, restoring all operator parameters and optimizer gradient history.
2. Exo Equivalent
Exo’s operator pattern lives in exo-train (packages/exo-train/)
and was added alongside the existing EvolutionPipeline/SynthesisPipeline
rather than replacing them — the two paradigms serve different purposes.
Architecture Difference
Where agent-core uses a monolithic Trainer that owns the entire optimization
loop, Exo separates concerns into composable pieces that integrate with
the existing Trainer ABC lifecycle:
# Agent-core: monolithic trainer
trainer = AgentEvolvingTrainer(agent, dataset, optimizer)
trainer.train(epochs=5)
# Exo: composable trainer + updater + optimizer
optimizer = InstructionOptimizer(operators, llm_fn=my_llm)
updater = SingleDimUpdater(optimizer)
trainer = OperatorTrainer(updater=updater, evaluator=my_eval_fn)
trainer.check_agent(agent)
trainer.check_dataset(train_data, test_data)
trainer.check_config(OperatorTrainConfig(epochs=5))
trainer.mark_validated()
metrics = await trainer.train()The key difference: Exo’s OperatorTrainer inherits from the Trainer ABC,
gaining the lifecycle state machine (CREATED → VALIDATED → TRAINING → COMPLETED)
and validation guards for free.
Component Mapping
| Agent-Core Component | Exo Equivalent | Notes |
|---|---|---|
Operator ABC | Operator ABC (operator/base.py) | Same interface: get_tunables(), get_state()/load_state(), invoke() |
LLMCallOperator | LLMCallOperator (operator/llm_call.py) | Adds LLMCallTrace recording |
ToolCallOperator | ToolCallOperator (operator/tool_call.py) | Adds ToolCallTrace recording |
MemoryCallOperator | MemoryCallOperator (operator/memory_call.py) | Adds MemoryCallTrace with retry logic |
TunableSpec | TunableSpec (operator/base.py) | Frozen dataclass; adds path and constraint fields |
TunableKind | TunableKind (operator/base.py) | StrEnum; adds TOOL_SELECTOR and MEMORY_SELECTOR kinds |
InstructionOptimizer | InstructionOptimizer (optimizer.py) | Two-phase backward/step; preserves {{...}} template variables |
ToolOptimizerBase | ToolOptimizer (optimizer.py) | Four-stage beam search pipeline |
MemoryOptimizerBase | (handled by MemoryCallOperator tunables) | Memory optimization via operator tunables rather than separate optimizer |
SingleDimUpdater | SingleDimUpdater (updater/) | Wraps single BaseOptimizer |
MultiDimUpdater | MultiDimUpdater (updater/) | Domain-specific composition with attribution |
Trainer (agent_evolving) | OperatorTrainer (operator_trainer.py) | Extends Exo’s Trainer ABC with operator lifecycle |
TracerTrajectoryExtractor | DefaultTrajectoryExtractor (trajectory/extractor.py) | Dict-based instead of tracer-span-based; TrajectoryExtractor ABC for custom implementations |
TrajectoryStep | TrajectoryStep (trajectory/types.py) | Adds StepKind enum, ExecutionSpec, Trajectory container |
EvolveCheckpoint | OperatorCheckpoint + CheckpointManager | Protocol-based; FileCheckpointStore for JSON persistence |
Key Exo Additions Beyond Agent-Core
BaseOptimizer ABC — Agent-core’s optimizers are standalone classes.
Exo introduces a formal BaseOptimizer ABC with bind(), backward(),
step(), add_trajectory(), and requires_forward_data() — giving all
optimizers a uniform interface.
TextualParameter — Explicit container for optimizer gradients, keyed by
(operator_id, target). Agent-core stores gradients implicitly in optimizer
state.
Updater Protocol — Formal protocol separating update logic from training.
Supports both single-domain and multi-domain optimization with the same interface.
CheckpointManager Protocol — Pluggable checkpoint policy (should_save,
build, restore) with DefaultCheckpointManager implementation supporting
periodic and improvement-triggered saves.
Lifecycle State Machine — OperatorTrainer inherits Exo’s Trainer
validation phase (check_agent, check_dataset, check_reward, check_config,
mark_validated), preventing training on invalid configurations.
3. Code Comparison
Defining Operators
# Agent-core
from openjiuwen.agent_evolving import LLMCallOperator
op = LLMCallOperator(
operator_id="summarizer",
system_prompt="Summarize the following text.",
)
tunables = op.get_tunables() # dict of TunableSpec
# Exo
from exo.train.operator import LLMCallOperator
op = LLMCallOperator(
name="summarizer",
system_prompt="Summarize the following text.",
llm_fn=my_llm_fn,
)
tunables = op.get_tunables() # list[TunableSpec]Running an Optimization Loop
# Agent-core
from openjiuwen.agent_evolving import (
InstructionOptimizer, SingleDimUpdater, Trainer
)
optimizer = InstructionOptimizer(llm=meta_llm)
updater = SingleDimUpdater(optimizer)
trainer = Trainer(agent, train_data, updater)
trainer.train(epochs=3)
# Checkpoint saved implicitly
# Exo
from exo.train.operator_trainer import OperatorTrainer, OperatorTrainConfig
from exo.train.optimizer import InstructionOptimizer
optimizer = InstructionOptimizer(operators=agent.operators, llm_fn=meta_llm)
updater = SingleDimUpdater(optimizer)
trainer = OperatorTrainer(updater=updater, evaluator=eval_fn)
# Validation phase (required)
trainer.check_agent(agent)
trainer.check_dataset(train_data, test_data)
trainer.check_config(OperatorTrainConfig(epochs=3, checkpoint_dir="./ckpts"))
trainer.mark_validated()
# Training phase
metrics = await trainer.train() # async, returns TrainMetrics
# Checkpoint managed via CheckpointManager protocolTextual Gradient Flow
# Agent-core: implicit gradient flow
optimizer.backward(failing_cases) # writes gradients internally
updates = optimizer.step() # returns new parameter values
agent.apply_updates(updates)
# Exo: explicit TextualParameter gradients
optimizer.backward(evaluated_cases) # writes TextualParameter.gradients
updates = optimizer.step() # Updates = dict[(op_id, target), value]
for (op_id, target), value in updates.items():
operators[op_id].set_parameter(target, value)Multi-Domain Optimization
# Agent-core
multi = MultiDimUpdater({
"llm": InstructionOptimizer(llm),
"tool": ToolOptimizer(llm),
"memory": MemoryOptimizer(llm),
})
updates = multi.update(trajectories, cases)
# Exo — same pattern
from exo.train.optimizer import InstructionOptimizer, ToolOptimizer
multi = MultiDimUpdater({
"llm": InstructionOptimizer(operators, llm_fn=meta_llm),
"tool": ToolOptimizer(operators, llm_fn=meta_llm),
})
updates = multi.update(trajectories, evaluated_cases)4. How EvolutionPipeline/SynthesisPipeline Coexist
The operator pattern and existing evolution system serve different paradigms:
| Aspect | EvolutionPipeline | Operator Pattern |
|---|---|---|
| Optimizes | Training data (synthesis + augmentation) | Agent parameters (prompts, tool descriptions) |
| Strategy | EvolutionStrategy ABC (synthesise/train/evaluate) | BaseOptimizer ABC (backward/step) |
| Trainer | Pluggable (VeRLTrainer, custom) | OperatorTrainer (textual gradients) |
| Data flow | SynthesisPipeline → TrajectoryDataset → training | Trajectories → optimizers → parameter updates |
| Use case | Fine-tuning, RL, data augmentation | Prompt engineering, tool description tuning |
Composition Points
The two systems compose naturally:
-
EvolutionStrategy using operators — An
EvolutionStrategy.train()method can internally useOperatorTrainerto optimize agent parameters as part of a broader evolution loop. -
Operator optimization using SynthesisPipeline —
OperatorTrainercan useSynthesisPipelineto augment its training cases before running optimization. -
Shared trajectory infrastructure — Both systems use
TrajectoryDatasetfor data capture. The operator system adds finer-grainedTrajectoryStepfor attribution, but these coexist with message-levelTrajectoryItem.
# Example: EvolutionStrategy that uses operator optimization internally
class OperatorEvolutionStrategy(EvolutionStrategy):
async def train(self, agent, data, epoch):
optimizer = InstructionOptimizer(agent.operators, llm_fn=self.llm)
updater = SingleDimUpdater(optimizer)
trainer = OperatorTrainer(updater=updater, evaluator=self.eval_fn)
trainer.check_agent(agent)
trainer.check_dataset(data)
trainer.check_config(OperatorTrainConfig(epochs=1))
trainer.mark_validated()
await trainer.train()5. Migration Table
| Agent-Core Path | Exo Import | Symbol |
|---|---|---|
openjiuwen.agent_evolving.Operator | exo.train.operator.Operator | ABC with get_tunables(), invoke(), get_state()/load_state() |
openjiuwen.agent_evolving.LLMCallOperator | exo.train.operator.LLMCallOperator | Wraps LLM calls; tunables: system_prompt, user_prompt |
openjiuwen.agent_evolving.ToolCallOperator | exo.train.operator.ToolCallOperator | Wraps tool invocations; tunables: tool_description |
openjiuwen.agent_evolving.MemoryCallOperator | exo.train.operator.MemoryCallOperator | Wraps memory retrieval; tunables: enabled, max_retries |
openjiuwen.agent_evolving.TunableSpec | exo.train.operator.TunableSpec | Frozen dataclass declaring tunable parameters |
openjiuwen.agent_evolving.TunableKind | exo.train.operator.TunableKind | StrEnum: PROMPT, CONTINUOUS, DISCRETE, TOOL_SELECTOR, MEMORY_SELECTOR |
openjiuwen.agent_evolving.InstructionOptimizer | exo.train.optimizer.InstructionOptimizer | Textual gradient prompt optimization (backward/step) |
openjiuwen.agent_evolving.ToolOptimizerBase | exo.train.optimizer.ToolOptimizer | Beam search tool description optimization |
openjiuwen.agent_evolving.MemoryOptimizerBase | (via MemoryCallOperator tunables) | Memory config optimization through operator tunables |
openjiuwen.agent_evolving.SingleDimUpdater | exo.train.updater.SingleDimUpdater | Single-optimizer wrapper |
openjiuwen.agent_evolving.MultiDimUpdater | exo.train.updater.MultiDimUpdater | Multi-domain composition with attribution |
openjiuwen.agent_evolving.Trainer | exo.train.operator_trainer.OperatorTrainer | Extends Exo Trainer ABC with operator optimization loop |
openjiuwen.agent_evolving.TracerTrajectoryExtractor | exo.train.trajectory.DefaultTrajectoryExtractor | Dict-based extraction (replaces tracer-span-based) |
openjiuwen.agent_evolving.TrajectoryStep | exo.train.trajectory.TrajectoryStep | Frozen dataclass with StepKind, operator_id, timing |
openjiuwen.agent_evolving.EvolveCheckpoint | exo.train.checkpointing.OperatorCheckpoint | Checkpoint with operators_state, updater_state, best_score |
| (no equivalent) | exo.train.operator.base.TunableKind.TOOL_SELECTOR | New kind for tool selection parameters |
| (no equivalent) | exo.train.operator.base.TunableKind.MEMORY_SELECTOR | New kind for memory selection parameters |
| (no equivalent) | exo.train.optimizer.BaseOptimizer | Formal ABC for all optimizers |
| (no equivalent) | exo.train.optimizer.TextualParameter | Explicit gradient container per operator |
| (no equivalent) | exo.train.updater.Updater | Protocol for update strategies |
| (no equivalent) | exo.train.checkpointing.CheckpointManager | Protocol for checkpoint policy |
| (no equivalent) | exo.train.checkpointing.DefaultCheckpointManager | Periodic + improvement-triggered saves |
| (no equivalent) | exo.train.checkpointing.FileCheckpointStore | JSON file persistence for checkpoints |
| (no equivalent) | exo.train.trajectory.StepKind | StrEnum: LLM, TOOL, MEMORY, WORKFLOW, AGENT |
| (no equivalent) | exo.train.trajectory.ExecutionSpec | Execution metadata (case_id, execution_id, seed, tags) |
| (no equivalent) | exo.train.trajectory.Trajectory | Container with steps + optional DAG edges |