AgentOpt · allenanie · Jun 5, 2025 · Jun 6, 2025 · Jun 9, 2025 · Jun 9, 2025
diff --git a/OVERVIEW.md b/OVERVIEW.md
@@ -1,32 +1,57 @@
 # Overview of Trace and Development Guide
 
-The library of Trace is designed to be a lightweight, modularized package to allow developers to easily try new ideas on generative optimization and integrate learning wtih their pipelines. 
-
-Currently, the Trace library has three main modules collected under the `opto` top module. 
-
-1. `opto.trace` provides the infrastructure for tracing computational workflows. It defines two primitives `trace.node` and `@trace.bundle`. They can be applied to Python objects and methods, respectively, which define the root nodes and operators of the directed acyclic graph (DAG) of computation. They both have a `trainable` flag. When set `True`, the wrapped objects are viewed as *parameters* of the computational worflow. Users can use `trace.node` and `@trace.bundle` to declare the data and computation that they wish to trace and/or adapt, and we call the resulting workflow defined by these two primitives a *traced* workflow. When running a traced workflow, a DAG will be automatiically created by Trace as a data structure, which will later be sent to optimizers in `opto.optimizers`for updates (upon calling `node.backward` with soem feedback).
-
-2. `opto.optimizers` has a collection of generative optimization algorithms, whose API is defined by an abstract class `Optimizer`. Think them like gradient algorithms. Their job is to propose a new version of the parameters (i.e. those set with `trainable=True`) when receiving a computational graph (DAG) and the feedback given to the computed output. Typically, these algorithms can be viewed as an LLM agent, which makes calls to LLM to analyze the computational graph and the feedback, and to propose updates. In Trace library, we provide implementation of several popular optimizers, such `OptoPrime`, `TextGrad`, and `OPRO`.
-
-3. `opto.trainers` are a collection of training algorithms (under the `AlgorithmBase` class) that use optimizers in `opto.optimizers` as subroutines to improve a given workflow following a feedback oracle constructed by datasets, interactive environments, etc. While `Optimizer` defines a low-level *optimization* API, `AlgorithmBase` defines a high-level *learning* API which standarizes the format of agent (by the `Module` class created by `@trace.model`), the data loader (by the `DataLoader` class), and the feedback oracle (by the `AutoGuide` class). With this common abstraction, we offer training algorithms, from the basic `MinibatchAlgorithm` which trains minibatches of samples to search algorithms like `BeamSearch`. The `AlgorithmBase` also handles logging of the training process. While there are overlapping between the functions of `Optimizer` and `AlgorithmBase`, the main distinction is that algorithms under `AlgorithmBase` are meta algorithms, as they should work for different optimizers in `opto.optimizers`.
-
-
-4. `opto.utils` has a collection of helper functions and backends, which are reusable for various applications. This includes, e.g., abstraction of LLMs, database, etc. Making use of all these utils would requie installing optional depedencies.
-
-
-In summary, `opto.trace` is the infrastructure, `opto.optimizers` are algorithms that process feedback and propose new parameter candidates, and `opto.trainers` are algorithms built on top of `opto.trace` and `opto.optimizers` to train learning agents.
-
-## Common Workflow of Using Trace
-
-1. Use `trace.node` and `@trace.bundle` to define the traceable workflow and its trainable parameter. 
-2. Wrap the workflow as a `trace.Module` using `@trace.model`
-3. Create a dataloader using `DataLoader` and define the feedback oracle (an analogy of loss function) using `AutoGuide`. 
-4. Create a trainer from `opto.trainers` using optimizers from `opto.optimizers` and the above module, dataloader, and feedback oracle.
+The Trace library is a lightweight, modular package designed to allow developers to experiment easily with generative optimization and integrate feedback-driven learning into their computational workflows.
+The library has four modules within the `opto` top-level namespace:
+
+1. `opto.trace` provides the infrastructure for converting executing Python code into symbolic directed acyclic graphs (DAGs). 
+It defines two tracing primitives:
+    - `trace.node`: Wraps Python objects, designating them as nodes within the computational graph.
+    - `@trace.bundle`: Decorates Python methods/functions, marking them as operators within the graph.
+
+Each primitive has a `trainable` flag. 
+When set to `True`, these marked nodes and bundles become the trainable *parameters* of the workflow.
+By using these primitives, developers can create a *traced workflow* represented as a DAG.
+This DAG structure is automatically constructed at runtime, capturing both computational dependencies and trainable parameters, ready for optimization.
+
+2. `opto.optimizers` has an abstract class `Optimizer` that defines algorithms that take computation DAGs and associated feedback objects as input, and output values for the trainable parameters.
+These algorithms are analogous to gradient-based optimizers in PyTorch, but are typically implemented as generative optimization agents, leveraging LLMs to analyze feedback and propose parameter updates.
+We provide implementations of several generative optimizers:
+    - `OptoPrime`
+    - `TextGrad`
+    - `OPRO`
+
+3. `opto.trainers` has the `AlgorithmBase` abstraction that orchestrates the overall training process.
+Trainers manage data handling, tracing control, feedback collection, optimizer invocation, and iterating/stopping. Specifically, a trainer:
+    - Controls data sampling (via `DataLoader`).
+    - Determines when DAGs are constructed and when feedback (e.g. via `AutoGuide`) is collected .
+    - Invokes `optimizers` for parameter updates, possibly repeatedly and manages the training loop.
+    - Logs training progress.
+
+Although `optimizers` handle lower-level optimization decisions, trainers under `AlgorithmBase` manage broader training logic and are designed to be compatible across various `optimizers`.
+We provide implementations of common trainers: `MinibatchAlgorithm`(basic minibatch training) and `BeamSearch` (example of search-based training).
+
+4. `opto.utils` has a collection of reusable helper functions and backend utilities, including abstraction for:
+    - Large Language Models (LLMs)
+    - Databases
+    - Miscellaneous support tools.
+
+Note: Some utilities might require installing optional depedencies.
+
+## Concise Summary of Abstractions
+  - `trace`: Infrastructure to construct symbolic computational DAGs
+  - `optimizers`: Receive DAG and feedback, output parameter values.
+  - `trainer`: Manages DAG construction, data sampling, feedback collection, optimizer invocation, and training workflow control.
+
+## Common Workflow for Using Trace
+
+1. Define a traceable workflow with `trace.node` and `@trace.bundle`, marking trainable parameters. 
+2. Wrap this workflow into a `trace.Module` with `@trace.model`.
+3. Define a dataloader (`DataLoader`) and feedback oracle (analogous to a loss function, using e.g. `AutoGuide`). 
+4. Instantiate a trainer from `opto.trainers`, specifying the optimizer from `opto.optimizers` alongside the defined module above, dataloader, and feedback oracle.
 5. Run the trainer. 
 
-
-## Common Workflow of Improving Trace
-- **Developing new optimization agent** Contribute to `trace.optimizers` and design new algorithms under `Optimizer`
-- **Developing new learning algorithms** Contribute to `trace.trainers` (and `trace.optimizers` when necessary). Design new algorithms under `AlgorithmBase`, new dataloader under `DataLoader`, or new feedback oracle under `AutoGuide`. 
-- **Improving infrastructure**  Propose updates to change `opto.trace` (e.g., to improve UI, add new tracing, etc.)
-- **Onboarding other utility tools** Add to `opto.utils` and update `setup.py` with optional requirements.
+## Guidelines for Improving and Extending Trace
+  - **New optimization agents**: Contribute to `opto.optimizers`, sub-class from the `Optimizer` abstraction.
+  - **New learning algorithms**: Contribute to `opto.trainers` (and optionally `opto.optimizers` if necessary). Design new algorithms sub-classing `AlgorithmBase`, new dataloader under `DataLoader`, or new feedback oracle under `AutoGuide`. 
+  - **Improving infrastructure**: Propose modifications to `opto.trace` to improve tracing capability, user experience, or additional functionality.
+  - **Onboarding other utility tools**: Add helpful tools to `opto.utils` and update `setup.py` accordingly for optional dependencies.
diff --git a/...timizers_tests/test_trainer_refactored.py → examples/gsm8k_trainer_example.py b/...timizers_tests/test_trainer_refactored.py → examples/gsm8k_trainer_example.py
@@ -2,16 +2,16 @@
 import numpy as np
 from opto import trace
 from opto.utils.llm import LLM, LiteLLM
-from opto.optimizers.utils import print_color
 from opto.optimizers import OptoPrime
-from opto.trainer.algorithms.basic_algorithm import BatchedFeedback
+from opto.trainer.algorithms.basic_algorithms import MinibatchAlgorithm
+from opto.trainer.loggers import DefaultLogger, TensorboardLogger
 from opto.trainer.guide import VerbalJudgeGuide
 from typing import Any
 
 
 @trace.model
 class Learner:
-    # A basic LLM agent.
+    """ A basic LLM agent. """
 
     def __init__(self, system_prompt: str = "You're a helpful agent",
                  user_prompt_template: str = "Query: {message}",
@@ -22,9 +22,15 @@ def __init__(self, system_prompt: str = "You're a helpful agent",
 
     @trace.bundle()
     def model(self, system_prompt: str, user_prompt_template: str, message: str) -> str:
-        """ Call the LLM model. system_prompt specifies
-        the behavior of the agent. user prompt is the input to the agent, which
-        is formatted as user_prompt_template.format(message=message)."""
+        """Call the LLM model.
+
+        Args:
+            system_prompt: the system prompt to the agent. By tuning this prompt, we can control the behavior of the agent. For example, it can be used to provide instructions to the agent (such as how to reason about the problem, how to answer the question), or provide in-context examples of how to solve the problem.
+            user_prompt_template: the user prompt template to the agent. It is used as formatting the input to the agent as user_prompt_template.format(message=message).
+            message: the input to the agent. It can be a query, a task, a code, etc.
+        Returns:
+            The response from the agent.
+        """
 
         if '{message}' not in user_prompt_template:
             raise ValueError("user_prompt_template must contain '{message}'")
@@ -39,42 +45,48 @@ def forward(self, message: Any) -> Any:
         """ Forward pass of the agent. """
         return self.model(self.system_prompt, self.user_prompt_template, message)
 
-class Logger:
-    def log(self, *messages, color=None, **kwargs):
-        print_color(messages, color=color)
+
+Guide = VerbalJudgeGuide
+Logger = TensorboardLogger
 
 
 def main():
     # set seed
     seed = 42
     num_epochs = 1
     batch_size = 1
-    eval_frequency = 1
-    teacher_model = "gpt-4o-mini" #"gpt-4o-mini_2024-07-18"
-    student_model = "gpt-35-turbo_1106"
+    eval_frequency = -1
+    verbose = True
+    teacher_model = None  # use default mode
+    student_model = None  # use default mode
 
     np.random.seed(seed)
 
-    train_dataset = datasets.load_dataset('openai/gsm8k', 'main')['train'][
-                    :10]  # NOTE for now, we train on a smaller portion
+    # In this example, we use the GSM8K dataset, which is a dataset of math word problems.
+    # We will look the training error of the agent on a small portion of this dataset.
+    train_dataset = datasets.load_dataset('openai/gsm8k', 'main')['train'][:10]
     train_dataset = dict(inputs=train_dataset['question'], infos=train_dataset['answer'])
-    test_dataset = train_dataset  # NOTE for now, we just look at training error
-
-    agent = Learner(llm=LiteLLM(model="gpt-3.5-turbo"))
-
-    guide = VerbalJudgeGuide(model=teacher_model)
-
-    alg = BatchedFeedback(agent=agent,
-                          optimizer=OptoPrime(agent.parameters()),
-                          logger=Logger())
-
+    test_dataset = train_dataset
+
+    agent = Learner(llm=LLM(student_model))
+    guide = Guide(model=teacher_model)
+    optimizer = OptoPrime(agent.parameters())
+    logger = Logger(verbose=verbose)
+             # set use_json_object_format=False if LLM does not support JSON object format
+
+    alg = MinibatchAlgorithm(
+            agent=agent,
+            optimizer=optimizer,
+            logger=logger)
+
     alg.train(guide,
               train_dataset,
               num_epochs=num_epochs,
               batch_size=batch_size,
               eval_frequency=eval_frequency,
               test_dataset=test_dataset,
-              num_threads=3)
+              num_threads=3,
+              verbose='output' if verbose else False)
 
 
 if __name__ == "__main__":

diff --git a/examples/minibatch_bbh_aynsc/run_bigbench_trace_async.py b/examples/minibatch_bbh_aynsc/run_bigbench_trace_async.py
@@ -10,7 +10,7 @@
 import autogen
 import pickle
 import os
-from opto.trainer.algorithms.basic_algorithm import MinibatchAlgorithm, evaluate
+from opto.trainer.algorithms.basic_algorithms import MinibatchAlgorithm, evaluate
 from opto.trainer.guide import AutoGuide
 
 

diff --git a/opto/optimizers/__init__.py b/opto/optimizers/__init__.py
@@ -1,7 +1,9 @@
-from opto.optimizers.optoprime import OptoPrime
+from opto.optimizers.optoprime import OptoPrime as OptoPrimeV1
 from opto.optimizers.optoprimemulti import OptoPrimeMulti
 from opto.optimizers.opro import OPRO
 from opto.optimizers.textgrad import TextGrad
-from opto.optimizers.optoprime_batchopt import OptoprimeBatchOpt
+from opto.optimizers.optoprime_v2 import OptoPrimeV2
 
-__all__ = ["OPRO", "OptoPrime", "OptoPrimeMulti", "TextGrad", "OptoprimeBatchOpt"]
+OptoPrime = OptoPrimeV1
+
+__all__ = ["OPRO", "OptoPrime", "OptoPrimeMulti", "TextGrad", "OptoPrimeV2", "OptoPrimeV1"]
diff --git a/opto/optimizers/optimizer.py b/opto/optimizers/optimizer.py
@@ -54,10 +54,19 @@ def trace_graph(self):
 
     def step(self, bypassing=False, *args, **kwargs):
         update_dict = self.propose(*args, **kwargs)
+        self.project(update_dict)   
         if not bypassing:
             self.update(update_dict)
         return update_dict  # TODO add reasoning
 
+    def project(self, update_dict: Dict[ParameterNode, Any]):
+        """Project the update dictionary onto the feasible set."""
+        for p, d in update_dict.items():
+            if p.trainable:
+                for projection in p.projections:                                        
+                    d = projection.project(d)
+            update_dict[p] = d
+
     def propose(self, *args, **kwargs):
         """Propose the new data of the parameters based on the feedback."""
         return self._step(*args, **kwargs)

diff --git a/opto/optimizers/optoprime.py b/opto/optimizers/optoprime.py
@@ -259,6 +259,7 @@ def __init__(
         max_tokens=4096,
         log=True,
         prompt_symbols=None,
+        use_json_object_format=True,  # whether to use json object format for the response when calling LLM
         **kwargs,
     ):
         super().__init__(parameters, *args, propagator=propagator, **kwargs)
@@ -294,6 +295,7 @@ def __init__(
         self.prompt_symbols = copy.deepcopy(self.default_prompt_symbols)
         if prompt_symbols is not None:
             self.prompt_symbols.update(prompt_symbols)
+        self.use_json_object_format = use_json_object_format
 
     def default_propagator(self):
         """Return the default Propagator object of the optimizer."""
@@ -478,11 +480,7 @@ def construct_update_dict(
         for node in self.parameters:
             if node.trainable and node.py_name in suggestion:
                 try:
-                    from black import format_str, FileMode
                     formatted_suggestion = suggestion[node.py_name]
-                    # use black formatter for code reformatting
-                    if type(formatted_suggestion) == str and 'def' in formatted_suggestion:
-                        formatted_suggestion = format_str(formatted_suggestion, mode=FileMode())
                     update_dict[node] = type(node.data)(formatted_suggestion)
                 except (ValueError, KeyError) as e:
                     # catch error due to suggestion missing the key or wrong data type
@@ -561,15 +559,13 @@ def call_llm(
             {"role": "system", "content": system_prompt},
             {"role": "user", "content": user_prompt},
         ]
-
+
+        response_format =  {"type": "json_object"} if self.use_json_object_format else None
         try:  # Try tp force it to be a json object
-            response = self.llm(
-                messages=messages,
-                response_format={"type": "json_object"},
-                max_tokens=max_tokens,
-            )
+            response = self.llm(messages=messages, max_tokens=max_tokens, response_format=response_format)
         except Exception:
             response = self.llm(messages=messages, max_tokens=max_tokens)
+
         response = response.choices[0].message.content
 
         if verbose:

diff --git a/opto/optimizers/optoprime_batchopt.py → opto/optimizers/optoprime_v2.py b/opto/optimizers/optoprime_batchopt.py → opto/optimizers/optoprime_v2.py
@@ -3,7 +3,7 @@
 from opto.optimizers.optoprime import OptoPrime
 
 
-class OptoprimeBatchOpt(OptoPrime):
+class OptoPrimeV2(OptoPrime):
     # This is generic representation prompt, which just explains how to read the problem.
     representation_prompt = dedent(
         """

diff --git a/opto/trace/__init__.py b/opto/trace/__init__.py
@@ -4,6 +4,7 @@
 from opto.trace.broadcast import apply_op
 import opto.trace.propagators as propagators
 import opto.trace.operators as operators
+import opto.trace.projections as projections
 
 from opto.trace.nodes import Node, GRAPH
 from opto.trace.nodes import node

diff --git a/opto/trace/bundle.py b/opto/trace/bundle.py
@@ -39,6 +39,7 @@ def bundle(
     catch_execution_error=True,
     allow_external_dependencies=False,
     overwrite_python_recursion=False,
+    projections=None,
 ):
     """Wrap a function as a FunModule which returns node objects.
 
@@ -53,6 +54,7 @@ def bundle(
         catch_execution_error (bool, optional): Whether to catch exceptions during operator execution. Defaults to True.
         allow_external_dependencies (bool, optional): Whether to allow external dependencies. Defaults to False.
         overwrite_python_recursion (bool, optional): Whether to overwrite Python recursion behavior. Defaults to False.
+        projections (List[Projection], optional): List of projections to be used in updating trainable parameter. Defaults to None.
 
     Returns:
         FunModule: The wrapped function that returns node objects.
@@ -70,6 +72,7 @@ def decorator(fun):
             allow_external_dependencies=allow_external_dependencies,
             overwrite_python_recursion=overwrite_python_recursion,
             _ldict=prev_f_locals,  # Get the locals of the calling function
+            projections=projections,
         )
         return fun_module
 
@@ -124,6 +127,7 @@ def __init__(
         catch_execution_error=True,
         allow_external_dependencies=False,
         overwrite_python_recursion=False,
+        projections=None,
         _ldict=None,
     ):
 
@@ -183,10 +187,12 @@ def __init__(
                 signature = re.search(r"\s*(def.*:)", source).group(1)
             else:
                 signature = signature_sr.group(1)
+
             self.parameter = ParameterNode(
                 self.info["source"],
                 name="__code",
                 constraint="The code should start with:\n" + signature,
+                projections=projections,
             )
 
     @property