Skip to content

Enable AFlow Optimization for All Evaluators #36

@RuishanFang

Description

@RuishanFang

Description:
The AFlowOptimizer has been integrated into our framework to enable the automatic optimization of agent workflows. This feature is currently only functional with humaneval_evaluator. To extend this capability across all supported benchmarks, we must make each evaluator compatible with the optimizer's requirements.

Proposed Evaluators to Extend

  • aime_evaluator.py
  • bbh_evaluator.py
  • drop_evaluator.py
  • gaia_evaluator.py
  • gsm8k_evaluator.py
  • hotpotqa_evaluator.py
  • ifeval_evaluator.py
  • math_evaluator.py
  • mbpp_evaluator.py
  • mmlu_pro_evaluator.py
  • swebench_evaluator.py

Implementation Considerations:

  • Implement the async_evaluate Method
  • Define and Load Datasets for Optimization and Testing

References:

Sub-issues

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions