Feat (benchmark): Allow easy specification of different search algorithms within benchmark scripts #1357

nickfraser · 2025-08-22T15:52:17Z

Builds on #1356. Introduces a BenchmarkSearchMixin which determines how we generate experiments with the benchmark utils. The first one GridSearchMixin, replicates the behaviour of the benchmark utils before this PR was made.

The second one is a RandomSearchMixin, which allow specification of various search algorithms that can be used to specify parameters, for example:

act_equalization:
  rand_type: choices
  rand_values: [null, "layerwise", "fx"]
act_equalization_alpha:
  rand_type: linear
  rand_values: [0.05, 0.95]
gptq:
  rand_type: const
  rand_values: true
learned_round_lr:
  rand_type: log2
  rand_values: [0.0001, 0.1]
learned_round_scale_lr:
  rand_type: exp2
  rand_values: [0.0001, 0.1]

Running as follows:

python llm_rand_benchmark.py --config benchmark_rand_template.yaml --dry-run --seed 1 --num-experiments 10

Will give the following output:

Num. experiments: 10
Benchmark args.:
        config: benchmark_rand_template.yaml
        results_folder: ./
        gpus: 0
        num_gpus_per_process: 1
        max_num_retries: 1
        dry_run: True
        num_experiments: 10
        max_experimental_configs: 100000
        seed: 1
Non-default args.:
        --act-equalization: type: choices, values: [None, 'layerwise', 'fx']
        --act-equalization-alpha: type: linear, min: 0.05, max: 0.95
        --gptq: type: const, value: True
        --learned-round-lr: type: log2, min: 0.0001, max: 0.1
        --learned-round-scale-lr: type: exp2, min: 0.0001, max: 0.1

Note, a limitation with the current approach is that the worker queue generation (i.e., the random search space) and execution (i.e., the experiments) are two separate steps, meaning that it's difficult to integrate experiment feedback into the search space selection (e.g., for Bayesian optimization or simulated annealing).

…n be "plugged into" a benchmark configuration

…SearchBenchmarkUtils`

…ive rounding algorithms are enabled

nickfraser

Switch to Mixin.

nickfraser · 2025-12-02T12:48:17Z

src/brevitas_examples/llm/llm_args.py

                'Error: weight_quant_rescaling_init must be positive.'
        if (int(args.gptq) + int(args.gpfq) + int(args.qronos)) > 1:
-            warn("GPTQ, GPFQ, and/or Qronos are enabled together.")
+            assert False, "GPTQ, GPFQ, and/or Qronos are enabled together."


Let me know if you prefer me to revert this. In my view, this should be fail instead of a warning. Could also be a separate PR.

src/brevitas_examples/llm/benchmark/llm_rand_benchmark.py

… `Mixin`

pablomlago · 2025-12-02T16:12:10Z

src/brevitas_examples/llm/llm_args.py

                'Error: weight_quant_rescaling_init must be positive.'
        if (int(args.gptq) + int(args.gpfq) + int(args.qronos)) > 1:
-            warn("GPTQ, GPFQ, and/or Qronos are enabled together.")
+            assert False, "GPTQ, GPFQ, and/or Qronos are enabled together."


I'm fine being such a small change. Nit:

Suggested change

assert False, "GPTQ, GPFQ, and/or Qronos are enabled together."

assert (int(args.gptq) + int(args.gpfq) + int(args.qronos)) <= 1, "GPTQ, GPFQ, and/or Qronos cannot be enabled together."

pablomlago · 2025-12-02T16:14:58Z