Skip to content

Conversation

@nickfraser
Copy link
Collaborator

@nickfraser nickfraser commented Aug 22, 2025

Builds on #1356. Introduces a BenchmarkSearchMixin which determines how we generate experiments with the benchmark utils. The first one GridSearchMixin, replicates the behaviour of the benchmark utils before this PR was made.

The second one is a RandomSearchMixin, which allow specification of various search algorithms that can be used to specify parameters, for example:

act_equalization:
  rand_type: choices
  rand_values: [null, "layerwise", "fx"]
act_equalization_alpha:
  rand_type: linear
  rand_values: [0.05, 0.95]
gptq:
  rand_type: const
  rand_values: true
learned_round_lr:
  rand_type: log2
  rand_values: [0.0001, 0.1]
learned_round_scale_lr:
  rand_type: exp2
  rand_values: [0.0001, 0.1]

Running as follows:

python llm_rand_benchmark.py --config benchmark_rand_template.yaml --dry-run --seed 1 --num-experiments 10

Will give the following output:

Num. experiments: 10
Benchmark args.:
        config: benchmark_rand_template.yaml
        results_folder: ./
        gpus: 0
        num_gpus_per_process: 1
        max_num_retries: 1
        dry_run: True
        num_experiments: 10
        max_experimental_configs: 100000
        seed: 1
Non-default args.:
        --act-equalization: type: choices, values: [None, 'layerwise', 'fx']
        --act-equalization-alpha: type: linear, min: 0.05, max: 0.95
        --gptq: type: const, value: True
        --learned-round-lr: type: log2, min: 0.0001, max: 0.1
        --learned-round-scale-lr: type: exp2, min: 0.0001, max: 0.1

Note, a limitation with the current approach is that the worker queue generation (i.e., the random search space) and execution (i.e., the experiments) are two separate steps, meaning that it's difficult to integrate experiment feedback into the search space selection (e.g., for Bayesian optimization or simulated annealing).

@nickfraser nickfraser marked this pull request as ready for review December 2, 2025 12:43
Copy link
Collaborator Author

@nickfraser nickfraser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switch to Mixin.

'Error: weight_quant_rescaling_init must be positive.'
if (int(args.gptq) + int(args.gpfq) + int(args.qronos)) > 1:
warn("GPTQ, GPFQ, and/or Qronos are enabled together.")
assert False, "GPTQ, GPFQ, and/or Qronos are enabled together."
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if you prefer me to revert this. In my view, this should be fail instead of a warning. Could also be a separate PR.

@nickfraser nickfraser requested a review from pablomlago December 2, 2025 13:07
'Error: weight_quant_rescaling_init must be positive.'
if (int(args.gptq) + int(args.gpfq) + int(args.qronos)) > 1:
warn("GPTQ, GPFQ, and/or Qronos are enabled together.")
assert False, "GPTQ, GPFQ, and/or Qronos are enabled together."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine being such a small change. Nit:

Suggested change
assert False, "GPTQ, GPFQ, and/or Qronos are enabled together."
assert (int(args.gptq) + int(args.gpfq) + int(args.qronos)) <= 1, "GPTQ, GPFQ, and/or Qronos cannot be enabled together."

class GridSearchMixin(BenchmarkSearchMixin):

@classmethod
def standardize_args(cls, script_args: str) -> Dict[str, List]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def standardize_args(cls, script_args: str) -> Dict[str, List]:
def standardize_args(cls, script_args: Namespace) -> Dict[str, List]:

pass


class BenchmarkSearchMixin(ABC):
Copy link
Collaborator

@pablomlago pablomlago Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indicate that the concrete classes need to provide the abstract class variable argument_parser, e.g.

    @property
    @abstractmethod
    def argument_parser(self) -> ArgumentParser:
        pass

class RandomSearchMixin(BenchmarkSearchMixin):

@classmethod
def standardize_args(cls, script_args: str) -> Dict[str, List]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def standardize_args(cls, script_args: str) -> Dict[str, List]:
def standardize_args(cls, script_args: Namespace) -> Dict[str, List]:

return id_str


class RandomSearchMixin(BenchmarkSearchMixin):
Copy link
Collaborator

@pablomlago pablomlago Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a subset of the functionality of standardize_args which is shared for GridSearchMixin and RandomSearchMixing, e.g. YAML reading. Is it possible to extract the common functionality to the abstract parent class.

return args_dict

@staticmethod
def parse_config_args(args: List[str]) -> Namespace:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the arguments are shared between GridSearchMixin and RandomSearchMixin, I would consider extracting the common logic.

print(f"\t{key}: {value}")


class GridSearchBenchmarkUtils(BenchmarkUtils, GridSearchMixin):
Copy link
Collaborator

@pablomlago pablomlago Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove these? I would rather go for having multiple inheritance in the entrypoints, e.g.

class ImagenetPTQBenchmarkUtils(BenchmarkUtils, GridSearchMixin):

q = q[start_index:end_index]
args_dict = entrypoint_utils.standardize_args(script_args)
# Generate a list of experiments
q = entrypoint_utils.gen_search_space(args_dict, script_args)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is worth renaming q to something more self-explanatory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants