-
Notifications
You must be signed in to change notification settings - Fork 238
Feat (benchmark): Allow easy specification of different search algorithms within benchmark scripts #1357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
…n be "plugged into" a benchmark configuration
…SearchBenchmarkUtils`
…ive rounding algorithms are enabled
nickfraser
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switch to Mixin.
| 'Error: weight_quant_rescaling_init must be positive.' | ||
| if (int(args.gptq) + int(args.gpfq) + int(args.qronos)) > 1: | ||
| warn("GPTQ, GPFQ, and/or Qronos are enabled together.") | ||
| assert False, "GPTQ, GPFQ, and/or Qronos are enabled together." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if you prefer me to revert this. In my view, this should be fail instead of a warning. Could also be a separate PR.
| 'Error: weight_quant_rescaling_init must be positive.' | ||
| if (int(args.gptq) + int(args.gpfq) + int(args.qronos)) > 1: | ||
| warn("GPTQ, GPFQ, and/or Qronos are enabled together.") | ||
| assert False, "GPTQ, GPFQ, and/or Qronos are enabled together." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine being such a small change. Nit:
| assert False, "GPTQ, GPFQ, and/or Qronos are enabled together." | |
| assert (int(args.gptq) + int(args.gpfq) + int(args.qronos)) <= 1, "GPTQ, GPFQ, and/or Qronos cannot be enabled together." |
| class GridSearchMixin(BenchmarkSearchMixin): | ||
|
|
||
| @classmethod | ||
| def standardize_args(cls, script_args: str) -> Dict[str, List]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def standardize_args(cls, script_args: str) -> Dict[str, List]: | |
| def standardize_args(cls, script_args: Namespace) -> Dict[str, List]: |
| pass | ||
|
|
||
|
|
||
| class BenchmarkSearchMixin(ABC): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indicate that the concrete classes need to provide the abstract class variable argument_parser, e.g.
@property
@abstractmethod
def argument_parser(self) -> ArgumentParser:
pass
| class RandomSearchMixin(BenchmarkSearchMixin): | ||
|
|
||
| @classmethod | ||
| def standardize_args(cls, script_args: str) -> Dict[str, List]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def standardize_args(cls, script_args: str) -> Dict[str, List]: | |
| def standardize_args(cls, script_args: Namespace) -> Dict[str, List]: |
| return id_str | ||
|
|
||
|
|
||
| class RandomSearchMixin(BenchmarkSearchMixin): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a subset of the functionality of standardize_args which is shared for GridSearchMixin and RandomSearchMixing, e.g. YAML reading. Is it possible to extract the common functionality to the abstract parent class.
| return args_dict | ||
|
|
||
| @staticmethod | ||
| def parse_config_args(args: List[str]) -> Namespace: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the arguments are shared between GridSearchMixin and RandomSearchMixin, I would consider extracting the common logic.
| print(f"\t{key}: {value}") | ||
|
|
||
|
|
||
| class GridSearchBenchmarkUtils(BenchmarkUtils, GridSearchMixin): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove these? I would rather go for having multiple inheritance in the entrypoints, e.g.
class ImagenetPTQBenchmarkUtils(BenchmarkUtils, GridSearchMixin):
| q = q[start_index:end_index] | ||
| args_dict = entrypoint_utils.standardize_args(script_args) | ||
| # Generate a list of experiments | ||
| q = entrypoint_utils.gen_search_space(args_dict, script_args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it is worth renaming q to something more self-explanatory.
Builds on #1356. Introduces a
BenchmarkSearchMixinwhich determines how we generate experiments with the benchmark utils. The first oneGridSearchMixin, replicates the behaviour of the benchmark utils before this PR was made.The second one is a
RandomSearchMixin, which allow specification of various search algorithms that can be used to specify parameters, for example:Running as follows:
Will give the following output:
Note, a limitation with the current approach is that the worker queue generation (i.e., the random search space) and execution (i.e., the experiments) are two separate steps, meaning that it's difficult to integrate experiment feedback into the search space selection (e.g., for Bayesian optimization or simulated annealing).