Skip to content

Conversation

@i-colbert
Copy link
Collaborator

@i-colbert i-colbert commented Nov 6, 2025

Reason for this PR

Initial implementation of GPTAQ: Efficient Finetuning-Free Quantization for Asymmetric Calibration

Reference implementation: https://github.com/Intelligent-Computing-Lab-Panda/GPTAQ/tree/main

Here are some 3-bit asymmetric weight-only quantization results for several Qwen3 models (config and reproduction details below):

0.6B 1.7B 4B
BF16 18.6 15.2 12.2
GPTQ 25.0 19.1 13.6
GPFQ 23.9 17.5 13.6
GPTAQ 23.1 17.8 13.6
Learned Round 22.4 18.0 14.0
Qronos 21.4 16.9 13.2

Notable details

The reference implementation adds some new $$\alpha$$ parameter here to scale their difference matrix $$P$$ but this was not mentioned in the paper. We follow their reference implementation by defaulting self.alpha=0.25; see full discussion here.

Config and reproduction details

The above table was collected using layerwise Hadamard rotations for 3-bit asymmetric weight-only quantization. The config is given below:

bos_preprocessing: sequence
dataset_eval_split: test
dtype: bfloat16
eval: true
gpxq_act_order: true
gpxq_block_name: model.layers
rotation: layerwise
rotation_mode: had
rotation_orphan_sink: true
weight_bit_width: 3
weight_param_method: mse
weight_quant_granularity: per_channel
weight_quant_type: asym
weight_scale_precision: float_scale

The data is collected via:

brevitas_ptq_llm --config=<path/to/config.yml>

where you can specify algorithms by adding --gptq, --gpfq, --qronos, and --gptaq to the CLI args. For Learned Round, the results were collected with --learned-round=linear_round --learned-round-fast-update. Noticeably, the scales were not learned alongside the rounding offset. This is to make sure all the algorithms in the table used the same grid heuristic for even comparison.

Changes Made in this PR

  • Creating new GPTAQ class that can leverage gpfq_mode like GPFQ and Qronos
  • Adding new apply_gptaq function to Huggingface entry point in brevitas_examples

Testing Summary

TODO

Risk Highlight

  • This PR includes code from another work (please detail).
  • This PR contains API-breaking changes.
  • This PR depends on work in another PR (please provide links/details).
  • This PR introduces new dependencies (please detail).
  • There are coverage gaps not covered by tests.
  • Documentation updates required in subsequent PR.

Checklist

  • Code comments added to any hard-to-understand areas, if applicable.
  • Changes generate no new warnings.
  • Updated any relevant tests, if applicable.
  • No conflicts with destination dev branch.
  • I reviewed my own code changes.
  • Initial CI/CD passing.
  • 1+ reviews given, and any review issues addressed and approved.
  • Post-review full CI/CD passing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant