Feat (gptaq): initial implementation of GPTAQ #1411

i-colbert · 2025-11-06T21:13:32Z

Reason for this PR

Initial implementation of GPTAQ: Efficient Finetuning-Free Quantization for Asymmetric Calibration

Reference implementation: https://github.com/Intelligent-Computing-Lab-Panda/GPTAQ/tree/main

Here are some 3-bit asymmetric weight-only quantization results for several Qwen3 models (config and reproduction details below):

	0.6B	1.7B	4B
BF16	18.6	15.2	12.2
GPTQ	25.0	19.1	13.6
GPFQ	23.9	17.5	13.6
GPTAQ	23.1	17.8	13.6
Learned Round	22.4	18.0	14.0
Qronos	21.4	16.9	13.2

Notable details

The reference implementation adds some new $$\alpha$$ parameter here to scale their difference matrix $$P$$ but this was not mentioned in the paper. We follow their reference implementation by defaulting self.alpha=0.25; see full discussion here.

Config and reproduction details

The above table was collected using layerwise Hadamard rotations for 3-bit asymmetric weight-only quantization. The config is given below:

bos_preprocessing: sequence
dataset_eval_split: test
dtype: bfloat16
eval: true
gpxq_act_order: true
gpxq_block_name: model.layers
rotation: layerwise
rotation_mode: had
rotation_orphan_sink: true
weight_bit_width: 3
weight_param_method: mse
weight_quant_granularity: per_channel
weight_quant_type: asym
weight_scale_precision: float_scale

The data is collected via:

brevitas_ptq_llm --config=<path/to/config.yml>

where you can specify algorithms by adding --gptq, --gpfq, --qronos, and --gptaq to the CLI args. For Learned Round, the results were collected with --learned-round=linear_round --learned-round-fast-update. Noticeably, the scales were not learned alongside the rounding offset. This is to make sure all the algorithms in the table used the same grid heuristic for even comparison.

Changes Made in this PR

Creating new GPTAQ class that can leverage gpfq_mode like GPFQ and Qronos
Adding new apply_gptaq function to Huggingface entry point in brevitas_examples

Testing Summary

TODO

Risk Highlight

This PR includes code from another work (please detail).
This PR contains API-breaking changes.
This PR depends on work in another PR (please provide links/details).
This PR introduces new dependencies (please detail).
There are coverage gaps not covered by tests.
Documentation updates required in subsequent PR.

Checklist

Code comments added to any hard-to-understand areas, if applicable.
Changes generate no new warnings.
Updated any relevant tests, if applicable.
No conflicts with destination dev branch.
I reviewed my own code changes.
Initial CI/CD passing.
1+ reviews given, and any review issues addressed and approved.
Post-review full CI/CD passing.

i-colbert added 2 commits November 6, 2025 21:10

Feat (gptaq): adding GPTAQ and connection to LLM entrypoint

3e5024f

Fix (gptaq): pre-commit fixes

7f272d0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat (gptaq): initial implementation of GPTAQ #1411

Feat (gptaq): initial implementation of GPTAQ #1411

i-colbert commented Nov 6, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Feat (gptaq): initial implementation of GPTAQ #1411

Are you sure you want to change the base?

Feat (gptaq): initial implementation of GPTAQ #1411

Conversation

i-colbert commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reason for this PR

Changes Made in this PR

Testing Summary

Risk Highlight

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

i-colbert commented Nov 6, 2025 •

edited

Loading