Skip to content

Conversation

@i-colbert
Copy link
Collaborator

@i-colbert i-colbert commented Dec 24, 2025

Reason for this PR

The current implementation applies block rotations uniformly to both online and fused rotation matrices when block_rotation_dim is specified. However, there are scenarios where it's beneficial to apply block rotations only to online matrices while keeping fused rotations as full-vector Hadamard matrices. This PR addresses the need for control over such rotation strategies, allowing users to selectively disable block rotations for fused rotations while maintaining them for online rotations.

Changes Made in this PR

  • Added disable_block_rotation_for_fused to control block rotations in _compute_rotations
  • Added CLI flag --disable-block-rotation-for-fused to the argument parser
  • Modified the rotation equalization logic to conditionally apply block rotations

Testing Summary

Risk Highlight

  • This PR includes code from another work (please detail).
  • This PR contains API-breaking changes.
  • This PR depends on work in another PR (please provide links/details).
  • This PR introduces new dependencies (please detail).
  • There are coverage gaps not covered by tests.
  • Documentation updates required in subsequent PR.

Checklist

  • Code comments added to any hard-to-understand areas, if applicable.
  • Changes generate no new warnings.
  • Updated any relevant tests, if applicable.
  • No conflicts with destination dev branch.
  • I reviewed my own code changes.
  • Initial CI/CD passing.
  • 1+ reviews given, and any review issues addressed and approved.
  • Post-review full CI/CD passing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant