ml-dsa: use Barrett reduction instead of integer division to prevent side-channels #1144
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This was originally disclosed as a security advisory but @tarcieri said a public PR was fine.
Summary
A timing side-channel was discovered in the Decompose algorithm which is used during ML-DSA signing to generate hints for the signature.
Details
The analysis was performed using a constant-time analyzer that examines compiled assembly code for instructions with data-dependent timing behavior. The analyzer flags:
The
decomposefunction used a hardware division instruction to computer1.0 / TwoGamma2::U32. This function is called during signing throughhigh_bits()andlow_bits(), which process values derived from secret key components:(&w - &cs2).low_bits()wherecs2is derived from secret key components2Hint::new()callshigh_bits()on values derived from secret key componentt0Original Code:
Impact
The dividend (
r1.0) is derived from secret key material. An attacker with precise timing measurements could extract information about the signing key by observing timing variations in the division operation.Mitigation
Replacing division with constant-time Barrett reduction mitigates this risk. Since
TwoGamma2is a compile-time constant, we precompute the multiplicative inverse.See our blog post on how we avoided side-channels in our Go implementation of ML-DSA for more information.
(The loop decomposition change isn't strictly speaking necessary, but the loop bounds do create false positives for my testing utility that looks at the compiled output. Let me know if you'd prefer that removed.)