Skip to content

Conversation

@ChALkeR
Copy link

@ChALkeR ChALkeR commented Nov 29, 2025

Summary

See also #1851
This is a subset of that PR

Quick benchmarks

Ops/sec in thousands, 3408 means 3_408_000 divisions on an M3 (Release build)
N is 32-byte, s is 16-byte.

Before:

0 / 100 x 3710 ops/sec @ 269μs/op (0ns..1000μs)
0 % 100 x 3897 ops/sec @ 256μs/op (0ns..1999μs)
0 / N x 2874 ops/sec @ 347μs/op (0ns..1000μs)
0 % N x 2906 ops/sec @ 344μs/op (0ns..1000μs)

21 / 100 x 3408 ops/sec @ 293μs/op (0ns..1000μs)
21 % 100 x 3621 ops/sec @ 276μs/op (0ns..1000μs)
100 / 21 x 3437 ops/sec @ 290μs/op (0ns..1000μs)
100 % 21 x 3047 ops/sec @ 328μs/op (0ns..1000μs)

s / (s + 1) x 3559 ops/sec @ 280μs/op (0ns..1000μs)
s % (s + 1) x 3607 ops/sec @ 277μs/op (0ns..1000μs)
(s + 1) / s x 3272 ops/sec @ 305μs/op (0ns..4ms)
(s + 1) % s x 3217 ops/sec @ 310μs/op (0ns..1000μs)

s / N x 2920 ops/sec @ 342μs/op (0ns..1000μs)
s % N x 2959 ops/sec @ 337μs/op (0ns..1000μs)
N / s x 1031 ops/sec @ 969μs/op (0ns..2ms)
N % s x 1036 ops/sec @ 965μs/op (0ns..2ms)

1 / 64 x 3440 ops/sec @ 290μs/op (0ns..1000μs)
1 / N x 2969 ops/sec @ 336μs/op (0ns..1000μs)
64 / 1 x 3308 ops/sec @ 302μs/op (0ns..1000μs)
N / 1 x 659 ops/sec @ 1516μs/op (999μs..2ms)

2 / 64 x 3700 ops/sec @ 270μs/op (0ns..1000μs)
2 / N x 2936 ops/sec @ 340μs/op (0ns..1000μs)
64 / 2 x 3132 ops/sec @ 319μs/op (0ns..1000μs)
N / 2 x 684 ops/sec @ 1462μs/op (999μs..2ms)

This PR (#1852):

0 / 100 x 12090 ops/sec @ 82μs/op (0ns..1000μs)
0 % 100 x 12148 ops/sec @ 82μs/op (0ns..5ms)
0 / N x 13027 ops/sec @ 76μs/op (0ns..1000μs)
0 % N x 13961 ops/sec @ 71μs/op (0ns..1000μs)

21 / 100 x 12117 ops/sec @ 82μs/op (0ns..1000μs)
21 % 100 x 12796 ops/sec @ 78μs/op (0ns..1000μs)
100 / 21 x 11611 ops/sec @ 86μs/op (0ns..4ms)
100 % 21 x 11698 ops/sec @ 85μs/op (0ns..2ms)

s / (s + 1) x 20154 ops/sec @ 49μs/op (0ns..1000μs)
s % (s + 1) x 20397 ops/sec @ 49μs/op (0ns..1000μs)
(s + 1) / s x 19873 ops/sec @ 50μs/op (0ns..1000μs)
(s + 1) % s x 21124 ops/sec @ 47μs/op (0ns..1000μs)

s / N x 13008 ops/sec @ 76μs/op (0ns..1000μs)
s % N x 13987 ops/sec @ 71μs/op (0ns..1000μs)
N / s x 1603 ops/sec @ 623μs/op (0ns..1000μs)
N % s x 1601 ops/sec @ 624μs/op (0ns..1000μs)

1 / 64 x 12476 ops/sec @ 80μs/op (0ns..1000μs)
1 / N x 12883 ops/sec @ 77μs/op (0ns..1000μs)
64 / 1 x 11541 ops/sec @ 86μs/op (0ns..1000μs)
N / 1 x 908 ops/sec @ 1101μs/op (999μs..2ms)

2 / 64 x 12249 ops/sec @ 81μs/op (0ns..1000μs)
2 / N x 12934 ops/sec @ 77μs/op (0ns..1000μs)
64 / 2 x 11827 ops/sec @ 84μs/op (0ns..1000μs)
N / 2 x 907 ops/sec @ 1102μs/op (999μs..2ms)

Test Plan

@meta-cla meta-cla bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Nov 29, 2025
@ChALkeR
Copy link
Author

ChALkeR commented Nov 29, 2025

cc @paulmillr, I get this here (in this bench, ops/sec are actual ones, on M3)

Before:

ed25519.sign x 334 ops/sec @ 2ms/op (1999μs..9ms)
ed25519.getPublicKey x 677 ops/sec @ 1476μs/op (999μs..7ms)
ed25519.verify x 67 ops/sec @ 14ms/op (13ms..20ms)
x25519.getPublicKey x 119 ops/sec @ 8ms/op (8ms..27ms)
secp256k1.sign x 350 ops/sec @ 2ms/op (1999μs..4ms)
secp256k1.getPublicKey x 436 ops/sec @ 2ms/op (1999μs..3ms)
secp256k1.verify x 31839 ops/sec @ 31μs/op (0ns..57ms)
schnorr.sign x 44 ops/sec @ 22ms/op (20ms..32ms)
schnorr.getPublicKey x 445 ops/sec @ 2ms/op (1999μs..3ms)
schnorr.verify x 57 ops/sec @ 17ms/op (16ms..20ms)

After:

ed25519.sign x 409 ops/sec @ 2ms/op (1999μs..3ms)
ed25519.getPublicKey x 847 ops/sec @ 1180μs/op (999μs..3ms)
ed25519.verify x 78 ops/sec @ 12ms/op (11ms..14ms)
x25519.getPublicKey x 149 ops/sec @ 6ms/op (5ms..7ms)
secp256k1.sign x 481 ops/sec @ 2ms/op (1999μs..3ms)
secp256k1.getPublicKey x 557 ops/sec @ 1795μs/op (999μs..18ms)
secp256k1.verify x 45175 ops/sec @ 22μs/op (0ns..1000μs)
schnorr.sign x 58 ops/sec @ 17ms/op (16ms..31ms)
schnorr.getPublicKey x 589 ops/sec @ 1697μs/op (999μs..4ms)
schnorr.verify x 73 ops/sec @ 13ms/op (12ms..14ms)

This change looks safe as in shouldn't affect timings (less sure about #1851 -style or even using comparison before divide)

@paulmillr
Copy link

Amazing. This kind of change can make a huge impact on the whole ecosystem.

@ChALkeR ChALkeR force-pushed the chalker/perf/1/bigint-div branch from ca0ceaa to 32bc589 Compare November 29, 2025 02:39
@ChALkeR ChALkeR force-pushed the chalker/perf/1/bigint-div branch from 32bc589 to 0df19fd Compare November 29, 2025 02:40
Co-authored-by: Andrew Toth <andrewstoth@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed Do not delete this pull request or issue due to inactivity.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants