-
Notifications
You must be signed in to change notification settings - Fork 75
Reapply "BREAKING: Make random_mod platform-independent" (#1017) #1018
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Going to see if I can trace down the hang on CI since I am having a hard time reproducing it locally. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1018 +/- ##
==========================================
+ Coverage 79.86% 79.88% +0.01%
==========================================
Files 163 163
Lines 17737 17740 +3
==========================================
+ Hits 14166 14171 +5
+ Misses 3571 3569 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Yeah, this is really weird, I couldn't reproduce it outside CI $ uname -a
Linux instance-20251126-235717 6.12.48+deb13-cloud-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.12.48-1 (2025-09-20) x86_64 GNU/Linux
$ cross test --target aarch64-unknown-linux-gnu --release
info: syncing channel updates for 'stable-x86_64-unknown-linux-gnu'
stable-x86_64-unknown-linux-gnu unchanged - rustc 1.91.1 (ed61e7d7e 2025-11-07)
info: checking for self-update
Finished `release` profile [optimized] target(s) in 0.06s
Running unittests src/lib.rs (/target/aarch64-unknown-linux-gnu/release/deps/crypto_bigint-3e7ac94d7ecc1145)
running 421 tests
test const_choice::tests::from_u64_lsb ... ok
test const_choice::tests::from_wide_word_le ... ok
test const_choice::tests::from_word_gt ... ok
test const_choice::tests::from_word_lt ... ok
[...]
test src/uint/div.rs - uint::div::Uint<LIMBS>::wrapping_rem_vartime (line 186) ... ok
test result: ok. 19 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.03s
$ |
|
Yeah, indeed. And it passes on |
|
The test case that’s failing looks to be Far as I can tell, That first |
|
I get the same behavior locally under aarch64-linux in an Alpine VM via UTM on my Mac. Only under |
|
Ah, This is without |
|
So to recap: |
Testing a theory that this may get linux-aarch64 to decide to do the comparison.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
|
Hoookay, pretty sure we have a rustc/LLVM bug on our hands. Following is recorded for posterity and/or for the LLVM bug report. I took a different approach: starting from master and working my way down, this does not hang: diff --git a/src/uint/rand.rs b/src/uint/rand.rs
index d36dc26..5997b3a 100644
--- a/src/uint/rand.rs
+++ b/src/uint/rand.rs
@@ -137,14 +137,10 @@ where
let hi_word_modulus = modulus.as_ref().as_ref()[n_limbs - 1].0;
let mask = !0 >> hi_word_modulus.leading_zeros();
- let mut hi_word = next_word()? & mask;
loop {
- while hi_word > hi_word_modulus {
- hi_word = next_word()? & mask;
- }
// Set high limb
- n.as_mut()[n_limbs - 1] = Limb::from_le_bytes(hi_word.to_le_bytes());
+ n.as_mut()[n_limbs - 1] = Limb::from_le_bytes((next_word()? & mask).to_le_bytes());
// Set low limbs
for i in 0..n_limbs - 1 {
// Need to deserialize from little-endian to make sure that two 32-bit limbs
@@ -157,7 +153,6 @@ where
if n.ct_lt(modulus).into() {
break;
}
- hi_word = next_word()? & mask;
}
Ok(())
}
This (diff from the prior diff) does: diff --git a/src/uint/rand.rs b/src/uint/rand.rs
index 5997b3a..1503267 100644
--- a/src/uint/rand.rs
+++ b/src/uint/rand.rs
@@ -139,8 +139,6 @@ where
let mask = !0 >> hi_word_modulus.leading_zeros();
loop {
- // Set high limb
- n.as_mut()[n_limbs - 1] = Limb::from_le_bytes((next_word()? & mask).to_le_bytes());
// Set low limbs
for i in 0..n_limbs - 1 {
// Need to deserialize from little-endian to make sure that two 32-bit limbs
@@ -148,6 +146,8 @@ where
// byte stream.
n.as_mut()[i] = Limb::from_le_bytes(next_word()?.to_le_bytes());
}
+ // Set high limb
+ n.as_mut()[n_limbs - 1] = Limb::from_le_bytes((next_word()? & mask).to_le_bytes());
// If the high limb is equal to the modulus' high limb, it's still possible
// that the full uint is too big so we check and repeat if it is.
if n.ct_lt(modulus).into() {If I dump the assembly for a monomorphized diff --git a/src/uint.rs b/src/uint.rs
index a5276cd..1478107 100644
--- a/src/uint.rs
+++ b/src/uint.rs
@@ -2,8 +2,9 @@
#![allow(clippy::needless_range_loop, clippy::many_single_char_names)]
-use core::fmt;
+use core::{convert::Infallible, fmt};
+use chacha20::ChaCha20Rng;
#[cfg(feature = "serde")]
use serdect::serde::{Deserialize, Deserializer, Serialize, Serializer};
use subtle::{Choice, ConditionallySelectable, ConstantTimeEq};
@@ -61,6 +62,12 @@ pub(crate) mod boxed;
#[cfg(feature = "rand_core")]
mod rand;
+/// my func
+#[inline(never)]
+pub fn my_random_mod(rng: &mut ChaCha20Rng, n: &mut Uint<5>, modulus: &NonZero<Uint<5>>, n_bits: u32) -> Result<(), Infallible> {
+ return rand::random_mod_core(rng, n, modulus, n_bits)
+}
+
/// Stack-allocated big unsigned integer.
///
/// Generic over the given number of `LIMBS`There is a pretty substantial diff: --- good_5.s 2025-11-26 21:40:49
+++ bad_5.s 2025-11-26 21:40:45
@@ -31,7 +31,7 @@
cinc w8, w8, ne
sub x20, x8, #1
cmp x20, #4
- b.hi .LBB2_9
+ b.hi .LBB2_11
ldr x8, [x2, x20, lsl #3]
mov x9, #-1
ldr x27, [x2, #32]
@@ -41,21 +41,27 @@
ldp x25, x26, [x2, #16]
lsr x24, x9, x8
mov x21, x0
- cbz x20, .LBB2_7
-.LBB2_2:
- mov x0, x21
- bl <R as rand_core::TryRngCore>::try_next_u64
+ cbz x20, .LBB2_8
mov x28, xzr
- and x8, x0, x24
- str x8, [x19, x20, lsl #3]
+ b .LBB2_4
.LBB2_3:
+ and w0, w8, #0x1
+ bl subtle::black_box
+ tst w0, #0xff
+ mov x28, xzr
+ b.ne .LBB2_10
+.LBB2_4:
mov x0, x21
bl <R as rand_core::TryRngCore>::try_next_u64
add x8, x28, #1
str x0, [x19, x28, lsl #3]
- cmp x20, x8
+ cmp x8, x20
mov x28, x8
- b.ne .LBB2_3
+ b.ne .LBB2_4
+ mov x0, x21
+ bl <R as rand_core::TryRngCore>::try_next_u64
+ and x8, x0, x24
+ str x8, [x19, x20, lsl #3]
ldp x8, x9, [x19]
cmp x8, x22
cset w8, lo
@@ -69,50 +75,24 @@
ngc x9, xzr
cmp x8, x25
sbcs xzr, x9, xzr
+ ldr x9, [x19, #32]
cset w8, lt
- subs x8, x10, x8
- ldr x10, [x19, #32]
- ngc x9, xzr
- cmp x8, x26
- sbcs xzr, x9, xzr
- cset w8, lt
- subs x9, x10, x27
- ngc x10, xzr
- cmp x9, x8
- sbc x8, x10, xzr
+ subs x10, x10, x8
+ ngc x11, xzr
+ subs x9, x9, x27
+ ngc x8, xzr
+ cmp x10, x26
+ sbcs xzr, x11, xzr
+ b.ge .LBB2_3
+ cmp x9, #1
+ cinc x8, x8, hs
+ b .LBB2_3
+.LBB2_7:
and w0, w8, #0x1
bl subtle::black_box
tst w0, #0xff
- b.eq .LBB2_2
-.LBB2_5:
- .cfi_def_cfa wsp, 96
- ldp x20, x19, [sp, #80]
- ldp x22, x21, [sp, #64]
- ldp x24, x23, [sp, #48]
- ldp x26, x25, [sp, #32]
- ldp x28, x27, [sp, #16]
- ldp x29, x30, [sp], #96
- .cfi_def_cfa_offset 0
- .cfi_restore w19
- .cfi_restore w20
- .cfi_restore w21
- .cfi_restore w22
- .cfi_restore w23
- .cfi_restore w24
- .cfi_restore w25
- .cfi_restore w26
- .cfi_restore w27
- .cfi_restore w28
- .cfi_restore w30
- .cfi_restore w29
- ret
-.LBB2_6:
- .cfi_restore_state
- and w0, w8, #0x1
- bl subtle::black_box
- tst w0, #0xff
- b.ne .LBB2_5
-.LBB2_7:
+ b.ne .LBB2_10
+.LBB2_8:
mov x0, x21
bl <R as rand_core::TryRngCore>::try_next_u64
and x8, x0, x24
@@ -138,11 +118,34 @@
ngc x8, xzr
cmp x10, x26
sbcs xzr, x11, xzr
- b.ge .LBB2_6
+ b.ge .LBB2_7
cmp x9, #1
cinc x8, x8, hs
- b .LBB2_6
-.LBB2_9:
+ b .LBB2_7
+.LBB2_10:
+ .cfi_def_cfa wsp, 96
+ ldp x20, x19, [sp, #80]
+ ldp x22, x21, [sp, #64]
+ ldp x24, x23, [sp, #48]
+ ldp x26, x25, [sp, #32]
+ ldp x28, x27, [sp, #16]
+ ldp x29, x30, [sp], #96
+ .cfi_def_cfa_offset 0
+ .cfi_restore w19
+ .cfi_restore w20
+ .cfi_restore w21
+ .cfi_restore w22
+ .cfi_restore w23
+ .cfi_restore w24
+ .cfi_restore w25
+ .cfi_restore w26
+ .cfi_restore w27
+ .cfi_restore w28
+ .cfi_restore w30
+ .cfi_restore w29
+ ret
+.LBB2_11:
+ .cfi_restore_state
adrp x2, .Lanon.2cba860f59aa28f6ecc8a20933206b3b.3
add x2, x2, :lo12:.Lanon.2cba860f59aa28f6ecc8a20933206b3b.3
mov x0, x20Compared against a much smaller diff if monomorphized to --- good_4.s 2025-11-26 21:48:27
+++ not_bad_4.s 2025-11-26 21:48:34
@@ -40,20 +40,19 @@
ldp x25, x26, [x2, #16]
lsr x22, x9, x8
cbz x20, .LBB2_5
-.LBB2_2:
- mov x0, x21
- bl <R as rand_core::TryRngCore>::try_next_u64
mov x27, xzr
- and x8, x0, x22
- str x8, [x19, x20, lsl #3]
.LBB2_3:
mov x0, x21
bl <R as rand_core::TryRngCore>::try_next_u64
add x8, x27, #1
str x0, [x19, x27, lsl #3]
- cmp x20, x8
+ cmp x8, x20
mov x27, x8
b.ne .LBB2_3
+ mov x0, x21
+ bl <R as rand_core::TryRngCore>::try_next_u64
+ and x8, x0, x22
+ str x8, [x19, x20, lsl #3]
ldp x8, x9, [x19]
cmp x8, x23
cset w8, lo
@@ -75,7 +74,8 @@
and w0, w8, #0x1
bl subtle::black_box
tst w0, #0xff
- b.eq .LBB2_2
+ mov x27, xzr
+ b.eq .LBB2_3
b .LBB2_6
.LBB2_5:
mov x0, x21The full .section .text.crypto_bigint::uint::my_random_mod,"ax",@progbits
.globl crypto_bigint::uint::my_random_mod
.p2align 2
.type crypto_bigint::uint::my_random_mod,@function
crypto_bigint::uint::my_random_mod:
.cfi_startproc
stp x29, x30, [sp, #-96]!
.cfi_def_cfa_offset 96
stp x28, x27, [sp, #16]
stp x26, x25, [sp, #32]
stp x24, x23, [sp, #48]
stp x22, x21, [sp, #64]
stp x20, x19, [sp, #80]
mov x29, sp
.cfi_def_cfa w29, 96
.cfi_offset w19, -8
.cfi_offset w20, -16
.cfi_offset w21, -24
.cfi_offset w22, -32
.cfi_offset w23, -40
.cfi_offset w24, -48
.cfi_offset w25, -56
.cfi_offset w26, -64
.cfi_offset w27, -72
.cfi_offset w28, -80
.cfi_offset w30, -88
.cfi_offset w29, -96
.cfi_remember_state
lsr w8, w3, #6
tst w3, #0x3f
cinc w8, w8, ne
sub x20, x8, #1
cmp x20, #4
b.hi .LBB2_11
ldr x8, [x2, x20, lsl #3]
mov x9, #-1
ldr x27, [x2, #32]
ldp x22, x23, [x2]
mov x19, x1
clz x8, x8
ldp x25, x26, [x2, #16]
lsr x24, x9, x8
mov x21, x0
cbz x20, .LBB2_8
mov x28, xzr
b .LBB2_4
.LBB2_3:
and w0, w8, #0x1
bl subtle::black_box
tst w0, #0xff
mov x28, xzr
b.ne .LBB2_10
.LBB2_4:
mov x0, x21
bl <R as rand_core::TryRngCore>::try_next_u64
add x8, x28, #1
str x0, [x19, x28, lsl #3]
cmp x8, x20
mov x28, x8
b.ne .LBB2_4
mov x0, x21
bl <R as rand_core::TryRngCore>::try_next_u64
and x8, x0, x24
str x8, [x19, x20, lsl #3]
ldp x8, x9, [x19]
cmp x8, x22
cset w8, lo
subs x8, x9, x8
ngc x9, xzr
cmp x8, x23
ldp x8, x10, [x19, #16]
sbcs xzr, x9, xzr
cset w9, lt
subs x8, x8, x9
ngc x9, xzr
cmp x8, x25
sbcs xzr, x9, xzr
ldr x9, [x19, #32]
cset w8, lt
subs x10, x10, x8
ngc x11, xzr
subs x9, x9, x27
ngc x8, xzr
cmp x10, x26
sbcs xzr, x11, xzr
b.ge .LBB2_3
cmp x9, #1
cinc x8, x8, hs
b .LBB2_3
.LBB2_7:
and w0, w8, #0x1
bl subtle::black_box
tst w0, #0xff
b.ne .LBB2_10
.LBB2_8:
mov x0, x21
bl <R as rand_core::TryRngCore>::try_next_u64
and x8, x0, x24
str x8, [x19, x20, lsl #3]
ldp x8, x9, [x19]
cmp x8, x22
cset w8, lo
subs x8, x9, x8
ngc x9, xzr
cmp x8, x23
ldp x8, x10, [x19, #16]
sbcs xzr, x9, xzr
cset w9, lt
subs x8, x8, x9
ngc x9, xzr
cmp x8, x25
sbcs xzr, x9, xzr
ldr x9, [x19, #32]
cset w8, lt
subs x10, x10, x8
ngc x11, xzr
subs x9, x9, x27
ngc x8, xzr
cmp x10, x26
sbcs xzr, x11, xzr
b.ge .LBB2_7
cmp x9, #1
cinc x8, x8, hs
b .LBB2_7
.LBB2_10:
.cfi_def_cfa wsp, 96
ldp x20, x19, [sp, #80]
ldp x22, x21, [sp, #64]
ldp x24, x23, [sp, #48]
ldp x26, x25, [sp, #32]
ldp x28, x27, [sp, #16]
ldp x29, x30, [sp], #96
.cfi_def_cfa_offset 0
.cfi_restore w19
.cfi_restore w20
.cfi_restore w21
.cfi_restore w22
.cfi_restore w23
.cfi_restore w24
.cfi_restore w25
.cfi_restore w26
.cfi_restore w27
.cfi_restore w28
.cfi_restore w30
.cfi_restore w29
ret
.LBB2_11:
.cfi_restore_state
adrp x2, .Lanon.2cba860f59aa28f6ecc8a20933206b3b.3
add x2, x2, :lo12:.Lanon.2cba860f59aa28f6ecc8a20933206b3b.3
mov x0, x20
mov w1, #5
bl core::panicking::panic_bounds_checkAnd here is .section .text.crypto_bigint::uint::my_random_mod,"ax",@progbits
.globl crypto_bigint::uint::my_random_mod
.p2align 2
.type crypto_bigint::uint::my_random_mod,@function
crypto_bigint::uint::my_random_mod:
.cfi_startproc
stp x29, x30, [sp, #-96]!
.cfi_def_cfa_offset 96
str x27, [sp, #16]
stp x26, x25, [sp, #32]
stp x24, x23, [sp, #48]
stp x22, x21, [sp, #64]
stp x20, x19, [sp, #80]
mov x29, sp
.cfi_def_cfa w29, 96
.cfi_offset w19, -8
.cfi_offset w20, -16
.cfi_offset w21, -24
.cfi_offset w22, -32
.cfi_offset w23, -40
.cfi_offset w24, -48
.cfi_offset w25, -56
.cfi_offset w26, -64
.cfi_offset w27, -80
.cfi_offset w30, -88
.cfi_offset w29, -96
.cfi_remember_state
lsr w8, w3, #6
tst w3, #0x3f
cinc w8, w8, ne
sub x20, x8, #1
cmp x20, #3
b.hi .LBB2_7
ldr x8, [x2, x20, lsl #3]
mov x9, #-1
mov x19, x1
ldp x23, x24, [x2]
mov x21, x0
clz x8, x8
ldp x25, x26, [x2, #16]
lsr x22, x9, x8
cbz x20, .LBB2_5
mov x27, xzr
.LBB2_3:
mov x0, x21
bl <R as rand_core::TryRngCore>::try_next_u64
add x8, x27, #1
str x0, [x19, x27, lsl #3]
cmp x8, x20
mov x27, x8
b.ne .LBB2_3
mov x0, x21
bl <R as rand_core::TryRngCore>::try_next_u64
and x8, x0, x22
str x8, [x19, x20, lsl #3]
ldp x8, x9, [x19]
cmp x8, x23
cset w8, lo
subs x8, x9, x8
ngc x9, xzr
cmp x8, x24
ldp x8, x10, [x19, #16]
sbcs xzr, x9, xzr
cset w9, lt
subs x8, x8, x9
ngc x9, xzr
cmp x8, x25
sbcs xzr, x9, xzr
cset w8, lt
subs x9, x10, x26
ngc x10, xzr
cmp x9, x8
sbc x8, x10, xzr
and w0, w8, #0x1
bl subtle::black_box
tst w0, #0xff
mov x27, xzr
b.eq .LBB2_3
b .LBB2_6
.LBB2_5:
mov x0, x21
bl <R as rand_core::TryRngCore>::try_next_u64
and x8, x0, x22
str x8, [x19, x20, lsl #3]
ldp x8, x9, [x19]
cmp x8, x23
cset w8, lo
subs x8, x9, x8
ngc x9, xzr
cmp x8, x24
ldp x8, x10, [x19, #16]
sbcs xzr, x9, xzr
cset w9, lt
subs x8, x8, x9
ngc x9, xzr
cmp x8, x25
sbcs xzr, x9, xzr
cset w8, lt
subs x9, x10, x26
ngc x10, xzr
cmp x9, x8
sbc x8, x10, xzr
and w0, w8, #0x1
bl subtle::black_box
tst w0, #0xff
b.eq .LBB2_5
.LBB2_6:
.cfi_def_cfa wsp, 96
ldp x20, x19, [sp, #80]
ldr x27, [sp, #16]
ldp x22, x21, [sp, #64]
ldp x24, x23, [sp, #48]
ldp x26, x25, [sp, #32]
ldp x29, x30, [sp], #96
.cfi_def_cfa_offset 0
.cfi_restore w19
.cfi_restore w20
.cfi_restore w21
.cfi_restore w22
.cfi_restore w23
.cfi_restore w24
.cfi_restore w25
.cfi_restore w26
.cfi_restore w27
.cfi_restore w30
.cfi_restore w29
ret
.LBB2_7:
.cfi_restore_state
adrp x2, .Lanon.c0613eab67b71300c185314e527e149f.3
add x2, x2, :lo12:.Lanon.c0613eab67b71300c185314e527e149f.3
mov x0, x20
mov w1, #4
bl core::panicking::panic_bounds_check |
This reverts 5784b13, reapplying 62b90b8. See also: RustCrypto#1018 (comment) It turns out that the previous approach ran afoul of a likely LLVM (or possibly rustc) bug. This time, I tried removing `random_mod_core` entirely and instead writing `random_mod` and `try_random_mod` as rejection loops over `random_bits` and `try_random_bits`. Somehow, this does not (seem to) run afoul of the bug.
|
@mrdomino Impressive sleuthing here, kudos. |
## Breaking changes - `Encoding::Repr` is no longer required to implement `Copy`, so consumers of `Encoding::Repr` will need to explicitly call `clone`. - The numbers produced by both `random_bits` and `random_mod` will generally be different, and calling these functions will leave the RNG in a different state, than before. ## Fixes - Adds a mitigation for rust-lang/rust#149522, modifying `Word::borrowing_sub` to use `overflowing_sub` instead of `WideWord` casts. ## Summary This essentially applies #285 to `random_bits` as well as `random_mod`. Both functions behave the same way now, with the only difference being that `random_mod` adds rejection sampling; otherwise both will produce the same numbers over the same entropy stream. Questions of platform dependence are now easy; we do not define these algorithms in terms of sequences of machine words but of bytes. Randomly sampled `Uint`s are now always constructed little-endian over the entropy stream. This does not preclude future machine-specific optimizations, but given how perilous the landscape has been (e.g. #1018), I’ve elected to err in the direction of parsimony rather than performance for this change. This leverages the existing work making `Uint` implement `Encoding`. It additionally needs `Encoding` on `BoxedUint` to make `RandomMod` and `RandomBits` work there; this is implemented, but requires dropping the `Copy` constraint on `Encoding::Repr`. Fixes #1009 [0]: rust-lang/rust#149522
This reverts commit 5784b13.