-
Notifications
You must be signed in to change notification settings - Fork 415
Eliminate zero-initialization overhead with internal MaybeUninit support #531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This change updates the internal implementation to use `MaybeUninit<u8>` for output buffers, eliminating unnecessary memory initialization overhead when filling output buffers, while keeping all public APIs unchanged.
|
Do you have a specific use case in mind for this optimization? In my imagination, the most performance-sensitive callers of the XOF are repeatedly filling a buffer full of random bytes, and in that case the cost of zeroing it is only paid once. Also this is a spicy question, but couldn't a performance sensitive caller use std::mem::MaybeUninit;
fn main() {
let mut buf: MaybeUninit<[u8; 1024]> = MaybeUninit::uninit();
let array: &mut [u8] = unsafe { buf.assume_init_mut() };
for byte in array.iter_mut() {
*byte = 99;
}
for byte in array {
assert_eq!(*byte, 99);
}
}(In other words, could we provide functionally the same capability by adding a line to the docs that says "we promise this function won't read the buffer; |
yes, you are right. Benches on my machine confirms no perf gain However, there is no sound way to fill uninit buffer with current API .assume_init_mut() cannot be used to initialize a MaybeUninit. but since such optimization doesn't make sense I'm okay to actually close the PR. another note: when features let mut output = Vec::with_capacity(OUTPUT_SIZE);
let mut buf = BorrowedBuf::from(output.spare_capacity_mut());
let mut hasher = blake3::Hasher::new();
hasher.update(INPUT_DATA);
hasher.finalize_xof().read_buf(buf.unfilled()).unwrap();
unsafe { output.set_len(OUTPUT_SIZE) };and it wont require any changes of blake3 |
I'm curious where that quote comes from, and I can't find a source. Here's the closest thing I can find to an authoritative opinion on this, from 2023:
Emphasis mine. So I shouldn't be telling anyone to do this (and the |
Summary
This PR adds internal support for
MaybeUninit<u8>output buffers throughout the BLAKE3 implementation, avoiding unnecessary memory initialization and improving performance for output operations. All public APIs remain completely unchanged.Motivation
When filling output buffers with hash data, Rust currently requires the output slice
&mut [u8]to be fully initialized before writing. This means callers must zero-initialize buffers before passing them to BLAKE3, even though BLAKE3 will immediately overwrite those valuesThis was previously proposed in #154 but was never merged. This PR provides a comprehensive internal implementation while maintaining complete API compatibility.
Changes
Public API
✅ No changes to public API - all existing methods work exactly as before with identical signatures
Changes
OutputReader::fill_uninit()method accepting&mut [MaybeUninit<u8>]Platform::xof_many_uninit()for uninitialized buffersffi_avx512::xof_many_uninit()for AVX-512 pathfill_one_block()now works withMaybeUninit<u8>internally