-
Notifications
You must be signed in to change notification settings - Fork 137
Implementing ByteSlice and ByteSliceMut for Vec<u8> #1045
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
9b595b1 to
b588db3
Compare
src/lib.rs
Outdated
| unsafe impl ByteSlice for Vec<u8> {} | ||
|
|
||
| #[cfg(any(feature = "alloc", test))] | ||
| // SAFETY: This uses safe a method from stdlib. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The safety comment needs to prove that the safety conditions of SplitByteSlice are totally satisfied. Unfortunately, I believe safety conditions are unsatisfiable in this case. It asks:
In particular, given
B: SplitByteSliceandb: B, ifb.deref()returns a byte slice with addressaddrand lengthlen, then ifsplit <= len,b.split_at(split)will return(first, second)such that:
first's address isaddrand its length issplitsecond's address isaddr + splitand its length islen - split
Your implementation does not satisfy the second condition, because the two Vecs will almost certainly not be adjacent in memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To add a bit of extra detail here: Your implementations of ByteSlice and ByteSliceMut are fine so long as you remove the implementation of SplitByteSlice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we give up the restriction of only allowing pair of Self to be returned from SplitByteSlice::split_at, we can use a type similar to VecSlice, from my example, and can split a Vec for free. And then we can also implement SplitByteSlice for VecSlice and then we can split a Vec however many times we want at zero cost. Is there a reason we would not want to remove this restriction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC this is still unsound - how do you prevent the CanDrop variant from being dropped before the DontDrop variant? If that happens, then the DontDrop variant points to freed memory. I could imagine solving this with lifetimes, but you might need to make SplitByteSlice even more complex to be able to carry those lifetimes (or maybe not? I'm genuinely not sure).
In the abstract I think this is an interesting idea, but it will take some work to rejigger the API to support it without adding a lot of complexity that other users (who aren't using Vec) will have to know about.
Is there a reason that your use case doesn't allow either just operating on slices (e.g. borrowing the Vec) or making a clone of the Vec into a RefCell<[u8]> (which we do support)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a new thing I tried that is probably better than the previous implementation. The VecSlice type now looks like the following
pub struct VecSlice {
slice: VirtualVec,
ghost: Rc<GhostVec>,
}
struct VirtualVec {
ptr: *mut u8,
len: usize,
cap: usize,
}
struct GhostVec(Vec<u8>);With every split, new VirtualVecs are created but the GhostVec is shared by all the splits and will only drop when the last VecSlice is dropped.
Although, now when I think about it, this is just a more sophisticated way of achieving what can easily be achieved by calling split_at on Vec not as Vec but as slice. This doesn't seem to provide much value I guess.
Regarding your question, I was just curious about if it was possible to split Vec at zero cost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joshlf removed all the unnecessary code. It now only contains implementations of ByteSlice and ByteSliceMut
645e425 to
a5fc43b
Compare
1fffa6a to
a4d7277
Compare
| } | ||
| } | ||
|
|
||
| // TODO(#429): Add a "SAFETY" comment and remove this `allow`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add safety comments here and below? Our policy is not to introduce any new undocumented unsafe code while we burn down the existing TODOs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On further reflection, I don't believe the safety conditions of ByteSlice are satisfiable:
Safety
Implementations of
ByteSlicemust promise that their implementations ofDerefandDerefMutare "stable". In particular, givenB: ByteSliceandb: B,bmust always dereference to a byte slice with the same address and length. This is true for bothb.deref()andb.deref_mut(). IfB: CopyorB: Clone, then the same is also true of copies or clones ofb. For example,b.deref_mut()must return a byte slice with the same address and length asb.clone().deref().
For Vec, a growable buffer, this cannot be the case: the length changes as elements are added and removed, and the address changes as elements are added.
| } | ||
| } | ||
|
|
||
| // TODO(#429): Add a "SAFETY" comment and remove this `allow`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On further reflection, I don't believe the safety conditions of ByteSlice are satisfiable:
Safety
Implementations of
ByteSlicemust promise that their implementations ofDerefandDerefMutare "stable". In particular, givenB: ByteSliceandb: B,bmust always dereference to a byte slice with the same address and length. This is true for bothb.deref()andb.deref_mut(). IfB: CopyorB: Clone, then the same is also true of copies or clones ofb. For example,b.deref_mut()must return a byte slice with the same address and length asb.clone().deref().
For Vec, a growable buffer, this cannot be the case: the length changes as elements are added and removed, and the address changes as elements are added.
|
|
||
| // TODO(#429): Add a "SAFETY" comment and remove this `allow`. | ||
| #[allow(clippy::undocumented_unsafe_blocks)] | ||
| unsafe impl<'a> ByteSliceMut for &'a mut [u8] {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joshlf, in light of the above comment, I'm not certain that this satisfies our documented safety condition either. It's trivial to define a function that changes both the length and address of a &mut [u8] between calls to deref:
fn change_add_and_length(mut slice: &mut [u8]) {
slice = &mut slice[1..];
}I think this indicates that our safety documentation on ByteSlice is missing some critical nuance.
Adresses #992