Skip to content
Closed
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
136 changes: 136 additions & 0 deletions text/0000-zero-page-optimization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
- Feature Name: zero_page_optimization
- Start Date: 2018-04-09
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary
[summary]: #summary

Extend the null pointer optimization to any value inside the zero page (which a
reference cannot have the value).

# Motivation
[motivation]: #motivation

Modern operating systems normally [traps null pointer access](https://en.wikipedia.org/wiki/Zero_page).
This means valid pointers will never take values inside the zero page, and we
can exploit this for ~12 bits of storage for secondary variants.

[Inside Rust std](https://github.com/rust-lang/rust/blob/ca26ef321c44358404ef788d315c4557eb015fb2/src/liballoc/heap.rs#L238),
we use a "dangling" pointer for ZST allocations; this involves a somewhat

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't understand how this relates at all to the motivation for this RFC.

verbose logic.

Outside std, we also see `futures-util`
[uses 1](https://github.com/rust-lang-nursery/futures-rs/blob/856fde847d4062f5d2af5d85d6640028297a10f1/futures-util/src/lock.rs#L157-L169)
as a special pointer value.

However, this is not something that is documented in the nomicon, neither it's
always true. For instance, microcontrollers without MMU doesn't implement such
guards at all, and `0` and `1` is a valid address where the entrypoint lies. See
[Cortex-M4](https://developer.arm.com/docs/ddi0439/latest/programmers-model/system-address-map)'s
design as one of such example.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it true that address 0 is valid in Cortex-M and that you can’t validly create a Rust reference &T to it, but it’s not like arbitrary data can end up there by chance. That address is reserved for some early boot detail that most application don’t deal with directly. In the cortext-m-rt crate there is not even a corresponding Rust item, it is entirely dealt with in the linker script.

So I don’t think there is a problem here in practice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SimonSapin Are you suggesting that access to 0 should be strictly unsafe?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I’m only saying that the ARM Cortex case is not really relevant to the "Rust makes bad assumptions" argument. But then what do you mean by "access to 0"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e. do I have to use *mut _ or *const _ if I want to access part of the "null range"?


Such crates should not assume anything regarding Rust ABI internals, but in the
case of this `BiLock`, we rely on compressing it into a usize so we can perform
atomic operations without a mutex. In practice, the entrypoint at `0` is
unlikely to be filled with Rust code but platform-specific bootstrap assembly.
Also, other factors like alignment also get involved so in practice we can't
collide the address. However, this RFC proposes a more logical and typed way
to code such things.

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

This change should be transparent for most users; the following description is
targeted at people dealing with FFI or unsafe.

The recently stabilized `NonNull` type will have more strict requirements:
the pointer must be not in the null page. `NonNull::dangling` will be
deprecated in favor of this optimization.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's deprecated, what's the replacement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In favour of the zero page optimization. That is, using an enumeration instead.

Copy link

@hanna-kruppe hanna-kruppe Apr 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand. You propose that the current way to get a NonNull that is non-null and aligned is deprecated. What non-deprecated thing can current users of that method do instead to get a NonNull with the same properties? that is similarly valid with the new invariant?

(Leaving aside the question of whether it's OK to change the meaning of NonZero like this after stabilization.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that NonNull::dangling() seems to be just a hack where Option<NonNull<T>> should be used. NonNull::dangling() advocates less idiomatic coding, and Option<NonNull<T>> should be a perfect fit as a replacement.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a big claim that requires a fair bit of support given that the API was accepted and stabilized.

Furthermore, using Option is not equivalent to using a dangling pointer since it "uses up" the null value: e.g. Vec<T> contains a NonNull<T> and this makes Option<Vec<T>> the same size as Vec<T>, if it used Option<NonNull<T>> instead, Option<Vec<T>> would be bigger.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ishitatsuyuki Sorry, I don’t see how NonNull::dangling is related to Option<NonNull<_>> at all. dangling is for creating an arbitrary pointer that is correctly aligned without being null. It is used for zero-size allocations, for example in Vec: https://github.com/rust-lang/rust/blob/fb730d75d4c1c05c90419841758300b6fbf01250/src/liballoc/raw_vec.rs#L93

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rkruppe Can I suggest that a code search only showed usage for "optional allocation", for either ZST or an absent node in the linked data structure? Also, the original intent of this addition seems to be "we need this to interact with allocator": rust-lang/rust#45527

@SimonSapin Using NonNull::dangling is a convention inside the alloc related functions, but it's not expressed through types. Using an enum makes it less error prone, catching the cases where we may pass an dangling pointer to the underlying allocator.

Copy link

@ExpHP ExpHP Apr 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ishitatsuyuki If I understand correctly, you are saying that if the following occurred:

  • Vec<T> instead stored Option<NonNull<T>>
  • NonNull<T> was changed to forbid pointers in the null page

Then Option<Vec<T>> could still receive optimization? If this is the case, it might help to demonstrate this explicitly.

That said, I think part of the concern here is that there are places where Vec<T> benefits specifically from the fact that dangling() is aligned. e.g., a slice can be constructed directly from the pointer without having to branch on None. ISTM that would be impossible when using Option<NonNull<T>> as it must remain possible to take a reference to the option.

Edit: Or wait... maybe it is possible. The pointer for Some(vec![]) would be null, and the representation of None::<Vec<T>> would begin with 1 where the Option<NonNull<T>> is stored. Hm...

Edit 2: but then what about Vec<Option<T>>? We end up with an Option<NonNull<Option<T>> whose None representation is 1, which is not aligned when interpreted as a pointer. Or something like that. My brain hurts.


During the migration, we should migrate the impact with a crater run. If changing
the behavior directly is unacceptable, then we'll have to create a new type instead.

`&T`, `&mut T`, `NonNull<T>` will have the same ranging semantics:
they will not take any value inside the zero page. We will optimize the layout
of an enumeration in a way similar to before, except that we will allow
discriminants of up to the zero page size (typically 4095).

Also, attempts to compress discriminants will be performed: which means, an
`Option<Option<&T>>` will be flattened internally, so its layout will be similar
to:

```rust
enum ... {
NoneInner, // discriminant 0
NoneOuter, // discriminant 1
Some(&T) // remainder
}
```

Note that here, we assign discriminants from inner to outer. This makes the
representation match when a reference is taken.

The exact behavior of this optimization should be documented upon implementation,
for unsafe coding usage.

To take advantage of zero page optimization, use `transmute` from and to usize.
This will cause compilation to fail if such optimization is not permitted on
the target.

An crate attribute `zero_page_size` will be exposed for configuring the exact
size of the zero page. This is mainly targeted at microcontroller runtimes.

An `zero_page_size` `#[cfg]` attribute will also be exposed, to code a fallback
instead of failing in cases like above.

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

We will add a target-specific default to determine the availability and size
of the zero page. The zero page range starts from 0, and must be at least one
byte so that old code relying on null pointer optimization will not break.

For the defined range, the compiler must ensure that no pointer of which value
is inside the range could be created safely. On microcontrollers, a dumb solution
would be creating a nop sled at the entrypoint.

This optimization only applies to pointer-like values (which can be dereferenced),
and `std::num::NonZero` keeps its current behavior.

The pointer internals will be also adopted to use this scheme: `Unique<T>` should
be refactored to use an enum internally.

# Drawbacks
[drawbacks]: #drawbacks

- This can create discrimination between platforms, although whether it's preferred
over undefined behavior is debatable.
- Compressing discriminant is not very straightforward.

# Rationale and alternatives
[alternatives]: #alternatives

## On the "null range"

- If we allow "none" to be set as the zero page range, it will make `Option<&T>`'s
layout Rust specific, which can't be used in FFI anymore. On microcontrollers
FFI should still be possible, so such breaking change isn't acceptable.
- We can also allow a very big value to use as "invalid page" range. However, this
may be incompatible with our current internals where `0` is considered `null`.

# Prior art
[prior-art]: #prior-art

Not applicable: Null pointer optimization is Rust specific, and this enhancement
is Rust specific too.

# Unresolved questions
[unresolved]: #unresolved-questions

- Can we suggest a better alternative than `transmute`? `transmute` is too
error prone despite we're trying to make the code more "safe".
- We can also store data in the lower bits of pointer, utilizing the alignemnt
requirement. Also, amd64 pointers are 48-bit technically, so we may also exploit
the space. These optimizations are less portable, and should be filed in another
RFC.