-
Notifications
You must be signed in to change notification settings - Fork 54
Add a note about uninhabited-struct layout optimization #346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -217,3 +217,47 @@ Cross-referencing to other discussions: | |||||
| * https://github.com/rust-lang/rfcs/issues/1397 | ||||||
| * https://github.com/rust-lang/rust/issues/17027 | ||||||
| * https://github.com/rust-lang/unsafe-code-guidelines/issues/176 | ||||||
|
|
||||||
| ## Uninhabited `struct`s should all be ZSTs | ||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just from reading this initially it's not immediately clear what ZSTs are. Maybe make this from
to
|
||||||
|
|
||||||
| It makes conceptual sense that if something is uninhabited, it shouldn't take up any space. | ||||||
| In safe code that works great, but we tried it and ran into problems, so it's not likely to happen. | ||||||
|
|
||||||
| The biggest problem is related to field projection during initialization. Take this code: | ||||||
|
|
||||||
| ```rust | ||||||
| pub fn make_pair<T0, T1>(a0: impl Fn() -> T0, a1: impl Fn() -> T1) -> Box<(T0, T1)> { | ||||||
| let mut mu = Box::<(T0, T1)>::new_uninit(); | ||||||
| unsafe { | ||||||
| let p0 = &raw mut (*mu.as_mut_ptr()).0; | ||||||
| p0.write(a0()); | ||||||
|
|
||||||
| let p1 = &raw mut (*mu.as_mut_ptr()).1; | ||||||
| p1.write(a1()); | ||||||
|
|
||||||
| mu.assume_init() | ||||||
| } | ||||||
| } | ||||||
| ``` | ||||||
|
|
||||||
| Is that *sound*? It sure looks reasonable -- after all, it initialized both the fields -- but | ||||||
| it depends on exactly what the layout rules are. | ||||||
|
|
||||||
| (Aside: Note that a production-ready version of that function should also handle unwinding cleanup | ||||||
| of the first value if constructing the second panicked, but for simplicity of presentation we're | ||||||
| ignoring that part here because leaking is still *sound*.) | ||||||
|
|
||||||
| For something simple like `make_pair::<u8, i32>`, it's clearly fine. But with `make_pair::<u32, !>` | ||||||
| it's *only* sound if we *don't* let `(u32, !)` become a ZST. We need the allocation for the box | ||||||
| to be large enough to write that `u32` without being an obviously-UB out-of-bounds write. | ||||||
|
|
||||||
| Thus if we wanted to always have uninhabited product types be ZSTs, we'd need to give up on certain | ||||||
| other rules, perhaps the one that `T` and `MaybeUninit<T>` always have the same size. So far, the | ||||||
| simpler, less-error-prone experience for writing unsafe code has won out over the minimal space | ||||||
| savings possible from shrinking the types. After all, while it's not necessarily fully unreachable, | ||||||
| as something like `make_pair(|| a, || loop { … })` would still need to allocate the space despite | ||||||
| that never reaching the `assume_init` part, it's still unlikely that this occurs frequently. | ||||||
|
|
||||||
| There *is* still interest in doing optimizations like this on *sum* types, however. There's more | ||||||
|
scottmcm marked this conversation as resolved.
Outdated
|
||||||
| to potentially be gained there since one variant of a `union` or `enum` being uninhabited doesn't | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am very surprised to see
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, I wrote that without thinking about it that hard, and looking again you're right, I don't see this happening on |
||||||
| keep the whole *value* from being uninhabited the way an uninhabited field does in a `struct`. | ||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this might be a typo?
Suggested change
|
||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the name “ZST” does not capture the underlying idea very well. The real thing we want is a -∞ sized type, ZST is just one specific way of implementing it. Although I don’t have a better title yet.
Also, do you think it’s a good idea to also add the discussion of
Inhabittedtrait here? In case someone reading this come up with that idea again.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not as clear as you'd think. Rust has unsafe, and types need a layout, so "a ZST with always-false validity invariant" actually has some important benefits. With a -∞ sized/aligned type you can't even start executing a function that has a
!variable in it because its whole stack frame would be uninhabited; yet it's easy to write such a function: justpanic!().Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Size is a natural number, so a size of -∞ doesn't even make sense. (One could define notions of size where that does make sense, but that's not the discussion we are having here. The notion of size here is the notion currently used in Rust. Given the proposed resolution for this question, it's also not useful to consider these other notions of size.)
Therefore, ZST is exactly the right term here. There's no way a type can be smaller than that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently size is (at the very least) the amount of bytes required to uniquely represent an arbitrary member of the set of a type, in the Information Theory sense.
The minimum value possible there is zero: if person A needs to describe an element x of type T to person B, and in order to do so, no bytes need to be sent to B, that's because B already knows all it needs to know about x.
And that can only happen when T is either a singleton set or the empty set.
I'm guessing that the -oo notion comes from the logarithm in the Information Theory definition. If we take the logarithm of the size of the set described by the type T, we get:
|T| = 1|T| = 0But another restriction we impose over sizes is that they be positive. This (I believe) is because we need sizes to describe a layout budget.
Thus, on one hand the logarithm tells us the number of bits is at least -oo, but on the other hand we know it to also be at least 0. So it ends up being 0.
A ZST :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those two concerns get put together by the use of the ceiling operator; the logarithm of an integer is not necessarily an integer, even when it's not -∞, so we actually do ⌈log₂|T|⌉ where we choose to define ⌈-∞⌉=0.