Skip to content
Merged
Changes from 1 commit
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
51bed0f
UnsafeAliased: port text from hackmd
RalfJung Aug 1, 2023
d599c1e
typos and nits
RalfJung Aug 1, 2023
c2bbec7
polyfill; Send/Sync question
RalfJung Aug 1, 2023
60ca0fa
remove some wrong cells
RalfJung Aug 2, 2023
6910494
detail example
RalfJung Aug 4, 2023
a1a05fb
Unpin hack migration plan
RalfJung Aug 4, 2023
a10d70c
remove stable feature gate
RalfJung Aug 5, 2023
9535a37
add alternative: drop aliasing rules for mutable references
RalfJung Aug 7, 2023
06b52df
link to real-world intrusive linked list example
RalfJung Aug 9, 2023
0369352
add get_mut_pinned method
RalfJung Aug 10, 2023
3e97a7e
fix typo
RalfJung Aug 15, 2023
a8f2516
add !Unpin impl to UnsafeAliased
RalfJung Aug 15, 2023
a55adde
add some links to prior discussions
RalfJung Sep 13, 2023
5f03b25
rename UnsafeAliased → UnsafePinned
RalfJung Nov 4, 2023
ce937b8
explain the naming choice
RalfJung Nov 4, 2023
60c255f
update UnsafeUnpin trait notes
RalfJung Nov 4, 2023
9881c94
explain why this is so much more awkward than UnsafeCell
RalfJung Nov 11, 2023
36b694d
be more clear about the soundness issues around '&mut UnsafePinned'
RalfJung Nov 25, 2023
7158ce2
also explain a non-alternative
RalfJung Nov 25, 2023
5ff0df8
fix a typo: mutation -> aliasing
RalfJung Nov 26, 2023
086412d
we should block niches
RalfJung Nov 26, 2023
7190a78
drawback: losing too much noalias
RalfJung Nov 28, 2023
d793712
elaborate on `Unpin + !UnsafeUnpin`
RalfJung Nov 28, 2023
9327537
fix typo
RalfJung Dec 22, 2023
86e0188
add example for why this is tied to pinning
RalfJung Feb 10, 2024
14b9019
update UnsafePinned docs regarding public API exposure
RalfJung Feb 11, 2024
5b2d690
Send and Sync
RalfJung Mar 10, 2024
1f31c7b
fix code example
RalfJung May 3, 2024
76a4035
make UnsafePinned derive Copy, Send, Sync
RalfJung May 3, 2024
bfb8c4b
expand back-compat note
RalfJung May 3, 2024
e98f367
add fixed code example
RalfJung May 29, 2024
474389f
add open question around naming
RalfJung May 29, 2024
3efb695
update filename and RFC number
RalfJung Jun 17, 2024
519aeb6
Clean up trailing whitespace
traviscross Jun 18, 2024
b950de5
Add tracking issue for RFC 3467
traviscross Jun 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
337 changes: 337 additions & 0 deletions text/0000-unsafe-aliased.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,337 @@
# `unsafe_aliased`

- Feature Name: `unsafe_aliased`
- Start Date: 2022-11-05
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)

# Summary
[summary]: #summary

Add a type `UnsafeAliased` that acts on `&mut` in a similar way to how `UnsafeCell` acts on `&`: it opts-out of the aliasing guarantees.
However, `&mut UnsafeAliased` can still be `mem::swap`ed, so this is not a free ticket for arbitrary aliasing of mutable references.
Abstractions built on top of `UnsafeAliased` must ensure proper encapsulation when handing such aliases references to clients (e.g. via pinning).

This type is then used in generator lowering, finally fixing [#63818](https://github.com/rust-lang/rust/issues/63818).

# Motivation
[motivation]: #motivation

Let's say you want to write a type with a self-referential pointer:

```rust
#![feature(pin_macro)]
#![feature(negative_impls)]
use std::ptr;
use std::pin::{pin, Pin};

pub struct S {
data: i32,
ptr_to_data: *mut i32,
}

impl !Unpin for S {}
Comment thread
RalfJung marked this conversation as resolved.
Outdated

impl S {
pub fn new() -> Self {
S { data: 42, ptr_to_data: ptr::null_mut() }
}

pub fn get_data(self: Pin<&mut Self>) -> i32 {
// SAFETY: We're not moving anything.
let this = unsafe { Pin::get_unchecked_mut(self) };
if this.ptr_to_data.is_null() {
this.ptr_to_data = ptr::addr_of_mut!(this.data);
}
// SAFETY: if the pointer is non-null, then we are pinned and it points to the `data` field.
unsafe { this.ptr_to_data.read() }
}

pub fn set_data(self: Pin<&mut Self>, data: i32) {
// SAFETY: We're not moving anything.
let this = unsafe { Pin::get_unchecked_mut(self) };
if this.ptr_to_data.is_null() {
this.ptr_to_data = ptr::addr_of_mut!(this.data);
}
// SAFETY: if the pointer is non-null, then we are pinned and it points to the `data` field.
unsafe { this.ptr_to_data.write(data) }
}
}

fn main() {
let mut s = pin!(S::new());
s.as_mut().set_data(42);
println!("{}", s.as_mut().get_data());
}
```

This kind of code is implicitly generated by rustc all the time when an `async fn` has a local variable of reference type that is live across a yield point.
The problem is that this code has UB under our aliasing rules: the `&mut S` inside the `self` argument of `get_data` aliases with `ptr_to_data`!

This simple code only has UB under Stacked Borrows but not under the LLVM aliasing rules; more complex variants of this -- still in the realm of what `async fn` generates -- also have UB under the LLVM aliasing rules.

<details>

<summary>A more complex variant</summary>

```rust
#![feature(pin_macro)]
Comment thread
RalfJung marked this conversation as resolved.
Outdated
#![feature(negative_impls)]
use std::ptr;
use std::pin::{pin, Pin};
use std::task::Poll;

pub struct S {
state: i32,
data: i32,
ptr_to_data: *mut i32,
}

impl !Unpin for S {}

impl S {
pub fn new() -> Self {
S { state: 0, data: 42, ptr_to_data: ptr::null_mut() }
Comment thread
RalfJung marked this conversation as resolved.
Outdated
}

fn poll(self: Pin<&mut Self>) -> Poll<()> {
// SAFETY: We're not moving anything.
let this = unsafe { Pin::get_unchecked_mut(self) };
match this.state {
0 => {
// The first time, set up the pointer.
this.ptr_to_data = ptr::addr_of_mut!(this.data);
// Now yield.
this.state += 1;
Poll::Pending
}
1 => {
// After coming back from the yield, write to the pointer.
unsafe { this.ptr_to_data.write(42) };
// And read our local variable `data`.
// THIS IS UB! `this` is derived from the `noalias` pointer
// `self` but we did a write to `this.data` in the previous
// line when writing to `ptr_to_data`. The compiler is allowed
// to reorder this and the previous line and then the output
// would change.
println!("{}", this.data);
// Done!
Poll::Ready(())
}
_ => unreachable!(),
}
}
}

fn main() {
let mut s = pin!(S::new());
while let Poll::Pending = s.as_mut().poll() {}
}
```

</details>

<br>

Beyond self-referential types, a similar problem also comes up with intrusive linked lists: the nodes of such a list often live on the stack frames of the functions participating in the list, but also have incoming pointers from other list elements.
When a function takes a mutable reference to its stack-allocated node, that will alias the pointers from the neighboring elements.
`Pin` is used to ensure that the list elements don't just move elsewhere (which would invalidate those incoming pointers), but there still is the problem that an `&mut Node` is actually not a unique pointer due to these aliases -- so we need a way for the to opt-out of the aliasing rules.

<br>

The goal of this RFC is to offer a way of writing such self-referential types and intrusive collections without UB.
We don't want to change the rules for mutable references in general (that would also affect all the code that doesn't do anything self-referential), instad we want to be able to tell the compiler that this code is doing funky aliasing and that should be taken into account for optimizations.

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

To write this code in a UB-free way, wrap the fields that are *targets* of self-referential pointers in an `UnsafeAliased`:

```rust
pub struct S {
data: UnsafeAliased<i32>, // <!---- here
ptr_to_data: *mut i32,
}
```

This lets the compiler know that mutable references to `data` might still have aliases, and so optimizations cannot assume that no aliases exist. That's entirely analogous to how `UnsafeCell` lets the compiler know that *shared* references to this field might undergo mutation.

However, what is *not* analogous is that `&mut S`, when handed to safe code *you do not control*, must still be unique pointers!
This is because of methods like `mem::swap` that can still assume that their two arguments are non-overlapping.
(`mem::swap` on two `&mut UnsafeAliased<i32>` may soundly assume that they do not alias.)
In other words, the safety invariant of `&mut S` still requires full ownership of the entire memory range `S` is stored at; for the duration that a function holds on to the borrow, nobody else may read and write this memory.
But again, this is a *safety invariant*; it only applies to safe code you do not control. You can write your own code handling `&mut S` and as long as that code is careful not to make use of this memory in the wrong way, potential aliasing is fine.
You can also hand out a `Pin<&mut S>` to external safe code; the `Pin` wrapper imposes exactly the restrictions needed to ensure that this remains sound (e.g., it prevents `mem::swap`).


# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

API sketch:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should UnsafeAliased be !Unpin? This (negative) impl listed isn't in the sketch, but it must be for the earlier guide-level line about Pin<&mut S> soundness to be accurate.

By analogy to how the other autotraits are treated, I argue that yes, it should be not-Unpin. The user can always implement Unpin if desired.


```rust
/// The type `UnsafeAliased<T>` lets unsafe code violate
/// the rule that `&mut UnsafeAliased<T>` may never alias anything else.
///
/// However, it is still very risky to have two `&mut UnsafeAliased<T>` that alias
/// each other. For instance, passing those two references to `mem::swap`
/// would cause UB. The general advice is to still have a single mutable
/// reference, and then a bunch of raw pointers that alias it.
/// If you want to pass such a reference to external safe code,
/// you need to use techniques such as pinning (`Pin<&mut TypeWithUnsafeAliased>`)
/// which ensure that both the mutable reference and its aliases remain valid.
///
/// Further note that this does *not* lift the requirement that shared references
/// must be read-only! Use `UnsafeCell` for that.
#[lang = "unsafe_cell_mut"]
Comment thread
RalfJung marked this conversation as resolved.
Outdated
#[repr(transparent)]
struct UnsafeAliased<T: ?Sized> {
value: T,
}

impl<T: ?Sized> !Send for UnsafeAliased<T> {}

impl<T> UnsafeCell<T> {
Comment thread
RalfJung marked this conversation as resolved.
Outdated
/// Constructs a new instance of `UnsafeCell` which will wrap the specified
Comment thread
RalfJung marked this conversation as resolved.
Outdated
/// value.
pub fn new(value: T) -> UnsafeAliased<T> {
UnsafeAliased { value }
}

pub fn into_inner(self) -> T {
self.value
}

/// Get read-write access to the contents of an `UnsafeAliased`.
pub fn get_mut(&mut self) -> *mut T {
Comment thread
RalfJung marked this conversation as resolved.
Outdated
self as *mut _ as *mut T
}

/// Get read-only access to the contents of a shared `UnsafeAliased`.
/// Note that `&UnsafeAliased<T>` is read-only if `&T` is read-only.
/// This means that if there is mutation of the `T`, future reads from the
/// `*const T` returned here are UB!
///
/// ```rust
/// unsafe {
/// let mut x = UnsafeAliased::new(0);
/// let ref1 = &mut *addr_of_mut!(x);
/// let ref2 = &mut *addr_of_mut!(x);
/// let ptr = ref1.get(); // read-only pointer, assumes immutability
/// ref2.get_mut.write(1);
Comment thread
RalfJung marked this conversation as resolved.
Outdated
/// ptr.read(); // UB!
/// }
/// ```
pub fn get(&self) -> *const T {
self as *const _ as *const T
}

pub fn raw_get_mut(this: *mut Self) -> *mut T {
this as *mut T
}

pub fn raw_get(this: *const Self) -> *const T {
this as *const T
}
}
```

The comment about aliasing `&mut` being "risky" refers to the fact that their safety invariant still asserts exclusive ownership.
This implies that `duplicate` in the following example, while not causing immediate UB, is still unsound:

```rust
pub struct S {
data: UnsafeAliased<i32>,
}

impl S {
fn new(x: i32) -> Self {
S { data: UnsafeAliased::new(x) }
}

fn duplicate<'a>(s: &'a mut S) -> (&'a mut S, &'a mut S) {
let s1 = unsafe { (&s).transmute_copy() };
let s2 = s;
(s1, s2)
}
}
```

The unsoundness is easily demonstrated by using safe code to cause UB:

```rust
let mut s = S::new(42);
let (s1, s2) = s.duplicate(); // no UB
mem::swap(s1, s2); // UB
```

We could even soundly make `get_mut` return an `&mut T`, given that the safety invariant is not affected by `UnsafeAliased`.
But that would probably not be useful and only cause confusion.
Comment on lines +293 to +
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This confuses me a bit, I expect to be able to write this kind of function:

fn safe_ffi_pointer_get() -> &mut UnsafePinned<u32> {
    let ptr: *mut u32 = unsafe { ffi_pointer_get() };
    let ptr: *mut UnsafePinned<u32> = ptr.cast::<UnsafePinned<u32>>();
    let r = unsafe { &mut *ptr };
}

But if get_mut_unchecked returned &mut T, then one could write:

let r = safe_ffi_pointer_get();
let r = r.get_mut_unchecked();

And thus get access to the &mut u32, even though it might still be referenced & modified by other pointers from e.g. C.

Copy link
Copy Markdown
Member Author

@RalfJung RalfJung Nov 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even though it might still be referenced & modified by other pointers from e.g. C.

If that might be the case, you should use Pin<&mut UnsafePinned<u32>> as your return type.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I find it a bit unnatural to have to wrap a pointer to u32 in Pin, but it's probably something one will get used to.

Another thing that I noticed was that this essentially gives Pin<P> another job. It not only ensures that the pointee is never moved, but also that uniqueness cannot be assumed. This will make writing unsafe pin projections much more miserable, since that is an additional requirement one needs to uphold:

fn project(self: Pin<&mut Self>) -> Pin<&mut Field> {
    // SAFETY: We do not move out of `this` and do not give it to code that might assume uniqueness.
    let this = unsafe { self.get_mut_unchecked() };
    // SAFETY: `self` was pinned and `field` is structually pinned.
    unsafe { Pin::new_unchecked(&mut this.field) }
}

I don't think there is another way though. This just makes we want safe field projections even more.
There might be generic code that assumes that Pin<&mut T> is actually unique, but if T is self referential, that assumption is wrong today, so I don't think that this is a problem.

Copy link
Copy Markdown
Member Author

@RalfJung RalfJung Nov 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I find it a bit unnatural to have to wrap a pointer to u32 in Pin, but it's probably something one will get used to.

You explained yourself why it needs to be pinned: "it might still be referenced & modified by other pointers". It's not a pointer to u32, it's a pointer to UnsafePinned<u32>.

Another thing that I noticed was that this essentially gives Pin<P> another job. It not only ensures that the pointee is never moved, but also that uniqueness cannot be assumed. This will make writing unsafe pin projections much more miserable, since that is an additional requirement one needs to uphold:

I don't follow. This RFC does not introduce any requirements for Pin that do not already exist. Uniqueness assumptions are completely unaffected by Pin. That is indeed the problem that motivates UnsafePinned in the first place.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even though it might still be referenced & modified by other pointers from e.g. C.

If that might be the case, you should use Pin<&mut UnsafePinned<u32>> as your return type.

what happens if I do this:

#[repr(transparent)]
struct Foo {
    field: UnsafePinned<u32>,
}

impl Unpin for Foo {}

fn safe_ffi_pointer_get() -> Pin<&mut Foo> {
    let ptr: *mut u32 = unsafe { ffi_pointer_get() };
    let ptr: *mut Foo = ptr.cast::<Foo>();
    let r = unsafe { &mut *ptr };
    Pin::new(r)
}

let r = safe_ffi_pointer_get().into_inner();
let r = r.field.get_mut_unchecked();

So I think Unpin has to be made unsafe regardless?

Copy link
Copy Markdown
Member Author

@RalfJung RalfJung Nov 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've written a bit more about Unpin + !UnsafeUnpin in the RFC. Basically, these types are a foot-gun -- someone did impl<T> Unpin MyType<T> rather than impl<T: Unpin> Unpin MyType<T>. It's not impossible that this is sound, but it certainly needs extra scrutiny. However, having such a type be Unpin + !UnsafeUnpin is still better than it being just Unpin; the known poll_fn-related cases of UB in real code would have all been mitigated by this and only issues like the one I describe here remain.

We also likely would never have found out about the poll_fn situation without that UB in real code... but that's not a great motivation to make code have UB, IMO.^^

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about types T: Unpin + !UnsafeUnpin? I would expect that those are malformed, since you can go safely from Pin<&mut T> -> &mut T but this can violate the safety invariants of &mut T...

I guess that would have to be a type that syntactically contains UnsafePinned but does not actually make use of pinning. It doesn't make a ton of sense. See the poll_fn debacle for the kind of problems one can cause with an impl Unpin for; UnsafeUnpin is an unsafe trait to prevent exactly that.

Yes, impl Unpin for is a problem.

Basically UnsafeUnpin as a trait has a "safety invariant" that is exactly the same as that of Unpin, but then it also additionally has validity effects (direct impact on the operational semantics) via the noalias for &mut. We should only add noalias for types that satisfy the Unpin invariant, but we can't trust the Unpin trait so we need a new one. Also, like for Freeze, it's unclear whether we want to allow programmers to "hide" an UnsafePinned with a manual impl UnsafeUnpin.

Would it be far fetched to make Unpin: UnsafeUnpin? That would prevent the existence of types Unpin + !UnsafeUnpin which IMO don't have a usecase and are foot-guny. Probably need a crater run to see if this breaks something, or do you know of some issue with this?

It's a shame that we are pessimizing all &mut UnsafePinned when we only really need to do then for pinned pointers, but due to a lack of native &pin types we can't really do any better.

IMO this isn't so bad -- at least for RFL, all types that will be !UnsafeUnpin will only be used when pinned. It would be pretty unnatural to use a &mut UnsafePinned directly.
For futures, I don't know enough, but to me it would seem strange to "use" a future that is not pinned.


I've written a bit more about Unpin + !UnsafeUnpin in the RFC. Basically, these types are a foot-gun -- someone did impl<T> Unpin MyType<T> rather than impl<T: Unpin> Unpin MyType<T>. It's not impossible that this is sound, but it certainly needs extra scrutiny. However, having such a type be Unpin + !UnsafeUnpin is still better than it being just Unpin; the known poll_fn-related cases of UB in real code would have all been mitigated by this and only issues like the one I describe here remain.

The way I see T: Unpin + !UnsafeUnpin is that there is no UB unless you give an aliasing Pin<&mut T> (or &mut T) to arbitrary (safe) code (e.g. swap). I am not sure that it is possible to exhibit UB with solely safe code. Because in order to create aliasing Pin<&mut T> one still needs unsafe. So I agree that this is far better than the status quo.

If the suggestion above to make Unpin: UnsafeUnpin, is there some other way to prevent the existence of types with Unpin + !UnsafeUnpin? For example creating a lint?

We also likely would never have found out about the poll_fn situation without that UB in real code... but that's not a great motivation to make code have UB, IMO.^^

Agreed.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be far fetched to make Unpin: UnsafeUnpin?

I don't see how that can be done backwards-compatibly.
And if we want do devise a plan for a breaking change over an edition boundary, it should IMO involve replacing Unpin by UnsafeUnpin wholesale... though we'd have to figure out the status of the Unpin bounds then, which are common, in contrast to Freeze which currently cannot be used in bounds (but that feature is every now and then requested).

I'd prefer to not tackle that challenge in this RFC.

The way I see T: Unpin + !UnsafeUnpin is that there is no UB unless you give an aliasing Pin<&mut T> (or &mut T) to arbitrary (safe) code (e.g. swap).

Well, the poll_fn situation didn't involve any such aliasing and still the example that moves the future is still UB.

If the suggestion above to make Unpin: UnsafeUnpin, is there some other way to prevent the existence of types with Unpin + !UnsafeUnpin? For example creating a lint?

That would basically lint against all manual impl Unpin. I don't know the async ecosystem very well but this feels rather disruptive?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I see T: Unpin + !UnsafeUnpin is that there is no UB unless you give an aliasing Pin<&mut T> (or &mut T) to arbitrary (safe) code (e.g. swap).

Well, the poll_fn situation didn't involve any such aliasing and still the example that moves the future is still UB.

Oh I overlooked that. Still, having Unpin + !UnsafeUnpin types feels weird to me.

If the suggestion above to make Unpin: UnsafeUnpin, is there some other way to prevent the existence of types with Unpin + !UnsafeUnpin? For example creating a lint?

That would basically lint against all manual impl Unpin. I don't know the async ecosystem very well but this feels rather disruptive?

Not so sure about that. pin-project generates Unpin impls. An example from tokio:

pin_project! {
    /// Stream for the [`take`](super::AsyncReadExt::take) method.
    #[derive(Debug)]
    #[must_use = "streams do nothing unless you `.await` or poll them"]
    #[cfg_attr(docsrs, doc(cfg(feature = "io-util")))]
    pub struct Take<R> {
        #[pin]
        inner: R,
        // Add '_' to avoid conflicts with `limit` method.
        limit_: u64,
    }
}

Here the Take<R>: Unpin if R: Unpin and if R: Unpin implies R: UnsafeUnpin, then this impl would not trigger the lint.

I just don't think we should allow T: Unpin + !UnsafeUnpin.

Copy link
Copy Markdown
Member Author

@RalfJung RalfJung Nov 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if R: Unpin implies R: UnsafeUnpin

But that's the thing, it doesn't, currently.

I don't see a good migration path towards not allowing T: Unpin + !UnsafeUnpin. And I don't think we should block having UnsafePinned on such a migration path. We surely should block stabilizing UnsafeUnpin on figuring out what we want to do here, but even stabilizing UnsafePinned does not require stabilizing UnsafeUnpin.


Reference diff:

```diff
* Breaking the [pointer aliasing rules]. `&mut T` and `&T` follow LLVM’s scoped
- [noalias] model, except if the `&T` contains an [`UnsafeCell<U>`].
+ [noalias] model, except if the `&T` contains an [`UnsafeCell<U>`] or
+ the `&mut T` contains an [`UnsafeAliased<U>`].
```

Async generator lowering changes:
- Fields that represent local variables whose address is taken across a yield point must be wrapped in `UnsafeAliased`.

Codegen and Miri changes:

- We have a `Unique` auto trait similar to `Freeze` that is implemented if the type does not contain any by-val `UnsafeAliased`.
- `noalias` on mutable references is only emitted for `Unique` types. (This replaces the current hack where it is only emitted for `Unpin` types.)
- Miri will do `SharedReadWrite` retagging inside `UnsafeAliased` similar to what it does inside `UnsafeCell` already. (This replaces the current `Unpin`-based hack.)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's clear that &UnsafeAliased<T> carries a Shared tag, and that mutating through it is UB (assuming no UnsafeCell is present). What's not clear is how retagging impacts extant sibling write-capable provenance.

This very much gets into some undecided specifics to specify precisely, but some specifics are needed. Namely, if I do

let mut s = pin!(S::new());
s.as_mut().get_data(); // init ptr
black_box(s.as_ref()); // retag &S
s.as_mut().set_data(42); // write through ptr

is this guaranteed to be permitted? If it is, then IIUC, &UnsafeAliased<T> can protect the bytes such that writing to them would be UB, but must not retag them in such a way as to completely invalidate sibling write-capable provenance.

Copy link
Copy Markdown
Member Author

@RalfJung RalfJung Aug 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea where existing siblings matter in your example.

In SB, each as_mut creates a new SRW child of s. Each as_ref creates a new SRO child. Both of these actions preserve existing SRW and SRO children of s.

In TB, as_mut will just return the original provenance -- no retagging for !Unique mutable references. as_ref will create a read-only child that lives as long as there is no write.

Your example doesn't even exploit any of that though. It would be allowed even if creating a new SRW or SRO child first killed all previously created children, since you are creating new pointer with calls to as_mut/as_ref all the time. Did you mean to keep the first as_mut pointer around and use it again after the as_ref?


### Comparison with some other types that affect aliasing

- `UnsafeCell`: disables aliasing (and affects but does not fully disable dereferenceable) behind shared refs, i.e. `&UnsafeCell<T>` is special. `UnsafeCell<&T>` (by-val, fully owned) is not special at all and basically like `&T`; `&mut UnsafeCell<T>` is also not special.
- `UnsafeAliased`: disables aliasing (and affects but does not fully disable dereferenceable) behind mutable refs, i.e. `&mut UnsafeAliased<T>` is special. `UnsafeAliased<&mut T>` (by-val, fully owned) is not special at all and basically like `&mut T`; `&UnsafeAliased<T>` is also not special.
- [`MaybeDangling`](https://github.com/rust-lang/rfcs/pull/3336): disables aliasing and dereferencable *of all references (and boxes) directly inside it*, i.e. `MaybeDanling<&[mut] T>` is special. `&[mut] MaybeDangling<T>` is not special at all and basically like `&[mut] T`.
Comment thread
RalfJung marked this conversation as resolved.
Outdated

# Drawbacks
[drawbacks]: #drawbacks

It's yet another wrapper type adjusting our aliasing rules and very easy to mix up with `UnsafeCell` or [`MaybeDangling`](https://github.com/rust-lang/rfcs/pull/3336).
Furthermore, it is an extremely subtle wrapper type, as the `duplicate` example shows.
`Unique` is a very strange trait, since it does not come with much of a safety requirement -- unlike `Freeze`, which quite clearly expresses some statement about the invariant of a type when shared, it is much less clear what the corresponding statement would be for `Unique`. (More work on RustBelt is needed to determine the details here.)

# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives

The proposal in this RFC matches [what was discussed with the lang team a long time ago](https://github.com/rust-lang/rust/issues/63818#issuecomment-526579977).

However, of course one could imagine alternatives:

- Keep the status quo. The current sitaution is that we only make aliasing requirements on mutable references if the type they point to is `Unpin`. This is unsatisfying: `Unpin` was never meant to have this job. A consequence is that a stray `impl Unpin` on a `Wrapper<T>`-style type can [lead to subtle miscompilations](https://internals.rust-lang.org/t/surprising-soundness-trouble-around-pollfn/17484/15) since it re-adds aliasing requirements for the inner `T`. Contrast this with the `UnsafeCell` situation, where it is not possible for (stable) code to just `impl Freeze for T` in the wrong way -- `UnsafeCell` is *always* recognized by the compiler.

On the other hand, `UnsafeAliased` is rather quirky in its behavior and having two marker traits (`Unique` and `Unpin`) might be too confusing, so sticking with `Unpin` might not be too bad in comparison.

If we do that, however, it seems preferrable to transition `Unpin` to an unsafe trait. There *is* a clear statement about the types' invariants associated with `Unpin`, so an `impl Unpin` already comes with a proof obligation. It just happens to be the case that in a module without unsafe, one can always arrange all the pieces such that the proof obligation is satisifed. This is mostly a coincidence and related to the fact that we don't have safe field projections on `Pin`. That said, solving this also requires solving the trouble around `Drop` and `Pin`, where effectively an `impl Drop` does an implicit `get_mut_unchecked`, i.e., implicitly assumes the type is `Unpin`.
- `UnsafeAliased` could affect aliasing guarantees *both* on mutable and shared references. This would avoid the currently rather subtle situation that arises when one of many aliasing `&mut UnsafeAliased<T>` is cast or coerced to `&UnsafeAliased<T>`: that is a read-only shared reference and all aliases must stop writing! It would make this type strictly more 'powerful' than `UnsafeCell` in the sense that replacing `UnsafeCell` by `UnsafeAliased` would always be correct. (Under the RFC as written, `UnsafeCell` and `UnsafeAliased` can be nested to remove aliasing requirements from both shared and mutable references.)

If we don't do this, we could consider removing `get` since since it seems too much like a foot-gun. But that makes shared references to `UnsafeAliased` fairly pointless. Shared references to generators/futures are basically useless so it is unclear what the potential use-cases here are.
- Instead of introducing a new type, we could say that `UnsafeCell` affects *both* shared and mutable references. That would lose some optimization potential on types like `&mut Cell<T>`, but would avoid the footgun of coercing an `&mut UnsafeAliased<T>` to `&UnsafeAliased<T>`. That said, so far the author is not aware of Miri detecting code that would run into this footgun (and Miri is [able to do detect such issues](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=aab417b535f7dbd266fbfe470ea208c7)).


# Prior art
[prior-art]: #prior-art

This is somewhat like `UnsafeCell`, but for mutable instead of shared references.

# Unresolved questions
[unresolved-questions]: #unresolved-questions

- How do we transition code that relies on `Unpin` opting-out of aliasing guarantees, to the new type? Futures and generators just need a compiler patch, but there is probably other code that needs adjusting (e.g., Rust-for-Linux uses pinning to handle all sorts of self-referntial things in the Linux Kernel). Note that all such code is explicitly unsupported right now; the `Unpin` loophole has always explicitly been declared as temporary, unstable, and not something that we promise will actually work.
- The name of the type needs to be bikeshed. `UnsafeAliased` might be too close to `UnsafeCell`, but that is a deliberate choice to indicate that this type has an effect when it appears in the *pointee*, unlike types like `MaybeUninit` or [`MaybeDangling`](https://github.com/rust-lang/rfcs/pull/3336) that have an effect when wrapped around the *pointer*. Like `UnsafeCell`, the aliasing allowed here is "interior". Other possible names: `UnsafeSelfReferential`, `UnsafePinned`, ...
- Relatedly, in which module should this type live?
- Should this type `derive(Copy)`? `UnsafeCell` does not, which is unfortunate because it now means some people might use `T: Copy` as indication that there is no `UnsafeCell` in `T`.
- `Unpin` [also affects the `dereferenceable` attribute](https://github.com/rust-lang/rust/pull/106180), so the same would happen for this type. Is that something we want to guarantee, or do we hope to get back `dereferenceable` when better semantics for it materialize on the LLVM side?

# Future possibilities
[future-possibilities]: #future-possibilities

- Similar to how we might want the ability to project from `&UnsafeCell<Struct>` to `&UnsafeCell<Field>`, the same kind of projection could also be interesting for `UnsafeAliased`.