Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsafe statics #2937

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

withoutboats
Copy link
Contributor

@withoutboats withoutboats commented May 29, 2020

Add unsafe statics, which are like statics except they are unsafe to declare and use, and they are not required to implement Sync. Unlike static muts, they do not allow mutable references.

cc @RalfJung

If it turns out this is more complex or disputed than I initially imagine, I may not be able to steward this to merge. But it seems like a really straightforward revision of static muts to make them no longer a misfeature.

Rendered

@withoutboats withoutboats added the T-lang Relevant to the language team, which will review and decide on the RFC. label May 29, 2020
@withoutboats
Copy link
Contributor Author

One thing I think is interesting: if we were to advance the unsafe references idea ever as well, we would end up in a state where unsafe feels a lot more elucidated.

Many kinds of items would have an "unsafe" variant, which in some cases relaxes some of the requirements (eg unsafe statics do not need to be sync, unsafe references do not need to point to valid data) and in other cases (unsafe functions and traits) does not. But all introduce points of abstraction at which invariants are introduced to be maintained by use sites (which must be inside unsafe scopes).

This would normalize unsafe Rust compared to today. Right now the "unsafe superset" is a hodgepodge of items which are not conceptually well-unified with the safe subset (raw pointers, static muts).

@RalfJung
Copy link
Member

Many kinds of items would have an "unsafe" variant, which in some cases relaxes some of the requirements (eg unsafe statics do not need to be sync, unsafe references do not need to point to valid data) and in other cases (unsafe functions and traits) does not. But all introduce points of abstraction at which invariants are introduced to be maintained by use sites (which must be inside unsafe scopes).

Except unsafe blocks and unsafe impl, which discharge those invariants.

@withoutboats
Copy link
Contributor Author

withoutboats commented May 29, 2020

Except unsafe blocks and unsafe impl, which discharge those invariants.

Yea, exactly, thats what I meant by "unsafe scopes." But maybe you're just refering to the fact that we use the same keyword for both sides of this contract, which - yea, whether that was a good idea or not a separate question. 😅

# Summary
[summary]: #summary

Replace static muts with unsafe statics, which are error prone.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Editorial note: this sounds like you are saying that unsafe statics are error prone.

Comment on lines +64 to +65
creating new invariant abstraction points. It's unclear what use cases this would have, but it seems
like something which could be compelling.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, so that would be using an unsafe static for "communication" between a library and its environment, and make that communication subject to extra constraints? Yes that sounds potentially interesting indeed.

(I struggled at first to understand this paragraph, it would help if you said explicitly that a hypothetical crate would make such an unsafe static part of its public API.)

declaration point of the static which does not obey the safe static rules unsafe. It is a more
direct expression of user intent than creating a RacyUnsafeCell: you want to create a static that
cannot be proven by the compiler to uphold the requirements of a safe static, so you create an
unsafe static.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular, unsafe static is somewhat more direct because it attaches the safety comment to the value (well, place) of concern, whereas RacyUnsafeCell would have a safety comment at its unsafe impl that basically says something like "the only instance of this type is a static and it is being used as follows". The latter feels like an awkward indirection.

@comex
Copy link

comex commented May 31, 2020

I'd intuitively expect unsafe static to work exactly the same as static except for requiring unsafe to access, similar to unsafe trait's relation to trait. It feels really surprising to me that it would also turn off the Sync check. That feels almost analogous to the common misconception that unsafe "turns off the borrow checker", in the sense that it implicitly disables a safety check.

@kennytm
Copy link
Member

kennytm commented May 31, 2020

cc rust-lang/rust#53639

@kennytm
Copy link
Member

kennytm commented May 31, 2020

The unsafe keyword in Rust can either define (unsafe trait) or "discharge" (unsafe impl, unsafe {} block) the unsafe obligation. Currently unsafe fn is in both categories, and it is now considered mistake in #2585 and eventually won't discharge the unsafe effect in some future edition.

In this RFC, unsafe static again takes both the define and discharge roles:

  • It discharges the safe invariance which is "static must be Sync"
  • It defines the unsafe effect which its usage must be unsafe.

perhaps this causes the surprises in the above comments.


Is it possible to split the two roles? Here is an alternative:

  1. unsafe static just declares the item unsafe to use, and nothing more.
  2. static no longer errors on non-Sync types. The check is converted to detect unsafe code usage, similar to accessing a static mut or union field — trying to access a non-Sync static would require an unsafe block.

This means the following is allowed (perhaps with a lint)...

static X: Cell<i32> = Cell::new(0);

but doing these would require unsafe:

// not ok:
let x = &X;
// ok:
let x = unsafe { &X };

// not ok:
let x = X.get();
// ok:
let x = unsafe { X.get() };

// not ok:
X.set(10);
// ok:
unsafe { X.set(10) };

// not ok:
let x = &mut X;
// never ok:
let x = unsafe { &mut X };

// not ok even if Y is Copy (??) (does `static Y: *mut T = ...` make sense?):
let y = Y;
// ok if Y is Copy (??):
let y = unsafe { Y };
error[E0133]: use of non-Sync static is unsafe and requires unsafe block
 --> src/main.rs:4:13
  |
4 |     let x = &X;
  |             ^^ use of non-Sync static
  |
  = note: non-Sync statics cannot be safely shared by multiple threads: aliasing violations or data races will cause undefined behavior

One big drawback of such proposal is that the Sync check is deferred from declaration time to usage time, which is quite the opposite direction of how Rust performs code checking. It also does not explicitly tell the reader that using X is unsafe, which is one of the rationales of this RFC.

@retep998
Copy link
Member

retep998 commented Jun 3, 2020

I like the idea of a static that requires unsafe to access, but I don't like that it also implicitly removes the Sync check.

I'm opposed to statics in general allowing non Sync types but requiring unsafe for such types, because one common usage of statics is in unsafe code that has to do callbacks with no context and in such a case the user might never realize their mistake because everything is in unsafe blocks already!

Since before 1.0, Rust has had a feature called `static mut`, which allows users to define statics
to which mutable references can be taken. Actually taking a mutable reference is unsafe. However,
this feature is incredibly footgunny and almost impossible to use correctly, since creating a
mutable reference to shared memory is undefined behavior, even if you never dereference that
Copy link
Member

@Aaron1011 Aaron1011 Jun 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you say 'shared memory', do you mean 'memory that already has a immutable/mutable reference to it'? The current wording makes it sound as though creating any mutable reference is undefined behavior, since a static is always 'shared' by virtue of having global scope.

@joshtriplett
Copy link
Member

I share the semantic questions raised in other comments about whether the keyword unsafe alone should imply ignoring Sync. I also have an implementation question, however:

Could the RFC please document the expectation for whether code generation for unsafe static will be able to emit the same code that would have been emitted with static mut? Should the compiler, in theory, be able to emit the same code for an UnsafeCell that it would emit for a global variable in another language? Are there cases where this change would prevent emitting efficient code?

(Some of this has been discussed on other issues, but I'd like to see it captured in the RFC.)

Copy link
Contributor

@nikomatsakis nikomatsakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good to me.

@nikomatsakis
Copy link
Contributor

I just read over the comments. I agree that having the "unsafe static is not Sync" stood out to me on first read as a bit of a surprise, though I overlooked it; it might be nice if the unsafe keyword had exactly one effect (making usage unsafe), while keeping other checks in place. On the other hand, that certainly works against the core motivation here, so we'd have to have some mechanism for opting out from the Sync requirement.

I found @kennytm's idea intriguing, but I think I would prefer for the "opt-out" to be an explicit mechanism. As a general rule, I'd prefer to avoid the compiler having to "figure out" if a type if Sync to know whether a mechanism is unsafe -- I'd rather we structure any trait obligations in terms of requirements (this must be true or the program errors) rather than tests (if this is true, one thing happens, but otherwise another), particularly when it comes to "inter-procedural" (in this case, inter-item) requirements.

I think the question of whether this use of unsafe plays "two roles" is an interesting one. It doesn't really feel to me that the Sync invariant is discharged in any way -- it seems like it is simply being moved to the usage site. This is I guess analogous to the way that one could define a function like unsafe fn(x: *const u32) -> u32 { *x }, but that case feels different to me. With a function, one can plausibly have different invariants for the caller from those being discharged in the body. But with a static, the Sync invariant is just always moved to the usage site.

(As a side note, if/when we do adopt some notion of safety and const fn, there would also be a potential interaction with unsafe static. For example, based on how unsafe fn operates, one might expect that unsafe static X: u32 = call_to_unsafe_const_fn(...) would compile, although I think that this makes little sense, at least given @RalfJung's interpretation of safety, which is more about being deterministic than unsafe.)

@RalfJung
Copy link
Member

This is I guess analogous to the way that one could define a function like unsafe fn(x: *const u32) -> u32 { *x }, but that case feels different to me. With a function, one can plausibly have different invariants for the caller from those being discharged in the body. But with a static, the Sync invariant is just always moved to the usage site.

This analogy doesn't really work. Non-Sync statics are not themselves UB, they can be used to cause UB, so the proof obligation to discharge is not" your type must be Sync". There's no "Sync invariant". There's just a requirement to not cause data races, which is independent of statics.

@nikomatsakis
Copy link
Contributor

@RalfJung

Yes. I guess the point is that an "ordinary" safe static is sort of analogous to function like

// can call from multiple threads because we require `T: Sync`
fn foo<T: Sync> {
   unsafe { ... }
}

whereas the unsafe variant might be more like

// can't call from multiple threads
unsafe fn foo<T> { }

In any case, I think it remains true that the unsafe keyword here is not being used to both discharge an obligation and to add one, but merely to add proof obligations. However, it does also remove a requirement that was serving to discharge the obligation before (the Sync requirement).

@Lokathor
Copy link
Contributor

So is there a reason we're not calling them unsync statics? Would a positional keyword add too much complexity?

@nikomatsakis
Copy link
Contributor

I think there would be two reasons:

  • Extra keyword, extra "concept" in some sense, though of course you could view it as a more precise expression of the concept (I just mean that there aren't other "unsync" things in the language)
  • The idea of being able to maybe attach add'l invariants of your own that are not related to race conditions, though I don't quite know what that would be

@brain0
Copy link

brain0 commented Jun 16, 2020

I don't think this feature is needed or desirable. As fair as I see it, the only thing that's currently impossible with static only (and thus requires static mut) is non-Sync types.

However, all of this can be solved without adding yet another language feature, analogous to UnsafeCell and Cell/RefCell:

Add a type UnsafeSyncCell (preliminary, need a better name) to std:

pub struct UnsafeSyncCell<T>(ManuallyDrop<T>);

unsafe impl<T> Sync for UnsafeSyncCell<T> { }

impl<T> UnsafeSyncCell<T> {
    /// # Safety
    ///
    /// Must guarantee that this is only called from the correct thread.
    pub unsafe fn get(&self) -> &T { &self.0 }

    /// # Safety
    ///
    /// See above.
    pub unsafe fn into_inner(self) -> T { ManuallyDrop::into_inner(self.0) }
}

Since this is mainly intended to be used in statics, I think it's okay that the value is not dropped automatically. If someone wants to use this outside of statics, they would need to write their own drop glue.

On top of that, one could build other types with safe interfaces, for example one that stores the thread id when the type is constructed and checks it on every access. Such a type could even drop the value if the thread is the correct one and abort otherwise.

There may be other issues with this besides drop that I'm not seeing right now, but I don't think there's any hard blockers.

Adding such a library type would allow deprecating (and eventually removing) static mut without adding a new language feature as a replacement. It would eliminate all the above discussions about unsafe semantics, since it's a library type like any other that follows the usual semantics. It would fit perfectly into the rest of Rust, where it's normal to compose types (think of Arc<Mutex<_>>). It's also opt-in only. And it does not use any std-only features, so it could even be built and prototyped in an external crate.

@rpjohnst
Copy link

@brain0 That alternative is already under discussion in rust-lang/rust#53639, and this is a counter-proposal designed to avoid some of its problems.

@brain0
Copy link

brain0 commented Jun 16, 2020

I'm sorry, I did not know that.

One argument in the other thread was that such a rare use case does not justify adding such types to std. I think it's the opposite, and adding a language feature for something that can be easily solved in a library is not justified.

@bluss
Copy link
Member

bluss commented Dec 20, 2020

I liked @dtolnay's minimal alternative: rust-lang/rust#53639 (comment) which will look like just a small special case feature (and removal of static mut). This should be less to learn for everyone.

The drawback of special-casing UnsafeCell though is: it's a special case which doesn't transfer to abstractions built ontop of UnsafeCell - and the rebuttal of that, which I think I'm happy with, is that those abstractions have a chance to add Sync impls, that UnsafeCell doesn't have.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
Status: Rejected/Not lang
Development

Successfully merging this pull request may close these issues.