From 67aca498c6b17b8888f17c5035e2c6f257e32c5c Mon Sep 17 00:00:00 2001 From: Christopher Durham Date: Mon, 23 May 2022 09:20:57 -0500 Subject: [PATCH] Put a bound on collection misbehavior As currently written, when a logic error occurs in a collection's trait parameters, this allows *completely arbitrary* misbehavior, so long as it does not cause undefined behavior in std. However, because the extent of misbehavior is not specified, it is allowed for *any* code in std to start misbehaving in arbitrary ways which are not formally UB; consider the theoretical example of a global which gets set on an observed logic error. Because the misbehavior is only bound by not resulting in UB from safe APIs and the crate-level encapsulation boundary of all of std, this makes writing user unsafe code that utilizes std theoretically impossible, as it now relies on undocumented QOI that unrelated parts of std cannot be caused to misbehave by a misuse of std::collections APIs. In practice, this is a nonconcern, because std has reasonable QOI and an implementation that takes advantage of this freedom is essentially a malicious implementation and only compliant by the most langauage-lawyer reading of the documentation. To close this hole, we just add a small clause to the existing logic error paragraph that ensures that any misbehavior is limited to the collection which observed the logic error, making it more plausible to prove the soundness of user unsafe code. This is not meant to be formal; a formal refinement would likely need to mention that values derived from the collection can also misbehave after a logic error is observed, as well as define what it means to "observe" a logic error in the first place. This fix errs on the side of informality in order to close the hole without complicating a normal reading which can assume a reasonable nonmalicious QOI. See also [discussion on IRLO][1]. [1]: https://internals.rust-lang.org/t/using-std-collections-and-unsafe-anything-can-happen/16640 --- library/alloc/src/collections/binary_heap.rs | 7 ++++--- library/alloc/src/collections/btree/map.rs | 6 +++--- library/alloc/src/collections/btree/set.rs | 6 +++--- library/std/src/collections/hash/map.rs | 3 ++- library/std/src/collections/hash/set.rs | 15 ++++++++------- 5 files changed, 20 insertions(+), 17 deletions(-) diff --git a/library/alloc/src/collections/binary_heap.rs b/library/alloc/src/collections/binary_heap.rs index c3c1d0c92a86b..839088eac21ea 100644 --- a/library/alloc/src/collections/binary_heap.rs +++ b/library/alloc/src/collections/binary_heap.rs @@ -166,9 +166,10 @@ mod tests; /// item's ordering relative to any other item, as determined by the [`Ord`] /// trait, changes while it is in the heap. This is normally only possible /// through [`Cell`], [`RefCell`], global state, I/O, or unsafe code. The -/// behavior resulting from such a logic error is not specified (it -/// could include panics, incorrect results, aborts, memory leaks, or -/// non-termination) but will not be undefined behavior. +/// behavior resulting from such a logic error is not specified, but will +/// be encapsulated to the `BinaryHeap` that observed the logic error and not +/// result in undefined behavior. This could include panics, incorrect results, +/// aborts, memory leaks, and non-termination. /// /// # Examples /// diff --git a/library/alloc/src/collections/btree/map.rs b/library/alloc/src/collections/btree/map.rs index 264c217c9ef23..28b44e3acb9da 100644 --- a/library/alloc/src/collections/btree/map.rs +++ b/library/alloc/src/collections/btree/map.rs @@ -64,9 +64,9 @@ pub(super) const MIN_LEN: usize = node::MIN_LEN_AFTER_SPLIT; /// It is a logic error for a key to be modified in such a way that the key's ordering relative to /// any other key, as determined by the [`Ord`] trait, changes while it is in the map. This is /// normally only possible through [`Cell`], [`RefCell`], global state, I/O, or unsafe code. -/// The behavior resulting from such a logic error is not specified (it could include panics, -/// incorrect results, aborts, memory leaks, or non-termination) but will not be undefined -/// behavior. +/// The behavior resulting from such a logic error is not specified, but will be encapsulated to the +/// `BTreeSet` that observed the logic error and not result in undefined behavior. This could +/// include panics, incorrect results, aborts, memory leaks, and non-termination. /// /// Iterators obtained from functions such as [`BTreeMap::iter`], [`BTreeMap::values`], or /// [`BTreeMap::keys`] produce their items in order by key, and take worst-case logarithmic and diff --git a/library/alloc/src/collections/btree/set.rs b/library/alloc/src/collections/btree/set.rs index d6733425288d4..20ef834eaeef3 100644 --- a/library/alloc/src/collections/btree/set.rs +++ b/library/alloc/src/collections/btree/set.rs @@ -23,9 +23,9 @@ use super::Recover; /// It is a logic error for an item to be modified in such a way that the item's ordering relative /// to any other item, as determined by the [`Ord`] trait, changes while it is in the set. This is /// normally only possible through [`Cell`], [`RefCell`], global state, I/O, or unsafe code. -/// The behavior resulting from such a logic error is not specified (it could include panics, -/// incorrect results, aborts, memory leaks, or non-termination) but will not be undefined -/// behavior. +/// The behavior resulting from such a logic error is not specified, but will be encapsulated to the +/// `BTreeSet` that observed the logic error and not result in undefined behavior. This could +/// include panics, incorrect results, aborts, memory leaks, and non-termination. /// /// Iterators returned by [`BTreeSet::iter`] produce their items in order, and take worst-case /// logarithmic and amortized constant time per item returned. diff --git a/library/std/src/collections/hash/map.rs b/library/std/src/collections/hash/map.rs index 969f5dde4f05d..4ec423eb27f31 100644 --- a/library/std/src/collections/hash/map.rs +++ b/library/std/src/collections/hash/map.rs @@ -54,7 +54,8 @@ use crate::sys; /// the [`Eq`] trait, changes while it is in the map. This is normally only /// possible through [`Cell`], [`RefCell`], global state, I/O, or unsafe code. /// The behavior resulting from such a logic error is not specified, but will -/// not result in undefined behavior. This could include panics, incorrect results, +/// be encapsulated to the `HashMap` that observed the logic error and not +/// result in undefined behavior. This could include panics, incorrect results, /// aborts, memory leaks, and non-termination. /// /// The hash table implementation is a Rust port of Google's [SwissTable]. diff --git a/library/std/src/collections/hash/set.rs b/library/std/src/collections/hash/set.rs index 4ac0e081c2e2d..da0572047eca8 100644 --- a/library/std/src/collections/hash/set.rs +++ b/library/std/src/collections/hash/set.rs @@ -33,13 +33,14 @@ use super::map::{map_try_reserve_error, RandomState}; /// In other words, if two keys are equal, their hashes must be equal. /// /// -/// It is a logic error for an item to be modified in such a way that the -/// item's hash, as determined by the [`Hash`] trait, or its equality, as -/// determined by the [`Eq`] trait, changes while it is in the set. This is -/// normally only possible through [`Cell`], [`RefCell`], global state, I/O, or -/// unsafe code. The behavior resulting from such a logic error is not -/// specified (it could include panics, incorrect results, aborts, memory -/// leaks, or non-termination) but will not be undefined behavior. +/// It is a logic error for a key to be modified in such a way that the key's +/// hash, as determined by the [`Hash`] trait, or its equality, as determined by +/// the [`Eq`] trait, changes while it is in the map. This is normally only +/// possible through [`Cell`], [`RefCell`], global state, I/O, or unsafe code. +/// The behavior resulting from such a logic error is not specified, but will +/// be encapsulated to the `HashSet` that observed the logic error and not +/// result in undefined behavior. This could include panics, incorrect results, +/// aborts, memory leaks, and non-termination. /// /// # Examples ///