Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement os_string_pathbuf_leak #125966

Merged
merged 2 commits into from
Jun 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions library/std/src/ffi/os_str.rs
Original file line number Diff line number Diff line change
Expand Up @@ -533,6 +533,25 @@ impl OsString {
unsafe { Box::from_raw(rw) }
}

/// Consumes and leaks the `OsString`, returning a mutable reference to the contents,
/// `&'a mut OsStr`.
///
/// The caller has free choice over the returned lifetime, including 'static.
/// Indeed, this function is ideally used for data that lives for the remainder of
/// the program’s life, as dropping the returned reference will cause a memory leak.
///
/// It does not reallocate or shrink the `OsString`, so the leaked allocation may include
/// unused capacity that is not part of the returned slice. If you want to discard excess
/// capacity, call [`into_boxed_os_str`], and then [`Box::leak`] instead.
/// However, keep in mind that trimming the capacity may result in a reallocation and copy.
///
/// [`into_boxed_os_str`]: Self::into_boxed_os_str
#[unstable(feature = "os_string_pathbuf_leak", issue = "125965")]
#[inline]
pub fn leak<'a>(self) -> &'a mut OsStr {
OsStr::from_inner_mut(self.inner.leak())
}

/// Part of a hack to make PathBuf::push/pop more efficient.
#[inline]
pub(crate) fn as_mut_vec_for_path_buf(&mut self) -> &mut Vec<u8> {
Expand Down
9 changes: 9 additions & 0 deletions library/std/src/ffi/os_str/tests.rs
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,15 @@ fn test_os_string_clear() {
assert_eq!(0, os_string.inner.as_inner().len());
}

#[test]
fn test_os_string_leak() {
let os_string = OsString::from("have a cake");
let (len, cap) = (os_string.len(), os_string.capacity());
let leaked = os_string.leak();
assert_eq!(leaked.as_encoded_bytes(), b"have a cake");
unsafe { drop(String::from_raw_parts(leaked as *mut OsStr as _, len, cap)) }
}

#[test]
fn test_os_string_capacity() {
let os_string = OsString::with_capacity(0);
Expand Down
19 changes: 19 additions & 0 deletions library/std/src/path.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1226,6 +1226,25 @@ impl PathBuf {
self
}

/// Consumes and leaks the `PathBuf`, returning a mutable reference to the contents,
/// `&'a mut Path`.
///
/// The caller has free choice over the returned lifetime, including 'static.
/// Indeed, this function is ideally used for data that lives for the remainder of
/// the program’s life, as dropping the returned reference will cause a memory leak.
///
/// It does not reallocate or shrink the `PathBuf`, so the leaked allocation may include
/// unused capacity that is not part of the returned slice. If you want to discard excess
/// capacity, call [`into_boxed_path`], and then [`Box::leak`] instead.
/// However, keep in mind that trimming the capacity may result in a reallocation and copy.
///
/// [`into_boxed_path`]: Self::into_boxed_path
#[unstable(feature = "os_string_pathbuf_leak", issue = "125965")]
#[inline]
pub fn leak<'a>(self) -> &'a mut Path {
Path::from_inner_mut(self.inner.leak())
}

/// Extends `self` with `path`.
///
/// If `path` is absolute, it replaces the current path.
Expand Down
10 changes: 10 additions & 0 deletions library/std/src/path/tests.rs
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,16 @@ fn into() {
assert_eq!(static_cow_path, owned_cow_path);
}

#[test]
fn test_pathbuf_leak() {
let string = "/have/a/cake".to_owned();
let (len, cap) = (string.len(), string.capacity());
let buf = PathBuf::from(string);
let leaked = buf.leak();
assert_eq!(leaked.as_os_str().as_encoded_bytes(), b"/have/a/cake");
unsafe { drop(String::from_raw_parts(leaked.as_mut_os_str() as *mut OsStr as _, len, cap)) }
}

#[test]
#[cfg(unix)]
pub fn test_decompositions_unix() {
Expand Down
5 changes: 5 additions & 0 deletions library/std/src/sys/os_str/bytes.rs
Original file line number Diff line number Diff line change
Expand Up @@ -176,6 +176,11 @@ impl Buf {
self.inner.extend_from_slice(&s.inner)
}

#[inline]
pub fn leak<'a>(self) -> &'a mut Slice {
unsafe { mem::transmute(self.inner.leak()) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These transmutes aren't really correct, but there's a bunch of other transmutes in the same file, so whatever.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

( They're relying on an assumption of slice representations being equal which is an assumption we've been gradually undermining. )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's fair, after this PR goes live on Nightly, I'll submit another PR replacing those transmutes with abstractions based on slice::from_raw_parts

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though now that I think about it, I'm not sure how would one construct an unsized type...

Copy link
Member

@workingjubilee workingjubilee Jun 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@its-the-shrimp The key thing is these bits:

pub fn as_slice(&self) -> &Slice {
// SAFETY: Slice just wraps [u8],
// and &*self.inner is &[u8], therefore
// transmuting &[u8] to &Slice is safe.
unsafe { mem::transmute(&*self.inner) }
}
#[inline]
pub fn as_mut_slice(&mut self) -> &mut Slice {
// SAFETY: Slice just wraps [u8],
// and &mut *self.inner is &mut [u8], therefore
// transmuting &mut [u8] to &mut Slice is safe.
unsafe { mem::transmute(&mut *self.inner) }
}

I don't think this is really safe, aiui, I believe this is "we get away with it because we're std and if the compiler changes its mind we'll change with it, instantly".

}

#[inline]
pub fn into_box(self) -> Box<Slice> {
unsafe { mem::transmute(self.inner.into_boxed_slice()) }
Expand Down
5 changes: 5 additions & 0 deletions library/std/src/sys/os_str/wtf8.rs
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,11 @@ impl Buf {
self.inner.shrink_to(min_capacity)
}

#[inline]
pub fn leak<'a>(self) -> &'a mut Slice {
unsafe { mem::transmute(self.inner.leak()) }
}

#[inline]
pub fn into_box(self) -> Box<Slice> {
unsafe { mem::transmute(self.inner.into_box()) }
Expand Down
5 changes: 5 additions & 0 deletions library/std/src/sys_common/wtf8.rs
Original file line number Diff line number Diff line change
Expand Up @@ -325,6 +325,11 @@ impl Wtf8Buf {
self.bytes.shrink_to(min_capacity)
}

#[inline]
pub fn leak<'a>(self) -> &'a mut Wtf8 {
unsafe { Wtf8::from_mut_bytes_unchecked(self.bytes.leak()) }
}

/// Returns the number of bytes that this string buffer can hold without reallocating.
#[inline]
pub fn capacity(&self) -> usize {
Expand Down
Loading