Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix handling of malicious Readers in read_to_end #80895

Merged
merged 4 commits into from
Jan 14, 2021

Conversation

sfackler
Copy link
Member

@sfackler sfackler commented Jan 11, 2021

A malicious Read impl could return overly large values from read, which would result in the guard's drop impl setting the buffer's length to greater than its capacity! To fix this, the drop impl now uses the safe truncate function instead of set_len which ensures that this will not happen. The result of calling the function will be nonsensical, but that's fine given the contract violation of the Read impl.

The Guard type is also used by append_to_string which does not pass untrusted values into the length field, so I've copied the guard type into each function and only modified the one used by read_to_end. We could just keep a single one and modify it, but it seems a bit cleaner to keep the guard code close to the functions and related specifically to them.

To fix this, we now assert that the returned length is not larger than the buffer passed to the method.

For reference, this bug has been present for ~2.5 years since 1.20: ecbb896.

Closes #80894.

@rust-highfive
Copy link
Collaborator

r? @shepmaster

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 11, 2021
@camelid camelid added A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jan 11, 2021
@rust-log-analyzer
Copy link
Collaborator

The job mingw-check failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
    Checking addr2line v0.14.0
error[E0282]: type annotations needed
   --> library/std/src/io/mod.rs:381:9
    |
381 |     let ret;
    |         ^^^ consider giving `ret` a type
error: aborting due to previous error

For more information about this error, try `rustc --explain E0282`.
error: could not compile `std`
error: could not compile `std`

To learn more, run the command again with --verbose.
command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "check" "--target" "x86_64-unknown-linux-gnu" "-Zbinary-dep-depinfo" "-j" "16" "--release" "--color" "always" "--features" "panic-unwind backtrace compiler-builtins-c" "--manifest-path" "/checkout/library/test/Cargo.toml" "--message-format" "json-render-diagnostics"
failed to run: /checkout/obj/build/bootstrap/debug/bootstrap check
Build completed unsuccessfully in 0:01:46

ret = Ok(g.len - start_len);
break;
}
Ok(0) => return Ok(g.len - start_len),
Ok(n) => g.len += n,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we check for an overflow here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, but I would rather keep the codepath oriented as much as possible towards non-malicious implementations. Overflow can only be caused by bogus return values, and just means that a nonsensical result will be returned.

Copy link
Contributor

@Qwaz Qwaz Jan 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Malicious Read can shrink g.len with an overflow, which in turn can lead to a violation of append_to_string()'s second requirement. (and append_to_string() should probably be marked as an unsafe function)

// The unsafety in this function is twofold:
//
// 1. We're looking at the raw bytes of `buf`, so we take on the burden of UTF-8
// checks.
// 2. We're passing a raw buffer to the function `f`, and it is expected that
// the function only *appends* bytes to the buffer. We'll get undefined
// behavior if existing bytes are overwritten to have non-UTF-8 data.
fn append_to_string<F>(buf: &mut String, f: F) -> Result<usize>
where
F: FnOnce(&mut Vec<u8>) -> Result<usize>,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the context:

fn read_to_string(&mut self, buf: &mut String) -> Result<usize> {
// Note that we do *not* call `.read_to_end()` here. We are passing
// `&mut Vec<u8>` (the raw contents of `buf`) into the `read_to_end`
// method to fill it up. An arbitrary implementation could overwrite the
// entire contents of the vector, not just append to it (which is what
// we are expecting).
//
// To prevent extraneously checking the UTF-8-ness of the entire buffer
// we pass it to our hardcoded `read_to_end` implementation which we
// know is guaranteed to only read data into the end of the buffer.
append_to_string(buf, |b| read_to_end(self, b))
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right. We'll need to replace that with a checked_add then.

This kind of mess is one of the things that the ReadBuf proposal is designed to avoid, so things should get less janky when that's implemented.

@sfackler
Copy link
Member Author

Updated to assert that the returned length is actually valid for the input buffer.

// particular, that could break read_to_string if the shortened buffer doesn't end on a UTF-8 boundary.
// The minimal check would just be a checked_add, but this assert is a bit more precise and should be
// just about the same cost.
assert!(n <= g.buf.len() - g.len);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to slice the slice that's been passed into the read call in the first place and then take its length? As in:

let buffer = &mut g.buf[g.len..];
match r.read(buffer) { 
    // ...
    Ok(n) => g.len += buffer[..n].len(),
}

or something along those lines? It has an added benefit of also detecting when returned n is greater than the length of the length of the buffer provided, and may result in slightly less code generated overall.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check already is looking for n being greater than the length of the buffer, but agreed that structuring it like that is a bit more clear. Will update later today.

@sfackler
Copy link
Member Author

r? @m-ou-se

@rust-highfive rust-highfive assigned m-ou-se and unassigned shepmaster Jan 11, 2021
@sfackler
Copy link
Member Author

cc @Mark-Simulacrum not sure if you think we should to backport to beta?

@Mark-Simulacrum
Copy link
Member

I think that seems fine - it's not really necessary IMO given this has been like this for a long time now, but giving it's a soundness fix seems appropriate. I'm going to unilaterally beta-accept but will wait till it lands on master before backporting.

@Mark-Simulacrum Mark-Simulacrum added beta-accepted Accepted for backporting to the compiler in the beta channel. beta-nominated Nominated for backporting to the compiler in the beta channel. labels Jan 12, 2021
@sfackler
Copy link
Member Author

Yeah, makes sense to me, particularly since the patch is so small.

@m-ou-se
Copy link
Member

m-ou-se commented Jan 13, 2021

@bors r+

@bors
Copy link
Contributor

bors commented Jan 13, 2021

📌 Commit e6c07b0 has been approved by m-ou-se

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 13, 2021
m-ou-se added a commit to m-ou-se/rust that referenced this pull request Jan 14, 2021
Fix handling of malicious Readers in read_to_end

A malicious `Read` impl could return overly large values from `read`, which would result in the guard's drop impl setting the buffer's length to greater than its capacity! ~~To fix this, the drop impl now uses the safe `truncate` function instead of `set_len` which ensures that this will not happen. The result of calling the function will be nonsensical, but that's fine given the contract violation of the `Read` impl.~~

~~The `Guard` type is also used by `append_to_string` which does not pass untrusted values into the length field, so I've copied the guard type into each function and only modified the one used by `read_to_end`. We could just keep a single one and modify it, but it seems a bit cleaner to keep the guard code close to the functions and related specifically to them.~~

To fix this, we now assert that the returned length is not larger than the buffer passed to the method.

For reference, this bug has been present for ~2.5 years since 1.20: rust-lang@ecbb896.

Closes rust-lang#80894.
bors added a commit to rust-lang-ci/rust that referenced this pull request Jan 14, 2021
Rollup of 17 pull requests

Successful merges:

 - rust-lang#79982 (Add missing methods to unix ExitStatusExt)
 - rust-lang#80017 (Suggest `_` and `..` if a pattern has too few fields)
 - rust-lang#80169 (Recommend panic::resume_unwind instead of panicking.)
 - rust-lang#80217 (Add a `std::io::read_to_string` function)
 - rust-lang#80444 (Add as_ref and as_mut methods for Bound)
 - rust-lang#80567 (Add Iterator::intersperse_with)
 - rust-lang#80829 (Get rid of `DepConstructor`)
 - rust-lang#80895 (Fix handling of malicious Readers in read_to_end)
 - rust-lang#80966 (Deprecate atomic::spin_loop_hint in favour of hint::spin_loop)
 - rust-lang#80969 (Use better ICE message when no MIR is available)
 - rust-lang#80972 (Remove unstable deprecated Vec::remove_item)
 - rust-lang#80973 (Update books)
 - rust-lang#80980 (Fixed incorrect doc comment)
 - rust-lang#80981 (Fix -Cpasses=list and llvm version print with -vV)
 - rust-lang#80985 (Fix stabilisation version of slice_strip)
 - rust-lang#80990 (llvm: Remove the unused context from CreateDebugLocation)
 - rust-lang#80991 (Fix formatting specifiers doc links)

Failed merges:

 - rust-lang#80944 (Use Option::map_or instead of `.map(..).unwrap_or(..)`)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit ce48709 into rust-lang:master Jan 14, 2021
@rustbot rustbot added this to the 1.51.0 milestone Jan 14, 2021
@Mark-Simulacrum Mark-Simulacrum modified the milestones: 1.51.0, 1.50.0 Jan 18, 2021
@Mark-Simulacrum Mark-Simulacrum removed the beta-nominated Nominated for backporting to the compiler in the beta channel. label Jan 18, 2021
bors added a commit to rust-lang-ci/rust that referenced this pull request Jan 20, 2021
…ulacrum

[beta] backports

This backports:

*  Update RLS and Rustfmt rust-lang#81027
*  bump rustfmt to v1.4.32 rust-lang#81093
*  Fix handling of malicious Readers in read_to_end rust-lang#80895
*  Fix broken ./x.py install rust-lang#80514
*  Fix x.py install not working with relative prefix rust-lang#80797
*  [security] Update mdbook rust-lang#80688
*  rustdoc: Render visibilities succinctly rust-lang#80368

r? `@ghost`
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Mar 18, 2021
Clarify docs for Read::read's return value

Right now the docs for `Read::read`'s return value are phrased in a way that makes it easy for the reader to assume that the return value is never larger than the passed buffer. This PR clarifies that this is a requirement for implementations of the trait, but that callers have to expect a buggy yet safe implementation failing to do so, especially if unchecked accesses to the buffer are done afterwards.

I fell into this trap recently, and when I noticed, I looked at the docs again and had the feeling that I might not have been the first one to miss this.

The same issue of trusting the return value of `read` was also present in std itself for about 2.5 years and only fixed recently, see rust-lang#80895.

I hope that clarifying the docs might help others to avoid this issue.
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Mar 19, 2021
Clarify docs for Read::read's return value

Right now the docs for `Read::read`'s return value are phrased in a way that makes it easy for the reader to assume that the return value is never larger than the passed buffer. This PR clarifies that this is a requirement for implementations of the trait, but that callers have to expect a buggy yet safe implementation failing to do so, especially if unchecked accesses to the buffer are done afterwards.

I fell into this trap recently, and when I noticed, I looked at the docs again and had the feeling that I might not have been the first one to miss this.

The same issue of trusting the return value of `read` was also present in std itself for about 2.5 years and only fixed recently, see rust-lang#80895.

I hope that clarifying the docs might help others to avoid this issue.
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Mar 19, 2021
Clarify docs for Read::read's return value

Right now the docs for `Read::read`'s return value are phrased in a way that makes it easy for the reader to assume that the return value is never larger than the passed buffer. This PR clarifies that this is a requirement for implementations of the trait, but that callers have to expect a buggy yet safe implementation failing to do so, especially if unchecked accesses to the buffer are done afterwards.

I fell into this trap recently, and when I noticed, I looked at the docs again and had the feeling that I might not have been the first one to miss this.

The same issue of trusting the return value of `read` was also present in std itself for about 2.5 years and only fixed recently, see rust-lang#80895.

I hope that clarifying the docs might help others to avoid this issue.
JohnTitor added a commit to JohnTitor/rust that referenced this pull request Mar 19, 2021
Clarify docs for Read::read's return value

Right now the docs for `Read::read`'s return value are phrased in a way that makes it easy for the reader to assume that the return value is never larger than the passed buffer. This PR clarifies that this is a requirement for implementations of the trait, but that callers have to expect a buggy yet safe implementation failing to do so, especially if unchecked accesses to the buffer are done afterwards.

I fell into this trap recently, and when I noticed, I looked at the docs again and had the feeling that I might not have been the first one to miss this.

The same issue of trusting the return value of `read` was also present in std itself for about 2.5 years and only fixed recently, see rust-lang#80895.

I hope that clarifying the docs might help others to avoid this issue.
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Mar 19, 2021
Clarify docs for Read::read's return value

Right now the docs for `Read::read`'s return value are phrased in a way that makes it easy for the reader to assume that the return value is never larger than the passed buffer. This PR clarifies that this is a requirement for implementations of the trait, but that callers have to expect a buggy yet safe implementation failing to do so, especially if unchecked accesses to the buffer are done afterwards.

I fell into this trap recently, and when I noticed, I looked at the docs again and had the feeling that I might not have been the first one to miss this.

The same issue of trusting the return value of `read` was also present in std itself for about 2.5 years and only fixed recently, see rust-lang#80895.

I hope that clarifying the docs might help others to avoid this issue.
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Mar 19, 2021
Clarify docs for Read::read's return value

Right now the docs for `Read::read`'s return value are phrased in a way that makes it easy for the reader to assume that the return value is never larger than the passed buffer. This PR clarifies that this is a requirement for implementations of the trait, but that callers have to expect a buggy yet safe implementation failing to do so, especially if unchecked accesses to the buffer are done afterwards.

I fell into this trap recently, and when I noticed, I looked at the docs again and had the feeling that I might not have been the first one to miss this.

The same issue of trusting the return value of `read` was also present in std itself for about 2.5 years and only fixed recently, see rust-lang#80895.

I hope that clarifying the docs might help others to avoid this issue.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` beta-accepted Accepted for backporting to the compiler in the beta channel. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Heap buffer overflow in read_to_end_with_reservation()