Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

discussion/tracking issue for #[must_use] functions in the standard library #48926

Closed
zackmdavis opened this issue Mar 11, 2018 · 37 comments
Closed
Labels
C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@zackmdavis
Copy link
Member

RFC 1940 authorized use of the #[must_use] attribute on functions (with the result that invocations of the function behave as if the return type was #[must_use].

This has been implemented for a while under a fn_must_use feature gate (tracking issue), and PR #48925 proposes stabilization.

#[must_use] has been added to the comparison methods in the standard library (.eq, .lt, &c.), but we never got a consensus as to what other standard library methods, if any, should get it. In principle, any function or with no side effects could be must-use (e.g., .len() on collections), but adding dozens or hundreds of annotations to the standard library feels kind of dramatic: perhaps must-use should instead be reserved for functions with "unusually result-like" semantics—after all, users who want to be really sure they're not uselessly throwing away a return value can always opt in to the (allow-by-default) unused-results lint.

If we wanted to be very conservative, we could refuse to stabilize until we've made a decision on this: if we were to stabilize first, any new #[must_use] annotations in the standard library would be insta-stable. But, maybe this isn't a big deal (cargo passes cap-lints allow to dependencies, so tinkering with lints shouldn't break people's builds).

@pietroalbini pietroalbini added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. labels Mar 13, 2018
@scottmcm
Copy link
Member

I'm not sure "result-like" is broad enough. Maybe something like "things that may have side effects, but which ought never be used for their side effects, especially if often expensive". As a canonical example of such a thing, Iterator::collect. Perhaps more controversially, Clone::clone.

I definitely don't think this needs to block stabilization, because any deny(warnings) impact will be the same whether with stabilization or after it.

kennytm added a commit to kennytm/rust that referenced this issue Apr 3, 2018
Add #[must_use] to a few standard library methods

Chosen to start a precedent of using it on ones that are potentially-expensive and where using it for side effects is particularly discouraged.

Discuss :)

```rust
warning: unused return value of `std::iter::Iterator::collect` which must be used: if you really need to exhaust the iterator, consider `.for_each(drop)` instead
  --> $DIR/fn_must_use_stdlib.rs:19:5
   |
LL |     "1 2 3".split_whitespace().collect::<Vec<_>>();
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

warning: unused return value of `std::borrow::ToOwned::to_owned` which must be used: cloning is often expensive and is not expected to have side effects
  --> $DIR/fn_must_use_stdlib.rs:21:5
   |
LL |     "hello".to_owned();
   |     ^^^^^^^^^^^^^^^^^^^

warning: unused return value of `std::clone::Clone::clone` which must be used: cloning is often expensive and is not expected to have side effects
  --> $DIR/fn_must_use_stdlib.rs:23:5
   |
LL |     String::from("world").clone();
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
```

cc rust-lang#48926
kennytm added a commit to kennytm/rust that referenced this issue Apr 3, 2018
Add #[must_use] to a few standard library methods

Chosen to start a precedent of using it on ones that are potentially-expensive and where using it for side effects is particularly discouraged.

Discuss :)

```rust
warning: unused return value of `std::iter::Iterator::collect` which must be used: if you really need to exhaust the iterator, consider `.for_each(drop)` instead
  --> $DIR/fn_must_use_stdlib.rs:19:5
   |
LL |     "1 2 3".split_whitespace().collect::<Vec<_>>();
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

warning: unused return value of `std::borrow::ToOwned::to_owned` which must be used: cloning is often expensive and is not expected to have side effects
  --> $DIR/fn_must_use_stdlib.rs:21:5
   |
LL |     "hello".to_owned();
   |     ^^^^^^^^^^^^^^^^^^^

warning: unused return value of `std::clone::Clone::clone` which must be used: cloning is often expensive and is not expected to have side effects
  --> $DIR/fn_must_use_stdlib.rs:23:5
   |
LL |     String::from("world").clone();
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
```

cc rust-lang#48926
kennytm added a commit to kennytm/rust that referenced this issue Apr 4, 2018
Add #[must_use] to a few standard library methods

Chosen to start a precedent of using it on ones that are potentially-expensive and where using it for side effects is particularly discouraged.

Discuss :)

```rust
warning: unused return value of `std::iter::Iterator::collect` which must be used: if you really need to exhaust the iterator, consider `.for_each(drop)` instead
  --> $DIR/fn_must_use_stdlib.rs:19:5
   |
LL |     "1 2 3".split_whitespace().collect::<Vec<_>>();
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

warning: unused return value of `std::borrow::ToOwned::to_owned` which must be used: cloning is often expensive and is not expected to have side effects
  --> $DIR/fn_must_use_stdlib.rs:21:5
   |
LL |     "hello".to_owned();
   |     ^^^^^^^^^^^^^^^^^^^

warning: unused return value of `std::clone::Clone::clone` which must be used: cloning is often expensive and is not expected to have side effects
  --> $DIR/fn_must_use_stdlib.rs:23:5
   |
LL |     String::from("world").clone();
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
```

cc rust-lang#48926
@scottmcm
Copy link
Member

scottmcm commented Apr 5, 2018

As of #49533, this is now on Iterator::collect, Clone::clone, and ToOwned::to_owned.

@jonhoo
Copy link
Contributor

jonhoo commented Apr 20, 2018

I think the non-assigning arithmetic functions are good candidates for this (see #50124).

@shepmaster
Copy link
Member

Another metric could be: "things that (are highly likely to) allocate".

@shepmaster
Copy link
Member

There's a concrete question around String::from_raw_parts, if people are interested in chiming in. My position is that such a function is mostly used for the purpose of converting back from FFI to be dropped, and shouldn't use the attribute.

@scottmcm
Copy link
Member

scottmcm commented May 7, 2018

A thought: I agree we don't want to clutter code with #[must_use] everywhere. But would we be open to the warnings if they came free? Is there some cheap conservative heuristic that would give a bunch of the warnings without the annotation noise? Perhaps "any &self method on a : Freeze type"?

@shepmaster
Copy link
Member

Perhaps "any &self method

I was starting to think down these lines:

Any function which:

  • returns non-() (and maybe some others, like !?)
  • has no &mut arguments of any kind (including and beyond self)

And then we opt-out of things with internal mutability?

We may still end up with a lot of attributes if we follow @Manishearth's suggestion to have explanatory text on every function. (Not sure if I agree with that or not).

@est31
Copy link
Member

est31 commented May 15, 2018

In principle, any function or with no side effects could be must-use (e.g., .len() on collections), but adding dozens or hundreds of annotations to the standard library feels kind of dramatic

The Rust language already has a concept of side-effect freedom which is const fn. So what about adding a warn-by-default lint to the compiler that complains about any unused result of a const fn invocation? This would remove the need to add #[must_use] everywhere, and wouldn't water it down.

MUST is a very strong word. When I see #[must_use], I think that if the result isn't used, this is most likely a dangerous bug. Using the attribute to lint for dead code seems wrong to me because it would water down the strong meaning of MUST. Of course, dead code might hide a bug as well so this distinction is a bit muddy.

@jonhoo
Copy link
Contributor

jonhoo commented May 15, 2018

Hmm, I'm not sure that warning on unused const fn results is sufficient. I could be mistaken, but isn't it currently impossible to write a const fn that, say, does allocation (e.g., one that returns HashMap::new())?

@sfackler
Copy link
Member

HashMap::new doesn't allocate.

@jonhoo
Copy link
Contributor

jonhoo commented May 15, 2018

Sorry, you're right. I guess String::from would be a better example.

@est31
Copy link
Member

est31 commented May 16, 2018

I've made an implementation of my lint idea here: #50805

@jonhoo allocating functions can be const fn. String::from is not const fn for a different reason: it needs a const impl Trait for Struct like feature as e.g. proposed in this RFC. But there is no good reason why the language can't be extended to make String::from const as well.

Edit: fixed link

@est31
Copy link
Member

est31 commented May 29, 2018

I've opened an RFC now as it seems that it requires one: rust-lang/rfcs#2450

@jonhoo
Copy link
Contributor

jonhoo commented Jul 11, 2018

As discussed in #52201, I think another set of candidates for this is methods like compare_and_swap on atomics; methods that do not return Result, but nonetheless need to have their return value checked to determine the outcome of the operation.

@LukasKalbertodt
Copy link
Member

Just to give another example: I just noticed a bug in code that I wrote which could have been caught early with a proper #[must_use] lint. I assumed that a Vec has at leastone element and I wanted to remove the last element. So I wrote this:

vec.pop();

I didn't think about what would happen if the vec is empty. This empty case is a special case I need to cover elsewhere in my code. If the statement above would have caused a warning, I would have thought about it. So I guess the Option<T> returned by pop() is kind of "result like"?

I think marking pop with #[must_use] is a good idea. I think it's extremely rare that the programmer doesn't care about the returned value and also doesn't care about whether or not the method actually modified the vector.

@shepmaster
Copy link
Member

shepmaster commented Oct 23, 2018

I recently watched a newcomer to Rust write this shape of code:

let mut s = String::new();
stdin.read_line(&mut s);
s.trim();
if s == "hello" { /* ... */ }

They either believed that trim operates in-place or simply forgot to use the result, but a must_use on that would have helped them.

@rkjnsn
Copy link
Contributor

rkjnsn commented Nov 27, 2018

I think marking pop with #[must_use] is a good idea. I think it's extremely rare that the programmer doesn't care about the returned value and also doesn't care about whether or not the method actually modified the vector.

I think I would lean more toward #[must_use] being reserved for situations where ignoring the return value is almost certainly a programmer error. It does not seem unreasonable that code might inspect the last element and, if a certain condition is met, discard it using pop. (Indeed, I have written similar code in other languages.)

In contrast, something like calling trim without checking the return value is surely an error, as there is no possible use for doing so.

@BatmanAoD
Copy link
Member

@Jasperav That might be too big a change to make except at an edition boundary, but Rust 2021 is right around the corner 😃

@BatmanAoD
Copy link
Member

It would also be appropriate to create an off-by-default clippy lint implementing that behavior, I think.

@scottmcm
Copy link
Member

No need for a clippy lint; it already exists in rustc: https://doc.rust-lang.org/rustc/lints/listing/allowed-by-default.html#unused-results

@cuviper
Copy link
Member

cuviper commented Apr 22, 2020

FYI, I've added a must_use on mem::replace in #71256. It doesn't quite meet the rough guidelines here about being useless otherwise, but I made the suggestion to use direct assignment if you don't need that return value. This idea came from noticing such unused values in some of my own code.

@scottmcm
Copy link
Member

I like that change. I'd support a guideline for that kind of thing, perhaps

Add #[must_use] if there's a simpler construct that you can suggest in the message when the result is unused

That also covers things like the attribute that was added on split_off to suggest truncate.

@KodrAus KodrAus added the Libs-Tracked Libs issues that are tracked on the team's project board. label Jul 29, 2020
@m-ou-se
Copy link
Member

m-ou-se commented Oct 22, 2020

Is there any progress towards clearer guidelines for #[must_use]? #78225 adds #[must_use] for Iterator::count and Iterator::last, which seem to fall outside of the current rough guidelines. It would be nice to have some documentation to link and check PRs like that against.

@camsteffen
Copy link
Contributor

I propose the following guideline:

Add #[must_use] to a function when, for any perceivable usage, not using the return value is a logic error or sub-optimal usage.

I disagree with the following:

MUST is a very strong word. When I see #[must_use], I think that if the result isn't used, this is most likely a dangerous bug.

"must" does not indicate the severity of the bug; It indicates the level of certainty that not using the value is wrong or sub-optimal.

I believe with this guideline, Iterator::count and Iterator::last can clearly be included.

@camsteffen
Copy link
Contributor

After thinking some more, I think I can do better. I see three distinct reasons for adding #[must_use] (and all valid reasons IMHO). So here is another attempt at a comprehensive guideline:

Add #[must_use] if at least one of the following conditions is true:

  1. The function has no effect other than producing a return value (a pure function).
  2. There exists a better (more idiomatic, performant, etc.) way to do the same thing if the return value is not used.
  3. The return value can indicate an exceptional outcome.

To say more about those 3 conditions:

  1. Can be determined just by looking at the function itself. Unfortunately this is very common. But at the same time it should not cause any false positives. Examples: checked_add, Vec::new, str::trim, Iterator::map (includes builder-like patterns).

  2. Can be determined by considering other API or language features. Thus it is more nuanced, but with the right justification, it should also be a rather confident decision. Examples: mem::replace, String::split_off, Iterator::last, Iterator::collect.

  3. Can be determined by asking, "Does this function sometimes not do what it normally does?". I believe this is inescapably subjective: you need to define "normal" and "exceptional". For example, we probably don't want to include Vec::pop because None is "not very exceptional". But this is arguably the most important category for helping people avoid logic errors. By adding #[must_use], you are declaring, "If you're going to ignore the return value, you should do so explicitly." (how to idiomatically ignore a #[must_use] value is separate discussion). The good news is, for most instances, this is applied by simply returning Result. However, this category could also include mem::compare_and_swap.

Looking at the 3 conditions from another perspective:

  1. Detects noop or possibly logic errors if the function was expected to modify input.
  2. Detects non-idiomatic or sub-optimal usage.
  3. Detects logic errors.

Condition #2 seems to stand out as "not as severe". But personally I don't think this is something to get hung up on - any unused_must_use warning could easily point to a huge logic error.

Bonus condition #4: The return value is literally designed to determine control flow (looking at you, ops::ControlFlow). Seems like a very strong case for #[must_use], but for a very unique reason that may never be repeated 😆.

It might be nice to have a special annotation for #1 like #[must_use(pure)] to avoid writing a bunch of similar error messages.

@scottmcm
Copy link
Member

scottmcm commented Jan 29, 2021

@m-ou-se No, I don't know of any. It really needs someone to pick it up and find a "hits the majority" subset so that must_use doesn't just become a "put it on everything" annotation.

Unfortunately this is very common

It might be nice to have a special annotation for #‍1 like #[must_use(pure)] to avoid writing a bunch of similar error messages.

Yeah, exactly. But because it's so common -- and the downside to not using something is often so minor -- I think this is a place for a lint to heuristically do it.

We know that unused_results is too strong, but it seems very plausible to start making lints to cut down on the frequency of needing extra annotations about this stuff.

For example, we could start with a lint for any const fns that are only passed &impl Freeze or impl Copy + 'static arguments. (Idea reference above.) Basically piggy-back on existing annotation work that's already happening, but also as a separate lint so people could decide they don't care if there's a few unnecessary calls to u32::wrapping_add, say, since they're very reliably get optimized out, even if they do want to know about the more important must_use warnings.


We could also consider an #[important_side_effect] annotation that would let the heuristics be more aggressive, though logistically transitioning to that might be awkward.

@shepmaster
Copy link
Member

a lint to heuristically do it

Looking at this from the other direction, manually annotating a bunch of functions that fit into these categories (whatever they may be) could be a way to gauge the efficacy of any proposed heuristic. Effectively, this would be like how lifetime elision was established based on feedback from real world usages.

@jkugelman
Copy link
Contributor

@camsteffen just brought this discussion to my attention.
cc @joshtriplett

Recently I independently asked around on Zulip and IRLO about adding #[must_use] throughout the standard library. Several Rust team members and many community members at large gave the green light.

With that go ahead, I put together a list of around 800 changes and have been working them off bit by bit. I'm about halfway done at this point. You can see the tracking issue here:

I'd appreciate everyone's feedback as I continue working on this.

@camsteffen
Copy link
Contributor

It looks like your work is focused on the "pure function" case, which is the most straightforward.

@jkugelman
Copy link
Contributor

Done! The MCP was accepted and the tracking issue is closed. std is now chock full of #[must_use]s.

@camsteffen
Copy link
Contributor

The MCP adds clear guidelines for some cases that should have #[must_use], but I don't think it is a complete guideline. For example it would not include String::split_off. Namely cases 2 and 3 from my previous post are still open-ended. And some PRs were blocked on this discussion. What would happen to those PRs today if this issue is closed?

@jkugelman
Copy link
Contributor

jkugelman commented Nov 2, 2021

I concur. The guidelines in the MCP were for what I was using to scope my changes. I did not intend them to be proscriptive. There are certainly more cases that should be added.

I fudged my own rules a bit, even. To wit, I added a number of functions that take and return &mut when it was clear the mut is a passthrough and the function doesn't itself modify anything. For example:

HashMap::key_mut(&mut self) -> &mut K;
HashMap::get_mut(&mut self) -> &mut V;
HashMap::get_key_value(&mut self) -> (&K, &V);
HashMap::get_key_value_mut(&mut self) -> (&mut K, &mut V);

edmorley added a commit to heroku/libcnb.rs that referenced this issue Dec 8, 2021
If a semicolon discards the result of a function or method tagged with
`#[must_use]`, the compiler will emit a lint message warning that the
return value is expected to be used. 

This lint suggests adding `#[must_use]` to public functions that:
- return something that's not already marked as `must_use`
- have no mutable *reference* args (having a mutable owned arg is fine,
  since as the method took ownership, the caller can only access the result
  via the returned self, so it therefore still must be used)
- don't mutate statics

Docs:
https://doc.rust-lang.org/reference/attributes/diagnostics.html#the-must_use-attribute
https://rust-lang.github.io/rust-clippy/master/index.html#must_use_candidate

For more on best practices of when to use `must_use`, see:
rust-lang/rust#48926 (comment)

Fixes #57.
GUS-W-10222390.
edmorley added a commit to heroku/libcnb.rs that referenced this issue Dec 9, 2021
If a semicolon discards the result of a function or method tagged with
`#[must_use]`, the compiler will emit a lint message warning that the
return value is expected to be used. 

This lint suggests adding `#[must_use]` to public functions that:
- return something that's not already marked as `must_use`
- have no mutable *reference* args (having a mutable owned arg is fine,
  since as the method took ownership, the caller can only access the result
  via the returned self, so it therefore still must be used)
- don't mutate statics

Docs:
https://doc.rust-lang.org/reference/attributes/diagnostics.html#the-must_use-attribute
https://rust-lang.github.io/rust-clippy/master/index.html#must_use_candidate

For more on best practices of when to use `must_use`, see:
rust-lang/rust#48926 (comment)

Fixes #57.
GUS-W-10222390.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests