Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reintroduce into_future in .await desugaring #90737

Merged
merged 5 commits into from
Dec 3, 2021
Merged

Conversation

eholk
Copy link
Contributor

@eholk eholk commented Nov 9, 2021

This is a reintroduction of the remaining parts from #65244 that have not been relanded yet.

This isn't quite ready to merge yet. The last attempt was reverting due to performance regressions, so we need to make sure this does not introduce those issues again.

Issues #67644, #67982

/cc @yoshuawuyts

@rust-highfive
Copy link
Collaborator

r? @michaelwoerister

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 9, 2021
@wesleywiser
Copy link
Member

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Nov 9, 2021
@bors
Copy link
Contributor

bors commented Nov 9, 2021

⌛ Trying commit 60f29450eeca029679562a3a0914975e5d8303a4 with merge 63f6208b5cec7496fd972de8245a4ff9d4da9e8b...

@bors
Copy link
Contributor

bors commented Nov 9, 2021

☀️ Try build successful - checks-actions
Build commit: 63f6208b5cec7496fd972de8245a4ff9d4da9e8b (63f6208b5cec7496fd972de8245a4ff9d4da9e8b)

@rust-timer
Copy link
Collaborator

Queued 63f6208b5cec7496fd972de8245a4ff9d4da9e8b with parent d608229, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (63f6208b5cec7496fd972de8245a4ff9d4da9e8b): comparison url.

Summary: This change led to very large relevant regressions 😿 in compiler performance.

  • Very large regression in instruction counts (up to 9.4% on incr-full builds of deeply-nested-async)

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Nov 10, 2021
@yoshuawuyts
Copy link
Member

yoshuawuyts commented Nov 10, 2021

Neat; we're doing a lot better than last time, where we had a 700% performance regression which basically made async/await unusable. This time around we're doing a lot better, and some amount of performance regression should probably be expected since we're actually doing more work.

I went in and verified that the output assembly is the same for both examples. Even though we're doing more work, we don't seem to be missing out on optimizing any of the generated code. [Future codegen | IntoFuture codegen | gist]

Personally I feel like this regression is small enough that we should prioritize landing this part of the async/.await RFC. It's been almost two years since we disabled IntoFuture as part of a short term fix, with little movement since. And comparing futures performance today to what it was back then, even with IntoFuture semantics enabled async/.await still compiles many times faster than it did in 20191.

How do other folks feel about this performance regression? Are there potentially any low-hanging performance optimizations we could apply to speed this up? If there are none, would folks be okay if we proposed to merge this sooner and optimize later?

Footnotes

  1. If I understand the table right, -95% means "20x as fast".

@eholk
Copy link
Contributor Author

eholk commented Nov 10, 2021

If I'm reading the detailed performance diff right, it looks like a lot of the extra time is being spent in evaluate_obligation and normalize_projection_ty. Given that this PR causes each await to introduce a new requirement for IntoFuture, this seems expected.

I'm not sure what the usual guidelines for accepting a regression are, but I'd be in favor of taking this one. Most of the benchmarks showed little change. The biggest regression was in deeply-nested-async, which is a pattern that probably doesn't come up too often (although the benchmark is based off a real-world scenario, so it's no unheard of). It would be nice to implement this part of the async/await RFC, even if that comes at a bit of a cost.

As far as ideas for reducing the regression, I think it'd be worth looking at the way generators get converted to futures. Maybe there's something that could be done that's symbiotic with this change and would lead to less work being done in total.

@eholk eholk marked this pull request as ready for review November 10, 2021 18:43
@michaelwoerister
Copy link
Member

Thanks for the PR, @eholk & @yoshuawuyts! This looks good to me. It would be nice to reduce the regression but it seems acceptable. The compiler just has to do more work so it is expected that compile time increase for the cases where this is exercised.

I am not particularly familiar with lowering (especially wrt async) so it would be good to have another pair of eyes on this:
r? @tmandry

@apiraino apiraino added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Nov 11, 2021
assert_eq!(7181, std::mem::size_of_val(&mixed_sizes()));
assert_eq!(3076, std::mem::size_of_val(&joined()));
assert_eq!(3076, std::mem::size_of_val(&joined_with_noop()));
assert_eq!(6157, std::mem::size_of_val(&mixed_sizes()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unexpected! Do you know why these sizes change?

It looks like an improvement but I'd like to understand what's going on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We weren't sure either! The previous PR also shrunk the size of the generated futures, without really commenting on it either. It's particularly surprising because neither the numbers before or after this changed are aligned the way we expected them to.

We reasoned that this change would be alright since it doesn't regress, there is precedence for accepting this in #65244, and it doesn't lead to any other changes as far as we can tell. Even if we don't fully understand why the current results are what they are, or what is causing the change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if this is right, but one guess is it has something to do with the match <future> { ... } becoming match InfoFuture::into_future(<future>) { ... }. Maybe the into_future call moves the future into the scrutinee position of the match, rather than borrowing from the outer scope?

Judging by the comment on this test, it seems like the first two changes aren't a big deal, since they are just a few bytes. The last change is surprising because it improves by exactly 1024. This means we were able to eliminate one of the BigFuts from the closure. By my count we should only need 5, so I'm not sure why we have 6 copies now or why we had 7 copies before. I wouldn't be surprised to see 8 or 5, I guess, but 6 and 7 are both weird. Could we be smarter about laying out the closures? Maybe there's a move from one slot to another that's forced by the way we laid out the closures?

Copy link
Member

@tmandry tmandry Nov 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting theory, if we're turning a borrow into a move that can definitely improve things! See #62321 for instance – if I recall correctly, borrowing a local in a generator causes the analysis to assume the local is always live from that point, preventing it from reclaiming the storage.

src/test/ui/async-await/issue-70594.stderr Outdated Show resolved Hide resolved
@bors
Copy link
Contributor

bors commented Nov 21, 2021

☔ The latest upstream changes (presumably #89580) made this pull request unmergeable. Please resolve the merge conflicts.

This is a reintroduction of the remaining parts from
rust-lang#65244 that have not been relanded
yet.

Issues rust-langGH-67644, rust-langGH-67982
Copy link
Member

@tmandry tmandry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

You can r=me after comments, also happy to take another look.

src/test/ui/async-await/await-into-future.rs Outdated Show resolved Hide resolved
src/test/ui/async-await/issue-70594.stderr Show resolved Hide resolved
Add a note about `IntoFuture` in error messages where T is not a future.

Change await-into-future.rs to be a run-pass test.
@eholk
Copy link
Contributor Author

eholk commented Dec 2, 2021

Thanks for the suggestions. I believe they are all addressed, so feel free to take a look. I don't know if I have permission to do r=me, but I'll try it out to see.

@bors r=tmandry

@rust-log-analyzer

This comment has been minimized.

@eholk
Copy link
Contributor Author

eholk commented Dec 3, 2021

@bors r=tmandry

@bors
Copy link
Contributor

bors commented Dec 3, 2021

@eholk: 🔑 Insufficient privileges: Not in reviewers

@tmandry
Copy link
Member

tmandry commented Dec 3, 2021

@bors r+

@bors
Copy link
Contributor

bors commented Dec 3, 2021

📌 Commit 3268bf1 has been approved by tmandry

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 3, 2021
@bors
Copy link
Contributor

bors commented Dec 3, 2021

⌛ Testing commit 3268bf1 with merge 532d2b1...

@bors
Copy link
Contributor

bors commented Dec 3, 2021

☀️ Test successful - checks-actions
Approved by: tmandry
Pushing 532d2b1 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Dec 3, 2021
@bors bors merged commit 532d2b1 into rust-lang:master Dec 3, 2021
@rustbot rustbot added this to the 1.59.0 milestone Dec 3, 2021
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (532d2b1): comparison url.

Summary: This change led to very large relevant regressions 😿 in compiler performance.

  • Very large regression in instruction counts (up to 9.3% on incr-full builds of deeply-nested-async)

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression

@eholk
Copy link
Contributor Author

eholk commented Dec 4, 2021

@rustbot label: +perf-regression-triaged

We saw similar regressions during the earlier try job and decided it they were acceptable. See this earlier comment for some details. The summary is that the regressions are only in async heavy code and are expected because we are generating code that introduces more trait obligations. Overall, the performance is still much faster than the earlier attempt to land this change, and there are a lot of users requesting this feature, so it's worth taking the regression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.