Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid placing immutable lvalues in allocas and zeroing them to cancel cleanup. #11445

Closed
eddyb opened this issue Jan 10, 2014 · 8 comments
Closed
Labels
A-codegen Area: Code generation

Comments

@eddyb
Copy link
Member

eddyb commented Jan 10, 2014

This is the current LLVM IR generated at --opt-level=0 for a seemingly noop transmute:

fn foo(x: ~[u8]) -> ~str {unsafe{transmute(x)}}
define internal { i64, i64, [0 x i8] }* @foo::h5f52f1de17121161af::v0.0({ i64, %tydesc*, i8*, i8*, i8 }*, { i64, i64, [0 x i8] }* noalias) unnamed_addr #4 {
"function top level":
  %__arg = alloca { i64, i64, [0 x i8] }*
  %__self = alloca { i64, i64, [0 x i8] }*
  %2 = alloca { i8*, i32 }
  store { i64, i64, [0 x i8] }* %1, { i64, i64, [0 x i8] }** %__arg
  %3 = load { i64, i64, [0 x i8] }** %__arg
  store { i64, i64, [0 x i8] }* %3, { i64, i64, [0 x i8] }** %__self
  %4 = bitcast { i64, i64, [0 x i8] }** %__arg to i8*
  call void @llvm.memset.p0i8.i64(i8* %4, i8 0, i64 8, i32 8, i1 false)
  %5 = load { i64, i64, [0 x i8] }** %__self
  br label %"normal return"

"normal return":                                  ; preds = %"function top level"
  %6 = bitcast { i64, i64, [0 x i8] }** %__arg to i32**
  call void @"_$UP$u32::glue_drop::hb6a3c7b062d25f8far"({}* null, i32** %6)
  ret { i64, i64, [0 x i8] }* %5

unwind:                                           ; No predecessors!
  %7 = landingpad { i8*, i32 } personality i32 (i32, i32, i64, %"struct.std::rt::unwind::libunwind::_Unwind_Exception[#1]"*, %"enum.std::rt::unwind::libunwind::_Unwind_Context[#1]"*)* @rust_eh_personality
          cleanup
  store { i8*, i32 } %7, { i8*, i32 }* %2
  br label %cleanup

cleanup:                                          ; preds = %unwind
  %8 = bitcast { i64, i64, [0 x i8] }** %__arg to i32**
  call void @"_$UP$u32::glue_drop::hb6a3c7b062d25f8far"({}* null, i32** %8)
  %9 = load { i8*, i32 }* %2
  resume { i8*, i32 } %9
}

Wherever x is moved out of, a memset call is used to zero the pointer, to "cancel the cleanup" (drop glue ignores NULL pointers and objects with a drop flag of 0).

The generated IR could look like this, without affecting semantics:

define internal { i64, i64, [0 x i8] }* @foo::h5f52f1de17121161af::v0.0({ i64, %tydesc*, i8*, i8*, i8 }*, { i64, i64, [0 x i8] }* noalias) unnamed_addr #4 {
"function top level":
  %__self = alloca { i64, i64, [0 x i8] }*
  store { i64, i64, [0 x i8] }* %1 { i64, i64, [0 x i8] }** %__self
  %2 = load { i64, i64, [0 x i8] }** %__self
  br label %"normal return"

"normal return":                                  ; preds = %"function top level"
  ret { i64, i64, [0 x i8] }* %2

(it might be even possible to remove the alloca used for the inlined transmute call)

The reason I wasn't able to implement this behavior in #11252 is the differing scope between the cleanup creation and the cleanup cancellation. It becomes obvious when we have multiple branches:

fn foo<T: Clone>(x: T) {
    if rand() % 1 == 0 {
        x // moves out of x (zeroes x to cancel cleanup)
    } else {
        x.clone() // doesn't move out of x
    }
}

@huonw recalled someone mentioning requiring to move out of x in both branches, but I don't know if that's a viable solution.

cc @nikomatsakis @pcwalton @thestinger

@alexcrichton
Copy link
Member

I'm tempted to close this as a dupe of #5016, but there's certainly a lot here. So long as we continue to zero out on movements, we can't promote anything to values (they have to be in alloca slots)

I think that @nikomatsakis is working on a patch that does this and requires that you move out of a variable in all branches (to completely eliminate zeroing on movement)

@eddyb
Copy link
Member Author

eddyb commented Jan 10, 2014

I found out about #5016 a bit too late, but yeah, that's the main blocker. Pretty much any immutable lvalue that is passed by value (i.e. fits in a register) could otherwise be a ByValue datum.

I did take a look at @nikomatsakis' patch and cleanup does look a little better, but it still uses zeroing for Lvalues (it actually makes it clear that's the only way to deal with Lvalues, at least in the current implementation).

@nikomatsakis
Copy link
Contributor

I am working towards this goal, though my current branch does not yet remove zeroing, it just makes the semantics and lifetimes of temporaries better defined and more easily controlled. That said -- we always keep local variables in allocas and I don't expect this to change, because it enables better integration with debuginfo.

@eddyb
Copy link
Member Author

eddyb commented Jan 14, 2014

That debuginfo integration is only required when compiling with -Z extra-debug-info - at that point, the compile/runtime cost is understandable.

@nikomatsakis
Copy link
Contributor

It doesn't seem worth it to me to treat local variables differently depending on whether -Z extra-debug-info is set or not. That said, it may be that once we implement #5016, it's not a big change to make it possible to have by-ref or by-value lvalues, in which case I'm not opposed, but I think readability and general cleanliness of trans takes top priority in this case.

Another way to cleanup that generated code would be to do a simple analysis that detects infallible functions and avoids generating landing pads. We'd probably have to store fallibility in the crate metadata, but it need not affect language semantics in any way.

@thestinger
Copy link
Contributor

@nikomatsakis: we can put debug info on registers with llvm.dbg.value

@steveklabnik steveklabnik added the A-codegen Area: Code generation label Jan 23, 2015
@steveklabnik
Copy link
Member

Triage: updated code:

fn foo(x: Vec<u8>) -> String {unsafe{std::mem::transmute(x)}}

llvm ir:

define internal void @_ZN3foo20h33c0f4c7a8c48249eaaE(%"3.collections::string::String"* noalias nocapture sret dereferenceable(24), %"3.collections::vec::Vec<u8>"* noalias nocapture dereferenceable(24) %x) unnamed_addr #0 {
entry-block:
  %dropflag_hint_6 = alloca i8
  store i8 61, i8* %dropflag_hint_6
  %1 = bitcast %"3.collections::string::String"* %0 to %"3.collections::vec::Vec<u8>"*
  %2 = bitcast %"3.collections::vec::Vec<u8>"* %x to i8*
  %3 = bitcast %"3.collections::vec::Vec<u8>"* %1 to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %3, i8* %2, i64 24, i32 8, i1 false)
  %4 = bitcast %"3.collections::vec::Vec<u8>"* %x to i8*
  call void @llvm.memset.p0i8.i64(i8* %4, i8 29, i64 24, i32 8, i1 false)
  call void @"_ZN31collections..vec..Vec$LT$u8$GT$9drop.359317hfad10bb9521885b9E"(%"3.collections::vec::Vec<u8>"* %x)
  ret void
}

It looks very different, is this fixed?

@eddyb
Copy link
Member Author

eddyb commented Feb 2, 2016

Yes, static and dynamic drop have both gotten much better since (although filling drop still exists in marginal cases).

@eddyb eddyb closed this as completed Feb 2, 2016
flip1995 pushed a commit to flip1995/rust that referenced this issue Sep 7, 2023
fix some comments

Thank you for making Clippy better!

We're collecting our changelog from pull request descriptions.
If your PR only includes internal changes, you can just write
`changelog: none`. Otherwise, please write a short comment
explaining your change.

It's also helpful for us that the lint name is put within backticks (`` ` ` ``),
and then encapsulated by square brackets (`[]`), for example:
```
changelog: [`lint_name`]: your change
```

If your PR fixes an issue, you can add `fixes #issue_number` into this
PR description. This way the issue will be automatically closed when
your PR is merged.

If you added a new lint, here's a checklist for things that will be
checked during review or continuous integration.

- \[x] Followed [lint naming conventions][lint_naming]
- \[x] Added passing UI tests (including committed `.stderr` file)
- \[x] `cargo test` passes locally
- \[x] Executed `cargo dev update_lints`
- \[x] Added lint documentation
- \[x] Run `cargo dev fmt`

[lint_naming]: https://rust-lang.github.io/rfcs/0344-conventions-galore.html#lints

Note that you can skip the above if you are just opening a WIP PR in
order to get feedback.

Delete this line and everything above before opening your PR.

---

*Please write a short comment explaining your change (or "none" for internal only changes)*

changelog: none
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation
Projects
None yet
Development

No branches or pull requests

5 participants