Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miscompilation of SIMD when crossing target_feature boundaries #55059

Closed
raphlinus opened this issue Oct 14, 2018 · 7 comments · Fixed by #55073
Closed

Miscompilation of SIMD when crossing target_feature boundaries #55059

raphlinus opened this issue Oct 14, 2018 · 7 comments · Fixed by #55073

Comments

@raphlinus
Copy link
Contributor

This is a reduced example of a problem I've run into trying to make safe SIMD wrappers. The idea here is to have a newtype that can only be constructed when the capability is dynamically detected. However, the compiler seems to get confused about calling conventions when calling into code with target_feature enabled from code that doesn't.

#[cfg(target_arch = "x86_64")]
use std::arch::x86_64::*;

#[target_feature(enable = "avx")]
unsafe fn avx_mul(a: __m256, b: __m256) -> __m256 {
    _mm256_mul_ps(a, b)
}

#[target_feature(enable = "avx")]
unsafe fn avx_store(p: *mut f32, a: __m256) {
    _mm256_storeu_ps(p, a)
}

#[target_feature(enable = "avx")]
unsafe fn avx_setr(a: f32, b: f32, c: f32, d: f32, e: f32, f: f32, g: f32, h: f32) -> __m256 {
    _mm256_setr_ps(a, b, c, d, e, f, g, h)
}

#[target_feature(enable = "avx")]
unsafe fn avx_set1(a: f32) -> __m256 {
    _mm256_set1_ps(a)
}

struct Avx(__m256);

fn mul(a: Avx, b: Avx) -> Avx {
    unsafe { Avx(avx_mul(a.0, b.0)) }
}

fn set1(a: f32) -> Avx {
    unsafe { Avx(avx_set1(a)) }
}

fn setr(a: f32, b: f32, c: f32, d: f32, e: f32, f: f32, g: f32, h: f32) -> Avx {
    unsafe { Avx(avx_setr(a, b, c, d, e, f, g, h)) }
}

unsafe fn store(p: *mut f32, a: Avx) {
    avx_store(p, a.0);
}

pub fn main() {
    let mut result = [0.0f32; 8];
    let a = mul(setr(0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0), set1(0.25));
    unsafe { store(result.as_mut_ptr(), a)}
    println!("{:?}", result);
}

(Playground)

Output:

[0.0, 5.0, 12.0, 21.0, 0.0, 0.0, 0.0, 0.0]

Errors:

   Compiling playground v0.0.1 (file:///playground)
    Finished release [optimized] target(s) in 0.59s
     Running `target/release/playground`

In a debug build, the answer is [0.0, 0.25, 0.5, 0.75, 1.0, 1.25, 1.5, 1.75] as expected. Notice that the first 3 values of the miscompiled version are [0, 1, 2, 3] * [4, 5, 6, 7], suggesting that the halves are getting scrambled (and this is confirmed by looking at the generated asm).

Also, this just crashes on Windows.

Same miscompilation happens if I move the Avx() newtype wrapper up into the top four functions.

It's possible I don't understand the rules for what's safe to do in SIMD. If that's the case, the limitations on passing values across function boundaries should be documented.

@hanna-kruppe
Copy link
Contributor

Looks like #50154 (the newtype is irrelevant for how the vectors are passed)

@raphlinus
Copy link
Contributor Author

@rkruppe You are correct, thanks for the identification. This is extremely unfortunate, as I think it's going to affect most efforts to wrap SIMD in safe, generic abstractions. It generally won't hit the more traditional technique of writing an unsafe block by hand, where the whole block is inside the target_feature, but even there the guarantees are weak.

I started working on a fearless_simd crate, but this bug makes the name of that a lie until it's fixed.

alexcrichton added a commit to alexcrichton/rust that referenced this issue Oct 14, 2018
The issue of passing around SIMD types as values between functions has
seen [quite a lot] of [discussion], and although we thought [we fixed
it][quite a lot] it [wasn't]! This PR is a change to rustc to, again,
try to fix this issue.

The fundamental problem here remains the same, if a SIMD vector argument
is passed by-value in LLVM's function type, then if the caller and
callee disagree on target features a miscompile happens. We solve this
by never passing SIMD vectors by-value, but LLVM will still thwart us
with its argument promotion pass to promote by-ref SIMD arguments to
by-val SIMD arguments.

This commit is an attempt to thwart LLVM thwarting us. We, just before
codegen, will take yet another look at the LLVM module and demote any
by-value SIMD arguments we see. This is a very manual attempt by us to
ensure the codegen for a module keeps working, and it unfortunately is
likely producing suboptimal code, even in release mode. The saving grace
for this, in theory, is that if SIMD types are passed by-value across
a boundary in release mode it's pretty unlikely to be performance
sensitive (as it's already doing a load/store, and otherwise
perf-sensitive bits should be inlined).

The implementation here is basically a big wad of C++. It was largely
copied from LLVM's own argument promotion pass, only doing the reverse.
In local testing this...

Closes rust-lang#50154
Closes rust-lang#52636
Closes rust-lang#54583
Closes rust-lang#55059

[quite a lot]: rust-lang#47743
[discussion]: rust-lang#44367
[wasn't]: rust-lang#50154
bors added a commit that referenced this issue Oct 14, 2018
rustc: Fix (again) simd vectors by-val in ABI

The issue of passing around SIMD types as values between functions has
seen [quite a lot] of [discussion], and although we thought [we fixed
it][quite a lot] it [wasn't]! This PR is a change to rustc to, again,
try to fix this issue.

The fundamental problem here remains the same, if a SIMD vector argument
is passed by-value in LLVM's function type, then if the caller and
callee disagree on target features a miscompile happens. We solve this
by never passing SIMD vectors by-value, but LLVM will still thwart us
with its argument promotion pass to promote by-ref SIMD arguments to
by-val SIMD arguments.

This commit is an attempt to thwart LLVM thwarting us. We, just before
codegen, will take yet another look at the LLVM module and demote any
by-value SIMD arguments we see. This is a very manual attempt by us to
ensure the codegen for a module keeps working, and it unfortunately is
likely producing suboptimal code, even in release mode. The saving grace
for this, in theory, is that if SIMD types are passed by-value across
a boundary in release mode it's pretty unlikely to be performance
sensitive (as it's already doing a load/store, and otherwise
perf-sensitive bits should be inlined).

The implementation here is basically a big wad of C++. It was largely
copied from LLVM's own argument promotion pass, only doing the reverse.
In local testing this...

Closes #50154
Closes #52636
Closes #54583
Closes #55059

[quite a lot]: #47743
[discussion]: #44367
[wasn't]: #50154
alexcrichton added a commit to alexcrichton/rust that referenced this issue Oct 16, 2018
The issue of passing around SIMD types as values between functions has
seen [quite a lot] of [discussion], and although we thought [we fixed
it][quite a lot] it [wasn't]! This PR is a change to rustc to, again,
try to fix this issue.

The fundamental problem here remains the same, if a SIMD vector argument
is passed by-value in LLVM's function type, then if the caller and
callee disagree on target features a miscompile happens. We solve this
by never passing SIMD vectors by-value, but LLVM will still thwart us
with its argument promotion pass to promote by-ref SIMD arguments to
by-val SIMD arguments.

This commit is an attempt to thwart LLVM thwarting us. We, just before
codegen, will take yet another look at the LLVM module and demote any
by-value SIMD arguments we see. This is a very manual attempt by us to
ensure the codegen for a module keeps working, and it unfortunately is
likely producing suboptimal code, even in release mode. The saving grace
for this, in theory, is that if SIMD types are passed by-value across
a boundary in release mode it's pretty unlikely to be performance
sensitive (as it's already doing a load/store, and otherwise
perf-sensitive bits should be inlined).

The implementation here is basically a big wad of C++. It was largely
copied from LLVM's own argument promotion pass, only doing the reverse.
In local testing this...

Closes rust-lang#50154
Closes rust-lang#52636
Closes rust-lang#54583
Closes rust-lang#55059

[quite a lot]: rust-lang#47743
[discussion]: rust-lang#44367
[wasn't]: rust-lang#50154
kennytm added a commit to kennytm/rust that referenced this issue Oct 18, 2018
rustc: Fix (again) simd vectors by-val in ABI

The issue of passing around SIMD types as values between functions has
seen [quite a lot] of [discussion], and although we thought [we fixed
it][quite a lot] it [wasn't]! This PR is a change to rustc to, again,
try to fix this issue.

The fundamental problem here remains the same, if a SIMD vector argument
is passed by-value in LLVM's function type, then if the caller and
callee disagree on target features a miscompile happens. We solve this
by never passing SIMD vectors by-value, but LLVM will still thwart us
with its argument promotion pass to promote by-ref SIMD arguments to
by-val SIMD arguments.

This commit is an attempt to thwart LLVM thwarting us. We, just before
codegen, will take yet another look at the LLVM module and demote any
by-value SIMD arguments we see. This is a very manual attempt by us to
ensure the codegen for a module keeps working, and it unfortunately is
likely producing suboptimal code, even in release mode. The saving grace
for this, in theory, is that if SIMD types are passed by-value across
a boundary in release mode it's pretty unlikely to be performance
sensitive (as it's already doing a load/store, and otherwise
perf-sensitive bits should be inlined).

The implementation here is basically a big wad of C++. It was largely
copied from LLVM's own argument promotion pass, only doing the reverse.
In local testing this...

Closes rust-lang#50154
Closes rust-lang#52636
Closes rust-lang#54583
Closes rust-lang#55059

[quite a lot]: rust-lang#47743
[discussion]: rust-lang#44367
[wasn't]: rust-lang#50154
alexcrichton added a commit to alexcrichton/rust that referenced this issue Oct 19, 2018
The issue of passing around SIMD types as values between functions has
seen [quite a lot] of [discussion], and although we thought [we fixed
it][quite a lot] it [wasn't]! This PR is a change to rustc to, again,
try to fix this issue.

The fundamental problem here remains the same, if a SIMD vector argument
is passed by-value in LLVM's function type, then if the caller and
callee disagree on target features a miscompile happens. We solve this
by never passing SIMD vectors by-value, but LLVM will still thwart us
with its argument promotion pass to promote by-ref SIMD arguments to
by-val SIMD arguments.

This commit is an attempt to thwart LLVM thwarting us. We, just before
codegen, will take yet another look at the LLVM module and demote any
by-value SIMD arguments we see. This is a very manual attempt by us to
ensure the codegen for a module keeps working, and it unfortunately is
likely producing suboptimal code, even in release mode. The saving grace
for this, in theory, is that if SIMD types are passed by-value across
a boundary in release mode it's pretty unlikely to be performance
sensitive (as it's already doing a load/store, and otherwise
perf-sensitive bits should be inlined).

The implementation here is basically a big wad of C++. It was largely
copied from LLVM's own argument promotion pass, only doing the reverse.
In local testing this...

Closes rust-lang#50154
Closes rust-lang#52636
Closes rust-lang#54583
Closes rust-lang#55059

[quite a lot]: rust-lang#47743
[discussion]: rust-lang#44367
[wasn't]: rust-lang#50154
@gnzlbg
Copy link
Contributor

gnzlbg commented Oct 20, 2018

@rkruppe AFAICT this example has undefined (or target dependent) behavior. This:

fn mul(a: Avx, b: Avx) -> Avx {
    unsafe { Avx(avx_mul(a.0, b.0)) }
}

calls #[target_feature(enable = "avx")] unconditionally. Fixing the undefined behavior would probably still lead to incorrect code gen due to #50154 , but fixing that bug would not remove the undefined / target-dependent behavior from this one.


@raphlinus If you compile the example with -C target-feature=+avx undefined behavior should not trigger independently of the optimization level that you use.

@hanna-kruppe
Copy link
Contributor

@gnzlbg This part of the issue text implies to me that the runtime check for the target feature being present is just omitted for brevity:

The idea here is to have a newtype that can only be constructed when the capability is dynamically detected.

@raphlinus
Copy link
Contributor Author

No, the runtime check is not omitted for brevity. The Avx type is explicitly designed so that it's impossible to instantiate a value of it unless the avx feature has been detected; all of the constructors for it are unsafe with the exception of those that do the dynamic check.

Not requiring compile time feature setting is an explicit goal of this work. Of course, it's possible I'm misunderstanding something, but wanted to make sure my intent here is clear.

@hanna-kruppe
Copy link
Contributor

No, the runtime check is not omitted for brevity. The Avx type is explicitly designed so that it's impossible to instantiate a value of it unless the avx feature has been detected; all of the constructors for it are unsafe with the exception of those that do the dynamic check.

I specifically meant that the constructors that do the runtime checks are omitted. The program in the issue text contains no checks, static or dynamic, for any target_feature.

@raphlinus
Copy link
Contributor Author

Ah, then we're on the same page. Sorry for any confusion.

Manishearth added a commit to Manishearth/rust that referenced this issue Oct 20, 2018
The issue of passing around SIMD types as values between functions has
seen [quite a lot] of [discussion], and although we thought [we fixed
it][quite a lot] it [wasn't]! This PR is a change to rustc to, again,
try to fix this issue.

The fundamental problem here remains the same, if a SIMD vector argument
is passed by-value in LLVM's function type, then if the caller and
callee disagree on target features a miscompile happens. We solve this
by never passing SIMD vectors by-value, but LLVM will still thwart us
with its argument promotion pass to promote by-ref SIMD arguments to
by-val SIMD arguments.

This commit is an attempt to thwart LLVM thwarting us. We, just before
codegen, will take yet another look at the LLVM module and demote any
by-value SIMD arguments we see. This is a very manual attempt by us to
ensure the codegen for a module keeps working, and it unfortunately is
likely producing suboptimal code, even in release mode. The saving grace
for this, in theory, is that if SIMD types are passed by-value across
a boundary in release mode it's pretty unlikely to be performance
sensitive (as it's already doing a load/store, and otherwise
perf-sensitive bits should be inlined).

The implementation here is basically a big wad of C++. It was largely
copied from LLVM's own argument promotion pass, only doing the reverse.
In local testing this...

Closes rust-lang#50154
Closes rust-lang#52636
Closes rust-lang#54583
Closes rust-lang#55059

[quite a lot]: rust-lang#47743
[discussion]: rust-lang#44367
[wasn't]: rust-lang#50154
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants