Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PhantomData fields in repr(C) structs change ABI on aarch64 #56877

Closed
glandium opened this issue Dec 16, 2018 · 49 comments
Closed

PhantomData fields in repr(C) structs change ABI on aarch64 #56877

glandium opened this issue Dec 16, 2018 · 49 comments
Assignees
Labels
A-FFI Area: Foreign function interface (FFI) A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. P-high High priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@glandium
Copy link
Contributor

Take this testcase:

#[repr(C)]
pub struct Foo {
    pub a: f32,
    pub b: f32,
}

#[no_mangle]
pub extern "C" fn foo(f: Foo) -> bool {
    f.a != f.b
}

#[repr(C)]
pub struct Bar {
    pub a: f32,
    pub b: f32,
    pub _unit: std::marker::PhantomData<()>,
}

#[no_mangle]
pub extern "C" fn bar(f: Bar) -> bool {
    f.a != f.b
}

Compile with:

rustc +nightly --target=aarch64-pc-windows-msvc --emit asm test.rs --crate-type staticlib -C opt-level=2

And check the output test.s file:

	.text
	.section	.text,"xr",one_only,foo
	.globl	foo
	.p2align	2
foo:
.seh_proc foo
	fcmp	s0, s1
	cset	w0, ne
	ret
	.section	.xdata,"dr",associative,foo
	.seh_handlerdata
	.section	.text,"xr",one_only,foo
	.seh_endproc

	.section	.text,"xr",one_only,bar
	.globl	bar
	.p2align	2
bar:
.seh_proc bar
	lsr	x8, x0, #32
	fmov	s0, w0
	fmov	s1, w8
	fcmp	s0, s1
	cset	w0, ne
	ret
	.section	.xdata,"dr",associative,bar
	.seh_handlerdata
	.section	.text,"xr",one_only,bar
	.seh_endproc

One would expect foo and bar having the same code, but that's not the case here.

This is the root cause of https://bugzilla.mozilla.org/show_bug.cgi?id=1512519

@glandium
Copy link
Contributor Author

Here is the non-optimized llvm-ir:

target datalayout = "e-m:w-p:64:64-i32:32-i64:64-i128:128-n32:64-S128"
target triple = "aarch64-pc-windows-msvc"

%Bar = type { [0 x i32], float, [0 x i32], float, [0 x i8], %"core::marker::PhantomData<()>", [0 x i8] }
%"core::marker::PhantomData<()>" = type {}

; Function Attrs: nounwind uwtable
define zeroext i1 @foo([2 x float]) unnamed_addr #0 {
start:
  %abi_cast = alloca [2 x float], align 4
  %f = alloca { float, float }, align 4
  store [2 x float] %0, [2 x float]* %abi_cast, align 4
  %1 = bitcast { float, float }* %f to i8* 
  %2 = bitcast [2 x float]* %abi_cast to i8* 
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %1, i8* align 4 %2, i64 8, i1 false)
  %3 = bitcast { float, float }* %f to float*
  %4 = load float, float* %3, align 4
  %5 = getelementptr inbounds { float, float }, { float, float }* %f, i32 0, i32 1
  %6 = load float, float* %5, align 4
  %7 = fcmp une float %4, %6
  ret i1 %7
}

; Function Attrs: nounwind uwtable
define zeroext i1 @bar(i64) unnamed_addr #0 {
start:
  %abi_cast = alloca i64, align 8
  %f = alloca %Bar, align 4
  store i64 %0, i64* %abi_cast, align 8
  %1 = bitcast %Bar* %f to i8* 
  %2 = bitcast i64* %abi_cast to i8* 
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %1, i8* align 8 %2, i64 8, i1 false)
  %3 = bitcast %Bar* %f to float*
  %4 = load float, float* %3, align 4
  %5 = getelementptr inbounds %Bar, %Bar* %f, i32 0, i32 3
  %6 = load float, float* %5, align 4
  %7 = fcmp une float %4, %6
  ret i1 %7
}

For reference, the llvm-ir for the same source code, for x86_64:

target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

%Bar = type { [0 x i32], float, [0 x i32], float, [0 x i8], %"core::marker::PhantomData<()>", [0 x i8] }
%"core::marker::PhantomData<()>" = type {}

; Function Attrs: nounwind nonlazybind uwtable
define zeroext i1 @foo(double) unnamed_addr #0 {
start:
  %abi_cast = alloca double, align 8
  %f = alloca { float, float }, align 4
  store double %0, double* %abi_cast, align 8
  %1 = bitcast { float, float }* %f to i8* 
  %2 = bitcast double* %abi_cast to i8* 
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %1, i8* align 8 %2, i64 8, i1 false)
  %3 = bitcast { float, float }* %f to float*
  %4 = load float, float* %3, align 4
  %5 = getelementptr inbounds { float, float }, { float, float }* %f, i32 0, i32 1
  %6 = load float, float* %5, align 4
  %7 = fcmp une float %4, %6
  ret i1 %7
}

; Function Attrs: nounwind nonlazybind uwtable
define zeroext i1 @bar(double) unnamed_addr #0 {
start:
  %abi_cast = alloca double, align 8
  %f = alloca %Bar, align 4
  store double %0, double* %abi_cast, align 8
  %1 = bitcast %Bar* %f to i8* 
  %2 = bitcast double* %abi_cast to i8* 
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %1, i8* align 8 %2, i64 8, i1 false)
  %3 = bitcast %Bar* %f to float*
  %4 = load float, float* %3, align 4
  %5 = getelementptr inbounds %Bar, %Bar* %f, i32 0, i32 3
  %6 = load float, float* %5, align 4
  %7 = fcmp une float %4, %6
  ret i1 %7
}

The notable difference is i64 vs. double. Note that the above is the output of --emit llvm-ir with no opt-level. I don't know if that matches what rustc passes to llvm. (I don't remember what the option to output the llvm-ir before/after all passes is)

@glandium
Copy link
Contributor Author

It's interesting to note that x86-64 takes the two floats as a single double, while aarch64 takes two floats, but the main problem here is that bar is taking an i64 instead of two floats on aarch64...

@glandium
Copy link
Contributor Author

BTW, this also happens on aarch64-unknown-linux-gnu, and I've now confirmed that the IR before the first optimization pass is the same.

@estebank estebank added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. A-FFI Area: Foreign function interface (FFI) labels Dec 17, 2018
@parched
Copy link
Contributor

parched commented Dec 17, 2018

This happens because homogeneous aggregates are treated differently in the PCS compared to inhomogeneous aggregates. It depends how zero sized types are defined to behave in #[repr(C)] types as to whether this is a bug or not. Should they be completely ignored?

@emilio
Copy link
Contributor

emilio commented Dec 17, 2018

Yes, or at least that's the assumption that all of bindgen / cbindgen / the improper_ctypes lint make.

@nikomatsakis nikomatsakis added I-nominated T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 18, 2018
@nikomatsakis
Copy link
Contributor

Nominating: this is likely to be a very high priority for FF over the next month or so, and if we can get a fix in now that will be a big win.

@parched
Copy link
Contributor

parched commented Dec 18, 2018

FWIW the code to fix is here. The question is, should just PhantomData be ignored or all ZSTs or just ZSTs with aligmnent=1?

EDIT: or actually here but perhaps PhantomData should be stripped out of the type much earlier.

@glandium
Copy link
Contributor Author

glandium commented Dec 18, 2018

There's nothing platform-dependent in that code, how come this only affects aarch64?

Edit: Oh, right #56877 (comment)

@pnkfelix
Copy link
Member

discussed at T-compiler meeting. P-high. assigning to self to make sure it doesn't get lost. (@nikomatsakis says "maybe we're close to a fix", which I assume is based on the comment thread here).

@pnkfelix pnkfelix self-assigned this Dec 20, 2018
@pnkfelix pnkfelix added P-high High priority and removed I-nominated labels Dec 20, 2018
@jrmuizel
Copy link
Contributor

FWIW, cbindgen explicitly ignores PhantomData. Other ZST are not ignored. However, that seems like a bug in cbindgen: mozilla/cbindgen#262

@arielb1
Copy link
Contributor

arielb1 commented Dec 21, 2018

FWIW the code to fix is here. The question is, should just PhantomData be ignored or all ZSTs or just ZSTs with aligmnent=1?

If the struct has actual padding, then it won't be a homogeneous aggregate. Therefore, the only risk here is for structs whose alignment is increased by a ZST, but only up to the size of the type itself. I don't think there's a hazard in regarding these as homogeneous aggregates.

For example,

#[repr(C)]
struct Foo {
    a: f32,
    b: f32,
    c: [f64; 0]
}

EDIT: or actually here but perhaps PhantomData should be stripped out of the type much earlier.

No. self.field has to return fields in source order for things to work.

@parched
Copy link
Contributor

parched commented Dec 21, 2018

Therefore, the only risk here is for structs whose alignment is increased by a ZST, but only up to the size of the type itself. I don't think there's a hazard in regarding these as homogeneous aggregates.

Unfortunately I don't think that's the case. In C

struct X {
    float a;
    float b;
    int x []; // or int x[0]
};

is not a homogeneous aggregate. I think we have to have a special case just for PhantomData.

@arielb1
Copy link
Contributor

arielb1 commented Dec 21, 2018

I think we have to have a special case just for PhantomData.

Or some other kind of emptyness-tracking that treats [T; 0] as being ABI-relevant while not treating PhantomData as being ABI-relevant.

@nagisa nagisa assigned nikomatsakis and unassigned pnkfelix Jan 3, 2019
@nagisa
Copy link
Member

nagisa commented Jan 3, 2019

Discussed at the T-compiler. It feels as if this needs a strategy of some sort before any implementation work can proceed.

@nikomatsakis
Copy link
Contributor

Looking over the homogeneous-aggregate code, it already checks the following conditions:

  • Each field maps to same "unit" (register, etc) of size U
  • The offset of field F is equal to U*F (i.e., no padding between the fields)
  • The total size of the struct is equal to U*N where N is the number of fields (i.e., no padding at the end)

It seems like we could basically keep all of these conditions, but filter the field list to those with non-zero size, and everything would be fine. In particular, the concerns that @arielb1 raised here regarding alignment are being checked I believe.

So roughly speaking the definition of a "homogeneous aggregate" would be something like:

  • Let F be a list of fields whose types have non-zero size and N be the number of such fields
    • Let type(F[x]) be the type of the field with index x
  • Let U(T) be the "unit" used to pass a value of type T
    • I'm not entirely sure how to define a unit, but it's a concept pre-existing in the code
  • There must be some unit U0 used to pass each field f in F
    • that is, for all f in F, U(f) = U0
  • The offset of each field with index i must be i * sizeof(U)
  • The total size of the aggegate must be N * sizeof(U)

If all conditions are met, the type is a "homogeneous aggregate".

@nikomatsakis
Copy link
Contributor

Does that sound about right? This seems like it wouldn't be too hard to implement.

@arielb1
Copy link
Contributor

arielb1 commented Jan 3, 2019

@nikomatsakis

This does not handle the C case with an empty array

struct X {
    float a;
    float b;
    int x []; // or int x[0]
};

We need some way of defining empty arrays as "homogeneous aggregate breaking" while defining PhantomData as not.

@nikomatsakis
Copy link
Contributor

@arielb1

I see, I missed that subtlety.

It comes down what the "filter" is that we apply to the types -- I specified "non-zero-size", but it could easily be "exclude empty structs". Or, perhaps, exclude anything that is zero-sized except for arrays whose length is non-zero (or whose element types are not zero-sized).

It seems to come down to whether we want a "whitelist" or a "blacklist" here.

It occurs to me that there is one other concern:

It is possible to define zero-sized structs in C with suitable compiler extensions (see this section of the nascent unsafe code guidelines for examples). Can anyone validate whether a struct like struct Foo { f64 x; f64 y; Bar bar; }; struct Bar { }; in C would be treated as a homogeneous aggregate? I created this godbolt example but I can't quite tell how to interpret it https://godbolt.org/z/hkaEda =)

@nikomatsakis
Copy link
Contributor

It seems clear though that if you have a #[repr(C)] empty struct that is embedded into another #[repr(C)] struct, it should be an aggregate iff the C compiler would consider to be one.

We have some freedom to do otherwise for structs and types that are not #[repr(C)].

@arielb1
Copy link
Contributor

arielb1 commented Jan 4, 2019

I used return b.y to read things out instead of doing an addition (https://godbolt.org/z/eQHDrX), and I found that having an empty C struct behaves as an empty array no that was an error, it does not. I think that an empty #[repr(Rust)] struct should behave "like PhantomData" - I'll note that is already sort of "improper_ctypes" territory - we shouldn't define it, but it feels to be a better idea than special-casing PhantomData.

struct Foo {};

struct Baz {
    float x;
    float y;
    struct Foo b;
};

@arielb1
Copy link
Contributor

arielb1 commented Jan 4, 2019

Can someone who knows the code look at it? I think C compilers specifically have a problem with zero-length arrays and structs, but not with positive-length arrays or structs.

More examples:

// still an homogeneous aggregate, despite having a non-0-sized struct and float.
struct Foo2 { float t; };
struct Baz2 {
    struct Foo2 x;
    float y[1];
};

float sizeof_baz2(struct Baz2 b) {
    return b.y[0];
}

@nikomatsakis
Copy link
Contributor

OK, so let me try to open up this branch as a PR and we'll take it from there. I have to figure out the testing, most of all.

@parched
Copy link
Contributor

parched commented Jan 16, 2019

@nikomatsakis

It comes down what the "filter" is that we apply to the types

The filter at the high level is "all fields that are removed from the rust struct declaration when you convert it to a C struct declaration", e.g. PhantomData, but more generally it's all repr(Rust) fields because they can't be declared in C. As a consequence of the Rust struct and C struct needing to be compatible they must also be zero-sized, non-alignment inducing and non-padding inducing. However, there's no real need to check for that in the homogeneity calculation because those must always be improper C types as the struct size wouldn't be the same.

The homogenous_aggregate function is fine as is for non-repr(Rust) types, even

#[repr(C)]
struct Foo {
    a: f32,
    b: f32,
    c: [i32; 0]
}

To be clear, there are three potential problematic array fields in C to deal with

  • flexible array members e.g., int x[]
  • zero length arrays e.g., int [0]
  • variable length arrays (VLAs) e.g., int [n]

The latter I don't think can even be passed by value in rust currently, e.g.,

#[repr(C)]
struct X {
    x : [i32],
}

pub extern "C" fn use_x(x : X) {
}
   |
11 | pub extern "C" fn use_x(x : X) {
   |                         ^ doesn't have a size known at compile-time
   |

The first two look to be handled differently on x86 😟 https://godbolt.org/z/o0z52H (although my x86 assembly isn't very good so maybe someone else can verify).


Something I found interesting in https://github.com/rust-rfcs/unsafe-code-guidelines/blob/9c9840297ca47d3085876cec7b59bb92d8554591/reference/src/layout/structs-and-tuples.md#function-call-abi-compatibility

#[repr(transparent)] can only be applied to structs with a single
field whose type T has non-zero size, along with some number of
other fields whose types are all zero-sized (typically
std::marker::PhantomData fields). The struct then takes on the "ABI
behavior" of the type T that has non-zero size.

This seems like the kind of filter we want here too, is there code for that somewhere we can reuse?

Alternatively that got me thinking, can the use case of adding PhantomData to repr(C) structs be covered using repr(transparent)?
e.g., instead of

#[repr(C)]
pub struct Bar {
    pub a: f32,
    pub b: f32,
    pub _unit: std::marker::PhantomData<()>,
}

do

#[repr(C)]
pub struct Bar {
    pub a: f32,
    pub b: f32,
}

#[repr(transparent)]
pub struct Bar {
    pub c: BarC,
    pub _unit: std::marker::PhantomData<()>,
}

then make PhantomData an improper C type again and the existing rustc code works?

@parched
Copy link
Contributor

parched commented Jan 17, 2019

The first two look to be handled differently on x86 😟 https://godbolt.org/z/o0z52H (although my x86 assembly isn't very good so maybe someone else can verify).

Actually, it looks like ABI for flexible array members for x86 changed in GCC 4.4 but clang uses the old behavior, am I interpreting that right? Does rust claim that flexible array members are representable in rust as zero sized arrays, if so there might be an issue, if not it's a potential foot-gun if someone assumes so.

@nagisa
Copy link
Member

nagisa commented Jan 18, 2019

By the way, I’ve found just today that

struct banana {
    int peach[0];
}

and

struct banana {
    int peach[];
}

are not equivalent. Namely the following code is UB with the first structure but fine with the second, suggesting that only the second structure is variable-length...

int foo(struct banana x) {
    return x.peach[0];
}

EDIT: I’m surprised I’ve found it only today as codebase at my $dayjob is littered with the former variant for VL aggregates...

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Jan 18, 2019

@parched

The filter at the high level is "all fields that are removed from the rust struct declaration when you convert it to a C struct declaration",...but more generally it's all #[repr(Rust)] fields because they can't be declared in C.

This doesn't sound right. If a #[repr(Rust)] struct has non-zero-size, it can't just be removed when converted to C -- rather, the type itself cannot be represented in C. But that data is still there and can't be ignored.

Does rust claim that flexible array members are representable in rust as zero sized arrays

I don't think we have another way to declare it, really, so presently the answer is yes. I presume here that "flexible array member" means T[] vs T[0]? (As @nagisa noted, I always considered foo[0] and foo[] equivalent in C and have definitely seen lots of code that uses foo[0]...)

@nikic
Copy link
Contributor

nikic commented Jan 18, 2019

@nagisa int peach[0] is not legal C, but commonly accepted as a compiler extension from pre-C99 times. Similarly int peach[1] is generally excluded from optimizations that depend on the array length, because it is a common flexible array member idiom in pre-C99 code (or for that matter, code that needs to interoperate with C++).

Edit: That is, if reading OOB of a zero or one sized array in tail position results in a miscompile, that's usually a compiler bug, even though it's technically UB. Your particular example is a bit odd because the struct is passed by value, and the usual guarantees for the struct hack patterns probably don't apply there.

@nikomatsakis
Copy link
Contributor

Some notes:

  • First off, when @arielb1 wrote "VLA" I was presuming they meant what @parched calls "flexible array members". =) But perhaps we should adopt the "flexible" terminology.
  • Second, we probably need to address (at the lang level) the fact that we can't draw all the distinctions that C can (e.g., flexible vs zero-length etc). But that will take some time.

I am wondering if we can agree on a pragmatic compromise to get us going forward, and leave some amount of work for later. To some extent, I think my PR represents a decent shot at such a compromise, since it seems to match clang (maybe?) and older gcc, and it allows one to represent flexible arrays using zero-length arrays. It's not perfect if the C code uses zero-length arrays in the newer sense (not the older, flexible sense), though you can add a "dummy" member after the ZLA (e.g., PhantomData<()>) as a workaround.

A more conservative rule might be to just filter out zero-sized, repr(Rust) structs (which would certainly cover phantomdata). This would not match the C behavior for zero-sized C structs or arrays, but those are quite unusual. On the other hand, it feels kind of strictly worse than my current PR, because it is basically just wrong more often (put another way, my PR does match C behavior for zero-sized structs).

At least, this is how I understand it now.

@nikomatsakis
Copy link
Contributor

(I should probably try to draw up some concrete examples to make my points here)

@parched
Copy link
Contributor

parched commented Jan 21, 2019

@parched

The filter at the high level is "all fields that are removed from the rust struct declaration when you convert it to a C struct declaration",...but more generally it's all #[repr(Rust)] fields because they can't be declared in C.

This doesn't sound right. If a #[repr(Rust)] struct has non-zero-size, it can't just be removed when converted to C -- rather, the type itself cannot be represented in C. But that data is still there and can't be ignored.

Yes, but what I meant in the sentence following that we don't need to worry about it in the ABI computation because code like that would be ill formed regardless of the function call ABI, i.e., their memory layouts wouldn't be compatible.


To be clear, what I gather from the disassembly here is

struct X {
    float a;
    float b;
    int c [];
};

struct Y {
    float a;
    float b;
    int c[0];
};

are treated the same on:

  • AArch64 GCC
  • AArch64 Clang
  • x86_64 GCC >= 4.4

but differently on:

  • x86_64 GCC < 4.4
  • x86_64 Clang

IMO rust should match the second one (struct Y) where they differ because it doesn't really make sense to pass a struct with a flexible array member by value because it will get sliced, or does it? Whereas, I can imagine cases where you do want to pass a struct with a zero sized array by value like.

#define NUM_THINGS 5 // change this to suit, might be zero

struct Y {
    float a;
    float b;
    int things[NUM_THINGS];
};

A more conservative rule might be to just filter out zero-sized, repr(Rust) structs (which would certainly cover phantomdata). This would not match the C behavior for zero-sized C structs or arrays, but those are quite unusual.

I think this is the way to go. As I understand it, it would match the C behaviour in those cases, no?

@nikomatsakis
Copy link
Contributor

@parched

If I understand you, you are arguing that we should not consider [T; 0] to be a "flexible array member". In effect, we would basically be saying that we have no way (in Rust) to express that. I find this logic in particular persuasive:

it doesn't really make sense to pass a struct with a flexible array member by value because it will get sliced

However, that leaves me in a bit of a quandary. In particular, what set of ZST are we going to exclude when performing the aggregate test? At minimum, that set should include #[repr(Rust)] structs (which includes PhantomData).

I'm not sure what #[repr(C)] types it should include -- it sounds to me you are arguing it sohuld include [T; 0] and probably all zero-sized structs. This would then match the first set of compilers you gave:

  • AArch64 GCC
  • AArch64 Clang
  • x86_64 GCC >= 4.4

but it would not match:

  • x86_64 GCC < 4.4
  • x86_64 Clang

Given that GCC changed behavior here, I think matching the new behavior makes sense. And obviously matching AArch64 is our real goal here (I'm not sure how important this test is for x86_64).

I think this is the way to go. As I understand it, it would match the C behaviour in those cases, no?

I'm not entirely sure what you meant by this. My "conservative" proposal is just to filter out repr-rust things, but in that case we would NOT match the C behavior, since a struct with a zero-sized repr-C thing would not be considered a homogeneous aggregate. This seems like an ok intermediate step but obviously not the final goal, which ought to be compatibility with C compilers.

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Jan 22, 2019

I'd like to land something ASAP. I'd be happy to modify my PR to one of two things:

  • Only filter out repr-rust and phantomdata types
  • Filter out all zero-sized types

Either fixes the immediate problem. The latter seems to me to be more compatible with C-Rust interop. The question boils down to what you consider the Rust type [T; 0] to map to in C -- is it T[0] or T[]?

If we think about compatibility with newer GCC 4.4, then I think the two options work out like this:

Pattern Filter only rust Filter all ZST
repr(C) struct with phantomdata in Rust decl
repr(C) struct with empty repr(C) struct
repr(C) struct with T[0] in C decl and [T; 0] in Rust decl
repr(C) struct with T[] in C decl and [T; 0] in Rust decl

The first line corresponds to this issue. However, as @parched points out, the final line may be a moot point, since such types ought not to be passed by value.

(Note: I wonder if we want something like FlexibleArray<T> as a special marker type for translating T[] to Rust?)

@nikomatsakis
Copy link
Contributor

After some discussion on Zulip, I am currently leaning towards "just filter rust types" as a simple, "conservative" step. I think likely we will want to make further steps here eventually but this should solve the immediate problem and seems uncontroversial.

@emilio
Copy link
Contributor

emilio commented Jan 23, 2019

I agree that filtering all ZSTs seems better. In particular, because:

  • The only case it breaks is the "repr(C) struct with T[] in C decl and [T; 0] in Rust decl". Which is just something you never want to pass by value.
  • Adding a [T; 0] to a #[repr(C)] struct that does not appear on the C declaration is a common hack to make your alignment correct (before #[repr(align)] at least), and it makes it work properly too.

But doing it first for rust zsts and then for all zsts sounds fine too.

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Jan 23, 2019

I've updated the PR to that approach, FYI.

@nikomatsakis
Copy link
Contributor

Discussed some with @eddyb on Discord (privmsg). They too feel that we should just screen out all ZSTs. They expressed a preference to modify the "homogeneous aggregate" test -- instead of "screening out fields", we would change the return value for zero-size types so that it conveys that there is no data there. Getting this return value would not invalidate homogeneity. I can try to mock that up.

@eddyb
Copy link
Member

eddyb commented Jan 23, 2019

(my point was that the right intuition, IMO, is that "no leaf fields" should not be an error condition - it so happens that ZST and "no leaf fields" are equivalent, but I prefer modelling the latter in the return type, e.g. Result<Option<...>, NotHomogenous> where 0-element arrays and 0-field structs return Ok(None))

@pnkfelix
Copy link
Member

visiting for triage. Seems like things are progressing smoothly here.

@parched
Copy link
Contributor

parched commented Jan 25, 2019

If we think about compatibility with newer GCC 4.4, then I think the two options work out like this:

Pattern Filter only rust Filter all ZST
repr(C) struct with phantomdata in Rust decl
repr(C) struct with empty repr(C) struct
repr(C) struct with T[0] in C decl and [T; 0] in Rust decl
repr(C) struct with T[] in C decl and [T; 0] in Rust decl

I'm not sure this correct, at least for AArch64 it's

Pattern Filter only rust Filter all ZST
repr(C) struct with phantomdata in Rust decl
repr(C) struct with empty repr(C) struct
repr(C) struct with T[0] in C decl and [T; 0] in Rust decl
repr(C) struct with T[] in C decl and [T; 0] in Rust decl
repr(C) struct with __attribute__(align) in C decl and [T; 0] in Rust decl

Note I added an extra line to cover @emilio's use case of when before #[repr(align)] was a thing.

Centril added a commit to Centril/rust that referenced this issue Jan 25, 2019
…ates, r=eddyb

distinguish "no data" from "heterogeneous" in ABI

Ignore zero-sized types when computing whether something is a homogeneous aggregate, except be careful of VLA.

cc rust-lang#56877

r? @arielb1
cc @eddyb
@nikomatsakis
Copy link
Contributor

@parched I see, I misinterpreted your slot regarding gcc somehow. You wrote that T[0] and T[] are treated as equivalent on:

  • AArch64 GCC // i.e., even when GCC >= 4.4!
  • AArch64 Clang
  • x86_64 GCC >= 4.4

It seem strange to me that this would vary per platform, but I guess I can imagine this being something that is part of a platform's ABI. That's certainly a pain. =) Right now, we at least treat the "homogeneous aggregate" test as being independent of the platform.

Hmm. Well that's a wrench in the works. :) I still think we should land my PR, even though it's going to behave differently on AArch64 for such structs (because it seems strictly better than the status quo), but I guess we may want to do a follow-up.

@pnkfelix
Copy link
Member

visiting for T-compiler meeting triage: PR #57645 stopped the immediate bug here, but there are questions about whether we got this right.

So this issue was left open, but it may be worthwhile to open a fresh issue dedicated to investigation and close this one.

@pnkfelix
Copy link
Member

Opened #58487 to track remaining issues here now that PR #57645 has landed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-FFI Area: Foreign function interface (FFI) A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. P-high High priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests