Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Clang][CodeGen] Emit load of GEP after EmitMemberExpr #110487

Closed
wants to merge 4 commits into from

Conversation

bwendling
Copy link
Collaborator

@bwendling bwendling commented Sep 30, 2024

We were missing a load of the value after emitting the MemberExpr. This
was causing __builtin_dynamic_object_size to return 0 incorrectly.

Fixes: #110385
Signed-off-by: Bill Wendling morbo@google.com
Cc: Jan Hendrik Farr kernel@jfarr.cc
Cc: Kees Cook kees@kernel.org

We were missing a load of the value after emitting the MemberExpr. This
was causing __builtin_dynamic_object_size to return 0 incorrectly.

Signed-off-by: Bill Wendling <morbo@google.com>
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen labels Sep 30, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Sep 30, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-codegen

Author: Bill Wendling (bwendling)

Changes

We were missing a load of the value after emitting the MemberExpr. This was causing __builtin_dynamic_object_size to return 0 incorrectly.


Full diff: https://github.com/llvm/llvm-project/pull/110487.diff

2 Files Affected:

  • (modified) clang/lib/CodeGen/CGExpr.cpp (+1)
  • (added) clang/test/CodeGen/attr-counted-by-pr110385.c (+74)
diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index df4994ba9af6e1..4e26e9f7076777 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -1169,6 +1169,7 @@ llvm::Value *CodeGenFunction::EmitLoadOfCountedByField(
     LValue LV = EmitMemberExpr(ME);
     Address Addr = LV.getAddress();
     Res = Addr.emitRawPointer(*this);
+    Res = Builder.CreateAlignedLoad(Res->getType(), Res, getPointerAlign());
   } else if (StructBase->getType()->isPointerType()) {
     LValueBaseInfo BaseInfo;
     TBAAAccessInfo TBAAInfo;
diff --git a/clang/test/CodeGen/attr-counted-by-pr110385.c b/clang/test/CodeGen/attr-counted-by-pr110385.c
new file mode 100644
index 00000000000000..1f11bc1a04c81b
--- /dev/null
+++ b/clang/test/CodeGen/attr-counted-by-pr110385.c
@@ -0,0 +1,74 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 4
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -fstrict-flex-arrays=2 -emit-llvm -o - %s | FileCheck %s
+
+struct bch_val {
+  unsigned long int __nothing[0];
+};
+
+struct bch_xattr {
+  struct bch_val v;
+  unsigned char x_type;
+  unsigned char x_name_len;
+  unsigned char x_name[] __attribute__((__counted_by__(x_name_len)));
+};
+
+struct bkey_s_c {
+  const struct bch_val *v;
+};
+
+struct bkey_s_c_xattr {
+  union {
+    const struct bch_xattr *v;
+    struct bkey_s_c s_c;
+  };
+};
+
+// CHECK-LABEL: define dso_local i64 @test1(
+// CHECK-SAME: ptr [[K_COERCE:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[K:%.*]] = alloca [[STRUCT_BKEY_S_C:%.*]], align 8
+// CHECK-NEXT:    [[XATTR:%.*]] = alloca [[STRUCT_BKEY_S_C_XATTR:%.*]], align 8
+// CHECK-NEXT:    [[COERCE_DIVE:%.*]] = getelementptr inbounds nuw [[STRUCT_BKEY_S_C]], ptr [[K]], i32 0, i32 0
+// CHECK-NEXT:    store ptr [[K_COERCE]], ptr [[COERCE_DIVE]], align 8
+// CHECK-NEXT:    [[TMP0:%.*]] = getelementptr inbounds nuw [[STRUCT_BKEY_S_C_XATTR]], ptr [[XATTR]], i32 0, i32 0
+// CHECK-NEXT:    [[V:%.*]] = getelementptr inbounds nuw [[STRUCT_BKEY_S_C]], ptr [[K]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP1:%.*]] = load ptr, ptr [[V]], align 8
+// CHECK-NEXT:    [[ADD_PTR:%.*]] = getelementptr inbounds [[STRUCT_BCH_VAL:%.*]], ptr [[TMP1]], i64 0
+// CHECK-NEXT:    store ptr [[ADD_PTR]], ptr [[TMP0]], align 8
+// CHECK-NEXT:    [[TMP2:%.*]] = getelementptr inbounds nuw [[STRUCT_BKEY_S_C_XATTR]], ptr [[XATTR]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP3:%.*]] = load ptr, ptr [[TMP2]], align 8
+// CHECK-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds [[STRUCT_BCH_XATTR:%.*]], ptr [[TMP3]], i32 0, i32 2
+// CHECK-NEXT:    [[DOT_COUNTED_BY_LOAD:%.*]] = load i8, ptr [[DOT_COUNTED_BY_GEP]], align 4
+// CHECK-NEXT:    [[TMP4:%.*]] = zext i8 [[DOT_COUNTED_BY_LOAD]] to i64
+// CHECK-NEXT:    [[TMP5:%.*]] = mul nuw i64 [[TMP4]], 1
+// CHECK-NEXT:    [[TMP6:%.*]] = icmp sgt i64 [[TMP4]], -1
+// CHECK-NEXT:    [[TMP7:%.*]] = select i1 [[TMP6]], i64 [[TMP5]], i64 0
+// CHECK-NEXT:    ret i64 [[TMP7]]
+//
+unsigned long int test1(struct bkey_s_c k) {
+  struct bkey_s_c_xattr xattr = (struct bkey_s_c_xattr){
+      .v = (struct bch_xattr *)(k.v - __builtin_offsetof(struct bch_xattr, v))
+  };
+
+  return __builtin_dynamic_object_size(xattr.v->x_name, 0);
+}
+
+// CHECK-LABEL: define dso_local i64 @test2(
+// CHECK-SAME: ptr noundef [[XATTR:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[XATTR_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    store ptr [[XATTR]], ptr [[XATTR_ADDR]], align 8
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[XATTR_ADDR]], align 8
+// CHECK-NEXT:    [[TMP1:%.*]] = getelementptr inbounds nuw [[STRUCT_BKEY_S_C_XATTR:%.*]], ptr [[TMP0]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP2:%.*]] = load ptr, ptr [[TMP1]], align 8
+// CHECK-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds [[STRUCT_BCH_XATTR:%.*]], ptr [[TMP2]], i32 0, i32 2
+// CHECK-NEXT:    [[DOT_COUNTED_BY_LOAD:%.*]] = load i8, ptr [[DOT_COUNTED_BY_GEP]], align 4
+// CHECK-NEXT:    [[TMP3:%.*]] = zext i8 [[DOT_COUNTED_BY_LOAD]] to i64
+// CHECK-NEXT:    [[TMP4:%.*]] = mul nuw i64 [[TMP3]], 1
+// CHECK-NEXT:    [[TMP5:%.*]] = icmp sgt i64 [[TMP3]], -1
+// CHECK-NEXT:    [[TMP6:%.*]] = select i1 [[TMP5]], i64 [[TMP4]], i64 0
+// CHECK-NEXT:    ret i64 [[TMP6]]
+//
+unsigned long int test2(struct bkey_s_c_xattr *xattr) {
+  return __builtin_dynamic_object_size(xattr->v->x_name, 0);
+}

@bwendling bwendling added this to the LLVM 19.X Release milestone Sep 30, 2024
@bwendling bwendling changed the title [Clang][CodeGen] Emit load of value [Clang][CodeGen] Emit load of GEP after EmitMemberExpr Sep 30, 2024
@Cydox
Copy link
Contributor

Cydox commented Sep 30, 2024

Fixes the kernel issue for me.

I closed #110437 in favor of this one

@Cydox
Copy link
Contributor

Cydox commented Sep 30, 2024

Wait, this introduces a regression when the inner struct is directly nested without using a pointer like so:

Without this change the code below will return 64, with my fix it will also return 64, with this fix it will SEGFAULT.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

struct variable {
        int a;
        int b;
        int length;
        short array[] __attribute__((counted_by(length)));
};

struct bucket {
        int a;
        struct variable growable;
//        struct variable *growable;
//        int b;
};

int main(int argc, char *argv[])
{
        struct bucket *p;
        struct variable *v;

        p = malloc(sizeof(*p) + sizeof(*p->growable.array) * 32);
        p->growable.length = 32;


//        printf("%zu\n", __builtin_dynamic_object_size(v->array, 1));

//        p->growable = v;
        printf("%zu\n", __builtin_dynamic_object_size(p->growable.array, 1));

        return 0;
}

@Cydox
Copy link
Contributor

Cydox commented Sep 30, 2024

Reopened my fix as #110497 and also adjusted the test to also check for the case of a nested struct without indirection.

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is sort of the same comment I've made on other related patches... but I think the fundamental issue here is that StructAccessBase is skipping over CK_LValueToRValue casts. Once you jump over such a cast, you're looking at expressions which are at a different level of indirection from the member access you care about.

If StructAccessBase just returns the cast itself, instead of trying to understand the value that's getting loaded, you shouldn't need a special case for MemberExpr.

@bwendling
Copy link
Collaborator Author

This is sort of the same comment I've made on other related patches... but I think the fundamental issue here is that StructAccessBase is skipping over CK_LValueToRValue casts. Once you jump over such a cast, you're looking at expressions which are at a different level of indirection from the member access you care about.

I understand, but there are ~30 different types of implicit casts and it's hard to know which can be ignored and which shouldn't.

If StructAccessBase just returns the cast itself, instead of trying to understand the value that's getting loaded, you shouldn't need a special case for MemberExpr.

I did an experiment and it has the same issue. If I return the LValueToRValue cast and run EmitLValue on it the 'load' is still missing.

In a separate PR, I wanted to create a MemberExpr based off of what the StructAccessBase returns, because all of the code to generate the correct access to that MemberExpr already exist, but you and Aaron had issues with that. I understand wanting to keep the AST in a sane state and not wanting to add nodes indiscriminately, but it's really limiting what can be done.

Here are two ways I'd like to handle this (in order of preference):

  1. In CodeGen, when given and Expr in a __builtin_dynamic_object_size that has a FAM with __counted_by, I'd like to build a new Expr pointing to the count field and run that through the proper Emit*() method.
  2. Convert a __builtin_dynamic_object_size with a FAM with __counted_by into the appropriate calculations in Sema.

For (1), building a new Expr is easier than emitting the "base" pointer to the structure and tacking on a reference to the count's FieldDecl, which has its own issues (like preventing us from loading the base pointer twice).

Option (2) isn't super awesome, because we lose the fact that the original expression was a __builtin_dynamic_object_size call. But maybe that's not important to keep around....

@efriedma-quic
Copy link
Collaborator

If I return the LValueToRValue cast and run EmitLValue on it

I'm surprised EmitLValue didn't error out on that. The result of an LValueToRValue cast is an rvalue. Calling EmitPointerWithAlignment on it should work, I think.

I understand, but there are ~30 different types of implicit casts and it's hard to know which can be ignored and which shouldn't.

The complexity here is, unfortunately, mostly unavoidable... maybe worth looking into some simplified classification of casts for common usage.

Convert a __builtin_dynamic_object_size with a FAM with __counted_by into the appropriate calculations in Sema.

This might be reasonable.

because we lose the fact that the original expression was a __builtin_dynamic_object_size call

If we went this route, probably we'd wrap the calculation in a dedicated node that contains both the original AST, and the lowered computation, or something like that.

@bwendling
Copy link
Collaborator Author

If I return the LValueToRValue cast and run EmitLValue on it

I'm surprised EmitLValue didn't error out on that. The result of an LValueToRValue cast is an rvalue. Calling EmitPointerWithAlignment on it should work, I think.

I understand, but there are ~30 different types of implicit casts and it's hard to know which can be ignored and which shouldn't.

The complexity here is, unfortunately, mostly unavoidable... maybe worth looking into some simplified classification of casts for common usage.

Convert a __builtin_dynamic_object_size with a FAM with __counted_by into the appropriate calculations in Sema.

This might be reasonable.

because we lose the fact that the original expression was a __builtin_dynamic_object_size call

If we went this route, probably we'd wrap the calculation in a dedicated node that contains both the original AST, and the lowered computation, or something like that.

I went ahead and used the EmitPointerWithAlignment for all (it worked without needing special casing). I'll look into doing this in Sema as a potential cleanup.

@Cydox
Copy link
Contributor

Cydox commented Oct 1, 2024

I went ahead and used the EmitPointerWithAlignment for all (it worked without needing special casing). I'll look into doing this in Sema as a potential cleanup.

With that change now the compiler segfaults when compiling the example C file I posted above:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

struct variable {
        int a;
        int b;
        int length;
        short array[] __attribute__((counted_by(length)));
};

struct bucket {
        int a;
        struct variable growable;
//        struct variable *growable;
//        int b;
};

int main(int argc, char *argv[])
{
        struct bucket *p;
        struct variable *v;

        p = malloc(sizeof(*p) + sizeof(*p->growable.array) * 32);
        p->growable.length = 32;


//        printf("%zu\n", __builtin_dynamic_object_size(v->array, 1));

//        p->growable = v;
        printf("%zu\n", __builtin_dynamic_object_size(p->growable.array, 1));

        return 0;
}

}
LValueBaseInfo BaseInfo;
TBAAAccessInfo TBAAInfo;
Address Addr = EmitPointerWithAlignment(StructBase, &BaseInfo, &TBAAInfo);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the example C file that causes the compiler to segfault, StructBase is not a pointer here, so this fails.

Here's a dump of StructBase:

MemberExpr 0x1240b890 'struct variable' lvalue ->growable 0x1240ac20
`-ImplicitCastExpr 0x1240b878 'struct bucket *' <LValueToRValue>
  `-DeclRefExpr 0x1240b858 'struct bucket *' lvalue Var 0x1240afe8 'p' 'struct bucket *'

So you have to distinguish between the MemberExpr and the pointer case right here, at which point you are basically back to my PR (although I guess the case for DeclRefExpr can be removed additionally; that does break tests, so I'll check carefully it the behavior is still correct with that case removed)

@bwendling
Copy link
Collaborator Author

I reverted my last commit. This leaves the original patch, which seems to work. @efriedma-quic, would you be okay with this patch while I work to improve the code in follow-up?

@Cydox
Copy link
Contributor

Cydox commented Oct 1, 2024

I reverted my last commit. This leaves the original patch, which seems to work. @efriedma-quic, would you be okay with this patch while I work to improve the code in follow-up?

The original (and current) patch in this PR still introduces a regression. So it should not be merged in my opinion.

Look at the following C file (test.c):

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

struct variable {
        int a;
        int b;
        int length;
        short array[] __attribute__((counted_by(length)));
};

struct bucket {
        int a;
        struct variable growable;
};

int main(int argc, char *argv[])
{
        struct bucket *p;

        p = malloc(sizeof(*p) + sizeof(*p->growable.array) * 32);
        p->growable.length = 32;

        printf("%zu\n", __builtin_dynamic_object_size(p->growable.array, 1));

        return 0;
}

Compiling this file with clang 19.1.0 and running it:

$ clang --version
clang version 19.1.0 (Fedora 19.1.0-1.fc41)
Target: x86_64-redhat-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Configuration file: /etc/clang/x86_64-redhat-linux-gnu-clang.cfg

$ clang test.c

$ ./a.out
64

Correctly prints 64.

But compiling and running it with the original (and current) patch in this PR:

$ clang --version
clang version 20.0.0git (git@github.com:llvm/llvm-project.git 2de76472e0d1417b64da5aa2c1138eb365685d3a)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/jan/llvm-project/build2/bin

$ clang test.c

$ ./a.out
Segmentation fault (core dumped)

Will result in a binary that crashes with a segmentation fault.


My PR #110497 adds a test case for this scenario and does not have the same issue: https://github.com/Cydox/llvm-project/blob/15b69329a97706ada7d5e8cbeb76508aa55b418f/clang/test/CodeGen/attr-counted-by-pr110385.c#L61

@Cydox
Copy link
Contributor

Cydox commented Oct 2, 2024

Interesting. You compiled with -O2, I compiled without optimization.
When I compile it with -O2, I now get 0.

I've uploaded a gist with the disassembly generated with

clang -mllvm --x86-asm-syntax=intel -S test.c -o ./test-no-optimize.S

and

clang -O2 -mllvm --x86-asm-syntax=intel -S test.c -o ./test-optimize-O2.S

Can you do the same?

https://gist.github.com/Cydox/8b48a7d36dbcf6376d5a2d44afe5cf57

@Cydox
Copy link
Contributor

Cydox commented Oct 2, 2024

@bwendling I think you accidentally compiled the wrong branch. Looks like that is in builtin_get_counted_by, but this PR is in bdos-member-expr-error

@bwendling
Copy link
Collaborator Author

@bwendling I think you accidentally compiled the wrong branch. Looks like that is in builtin_get_counted_by, but this PR is in bdos-member-expr-error

Yeah, sorry about that.

@Cydox
Copy link
Contributor

Cydox commented Oct 2, 2024

No worries

@bwendling bwendling closed this Oct 3, 2024
@bwendling bwendling deleted the bdos-member-expr-error branch October 3, 2024 05:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen clang Clang issues not falling into any other category
Projects
Development

Successfully merging this pull request may close these issues.

__builtin_dynamic_object_size() fails to return correct size depending on depth of flexible array
4 participants