Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IR] Allow fast math flags on calls with homogeneous FP struct types #110506

Merged
merged 9 commits into from
Oct 2, 2024

Conversation

MacDue
Copy link
Member

@MacDue MacDue commented Sep 30, 2024

This extends FPMathOperator to allow calls that return literal structs of homogeneous floating-point or vector-of-floating-point types.

The intended use case for this is to support FP intrinsics that return multiple values (such as llvm.sincos).

This extends FPMathOperator to allow calls that return literal structs
of homogeneous floating-point or vector-of-floating-point types.

The intended use case for this is to support FP intrinsics that return
multiple values (such as `llvm.sincos`).
@llvmbot
Copy link
Collaborator

llvmbot commented Sep 30, 2024

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-llvm-ir

Author: Benjamin Maxwell (MacDue)

Changes

This extends FPMathOperator to allow calls that return literal structs of homogeneous floating-point or vector-of-floating-point types.

The intended use case for this is to support FP intrinsics that return multiple values (such as llvm.sincos).


Full diff: https://github.com/llvm/llvm-project/pull/110506.diff

6 Files Affected:

  • (modified) llvm/docs/LangRef.rst (+10-9)
  • (modified) llvm/include/llvm/IR/DerivedTypes.h (+4)
  • (modified) llvm/include/llvm/IR/Operator.h (+12-2)
  • (modified) llvm/lib/IR/Type.cpp (+7-6)
  • (modified) llvm/test/Bitcode/compatibility.ll (+20)
  • (modified) llvm/unittests/IR/InstructionsTest.cpp (+34-6)
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 3f39d58b322a4f..1eb2982385fda0 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -12472,9 +12472,8 @@ instruction's return value on the same edge).
 The optional ``fast-math-flags`` marker indicates that the phi has one
 or more :ref:`fast-math-flags <fastmath>`. These are optimization hints
 to enable otherwise unsafe floating-point optimizations. Fast-math-flags
-are only valid for phis that return a floating-point scalar or vector
-type, or an array (nested to any depth) of floating-point scalar or vector
-types.
+are only valid for phis that return a floating-point scalar or vector type,
+possibly within an array (nested to any depth), or a homogeneous struct literal.
 
 Semantics:
 """"""""""
@@ -12523,8 +12522,8 @@ class <t_firstclass>` type.
 #. The optional ``fast-math flags`` marker indicates that the select has one or more
    :ref:`fast-math flags <fastmath>`. These are optimization hints to enable
    otherwise unsafe floating-point optimizations. Fast-math flags are only valid
-   for selects that return a floating-point scalar or vector type, or an array
-   (nested to any depth) of floating-point scalar or vector types.
+   for selects that return a floating-point scalar or vector type, possibly
+   within an array (nested to any depth), or a homogeneous struct literal.
 
 Semantics:
 """"""""""
@@ -12762,8 +12761,8 @@ This instruction requires several arguments:
 #. The optional ``fast-math flags`` marker indicates that the call has one or more
    :ref:`fast-math flags <fastmath>`, which are optimization hints to enable
    otherwise unsafe floating-point optimizations. Fast-math flags are only valid
-   for calls that return a floating-point scalar or vector type, or an array
-   (nested to any depth) of floating-point scalar or vector types.
+   for calls that return a floating-point scalar or vector type, possibly within
+   an array (nested to any depth), or a homogeneous struct literal.
 
 #. The optional "cconv" marker indicates which :ref:`calling
    convention <callingconv>` the call should use. If none is
@@ -20528,7 +20527,8 @@ the explicit vector length.
    more :ref:`fast-math flags <fastmath>`. These are optimization hints to
    enable otherwise unsafe floating-point optimizations. Fast-math flags are
    only valid for selects that return a floating-point scalar or vector type,
-   or an array (nested to any depth) of floating-point scalar or vector types.
+   possibly within an array (nested to any depth), or a homogeneous struct
+   literal.
 
 Semantics:
 """"""""""
@@ -20586,7 +20586,8 @@ is the pivot.
    more :ref:`fast-math flags <fastmath>`. These are optimization hints to
    enable otherwise unsafe floating-point optimizations. Fast-math flags are
    only valid for merges that return a floating-point scalar or vector type,
-   or an array (nested to any depth) of floating-point scalar or vector types.
+   possibly within an array (nested to any depth), or a homogeneous struct
+   literal.
 
 Semantics:
 """"""""""
diff --git a/llvm/include/llvm/IR/DerivedTypes.h b/llvm/include/llvm/IR/DerivedTypes.h
index 975c142f1a4572..a24801d8bdf834 100644
--- a/llvm/include/llvm/IR/DerivedTypes.h
+++ b/llvm/include/llvm/IR/DerivedTypes.h
@@ -301,6 +301,10 @@ class StructType : public Type {
   ///  {<vscale x 2 x i32>, <vscale x 4 x i64>}}
   bool containsHomogeneousScalableVectorTypes() const;
 
+  /// Return true if this struct is non-empty and all element types are the
+  /// same.
+  bool containsHomogeneousTypes() const;
+
   /// Return true if this is a named struct that has a non-empty name.
   bool hasName() const { return SymbolTableEntry != nullptr; }
 
diff --git a/llvm/include/llvm/IR/Operator.h b/llvm/include/llvm/IR/Operator.h
index 88b9bfc0be4b15..22ffcc730e7b68 100644
--- a/llvm/include/llvm/IR/Operator.h
+++ b/llvm/include/llvm/IR/Operator.h
@@ -15,6 +15,7 @@
 #define LLVM_IR_OPERATOR_H
 
 #include "llvm/ADT/MapVector.h"
+#include "llvm/ADT/TypeSwitch.h"
 #include "llvm/IR/Constants.h"
 #include "llvm/IR/FMF.h"
 #include "llvm/IR/GEPNoWrapFlags.h"
@@ -351,8 +352,17 @@ class FPMathOperator : public Operator {
     case Instruction::Select:
     case Instruction::Call: {
       Type *Ty = V->getType();
-      while (ArrayType *ArrTy = dyn_cast<ArrayType>(Ty))
-        Ty = ArrTy->getElementType();
+      TypeSwitch<Type *>(Ty)
+          .Case([&](StructType *StructTy) {
+            if (!StructTy->isLiteral() || !StructTy->containsHomogeneousTypes())
+              return;
+            Ty = StructTy->elements().front();
+          })
+          .Case([&](ArrayType *ArrTy) {
+            do {
+              Ty = ArrTy->getElementType();
+            } while ((ArrTy = dyn_cast<ArrayType>(Ty)));
+          });
       return Ty->isFPOrFPVectorTy();
     }
     default:
diff --git a/llvm/lib/IR/Type.cpp b/llvm/lib/IR/Type.cpp
index 3784ad28d7219d..f618263f79c313 100644
--- a/llvm/lib/IR/Type.cpp
+++ b/llvm/lib/IR/Type.cpp
@@ -430,13 +430,14 @@ bool StructType::containsScalableVectorType(
 }
 
 bool StructType::containsHomogeneousScalableVectorTypes() const {
-  Type *FirstTy = getNumElements() > 0 ? elements()[0] : nullptr;
-  if (!FirstTy || !isa<ScalableVectorType>(FirstTy))
+  if (getNumElements() <= 0 || !isa<ScalableVectorType>(elements().front()))
     return false;
-  for (Type *Ty : elements())
-    if (Ty != FirstTy)
-      return false;
-  return true;
+  return containsHomogeneousTypes();
+}
+
+bool StructType::containsHomogeneousTypes() const {
+  ArrayRef<Type *> ElementTys = elements();
+  return !ElementTys.empty() && all_equal(ElementTys);
 }
 
 void StructType::setBody(ArrayRef<Type*> Elements, bool isPacked) {
diff --git a/llvm/test/Bitcode/compatibility.ll b/llvm/test/Bitcode/compatibility.ll
index ea29ff634a43bb..4fe9d9b11f8831 100644
--- a/llvm/test/Bitcode/compatibility.ll
+++ b/llvm/test/Bitcode/compatibility.ll
@@ -1122,6 +1122,26 @@ define void @fastMathFlagsForArrayCalls([2 x float] %f, [2 x double] %d1, [2 x <
   ret void
 }
 
+declare { float, float } @fmf_struct_f32()
+declare { double, double } @fmf_struct_f64()
+declare { <4 x double>, <4 x double> } @fmf_struct_v4f64()
+
+; CHECK-LABEL: fastMathFlagsForStructCalls(
+define void @fastMathFlagsForStructCalls({ float, float } %f, { double, double } %d1, { <4 x double>, <4 x double> } %d2) {
+  %call.fast = call fast { float, float } @fmf_struct_f32()
+  ; CHECK: %call.fast = call fast { float, float } @fmf_struct_f32()
+
+  ; Throw in some other attributes to make sure those stay in the right places.
+
+  %call.nsz.arcp = notail call nsz arcp { double, double } @fmf_struct_f64()
+  ; CHECK: %call.nsz.arcp = notail call nsz arcp { double, double } @fmf_struct_f64()
+
+  %call.nnan.ninf = tail call nnan ninf fastcc { <4 x double>, <4 x double> } @fmf_struct_v4f64()
+  ; CHECK: %call.nnan.ninf = tail call nnan ninf fastcc { <4 x double>, <4 x double> } @fmf_struct_v4f64()
+
+  ret void
+}
+
 ;; Type System
 %opaquety = type opaque
 define void @typesystem() {
diff --git a/llvm/unittests/IR/InstructionsTest.cpp b/llvm/unittests/IR/InstructionsTest.cpp
index 481fe96607e48e..9d8056a768af2f 100644
--- a/llvm/unittests/IR/InstructionsTest.cpp
+++ b/llvm/unittests/IR/InstructionsTest.cpp
@@ -1559,12 +1559,40 @@ TEST(InstructionsTest, FPCallIsFPMathOperator) {
       CallInst::Create(AVFFnTy, AVFCallee, {}, ""));
   EXPECT_TRUE(isa<FPMathOperator>(AVFCall));
 
-  Type *AAVFTy = ArrayType::get(AVFTy, 2);
-  FunctionType *AAVFFnTy = FunctionType::get(AAVFTy, {});
-  Value *AAVFCallee = Constant::getNullValue(PtrTy);
-  std::unique_ptr<CallInst> AAVFCall(
-      CallInst::Create(AAVFFnTy, AAVFCallee, {}, ""));
-  EXPECT_TRUE(isa<FPMathOperator>(AAVFCall));
+  Type *StructITy = StructType::get(ITy, ITy);
+  FunctionType *StructIFnTy = FunctionType::get(StructITy, {});
+  Value *StructICallee = Constant::getNullValue(PtrTy);
+  std::unique_ptr<CallInst> StructICall(
+      CallInst::Create(StructIFnTy, StructICallee, {}, ""));
+  EXPECT_FALSE(isa<FPMathOperator>(StructICall));
+
+  Type *NamedStructFTy = StructType::create({FTy, FTy}, "AStruct");
+  FunctionType *NamedStructFFnTy = FunctionType::get(NamedStructFTy, {});
+  Value *NamedStructFCallee = Constant::getNullValue(PtrTy);
+  std::unique_ptr<CallInst> NamedStructFCall(
+      CallInst::Create(NamedStructFFnTy, NamedStructFCallee, {}, ""));
+  EXPECT_FALSE(isa<FPMathOperator>(NamedStructFCall));
+
+  Type *MixedStructTy = StructType::get(FTy, ITy);
+  FunctionType *MixedStructFnTy = FunctionType::get(MixedStructTy, {});
+  Value *MixedStructCallee = Constant::getNullValue(PtrTy);
+  std::unique_ptr<CallInst> MixedStructCall(
+      CallInst::Create(MixedStructFnTy, MixedStructCallee, {}, ""));
+  EXPECT_FALSE(isa<FPMathOperator>(MixedStructCall));
+
+  Type *StructFTy = StructType::get(FTy, FTy);
+  FunctionType *StructFFnTy = FunctionType::get(StructFTy, {});
+  Value *StructFCallee = Constant::getNullValue(PtrTy);
+  std::unique_ptr<CallInst> StructFCall(
+      CallInst::Create(StructFFnTy, StructFCallee, {}, ""));
+  EXPECT_TRUE(isa<FPMathOperator>(StructFCall));
+
+  Type *StructVFTy = StructType::get(VFTy, VFTy);
+  FunctionType *StructVFFnTy = FunctionType::get(StructVFTy, {});
+  Value *StructVFCallee = Constant::getNullValue(PtrTy);
+  std::unique_ptr<CallInst> StructVFCall(
+      CallInst::Create(StructVFFnTy, StructVFCallee, {}, ""));
+  EXPECT_TRUE(isa<FPMathOperator>(StructVFCall));
 }
 
 TEST(InstructionsTest, FNegInstruction) {

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this extension.

llvm/include/llvm/IR/Operator.h Outdated Show resolved Hide resolved
type, or an array (nested to any depth) of floating-point scalar or vector
types.
are only valid for phis that return a floating-point scalar or vector type,
possibly within an array (nested to any depth), or a homogeneous struct literal.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove "possibly", and elaborate on what a homogenous struct means (also it's not a literal?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be "homogeneous literal struct". "literal" here means "unnamed".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove "possibly"

I think removing that makes it sound like the float/vector type is required to be within an array or struct type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just list the cases that work

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've listed the cases under the "Fast-Math Flags" section and updated the instructions to reference that. I think that's a little clearer than trying to squash call the cases into a single sentence (and makes future updates easier).

I've also updated one of the struct examples to point out it's a homogeneous struct.

@@ -1122,6 +1122,26 @@ define void @fastMathFlagsForArrayCalls([2 x float] %f, [2 x double] %d1, [2 x <
ret void
}

declare { float, float } @fmf_struct_f32()
declare { double, double } @fmf_struct_f64()
declare { <4 x double>, <4 x double> } @fmf_struct_v4f64()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test some cases with more than 2 entries

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And empty and an odd number of entries

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And complex nested structs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nested structs and empty structs are not supported (so can't have the attributes in the IR without an error), so I'll instead add those tests to the InstructionsTest.cpp unittest

llvm/unittests/IR/InstructionsTest.cpp Show resolved Hide resolved

%call.nnan.ninf = tail call nnan ninf fastcc { <4 x double>, <4 x double> } @fmf_struct_v4f64()
; CHECK: %call.nnan.ninf = tail call nnan ninf fastcc { <4 x double>, <4 x double> } @fmf_struct_v4f64()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about struct of array?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add a test, but it's not a supported case (as I don't currently see a need for it).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a test with nofpclass attributes on the return / argument? The intent was it would be allowed for the same types as FPMathOperator

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nofpclass used a separate check, so I had to update it to support struct types (in the last commit). Not sure if it should be part of this PR, or moved to a later PR though?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either way

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:X86 labels Sep 30, 2024
Copy link

github-actions bot commented Sep 30, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the modified clang tests are failing on the pre-merge CI?

�_bk;t=1727733748123�Failed Tests (3):
�_bk;t=1727733748123�  Clang :: CodeGen/X86/cx-complex-range.c
�_bk;t=1727733748123�  Clang :: CodeGen/cx-complex-range.c
�_bk;t=1727733748123�  Clang :: CodeGen/nofpclass.c

llvm/include/llvm/IR/Operator.h Outdated Show resolved Hide resolved
llvm/include/llvm/IR/Operator.h Outdated Show resolved Hide resolved
@MacDue
Copy link
Member Author

MacDue commented Oct 1, 2024

It looks like the modified clang tests are failing on the pre-merge CI?

�_bk;t=1727733748123�Failed Tests (3):
�_bk;t=1727733748123�  Clang :: CodeGen/X86/cx-complex-range.c
�_bk;t=1727733748123�  Clang :: CodeGen/cx-complex-range.c
�_bk;t=1727733748123�  Clang :: CodeGen/nofpclass.c

Ah thanks, they need to be updated for the nofpclass changes too now 👍

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from my side, but please wait on @arsenm's approval as well.

@MacDue MacDue merged commit 95f00a6 into llvm:main Oct 2, 2024
9 checks passed
@MacDue MacDue deleted the fmf_struct_literals branch October 2, 2024 09:05
VitaNuo pushed a commit to VitaNuo/llvm-project that referenced this pull request Oct 2, 2024
…lvm#110506)

This extends FPMathOperator to allow calls that return literal structs
of homogeneous floating-point or vector-of-floating-point types.

The intended use case for this is to support FP intrinsics that return
multiple values (such as `llvm.sincos`).
VitaNuo pushed a commit to VitaNuo/llvm-project that referenced this pull request Oct 2, 2024
…lvm#110506)

This extends FPMathOperator to allow calls that return literal structs
of homogeneous floating-point or vector-of-floating-point types.

The intended use case for this is to support FP intrinsics that return
multiple values (such as `llvm.sincos`).
Sterling-Augustine pushed a commit to Sterling-Augustine/llvm-project that referenced this pull request Oct 3, 2024
…lvm#110506)

This extends FPMathOperator to allow calls that return literal structs
of homogeneous floating-point or vector-of-floating-point types.

The intended use case for this is to support FP intrinsics that return
multiple values (such as `llvm.sincos`).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 clang Clang issues not falling into any other category llvm:ir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants