[IR] Allow fast math flags on calls with homogeneous FP struct types #110506

MacDue · 2024-09-30T13:10:11Z

This extends FPMathOperator to allow calls that return literal structs of homogeneous floating-point or vector-of-floating-point types.

The intended use case for this is to support FP intrinsics that return multiple values (such as llvm.sincos).

This extends FPMathOperator to allow calls that return literal structs of homogeneous floating-point or vector-of-floating-point types. The intended use case for this is to support FP intrinsics that return multiple values (such as `llvm.sincos`).

llvmbot · 2024-09-30T13:10:50Z

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-llvm-ir

Author: Benjamin Maxwell (MacDue)

Changes

This extends FPMathOperator to allow calls that return literal structs of homogeneous floating-point or vector-of-floating-point types.

The intended use case for this is to support FP intrinsics that return multiple values (such as llvm.sincos).

Full diff: https://github.com/llvm/llvm-project/pull/110506.diff

6 Files Affected:

(modified) llvm/docs/LangRef.rst (+10-9)
(modified) llvm/include/llvm/IR/DerivedTypes.h (+4)
(modified) llvm/include/llvm/IR/Operator.h (+12-2)
(modified) llvm/lib/IR/Type.cpp (+7-6)
(modified) llvm/test/Bitcode/compatibility.ll (+20)
(modified) llvm/unittests/IR/InstructionsTest.cpp (+34-6)

diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 3f39d58b322a4f..1eb2982385fda0 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -12472,9 +12472,8 @@ instruction's return value on the same edge).
 The optional ``fast-math-flags`` marker indicates that the phi has one
 or more :ref:`fast-math-flags <fastmath>`. These are optimization hints
 to enable otherwise unsafe floating-point optimizations. Fast-math-flags
-are only valid for phis that return a floating-point scalar or vector
-type, or an array (nested to any depth) of floating-point scalar or vector
-types.
+are only valid for phis that return a floating-point scalar or vector type,
+possibly within an array (nested to any depth), or a homogeneous struct literal.
 
 Semantics:
 """"""""""
@@ -12523,8 +12522,8 @@ class <t_firstclass>` type.
 #. The optional ``fast-math flags`` marker indicates that the select has one or more
    :ref:`fast-math flags <fastmath>`. These are optimization hints to enable
    otherwise unsafe floating-point optimizations. Fast-math flags are only valid
-   for selects that return a floating-point scalar or vector type, or an array
-   (nested to any depth) of floating-point scalar or vector types.
+   for selects that return a floating-point scalar or vector type, possibly
+   within an array (nested to any depth), or a homogeneous struct literal.
 
 Semantics:
 """"""""""
@@ -12762,8 +12761,8 @@ This instruction requires several arguments:
 #. The optional ``fast-math flags`` marker indicates that the call has one or more
    :ref:`fast-math flags <fastmath>`, which are optimization hints to enable
    otherwise unsafe floating-point optimizations. Fast-math flags are only valid
-   for calls that return a floating-point scalar or vector type, or an array
-   (nested to any depth) of floating-point scalar or vector types.
+   for calls that return a floating-point scalar or vector type, possibly within
+   an array (nested to any depth), or a homogeneous struct literal.
 
 #. The optional "cconv" marker indicates which :ref:`calling
    convention <callingconv>` the call should use. If none is
@@ -20528,7 +20527,8 @@ the explicit vector length.
    more :ref:`fast-math flags <fastmath>`. These are optimization hints to
    enable otherwise unsafe floating-point optimizations. Fast-math flags are
    only valid for selects that return a floating-point scalar or vector type,
-   or an array (nested to any depth) of floating-point scalar or vector types.
+   possibly within an array (nested to any depth), or a homogeneous struct
+   literal.
 
 Semantics:
 """"""""""
@@ -20586,7 +20586,8 @@ is the pivot.
    more :ref:`fast-math flags <fastmath>`. These are optimization hints to
    enable otherwise unsafe floating-point optimizations. Fast-math flags are
    only valid for merges that return a floating-point scalar or vector type,
-   or an array (nested to any depth) of floating-point scalar or vector types.
+   possibly within an array (nested to any depth), or a homogeneous struct
+   literal.
 
 Semantics:
 """"""""""
diff --git a/llvm/include/llvm/IR/DerivedTypes.h b/llvm/include/llvm/IR/DerivedTypes.h
index 975c142f1a4572..a24801d8bdf834 100644
--- a/llvm/include/llvm/IR/DerivedTypes.h
+++ b/llvm/include/llvm/IR/DerivedTypes.h
@@ -301,6 +301,10 @@ class StructType : public Type {
   ///  {<vscale x 2 x i32>, <vscale x 4 x i64>}}
   bool containsHomogeneousScalableVectorTypes() const;
 
+  /// Return true if this struct is non-empty and all element types are the
+  /// same.
+  bool containsHomogeneousTypes() const;
+
   /// Return true if this is a named struct that has a non-empty name.
   bool hasName() const { return SymbolTableEntry != nullptr; }
 
diff --git a/llvm/include/llvm/IR/Operator.h b/llvm/include/llvm/IR/Operator.h
index 88b9bfc0be4b15..22ffcc730e7b68 100644
--- a/llvm/include/llvm/IR/Operator.h
+++ b/llvm/include/llvm/IR/Operator.h
@@ -15,6 +15,7 @@
 #define LLVM_IR_OPERATOR_H
 
 #include "llvm/ADT/MapVector.h"
+#include "llvm/ADT/TypeSwitch.h"
 #include "llvm/IR/Constants.h"
 #include "llvm/IR/FMF.h"
 #include "llvm/IR/GEPNoWrapFlags.h"
@@ -351,8 +352,17 @@ class FPMathOperator : public Operator {
     case Instruction::Select:
     case Instruction::Call: {
       Type *Ty = V->getType();
-      while (ArrayType *ArrTy = dyn_cast<ArrayType>(Ty))
-        Ty = ArrTy->getElementType();
+      TypeSwitch<Type *>(Ty)
+          .Case([&](StructType *StructTy) {
+            if (!StructTy->isLiteral() || !StructTy->containsHomogeneousTypes())
+              return;
+            Ty = StructTy->elements().front();
+          })
+          .Case([&](ArrayType *ArrTy) {
+            do {
+              Ty = ArrTy->getElementType();
+            } while ((ArrTy = dyn_cast<ArrayType>(Ty)));
+          });
       return Ty->isFPOrFPVectorTy();
     }
     default:
diff --git a/llvm/lib/IR/Type.cpp b/llvm/lib/IR/Type.cpp
index 3784ad28d7219d..f618263f79c313 100644
--- a/llvm/lib/IR/Type.cpp
+++ b/llvm/lib/IR/Type.cpp
@@ -430,13 +430,14 @@ bool StructType::containsScalableVectorType(
 }
 
 bool StructType::containsHomogeneousScalableVectorTypes() const {
-  Type *FirstTy = getNumElements() > 0 ? elements()[0] : nullptr;
-  if (!FirstTy || !isa<ScalableVectorType>(FirstTy))
+  if (getNumElements() <= 0 || !isa<ScalableVectorType>(elements().front()))
     return false;
-  for (Type *Ty : elements())
-    if (Ty != FirstTy)
-      return false;
-  return true;
+  return containsHomogeneousTypes();
+}
+
+bool StructType::containsHomogeneousTypes() const {
+  ArrayRef<Type *> ElementTys = elements();
+  return !ElementTys.empty() && all_equal(ElementTys);
 }
 
 void StructType::setBody(ArrayRef<Type*> Elements, bool isPacked) {
diff --git a/llvm/test/Bitcode/compatibility.ll b/llvm/test/Bitcode/compatibility.ll
index ea29ff634a43bb..4fe9d9b11f8831 100644
--- a/llvm/test/Bitcode/compatibility.ll
+++ b/llvm/test/Bitcode/compatibility.ll
@@ -1122,6 +1122,26 @@ define void @fastMathFlagsForArrayCalls([2 x float] %f, [2 x double] %d1, [2 x <
   ret void
 }
 
+declare { float, float } @fmf_struct_f32()
+declare { double, double } @fmf_struct_f64()
+declare { <4 x double>, <4 x double> } @fmf_struct_v4f64()
+
+; CHECK-LABEL: fastMathFlagsForStructCalls(
+define void @fastMathFlagsForStructCalls({ float, float } %f, { double, double } %d1, { <4 x double>, <4 x double> } %d2) {
+  %call.fast = call fast { float, float } @fmf_struct_f32()
+  ; CHECK: %call.fast = call fast { float, float } @fmf_struct_f32()
+
+  ; Throw in some other attributes to make sure those stay in the right places.
+
+  %call.nsz.arcp = notail call nsz arcp { double, double } @fmf_struct_f64()
+  ; CHECK: %call.nsz.arcp = notail call nsz arcp { double, double } @fmf_struct_f64()
+
+  %call.nnan.ninf = tail call nnan ninf fastcc { <4 x double>, <4 x double> } @fmf_struct_v4f64()
+  ; CHECK: %call.nnan.ninf = tail call nnan ninf fastcc { <4 x double>, <4 x double> } @fmf_struct_v4f64()
+
+  ret void
+}
+
 ;; Type System
 %opaquety = type opaque
 define void @typesystem() {
diff --git a/llvm/unittests/IR/InstructionsTest.cpp b/llvm/unittests/IR/InstructionsTest.cpp
index 481fe96607e48e..9d8056a768af2f 100644
--- a/llvm/unittests/IR/InstructionsTest.cpp
+++ b/llvm/unittests/IR/InstructionsTest.cpp
@@ -1559,12 +1559,40 @@ TEST(InstructionsTest, FPCallIsFPMathOperator) {
       CallInst::Create(AVFFnTy, AVFCallee, {}, ""));
   EXPECT_TRUE(isa<FPMathOperator>(AVFCall));
 
-  Type *AAVFTy = ArrayType::get(AVFTy, 2);
-  FunctionType *AAVFFnTy = FunctionType::get(AAVFTy, {});
-  Value *AAVFCallee = Constant::getNullValue(PtrTy);
-  std::unique_ptr<CallInst> AAVFCall(
-      CallInst::Create(AAVFFnTy, AAVFCallee, {}, ""));
-  EXPECT_TRUE(isa<FPMathOperator>(AAVFCall));
+  Type *StructITy = StructType::get(ITy, ITy);
+  FunctionType *StructIFnTy = FunctionType::get(StructITy, {});
+  Value *StructICallee = Constant::getNullValue(PtrTy);
+  std::unique_ptr<CallInst> StructICall(
+      CallInst::Create(StructIFnTy, StructICallee, {}, ""));
+  EXPECT_FALSE(isa<FPMathOperator>(StructICall));
+
+  Type *NamedStructFTy = StructType::create({FTy, FTy}, "AStruct");
+  FunctionType *NamedStructFFnTy = FunctionType::get(NamedStructFTy, {});
+  Value *NamedStructFCallee = Constant::getNullValue(PtrTy);
+  std::unique_ptr<CallInst> NamedStructFCall(
+      CallInst::Create(NamedStructFFnTy, NamedStructFCallee, {}, ""));
+  EXPECT_FALSE(isa<FPMathOperator>(NamedStructFCall));
+
+  Type *MixedStructTy = StructType::get(FTy, ITy);
+  FunctionType *MixedStructFnTy = FunctionType::get(MixedStructTy, {});
+  Value *MixedStructCallee = Constant::getNullValue(PtrTy);
+  std::unique_ptr<CallInst> MixedStructCall(
+      CallInst::Create(MixedStructFnTy, MixedStructCallee, {}, ""));
+  EXPECT_FALSE(isa<FPMathOperator>(MixedStructCall));
+
+  Type *StructFTy = StructType::get(FTy, FTy);
+  FunctionType *StructFFnTy = FunctionType::get(StructFTy, {});
+  Value *StructFCallee = Constant::getNullValue(PtrTy);
+  std::unique_ptr<CallInst> StructFCall(
+      CallInst::Create(StructFFnTy, StructFCallee, {}, ""));
+  EXPECT_TRUE(isa<FPMathOperator>(StructFCall));
+
+  Type *StructVFTy = StructType::get(VFTy, VFTy);
+  FunctionType *StructVFFnTy = FunctionType::get(StructVFTy, {});
+  Value *StructVFCallee = Constant::getNullValue(PtrTy);
+  std::unique_ptr<CallInst> StructVFCall(
+      CallInst::Create(StructVFFnTy, StructVFCallee, {}, ""));
+  EXPECT_TRUE(isa<FPMathOperator>(StructVFCall));
 }
 
 TEST(InstructionsTest, FNegInstruction) {

nikic

I'm fine with this extension.

llvm/include/llvm/IR/Operator.h

arsenm · 2024-09-30T15:23:18Z

llvm/docs/LangRef.rst

-type, or an array (nested to any depth) of floating-point scalar or vector
-types.
+are only valid for phis that return a floating-point scalar or vector type,
+possibly within an array (nested to any depth), or a homogeneous struct literal.


Remove "possibly", and elaborate on what a homogenous struct means (also it's not a literal?)

Should be "homogeneous literal struct". "literal" here means "unnamed".

Remove "possibly"

I think removing that makes it sound like the float/vector type is required to be within an array or struct type.

Just list the cases that work

I've listed the cases under the "Fast-Math Flags" section and updated the instructions to reference that. I think that's a little clearer than trying to squash call the cases into a single sentence (and makes future updates easier).

I've also updated one of the struct examples to point out it's a homogeneous struct.

arsenm · 2024-09-30T15:24:05Z

llvm/test/Bitcode/compatibility.ll

@@ -1122,6 +1122,26 @@ define void @fastMathFlagsForArrayCalls([2 x float] %f, [2 x double] %d1, [2 x <
  ret void
 }

+declare { float, float } @fmf_struct_f32()
+declare { double, double } @fmf_struct_f64()
+declare { <4 x double>, <4 x double> } @fmf_struct_v4f64()


Test some cases with more than 2 entries

And empty and an odd number of entries

And complex nested structs

Nested structs and empty structs are not supported (so can't have the attributes in the IR without an error), so I'll instead add those tests to the InstructionsTest.cpp unittest

llvm/unittests/IR/InstructionsTest.cpp

arsenm · 2024-09-30T15:26:14Z

llvm/test/Bitcode/compatibility.ll

+
+  %call.nnan.ninf = tail call nnan ninf fastcc { <4 x double>, <4 x double> } @fmf_struct_v4f64()
+  ; CHECK: %call.nnan.ninf = tail call nnan ninf fastcc { <4 x double>, <4 x double> } @fmf_struct_v4f64()
+


What about struct of array?

I can add a test, but it's not a supported case (as I don't currently see a need for it).

Can you also add a test with nofpclass attributes on the return / argument? The intent was it would be allowed for the same types as FPMathOperator

nofpclass used a separate check, so I had to update it to support struct types (in the last commit). Not sure if it should be part of this PR, or moved to a later PR though?

github-actions · 2024-09-30T18:54:17Z

✅ With the latest revision this PR passed the C/C++ code formatter.

nikic

It looks like the modified clang tests are failing on the pre-merge CI?

�_bk;t=1727733748123�Failed Tests (3):
�_bk;t=1727733748123�  Clang :: CodeGen/X86/cx-complex-range.c
�_bk;t=1727733748123�  Clang :: CodeGen/cx-complex-range.c
�_bk;t=1727733748123�  Clang :: CodeGen/nofpclass.c

llvm/include/llvm/IR/Operator.h

MacDue · 2024-10-01T08:47:52Z

It looks like the modified clang tests are failing on the pre-merge CI?

�_bk;t=1727733748123�Failed Tests (3):
�_bk;t=1727733748123�  Clang :: CodeGen/X86/cx-complex-range.c
�_bk;t=1727733748123�  Clang :: CodeGen/cx-complex-range.c
�_bk;t=1727733748123�  Clang :: CodeGen/nofpclass.c

Ah thanks, they need to be updated for the nofpclass changes too now 👍

llvm/lib/IR/Attributes.cpp

llvm/include/llvm/IR/Operator.h

nikic

LGTM from my side, but please wait on @arsenm's approval as well.

…lvm#110506) This extends FPMathOperator to allow calls that return literal structs of homogeneous floating-point or vector-of-floating-point types. The intended use case for this is to support FP intrinsics that return multiple values (such as `llvm.sincos`).

llvmbot added the llvm:ir label Sep 30, 2024

MacDue mentioned this pull request Sep 30, 2024

[IR] Add llvm.sincos intrinsic #109825

Open

tschuett requested review from arsenm, nikic and tschuett September 30, 2024 13:18

nikic reviewed Sep 30, 2024

View reviewed changes

llvm/include/llvm/IR/Operator.h Outdated Show resolved Hide resolved

Remove TypeSwitch

6881540

arsenm reviewed Sep 30, 2024

View reviewed changes

Update clang tests

5cc741f

llvmbot added clang Clang issues not falling into any other category backend:X86 labels Sep 30, 2024

Fixups

8d3353c

MacDue force-pushed the fmf_struct_literals branch from ad12ad2 to 8d3353c Compare September 30, 2024 17:14

Update nofpclass

04b7d82

MacDue force-pushed the fmf_struct_literals branch from 6e7bebc to 04b7d82 Compare September 30, 2024 18:55

nikic reviewed Oct 1, 2024

View reviewed changes

llvm/include/llvm/IR/Operator.h Outdated Show resolved Hide resolved

llvm/include/llvm/IR/Operator.h Outdated Show resolved Hide resolved

Update nofpclass clang tests

77479c6

arsenm reviewed Oct 1, 2024

View reviewed changes

llvm/lib/IR/Attributes.cpp Show resolved Hide resolved

MacDue added 2 commits October 1, 2024 09:11

Use auto

1869ca2

Remove TODO

6bfe0bc

arsenm reviewed Oct 1, 2024

View reviewed changes

llvm/include/llvm/IR/Operator.h Outdated Show resolved Hide resolved

nikic approved these changes Oct 1, 2024

View reviewed changes

MacDue force-pushed the fmf_struct_literals branch from b4ef422 to 4687a18 Compare October 1, 2024 15:34

Short circuit check

98bd284

MacDue force-pushed the fmf_struct_literals branch from 4687a18 to 98bd284 Compare October 1, 2024 16:11

arsenm approved these changes Oct 1, 2024

View reviewed changes

MacDue merged commit 95f00a6 into llvm:main Oct 2, 2024
9 checks passed

MacDue deleted the fmf_struct_literals branch October 2, 2024 09:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[IR] Allow fast math flags on calls with homogeneous FP struct types #110506

[IR] Allow fast math flags on calls with homogeneous FP struct types #110506

MacDue commented Sep 30, 2024

llvmbot commented Sep 30, 2024 •

edited

Loading

nikic left a comment

arsenm Sep 30, 2024

nikic Sep 30, 2024

MacDue Sep 30, 2024

arsenm Sep 30, 2024

MacDue Sep 30, 2024

arsenm Sep 30, 2024

tschuett Sep 30, 2024

tschuett Sep 30, 2024

MacDue Sep 30, 2024

arsenm Sep 30, 2024

MacDue Sep 30, 2024

arsenm Sep 30, 2024

MacDue Sep 30, 2024

arsenm Oct 1, 2024

github-actions bot commented Sep 30, 2024 •

edited

Loading

nikic left a comment

MacDue commented Oct 1, 2024

nikic left a comment


		%call.nnan.ninf = tail call nnan ninf fastcc { <4 x double>, <4 x double> } @fmf_struct_v4f64()
		; CHECK: %call.nnan.ninf = tail call nnan ninf fastcc { <4 x double>, <4 x double> } @fmf_struct_v4f64()

[IR] Allow fast math flags on calls with homogeneous FP struct types #110506

[IR] Allow fast math flags on calls with homogeneous FP struct types #110506

Conversation

MacDue commented Sep 30, 2024

llvmbot commented Sep 30, 2024 • edited Loading

nikic left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Sep 30, 2024 • edited Loading

nikic left a comment

Choose a reason for hiding this comment

MacDue commented Oct 1, 2024

nikic left a comment

Choose a reason for hiding this comment

llvmbot commented Sep 30, 2024 •

edited

Loading

github-actions bot commented Sep 30, 2024 •

edited

Loading