-
Notifications
You must be signed in to change notification settings - Fork 11.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[IR] Allow fast math flags on calls with homogeneous FP struct types #110506
Conversation
This extends FPMathOperator to allow calls that return literal structs of homogeneous floating-point or vector-of-floating-point types. The intended use case for this is to support FP intrinsics that return multiple values (such as `llvm.sincos`).
@llvm/pr-subscribers-clang @llvm/pr-subscribers-llvm-ir Author: Benjamin Maxwell (MacDue) ChangesThis extends FPMathOperator to allow calls that return literal structs of homogeneous floating-point or vector-of-floating-point types. The intended use case for this is to support FP intrinsics that return multiple values (such as Full diff: https://github.com/llvm/llvm-project/pull/110506.diff 6 Files Affected:
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 3f39d58b322a4f..1eb2982385fda0 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -12472,9 +12472,8 @@ instruction's return value on the same edge).
The optional ``fast-math-flags`` marker indicates that the phi has one
or more :ref:`fast-math-flags <fastmath>`. These are optimization hints
to enable otherwise unsafe floating-point optimizations. Fast-math-flags
-are only valid for phis that return a floating-point scalar or vector
-type, or an array (nested to any depth) of floating-point scalar or vector
-types.
+are only valid for phis that return a floating-point scalar or vector type,
+possibly within an array (nested to any depth), or a homogeneous struct literal.
Semantics:
""""""""""
@@ -12523,8 +12522,8 @@ class <t_firstclass>` type.
#. The optional ``fast-math flags`` marker indicates that the select has one or more
:ref:`fast-math flags <fastmath>`. These are optimization hints to enable
otherwise unsafe floating-point optimizations. Fast-math flags are only valid
- for selects that return a floating-point scalar or vector type, or an array
- (nested to any depth) of floating-point scalar or vector types.
+ for selects that return a floating-point scalar or vector type, possibly
+ within an array (nested to any depth), or a homogeneous struct literal.
Semantics:
""""""""""
@@ -12762,8 +12761,8 @@ This instruction requires several arguments:
#. The optional ``fast-math flags`` marker indicates that the call has one or more
:ref:`fast-math flags <fastmath>`, which are optimization hints to enable
otherwise unsafe floating-point optimizations. Fast-math flags are only valid
- for calls that return a floating-point scalar or vector type, or an array
- (nested to any depth) of floating-point scalar or vector types.
+ for calls that return a floating-point scalar or vector type, possibly within
+ an array (nested to any depth), or a homogeneous struct literal.
#. The optional "cconv" marker indicates which :ref:`calling
convention <callingconv>` the call should use. If none is
@@ -20528,7 +20527,8 @@ the explicit vector length.
more :ref:`fast-math flags <fastmath>`. These are optimization hints to
enable otherwise unsafe floating-point optimizations. Fast-math flags are
only valid for selects that return a floating-point scalar or vector type,
- or an array (nested to any depth) of floating-point scalar or vector types.
+ possibly within an array (nested to any depth), or a homogeneous struct
+ literal.
Semantics:
""""""""""
@@ -20586,7 +20586,8 @@ is the pivot.
more :ref:`fast-math flags <fastmath>`. These are optimization hints to
enable otherwise unsafe floating-point optimizations. Fast-math flags are
only valid for merges that return a floating-point scalar or vector type,
- or an array (nested to any depth) of floating-point scalar or vector types.
+ possibly within an array (nested to any depth), or a homogeneous struct
+ literal.
Semantics:
""""""""""
diff --git a/llvm/include/llvm/IR/DerivedTypes.h b/llvm/include/llvm/IR/DerivedTypes.h
index 975c142f1a4572..a24801d8bdf834 100644
--- a/llvm/include/llvm/IR/DerivedTypes.h
+++ b/llvm/include/llvm/IR/DerivedTypes.h
@@ -301,6 +301,10 @@ class StructType : public Type {
/// {<vscale x 2 x i32>, <vscale x 4 x i64>}}
bool containsHomogeneousScalableVectorTypes() const;
+ /// Return true if this struct is non-empty and all element types are the
+ /// same.
+ bool containsHomogeneousTypes() const;
+
/// Return true if this is a named struct that has a non-empty name.
bool hasName() const { return SymbolTableEntry != nullptr; }
diff --git a/llvm/include/llvm/IR/Operator.h b/llvm/include/llvm/IR/Operator.h
index 88b9bfc0be4b15..22ffcc730e7b68 100644
--- a/llvm/include/llvm/IR/Operator.h
+++ b/llvm/include/llvm/IR/Operator.h
@@ -15,6 +15,7 @@
#define LLVM_IR_OPERATOR_H
#include "llvm/ADT/MapVector.h"
+#include "llvm/ADT/TypeSwitch.h"
#include "llvm/IR/Constants.h"
#include "llvm/IR/FMF.h"
#include "llvm/IR/GEPNoWrapFlags.h"
@@ -351,8 +352,17 @@ class FPMathOperator : public Operator {
case Instruction::Select:
case Instruction::Call: {
Type *Ty = V->getType();
- while (ArrayType *ArrTy = dyn_cast<ArrayType>(Ty))
- Ty = ArrTy->getElementType();
+ TypeSwitch<Type *>(Ty)
+ .Case([&](StructType *StructTy) {
+ if (!StructTy->isLiteral() || !StructTy->containsHomogeneousTypes())
+ return;
+ Ty = StructTy->elements().front();
+ })
+ .Case([&](ArrayType *ArrTy) {
+ do {
+ Ty = ArrTy->getElementType();
+ } while ((ArrTy = dyn_cast<ArrayType>(Ty)));
+ });
return Ty->isFPOrFPVectorTy();
}
default:
diff --git a/llvm/lib/IR/Type.cpp b/llvm/lib/IR/Type.cpp
index 3784ad28d7219d..f618263f79c313 100644
--- a/llvm/lib/IR/Type.cpp
+++ b/llvm/lib/IR/Type.cpp
@@ -430,13 +430,14 @@ bool StructType::containsScalableVectorType(
}
bool StructType::containsHomogeneousScalableVectorTypes() const {
- Type *FirstTy = getNumElements() > 0 ? elements()[0] : nullptr;
- if (!FirstTy || !isa<ScalableVectorType>(FirstTy))
+ if (getNumElements() <= 0 || !isa<ScalableVectorType>(elements().front()))
return false;
- for (Type *Ty : elements())
- if (Ty != FirstTy)
- return false;
- return true;
+ return containsHomogeneousTypes();
+}
+
+bool StructType::containsHomogeneousTypes() const {
+ ArrayRef<Type *> ElementTys = elements();
+ return !ElementTys.empty() && all_equal(ElementTys);
}
void StructType::setBody(ArrayRef<Type*> Elements, bool isPacked) {
diff --git a/llvm/test/Bitcode/compatibility.ll b/llvm/test/Bitcode/compatibility.ll
index ea29ff634a43bb..4fe9d9b11f8831 100644
--- a/llvm/test/Bitcode/compatibility.ll
+++ b/llvm/test/Bitcode/compatibility.ll
@@ -1122,6 +1122,26 @@ define void @fastMathFlagsForArrayCalls([2 x float] %f, [2 x double] %d1, [2 x <
ret void
}
+declare { float, float } @fmf_struct_f32()
+declare { double, double } @fmf_struct_f64()
+declare { <4 x double>, <4 x double> } @fmf_struct_v4f64()
+
+; CHECK-LABEL: fastMathFlagsForStructCalls(
+define void @fastMathFlagsForStructCalls({ float, float } %f, { double, double } %d1, { <4 x double>, <4 x double> } %d2) {
+ %call.fast = call fast { float, float } @fmf_struct_f32()
+ ; CHECK: %call.fast = call fast { float, float } @fmf_struct_f32()
+
+ ; Throw in some other attributes to make sure those stay in the right places.
+
+ %call.nsz.arcp = notail call nsz arcp { double, double } @fmf_struct_f64()
+ ; CHECK: %call.nsz.arcp = notail call nsz arcp { double, double } @fmf_struct_f64()
+
+ %call.nnan.ninf = tail call nnan ninf fastcc { <4 x double>, <4 x double> } @fmf_struct_v4f64()
+ ; CHECK: %call.nnan.ninf = tail call nnan ninf fastcc { <4 x double>, <4 x double> } @fmf_struct_v4f64()
+
+ ret void
+}
+
;; Type System
%opaquety = type opaque
define void @typesystem() {
diff --git a/llvm/unittests/IR/InstructionsTest.cpp b/llvm/unittests/IR/InstructionsTest.cpp
index 481fe96607e48e..9d8056a768af2f 100644
--- a/llvm/unittests/IR/InstructionsTest.cpp
+++ b/llvm/unittests/IR/InstructionsTest.cpp
@@ -1559,12 +1559,40 @@ TEST(InstructionsTest, FPCallIsFPMathOperator) {
CallInst::Create(AVFFnTy, AVFCallee, {}, ""));
EXPECT_TRUE(isa<FPMathOperator>(AVFCall));
- Type *AAVFTy = ArrayType::get(AVFTy, 2);
- FunctionType *AAVFFnTy = FunctionType::get(AAVFTy, {});
- Value *AAVFCallee = Constant::getNullValue(PtrTy);
- std::unique_ptr<CallInst> AAVFCall(
- CallInst::Create(AAVFFnTy, AAVFCallee, {}, ""));
- EXPECT_TRUE(isa<FPMathOperator>(AAVFCall));
+ Type *StructITy = StructType::get(ITy, ITy);
+ FunctionType *StructIFnTy = FunctionType::get(StructITy, {});
+ Value *StructICallee = Constant::getNullValue(PtrTy);
+ std::unique_ptr<CallInst> StructICall(
+ CallInst::Create(StructIFnTy, StructICallee, {}, ""));
+ EXPECT_FALSE(isa<FPMathOperator>(StructICall));
+
+ Type *NamedStructFTy = StructType::create({FTy, FTy}, "AStruct");
+ FunctionType *NamedStructFFnTy = FunctionType::get(NamedStructFTy, {});
+ Value *NamedStructFCallee = Constant::getNullValue(PtrTy);
+ std::unique_ptr<CallInst> NamedStructFCall(
+ CallInst::Create(NamedStructFFnTy, NamedStructFCallee, {}, ""));
+ EXPECT_FALSE(isa<FPMathOperator>(NamedStructFCall));
+
+ Type *MixedStructTy = StructType::get(FTy, ITy);
+ FunctionType *MixedStructFnTy = FunctionType::get(MixedStructTy, {});
+ Value *MixedStructCallee = Constant::getNullValue(PtrTy);
+ std::unique_ptr<CallInst> MixedStructCall(
+ CallInst::Create(MixedStructFnTy, MixedStructCallee, {}, ""));
+ EXPECT_FALSE(isa<FPMathOperator>(MixedStructCall));
+
+ Type *StructFTy = StructType::get(FTy, FTy);
+ FunctionType *StructFFnTy = FunctionType::get(StructFTy, {});
+ Value *StructFCallee = Constant::getNullValue(PtrTy);
+ std::unique_ptr<CallInst> StructFCall(
+ CallInst::Create(StructFFnTy, StructFCallee, {}, ""));
+ EXPECT_TRUE(isa<FPMathOperator>(StructFCall));
+
+ Type *StructVFTy = StructType::get(VFTy, VFTy);
+ FunctionType *StructVFFnTy = FunctionType::get(StructVFTy, {});
+ Value *StructVFCallee = Constant::getNullValue(PtrTy);
+ std::unique_ptr<CallInst> StructVFCall(
+ CallInst::Create(StructVFFnTy, StructVFCallee, {}, ""));
+ EXPECT_TRUE(isa<FPMathOperator>(StructVFCall));
}
TEST(InstructionsTest, FNegInstruction) {
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with this extension.
llvm/docs/LangRef.rst
Outdated
type, or an array (nested to any depth) of floating-point scalar or vector | ||
types. | ||
are only valid for phis that return a floating-point scalar or vector type, | ||
possibly within an array (nested to any depth), or a homogeneous struct literal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove "possibly", and elaborate on what a homogenous struct means (also it's not a literal?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be "homogeneous literal struct". "literal" here means "unnamed".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove "possibly"
I think removing that makes it sound like the float/vector type is required to be within an array or struct type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just list the cases that work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've listed the cases under the "Fast-Math Flags" section and updated the instructions to reference that. I think that's a little clearer than trying to squash call the cases into a single sentence (and makes future updates easier).
I've also updated one of the struct examples to point out it's a homogeneous struct.
llvm/test/Bitcode/compatibility.ll
Outdated
@@ -1122,6 +1122,26 @@ define void @fastMathFlagsForArrayCalls([2 x float] %f, [2 x double] %d1, [2 x < | |||
ret void | |||
} | |||
|
|||
declare { float, float } @fmf_struct_f32() | |||
declare { double, double } @fmf_struct_f64() | |||
declare { <4 x double>, <4 x double> } @fmf_struct_v4f64() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test some cases with more than 2 entries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And empty and an odd number of entries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And complex nested structs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nested structs and empty structs are not supported (so can't have the attributes in the IR without an error), so I'll instead add those tests to the InstructionsTest.cpp
unittest
|
||
%call.nnan.ninf = tail call nnan ninf fastcc { <4 x double>, <4 x double> } @fmf_struct_v4f64() | ||
; CHECK: %call.nnan.ninf = tail call nnan ninf fastcc { <4 x double>, <4 x double> } @fmf_struct_v4f64() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about struct of array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add a test, but it's not a supported case (as I don't currently see a need for it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add a test with nofpclass attributes on the return / argument? The intent was it would be allowed for the same types as FPMathOperator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nofpclass
used a separate check, so I had to update it to support struct types (in the last commit). Not sure if it should be part of this PR, or moved to a later PR though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either way
ad12ad2
to
8d3353c
Compare
✅ With the latest revision this PR passed the C/C++ code formatter. |
6e7bebc
to
04b7d82
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the modified clang tests are failing on the pre-merge CI?
�_bk;t=1727733748123�Failed Tests (3):
�_bk;t=1727733748123� Clang :: CodeGen/X86/cx-complex-range.c
�_bk;t=1727733748123� Clang :: CodeGen/cx-complex-range.c
�_bk;t=1727733748123� Clang :: CodeGen/nofpclass.c
Ah thanks, they need to be updated for the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM from my side, but please wait on @arsenm's approval as well.
b4ef422
to
4687a18
Compare
4687a18
to
98bd284
Compare
…lvm#110506) This extends FPMathOperator to allow calls that return literal structs of homogeneous floating-point or vector-of-floating-point types. The intended use case for this is to support FP intrinsics that return multiple values (such as `llvm.sincos`).
…lvm#110506) This extends FPMathOperator to allow calls that return literal structs of homogeneous floating-point or vector-of-floating-point types. The intended use case for this is to support FP intrinsics that return multiple values (such as `llvm.sincos`).
…lvm#110506) This extends FPMathOperator to allow calls that return literal structs of homogeneous floating-point or vector-of-floating-point types. The intended use case for this is to support FP intrinsics that return multiple values (such as `llvm.sincos`).
This extends FPMathOperator to allow calls that return literal structs of homogeneous floating-point or vector-of-floating-point types.
The intended use case for this is to support FP intrinsics that return multiple values (such as
llvm.sincos
).