Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flang][Lower] Always generate namelist group locally #109303

Closed
wants to merge 1 commit into from

Conversation

tblah
Copy link
Contributor

@tblah tblah commented Sep 19, 2024

Namelists are allowed inside of OpenMP reduction regions. Those regions map the symbol for the reduction variable to a thread-local reduction variable, but this mapping is escaped if the namelist is generated as a global variable because the global variable will refer the original symbol mapping. See #101907 for a code example.

We can't do this only when inside of an OpenMP region because the IO function might be inside of a subroutine/function which is called from inside the OpenMP region.

There is also no restriction against namelist variables for the reduce clause of do concurrent in F23. We currently don't seem to implement initial values for reduction variables (F23 table 11.1), but when we do I expect this would see the same issue as OpenMP.

A disadvantage of doing this is that the namelist has to be constructed in place every time. If for some reason this is not hoisted out of hot loops (e.g. it is inside of another subroutine called in the loop body) this might have some performance impact (although I suspect this won't be huge compared to the context switches required for I/O).

Fixes #101907

Namelists are allowed inside of OpenMP reduction regions. Those regions
map the symbol for the reduction variable to a thread-local reduction
variable, but this mapping is escaped if the namelist is generated as a
global variable because the global variable will refer the original
symbol mapping. See llvm#101907 for a code example.

We can't do this only when inside of an OpenMP region because the IO
function might be inside of a subroutine/function which is called from
inside the OpenMP region.

There is also no restriction against namelist variables for the `reduce`
clause of `do concurrent` in F23. We currently don't seem to implement
initial values for reduction variables (F23 table 11.1), but when we do
I expect this would see the same issue as OpenMP.

A disadvantage of doing this is that the namelist has to be constructed
in place every time. If for some reason this is not hoisted out of hot
loops (e.g. it is inside of another subroutine called in the loop body)
this might have some performance impact (although I suspect this won't
be huge compared to the context switches required for I/O).

Fixes llvm#101907
@llvmbot llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir labels Sep 19, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Sep 19, 2024

@llvm/pr-subscribers-flang-fir-hlfir

Author: Tom Eccles (tblah)

Changes

Namelists are allowed inside of OpenMP reduction regions. Those regions map the symbol for the reduction variable to a thread-local reduction variable, but this mapping is escaped if the namelist is generated as a global variable because the global variable will refer the original symbol mapping. See #101907 for a code example.

We can't do this only when inside of an OpenMP region because the IO function might be inside of a subroutine/function which is called from inside the OpenMP region.

There is also no restriction against namelist variables for the reduce clause of do concurrent in F23. We currently don't seem to implement initial values for reduction variables (F23 table 11.1), but when we do I expect this would see the same issue as OpenMP.

A disadvantage of doing this is that the namelist has to be constructed in place every time. If for some reason this is not hoisted out of hot loops (e.g. it is inside of another subroutine called in the loop body) this might have some performance impact (although I suspect this won't be huge compared to the context switches required for I/O).

Fixes #101907


Patch is 25.72 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/109303.diff

3 Files Affected:

  • (modified) flang/lib/Lower/IO.cpp (+6-54)
  • (modified) flang/test/Lower/namelist-common-block.f90 (+54-10)
  • (modified) flang/test/Lower/namelist.f90 (+63-10)
diff --git a/flang/lib/Lower/IO.cpp b/flang/lib/Lower/IO.cpp
index 9e98b230b676f6..4306f71438419b 100644
--- a/flang/lib/Lower/IO.cpp
+++ b/flang/lib/Lower/IO.cpp
@@ -467,7 +467,6 @@ getNamelistGroup(Fortran::lower::AbstractConverter &converter,
   const auto &details =
       symbol.GetUltimate().get<Fortran::semantics::NamelistDetails>();
   mlir::MLIRContext *context = builder.getContext();
-  mlir::StringAttr linkOnce = builder.createLinkOnceLinkage();
   mlir::Type idxTy = builder.getIndexType();
   mlir::Type sizeTy =
       fir::runtime::getModel<std::size_t>()(builder.getContext());
@@ -487,37 +486,10 @@ getNamelistGroup(Fortran::lower::AbstractConverter &converter,
 
   // Define variable names, and static descriptors for global variables.
   DefinedIoProcMap definedIoProcMap = getDefinedIoProcMap(converter);
-  bool groupIsLocal = hasLocalDefinedIoProc(definedIoProcMap);
   stringAddress(symbol);
-  for (const Fortran::semantics::Symbol &s : details.objects()) {
-    stringAddress(s);
-    if (!Fortran::lower::symbolIsGlobal(s)) {
-      groupIsLocal = true;
-      continue;
-    }
-    // A global pointer or allocatable variable has a descriptor for typical
-    // accesses. Variables in multiple namelist groups may already have one.
-    // Create descriptors for other cases.
-    if (!IsAllocatableOrObjectPointer(&s)) {
-      std::string mangleName =
-          Fortran::lower::mangle::globalNamelistDescriptorName(s);
-      if (builder.getNamedGlobal(mangleName))
-        continue;
-      const auto expr = Fortran::evaluate::AsGenericExpr(s);
-      fir::BoxType boxTy =
-          fir::BoxType::get(fir::PointerType::get(converter.genType(s)));
-      auto descFunc = [&](fir::FirOpBuilder &b) {
-        auto box = Fortran::lower::genInitialDataTarget(
-            converter, loc, boxTy, *expr, /*couldBeInEquivalence=*/true);
-        b.create<fir::HasValueOp>(loc, box);
-      };
-      builder.createGlobalConstant(loc, boxTy, mangleName, descFunc, linkOnce);
-    }
-  }
 
   // Define the list of Items.
-  mlir::Value listAddr =
-      groupIsLocal ? builder.create<fir::AllocaOp>(loc, listTy) : mlir::Value{};
+  mlir::Value listAddr = builder.create<fir::AllocaOp>(loc, listTy);
   std::string listMangleName = groupMangleName + ".list";
   auto listFunc = [&](fir::FirOpBuilder &builder) {
     mlir::Value list = builder.create<fir::UndefOp>(loc, listTy);
@@ -575,21 +547,12 @@ getNamelistGroup(Fortran::lower::AbstractConverter &converter,
       list = builder.create<fir::InsertValueOp>(loc, listTy, list, descAddr,
                                                 builder.getArrayAttr(idx));
     }
-    if (groupIsLocal)
-      builder.create<fir::StoreOp>(loc, list, listAddr);
-    else
-      builder.create<fir::HasValueOp>(loc, list);
+    builder.create<fir::StoreOp>(loc, list, listAddr);
   };
-  if (groupIsLocal)
-    listFunc(builder);
-  else
-    builder.createGlobalConstant(loc, listTy, listMangleName, listFunc,
-                                 linkOnce);
+  listFunc(builder);
 
   // Define the group.
-  mlir::Value groupAddr = groupIsLocal
-                              ? builder.create<fir::AllocaOp>(loc, groupTy)
-                              : mlir::Value{};
+  mlir::Value groupAddr = builder.create<fir::AllocaOp>(loc, groupTy);
   auto groupFunc = [&](fir::FirOpBuilder &builder) {
     mlir::Value group = builder.create<fir::UndefOp>(loc, groupTy);
     // group name [const char *groupName]
@@ -617,20 +580,9 @@ getNamelistGroup(Fortran::lower::AbstractConverter &converter,
         loc, groupTy, group,
         getNonTbpDefinedIoTableAddr(converter, definedIoProcMap),
         builder.getArrayAttr(builder.getIntegerAttr(idxTy, 3)));
-    if (groupIsLocal)
-      builder.create<fir::StoreOp>(loc, group, groupAddr);
-    else
-      builder.create<fir::HasValueOp>(loc, group);
+    builder.create<fir::StoreOp>(loc, group, groupAddr);
   };
-  if (groupIsLocal) {
-    groupFunc(builder);
-  } else {
-    fir::GlobalOp group = builder.createGlobal(
-        loc, groupTy, groupMangleName,
-        /*isConst=*/true, /*isTarget=*/false, groupFunc, linkOnce);
-    groupAddr = builder.create<fir::AddrOfOp>(loc, group.resultType(),
-                                              group.getSymbol());
-  }
+  groupFunc(builder);
   assert(groupAddr && "missing namelist group result");
   return groupAddr;
 }
diff --git a/flang/test/Lower/namelist-common-block.f90 b/flang/test/Lower/namelist-common-block.f90
index d16f886dadb5bc..70fe64a002de63 100644
--- a/flang/test/Lower/namelist-common-block.f90
+++ b/flang/test/Lower/namelist-common-block.f90
@@ -1,4 +1,4 @@
-! RUN: bbc -emit-fir -hlfir=false -o - %s | FileCheck %s
+! RUN: bbc -emit-hlfir -o - %s | FileCheck %s
 
 ! Test that allocatable or pointer item from namelist are retrieved correctly
 ! if they are part of a common block as well.
@@ -17,12 +17,56 @@ subroutine print_t()
   end subroutine
 end
 
-! CHECK-LABEL: fir.global linkonce @_QFNt.list constant : !fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>> {
-! CHECK: %[[CB_ADDR:.*]] = fir.address_of(@c_) : !fir.ref<!fir.array<56xi8>>
-! CHECK: %[[CB_CAST:.*]] = fir.convert %[[CB_ADDR]] : (!fir.ref<!fir.array<56xi8>>) -> !fir.ref<!fir.array<?xi8>>
-! CHECK: %[[OFFSET:.*]] = arith.constant 8 : index
-! CHECK: %[[COORD:.*]] = fir.coordinate_of %[[CB_CAST]], %[[OFFSET]] : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
-! CHECK: %[[CAST_BOX:.*]] = fir.convert %[[COORD]] : (!fir.ref<i8>) -> !fir.ref<!fir.box<!fir.ptr<!fir.array<?xf32>>>>
-! CHECK: %[[CAST_BOX_NONE:.*]] = fir.convert %[[CAST_BOX]] : (!fir.ref<!fir.box<!fir.ptr<!fir.array<?xf32>>>>) -> !fir.ref<!fir.box<none>>
-! CHECK: %[[RES:.*]] = fir.insert_value %{{.*}}, %[[CAST_BOX_NONE]], [1 : index, 1 : index] : (!fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>, !fir.ref<!fir.box<none>>) -> !fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>
-! CHECK: fir.has_value %[[RES]] : !fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>
+! CHECK-LABEL:   func.func private @_QFPprint_t() attributes {fir.host_symbol = @_QQmain, llvm.linkage = #{{.*}}<internal>} {
+! CHECK:           %[[VAL_0:.*]] = fir.alloca !fir.box<!fir.ptr<i32>>
+! CHECK:           %[[VAL_1:.*]] = fir.address_of(@c_) : !fir.ref<!fir.array<56xi8>>
+! CHECK:           %[[VAL_2:.*]] = fir.convert %[[VAL_1]] : (!fir.ref<!fir.array<56xi8>>) -> !fir.ref<!fir.array<?xi8>>
+! CHECK:           %[[VAL_3:.*]] = arith.constant 0 : index
+! CHECK:           %[[VAL_4:.*]] = fir.coordinate_of %[[VAL_2]], %[[VAL_3]] : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
+! CHECK:           %[[VAL_5:.*]] = fir.convert %[[VAL_4]] : (!fir.ref<i8>) -> !fir.ref<i32>
+! CHECK:           %[[VAL_6:.*]]:2 = hlfir.declare %[[VAL_5]] {uniq_name = "_QFEi"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
+! CHECK:           %[[VAL_7:.*]] = fir.convert %[[VAL_1]] : (!fir.ref<!fir.array<56xi8>>) -> !fir.ref<!fir.array<?xi8>>
+! CHECK:           %[[VAL_8:.*]] = arith.constant 8 : index
+! CHECK:           %[[VAL_9:.*]] = fir.coordinate_of %[[VAL_7]], %[[VAL_8]] : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
+! CHECK:           %[[VAL_10:.*]] = fir.convert %[[VAL_9]] : (!fir.ref<i8>) -> !fir.ref<!fir.box<!fir.ptr<!fir.array<?xf32>>>>
+! CHECK:           %[[VAL_11:.*]]:2 = hlfir.declare %[[VAL_10]] {fortran_attrs = #{{.*}}<pointer>, uniq_name = "_QFEp"} : (!fir.ref<!fir.box<!fir.ptr<!fir.array<?xf32>>>>) -> (!fir.ref<!fir.box<!fir.ptr<!fir.array<?xf32>>>>, !fir.ref<!fir.box<!fir.ptr<!fir.array<?xf32>>>>)
+! CHECK:           %[[VAL_12:.*]] = arith.constant 6 : i32
+! CHECK:           %[[VAL_13:.*]] = fir.address_of(@_QQclXcb818d0ca98c80cc21b98226ec6d98a0) : !fir.ref<!fir.char<1,71>>
+! CHECK:           %[[VAL_14:.*]] = fir.convert %[[VAL_13]] : (!fir.ref<!fir.char<1,71>>) -> !fir.ref<i8>
+! CHECK:           %[[VAL_15:.*]] = arith.constant 16 : i32
+! CHECK:           %[[VAL_16:.*]] = fir.call @_FortranAioBeginExternalListOutput(%[[VAL_12]], %[[VAL_14]], %[[VAL_15]]) fastmath<contract> : (i32, !fir.ref<i8>, i32) -> !fir.ref<i8>
+! CHECK:           %[[VAL_17:.*]] = fir.alloca !fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>
+! CHECK:           %[[VAL_18:.*]] = fir.undefined !fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>
+! CHECK:           %[[VAL_19:.*]] = fir.address_of(@_QQclX6900) : !fir.ref<!fir.char<1,2>>
+! CHECK:           %[[VAL_20:.*]] = fir.convert %[[VAL_19]] : (!fir.ref<!fir.char<1,2>>) -> !fir.ref<i8>
+! CHECK:           %[[VAL_21:.*]] = fir.insert_value %[[VAL_18]], %[[VAL_20]], [0 : index, 0 : index] : (!fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>, !fir.ref<i8>) -> !fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>
+! CHECK:           %[[VAL_22:.*]] = fir.embox %[[VAL_6]]#1 : (!fir.ref<i32>) -> !fir.box<!fir.ptr<i32>>
+! CHECK:           fir.store %[[VAL_22]] to %[[VAL_0]] : !fir.ref<!fir.box<!fir.ptr<i32>>>
+! CHECK:           %[[VAL_23:.*]] = fir.convert %[[VAL_0]] : (!fir.ref<!fir.box<!fir.ptr<i32>>>) -> !fir.ref<!fir.box<none>>
+! CHECK:           %[[VAL_24:.*]] = fir.insert_value %[[VAL_21]], %[[VAL_23]], [0 : index, 1 : index] : (!fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>, !fir.ref<!fir.box<none>>) -> !fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>
+! CHECK:           %[[VAL_25:.*]] = fir.address_of(@_QQclX7000) : !fir.ref<!fir.char<1,2>>
+! CHECK:           %[[VAL_26:.*]] = fir.convert %[[VAL_25]] : (!fir.ref<!fir.char<1,2>>) -> !fir.ref<i8>
+! CHECK:           %[[VAL_27:.*]] = fir.insert_value %[[VAL_24]], %[[VAL_26]], [1 : index, 0 : index] : (!fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>, !fir.ref<i8>) -> !fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>
+! CHECK:           %[[VAL_28:.*]] = fir.address_of(@c_) : !fir.ref<!fir.array<56xi8>>
+! CHECK:           %[[VAL_29:.*]] = fir.convert %[[VAL_28]] : (!fir.ref<!fir.array<56xi8>>) -> !fir.ref<!fir.array<?xi8>>
+! CHECK:           %[[VAL_30:.*]] = arith.constant 8 : index
+! CHECK:           %[[VAL_31:.*]] = fir.coordinate_of %[[VAL_29]], %[[VAL_30]] : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
+! CHECK:           %[[VAL_32:.*]] = fir.convert %[[VAL_31]] : (!fir.ref<i8>) -> !fir.ref<!fir.box<!fir.ptr<!fir.array<?xf32>>>>
+! CHECK:           %[[VAL_33:.*]] = fir.convert %[[VAL_32]] : (!fir.ref<!fir.box<!fir.ptr<!fir.array<?xf32>>>>) -> !fir.ref<!fir.box<none>>
+! CHECK:           %[[VAL_34:.*]] = fir.insert_value %[[VAL_27]], %[[VAL_33]], [1 : index, 1 : index] : (!fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>, !fir.ref<!fir.box<none>>) -> !fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>
+! CHECK:           fir.store %[[VAL_34]] to %[[VAL_17]] : !fir.ref<!fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>
+! CHECK:           %[[VAL_35:.*]] = fir.alloca tuple<!fir.ref<i8>, i64, !fir.ref<!fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>, !fir.ref<none>>
+! CHECK:           %[[VAL_36:.*]] = fir.undefined tuple<!fir.ref<i8>, i64, !fir.ref<!fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>, !fir.ref<none>>
+! CHECK:           %[[VAL_37:.*]] = fir.address_of(@_QQclX7400) : !fir.ref<!fir.char<1,2>>
+! CHECK:           %[[VAL_38:.*]] = fir.convert %[[VAL_37]] : (!fir.ref<!fir.char<1,2>>) -> !fir.ref<i8>
+! CHECK:           %[[VAL_39:.*]] = fir.insert_value %[[VAL_36]], %[[VAL_38]], [0 : index] : (tuple<!fir.ref<i8>, i64, !fir.ref<!fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>, !fir.ref<none>>, !fir.ref<i8>) -> tuple<!fir.ref<i8>, i64, !fir.ref<!fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>, !fir.ref<none>>
+! CHECK:           %[[VAL_40:.*]] = arith.constant 2 : i64
+! CHECK:           %[[VAL_41:.*]] = fir.insert_value %[[VAL_39]], %[[VAL_40]], [1 : index] : (tuple<!fir.ref<i8>, i64, !fir.ref<!fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>, !fir.ref<none>>, i64) -> tuple<!fir.ref<i8>, i64, !fir.ref<!fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>, !fir.ref<none>>
+! CHECK:           %[[VAL_42:.*]] = fir.insert_value %[[VAL_41]], %[[VAL_17]], [2 : index] : (tuple<!fir.ref<i8>, i64, !fir.ref<!fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>, !fir.ref<none>>, !fir.ref<!fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>) -> tuple<!fir.ref<i8>, i64, !fir.ref<!fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>, !fir.ref<none>>
+! CHECK:           %[[VAL_43:.*]] = fir.address_of(@default.nonTbpDefinedIoTable) : !fir.ref<tuple<i64, !fir.ref<!fir.array<0xtuple<!fir.ref<none>, !fir.ref<none>, i32, i1>>>, i1>>
+! CHECK:           %[[VAL_44:.*]] = fir.convert %[[VAL_43]] : (!fir.ref<tuple<i64, !fir.ref<!fir.array<0xtuple<!fir.ref<none>, !fir.ref<none>, i32, i1>>>, i1>>) -> !fir.ref<none>
+! CHECK:           %[[VAL_45:.*]] = fir.insert_value %[[VAL_42]], %[[VAL_44]], [3 : index] : (tuple<!fir.ref<i8>, i64, !fir.ref<!fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>, !fir.ref<none>>, !fir.ref<none>) -> tuple<!fir.ref<i8>, i64, !fir.ref<!fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>, !fir.ref<none>>
+! CHECK:           fir.store %[[VAL_45]] to %[[VAL_35]] : !fir.ref<tuple<!fir.ref<i8>, i64, !fir.ref<!fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>, !fir.ref<none>>>
+! CHECK:           %[[VAL_46:.*]] = fir.convert %[[VAL_35]] : (!fir.ref<tuple<!fir.ref<i8>, i64, !fir.ref<!fir.array<2xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>, !fir.ref<none>>>) -> !fir.ref<tuple<>>
+! CHECK:           %[[VAL_47:.*]] = fir.call @_FortranAioOutputNamelist(%[[VAL_16]], %[[VAL_46]]) fastmath<contract> : (!fir.ref<i8>, !fir.ref<tuple<>>) -> i1
+! CHECK:           %[[VAL_48:.*]] = fir.call @_FortranAioEndIoStatement(%[[VAL_16]]) fastmath<contract> : (!fir.ref<i8>) -> i32
diff --git a/flang/test/Lower/namelist.f90 b/flang/test/Lower/namelist.f90
index a96bbbfad0cd6b..9a8a6f827a02b1 100644
--- a/flang/test/Lower/namelist.f90
+++ b/flang/test/Lower/namelist.f90
@@ -114,11 +114,35 @@ subroutine sss
 ! CHECK-LABEL: c.func @_QPglobal_pointer
 subroutine global_pointer
   real,pointer,save::ptrarray(:)
-  ! CHECK:     %[[V_4:[0-9]+]] = fir.call @_FortranAioBeginExternalListOutput
-  ! CHECK:     %[[V_5:[0-9]+]] = fir.address_of(@_QFglobal_pointerNmygroup) : !fir.ref<tuple<!fir.ref<i8>, i64, !fir.ref<!fir.array<1xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>, !fir.ref<none>>>
-  ! CHECK:     %[[V_6:[0-9]+]] = fir.convert %[[V_5]] : (!fir.ref<tuple<!fir.ref<i8>, i64, !fir.ref<!fir.array<1xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>, !fir.ref<none>>>) -> !fir.ref<tuple<>>
-  ! CHECK:     %[[V_7:[0-9]+]] = fir.call @_FortranAioOutputNamelist(%[[V_4]], %[[V_6]]) fastmath<contract> : (!fir.ref<i8>, !fir.ref<tuple<>>) -> i1
-  ! CHECK:     %[[V_8:[0-9]+]] = fir.call @_FortranAioEndIoStatement(%[[V_4]]) fastmath<contract> : (!fir.ref<i8>) -> i32
+  ! CHECK:           %[[VAL_0:.*]] = arith.constant 1 : i64
+  ! CHECK:           %[[VAL_2:.*]] = arith.constant 10 : i32
+  ! CHECK:           %[[VAL_3:.*]] = fir.address_of(@_QFglobal_pointerEptrarray) : !fir.ref<!fir.box<!fir.ptr<!fir.array<?xf32>>>>
+  ! CHECK:           %[[VAL_4:.*]] = fir.declare %[[VAL_3]] {fortran_attrs = {{.*}}<pointer>, uniq_name = "_QFglobal_pointerEptrarray"} : (!fir.ref<!fir.box<!fir.ptr<!fir.array<?xf32>>>>) -> !fir.ref<!fir.box<!fir.ptr<!fir.array<?xf32>>>>
+  ! CHECK:           %[[VAL_5:.*]] = fir.address_of(@_QQclX51b4dc7bdbe935748434745744d2a699) : !fir.ref<!fir.char<1,58>>
+  ! CHECK:           %[[VAL_6:.*]] = fir.convert %[[VAL_5]] : (!fir.ref<!fir.char<1,58>>) -> !fir.ref<i8>
+  ! CHECK:           %[[VAL_7:.*]] = fir.call @_FortranAioBeginExternalListOutput(%[[VAL_2]], %[[VAL_6]], %{{.*}}) fastmath<contract> : (i32, !fir.ref<i8>, i32) -> !fir.ref<i8>
+  ! CHECK:           %[[VAL_8:.*]] = fir.alloca !fir.array<1xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>
+  ! CHECK:           %[[VAL_9:.*]] = fir.undefined !fir.array<1xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>
+  ! CHECK:           %[[VAL_10:.*]] = fir.address_of(@_QQclX707472617272617900) : !fir.ref<!fir.char<1,9>>
+  ! CHECK:           %[[VAL_11:.*]] = fir.convert %[[VAL_10]] : (!fir.ref<!fir.char<1,9>>) -> !fir.ref<i8>
+  ! CHECK:           %[[VAL_12:.*]] = fir.insert_value %[[VAL_9]], %[[VAL_11]], [0 : index, 0 : index] : (!fir.array<1xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>, !fir.ref<i8>) -> !fir.array<1xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>
+  ! CHECK:           %[[VAL_13:.*]] = fir.convert %[[VAL_3]] : (!fir.ref<!fir.box<!fir.ptr<!fir.array<?xf32>>>>) -> !fir.ref<!fir.box<none>>
+  ! CHECK:           %[[VAL_14:.*]] = fir.insert_value %[[VAL_12]], %[[VAL_13]], [0 : index, 1 : index] : (!fir.array<1xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>, !fir.ref<!fir.box<none>>) -> !fir.array<1xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>
+  ! CHECK:           fir.store %[[VAL_14]] to %[[VAL_8]] : !fir.ref<!fir.array<1xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>
+  ! CHECK:           %[[VAL_15:.*]] = fir.alloca tuple<!fir.ref<i8>, i64, !fir.ref<!fir.array<1xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>, !fir.ref<none>>
+  ! CHECK:           %[[VAL_16:.*]] = fir.undefined tuple<!fir.ref<i8>, i64, !fir.ref<!fir.array<1xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>, !fir.ref<none>>
+  ! CHECK:           %[[VAL_17:.*]] = fir.address_of(@_QQclX6D7967726F757000) : !fir.ref<!fir.char<1,8>>
+  ! CHECK:           %[[VAL_18:.*]] = fir.convert %[[VAL_17]] : (!fir.ref<!fir.char<1,8>>) -> !fir.ref<i8>
+  ! CHECK:           %[[VAL_19:.*]] = fir.insert_value %[[VAL_16]], %[[VAL_18]], [0 : index] : (tuple<!fir.ref<i8>, i64, !fir.ref<!fir.array<1xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>, !fir.ref<none>>, !fir.ref<i8>) -> tuple<!fir.ref<i8>, i64, !fir.ref<!fir.array<1xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>, !fir.ref<none>>
+  ! CHECK:           %[[VAL_20:.*]] = fir.insert_value %[[VAL_19]], %[[VAL_0]], [1 : index] : (tuple<!fir.ref<i8>, i64, !fir.ref<!fir.array<1xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>, !fir.ref<none>>, i64) -> tuple<!fir.ref<i8>, i64, !fir.ref<!fir.array<1xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>, !fir.ref<none>>
+  ! CHECK:           %[[VAL_21:.*]] = fir.insert_value %[[VAL_20]], %[[VAL_8]], [2 : index] : (tuple<!fir.ref<i8>, i64, !fir.ref<!fir.array<1xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>, !fir.ref<none>>, !fir.ref<!fir.array<1xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>) -> tuple<!fir.ref<i8>, i64, !fir.ref<!fir.array<1xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>, !fir.ref<none>>
+  ! CHECK:           %[[VAL_22:.*]] = fir.address_of(@default.nonTbpDefinedIoTable) : !fir.ref<tuple<i64, !fir.ref<!fir.array<0xtuple<!fir.ref<none>, !fir.ref<none>, i32, i1>>>, i1>>
+  ! CHECK:           %[[VAL_23:.*]] = fir.convert %[[VAL_22]] : (!fir.ref<tuple<i64, !fir.ref<!fir.array<0xtuple<!fir.ref<none>, !fir.ref<none>, i32, i1>>>, i1>>) -> !fir.ref<none>
+  ! CHECK:           %[[VAL_24:.*]] = fir.insert_value %[[VAL_21]], %[[VAL_23]], [3 : index] : (tuple<!fir.ref<i8>, i64, !fir.ref<!fir.array<1xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>, !fir.ref<none>>, !fir.ref<none>) -> tuple<!fir.ref<i8>, i64, !fir.ref<!fir.array<1xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>, !fir.ref<none>>
+  ! CHECK:           fir.store %[[VAL_24]] to %[[VAL_15]] : !fir.ref<tuple<!fir.ref<i8>, i64, !fir.ref<!fir.array<1xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>, !fir.ref<none>>>
+  ! CHECK:           %[[VAL_25:.*]] = fir.convert %[[VAL_15]] : (!fir.ref<tuple<!fir.ref<i8>, i64, !fir.ref<!fir.array<1xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>>>>>, !fir.ref<none>>>) -> !fir.ref<tuple<>>
+  ! CHECK:           %[[VAL_26:.*]] = fir.call @_FortranAioOutputNamelist(%[[VAL_7]], %[[VAL_25]]) fastmath<contract> : (!fir.ref<i8>, !fir.ref<tuple<>>) -> i1
+  ! CHECK:           %[[VAL_27:.*]] = fir.call @_FortranAioEndIoStatement(%[[VAL_7]]) fastmath<contract> : (!fir.ref<i8>) -> i32
   namelist/mygroup/ptrarray
   write(10, nml=mygroup)
 end
@@ -132,11 +156,40 @@ module mmm
 subroutine rename_sub
   use mmm, bbb => aaa
   rrr = 3.
-  ! CHECK:     %[[V_4:[0-9]+]] = fir.call @_FortranAioBeginExternalListOutput
-  ! CHECK:     %[[V_5:[0-9]+]] = fir.address_of(@_QMmmmNaaa) : !fir.ref<tuple<!fir.ref<i8>, i64, !fir.ref<!fir.array<1xtuple<!fir.ref<i8>, !fir.ref<!fir.box<none>...
[truncated]

Copy link
Contributor

@vdonaldson vdonaldson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we look at compilation options to inform this decision, either now or in the long run?
In any case, I'm inclined to think it would be best to retain this code, even if it is programmatically deactivated for the time being.

@jeanPerier
Copy link
Contributor

Is this really legal OpenMP?

I found this restriction in OpenMP standard section 5.3:

Variables that appear in namelist statements, in variable format expressions, and in expressions for statement function definitions, must not be privatized

Maybe this does not really apply to REDUCE, but i it could then be an omission from OpenMP given the spirit of 5.3 restriction.

Regarding DO CONCURRENT, I have not found a very convincing point about what happens with variables in LOCAL_INIT, LOCAL, or REDUCE that appeared in a namelist. I think it would deserve a standard clarification, if namelist IO is possible inside it (which I do not think is forbidden).

I share Val's concern that namelist IO can be used with huge variable lists in old programs in parts that make little sense to parallelize, and that spending time creating the namelist object again and again at each namelist reference could be a significant hit.

@tblah
Copy link
Contributor Author

tblah commented Sep 20, 2024

Thanks for taking a look.

Could we look at compilation options to inform this decision, either now or in the long run?

I was originally thinking about doing this only when -fopenmp is passed, but there is do concurrent as well.

Is this really legal OpenMP?

I agree this probably shouldn't be legal OpenMP, but so far as I can tell, restrictions on privatisation do not apply to reductions. The upstream ticket indicates that this is supported by both ifort and gfortran. But I would be very happy to instead add a semantic check preventing this if sufficient consensus could be reached to deviate from OpenMP and F23.

I agree that creating a very large namelist could be a performance hit. Do those programs declare all variables inside of the namelist inside of a common block (i.e. is this an optimization that users expect)?. If some of the variables are local then this optimization could not be performed anyway.

@jeanPerier
Copy link
Contributor

I agree that creating a very large namelist could be a performance hit. Do those programs declare all variables inside of the namelist inside of a common block (i.e. is this an optimization that users expect)?. If some of the variables are local then this optimization could not be performed anyway.

I lack expertise about old code to judge here. I went ahead and ran an instrumented compiler through some of our regression tests. Out of about 400 tests using namelists, 150 fell into cases where the namelist data structure could be made global (37%). I am not sure however that this is an optimization that user expect given gfortran and ifort do not seem to be doing it, but nvfortran and classic flang are doing it I think.

tblah added a commit to tblah/llvm-project that referenced this pull request Oct 1, 2024
This is allowed by the OpenMP and F23 standards. But variables in
a namelist are not allowed in OpenMP privatisation. I suspect this
was an oversight.

If we allow this we run into problems masking the original symbol with
the symbol for the reduction variable when the variable is accessed via
a namelist initialised as a global variable. See llvm#101907. One solution
for this would be to force the namelist to always be initilized inside
of the block in which it is used (therefore using the correct mapping
for the reduction variable), but this could make some production
applications slow.

I tentatively think it is probably better to disallow a (perhaps mistaken)
edge case of the standards with (I think) little practical use, than to make
real applications slow in order to make this work. If reviewers would rather
keep to the letter of the standard, see llvm#109303 which implements the
alternative solution. I'm open to either path forward.

Fixes llvm#101907
tblah added a commit that referenced this pull request Oct 2, 2024
This is allowed by the OpenMP and F23 standards. But variables in a
namelist are not allowed in OpenMP privatisation. I suspect this was an
oversight.

If we allow this we run into problems masking the original symbol with
the symbol for the reduction variable when the variable is accessed via
a namelist initialised as a global variable. See #101907. One solution
for this would be to force the namelist to always be initilized inside
of the block in which it is used (therefore using the correct mapping
for the reduction variable), but this could make some production
applications slow.

I tentatively think it is probably better to disallow a (perhaps
mistaken) edge case of the standards with (I think) little practical
use, than to make real applications slow in order to make this work. If
reviewers would rather keep to the letter of the standard, see #109303
which implements the alternative solution. I'm open to either path
forward.

Fixes #101907
@tblah
Copy link
Contributor Author

tblah commented Oct 2, 2024

Closing in favor of #110671

@tblah tblah closed this Oct 2, 2024
VitaNuo pushed a commit to VitaNuo/llvm-project that referenced this pull request Oct 2, 2024
…10671)

This is allowed by the OpenMP and F23 standards. But variables in a
namelist are not allowed in OpenMP privatisation. I suspect this was an
oversight.

If we allow this we run into problems masking the original symbol with
the symbol for the reduction variable when the variable is accessed via
a namelist initialised as a global variable. See llvm#101907. One solution
for this would be to force the namelist to always be initilized inside
of the block in which it is used (therefore using the correct mapping
for the reduction variable), but this could make some production
applications slow.

I tentatively think it is probably better to disallow a (perhaps
mistaken) edge case of the standards with (I think) little practical
use, than to make real applications slow in order to make this work. If
reviewers would rather keep to the letter of the standard, see llvm#109303
which implements the alternative solution. I'm open to either path
forward.

Fixes llvm#101907
VitaNuo pushed a commit to VitaNuo/llvm-project that referenced this pull request Oct 2, 2024
…10671)

This is allowed by the OpenMP and F23 standards. But variables in a
namelist are not allowed in OpenMP privatisation. I suspect this was an
oversight.

If we allow this we run into problems masking the original symbol with
the symbol for the reduction variable when the variable is accessed via
a namelist initialised as a global variable. See llvm#101907. One solution
for this would be to force the namelist to always be initilized inside
of the block in which it is used (therefore using the correct mapping
for the reduction variable), but this could make some production
applications slow.

I tentatively think it is probably better to disallow a (perhaps
mistaken) edge case of the standards with (I think) little practical
use, than to make real applications slow in order to make this work. If
reviewers would rather keep to the letter of the standard, see llvm#109303
which implements the alternative solution. I'm open to either path
forward.

Fixes llvm#101907
Sterling-Augustine pushed a commit to Sterling-Augustine/llvm-project that referenced this pull request Oct 3, 2024
…10671)

This is allowed by the OpenMP and F23 standards. But variables in a
namelist are not allowed in OpenMP privatisation. I suspect this was an
oversight.

If we allow this we run into problems masking the original symbol with
the symbol for the reduction variable when the variable is accessed via
a namelist initialised as a global variable. See llvm#101907. One solution
for this would be to force the namelist to always be initilized inside
of the block in which it is used (therefore using the correct mapping
for the reduction variable), but this could make some production
applications slow.

I tentatively think it is probably better to disallow a (perhaps
mistaken) edge case of the standards with (I think) little practical
use, than to make real applications slow in order to make this work. If
reviewers would rather keep to the letter of the standard, see llvm#109303
which implements the alternative solution. I'm open to either path
forward.

Fixes llvm#101907
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang Flang issues not falling into any other category
Projects
None yet
4 participants