We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flang can't vectorize the loop in s115 of TSVC while Clang can vectorize the loop written in C.
s115
! Fortran version subroutine s115 (ntimes,ld,n,ctime,dtime,a,b,c,d,e,aa,bb,cc) integer ntimes, ld, n, i, nl, j real a(n), b(n), c(n), d(n), e(n), aa(ld,n), bb(ld,n), cc(ld,n) call init(ld,n,a,b,c,d,e,aa,bb,cc,'s115 ') do 10 j = 1,n do 20 i = j+1, n a(i) = a(i) - aa(i,j) * a(j) 20 continue 10 continue call dummy(ld,n,a,b,c,d,e,aa,bb,cc,1.) end
$ flang-new -v -O3 -flang-experimental-integer-overflow s115.f -S -Rpass=vector flang-new version 20.0.0git (https://github.com/llvm/llvm-project.git 2c770675ce36402b51a320ae26f369690c138dc1) Target: aarch64-unknown-linux-gnu Thread model: posix InstalledDir: /path/to/build/bin Build config: +assertions Found candidate GCC installation: /usr/lib/gcc/aarch64-redhat-linux/11 Selected GCC installation: /usr/lib/gcc/aarch64-redhat-linux/11 Candidate multilib: .;@m64 Selected multilib: .;@m64 "/path/to/build/bin/flang-new" -fc1 -triple aarch64-unknown-linux-gnu -S -fcolor-diagnostics -mrelocation-model pic -pic-level 2 -pic-is-pie -target-cpu generic -target-feature +outline-atomics -target-feature +v8a -target-feature +fp-armv8 -target-feature +neon -fversion-loops-for-stride -flang-experimental-integer-overflow -Rpass=vector -resource-dir /path/to/build/lib/clang/20 -mframe-pointer=non-leaf -O3 -o /dev/null -x f95-cpp-input s115.f
// C version #define LEN 32000 #define LEN2 256 float a[LEN], b[LEN], c[LEN], d[LEN], e[LEN]; float aa[LEN2][LEN2], bb[LEN2][LEN2], cc[LEN2][LEN2]; int s115() { init( "s115 "); for (int j = 0; j < LEN2; j++) { for (int i = j+1; i < LEN2; i++) { a[i] -= aa[j][i] * a[j]; } } dummy(a, b, c, d, e, aa, bb, cc, 0.); return 0; }
$ clang -O3 s115.c -S -Rpass=vector s115.c:10:4: remark: vectorized loop (vectorization width: 4, interleaved count: 2) [-Rpass=loop-vectorize] 10 | for (int i = j+1; i < LEN2; i++) { | ^
If j+1 overflow, the access to a(i) and a(j) may overlap so vectorization is prevented. IIRC, compilers don't have to consider it.
j+1
a(i)
a(j)
The text was updated successfully, but these errors were encountered:
@llvm/issue-subscribers-flang-ir
Author: Yusuke MINATO (yus3710-fj)
$ flang-new -v -O3 -flang-experimental-integer-overflow s115.f -S -Rpass=vector flang-new version 20.0.0git (https://github.com/llvm/llvm-project.git 2c770675ce36402b51a320ae26f369690c138dc1) Target: aarch64-unknown-linux-gnu Thread model: posix InstalledDir: /path/to/build/bin Build config: +assertions Found candidate GCC installation: /usr/lib/gcc/aarch64-redhat-linux/11 Selected GCC installation: /usr/lib/gcc/aarch64-redhat-linux/11 Candidate multilib: .;@<!-- -->m64 Selected multilib: .;@<!-- -->m64 "/path/to/build/bin/flang-new" -fc1 -triple aarch64-unknown-linux-gnu -S -fcolor-diagnostics -mrelocation-model pic -pic-level 2 -pic-is-pie -target-cpu generic -target-feature +outline-atomics -target-feature +v8a -target-feature +fp-armv8 -target-feature +neon -fversion-loops-for-stride -flang-experimental-integer-overflow -Rpass=vector -resource-dir /path/to/build/lib/clang/20 -mframe-pointer=non-leaf -O3 -o /dev/null -x f95-cpp-input s115.f
// C version #define LEN 32000 #define LEN2 256 float a[LEN], b[LEN], c[LEN], d[LEN], e[LEN]; float aa[LEN2][LEN2], bb[LEN2][LEN2], cc[LEN2][LEN2]; int s115() { init( "s115 "); for (int j = 0; j < LEN2; j++) { for (int i = j+1; i < LEN2; i++) { a[i] -= aa[j][i] * a[j]; } } dummy(a, b, c, d, e, aa, bb, cc, 0.); return 0; }
$ clang -O3 s115.c -S -Rpass=vector s115.c:10:4: remark: vectorized loop (vectorization width: 4, interleaved count: 2) [-Rpass=loop-vectorize] 10 | for (int i = j+1; i < LEN2; i++) { | ^
Sorry, something went wrong.
No branches or pull requests
Flang can't vectorize the loop in
s115
of TSVC while Clang can vectorize the loop written in C.If
j+1
overflow, the access toa(i)
anda(j)
may overlap so vectorization is prevented.IIRC, compilers don't have to consider it.
The text was updated successfully, but these errors were encountered: