Loop vectorization gives different answer - fortran

I am building some unit tests and find that my code gives a slightly different result when vectorized. In my example case below, an array a is summed in one dimension and added to an initial value x. Most elements of a are too small to change x. The code is:
module datamod
use ISO_FORTRAN_ENV, only : dp => REAL64
implicit none
! -- Array dimensions are large enough for gfortran to vectorize
integer, parameter :: N = 6
integer, parameter :: M = 10
real(dp) :: x(N), a(N,M)
contains
subroutine init_ax
! -- Set a and x so the issue manifests
x = 0.
x(1) = 0.1e+03_dp
a = 0.
! -- Each negative component is too small to individually change x(1)
! -- But the positive component is just big enough
a( 1, 1) = -0.4e-14_dp
a( 1, 2) = -0.4e-14_dp
a( 1, 3) = -0.4e-14_dp
a( 1, 4) = 0.8e-14_dp
a( 1, 5) = -0.4e-14_dp
end subroutine init_ax
end module datamod
program main
use datamod, only : a, x, N, M, init_ax
implicit none
integer :: i, j
call init_ax
! -- The loop in question
do i=1,N
do j=1,M
x(i) = x(i) + a(i,j)
enddo
enddo
write(*,'(a,e26.18)') 'x(1) is: ', x(1)
end program main
The code gives the following results in gfortran without and with loop vectorization. Note that ftree-vectorize is included in -O3, so the problem manifests when using -O3 also.
mach5% gfortran -O2 main.f90 && ./a.out
x(1) is: 0.100000000000000014E+03
mach5% gfortran -O2 -ftree-vectorize main.f90 && ./a.out
x(1) is: 0.999999999999999858E+02
I know that certain compiler options can change the answer, such as -fassociative-math. However, none of those are included in the standard -O3 optimization package according to the gcc optimization options page.
It seems to me as though the vectorized code is adding up all components of a first, and then adding to x. However, this is incorrect because the code as written requires each component of a to be added to x.
What is going on here? May loop vectorization change the answer in some cases? Gfortran versions 4.7 and 5.3 had the same problem, but Intel 16.0 and PGI 15.10 did not.

I copied the code you provided (to a file called test.f90) and then I compiled and ran it using version 4.8.5 of gfortran. I found that results from the -O2 and -O2 -ftree-vectorize options differ just as your results differ. However, when I simply used -O3, I found that the results matched -O2.
$ gfortran --version
GNU Fortran (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11)
Copyright (C) 2015 Free Software Foundation, Inc.
GNU Fortran comes with NO WARRANTY, to the extent permitted by law.
You may redistribute copies of GNU Fortran
under the terms of the GNU General Public License.
For more information about these matters, see the file named COPYING
$ gfortran -O2 test.f90 && ./a.out
x(1) is: 0.100000000000000014E+03
$ gfortran -O2 -ftree-vectorize test.f90 && ./a.out
x(1) is: 0.999999999999999858E+02
$ gfortran -O3 test.f90 && ./a.out
x(1) is: 0.100000000000000014E+03

Related

Bound checking for empty arrays --- behavior of various compilers

Update 20210914: Absoft support confirms that the behavior of af95 / af90 described below is unintended and indeed a bug. Absoft developers will work to resolve it. The other compilers act correctly in this regard. Thank #Vladimir F for the answer, comments, and suggestions.
I have the impression that Fortran is cool with arrays of size 0. However, with Absoft Pro 21.0, I encountered a (strange) error involving such arrays. In contrast, gfortran, ifort, nagfor, pgfortran, sunf95, and g95 are all happy with the same piece of code.
Below is a minimal working example.
! testempty.f90
!!!!!! A module that offends AF90/AF95 !!!!!!!!!!!!!!!!!!!!!!!!
module empty_mod
implicit none
private
public :: foo
contains
subroutine foo(n)
implicit none
integer, intent(in) :: n
integer :: a(0)
integer :: b(n - 1)
call bar(a) ! AF90/AF95 is happy with this line.
call bar(b) ! AF90/AF95 is angry with this line.
end subroutine foo
subroutine bar(x)
implicit none
integer, intent(out) :: x(:)
x = 1 ! BAR(B) annoys AF90/AF95 regardless of this line.
end subroutine bar
end module empty_mod
!!!!!! Module ends !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!! Main program !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
program testempty
use empty_mod, only : foo
implicit none
call foo(2) ! AF90/AF95 is happy with this line.
call foo(1) ! AF90/AF95 is angry with this line.
write (*, *) 'Succeed!' ! Declare victory when arriving here.
end program testempty
!!!!!! Main program ends !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Name this piece of code as testempty.f90. Then run
$ af95 -no-pie -et -Rb -g -O0 -o atest testempty.f90
$ ./atest
This is what happened on my machine (Ubuntu 20.04, linux 5.4.0-77-generic, x86_64):
./atest
? FORTRAN Runtime Error:
? Subscript 1 is out of range for dimension 1 for array
? B with bounds 1:
? File testempty.f90; Line 19
? atest, run-time exception on Mon Sep 13 14:08:41 2021
? Program counter: 000000001004324B
? Signal SIGABRT, Abort
? Traceback follows
OBJECT PC ROUTINE LINE SOURCE
libpthread.so.0 000000001004324B raise N/A N/A
atest 00000000004141F3 __abs_f90rerr N/A N/A
atest 000000000041CA81 _BOUNDS_ERROR N/A N/A
atest 00000000004097B4 __FOO.in.EMPTY_MO N/A N/A
atest 000000000040993A MAIN__ 40 testempty.f90
atest 000000000042A209 main N/A N/A
libc.so.6 000000000FD0C0B3 __libc_start_main N/A N/A
atest 000000000040956E _start N/A N/A
So af95 was annoyed by call bar(b). With af90, the result was the same.
I tested the same code using gfortran, ifort, nagfor, pgfortran, sunf95, and g95. All of them were quite happy with the code even though I imposed bound checking explicitly. Below is the Makefile for the tests.
# This Makefile tests the following compilers on empty arrays.
#
# af95: Absoft 64-bit Pro Fortran 21.0.0
# gfortran: GNU Fortran (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
# ifort: ifort (IFORT) 2021.2.0 20210228
# nagfor: NAG Fortran Compiler Release 7.0(Yurakucho) Build 7036
# pgfortran: pgfortran (aka nvfortran) 21.3-0 LLVM 64-bit x86-64
# sunf95: Oracle Developer Studio 12.6
# g95: G95 (GCC 4.0.3 (g95 0.94!) Jan 17 2013)
#
# Tested on Ubuntu 20.04 with Linux 5.4.0-77-generic x86_64
.PHONY: test clean
test:
make -s gtest
make -s itest
make -s ntest
make -s ptest
make -s stest
make -s 9test
make -s atest
gtest: FC = gfortran -Wall -Wextra -fcheck=all
itest: FC = ifort -warn all -check all
ntest: FC = nagfor -C
ptest: FC = pgfortran -C -Mbounds
stest: FC = sunf95 -w3 -xcheck=%all -C
9test: FC = g95 -Wall -Wextra -fbounds-check
atest: FC = af95 -no-pie -et -Rb
%test: testempty.f90
$(FC) -g -O0 -o $# $<
./$#
clean:
rm -f *.o *.mod *.dbg *test
Questions:
Is the behavior of af95/af90 standard-conforming?
Does my code contain anything that violates the Fortran standards?
In general, is it considered dangerous to involve empty arrays in Fortran code? Sometimes they are inevitable given the fact the data sizes are often undecidable before runtime.
By "standards", I mean 2003, 2008, and 2018.
Thank you very much for any comments or criticism.
(The same question is posed on Fortran Discourse, and I hope it does not violate the rules here.)
The program looks OK to me. Zero-sized arrays are perfectly possible in Fortran although I admit I normally do not have automatic ones - but that is just a coincidence.
I think it is a compiler bug in the Absoft compiler or its array bounds checker.

Strange behavior of "gfortran -Wconversion"

Consider the following code.
! test.f90
program test
use iso_fortran_env, only: INT64, REAL64
print *, real(0_INT64, REAL64)
print *, real(1000_INT64, REAL64)
print *, real(huge(0_INT64), REAL64)
end program test
When compiling it with gfortran in the following way:
$ gfortran -Wconversion -std=f2008 test.f90
I got the following warning:
test.f90:5:18:
5 | print *, real(huge(0_INT64), REAL64)
| 1
Warning: Change of value in conversion from ‘INTEGER(8)’ to ‘REAL(8)’ at (1) [-Wconversion]
Note that gfortran is happy with the first two conversions, but not the last one.
Question: Is the warning illustrated above an expected behavior of gfortran? I thought that NO warning should be produced in any of the three cases, since the conversion is done explicitly by REAL( , INT64).
Here is the version information of my gfortran:
$ gfortran --version
GNU Fortran (Ubuntu 9.3.0-10ubuntu2) 9.3.0
As a reference, ifort 19.1.127 compiles test.f90 without any complaint:
$ ifort -warn all -stand f08 test.f90
Thank you very much for any comments or critics.
Answer by #dave_thompson_085 in the comments:
“0 and 1000 can be represented exactly in REAL64 (and even in REAL32). HUGE(INT64) is 9223372036854775807 and it cannot. REAL64 has 53 bits for the 'mantissa' (really, significand), and after subtracting the sign and adding the hidden bit this supports just under 16 decimal digits of magnitude. 9223372036854775807 is 19 decimal digits. This is not a diagnostic required by the standard, so it's up to each 'processor' (compiler) what to do about it.”
Thank you very much, #dave_thompson_085.

Truncation of deferred-length string when passing as optional

I am passing optional deferred-length strings (character(len=:), allocatable, optional) between subroutines and getting unexpected behavior in GCC.
Here is my minimal example, where I pass an optional string through an interface routine to a routine that sets it:
$ cat main.f90
module deepest_call_m
implicit none
contains
subroutine deepest_call(str)
character(len=:), allocatable, intent(OUT), optional :: str
if (present(str)) str = '12345'
write(*,*) 'at bot of deepest_call, str is "'//trim(str)//'"'
end subroutine deepest_call
end module deepest_call_m
module interface_call_m
implicit none
contains
subroutine interface_call(str)
use deepest_call_m, only : deepest_call
character(len=:), allocatable, intent(OUT), optional :: str
call deepest_call(str=str)
write(*,*) 'at bot of interface_call, str is "'//trim(str)//'"'
end subroutine interface_call
end module interface_call_m
program main
use interface_call_m, only : interface_call
implicit none
character(len=:), allocatable :: str
call interface_call(str=str)
write(*,*) 'at bot of main, str is "'//trim(str)//'"'
end program main
(Note that, for simplicity, I'm not wrapping the write statements in if(present) and if(allocated), although that would be necessary for a real implementation.)
In Intel 16.0 and 2019_U4, and PGI 15.10, this routine gives the expected result: str is set in deepest_call and remains the same through interface_call and in main:
$ ifort --version
ifort (IFORT) 16.0.0 20150815
Copyright (C) 1985-2015 Intel Corporation. All rights reserved.
$ ifort main.f90 && ./a.out
at bot of deepest_call, str is "12345"
at bot of interface_call, str is "12345"
at bot of main, str is "12345"
However, with gfortran 4.8.5:
$ gfortran --version
GNU Fortran (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36)
Copyright (C) 2015 Free Software Foundation, Inc.
GNU Fortran comes with NO WARRANTY, to the extent permitted by law.
You may redistribute copies of GNU Fortran
under the terms of the GNU General Public License.
For more information about these matters, see the file named COPYING
$ gfortran main.f90 && ./a.out
at bot of deepest_call, str is "12345"
at bot of interface_call, str is ""
at bot of main, str is ""
The string has been truncated when returning from the deepest_call in interface_call. With gfortran 7.3.0 and 8.2.0, the code crashes at runtime when no compile-time options are provided:
[chaud106#epyc-login-1-0 Testing]$ gfortran --version && gfortran main.f90 && ./a.out
GNU Fortran (GCC) 8.2.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
at bot of deepest_call, str is "12345"
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
#0 0x2af39a82733f in ???
#1 0x400c68 in ???
#2 0x400d9c in ???
#3 0x400f0e in ???
#4 0x2af39a813494 in ???
#5 0x400878 in ???
#6 0xffffffffffffffff in ???
Segmentation fault
However, adding compile-time checking recovers the previous truncation behavior:
$ gfortran --version
GNU Fortran (GCC) 8.2.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ gfortran -g -fbacktrace -Wall -Wextra -std=f2008 -fcheck=all -Og main.f90 && ./a.out
at bot of deepest_call, str is "12345"
at bot of interface_call, str is ""
at bot of main, str is ""
This looks a lot like a compiler bug to me, so perhaps I just need to post on the bugzilla. But I would like some feedback here first, specifically:
Is my test case standard-conforming? I'm specifically curious about the str=str argument to deepest_call. I use this structure often to pass an argument as optional if and only if it's optional in the present scope, but I couldn't easily find this in the standard and I'm not sure if it's really valid. Passing just str seemed to give the same behavior, though.
Are there any simple workarounds? Given that this issue affects a wide range of versions (4.8.5 to 8.2.0) simply avoiding them is not feasible.
Is anybody aware of this behavior for other versions of gfortran, or other compilers? I only have easy access to GCC, Intel, and PGI.
Your code is standard compliant.
An optional dummy argument may be an actual dummy argument in a subsequent call without the keyword, and whether or not it is present. (Of course, it may only be not present in the case that the following dummy argument is optional.) That is:
call interface_call(str)
is just as correct as
call interface_call(str=str)
As to (suboptimal) workarounds (I see this fail with gfortran 9.1.0, but working with gfortran 10.0.0 20190625), you could consider having the arguments as not deferred-length. After all, you're using trim everywhere, so trailing whitespace would be chomped.
Your code is just one small change away from not being compliant: the write statements aren't protected by a presence check for the optional dummy. If the optional dummy is not present in the call chain then the program would be invalid.

"Fortran runtime error: End of file" while writing

I have written a piece of code, compiled with GNU Fortran (GCC) 7.2.1 20171128 on Arch Linux, that tries to write to a file. The unit is opened with the newunit=... Fortran 2008-feature
When trying to write to the file, the code crashes, raising the error Fortran runtime error: End of file.
Non working code
Here's a minimal non-working version of the code. If the file does not exist, the code crashes with gfortran 7.2.1
program foo
implicit none
character(len=80) :: filename
character(len=5) :: nchar
integer :: ilun=1
call title(1, nchar)
! nchar = '00001'
filename = trim(nchar)//'.txt'
write(*, '(a, "<", a, ">")') 'filename ', trim(filename)
open(newunit=ilun, file=trim(filename), form='formatted', status='replace')
write(ilun, '(a1,a12,a10)') '#', 'Family', 'Count'
close(ilun)
end program foo
subroutine title(n, nchar)
implicit none
integer, intent(in) :: n
character(len=5), intent(out) :: nchar
write(nchar, '(i0.5)') n
end subroutine title
Here the command I'm using rm -f 00001.txt; gfortran foo.f90 -o a.out && ./a.out.
Working code
By comparison, the following code compiles and works perfectly on the same machine
program foo
implicit none
character(len=80) :: filename
character(len=5) :: nchar
integer :: ilun=1
! call title(1, nchar)
nchar = '00001'
filename = trim(nchar)//'.txt'
write(*, '(a, "<", a, ">")') 'filename ', trim(filename)
open(newunit=ilun, file=trim(filename), form='formatted', status='replace')
write(ilun, '(a1,a12,a10)') '#', 'Family', 'Count'
close(ilun)
end program foo
Here's the command I'm using rm -f 00001.txt; gfortran foo.f90 -o a.out && ./a.out.
Important note
Both codes work well when compiled using ifort (any version tried between ifort15 and ifort18) as well as GNU Fortran (GCC) 6.4.1 20171003 and GNU Fortran (GCC) 7.2.0, so there seems to be an issue introduced in version 7.2.1 of gfortran or on the version bundled with Arch Linux.
A few comments
If you uncomment nchar = '00001' in the non-working example, it still doesn't work.
If you change newunit=ilun to unit=ilun, with e.g. ilun=10 before, it works in any case
System details
OS: GNU Linux
Distribution: Arch Linux (up-to-date as of 15-12-2017)
$ uname -a
Linux manchot 4.14.4-1-ARCH #1 SMP PREEMPT Tue Dec 5 19:10:06 UTC 2017 x86_64 GNU/Linux
$ gfortran --version
GNU Fortran (GCC) 7.2.1 20171128
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
This issue is related to the Arch Linux distribution of Gfortran 7.2.1. It has now been fixed (see https://bugs.archlinux.org/task/56768).
If you encounter the issue, you should update your installation using
pacman -Syu gcc-fortran

"call to function log10f cannot be vectorized" with Intel Fortran

I’ve created a short example of a very simple loop that should vectorize. The message
call to function log10f cannot be vectorized
is what I do not understand.
Why isn’t a vectorized version of alog10 available from the library?
program test
real a(100)
do i = 1,100
a(i) = a(i) = 4.31 + alog10(max(50.0, real(i)))
end do
call sub(a)
stop
end
Compiled with ifort like
ifort -o x.o -c -O3 -xAVX -mkl -ip -fp-model precise -w -ftz -align all -fno-alias -FR
-convert big_endian -g -vec_report3 -opt_report_phase hlo -opt-report-phase hpo
-opt-report-phase ipo_inl x.f90
I get the report
INLINING OPTION VALUES:
-inline-factor: 100
-inline-min-size: 30
-inline-max-size: 230
-inline-max-total-size: 2000
-inline-max-per-routine: disabled
-inline-max-per-compile: disabled
<x.f90;1:12;IPO INLINING;MAIN__;0>
INLINING REPORT: (MAIN__) [1/1=100.0%]
-> for_stop_core(EXTERN)
-> sub_(EXTERN)
-> log10f(EXTERN)
-> for_set_reentrancy(EXTERN)
-> for_set_fpe_(EXTERN)
HPO VECTORIZER REPORT (MAIN__) LOG OPENED ON Wed Dec 31 12:48:17 2014
<x.f90;-1:-1;hpo_vectorization;MAIN__;0>
HPO Vectorizer Report (MAIN__)
x.f90(6:11-6:11):VEC:MAIN__: vectorization support: call to function log10f cannot be
vectorized
x.f90(6): (col. 11) remark: loop was not vectorized: statement cannot be vectorized
loop was not vectorized: statement cannot be vectorized
More of a comment than an answer ...
Why don't you try with a vectorised expression ? Perhaps
a = log10([(I, I=1, 100)])
Caveat It's your responsibility to ensure that the syntax of the snippet is correct.