I am currently investigating robust methods for the summation of arrays, and implemented the algorithm published by Shewchuk in "Adaptive Precision Floating-Point Arithmetic and Fast Robust Geometric Predicates". While the implemented algorithm works as expected in gfortran, ifort optimizes the countermeasures away.
To give some context, here is my code:
module test_mod
contains
function shewchukSum( array ) result(res)
implicit none
real,intent(in) :: array(:)
real :: res
integer :: xIdx, yIdx, i, nPartials
real :: partials(100), hi, lo, x, y
nPartials = 0
do xIdx=1,size(array)
i = 0
x = array(xIdx)
! Calculate the partial sums
do yIdx=1,nPartials
y = partials(yIdx)
hi = x + y
if ( abs(x) < abs(y) ) then
lo = x - (hi - y)
else
lo = y - (hi - x)
endif
x = hi
! If a round-off error occured, store it. Exact comparison intended
if ( lo == 0. ) cycle
i = i + 1 ; partials(i) = lo
enddo ! yIdx
nPartials = i + 1 ; partials( nPartials ) = x
enddo ! xIdx
res = sum( partials(:nPartials) )
end function
end module
And the calling test program is
program test
use test_mod
implicit none
print *, sum([1.e0, 1.e16, 1.e0, -1.e16])
print *,shewchukSum([1.e0, 1.e16, 1.e0, -1.e16])
end program
Compilation using gfortran with produces the correct results for all optimization levels:
./a.out
0.00000000
2.00000000
ifort, however, produces zeros for all optimizations above -O0:
./a.out
0.00000000
0.00000000
I tried to debug the code and went down to the assembly level and figured out that ifort is optimizing away the calculation of lo and the operations after if ( lo == 0. ) cycle .
Is there a possibility to force ifort to perform the complete operation for all levels of optimization? This addition is a critical part of the calculations, and I want it to run as fast as possible.
For comparison, gfortran at -O2 executes this code approximately eight to ten times faster than ifort at -O0 (measured for arrays of length >100k).
When it comes to floating point operations, the default for ifort is generally for performance rather than strict correctness.
There are a number of options to control the floating point behaviour. Using ifort 16 and the option -assume protect_parens I get the expected behaviour even at higher optimization levels.
Additionally, there are the options -fp-model precise -fp-model source (this latter implies -assume protect_parens) which may also be of interest to you. The default for -fp-model is fast=1 which
allows value-unsafe optimizations
Naturally, these may have an impact on performance, so other options around the floating point behaviour are also worth considering.
Much further detail can be found in an Intel publication.
Related
In Fortran, is it possible to print data in a tabular manner, without losing information, when more space is needed than specified?
For instance consider the program
! format.f90
program main
real(8) :: arr(5)
arr = [0.0, 1.111, 22.22, 333.3, 444444444444444444.44]
print '(F10.3)', arr
end program main
Then by default the output for the last entry will be replaced by stars, indicating the lack of space.
>> ifort format.f90 -o format.bin
>> ./format.bin
0.000
1.111
22.220
333.300
**********
By comparison, C-style format specifiers automatically increase the column width when required, e.g.
// format.c
#include <stdio.h>
int main () {
double arr[5] = {0.0, 1.111, 22.22, 333.3, 444444444444444444.44};
for(int i=0; i<5; i++) {
printf("%10.3f\n", arr[i]);
}
}
>> gcc format.c -o format.bin
>> ./format.bin
0.000
1.111
22.220
333.300
444444444444444416.000
Is it possible to obtain such behavior in Fortran with built-in features?
Options, that don't fulfill the requirements
G descriptor. The G descriptor allows reliably outputting data in a tabular well-readable format and automatically adds exponentials when needed. However, it also wastes space if the exponentials are not needed and it doesn't line up the comma. For example, when switching F10.3 for G11.4,"¶" (paragraph sign added for emphasis):
>> ifort format.f90 -o format.bin
>> ./format.bin
0.000 ¶
1.111 ¶
22.22 ¶
333.3 ¶
0.4444E+18¶
Building a formatting API based on the F0 specifier. The specifier F0.3 would allow variable-width output, but doesn't allow specifying a minimum width. This could be solved using a wrapper function akin to leftpad, but a built-in or widely-used solution would be preferable for a better chance of actually being used in a codebase. As an example:
! format.f90
program main
real(8) :: arr(5)
integer :: i
arr = [0.0, 1.111, 22.22, 333.3, 444444444444444444.44]
! more complicated print statement, because 'float2char'
! cannot be 'elemental' due to needing the 'alloctable' property.
print '(A)', (float2char('(F0.3)', 10, arr(i)), i=1,5)
contains
function float2char(format, width, value) result(r)
character(:), allocatable :: r
character(*), intent(in) :: format
integer, intent(in) :: width
real(8), intent(in) :: value
character(64) :: buffer ! better: calculate size from value?
write(buffer, format) value
allocate(character(max(width, len_trim(buffer))) :: r)
r(:) = trim(buffer) ! (:) needed to prevent reallocation in recent compilers
r(:) = adjustr(r)
end function float2char
end program main
>> ifort format.f90 -o format.bin
>> ./format.bin
.000
1.111
22.220
333.300
444444452740661248.000
Yes, so in Fortran the fixed-width edit descriptors really are FIXED width. Sometimes useful, often annoying.
One thing you can do is to use the G edit descriptor, which is similar to %g in C, namely that it switches to scientific format when the number is large or small. That allows very large or small values to fit in a fixed width field. Note however that with G editing the d is the number of significant digits, not the number of digits after the decimal point as with F editing. Also it leaves space at the end for the exponent even if the number is in the range that no exponent is needed.
Your example could look like
! format.f90
program main
real(8) :: arr(4)
arr = [0.0, 1.111, 222222222222.222, 3.333]
print '(F10.3)', arr
print *, 'With G edit'
print '(G10.4)', arr
end program main
with output
0.000
1.111
**********
3.333
With G edit
0.000
1.111
0.2222E+12
3.333
I am testing some very simple equivalence errors when precision is an issue and was hoping to perform the operations in extended double precision (so that I knew what the answer would be in ~19 digits) and then perform the same operations in double precision (where there would be roundoff error in the 16th digit), but somehow my double precision arithmetic is maintaining 19 digits of accuracy.
When I perform the operations in extended double, then hardcode the numbers into another Fortran routine, I get the expected errors, but is there something strange going on when I assign an extended double precision variable to a double precision variable here?
program code_gen
implicit none
integer, parameter :: Edp = selected_real_kind(17)
integer, parameter :: dp = selected_real_kind(8)
real(kind=Edp) :: alpha10, x10, y10, z10
real(kind=dp) :: alpha8, x8, y8, z8
real(kind = dp) :: pi_dp = 3.1415926535897932384626433832795028841971693993751058209749445
integer :: iter
integer :: niters = 10
print*, 'tiny(x10) = ', tiny(x10)
print*, 'tiny(x8) = ', tiny(x8)
print*, 'epsilon(x10) = ', epsilon(x10)
print*, 'epsilon(x8) = ', epsilon(x8)
do iter = 1,niters
x10 = rand()
y10 = rand()
z10 = rand()
alpha10 = x10*(y10+z10)
x8 = x10
x8 = x8 - pi_dp
x8 = x8 + pi_dp
y8 = y10
y8 = y8 - pi_dp
y8 = y8 + pi_dp
z8 = z10
z8 = z8 - pi_dp
z8 = z8 + pi_dp
alpha8 = alpha10
write(*, '(a, es30.20)') 'alpha8 .... ', x8*(y8+z8)
write(*, '(a, es30.20)') 'alpha10 ... ', alpha10
if( alpha8 .gt. x8*(y8+z8) ) then
write(*, '(a)') 'ERROR(.gt.)'
elseif( alpha8 .lt. x8*(y8+z8) ) then
write(*, '(a)') 'ERROR(.lt.)'
endif
enddo
end program code_gen
where rand() is the gfortran function found here.
If we are speaking about only one precision type (take, for example, double), then we can denote machine epsilon as E16 which is approximately 2.22E-16. If we take a simple addition of two Real numbers, x+y, then the resulting machine expressed number is (x+y)*(1+d1) where abs(d1) < E16. Likewise, if we then multiply that number by z, the resulting value is really (z*((x+y)*(1+d1))*(1+d2)) which is nearly (z*(x+y)*(1+d1+d2)) where abs(d1+d2) < 2*E16. If we now move to extended double precision, then the only thing that changes is that E16 turns to E20 and has a value of around 1.08E-19.
My hope was to perform the analysis in extended double precision so that I could compare two numbers which should be equal but show that, on occasion, roundoff error will cause comparisons to fail. By assigning x8=x10, I was hoping to create a double precision 'version' of the extended double precision value x10, where only the first ~16 digits of x8 conform to the values of x10, but upon printing out the values, it shows that all 20 digits are the same and the expected double precision roundoff error is not occurring as I would expect.
It should also be noted that before this attempt, I wrote a program which actually writes another program where the values of x, y, and z are 'hardcoded' to 20 decimal places. In this version of the program, the comparisons of .gt. and .lt. failed as expected, but I am not able to duplicate the same failures by casting an extended double precision value as a double precision variable.
In an attempt to further 'perturb' the double precision values and add roundoff error, I have added, then substracted, pi from my double precision variables which should leave the remaining variables with some double precision roundoff error, but I am still not seeing that in the final result.
As the gfortran documentation you link states, the function result of rand is a default real value (single precision). Such a value can be represented exactly by each of your other real types.
That is, x10=rand() assigns a single precision value to the extended precision variable x10. It does so exactly. This same value now stored in x10 is assigned to the double precision variable x8, but this remains exactly representable as double precision.
There is sufficient precision in the single-as-double that the calculations using double and extended types return the same value. [See the note at the end of this answer.]
If you wish to see real effects of loss of precision, then start by using an extended or double precision value. For example, rather than using rand (returning a single precision value), use the intrinsic random_number
call random_number(x10)
(which has the advantage of being standard Fortran). Unlike a function, which in (nearly) all cases returns a value type regardless of the end use of the value, this subroutine will give you a precision corresponding to the argument. You will (hopefully) see much as you will from your "hard-coded" experiment.
Alternatively, as agentp commented, it may be more intuitive to start with a double precision value
call random_number(x8); x10=x8 ! x8 and x10 have the precision of double precision
call random_number(y8); y10=y8
call random_number(z8); z10=z8
and perform the calculations from that starting point: those extra bits will then start to show.
In summary, when you do x8=x10 you are getting the first few bits of x8 corresponding to those of x10, but many of those bits and those that follow in x10 are all zero.
When it comes to your pi_dp perturbation, you are again assigning a single precision (this time a literal constant) value to a double precision variable. Just having all those digits doesn't make it anything other than a default real literal. You can specify a different kind of literal with a _Edp suffix, as described in other answers.
Finally, one also then has to worry about what the compiler does with regards to optimization.
My thesis is that starting from the single precision value, the calculations performed are representable exactly in both double and extended precision (with the same values). For other calculations, or from a starting point with more bits set, or representations (for example, on some systems or with other compilers the numeric type with kind selected_real_kind(17) may have quite different characteristics such as a different radix) that needn't be the case.
While this was largely based on guessing and hoping it explained the observation. Fortunately, there are ways to test this idea. As we're talking about IEEE arithmetic we can consider the inexact flag. If that flag isn't raised during the computation we can be happy.
With gfortran there is the compilation option -ffpe=inexact which will make the inexact flag signalling. With gfortran 5.0 the intrinsic module ieee_exceptions is supported which can be used in a portable/standard manner.
You can consider this flag for further experimentation: if it is raised then you can expect to see differences between the two precisions.
In the Fortran code given below, I have made all numbers involving calculation of PI as double precision but the value of PI I get is just a real number with a large number of zero or 9 at the end. How do I make the program give PI in double precision? I am using gfortran compiler.
!This program determines the value of pi using Monte-Carlo algorithm.
program findpi
implicit none
double precision :: x,y,radius,truepi,cnt
double precision,allocatable,dimension(:) :: pi,errpi
integer :: seedsize,i,t,iter,j,k,n
integer,allocatable,dimension(:) :: seed
!Determining the true value of pi to compare with the calculated value
truepi=4.D0*ATAN(1.D0)
call random_seed(size=seedsize)
allocate(seed(seedsize))
do i=1,seedsize
call system_clock(t) !Using system clock to randomise the seed to
!random number generator
seed(i)=t
enddo
call random_seed(put=seed)
n=2000 !Number of times value of pi is determined
allocate(pi(n),errpi(n))
do j=1,n
iter=n*100 !Number of random points
cnt=0.D0
do i=1,iter
call random_number(x)
call random_number(y)
radius=sqrt(x*x + y*y)
if (radius < 1) then
cnt = cnt+1.D0
endif
enddo
pi(j)=(4.D0*cnt)/dble(iter)
print*, j,pi(j)
enddo
open(10,file="pi.dat",status="replace")
write(10,"(F15.8,I10)") (pi(k),k,k=1,n)
call system("gnuplot --persist piplot.gnuplot")
end program findpi
Your calculation is in double precision, but I see two issues:
The first is a systematic error... You determine pi by
pi(j)=(4.D0*cnt)/dble(iter)
iter is at most 2000*100, so 1/iter is at least 5e-6, so you can't resolve anything finder than that ;-)
The second issue is that your IO routines print the results in single precision! The line
write(10,"(F15.8,I10)") (pi(k),k,k=1,n)
and more specifically the format specifier "(F15.8,I10)" needs to be adjusted. At the moment it tells the compiler to use 15 characters overall to print the number, with 8 digits after the decimal point. As a first measure, you could use *:
write(10,*) (pi(k),k,k=1,n)
This uses 22 characters in total with all 15 digits for double precision:
write(10,"(F22.15,I10)") (pi(k),k,k=1,n)
The following is the code I have written to find the DFT of sine(x) over a period.
program fftw_test
implicit none
INTEGER FFTW_MEASURE
PARAMETER (FFTW_MEASURE=0)
INTEGER FFTW_ESTIMATE
PARAMETER (FFTW_ESTIMATE=64)
INTEGER FFTW_FORWARD
PARAMETER (FFTW_FORWARD=-1)
integer, parameter :: n = 8
integer :: i
double complex, dimension(0:n-1) :: input, output
double precision, parameter :: pi = 3.141592653, h = 2.0d0*pi/(n)
integer*8 :: plan
call dfftw_plan_dft_1d(plan, n, input, output, fftw_forward, fftw_measure)
do i = 0, n-1
input(i) = cmplx(sin(h*i), 0)
end do
call dfftw_execute_dft(plan, input, output)
output = output/n
output(0) = cmplx(0,0) ! setting oddball wavenumber to be 0
call dfftw_destroy_plan(plan)
do i = -n/2, n/2-1, 1
write(*, *) i, output(i+(n/2))
end do
end program
I am aware of the r2c (real to complex) function in the FFTW library. But I was advised to use the normal c2c function. So I defined the input function as a complex number with real part = sine(x) and complex part 0.
The DFT of sine(x) is supposed to be fk(-1) = cmplx(0, -0.5) and fk(1) = cmplx(0, 0.5) where fk(k) means the fourier coefficient of the k wavenumber
The output I received is as follows.
-4 ( 0.0000000000000000 , 0.0000000000000000 )
-3 ( 3.2001271327131496E-008,-0.49999998518472011 )
-2 ( -1.0927847071684482E-008, 1.4901161193847656E-008)
-1 ( -1.0145577183762535E-008, 1.4815279864022202E-008)
0 ( -1.0927847071684482E-008, 0.0000000000000000 )
1 ( -1.0145577183762535E-008, -1.4815279864022202E-008)
2 ( -1.0927847071684482E-008, -1.4901161193847656E-008)
3 ( 3.2001271327131496E-008, 0.49999998518472011 )
I am getting fk(-3) = cmplx(~0, -0.5) and fk(3) = cmplx(~0, 0.5). If I increase the grid size to 16, 32 or so I get -n/2 -1 and n/2 -1 wavenumbers with the required values instead of the -1 and 1 wavenumbers.
Does this have something to do with the way FFTW stores the output in the output array ? Or am I going wrong anywhere else ?
Also, I don't seem to be getting 'proper 0' where I should be. It is instead numbers of the order of 10^(-8) which I believe is the smallest my datatype double can hold. Is that something I should be worried about ?
Like #VladimirF already said, the ordering of the values is a bit different, than you might expect. The first half of the array holds the positive frequencies, the second half holds the negative frequencies in reverse order (see this link). And you might have to check the sign convention used by FFTW.
The problem with accuracy stems from your single precision value for pi and the use of cmplx which produces single precision complex numbers (use the keyword argument kind). In this case you could simply assign your real value to the complex variables. Applying these two changes yields a precision of ~1e-10. This can be improved by supplying a better approximation for pi (i.e. more than 10 digits).
E.g. the value pi = 3.141592653589793d0 yields results with accuracy of 1e-16.
I'm rewriting some code to make a program compile with the gfortran compiler as opposed to ifort compiler I usually use. The code follows:
_Subroutine SlideBits (WORD, BITS, ADDR)
Implicit None
Integer(4) WORD
Integer(4) BITS
Integer(4) ADDR
Integer(4) ADDR1
ADDR1 = 32 - ADDR
WORD = (WORD .And. (.Not.ISHFT(1,ADDR1))) .Or. ISHFT(BITS,ADDR1)
End_
When I compile the above code using the gfortran compiler, I recieve this error:
WORD = (WORD .And. (.Not.ISHFT(1,ADDR1))) .Or. ISHFT(BITS,ADDR1)
Error: Operand of .NOT. operator at (1) is INTEGER(4)
All three of the variables coming into the subroutine are integers. I've looked around a bit and the gfortran wiki states that the gfortran compiler should be able to handle logical statments being applied to integer values. Several other sites I've visited either quote from the gnu wiki or agree with it. This is the first time I've seen this error as the Intel Fortran compiler (ifort) I normally use compiles cleanly.
The comments/answers above "may .Not. be" the correct responses, depending on your ultimate objective.
The likely purpose of that "WORD = .." statement is .NOT. to arrive at a boolean/logical result, but rather to obtain a kind of integer enumerator.
To see this, first "ignore" the bit shifting (iShift() etc), and just look at something like IntR = Int1 .Or. Int2. This will produce a "proper" integer result. The value will depend on not only the values of the int's, but also on their declared "type" (e.g. Integer(1), Integer(2), etc)
That is, the resulting value of WORD will be a "proper" integer; something like "33504" .. or whatever, (likely) .NOT. a 0/1 or -1/0 or .True./.False. etc
If you replace = Int1 .Or. Int2 with = (Int1 /= 0) .Or. (Int2 /= 0) ... you will get an "integer logical" (i.e. 0/1 etc) and WILL NOT produce the
desired enumerator ... if that is what you are looking for.
The .Or. on two Int's is a kind of bit-wise addition that produces a new num based on how the bits align/word size etc.
e.g. 3 == 011, 2 = 010 ... so, 3 .Or. 2 ==> 011 = 3
e.g. 3 == 011, 5 = 101 ... so, 3 .Or. 5 ==> 111 = 7
e.g. 5 == 101, 5 = 101 ... so, 5 .Or. 5 ==> 101 = 5
... similarly the .And. provides a kind of multiplication.
This technique is sometimes used to create enumerators somewhat like the use of powers of two (1,2,4,8...) are used to assign a value. Then, any sum of those
values can be decomposed, for example, into its constituent elements. For instance, if a(1) = 2, and a(2) = 8, then the sum 10 can be decomposed to
show the selections were the 1st and 4th elements of (1,2,4,8,...) etc.
It may help conceptualise this by noting that bit-shifting is like multiplying by 2 (for left shift) and dividing by 2 (for right shift).
BTW, you don't need to restrict to Fortran for this. Whack it into a VBA function and see the result in your spreadsheet VBA does not
have bit shift intrinsics, but they are available ... in any case it will demonstrate the Int1 .Or. Int2 behaviour even without bit shifting, such as
Function TwoIntsOr(Int1 As Long, Int2 As Long) As Long
'
TwoIntsOr = Int1 Or Int2
'
End Function
-- .Or. in Fortran
Function TwoIntsOr(Int1, Int2)
Integer :: TwoInstOr
Integer, Intent(In) :: Int1, Int2
!
TwoIntsOr = Int1 .Or. Int2
!
End Function
).
It is not standard Fortran to apply logical/boolean operators to integer variables. If the goal is a boolean result, the ideal solution would be to convert the types to logical. If, as it appears from casual examination, the code is really doing bit-wise operations, then it would be better to use the IAND and IOR intrinsic functions.
gfortran is expecting booleans for the logical operators and the code is providing integers. Use comparisons with zero instead of logical operators.
WORD = ((WORD /= 0) .And. (ISHFT(1,ADDR1) == 0)) .Or. (ISHFT(BITS,ADDR1) /= 0)
gfortran and ifort use different representations for .true. and .false. values, so it's best to stick to booleans when that's what the code needs. In a conversion from ifort to gfortran I got bit by the former representing .true. as -1 and the latter using 1 for the same purpose, instead of the traditional (C-like) not 0.