I've been having some trouble with variables precision in my code...
For a while i've been declaring variables as real(kind=8) :: var and I now understand that this is not very portable and has some other complications, but basically I'm getting a lot of imprecision in numerical calculations.
Now I'm using:
INTEGER, PARAMETER :: R8 = SELECTED_REAL_KIND (30, 300)
with variable declaration: real(R8) :: var1,var2.
Before I usually initialized variables as var1 = 1.0D0 and now I'm using var1 = 1.0_R8 but what should I do with var1 = 1.0D-20? I've run a simple code which proved that 1.0D-20 won't give me an accurate result but something like 10.0_r8**(-20.0_r8) will. Is there an easier way to define the variable? I know that 1D-20 is very small but the code I'm using really needs 30 decimal case precision.
Thank you for your help!
There's really two things happening here. First, declaring an exponent using d notation is the same as declaring it as type double precision.
Second, the r8 variable you declare requires more precision than most (all?) 8-byte representations. So you're actually declaring most of your variables as quad, then initializing them as double, which is the source of your problem.
As mentioned in the comments, the answer to your explicit question is to declare exponents using the following notation
real(mytype) :: a = 1.23e-20_mytype
This notation is cumbersome but easy to get used to for constant declaration.
Here's a little sample code I used to test your types:
program main
use ISO_FORTRAN_ENV, only : REAL64 => REAL64, REAL128
implicit none
INTEGER, PARAMETER :: R8 = SELECTED_REAL_KIND (30, 300)
real(r8) :: x, y, z
real(REAL64) :: a, b, c
x = 1.23d-20
y = 1.23e-20_r8
z = 1.23_r8*(10._r8**-20)
write(*,*) x
write(*,*) y
write(*,*) z
a = 1.23d-20
b = 1.23e-20_REAL64
c = 1.23_REAL64*(10._REAL64**-20)
write(*,*) a
write(*,*) b
write(*,*) c
write(*,*) 'Types: ', REAL64, R8, REAL128
end program main
for Intel 16.0, this gives:
mach5% ifort main.f90 && ./a.out
1.230000000000000057423043720037598E-0020
1.230000000000000000000000000000000E-0020
1.230000000000000000000000000000002E-0020
1.230000000000000E-020
1.230000000000000E-020
1.230000000000000E-020
Types: 8 16 16
Related
This question already has answers here:
How to pass array to a procedure which is passed as an argument to another procedure using Fortran
(2 answers)
Closed 1 year ago.
I'm writing a subroutine that can take function names are argument.
In my example, it is call call_test(ga), where ga is a function ga(x).
My old practice works fine if x is a scalar.
The problem is the program failed if x is an array.
The minimal sample that can reproduce the problem is the code below:
module fun
implicit none
private
public :: ga, call_test
contains
subroutine call_test(fn)
double precision,external::fn
double precision::f
double precision,dimension(0:2)::x0
x0 = 0.111d0
print*,'Input x0=',x0
print*,'sze x0:',size(x0)
f = fn(x0)
print*,'fn(x0)=',f
end subroutine call_test
function ga(x) result(f)
double precision,dimension(0:) :: x
double precision::f
print*,'size x inside ga:',size(x)
print*,'x in ga is:=',x
f = sum(x)
end function ga
end module
program main
use fun
call call_test(ga)
end program main
Using latest ifort, my execution result is:
Input x0= 0.111000000000000 0.111000000000000 0.111000000000000
sze x0: 3
size x inside ga: 140732712329264
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
a.out 000000010C6EC584 Unknown Unknown Unknown
libsystem_platfor 00007FFF20610D7D Unknown Unknown Unknown
a.out 000000010C6C62B2 _MAIN__ 32 main.f90
a.out 000000010C6C5FEE Unknown Unknown Unknown
Using gfortran the result is
Input x0= 0.11100000000000000 0.11100000000000000 0.11100000000000000
sze x0: 3
size x inside ga: 0
x in ga is:=
fn(x0)= 0.0000000000000000
My question is why is this, and how to make it work. A functional code solution based on my code is highly appreciated.
#IanBush is right in his comment. You need an explicit interface as the function argument takes an assumed-shape dummy argument. Since you have some other deprecated features in your code, here is a modern reimplementation of it in the hope of improving Fortran community coding practice,
module fun
use iso_fortran_env, only: RK => real64
implicit none
private
public :: ga, call_test
abstract interface
function fn_proc(x) result(f)
import RK
real(RK), intent(in) :: x(0:)
real(RK) :: f
end function fn_proc
end interface
contains
subroutine call_test(fn)
implicit none
procedure(fn_proc) :: fn
real(RK) :: f
real(RK) :: x0(0:2)
x0 = 0.111_RK
write(*,"(*(g0,:,' '))") 'Input x0 =',x0
write(*,"(*(g0,:,' '))") 'sze x0:',size(x0)
f = fn(x0)
write(*,"(*(g0,:,' '))") 'fn(x0) =', f
end subroutine call_test
function ga(x) result(f)
real(RK), intent(in) :: x(0:)
real(RK) :: f
write(*,"(*(g0,:,' '))") 'size x inside ga:',size(x)
write(*,"(*(g0,:,' '))") 'x in ga is:',x
f = sum(x)
end function ga
end module
program main
use fun
call call_test(ga)
end program main
Note the use of the abstract interface. The import statement merely imports the symbol RK for usage within the abstract. Without it, you will have to use iso_fortran_env to declare RK as real64. Do not use double precision, it is deprecated and does not necessarily mean 64bit real number. Also, note the conversion of 0.111d0 to 0.111_RK. Also, I strongly recommend not using single-letter variable names. They are ugly, error-prone, and non-informative. Modern Fortran allows variable names up to 63 characters long. There is no harm in having long but descriptive variable names. Here is the code output,
$main
Input x0 = 0.11100000000000000 0.11100000000000000 0.11100000000000000
sze x0: 3
size x inside ga: 3
x in ga is: 0.11100000000000000 0.11100000000000000 0.11100000000000000
fn(x0) = 0.33300000000000002
I try to introduce complex-valued array and variables in Fortran. For the following code
program
integer :: i , j !two dimensional real array
real(dp) :: m1(3,2)
complex(dp) :: a1(3,2),a2(3,2), c0, c1
!assigning some values to the array numbers
c0 = (1.0_dp, 0.0_dp)
c1 = (0.000000001_dp, 0.0_dp)
do i=1,3
do j = 1, 2
a1(i,j) = c0*i*j
end do
end do
do i=1,3
do j = 1, 2
a2(i,j) = c0*i*j + c1
end do
end do
do i=1,3
do j = 1, 2
if (dabs( dreal(a1(i,j)) - dreal(a2(i,j))) > 1.e-6 then
write (*,*), 'warning',i,j, a1(i,j), a2(i,j)
end if
end do
end do
write (*,*), a1(1,1), a2(1,1)
end program
ifort gives me
complex(dp) :: a1(3,2), a2(3,2)
-----------^
why complex(dp) requires compile-time constant and how to fix it? Thank you.
The kind parameter dp must be a constant, see Fortran - setting kind/precision of a variable at run time
However, you do not have the same problem as in the link, you did not try to define dp at all! First, you must use IMPLICIT NONE, it is absolutely necessary for safe programming and the biggest problem with your code. Then it will tell you that the type of dp is not declared.
You just define the kind constant in one of the usual ways, as a constant. The most simple is:
program main
!NECESSARY!
implicit none
integer, parameter :: dp = kind(1.d0)
Note that I named the program main, you have to name your program. Or just omit the program.
More about defining real kinds can be found at Fortran 90 kind parameter
Another note: Forget dabs() and dreal(). dabs() is an old remnant of Fortran 66 and dreal() is not standard Fortran at all. Just use abs and either dble() or better real( ,dp).
I have a seemingly simple problem: I want to detect whether a floating point addition in Fortran will overflow by doing something like the following:
real*8 :: a, b, c
a = ! some value
b = ! some value
if (b > DOUBLE_MAX - a) then
! handle overflow
else
c = a + b
The problem is that I don't know what DOUBLE_MAX should be. I'm aware of how floating point numbers are represented according to IEEE 754 but the largest value representable by a double precision floating point number seems to be too large for a variable of type real*8 (i.e. if I try to assign 1.7976931348623157e+308 to such a variable gfortran complains). C and C++ have predefined constants/generic functions for this purpose but I couldn't find a Fortran equivalent.
Note: I'm aware that real*8 is not really part of the standard but there seems to be no other way to reliably specify that a floating point number should use the double precision format.
Something like this?
real(REAL64) function func( a, b )
use, intrinsic :: iso_fortran_env, only: REAL64, INT64
use, intrinsic :: ieee_arithmetic, only: ieee_value, ieee_set_flag, IEEE_OVERFLOW, IEEE_QUIET_NAN
implicit none
real(REAL64), intent(in) :: a, b
real(REAL64), parameter :: MAX64 = huge(0.0_REAL64)
if ( b > MAX64-a ) then
! Set IEEE_OVERFLOW flag and return NaN
call ieee_set_flag(IEEE_OVERFLOW,.true.)
func = ieee_value(func,IEEE_QUIET_NAN)
else
func = a + b
end if
return
end function func
All I could find for intrinsic ieee_exceptions module is:
https://github.com/gcc-mirror/gcc/blob/master/libgfortran/ieee/ieee_exceptions.F90
For setting NaN value see post.
There are likely better ways to detect overflow, but the precise answer to your question is to use the huge function. HUGE(a) returns the largest possible number representable by the type a.
Is there an equivalent for the 'a' format specifier known from C in Fortran?
C Example:
printf("%a\n",43.1e6); // 0x1.48d3bp+25
Exporting floating point numbers in hexadecimal format prevents rounding errors. While the rounding errors are usually negligible, it is still advantageous to be able to restore a saved value exactly. Note, that the hexadecimal representation produced by printf is portable and human readable.
How can I export and parse floating point numbers in Fortran like I do in C using the 'a' specifier?
If you want to have full precision, the best way is to use unformatted files, such as this:
program main
real :: r
integer :: i
r = -4*atan(1.)
open(20,access="stream")
write (20) r
close(20)
end program main
(I used stream access, which is new to Fortran 2003, because
it is usually less confusing than normal unformatted access). You can then use, for example, od -t x1 fort.20 to look at this as a hex dump.
You can also use TRANSFER to copy the bit pattern to an integer and then use the Z edit descriptor.
If you really want to mimic the %a specifier, you'll have to roll your own. Most machines now use IEEE format. Use TRANSFER for copying the pattern to an integer, then pick that apart using IAND (and multiplications or divisions by powers of two for shifting).
Another option would be to let the C library do your work for you and interface via C binding. This rather depends on a modern compiler (some F2003 features used).
module x
use, intrinsic :: iso_c_binding
private
public :: a_fmt
interface
subroutine doit(a, dest, n) bind(C)
import
real(kind=c_double), value :: a
character(kind=c_char), intent(out) :: dest(*)
integer, value :: n
end subroutine doit
end interface
interface a_fmt
module procedure a_fmt_float, a_fmt_double
end interface a_fmt
contains
function a_fmt_float(a) result(res)
real(kind=c_float), intent(in) :: a
character(len=:), allocatable :: res
res = a_fmt_double (real(a, kind=c_double))
end function a_fmt_float
function a_fmt_double(a) result(res)
real(kind=c_double), intent(in) :: a
character(len=:), allocatable :: res
character(len=30) :: dest
integer :: n
call doit (a, dest, len(dest))
n = index(dest, achar(0))
res = dest(1:n)
end function a_fmt_double
end module x
program main
use x
implicit none
double precision :: r
integer :: i
r = -1./3.d0
do i=1,1030
print *,a_fmt(r)
r = - r * 2.0
end do
end program main
#include <stdio.h>
void doit (double a, char *dest, int n)
{
snprintf(dest, n-1, "%a", a);
}
I would like to calculate gamma(-170.1) using the program below:
program arithmetic
! program to do a calculation
real(8) :: x
x = GAMMA(-170.1)
print *, x
end program
but I get the error:
test.f95:4.10:
x = GAMMA(-170.1)
1
Error: Result of GAMMA underflows its kind at (1)
when I compile with gfortran. According to Maple gamma(-170.1) = 5.191963205*10^(-172) which I think should be within the range of the exponent of the variable x as I've defined it.
The below modification of your program should work. Remember that in Fortran the RHS is evaluated before assigning to the LHS, and that floating point literals are of default kind, that is single precision. Thus, making the argument to GAMMA double precision the compiler chooses the double precision GAMMA.
program arithmetic
! program to do a calculation
integer, parameter :: dp = kind(1.0d0)
real(dp) :: x
x = GAMMA(-170.1_dp)
print *, x
end program
-170.0 may be treated as a float. If so, changing it to a double should resolve the issue.