I have a do while loop in my program, who's condition to continue keeps giving me off-by-one errors and I can't figure out why. It looks like this:
do while (ii .le. nri .and. ed(ii) .le. e1)
! do some stuff...
ii = ii + 1
end do
where ii and nri are scalar integers, e1 is a scalar real, and ed is a real array of length nri. What I expect to happen after the last run is that since ii.le.nri returns .false. the second condition is never tested, and I don't get any off-by-one problems. I've verified with the debugger that ii.le.nri really does return .false. - and yet the program crashes.
To verify my assumption that only one condition is tested, I even wrote a small test program, which I compiled with the same compiler options:
program iftest
implicit none
if (returns_false() .and. returns_true()) then
print *, "in if block"
end if
contains
function returns_true()
implicit none
logical returns_true
print *, "in returns true"
returns_true = .true.
end function
function returns_false()
implicit none
logical returns_false
print *, "in returns false"
returns_false = .false
end function
end program
Running this program outputs, as I expected, only
$ ./iftest
in returns false
and exits. The second test is never run.
Why doesn't this apply to my do while clause?
In contrast to some languages Fortran does not guarantee any particular order of evaluation of compound logical expressions. In the case of your code, at the last go round the while loop the value of ii is set to nri+1. It is legitimate for your compiler to have generated code which tests ed(nri+1)<=e1 and thereby refer to an element outside the bounds of ed. This may well be the cause of your program's crash.
Your expectations are contrary to the Fortran standards prescriptions for the language.
If you haven't already done so, try recompiling your code with array-bounds checking switched on and see what happens.
As to why your test didn't smoke out this issue, well I suspect that all your test really shows is that your compiler generates a different order of execution for different species of condition and that you are not really comparing like-for-like.
Extending the answer High Performance Mark, here is one way to rewrite the loop:
ii_loop: do
if (ii .gt. nri) exit ii_loop
if (ed(ii) .gt. e1) exit ii_loop
! do some stuff
ii = ii + 1
end do ii_loop
Related
While benchmarking 'subtracting a vector from a matrix', I noticed Fortran compilers appear to be performing some sort of optimization when I reuse variables/code. It looks like the arrays are being reused from cache memory, however I'm not sure.
I believe this optimization is causing discrepancies in my benchmark results and would like to identify the specific type of optimization and, if possible, turn it off.
For example, in the following code that compares 2 cases, an additional Case 3 is introduced which is identical to Case 1. However, the time taken to run Case 3 is reported to be much lesser than that for Case 1.
program main
implicit none
integer :: n = 1E7
real*8, dimension(3) :: a
real*8, allocatable, dimension(:, :) :: b, c
real :: start, finish
integer :: i
allocate(b(n, 3))
allocate(c(n, 3))
call random_number(a)
call random_number(b)
! Case 1: Do loop
call cpu_time(start)
do i = 1, 3
c(:, i) = b(:, i) - a(i)
enddo
call cpu_time(finish)
print*, 'do-loop : ', finish-start
! Case 2: Spread
call cpu_time(start)
c = b - spread(a, dim=1, ncopies=n)
call cpu_time(finish)
print*, 'spread : ', finish-start
! Case 3: Do loop (again)
call cpu_time(start)
do i = 1, 3
c(:, i) = b(:, i) - a(i)
enddo
call cpu_time(finish)
print*, 'do-loop : ', finish-start
end program main
This produces similar results with Intel and GNU compilers as shown below. I have tried investigating using flags like -O0 and -qopt-report, but cannot understand why the code behaves so. Because the arrays are large, ulimit -s unlimited might be required (on Linux) to avoid a segmentation fault.
$ ifort reuse.f90 && ./a.out
do-loop : 0.2072840
spread : 0.4781271
do-loop : 3.6670923E-02
$ gfortran reuse.f90 && ./a.out
do-loop : 0.232345015
spread : 0.342370987
do-loop : 4.52849865E-02
At least in Linux, the memory allocator uses the "optimistic memory allocation strategy" (or see Why can Fortran allocate such large arrays? for Fortran). It assumes that there will be enough memory, assigns the virtual address space and that is all. The memory pages are only assigned when you access the memory by assigning some values (or trying to read the undefined garbage).
That has two implication.
If you requested too much memory, the allocate may still succeed and the program may crash later.
The first access will take more time.
To remove the problem with the latter, initialize the memory first, e.g. C = 0.
There are other reasons why you should disregard the first runs of any tests and always run them multiple times - not just one long test, but multiple short runs. There are various turbo modes in modern CPUs that may take some time to start, for example.
I just wanted a second opinion. I am newer to gfortran and I have this code:
program Assignmenttwo
!Nik Wrye
!CDS251-001
!Homework #2/Assignment #2
!September 9th, 2021
!This program is to Next, write a do loop that iterates 10 million
!times. In the loop, add 1.e-7 (one ten-millionth) to each variable (The variable should
!appear on both sides of the equal sign.) After the loop, print out each variable with
!a label.
implicit none
!declaring variables
real*4:: Numone, Numtwo
!intializing variables
Numone = 1.0
Numtwo = 2.0
do while (Numone < 1.1.and.Numtwo<2.1)!this do statement will cycle the loop until it has it the 10 millionth time
Numone = Numone+1.e-7 !adding the desired amount
Numtwo = Numtwo+1.e-7
enddo
print*, Numone
print*, Numtwo
end program Assignmenttwo
I am expecting the output to be 1.1 and 2.1 but I am getting 1.1 and 2.0. Any ideas?
Arguably the point of this assignment is that you get 1.1 but 2.0. This is down to the behaviour of floating point, the full details of which you can read about elsewhere.
You can do the simple test
print *, 1.+1e-7, 2.+1e-7
print *, 1.+1e-7/=1., 2.+1e-7/=2.
without doing the loops to see the effect.
Instead of repeating massive detail about the mechanics, I'll just mention that you can confirm that 1e-7 is "too small" to have an effect by using the nearest intrinsic function
print *, nearest(2., 1.)
With double precision you'll get different answers. (And this is why careful consideration of which real type to use in any case is important.)
Don't forget that Fortran no longer allows real DO control where increments may indeed be too small to have an effect.
So I have this code in Fortran:
REAL*8 DELTA,XI,SO,S
SO=0.273333328465621
S=0.323333333556851
XI=0.01
DELTA =SO-S ! DELTA = -0.0500000050912297
IF(DELTA.GE.0.0)XI=XI/10
This code with those values always end up evaluating the IF as true and executes the XI division (i.e. XI=0.001 after. I think this is a weird behavior, but my job is to replicate that behavior in C#.
Compiled with intel fortran, no optimizations and and full debug information as part of a 32 bit DLL
Any ideas why this happens?
The following doesn't execute the IF statement. Both with gfortran and ifort.
program test_delta
double precision DELTA
DELTA = -0.0500001
IF (DELTA .GE. 0.0) then
write (*, *) "IF-statement executed"
ENDIF
end program test_delta
One change is that I added the expected "then" to the IF statement. Otherwise both compilers issued error messages.
I'm adapting some Fortran code I haven't written, and without a lot of fortran experience myself. I just found a situation where some malformed input got silently ignored, and would like to change that code to do something more appropriate. If this were C, then I'd do something like
fprintf(stderr, "There was an error of kind foo");
exit(EXIT_FAILURE);
But in fortran, the best I know how to do looks like
write(*,*) 'There was an error of kind foo'
stop
which lacks the choice of output stream (minor issue) and exit status (major problem).
How can I terminate a fortran program with a non-zero exit status?
In case this is compiler-dependent, a solution which works with gfortran would be nice.
The stop statement allows a integer or character value. It seems likely that these will be output to stderr when that exists, but as stderr is OS dependent, it is unlikely that the Fortran language standard requires that, if it says anything at all. It is also likely that if you use the numeric option that the exit status will be set. I tried it with gfortran on a Mac, and that was the case:
program TestStop
integer :: value
write (*, '( "Input integer: " )', advance="no")
read (*, *) value
if ( value > 0 ) then
stop 0
else
stop 9
end if
end program TestStop
While precisely what stop with an integer or string will do is OS-dependent, the statement is part of the language and will always compile. call exit is a GNU extension and might not link on some OSes.
In addition to stop n, there is also error stop n since Fortran 2008.
With gfortran under Windows, they both send the error number to the OS, as can be seen with a subsequent echo %errorlevel%. The statement error stop can also be passed an error message.
program bye
read *, n
select case (n)
case (1); stop 10
case (2); error stop 20
case (3); error stop "Something went wrong"
case (4); error stop 2147483647
end select
end program
I couldn't find anything about STOP in the gfortran 4.7.0 keyword index, probably because it is a language keyword and not an intrinsic. Nevertheless, there is an EXIT intrinsic which seems to do just what I was looking for: exit with a given status. And the fortran wiki has a small example of using stderr which mentions a constant ERROR_UNIT. So now my code now looks like this:
USE ISO_FORTRAN_ENV, ONLY : ERROR_UNIT
[…]
WRITE(ERROR_UNIT,*) 'There as an error of kind foo'
CALL EXIT(1)
This at least compiles. Testing still pending, but it should work. If someone knows a more elegant or more appropriate solution, feel free to offer alternative answers to this question.
I've read about the save statement in the (Intel's) language reference document, but I cannot quite grasp what it does. Could someone explain to me in simple language what it means when the save statement is included in a module ?
In principal when a module goes out-of-scope, the variables of that module become undefined -- unless they are declared with the SAVE attribute, or a SAVE statement is used. "Undefined" means that you are not allowed to rely on the variable having the previous value if you again use the module -- it might have the previous value when you re-access the module, or it might not -- there is no guarantee. But many compilers don't do this for module variables -- the variables probably retain their values -- it isn't worth the effort for the compiler to figure out whether a module remains in scope or not and probably module variables are treated as global variables -- but don't rely on that! To be safe, either use "save" or "use" the module from the main program so that it never goes out of scope.
"save" is also important in procedures, to store "state" across invocations of the subroutine or function (as written by #ire_and_curses) -- "first invocation" initializations, counters, etc.
subroutine my_sub (y)
integer :: var
integer, save :: counter = 0
logical, save :: FirstCall = .TRUE.
counter = counter + 1
write (*, *) counter
if (FirstCall) then
FirstCall = .FALSE.
....
end if
var = ....
etc.
In this code fragment, "counter" will report the number of invocations of subroutine x. Though actually in Fortran >=90 one can omit the "save" because the initialization in the declaration implies "save".
In contrast to the module case, with modern compilers, without the save attribute or initialization-on-a-declaration, it is normal for local variables of procedures to lose their values across invocations. So if you attempt to use "var" on an later call before redefining it in that call, the value is undefined and probably won't be the value calculated on a previous invocation of the procedure.
This is different from the behavior of many FORTRAN 77 compilers, some of which retained the values of all local variables, even though this wasn't required by the language standard. Some old programs were written relying on this non-standard behavior -- these programs will fail on the newer compilers. Many compilers have an option to use the non-standard behavior and "save" all local variables.
LATER EDIT: update with a code example that shows incorrect usage of a local variable that should have the save attribute but doesn't:
module subs
contains
subroutine asub (i, control)
implicit none
integer, intent (in) :: i
logical, intent (in) :: control
integer, save :: j = 0
integer :: k
j = j + i
if ( control ) k = 0
k = k + i
write (*, *) 'i, j, k=', i, j, k
end subroutine asub
end module subs
program test_saves
use subs
implicit none
call asub ( 3, .TRUE. )
call asub ( 4, .FALSE. )
end program test_saves
Local variable k of the subroutine is intentionally misused -- in this program it is initialized in the first call since control is TRUE, but on the second call control is FALSE, so k is not redefined. But without the save attribute k is undefined, so the using its value is illegal.
Compiling the program with gfortran, I found that k retained its value anyway:
i, j, k= 3 3 3
i, j, k= 4 7 7
Compiling the program with ifort and aggressive optimization options, k lost its value:
i, j, k= 3 3 3
i, j, k= 4 7 4
Using ifort with debugging options, the problems was detected at runtime!
i, j, k= 3 3 3
forrtl: severe (193): Run-Time Check Failure. The variable 'subs_mp_asub_$K' is being used without being defined
Normally, local variables go out of scope once execution leaves the current procedure, and so have no 'memory' of their value on previous invocations. SAVE is a way of specifying that a variable in a procedure should maintain its value from one call to the next. It's useful when you want to store state in a procedure, for example to keep a running total or maintain a variable's configuration.
There's a good explanation here, with an example.
A short explanation could be: the attribute save says that the value of a variable must be preserved across different calls to the same subroutine/function. Otherwise normally when you return from a subroutine/function, "local" variables lose their values since the memory where those vars were stored is released. It is like static in C, if you know this language.