I'm adapting some Fortran code I haven't written, and without a lot of fortran experience myself. I just found a situation where some malformed input got silently ignored, and would like to change that code to do something more appropriate. If this were C, then I'd do something like
fprintf(stderr, "There was an error of kind foo");
exit(EXIT_FAILURE);
But in fortran, the best I know how to do looks like
write(*,*) 'There was an error of kind foo'
stop
which lacks the choice of output stream (minor issue) and exit status (major problem).
How can I terminate a fortran program with a non-zero exit status?
In case this is compiler-dependent, a solution which works with gfortran would be nice.
The stop statement allows a integer or character value. It seems likely that these will be output to stderr when that exists, but as stderr is OS dependent, it is unlikely that the Fortran language standard requires that, if it says anything at all. It is also likely that if you use the numeric option that the exit status will be set. I tried it with gfortran on a Mac, and that was the case:
program TestStop
integer :: value
write (*, '( "Input integer: " )', advance="no")
read (*, *) value
if ( value > 0 ) then
stop 0
else
stop 9
end if
end program TestStop
While precisely what stop with an integer or string will do is OS-dependent, the statement is part of the language and will always compile. call exit is a GNU extension and might not link on some OSes.
In addition to stop n, there is also error stop n since Fortran 2008.
With gfortran under Windows, they both send the error number to the OS, as can be seen with a subsequent echo %errorlevel%. The statement error stop can also be passed an error message.
program bye
read *, n
select case (n)
case (1); stop 10
case (2); error stop 20
case (3); error stop "Something went wrong"
case (4); error stop 2147483647
end select
end program
I couldn't find anything about STOP in the gfortran 4.7.0 keyword index, probably because it is a language keyword and not an intrinsic. Nevertheless, there is an EXIT intrinsic which seems to do just what I was looking for: exit with a given status. And the fortran wiki has a small example of using stderr which mentions a constant ERROR_UNIT. So now my code now looks like this:
USE ISO_FORTRAN_ENV, ONLY : ERROR_UNIT
[…]
WRITE(ERROR_UNIT,*) 'There as an error of kind foo'
CALL EXIT(1)
This at least compiles. Testing still pending, but it should work. If someone knows a more elegant or more appropriate solution, feel free to offer alternative answers to this question.
Related
I have a parallel fortran code in which I want only the rank=0 process to be able to write to stdout, but I don't want to have to litter the code with:
if(rank==0) write(*,*) ...
so I was wondering if doing something like the following would be a good idea, or whether there is a better way?
program test
use mpi
implicit none
integer :: ierr
integer :: nproc
integer :: rank
integer :: stdout
call mpi_init(ierr)
call mpi_comm_rank(mpi_comm_world, rank, ierr)
call mpi_comm_size(mpi_comm_world, nproc, ierr)
select case(rank)
case(0)
stdout = 6
case default
stdout = 7
open(unit=stdout, file='/dev/null')
end select
write(stdout,*) "Hello from rank=", rank
call mpi_finalize(ierr)
end program test
This gives:
$ mpirun -n 10 ./a.out
Hello from rank= 0
Thanks for any advice!
There are two disadvantages to your solution:
This "clever" solution actually obscures the code, since it lies: stdout isn't stdout any more. If someone reads the code he/she will think that all processes are writing to stdout, while in reality they aren't.
If you want all processes to write to stdout at some point, what will you do then? Add more tricks?
If you really want to stick with this trick, please don't use "stdout" as a variable for the unit number, but e.g. "master" or anything that indicates you're not actually writing to stdout. Furthermore, you should be aware that the number 6 isn't always stdout. Fortran 2003 allows you to check the unit number of stdout, so you should use that if you can.
My advice would be to stay with the if(rank==0) statements. They are clearly indicating what happens in the code. If you use lots of similar i/o statements, you could write subroutines for writing only for rank 0 or for all processes. These can have meaningful names that indicate the intended usage.
mpirun comes with the option to redirect stdout from each process into separate files. For example, -output-filename out would result in out.1.0, out.1.1, ... which you then can monitor using whatever way you like (I use tail -f). Next to if(rank.eq.0) this is the cleanest solution I think.
I am not so concerned with the two disadvantages mentioned by steabert. We can work that out by introducing another file descriptor that clearly indicates that it is stdout only on master process, e.g. stdout -> stdout0.
But my concern is here: The /dev/null will work in UNIX-like environment. Will it work on Windows environment? How about the funky BlueGene systems?
As the title says, really.
Is there anything against not using stop, like this:
PROGRAM myprog
.
. < do stuff >
.
END PROGRAM myprog
rather than using an explicit stop, as in this:
PROGRAM myprog
.
. < do stuff >
.
STOP
END PROGRAM myprog
I see a lot of older fortran code that has a STOP before the END PROGRAM statment, but is it really needed there?
On our Cray machine, having a STOP stament at the end of the program writes the string "STOP" to STDERR, which is a bit annoying...
The code
stop
end program
is redundant as far as the program return value is concerned in modern Fortran. A stop with no integer or character stop-code should return a 0 exit code to the OS if exit codes are supported. If end program is encountered the behavior is the same, returning 0 to the OS.
The difference arises in program output. As you've noted, stop produces output. The standard (Fortran 2008 cl. 8.4) says
When an image is terminated by a STOP or ERROR STOP statement, its stop code, if any, is made available
in a processor-dependent manner. If any exception (14) is signaling on that image, the processor shall issue a
warning indicating which exceptions are signaling; this warning shall be on the unit identified by the named
constant ERROR UNIT (13.8.2.8). It is recommended that the stop code is made available by formatted output
to the same unit.
This recommends the stop-code be made available on standard error, which is where your STOP output is coming from. If you had given a stop-code to stop, it would have been output with STOP. Additionally, if there are floating point exceptions signalling, you will get additional output on standard error detailing that condition.
If you don't desire the additional output from stop and are not using it to return a non-zero error code to the OS, you can omit it from your program.
There is probably a historical reason for the stop,end ending of the main program, but my brief skimming of a FORTRAN66 manual did not enlighten me.
I have a do while loop in my program, who's condition to continue keeps giving me off-by-one errors and I can't figure out why. It looks like this:
do while (ii .le. nri .and. ed(ii) .le. e1)
! do some stuff...
ii = ii + 1
end do
where ii and nri are scalar integers, e1 is a scalar real, and ed is a real array of length nri. What I expect to happen after the last run is that since ii.le.nri returns .false. the second condition is never tested, and I don't get any off-by-one problems. I've verified with the debugger that ii.le.nri really does return .false. - and yet the program crashes.
To verify my assumption that only one condition is tested, I even wrote a small test program, which I compiled with the same compiler options:
program iftest
implicit none
if (returns_false() .and. returns_true()) then
print *, "in if block"
end if
contains
function returns_true()
implicit none
logical returns_true
print *, "in returns true"
returns_true = .true.
end function
function returns_false()
implicit none
logical returns_false
print *, "in returns false"
returns_false = .false
end function
end program
Running this program outputs, as I expected, only
$ ./iftest
in returns false
and exits. The second test is never run.
Why doesn't this apply to my do while clause?
In contrast to some languages Fortran does not guarantee any particular order of evaluation of compound logical expressions. In the case of your code, at the last go round the while loop the value of ii is set to nri+1. It is legitimate for your compiler to have generated code which tests ed(nri+1)<=e1 and thereby refer to an element outside the bounds of ed. This may well be the cause of your program's crash.
Your expectations are contrary to the Fortran standards prescriptions for the language.
If you haven't already done so, try recompiling your code with array-bounds checking switched on and see what happens.
As to why your test didn't smoke out this issue, well I suspect that all your test really shows is that your compiler generates a different order of execution for different species of condition and that you are not really comparing like-for-like.
Extending the answer High Performance Mark, here is one way to rewrite the loop:
ii_loop: do
if (ii .gt. nri) exit ii_loop
if (ed(ii) .gt. e1) exit ii_loop
! do some stuff
ii = ii + 1
end do ii_loop
I have a parallel fortran code in which I want only the rank=0 process to be able to write to stdout, but I don't want to have to litter the code with:
if(rank==0) write(*,*) ...
so I was wondering if doing something like the following would be a good idea, or whether there is a better way?
program test
use mpi
implicit none
integer :: ierr
integer :: nproc
integer :: rank
integer :: stdout
call mpi_init(ierr)
call mpi_comm_rank(mpi_comm_world, rank, ierr)
call mpi_comm_size(mpi_comm_world, nproc, ierr)
select case(rank)
case(0)
stdout = 6
case default
stdout = 7
open(unit=stdout, file='/dev/null')
end select
write(stdout,*) "Hello from rank=", rank
call mpi_finalize(ierr)
end program test
This gives:
$ mpirun -n 10 ./a.out
Hello from rank= 0
Thanks for any advice!
There are two disadvantages to your solution:
This "clever" solution actually obscures the code, since it lies: stdout isn't stdout any more. If someone reads the code he/she will think that all processes are writing to stdout, while in reality they aren't.
If you want all processes to write to stdout at some point, what will you do then? Add more tricks?
If you really want to stick with this trick, please don't use "stdout" as a variable for the unit number, but e.g. "master" or anything that indicates you're not actually writing to stdout. Furthermore, you should be aware that the number 6 isn't always stdout. Fortran 2003 allows you to check the unit number of stdout, so you should use that if you can.
My advice would be to stay with the if(rank==0) statements. They are clearly indicating what happens in the code. If you use lots of similar i/o statements, you could write subroutines for writing only for rank 0 or for all processes. These can have meaningful names that indicate the intended usage.
mpirun comes with the option to redirect stdout from each process into separate files. For example, -output-filename out would result in out.1.0, out.1.1, ... which you then can monitor using whatever way you like (I use tail -f). Next to if(rank.eq.0) this is the cleanest solution I think.
I am not so concerned with the two disadvantages mentioned by steabert. We can work that out by introducing another file descriptor that clearly indicates that it is stdout only on master process, e.g. stdout -> stdout0.
But my concern is here: The /dev/null will work in UNIX-like environment. Will it work on Windows environment? How about the funky BlueGene systems?
In C/C++ language loop statements we use exit(0), or exit(1) or other values. What is needed of that value, what is the role of that value in a loop when we exit the loop, and what is the meaning of 1 and 0 in exit()?
exit() will terminate the process, not the loop. For the argument, it's the exit status (0, EXIT_SUCCESS, EXIT_FAILURE) : http://www.opengroup.org/onlinepubs/000095399/functions/exit.html
A suggestion : you should search and read the documentation of functions or language feature before asking.
You don't use exit() to exit a loop. exit will exit your whole program.
The number you supply to the exit function will be the exit code of your program.
Typically an exit value of 0 indicates that the program finished successfully.
A non-zero value is usually an error identifier for the program, used to indicate that
The program failed
the number usually indicates why it failed
Ummm...no, you have this badly wrong.
In Cish, exit terminates the entire program. If you just want to stop looping, you use break. If you want to stop the current loop iteration and proceed to the next one, you use continue.
The value supplied to exit is returned as the exit value of your entire program (like you had returned that value from main). What the OS does with that value is up to the OS. What it means is also more or less up to you, although some OS'es define anything other than 0 as an error for various utilities.
Popular statements for terminating a language loop are: break, continue, goto, return, throw and exit. There are other functions that can exit a loop, but you can research them yourself.
The break statement exits the nearest loop. Execution resumes after the end of the loop.
The continue statement cause execution to start at the top of the loop. Statements after continue will not be executed. This may or may not exit a loop, depending on your pedantics.
The goto statement can be used to exit a loop. Place a label outside of the loop and use goto.
The return statement will exit the function. If the function is main, it will terminate the program.
The throw statement will exit the loop and the current function. Execution resumes at the nearest catch statement or terminates the program if no relevant catch statements are found.
The exit function will exit a loop and terminate a program. The values passed to exit will be passed to the Operating System after the program terminates. Some OSes allow the return value of a program to be used in a script. The values of 0 and 1 for exit are OS dependent; some use 0 to indicate successful termination. Other values may give reasons for the termination.
Other methods to exit a loop, some depend on implementation of undefined behavior:
Satisfy the loop's terminating
condition.
Divide by zero.
Dereference a null pointer.
Square root of a negative number.
Calculate the length of a C-string
without a terminating NUL character.
If you clarify your question, you will get better answers.
At the risk of sounding redundant, its the exit code for the process. Typically, you can define a range of values indicating different degrees of success or failure. Traditionally, 0 is success and 1 (or just non-zero) is failure. Then the program or function that invoked yours can examine the value (if its so inclined) and glean from it some idea as to whether or not your process was successful.
For example, you could have a program that copies a file from location A to B. If you are unable to copy the file because location B is write-protected, you could return -1, and any other program that utilizes yours as a step in its process now knows that you failed, and that expecting the file at location B to be complete and accessible is a bad idea.
You don't care to know if you are in a loop or not if you use the exit() function. It will quit the program and return the given integer as the return value.
There is no loop return value or such thing in C/C++.
The value only matters if an external program is going to use it to determine if the program ran successfully or not. Usually a program that runs successfully returns 0 or greater, and if it fails it returns -1.
For hysterical ... sorry, historical ... reasons.
Going back to Unix days, and still in Linux, Windows, etc, etc, you program does not just run alone - it gets called from somewhere (another program, the shell (bash, etc, dos prompt, etc) and they might want to know if it succeeded or not...
You may be confusing return(x) and exit(x). return sends info back to the calling function, exit shuts down the whole process, giving an exit status to the environment that started the process (if any).
The values returned from functions in C are often error values, but their meaning depends on the semantics of the process; the meaning of exit depends on the semantics of the enviroment.
Think of it like a ship; return is like a message given to an immediate supervisor, and might indicate why the decks couldn't be cleaned today, and why the project was abandoned; exit is a message corked up in a bottle to describe why and how the ship sank.