I am trying to write a series of values for time (real values) into a dat file in FORTRAN. This is a part of an MPI code and the code runs for a long time. So I would like to extract data at every time step and print it into a file and read the file any time during the execution of the program. Currently, the problem I am facing is, the values of time are not written into the file until the program ends. I have put the open statement before the do loop and the close statement after the end of do loop.
The parts of my code look like:
open(unit=57,file='inst.dat')
do loop starts
.
.
.
write(57,*) time
.
.
.
end do
close(57)
try call flush(unit). Check your compiler docs as this is i think an extension.
You mention MPI: For parallel codes I think you need to give each thread its own file/unit,
or take other measures to avoid conflicts.
From Gfortran manual:
Beginning with the Fortran 2003 standard, there is a FLUSH statement that should be preferred over the FLUSH intrinsic.
The FLUSH intrinsic and the Fortran 2003 FLUSH statement have identical effect: they flush the runtime library's I/O buffer so that the data becomes visible to other processes. This does not guarantee that the data is committed to disk.
On POSIX systems, you can request that all data is transferred to the storage device by calling the fsync function, with the POSIX file descriptor of the I/O unit as argument (retrieved with GNU intrinsic FNUM). The following example shows how:
! Declare the interface for POSIX fsync function
interface
function fsync (fd) bind(c,name="fsync")
use iso_c_binding, only: c_int
integer(c_int), value :: fd
integer(c_int) :: fsync
end function fsync
end interface
! Variable declaration
integer :: ret
! Opening unit 10
open (10,file="foo")
! ...
! Perform I/O on unit 10
! ...
! Flush and sync
flush(10)
ret = fsync(fnum(10))
! Handle possible error
if (ret /= 0) stop "Error calling FSYNC"
How about closing the file after every time step (assuming a reasonable amount of time elapses between time steps)?
do loop starts
.
.
!Note: an if statement should wrap the following so that it is
!only called by one processor.
open(unit=57,file='inst.dat')
write(57,*) time
close(57)
.
.
end do
Alternatively if the time between time steps is short, writing the data after blocks of 10, 100, ... iterations may be more efficient.
Related
When I run my fortran code in parallel on a linux cluster with mpirun I get a sigbus error.
It occurs while reading a file, the timing is irregular, and sometimes it proceeds without error.
I have tried debug compilation options like -g, but I haven't gotten any information on what line the error is coming from.
Actually the code was executed previously in three different clusters without this error, but the error is only occurring on this machine.
I personally suspect this is related to the performance of the machine (especially storage i/o), but I am not sure.
The program code is simple. Each process executed by mpirun reads the file corresponding to its rank as follows.
!!!!!!!!!! start of code
OPEN(11, FILE='FILE_NAME_WITH_RANK', FORM='UNFORMATTED')
READ(11,*) ISIZE
ALLOCATE(SOME_VARIABLE(ISIZE))
DO I = 1, ISIZE
READ(11,*) SOME_VARIABLE(I)
ENDDO
READ(11,*) ISIZE2
ALLOCATE(SOME_VARIABLE2(ISIZE2))
DO I = 1, ISIZE2
READ(11,*) SOME_VARIABLE2(I)
ENDDO
! MORE VARIABLES
CLOSE(11)
!!!!!!!!!! end of code
I used 191 cpu, and the total size of 191 files it loads is about 11 GB.
The cluster used for execution consists of 24 nodes with 16 cpu each (384 cpu total) and uses common storage that is shared with another cluster.
I ran the code in parallel by specifying nodes 1 through 12 as the hostfile.
Initially, I had 191 cpu read all files at the same time out of sequence.
After doing so, the program ended with a sigbus error. Also, for some nodes, the ssh connection was delayed, and the bashrc file cannot be found by node with stale file handle error.
The stale file handle error waited a bit and it seemed to recover by itself, but I'm not sure what the system administrator did.
So, I changed it to the following code so that only one cpu can read the file at a time.
!!!!!!!!!! start of code
DO ICPU = 0, NUMBER_OF_PROCESS-1
IF(ICPU.EQ.MY_PROCESS) CALL READ_FILE
CALL MPI_BARRIER(MPI_COMMUNICATOR,IERR)
ENDDO
!!!!!!!!!! end of code
This seemed to work fine for single execution, but if I ran more than one of these programs at the same time, the first mpirun stopped and both ended with a sigbus error eventually.
My next attempt is to minimize the execution of the read statement by deleting the do statement when reading the array. However, due to limited time, I couldn't test the effectiveness of this modification.
Here are some additional information.
If I execute a search or copy a file with an explorer such as nautilus while running a parallel program, nautilus does not respond or the running program raise sigbus. In severe cases, I wasn't able to connect the VNC server with stale file handle errors.
I use OpenMPI 2.1.1, GNU Fortran 4.9.4.
I compile the program with following
$OPENMPIHOME/bin/mpif90 -mcmodel=large -fmax-stack-var-size-64 -cpp -O3 $SOURCE -o $EXE
I execute the program with following in gnome terminal
$OPENMPIHOME/bin/mpirun -np $NP -x $LD_LIBRARY_PATH --hostfile $HOSTFILE $EXE
The cluster is said to be running commercial software like FLUENT without problems.
Summing up the above, my personal suspicion is that the storage of the cluster is dismounted due to the excessive disk I/O generated by my code, but I don't know if this makes sense because I have no cluster knowledge.
If yes, I wonder if there is a way to minimize the disk I/O, if it is enough to proceed with the vectorized I/O mentioned above, or if there is an additional part.
I would appreciate it if you could tell me anything about the problem.
Thanks in advance.
!!!
I wrote an example code. As mentioned above, it may not be easy to reproduce because the occurrence varies depending on the machine.
PROGRAM BUSWRITE
IMPLICIT NONE
INTEGER, PARAMETER :: ISIZE1 = 10000, ISIZE2 = 20000, ISIZE3 = 30000
DOUBLE PRECISION, ALLOCATABLE :: ARRAY1(:), ARRAY2(:), ARRAY3(:)
INTEGER :: I
INTEGER :: I1, I2, I3
CHARACTER*3 CPUNUM
INCLUDE 'mpif.h'
INTEGER ISTATUS(MPI_STATUS_SIZE)
INTEGER :: IERR, NPES, MYPE
CALL MPI_INIT(IERR)
CALL MPI_COMM_SIZE(MPI_COMM_WORLD,NPES,IERR)
CALL MPI_COMM_RANK(MPI_COMM_WORLD,MYPE,IERR)
I1=MOD(MYPE/100,10)+48
I2=MOD(MYPE/10 ,10)+48
I3=MOD(MYPE ,10)+48
CPUNUM=CHAR(I1)//CHAR(I2)//CHAR(I3)
OPEN(11, FILE=CPUNUM//'.DAT', FORM='UNFORMATTED')
ALLOCATE(ARRAY1(ISIZE1))
ALLOCATE(ARRAY2(ISIZE2))
ALLOCATE(ARRAY3(ISIZE3))
DO I = 1, ISIZE1
ARRAY1(I) = I
WRITE(11) ARRAY1(I)
ENDDO
DO I = 1, ISIZE2
ARRAY2(I) = I**2
WRITE(11) ARRAY2(I)
ENDDO
DO I = 1, ISIZE3
ARRAY3(I) = I**3
WRITE(11) ARRAY3(I)
ENDDO
CLOSE(11)
CALL MPI_FINALIZE(IERR)
END PROGRAM
mpif90 -ffree-line-length-0 ./buswrite.f90 -o ./buswrite
mpirun -np 32 ./buswrite
I've got 32 000.DAT ~ 031.DAT
PROGRAM BUSREAD
IMPLICIT NONE
INTEGER, PARAMETER :: ISIZE1 = 10000, ISIZE2 = 20000, ISIZE3 = 30000
DOUBLE PRECISION, ALLOCATABLE :: ARRAY1(:), ARRAY2(:), ARRAY3(:)
INTEGER :: I
INTEGER :: I1, I2, I3
CHARACTER*3 CPUNUM
INCLUDE 'mpif.h'
INTEGER ISTATUS(MPI_STATUS_SIZE)
INTEGER :: IERR, NPES, MYPE
CALL MPI_INIT(IERR)
CALL MPI_COMM_SIZE(MPI_COMM_WORLD,NPES,IERR)
CALL MPI_COMM_RANK(MPI_COMM_WORLD,MYPE,IERR)
I1=MOD(MYPE/100,10)+48
I2=MOD(MYPE/10 ,10)+48
I3=MOD(MYPE ,10)+48
CPUNUM=CHAR(I1)//CHAR(I2)//CHAR(I3)
OPEN(11, FILE=CPUNUM//'.DAT', FORM='UNFORMATTED')
ALLOCATE(ARRAY1(ISIZE1))
ALLOCATE(ARRAY2(ISIZE2))
ALLOCATE(ARRAY3(ISIZE3))
DO I = 1, ISIZE1
READ(11) ARRAY1(I)
IF(ARRAY1(I).NE.I) STOP
ENDDO
DO I = 1, ISIZE2
READ(11) ARRAY2(I)
IF(ARRAY2(I).NE.I**2) STOP
ENDDO
DO I = 1, ISIZE3
READ(11) ARRAY3(I)
IF(ARRAY3(I).NE.I**3) STOP
ENDDO
CLOSE(11)
CALL MPI_BARRIER(MPI_COMM_WORLD,IERR)
IF(MYPE.EQ.0) WRITE(*,*) 'GOOD'
CALL MPI_FINALIZE(IERR)
END PROGRAM
mpif90 -ffree-line-length-0 ./busread.f90 -o ./busread
mpirun -np 32 ./busread
I've got 'GOOD' output text from terminal as expected, but the machine in question is terminated with a sigbus error while running busread.
The issue was not observed after a device reboot. Even though I ran 4 programs at the same time under the same conditions, no problem occurred. In addition, other teams that used the device also had similar problems, which were resolved after reboot. The conclusion is a bit ridiculous, but if there are any people experiencing similar problems, I would like to summarize it as follows.
If your program terminates abnormally due to a memory error (like sigbus and sigsegv) while reading or writing a file, you can check the following.
Make sure there are no errors in your program. Check whether the time of occurrence of the error is constant or irregular, whether other programs have the same symptoms, whether it runs well on other machines, and whether there is a problem when run with a memory error checking tool such as valgrind.
Optimize the file I/O part. In the case of fortran, processing an entire array is tens of times faster than processing by element.
Immediately after an error occurs, try ssh connection to the machine (or node) to check whether the connection is smooth and that the file system is well accessed. If you cannot access the bashrc file or an error such as stale file handle occurs, please contact the system manager after combining the above reviewed information.
If someone has anything to add or if this post isn't appropriate, please let me know.
I'm running a distributed model stripped to its bare minimum below:
integer, parameter :: &
nx = 1200,& ! Number of columns in grid
ny = 1200,& ! Number of rows in grid
nt = 6000 ! Number of timesteps
integer :: it ! Loop counter
real :: var1(nx,ny), var2(nx,ny), var3(nx,ny), etc(nx,ny)
! Create netcdf to write model output
call check( nf90_create(path="out.nc",cmode=nf90_clobber, ncid=nc_out_id) )
! Loop over time
do it = 1,nt
! Calculate a lot of variables
...
! Write some variables in out.nc at each timestep
CALL check( nf90_put_var(ncid=nc_out_id, varid=var1_varid, values=var1, &
start = (/ 1, 1, it /), count = (/ nx, ny, 1 /)) )
! Close the netcdf otherwise it is not readable:
if (it == nt) call check( nf90_close(nc_out_id) )
enddo
I'm in the development stage of the model so, it inevitably crashes at unexpected points (usually at the Calculate a lot of variables stage), which means that, if the model crashes at timestep it =3000, 2999 timesteps will be written to the netcdf output file, but I will not be able to read the file because the file has not been closed. Still, the data have been written: I currently have a 2GB out.nc file that I can't read. When I ncdump the file it shows
netcdf out.nc {
dimensions:
x = 1400 ;
y = 1200 ;
time = UNLIMITED ; // (0 currently)
variables:
float var1 (time, y, x) ;
data:
}
My questions are: (1) Is there a way to close the file retrospectively, even outside Fortran, to be able to read the data that have already been written? (2) Alternatively, is there another way to write the file in Fortran that would make the file readable even without closing it?
When nf90_close is called, buffered output is written to disk and the file ID is relinquished so it can be reused. The problem is most likely due to buffered output not having been written to the disk when the program terminates due to a crash, meaning that only the changes you made in "define mode" are present in the file (as shown by ncdump).
You therefore need to force the data to be written to the disk more often. There are three ways of doing this (as far as I am aware).
nf90_sync - which synchronises the buffered data to disk when called. This gives you the most control over when to output data (every loop step, or every n loop steps, for example), which can allow you to optimize for speed vs robustness, but introduces more programming and checking overhead for you.
Thanks to #RussF for this idea. Creating or opening the file using the nf90_share flag. This is the recommended approach if the netCDF file is intended to be used by multiple readers/writers simultaneously. It is essentially the same as an automatic implementation of nf90_sync for writing data. It gives less control, but also less programming overhead. Note that:
This only applies to netCDF-3 classic or 64-bit offset files.
Finally, an option I wouldn't recommend, but am including for completeness (and I guess there may be situations where this is the best option, although none spring to mind) - closing and reopening the file. I don't recommend this, because it will slow down your program, and adds greater possibility of causing errors.
I wanted to run a code (or an external executable) for a specified amount of time. For example, in Fortran I can
call system('./run')
Is there a way I can restrict its run to let's say 10 seconds, for example as follows
call system('./run', 10)
I want to do it from inside the Fortran code, example above is for system command, but I want to do it also for some other subroutines of my code. for example,
call performComputation(10)
where performComputation will be able to run only for 10 seconds. The system it will run on is Linux.
thanks!
EDITED
Ah, I see - you want to call a part of the current program a limited time. I see a number of options for that...
Option 1
Modify the subroutines you want to run for a limited time so they take an additional parameter, which is the number of seconds they may run. Then modify the subroutine to get the system time at the start, and then in their processing loop get the time again and break out of the loop and return to the caller if the time difference exceeds the maximum allowed number of seconds.
On the downside, this requires you to change every subroutine. It will exit the subroutine cleanly though.
Option 2
Take advantage of a threading library - e.g. pthreads. When you want to call a subroutine with a timeout, create a new thread that runs alongside your main program in parallel and execute the subroutine inside that thread of execution. Then in your main program, sleep for 10 seconds and then kill the thread that is running your subroutine.
This is quite easy and doesn't require changes to all your subroutines. It is not that elegant in that it chops the legs off your subroutine at some random point, maybe when it is least expecting it.
Imagine time running down the page in the following example, and the main program actions are on the left and the subroutine actions are on the right.
MAIN SUBROUTINE YOUR_SUB
... something ..
... something ...
f_pthread_create(,,,YOUR_SUB,) start processing
sleep(10) ... calculate ...
... calculate ...
... calculate ...
f_pthread_kill()
... something ..
... something ...
Option 3
Abstract out the subroutines you want to call and place them into their own separate executables, then proceed as per my original answer below.
Whichever option you choose, you are going to have to think about how you get the results from the subroutine you are calling - will it store them in a file? Does the main program need to access them? Are they in global variables? The reason is that if you are going to follow options 2 or 3, there will not be a return value from the subroutine.
Original Answer
If you don't have timeout, you can do
call system('./run & sleep 10; kill $!')
Yes there is a way. take a look at the linux command timeout
# run command for 10 seconds and then send it SIGTERM kill message
# if not finished.
call system('timeout 10 ./run')
Example
# finishes in 10 seconds with a return code of 0 to indicate success.
sleep 10
# finishes in 1 second with a return code of `124` to indicate timed out.
timeout 1 sleep 10
You can also choose the type of kill signal you want to send by specifying the -s parameter. See man timeout for more info.
I am writing a parallel VTK file (pvti) from my fortran CFD solver. The file is really just a list of all the individual files for each piece of the data. Running MPI, if I have each process write the name of its individual file to standard output
print *, name
then I get a nice list of each file, ie
block0.vti
block1.vti
block2.vti
This is exactly the sort of list I want. But if I write to a file
write(9,*) name
then I only get one output in the file. Is there a simple way to replicate the standard output version of this without transferring data?
You could try adapting the following which uses MPI-IO, which is really the only way to ensure ordered files from multiple processes. It does assume an end of line character and that all the lines are the same length (padded with blanks if required) but I think that's about it.
Program ascii_mpiio
! simple example to show MPI-IO "emulating" Fortran
! formatted direct access files. Note can not use the latter
! in parallel with multiple processes writing to one file
! is behaviour is not defined (and DOES go wrong on certain
! machines)
Use mpi
Implicit None
! All the "lines" in the file will be this length
Integer, Parameter :: max_line_length = 30
! We also need to explicitly write a carriage return.
! here I am assuming ASCII
Character, Parameter :: lf = Achar( 10 )
! Buffer to hold a line
Character( Len = max_line_length + 1 ) :: line
Integer :: me, nproc
Integer :: fh
Integer :: record
Integer :: error
Integer :: i
! Initialise MPI
Call mpi_init( error )
Call mpi_comm_rank( mpi_comm_world, me , error )
Call mpi_comm_size( mpi_comm_world, nproc, error )
! Create a MPI derived type that will contain a line of the final
! output just before we write it using MPI-IO. Note this also
! includes the carriage return at the end of the line.
Call mpi_type_contiguous( max_line_length + 1, mpi_character, record, error )
Call mpi_type_commit( record, error )
! Open the file. prob want to change the path and name
Call mpi_file_open( mpi_comm_world, '/home/ian/test/mpiio/stuff.dat', &
mpi_mode_wronly + mpi_mode_create, &
mpi_info_null, fh, error )
! Set the view for the file. Note the etype and ftype are both RECORD,
! the derived type used to represent a whole line, and the displacement
! is zero. Thus
! a) Each process can "see" all of the file
! b) The unit of displacement in subsequent calls is a line.
! Thus if we have a displacement of zero we write to the first line,
! 1 means we write to the second line, and in general i means
! we write to the (i+1)th line
Call mpi_file_set_view( fh, 0_mpi_offset_kind, record, record, &
'native', mpi_info_null, error )
! Make each process write to a different part of the file
Do i = me, 50, nproc
! Use an internal write to transfer the data into the
! character buffer
Write( line, '( "This is line ", i0, " from ", i0 )' ) i, me
!Remember the line feed at the end of the line
line( Len( line ):Len( line ) ) = lf
! Write with a displacement of i, and thus to line i+1
! in the file
Call mpi_file_write_at( fh, Int( i, mpi_offset_kind ), &
line, 1, record, mpi_status_ignore, error )
End Do
! Close the file
Call mpi_file_close( fh, error )
! Tidy up
Call mpi_type_free( record, error )
Call mpi_finalize( error )
End Program ascii_mpii
Also please note you're just getting lucky with your standard output "solution", you're not guaranteed to get it all nice sorted.
Apart from having the writes from different ranks well mixed, your problem is that the Fortran OPEN statement probably truncates the file to zero length, thus obliterating the previous content instead of appending to it. I'm with Vladimir F on this and would write this file only in rank 0. There are several possible cases, some of which are listed here:
each rank writes a separate VTK file and the order follows the ranks or the actual order is not significant. In that case you could simply use a DO loop in rank 0 from 0 to #ranks-1 to generate the whole list.
each rank writes a separate VTK file, but the order does not follow the ranks, e.g. rank 0 writes block3.vti, rank 1 writes block12.vti, etc. In that case you can use MPI_GATHER to collect the block number from each process into an array at rank 0 and then loop over the elements of the array.
some ranks write a VTK file, some don't, and the block order does not follow the ranks. It's similar to the previous case - just have the ranks that do not write a block send a negative block number and then rank 0 would skip the negative array elements.
block numbering follows ranks order but not all ranks write a block. In that case you can use MPI_GATHER to collect one LOGICAL value from each rank that indicates if it has written a block or not.
If you are not in a hurry, you can force the output from different tasks to be in order:
! Loop over processes in order
DO n = 0,numProcesses-1
! Write to file if it is my turn
IF(nproc == n)THEN
! Write output here
ENDIF
! This call ensures that all processes wait for each other
#ifdef MPI
CALL MPI_Barrier(mpi_comm_world,ierr)
#endif
ENDDO
This solution is simple, but not efficient for very large output. This does not seem to be your case. Make sure you flush the output buffer after each write. If using this method, make sure to do tests before implementing, as success is not guaranteed on all architectures. This method works for me for outputting large NetCDF files without the need to pass the data around.
I get the following error with MPICH-2.1.5 and PGI compiler;
Fatal error in PMPI_Waitall: Invalid MPI_Request, error stack:
PMPI_Waitall(311): MPI_Waitall(count=4, req_array=0x2ca0ae0, status_array=0x2c8d220) failed
PMPI_Waitall(288): The supplied request in array element 0 was invalid (kind=0)
in the following example Fortran code for a stencil based algorithm,
Subroutine data_exchange
! data declaration
integer request(2*neighbor),status(MPI_STATUS_SIZE,2*neighbor)
integer n(neighbor),iflag(neighbor)
integer itag(neighbor),neigh(neighbor)
! Data initialization
request = 0; n = 0; iflag = 0;
! Create data buffers to send and recv
! Define values of n,iflag,itag,neigh based on boundary values
! Isend/Irecv look like this
ir=0
do i=1,neighbor
if(iflag(i).eq.1) then
ir=ir+1
call MPI_Isend(buf_send(i),n(i),MPI_REAL,neigh(i),itag(i),MPI_COMM_WORLD,request(ir),ierr)
ir=ir+1
call MPI_Irecv(buf_recv(i),nsize,MPI_REAL,neigh(i),MPI_ANY_TAG,MPI_COMM_WORLD,request(ir),ierr)
endif
enddo
! Calculations
call MPI_Waitall(2*neighbor,request,status,ierr)
end subroutine
The error occurs when the array_of_request in mpi_waitall gets a null value (request(i)=0). The null value in array_of_request comes up when the conditional iflag(i)=1 is not satisfied. The straight forward solution is to comment out the conditional but then that would introduce overheads of sending and receiving messages of 0 sizes which is not feasible for large scale systems (1000s of cores).
As per the MPI-forum link, the array_of_requests list may contain null or inactive handles.
I have tried following,
not initializing array_of_requests,
resizing array_of_request to match the MPI_isend + MPI_irecv count,
assigning dummy values to array_of_request
I also tested the very same code with MPICH-1 as wells as OpenMPI 1.4 and the code works without any issue.
Any insights would be really appreciated!
You could just move the first increment of ir into the conditional as well. Then you would have all handles in request(1:ir) at the and of the loop and issue:
call MPI_Waitall(ir,request(1:ir),status(:,1:ir),ierr)
This would make sure all requests are initialized properly.
Another thing: does n(i) in MPI_Isend hold the same value as nsize in the corresponding MPI_Irecv?
EDIT:
After consulting the MPI Standard (3.0, Ch. 3.7.3) I think you need to initialize the request array to MPI_REQUEST_NULL if you want give the whole request array to MPI_Waitall.