length of .dat file is shorter than expected [duplicate]

length of .dat file is shorter than expected [duplicate] - fortran

This question already has an answer here:
How to flush stdout in Fortran 90?
(1 answer)
Closed 10 months ago.
When I write the following .dat file, the file is only 6397 lines long even though the nested do loops correctly iterate 6400 times. Also, when I output the data NX(BR,BC) to Terminal, everything is fine. Only the .dat file is missing a few lines. Any idea what the issue might be?
Note NX is defined as REAL(8),DIMENSION(0:2000,0:2000) :: NX
Here's the rest of the relevant part of the code:
OPEN(UNIT=18,FILE='test_file.dat',STATUS='UNKNOWN',ACTION='WRITE')
I = 0 ! for debugging
DO BC = 0,39
DO BR = 0,159
! do some calculations here to calculate NX(BR,BC)
! for debugging
I = I + 1
WRITE(18,*) NX(BR,BC)
ENDDO
ENDDO
WRITE(*,*) I ! outputs 6400

I tested your code with my raspberry pi 4 (GNU Fortran (GCC) 11.2.0) and got exactly 6400 lines of zeros.
Maybe the write buffer is not flushed correctly. Have you tried closing and re-opening the file: How to carriage return and flush in Fortran?
close(18)

Related

Fortran ftell returning to wrong position

Here is a snippet of simple code that reads line from file, then returns to previous position and re-reads same line:
program main
implicit none
integer :: unit, pos, stat
character(128) :: buffer
! Open file as formatted stream
open( NEWUNIT=unit, FILE="data.txt", ACCESS="stream", FORM="formatted", STATUS="old", ACTION="read", IOSTAT=stat )
if ( stat /= 0 ) error stop
! Skip 2 lines
read (unit,*) buffer
read (unit,*) buffer
! Store position
pos = ftell(unit)
! Read & write next line
read (unit,*) buffer
write (*,*) "buffer=", trim(buffer)
! Return to previous position
call fseek(unit,pos,0)
! pos = ftell(unit) ! <-- ?!
! Read & write next line (should be same output)
read (unit,*) buffer
write (*,*) "buffer=", trim(buffer)
! Close file stream
close (UNIT=unit)
end program main
The "data.txt" is just a dummy file with 4 lines:
1
2
3
4
Now when I compile the snippet (gfortran 9.3.0) and run it, I get an answer:
buffer=3
buffer=4
which is wrong, as they should be same. More interestingly when I add an additional ftell (commented line in the snippet) after 'fseek' I get correct answer:
buffer=3
buffer=3
Any idea why it does that? or am I using ftell and fseek incorrectly?

gfortran's documentation for FTELL and FSEEK clearly states that these routines are provided for backwards compatibility with g77. As your code is using NEWUNIT, ERROR STOP, and STREAM access, you are not compiling old moldy code. You ought to use standard conforming methods as pointed out by #Vladimir.
A quick debugging session shows that FTELL and FSEEK are using a 0-based reference for the file position while the inquire method of modern Fortran is 1 based. There could be an off-by-one type bug in gfortran, but as FTELL and FSEEK are for backwards compatibility with g77 (an unmaintained 15+ year old compiler), someone would need to do some code spelunking to determine the intended behavior. I suspect none of the current, active, gfortran developers care enough to explore the problem. So, to fix your problem
program main
implicit none
integer pos, stat, unit
character(128) buffer
! Open file as formatted stream
open(NEWUNIT=unit, FILE="data.txt", ACCESS="stream", FORM="formatted", &
& STATUS="old", ACTION="read", IOSTAT=stat)
if (stat /= 0) stop
! Skip 2 lines
read (unit,*) buffer
read (unit,*) buffer
! Store position
inquire(unit, pos=pos)
! Read & write next line
read (unit,*) buffer
write (*,*) "buffer=", trim(buffer)
! Reread & write line (should be same output)
read (unit,*,pos=pos) buffer
write (*,*) "buffer=", trim(buffer)
! Close file stream
close (UNIT=unit)
end program main

Retrospectively closing a NetCDF file created with Fortran

I'm running a distributed model stripped to its bare minimum below:
integer, parameter :: &
nx = 1200,& ! Number of columns in grid
ny = 1200,& ! Number of rows in grid
nt = 6000 ! Number of timesteps
integer :: it ! Loop counter
real :: var1(nx,ny), var2(nx,ny), var3(nx,ny), etc(nx,ny)
! Create netcdf to write model output
call check( nf90_create(path="out.nc",cmode=nf90_clobber, ncid=nc_out_id) )
! Loop over time
do it = 1,nt
! Calculate a lot of variables
...
! Write some variables in out.nc at each timestep
CALL check( nf90_put_var(ncid=nc_out_id, varid=var1_varid, values=var1, &
start = (/ 1, 1, it /), count = (/ nx, ny, 1 /)) )
! Close the netcdf otherwise it is not readable:
if (it == nt) call check( nf90_close(nc_out_id) )
enddo
I'm in the development stage of the model so, it inevitably crashes at unexpected points (usually at the Calculate a lot of variables stage), which means that, if the model crashes at timestep it =3000, 2999 timesteps will be written to the netcdf output file, but I will not be able to read the file because the file has not been closed. Still, the data have been written: I currently have a 2GB out.nc file that I can't read. When I ncdump the file it shows
netcdf out.nc {
dimensions:
x = 1400 ;
y = 1200 ;
time = UNLIMITED ; // (0 currently)
variables:
float var1 (time, y, x) ;
data:
}
My questions are: (1) Is there a way to close the file retrospectively, even outside Fortran, to be able to read the data that have already been written? (2) Alternatively, is there another way to write the file in Fortran that would make the file readable even without closing it?

When nf90_close is called, buffered output is written to disk and the file ID is relinquished so it can be reused. The problem is most likely due to buffered output not having been written to the disk when the program terminates due to a crash, meaning that only the changes you made in "define mode" are present in the file (as shown by ncdump).
You therefore need to force the data to be written to the disk more often. There are three ways of doing this (as far as I am aware).
nf90_sync - which synchronises the buffered data to disk when called. This gives you the most control over when to output data (every loop step, or every n loop steps, for example), which can allow you to optimize for speed vs robustness, but introduces more programming and checking overhead for you.
Thanks to #RussF for this idea. Creating or opening the file using the nf90_share flag. This is the recommended approach if the netCDF file is intended to be used by multiple readers/writers simultaneously. It is essentially the same as an automatic implementation of nf90_sync for writing data. It gives less control, but also less programming overhead. Note that:
This only applies to netCDF-3 classic or 64-bit offset files.
Finally, an option I wouldn't recommend, but am including for completeness (and I guess there may be situations where this is the best option, although none spring to mind) - closing and reopening the file. I don't recommend this, because it will slow down your program, and adds greater possibility of causing errors.

Why can't my program print my statements?

There is 3 columns in the file i'm reading and I want to average each column and take the std. The code compiles now, but nothing is being printed.
Here is my code:
program cardata
implicit none
real, dimension(291) :: x
intEGER I,N
double precision date, odometer, fuel
real :: std=0
real :: xbar=0
open(unit=10, file="car.dat", FOrm="FORMATTED", STATUS="OLD", ACTION="READ")
read(10,*) N
do I=1,N
read(10,*) x(I)
xbar= xbar +x(I)
enddo
xbar = xbar/N
DO I =1,N
std =std +((x(I) -xbar))**2
enddo
std = SQRT((std / (N - 1)))
print*,'mean:',xbar
print*, 'std deviation:',std
close(unit=10)
end program cardata
I am fairly new to this, any input will be greatly appreciated.
Example of car.dat:
date odometer fuel
19930114 298 22.4
19930118 566 18.1
19930118 800 18.9
19930121 960 15.8
19930125 1247 19.8
19930128 1521 17.1
19930128 1817 19.8
19930202 2079 18.0
19930202 2342 10.0
19930209 2511 16.4
19930212 2780 16.7
19930214 3024 19.0
19930215 3320 17.7
19930302 3560 16.4
19930312 3853 18.8
19930313 4105 18.5

From the car.dat that you gave in the comments, it's surprising that the program doesn't show anything. When I run it, it shows a very clear runtime error:
$ gfortran -o cardata cardata.f90
$ ./cardata
At line 12 of file cardata.f90 (unit = 10, file = 'car.dat')
Fortran runtime error: Bad integer for item 1 in list input
You seem to be copying code from another example without really understanding what it does. The code, as you wrote it, expects the file car.dat to be in a certain format: First an integer, which corresponds to the number of items in the file, then a single real per line. So something like this:
5
1.2
4.1
2.2
0.4
-5.2
But with your example, the first line contains text (that is, the description of the different columns), and when it tries to parse that into an integer (N) it must fail.
I will not give you the complete example, as I have the nagging suspicion that this is some sort of homework from which you are supposed to learn something. But here are a few hints:
You can easily read several values per line:
read(10, *) date(I), odometer(I), fuel(I)
I'm assuming here that, different to your program, date, odometer, and fuel are arrays. date and odometer can be integer, but fuel must be real (or double precision, but that's not necessary for these values).
You need to jump over the first line before you can start. You can just read the line into a dummy character(len=64) variable. (I picked len=64, but you can pick any other length that you feel confident with, but it should be long enough to actually contain the whole line.)
The trickiest bit is how to get your N as it is not given at the beginning of the file. What you can do is this:
N = 0
readloop : do
read(10, fmt=*, iostat=ios) date(N+1), odometer(N+1), fuel(N+1)
if (ios /= 0) exit readloop
N = N + 1
end do readloop
Of course you need to declare INTEGER :: ios at the beginning of your program. This will try to read the values into the next position on the arrays, and if it fails (usually because it has reached the end of the file) it will just end.
Note that this again expects date, odometer, and fuel to be arrays, and moreover, to be arrays large enough to contain all values. If you can't guarantee that, I recommend reading up on allocatable arrays and how to dynamically increase their size.

MPI write to file sequentially

I am writing a parallel VTK file (pvti) from my fortran CFD solver. The file is really just a list of all the individual files for each piece of the data. Running MPI, if I have each process write the name of its individual file to standard output
print *, name
then I get a nice list of each file, ie
block0.vti
block1.vti
block2.vti
This is exactly the sort of list I want. But if I write to a file
write(9,*) name
then I only get one output in the file. Is there a simple way to replicate the standard output version of this without transferring data?

You could try adapting the following which uses MPI-IO, which is really the only way to ensure ordered files from multiple processes. It does assume an end of line character and that all the lines are the same length (padded with blanks if required) but I think that's about it.
Program ascii_mpiio
! simple example to show MPI-IO "emulating" Fortran
! formatted direct access files. Note can not use the latter
! in parallel with multiple processes writing to one file
! is behaviour is not defined (and DOES go wrong on certain
! machines)
Use mpi
Implicit None
! All the "lines" in the file will be this length
Integer, Parameter :: max_line_length = 30
! We also need to explicitly write a carriage return.
! here I am assuming ASCII
Character, Parameter :: lf = Achar( 10 )
! Buffer to hold a line
Character( Len = max_line_length + 1 ) :: line
Integer :: me, nproc
Integer :: fh
Integer :: record
Integer :: error
Integer :: i
! Initialise MPI
Call mpi_init( error )
Call mpi_comm_rank( mpi_comm_world, me , error )
Call mpi_comm_size( mpi_comm_world, nproc, error )
! Create a MPI derived type that will contain a line of the final
! output just before we write it using MPI-IO. Note this also
! includes the carriage return at the end of the line.
Call mpi_type_contiguous( max_line_length + 1, mpi_character, record, error )
Call mpi_type_commit( record, error )
! Open the file. prob want to change the path and name
Call mpi_file_open( mpi_comm_world, '/home/ian/test/mpiio/stuff.dat', &
mpi_mode_wronly + mpi_mode_create, &
mpi_info_null, fh, error )
! Set the view for the file. Note the etype and ftype are both RECORD,
! the derived type used to represent a whole line, and the displacement
! is zero. Thus
! a) Each process can "see" all of the file
! b) The unit of displacement in subsequent calls is a line.
! Thus if we have a displacement of zero we write to the first line,
! 1 means we write to the second line, and in general i means
! we write to the (i+1)th line
Call mpi_file_set_view( fh, 0_mpi_offset_kind, record, record, &
'native', mpi_info_null, error )
! Make each process write to a different part of the file
Do i = me, 50, nproc
! Use an internal write to transfer the data into the
! character buffer
Write( line, '( "This is line ", i0, " from ", i0 )' ) i, me
!Remember the line feed at the end of the line
line( Len( line ):Len( line ) ) = lf
! Write with a displacement of i, and thus to line i+1
! in the file
Call mpi_file_write_at( fh, Int( i, mpi_offset_kind ), &
line, 1, record, mpi_status_ignore, error )
End Do
! Close the file
Call mpi_file_close( fh, error )
! Tidy up
Call mpi_type_free( record, error )
Call mpi_finalize( error )
End Program ascii_mpii
Also please note you're just getting lucky with your standard output "solution", you're not guaranteed to get it all nice sorted.

Apart from having the writes from different ranks well mixed, your problem is that the Fortran OPEN statement probably truncates the file to zero length, thus obliterating the previous content instead of appending to it. I'm with Vladimir F on this and would write this file only in rank 0. There are several possible cases, some of which are listed here:
each rank writes a separate VTK file and the order follows the ranks or the actual order is not significant. In that case you could simply use a DO loop in rank 0 from 0 to #ranks-1 to generate the whole list.
each rank writes a separate VTK file, but the order does not follow the ranks, e.g. rank 0 writes block3.vti, rank 1 writes block12.vti, etc. In that case you can use MPI_GATHER to collect the block number from each process into an array at rank 0 and then loop over the elements of the array.
some ranks write a VTK file, some don't, and the block order does not follow the ranks. It's similar to the previous case - just have the ranks that do not write a block send a negative block number and then rank 0 would skip the negative array elements.
block numbering follows ranks order but not all ranks write a block. In that case you can use MPI_GATHER to collect one LOGICAL value from each rank that indicates if it has written a block or not.

If you are not in a hurry, you can force the output from different tasks to be in order:
! Loop over processes in order
DO n = 0,numProcesses-1
! Write to file if it is my turn
IF(nproc == n)THEN
! Write output here
ENDIF
! This call ensures that all processes wait for each other
#ifdef MPI
CALL MPI_Barrier(mpi_comm_world,ierr)
#endif
ENDDO
This solution is simple, but not efficient for very large output. This does not seem to be your case. Make sure you flush the output buffer after each write. If using this method, make sure to do tests before implementing, as success is not guaranteed on all architectures. This method works for me for outputting large NetCDF files without the need to pass the data around.

Retrieve data from file written in FORTRAN during program run

I am trying to write a series of values for time (real values) into a dat file in FORTRAN. This is a part of an MPI code and the code runs for a long time. So I would like to extract data at every time step and print it into a file and read the file any time during the execution of the program. Currently, the problem I am facing is, the values of time are not written into the file until the program ends. I have put the open statement before the do loop and the close statement after the end of do loop.
The parts of my code look like:
open(unit=57,file='inst.dat')
do loop starts
.
.
.
write(57,*) time
.
.
.
end do
close(57)

try call flush(unit). Check your compiler docs as this is i think an extension.
You mention MPI: For parallel codes I think you need to give each thread its own file/unit,
or take other measures to avoid conflicts.

From Gfortran manual:
Beginning with the Fortran 2003 standard, there is a FLUSH statement that should be preferred over the FLUSH intrinsic.
The FLUSH intrinsic and the Fortran 2003 FLUSH statement have identical effect: they flush the runtime library's I/O buffer so that the data becomes visible to other processes. This does not guarantee that the data is committed to disk.
On POSIX systems, you can request that all data is transferred to the storage device by calling the fsync function, with the POSIX file descriptor of the I/O unit as argument (retrieved with GNU intrinsic FNUM). The following example shows how:
! Declare the interface for POSIX fsync function
interface
function fsync (fd) bind(c,name="fsync")
use iso_c_binding, only: c_int
integer(c_int), value :: fd
integer(c_int) :: fsync
end function fsync
end interface
! Variable declaration
integer :: ret
! Opening unit 10
open (10,file="foo")
! ...
! Perform I/O on unit 10
! ...
! Flush and sync
flush(10)
ret = fsync(fnum(10))
! Handle possible error
if (ret /= 0) stop "Error calling FSYNC"

How about closing the file after every time step (assuming a reasonable amount of time elapses between time steps)?
do loop starts
.
.
!Note: an if statement should wrap the following so that it is
!only called by one processor.
open(unit=57,file='inst.dat')
write(57,*) time
close(57)
.
.
end do
Alternatively if the time between time steps is short, writing the data after blocks of 10, 100, ... iterations may be more efficient.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js