Parallel IO & Append - fortran

When I run my small-scale parallel codes, I typically output N files (N being number of processors) in the form fileout.dat.xxx where xxx is the processor number (using I3.3) and then just cat them into a single fileout.dat file after the code is finished.
My question is can I use ACCESS='append' or POSITION='append' in the OPEN statement and have all processors write to the same file?

In practice, no. POSITION='append' merely says that the file pointer will be at the end of file after the open statement is executed. It is, however, possible to change the file position, e.g. with the BACKSPACE, REWIND or such statements. Thus, Fortran POSITION='append' does not correspond to the POSIX O_APPEND, and hence a POSIX OS cannot ensure that all writes only append to the file and do not overwrite older data.
Furhtermore, in case you run your code on a cluster, be aware that O_APPEND does not work on many networked file systems such as NFS.
In order to do parallel I/O with several processes/threads writing to a single file, use ACCESS='direct' or ACCESS='stream' and have the processes agree on which records/byte ranges to write to.

Related

Thread 1 reads from file as thread 2 writes to same file

Thread 1 (T1) creates the file using
FILE *MyFile = tmpfile();
Thread 2 (T2) then starts writing to the file. While thread 2 is writing, thread 1 occasionally reads from the file.
I set it up such that T2 is temporarily suspended when T1 is reading but, as T1 is only ever reading part of the file T2 won't be writing to (the file is written sequentially), I'm wondering if suspending T2 is necessary. I know this would be OK if FILE was replaced by fixed size array / vector. Just wondering how disc differs from memory.
Edit.
The writes are done using fseek and fwrite. The reads are done using fseek and fread. I assumed that was a given but maybe not from some of the comments. I suppose if T1 fseeks to position X at the same time as T2 fseeks to position Y then who knows where the next read or write will start from. Will take a look at pipes, Thanks for the help.
Mixing reads and writes on a FILE is not even safe when dealing with a single thread. From the manpage of fopen:
Reads and writes may be intermixed on read/write streams in any order. Note that ANSI C
requires that a file positioning function intervene between output and input, unless an input
operation encounters end-of-file. (If this condition is not met, then a read is allowed to
return the result of writes other than the most recent.) Therefore it is good practice (and
indeed sometimes necessary under Linux) to put an fseek(3) or fgetpos(3) operation between
write and read operations on such a stream. This operation may be an apparent no-op (as in
fseek(..., 0L, SEEK_CUR) called for its synchronizing side effect).
So don't assume reads and writes are magically synchronized for you and protect access to the FILE with a mutex.

Do we need mutex to perform multithreading file IO

I'm trying to do random write (Benchmark test) to a file using multiple threads (pthread). Looks like if I comment out mutex lock the created file size is less than actual as if Some writes are getting lost (always in some multiple of chunk size). But if I keep the mutex it's always exact size.
Is my code have a problem in other place and mutex is not really required (as suggested by #evan ) or mutex is necessary here
void *DiskWorker(void *threadarg) {
FILE *theFile = fopen(fileToWrite, "a+");
....
for (long i = 0; i < noOfWrites; ++i) {
//pthread_mutex_lock (&mutexsum);
// For Random access
fseek ( theFile , randomArray[i] * chunkSize , SEEK_SET );
fputs ( data , theFile );
//Or for sequential access (in this case above 2 lines would not be here)
fprintf(theFile, "%s", data);
//sequential access end
fflush (theFile);
//pthread_mutex_unlock(&mutexsum);
}
.....
}
You are opening a file using "append mode". According to C11:
Opening a file with append mode ('a' as the first character in the
mode argument) causes all subsequent writes to the file to be forced
to the then current end-of-file, regardless of intervening calls to
the fseek function.
C standard does not specified how exactly this should be implemented, but on POSIX system this is usually implemented using O_APPEND flag of open function, while flushing data is done using function write. Note that fseek call in your code should have no effect.
I think POSIX requires this, as it describes how redirecting output in append mode (>>) is done by the shell:
Appended output redirection shall cause the file whose name results
from the expansion of word to be opened for output on the designated
file descriptor. The file is opened as if the open() function as
defined in the System Interfaces volume of POSIX.1-2008 was called
with the O_APPEND flag. If the file does not exist, it shall be
created.
And since most programs use FILE interface to send data to stdout, this probably requires fopen to use open with O_APPEND and write (and not functions like pwrite) when writing data.
So if on your system fopen with 'a' mode uses O_APPEND and flushing is done using write and your kernel and filesystem correctly implement O_APPEND flag, using mutex should have no effect as writes do not intervene:
If the O_APPEND flag of the file status flags is set, the file
offset shall be set to the end of the file prior to each write and no
intervening file modification operation shall occur between changing
the file offset and the write operation.
Note that not all filesystems support this behavior. Check this answer.
As for my answer to your previous question, my suggestion was to remove mutex as it should have no effect on the size of a file (and it didn't have any effect on my machine).
Personally, I never really used O_APPEND and would be hesitant to do so, as its behavior might not be supported at some level, plus its behavior is weird on Linux (see "bugs" section of pwrite).
You definitely need a mutex because you are issuing several different file commands. The underlying file subsystem can't possibly know how many file commands you are going to call to complete your whole operation.
So you need the mutex.
In your situation you may find you get better performance putting the mutex outside the loop. The reason being that, otherwise, switching between threads may cause excessive skipping between different parts of the disk. Hard disks take about 10ms to move the read/write head so that could potentially slow things down a lot.
So it might be a good idea to benchmark that.

Are fortran file read/write semantics different between Fortran77 and Fortran90?

I've inherited both the source code and a compiled executable for a Fortran77 program. I do not know what compiler was used to make the existing executable, however I'm using GCC 4.9.2. Among other things, the program reads and writes records from/to a binary file. The number of records is also stored in the beginning of the file.
My problem is that the executable I compile myself incorrectly reads the number of records from the file, and consequently throws an error when it tries to read past the last record. The binary data file was generated using the existing executable (the one I inherited), and surprisingly, when I use the existing executable to read from the data file, it works as expected.
My question is, could the Fortran READ/WRITE statements have different semantics (such as file layout) depending on the platform, fortran version, or compiler type/version?
For what it's worth, the read and write code is
WRITE(INLIB,REC=1)NPROD,(ELMNTS(K),K=1,ITE),(ATOMWT(KK),KK=1,
+ ITE),IOUT,INFILE,ITERM,IBM,LINEPR
READ(INLIB,REC=1)NPROD,(ELMNTS(K),K=1,ITE),(ATOMWT(KK),KK=1,ITE)
+ ,IOUT,INFILE,ITERM,IBM,LINEPR
NPROD is the number of records in the file. When I put a break point after the READ statement, I can see that NPROD is about 300,000, when I know for a fact that there are only about 2,000 records.
This is the code for opening the file:
OPEN(UNIT=INLIB,FILE='PRODUCT.BIN',ACCESS='DIRECT',RECL=1188,
+ STATUS='OLD')
With regard to endianess, I think my current platform is compatible with the binary data file because if I open it in a hex editor, I can see some legible ASCII text that makes sense in the context of the program.

Atomic writing to file on linux

Is there a way to dump a buffer to file atomically?
By "atomically" I mean: if for example someone terminates my application during writing, I'd like to have file in either before- or after-writing state, but not in a corrupted intermediate state.
If the answer is "no", then probably it could be done with a really small buffers?
For example, can I dump 2 consequent int32_t variables with a single 8 bytes fwrite (on x64 platform), and be sure that both of those int32s are dumped, or neither of them, but not only just one of them?
I recommend writing to a temporary file and then doing a rename(2) on it.
ofstream o("file.tmp"); //Write to a temporary file
o << "my data";
o.close();
//Perform an atomic move operation... needed so readers can't open a partially written file
rename("file.tmp", "file.real");

Unexpected "padding" in a Fortran unformatted file

I don't understand the format of unformatted files in Fortran.
For example:
open (3,file=filename,form="unformatted",access="sequential")
write(3) matrix(i,:)
outputs a column of a matrix into a file. I've discovered that it pads the file with 4 bytes on either end, however I don't really understand why, or how to control this behavior. Is there a way to remove the padding?
For unformated IO, Fortran compilers typically write the length of the record at the beginning and end of the record. Most but not all compilers use four bytes. This aids in reading records, e.g., length at the end assists with a backspace operation. You can suppress this with the new Stream IO mode of Fortran 2003, which was added for compatibility with other languages. Use access='stream' in your open statement.
I never used sequential access with unformatted output for this exact reason. However it depends on the application and sometimes it is convenient to have a record length indicator (especially for unstructured data). As suggested by steabert in Looking at binary output from fortran on gnuplot, you can avoid this by using keyword argument ACCESS = 'DIRECT', in which case you need to specify record length. This method is convenient for efficient storage of large multi-dimensional structured data (constant record length). Following example writes an unformatted file whose size equals the size of the array:
REAL(KIND=4),DIMENSION(10) :: a = 3.141
INTEGER :: reclen
INQUIRE(iolength=reclen)a
OPEN(UNIT=10,FILE='direct.out',FORM='UNFORMATTED',&
ACCESS='DIRECT',RECL=reclen)
WRITE(UNIT=10,REC=1)a
CLOSE(UNIT=10)
END
Note that this is not the ideal aproach in sense of portability. In an unformatted file written with direct access, there is no information about the size of each element. A readme text file that describes the data size does the job fine for me, and I prefer this method instead of padding in sequential mode.
Fortran IO is record based, not stream based. Every time you write something through write() you are not only writing the data, but also beginning and end markers for that record. Both record markers are the size of that record. This is the reason why writing a bunch of reals in a single write (one record: one begin marker, the bunch of reals, one end marker) has a different size with respect to writing each real in a separate write (multiple records, each of one begin marker, one real, and one end marker). This is extremely important if you are writing down large matrices, as you could balloon the occupation if improperly written.
Fortran Unformatted IO I am quite familiar with differing outputs using the Intel and Gnu compilers. Fortunately my vast experience dating back to 1970's IBM's allowed me to decode things. Gnu pads records with 4 byte integer counters giving the record length. Intel uses a 1 byte counter and a number of embedded coding values to signify a continuation record or the end of a count. One can still have very long record lengths even though only 1 byte is used.
I have software compiled by the Gnu compiler that I had to modify so it could read an unformatted file generated by either compiler, so it has to detect which format it finds. Reading an unformatted file generated by the Intel compiler (which follows the "old' IBM days) takes "forever" using Gnu's fgetc or opening the file in stream mode. Converting the file to what Gnu expects results in a factor of up to 100 times faster. It depends on your file size if you want to bother with detection and conversion or not. I reduced my program startup time (which opens a large unformatted file) from 5 minutes down to 10 seconds. I had to add in options to reconvert back again if the user wants to take the file back to an Intel compiled program. It's all a pain, but there you go.