I have found several questions and answers about this, but among "stream", "recl", HDF,etc. I am a bit lost ( I am quite a newbie ).
I apologize if there is somewhere a plain answer to my question.
This is my problem: I want to insert into an existing Fortran code a "WRITE" statement that produces an unformatted file that I can subsequently read with another post-processing Fortran code that I have written. With portability I mean that I can do this regardless the compiler and compilation flags used and, ideally, between different platforms (computers). Is this possible? How? If not, what are the compromises that I must accept?
If the anwer is supported by a detailed, but not too complicated, explanation, it would be highly appreciated.
p.s.
I want to use unformatted files because they are lighter and the I/O operations should be faster than with formatted files. Correct?
Update #1
From a comment it seems that it is not strictly possible to obtain an unformatted file which is portable to different machines. Therefore, let us assume I want to use a single machine. I am using ifort and gfortran. If with Fotran90 is not possbile, I think I can use Fortran2003. For me it is a bit complicated to control the compilation flags used to compile the original code, but if it is necessary I can work to control that aspect too.
Related
I'm asking this question because I have been working on a project that requires collecting a lot of data REALLY fast, depending on the scenario. 5.7GBytes with a capital BYTE per second or 11.4GBytes per second.
We are working with a small striped raid array using 3 Samsung Pro NVME (for 11.4GB/s we have a larger array).
Currently, the project has been developed on Windows, I wanted to make things as portable as possible so I focused on using C++ Standard Library; however, no matter what I did I could not crack transferring files faster than 1.5GB/s
The strategy was simple to create a couple of huge swap buffers, and write them directly to disk as a huge unformatted binary file.
Using std::ofstream
and benchmarking manually setting varied buffer sizes through:
rdbuf()->pubsetbuf(buffer, BUFFER_SIZE);
open(Filename, std::ios::binary|std::ios::trunc);
followed by my managed write loop, I was able to find a sweet spot, but never able to crack 1.5GB/s
I then found the Windows SDK and its CreateFile function
In particular, the create file function using the FILE_FLAG_NO_BUFFERING flag.
This was a game-changer, as long as I made sure I fed it sector-aligned data (in my case everything needed to be some multiple of 512Bytes) I was suddenly able to take full advantage of the raid array throughput.
I revisited the std::ofstream function in an attempt to work with more OS-agnostic functions; however, even though one can specify zero buffer for std::ofstream, there doesn't appear to be any documentation with regards to any caveats to using that function with no buffer.
std::ofstream allows 64bit values for its write size, unlike Windows SDK WriteFile which only accepts DWORD's setting the maximum write size is the largest multiple of 512 one can squeeze into a uint32_t and you must manage your write in a loop if your file exceeds 4GB (mine do).
This just raises the question, is Microsoft simply not giving the C++ Standard Library Devs access to the necessary OS-level system calls to take advantage of Ultra-high-speed drive arrays? Or am I missing something in how to use the C++ Standard Library to its full potential?
"is Microsoft simply not giving the C++ Standard Library Devs..."
You might notice that the product you're using is called Microsoft Visual Studio. The Standard Library developers for Visual Studio work at Microsoft, although in a different team as the Windows developers.
The reason is a bit more simple: the Visual C++ devs can't possibly know and optimize for all possible use scenario's. It's a bit unusual to do text formatting at such high speeds. Remember, the point of ostream is to provide operator<<. ofstream is for formatted output to files. But for high-speed I/O you want binary output anyway.
To put it bluntly, the bandwidth you're aiming for are within the ballpark of the physical limits of current commodity hardware (~24GByte/s for 16Ă—PCIe.4), and in my own work I found it very challenging to reach single-core memory transfer rates above 8GByte/s without the use of "dark magic" (aka hand crafted assembly and optimized system call code), and it involved carefully aligning the memory accesses and making use of vector extensions. But most importantly, to reach these levels of optimization requires to be aware of the kind of data that is being processed and what kind of access patters to expect and/or build caching intermediaries to accomodate for the underlying hardware.
Such optimizations are plain and simply outside of the scope of general purpose standard libraries. Standard libraries in their implementation must adhere to the behaviours written down in the specification, and some of these requirements tend to collide with what has to be done to make the most of the underlying hardware.
So I'm sorry to tell you, but you'll probably have to bite the bullet and use the low level system APIs directly, bypassing the standard library.
I have been using the very very old Turbo C++ 3.0 compiler.
During the usage of this compiler, I have become used to functions like getch(), getche() and most importantly clrscr().
Now I have started using Visual C++ 2010 Express. This is causing a lot of problems, as most of these functions (I found this out now) are non-standard and are not available in Visual C++.
What am I to do now?
Always try to avoid them if possible or try their alternatives :
for getch() --- cin.get()
clrscr -- system("cls") // try avoiding the system commands. check : [System][1]
And for any others you can search for them .
The real question is what you are trying to do, globally.
getch and clrscr have never been portable. If you're trying
to create masks or menus in a console window, you should look
into curses or ncurses: these offer a portable solution for
such things. If it's just paging, you can probably get away
with simple outputing a large number of '\n' (for clrscr),
and std::cin.get() for getch. (But beware that this will only
return once the user has entered a new line, and will only read
one character of the line, leaving the rest in the buffer. It
is definitely not a direct replacement for getch. In fact,
std::getline or std::cin::ignore might be better choices.)
Edit:
Adding some more possiblities:
First, as Joachim Pileborg suggested in his comment, if
portability is an issue, there may be platform specific
functions for much of what you are trying to do. If all you're
concerned about is Windows (and it probably is, since system(
"cls" ) and getch() don't work elsewhere), then his comment
may be a sufficient answer.
Second, for many consoles (including xterm and the a console
window under Windows), the escape sequence "\x1b""2J" should
clear the screen. (Note that you have to enter it as two
separate string literals, since otherwise, it would be
interpreted as two characters, the first with the impossible hex
value of 0x1b2.) Don't forget about possible issues of
redirection and flushing, however.
Finally, if you're doing anything non-trivial, you should look
into curses (or ncurses, they're the same thing, but with
different implementations). It's a bit more effort to put into
action (you need explicit initialization, etc.), but it has
a getch function which does exactly what you want, and it also
has functions for explicitly positionning the curser, etc. which
may also make your code simpler. (The original curses was
developed to support the original vi editor, at UCB. Any
editor like task not being developed in its own window would
benefit enormously from it.)
Well,
People, i have found the one best solution that can be used everywhere.
I simply googled the definitions of clrscr() and gotoxy() and created a header file and added these definitions to it. Thus, i can include this file and do everything that i was doing prior.
But, i have a query too.
windows.h is there in the definition. suppose i compile the file and make a exe file. Then will i be able to run it on a linux machine?
According to me the answer has to be yes. But please tell me if i am wrong and also tell me why i am wrong.
I was wondering if it was a good idea to load/save an array of a certain type of structure using fstream. Note, I am talking about loading/saving to a binary file. Should I be loading/saving independent variables such as int, float, boolean rather then a struct? The reason I ask that is because I've heard that a structure might have some type of padding which might offset the save/load.
A structure may contain padding, which will be written to the file. This is no big deal if the file is going to be read back on the same platform, using code emitted by the same compiler that did the write. However, this is difficult to guarantee, and if you cannot guarantee it, you should normally write the data in some textual format, such as XML, json or whatever.
Without serialization, your binary data will not be portable across different platform (and compilers). So if you need portability, then you need to serialize the data before storing it in file as binary.
Have a look at these:
Boost Serialization Tutorial
Boost Serializable Concept
It's not deprecated (it's not part of any formal spec, where should it be deprecated?), but it's extremely not portable and probably the worst way to go about serialising stuff. Use Boost.Serialization, or a similar library.
As you pointed out in your answer this will happen with writing structs this way. If you want your files to be portable across platforms, e.g. file being written on Linux i686 to opened by Solaris on Sparc then even writing individual float's won't work.
Try writing your data to something like text or XML and then zip/tar the files to make one document of them.
As Neil said, prefer textual representation of data. The XML format may be overkill. Simpler versions are Comma Separated Value (CSV) and one value per text line.
I have to work on a fortran program, which used to be compiled using Microsoft Compaq Visual Fortran 6.6. I would prefer to work with gfortran but I have met lots of problems.
The main problem is that the generated binaries have different behaviours. My program takes an input file and then has to generate an output file. But sometimes, when using the binary compiled by gfortran, it crashes before its end, or gives different numerical results.
This a program written by researchers which uses a lot of float numbers.
So my question is: what are the differences between these two compilers which could lead to this kind of problem?
edit:
My program computes the values of some parameters and there are numerous iterations. At the beginning, everything goes well. After several iterations, some NaN values appear (only when compiled by gfortran).
edit:
Think you everybody for your answers.
So I used the intel compiler which helped me by giving some useful error messages.
The origin of my problems is that some variables are not initialized properly. It looks like when compiling with compaq visual fortran these variables take automatically 0 as a value, whereas with gfortran (and intel) it takes random values, which explain some numerical differences which add up at the following iterations.
So now the solution is a better understanding of the program to correct these missing initializations.
There can be several reasons for such behaviour.
What I would do is:
Switch off any optimization
Switch on all debug options. If you have access to e.g. intel compiler, use ifort -CB -CU -debug -traceback. If you have to stick to gfortran, use valgrind, its output is somewhat less human-readable, but it's often better than nothing.
Make sure there are no implicit typed variables, use implicit none in all the modules and all the code blocks.
Use consistent float types. I personally always use real*8 as the only float type in my codes. If you are using external libraries, you might need to change call signatures for some routines (e.g., BLAS has different routine names for single and double precision variables).
If you are lucky, it's just some variable doesn't get initialized properly, and you'll catch it by one of these techniques. Otherwise, as M.S.B. was suggesting, a deeper understanding of what the program really does is necessary. And, yes, it might be needed to just check the algorithm manually starting from the point where you say 'some NaNs values appear'.
Different compilers can emit different instructions for the same source code. If a numerical calculation is on the boundary of working, one set of instructions might work, and another not. Most compilers have options to use more conservative floating point arithmetic, versus optimizations for speed -- I suggest checking the compiler options that you are using for the available options. More fundamentally this problem -- particularly that the compilers agree for several iterations but then diverge -- may be a sign that the numerical approach of the program is borderline. A simplistic solution is to increase the precision of the calculations, e.g., from single to double. Perhaps also tweak parameters, such as a step size or similar parameter. Better would be to gain a deeper understanding of the algorithm and possibly make a more fundamental change.
I don't know about the crash but some differences in the results of numerical code in an Intel machine can be due to one compiler using 80-doubles and the other 64-bit doubles, even if not for variables but perhaps for temporary values. Moreover, floating-point computation is sensitive to the order elementary operations are performed. Different compilers may generate different sequence of operations.
Differences in different type implementations, differences in various non-Standard vendor extensions, could be a lot of things.
Here are just some of the language features that differ (look at gfortran and intel). Programs written to fortran standard work on every compiler the same, but a lot of people don't know what are the standard language features, and what are the language extensions, and so use them ... when compiled with a different compiler troubles arise.
If you post the code somewhere I could take a quick look at it; otherwise, like this, 'tis hard to say for certain.
I am trying to partially truncate (or shorten) an existing file, using fstream. I have tried writing an EOF character, but this seems to do nothing.
Any help would be appreciated...
I don't think you can. There are many functions for moving "up and down" the wrapper hierarchy for HANDLE<->int<->FILE *, at least on Windows, but there is no "proper" to extract the FILE * from an iostreams object (if indeed it is even implemented with one).
You may find this question to be of assistance.
Personally I would strongly recommend steering clear of iostreams, they're poorly designed, heavily C++, and nasty to look at. Take a look at Boost's iostreams, or wrap stdio.h if you need to use classes.
The relevant function for stdio is ftruncate().
The Boost.Interprocess library defines a portable truncate function. For some reason it is not documented, but you can find it this header file.
It'll depend on the OS. Most OSes support this, but in different ways. On Windows, there's a SetEndOfFile(). On Unix and similar systems, you lseek to where you want the file to end, and do an lwrite of zero bytes there. Other OSes undoubtedly use other methods.
I bit the bullet in the end and read the part of the file to be kept to an array then re-wrote it. It's not the best solution - but as the files will always be small I have decided to accept this method.