C++ write filestream - c++

I am writing an application that produces and logs a lot of data in the form of ASCII and binary output (not mixed, one or the other depending on the log). The application is single-threaded (should make things easier) and I want to write my data to disk in the order that it was generated. I need to implement a write(char* data) method that takes a null-terminated character array and writes it to disk. Ideally, I want the function to buffer the data and return before the data is actually written to disk...I figure that there must be some way for Windows to setup a thread and do this in the background. The only thing that I care about is that I get the data in the log file in the order that it was written. What is the best way to do this? Someone else implemented the current write method and it looks like:
void writeData(const char* data, int size)
{
if (fp != 0)
fwrite (data, 1, size, fp);
}
fp is the file pointer.
C++ Stdio.h header:
http://www.cplusplus.com/reference/cstdio/fwrite/

In multi-thread, may be you can use something like log queue.
In single-thread, the order is guaranteed

If you are talking Windows-only then you pretty much have two options: Overlapped I/O through the WinAPI or setting up a separate thread in your program to handle file I/O (which can potentially be cross-platform by using pthreads)

Related

Does my C++ code handle 100GB+ file copying? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
I need a cross-platform portable function that is able to copy a 100GB+ binary file to a new destination. My first solution was this:
void copy(const string &src, const string &dst)
{
FILE *f;
char *buf;
long len;
f = fopen(src.c_str(), "rb");
fseek(f, 0, SEEK_END);
len = ftell(f);
rewind(f);
buf = (char *) malloc((len+1) * sizeof(char));
fread(buf, len, 1, f);
fclose(f);
f = fopen(dst.c_str(), "a");
fwrite(buf, len, 1, f);
fclose(f);
}
Unfortunately, the program was very slow. I suspect the buffer had to keep 100GB+ in the memory. I'm tempted to try the new code (taken from Copy a file in a sane, safe and efficient way):
std::ifstream src_(src, std::ios::binary);
std::ofstream dst_ = std::ofstream(dst, std::ios::binary);
dst_ << src_.rdbuf();
src_.close();
dst_.close();
My question is about this line:
dst_ << src_.rdbuf();
What does the C++ standard say about it? Does the code compiled to byte-by-byte transfer or just whole-buffer transfer (like my first example)?
I'm curious does the << compiled to something useful for me? Maybe I don't have to invest my time on something else, and just let the compiler do the job inside the operator? If the operator translates to looping for me, why should I do it myself?
PS: std::filesystem::copy is impossible as the code has to work for C++11.
The crux of your question is what happens when you do this:
dst_ << src_.rdbuf();
Clearly this is two function calls: one to istream::rdbuf(), which simply returns a pointer to a streambuf, followed by one to ostream::operator<<(streambuf*), which is documented as follows:
After constructing and checking the sentry object, checks if sb is a null pointer. If it is, executes setstate(badbit) and exits. Otherwise, extracts characters from the input sequence controlled by sb and inserts them into *this until one of the following conditions are met: [...]
Reading this, the answer to your question is that copying a file in this way will not require buffering the entire file contents in memory--rather it will read a character at a time (perhaps with some chunked buffering, but that's an optimization that shouldn't change our analysis).
Here is one implementation: https://gcc.gnu.org/onlinedocs/libstdc++/libstdc++-api-4.6/a01075_source.html (__copy_streambufs). Essentially it a loop calling sgetc() and sputc() repeatedly until EOF is reached. The memory required is small and constant.
The C++ standard (I checked C++98, so this should be extremely compatible) says in [lib.ostream.inserters]:
basic_ostream<charT,traits>& operator<<
(basic_streambuf<charT,traits> *sb);
Effects: If sb is null calls setstate(badbit) (which may throw ios_base::failure).
Gets characters from sb and inserts them in *this. Characters are read from sb and inserted until any of the following occurs:
end-of-file occurs on the input sequence;
inserting in the output sequence fails (in which case the character to be inserted is not extracted);
an exception occurs while getting a character from sb.
If the function inserts no characters, it calls setstate(failbit) (which may throw ios_base::failure (27.4.4.3)). If an exception was thrown while extracting a character, the function set failbit in error state, and if failbit is on in exceptions() the caught exception is rethrown.
Returns: *this.
This description says << on rdbuf works on a character-by-character basis. In particular, if inserting of a character fails, that exact character remains unread in the input sequence. This implies that an implementation cannot just extract the whole contents into a single huge buffer upfront.
So yes, there's a loop somewhere in the internals of the standard library that does a byte-by-byte (well, charT really) transfer.
However, this does not mean that the whole thing is completely unbuffered. This is simply about what operator<< does internally. Your ostream object will still accumulate data internally until its buffer is full, then call write (or whatever low-level function your OS uses).
Unfortunately, the program was very slow.
Your first solution is wrong for a very simple reason: it reads the entire source file in memory, then write it entirely.
Files have been invented (perhaps in the 1960s) to handle data that don't fit in memory (and has to be in some "slower" storage, at that time hard disks or drums, or perhaps even tapes). And they have always been copied by "chunks".
The current (Unix-like) definition of file (as a sequence of bytes than is open-ed, read, write-n, close-d) is more recent than 1960s. Probably the late 1970s or early 1980s. And it comes with the notion of streams (which has been standardized in C with <stdio.h> and in C++ with std::fstream).
So your program has to work (like every file copying program today) for files much bigger than the available memory.You need some loop to read some buffer, write it, and repeat.
The size of the buffer is very important. If it is too small, you'll make too many IO operations (e.g. system calls). If it is too big, IO might be inefficient or even not work.
In practice, the buffer should today be much less than your RAM, typically several megabytes.
Your code is more C like than C++ like because it uses fopen. Here is a possible solution in C with <stdio.h>. If you code in genuine C++, adapt it to <fstream>:
void copyfile(const char*destpath, const char*srcpath) {
// experiment with various buffer size
#define MYBUFFERSIZE (4*1024*1024) /* four megabytes */
char* buf = malloc(MYBUFFERSIZE);
if (!buf) { perror("malloc buf"); exit(EXIT_FAILURE); };
FILE* filsrc = fopen(srcpath, "r");
if (!filsrc) { perror(srcpath); exit(EXIT_FAILURE); };
FILE* fildest = fopen(destpath, "w");
if (!fildest) { perror(destpath); exit(EXIT_FAILURE); };
for (;;) {
size_t rdsiz = fread(buf, 1, MYBUFFERSIZE, filsrc);
if (rdsiz==0) // end of file
break;
else if (rdsiz<0) // input error
{ perror("fread"); exit(EXIT_FAILURE); };
size_t wrsiz = fwrite(buf, rdsiz, 1, fildest);
if (wrsiz != 1) { perror("fwrite"); exit(EXIT_FAILURE); };
}
if (fclose(filsrc)) { perror("fclose source"); exit(EXIT_FAILURE); };
if (fclose(fildest)) { perror("fclose dest"); exit(EXIT_FAILURE); };
}
For simplicity, I am reading the buffer in byte components and writing it as a whole. A better solution is to handle partial writes.
Apparently dst_ << src_.rdbuf(); might do some loop internally (I have to admit I never used it and did not understand that at first; thanks to Melpopene for correcting me). But the actual buffer size matters a big lot. The two other answers (by John Swinck and by melpomene) focus on that rdbuf() thing. My answer focus on explaining why copying can be slow when you do it like in your first solution, and why you need to loop and why the buffer size matters a big lot.
If you really care about performance, you need to understand implementation details and operating system specific things. So read Operating systems: three easy pieces. Then understand how, on your particular operating system, the various buffering is done (there are several layers of buffers involved: your program buffers, the standard stream buffers, the kernel buffers, the page cache). Don't expect your C++ standard library to buffer in an optimal fashion.
Don't even dream of coding in standard C++ (without operating system specific stuff) an optimal or very fast copying function. If performance matters, you need to dive in OS specific details.
On Linux, you might use time(1), oprofile(1), perf(1) to measure your program's performance. You could use strace(1) to understand the various system calls involved (see syscalls(2) for a list). You might even code (in a Linux specific way) using directly the open(2), read(2), write(2), close(2) and perhaps readahead(2), mmap(2), posix_fadvise(2), madvise(2), sendfile(2) system calls.
At last, large file copying are limited by disk IO (which is the bottleneck). So even by spending days in optimizing OS specific code, you won't win much. The hardware is the limitation. You probably should code what is the most readable code for you (it might be that dst_ << src_.rdbuf(); thing which is looping) or use some library providing file copy. You might win a tiny amount of performance by tuning the various buffer sizes.
If the operator translates to looping for me, why should I do it myself?
Because you have no explicit guarantee on the actual buffering done (at various levels). As I explained, buffering matters for performance. Perhaps the actual performance is not that critical for you, and the ordinary settings of your system and standard library (and their default buffers sizes) might be enough.
PS. Your question contains at least 3 different questions (but related ones). I don't find it clear (so downvoted it), because I did not understand what is the most relevant one. Is it : performance? robustness? meaning of dst_ << src_.rdbuf();? Why is the first solution slow? How to copy large files quickly?

WriteFile overlapped and fwrite equivalent

On Windows, the WriteFile() function has a parameter called lpOverlapped which lets you specify an offset at which to write to the file.
I was wondering, is there is an fwrite() cross-platform equivalent of that?
I see that if the file is opened with the rb+ flag, I might be able to use fseek() to write to a particular offset. My question is - will this approach be equivalent to the overlapped WriteFile(), and will it produce the same behaviour on all platforms?
Background
The reason I need this is because I am writing blocked compressed data streams to a file, and I want to be able to load a specific block from the file and be able to decompress it. So, basically if I keep track of where the block begins in a file, I can load the block and decompress it in a more efficient manner. I know that there are probably better ways to do this, but I need this solution for some backwards compatibility.
Assuming you are okay with using POSIX functions and not just things from the C or C++ standard libraries, the solution is pwrite (aka: positioned write).
ssize_t rc = pwrite(file_handle, data_ptr, data_size, destination_offset);
I think you are confusing "overlapped" and "overwrite"/"offset." I didn't study up on the specifics of why Microsoft explicitly says overlapped writes include a parameter for offset (I think it makes sense as I describe below). In general, when Microsoft talks about "overlapped" IO, they are talking about how to synchronize events like starting to write the file, receiving notification that the write completed, and starting another write to the file which might or might not overlap with a previous write. In this last case, by overlap I mean what you would think that overlap means, ie overlaps within the contents of the file. Whereas Microsoft means that writing the file overlaps in time with your thread running, or not. Note that this gets very complicated if more than one thread can write the same file.
If possible, and surely if you want portable code, you want to avoid all this nonsense and just do the simplest write possible in each context, which means avoid Microsoft optimizations like "overlapped IO" unless you really need performance. (And if you need absolutely optimal performance, you might want to cache the file yourself and manage the overlaps, then write it once from start to finish.)
While pwrite is probably the best solution, there is an alternative that sticks with stdio functions. Unfortunately, to make it thread-safe, you're using non-standard "stdio" to take direct control of the FILE*'s internal lock, and the names aren't portable. Specifically, POSIX defines one set of "take/release file lock" names and Windows defines another set (_lock_file/_unlock_file).
That said, you could use these semi-portable constructs to use stdio functions to ensure no buffering conflicts (pwrite to fileno(some_FILE_star) could cause problems if the FILE* buffer overlaps the pwrite location, since pwrite won't fix up the buffer):
// Error checking omitted; you should actually check returns in real code
size_t pfwrite(const void *ptr, size_t size, size_t n,
size_t offset, FILE *stream) {
// Take FILE*'s lock and hold it for entire transaction
flockfile(stream); // _lock_file on Windows
// Record position
long origpos = ftell(stream);
// Seek to desired offset and write
fseek(stream, offset, SEEK_SET); // Possibly offset * size, not just offset?
size_t written = fwrite(ptr, size, n, stream);
// Seek back to original position
fseek(stream, origpos, SEEK_SET);
// Release FILE*'s lock now that transaction complete
funlockfile(stream); // _unlock_file on Windows
return written;
}

What's the difference between read() and getc()

I have two code segments:
while((n=read(0,buf,BUFFSIZE))>0)
if(write(1,buf,n)!=n)
err_sys("write error");
while((c=getc(stdin))!=EOF)
if(putc(c,stdout)==EOF)
err_sys("write error");
Some sayings on internet make me confused. I know that standard I/O does buffering automatically, but I have passed a buf to read(), so read() is also doing buffering, right? And it seems that getc() read data char by char, how much data will the buffer have before sending all the data out?
Thanks
While both functions can be used to read from a file, they are very different. First of all on many systems read is a lower-level function, and may even be a system call directly into the OS. The read function also isn't standard C or C++, it's part of e.g. POSIX. It also can read arbitrarily sized blocks, not only one byte at a time. There's no buffering (except maybe at the OS/kernel level), and it doesn't differ between "binary" and "text" data. And on POSIX systems, where read is a system call, it can be used to read from all kind of devices and not only files.
The getc function is a higher level function. It usually uses buffered input (so input is read in blocks into a buffer, sometimes by using read, and the getc function gets its characters from that buffer). It also only returns a single characters at a time. It's also part of the C and C++ specifications as part of the standard library. Also, there may be conversions of the data read and the data returned by the function, depending on if the file was opened in text or binary mode.
Another difference is that read is also always a function, while getc might be a preprocessor macro.
Comparing read and getc doesn't really make much sense, more sense would be comparing read with fread.

writing in file in MPI

I've written a program in MPI and I want to execute it under 3 or more cores (or computers) and I want to write some thing in file in each processor, for two cores I've created two different files,but I don't know what should I do for more cores,is it necessary to create separate file for each processor?if yes, so how should I do it based on my code,and is it different for multi computer ?
void main()////it is just for 2 cores,what should I do if I use
more cores or more computers?
{
FILE*fp=fopen("C:\\a.txt","w");
FILE*fp1=fopen("C:\\b.txt","w");
if(Id==0)
{
here I write in "fp"
}
else
{
here I write in "fp1"
}
}
If you are using this model, yes, you would need to create a file per process. The reason is that if you have two processes write to the same file, they would most likely end up overwriting each other unless you do something with MPI to ensure that they serialize themselves.
Depending on the data you're trying to write, you could take a look at the I/O section of MPI. There is functionality in there for parallel I/O but it can be a bit tricky.
void main() {
int me;
char filename[1000];
MPI_Comm_rank(MPI_COMM_WORLD, &me);
sprintf(filename,"C:\\a%05d.txt",me);
FILE *fp=fopen(filename,"w");
here I write in "fp"
}
This is probably the hint you required, but usually the correct approach is one of:
collect data (see MPI_Gather() to start with) an let only the process with rank==0 perform I/O
if I/O is important or if all the data cannot fit the memory of one node (isn't it the same actually?), use MPI-IO.

Consistency of two C FILE* streams on a single file

I need to implement a simple "spill to disk" layer for large volume of data coming off a network socket. I was hoping to have two C FILE* streams, one used by a background thread writing to the file, one used by a front end thread reading it.
The two streams are so one thread can be writing at one offset, while the other is reading elsewhere - without taking a lock and blocking the other thread.
There will be a paging mechanism so the reads/writes are at random access locations - not necessarily sequential.
One more caveat, this needs to work on Windows and Linux.
The question: after the fwrite to the first stream has returned, is that written data guaranteed to be immediately visible to an fread on the second stream?
If not, what other options might I consider?
So Posix pread/pwrite functions turned out to be what I needed. Here's a version for Win32:
size_t pread64(int fd, void* buf, size_t nbytes, __int64 offset)
{
OVERLAPPED ovl;
memset(&ovl, 0, sizeof(ovl));
*((__int64*)&ovl.Offset)=offset;
DWORD nBytesRead;
if (!ReadFile((HANDLE)_get_osfhandle(fd), buf, nbytes, &nBytesRead, &ovl))
return -1;
return nBytesRead;
}
size_t pwrite64(int fd, void* buf, size_t nbytes, __int64 offset)
{
OVERLAPPED ovl;
memset(&ovl, 0, sizeof(ovl));
*((__int64*)&ovl.Offset)=offset;
DWORD nBytesWritten;
if (!WriteFile((HANDLE)_get_osfhandle(fd), buf, nbytes, &nBytesWritten, &ovl))
return -1;
return nBytesWritten;
}
(And thank you everyone for input on this - much appreciated).
This sounds like a great fit for memory-mapped I/O. It's guaranteed to be coherent, very fast, and keeping track of multiple pointers is straightforward.
You'll need different functions to set up the memory mapping on different OSes, but the actual I/O is completely portable (using pointer deference).
linux: open, mmap
Windows: CreateFileMapping, MapViewOfFile
This definitely will not give you the semantics you want. If you disabled buffering, it might be reasonable to expect it to work, but I still don't think there are any guarantees. Stdio/FILE is really not the right tool for specialized IO needs like this.
The POSIX way to do what you want is with file descriptors and the pread/pwrite functions. I suspect there's a Windows way (or you could emulate them based on some other underlying Windows primitive) but I don't know it.
Also Ben's suggestion of using memory-mapped IO is a very good one, assuming the file fits in your address space.