What is the secret to the speed of the filesystem::copy function? - c++

I am trying to reach the speed of filesystem::copy in reading the content of a file and write that content to a new file "copy operation" but I can't reach that speed.
The following is a simple example of my attempt:
void Copy(const wstring &fromPath, const wstring &toPath) {
ifstream readFile(fromPath.c_str(), ios_base::binary|ios_base::ate);
char* fileContent = NULL;
if (!readFile) { cout << "Cannot open the file.\n"; return; }
ofstream writeFile(toPath.c_str(), ios_base::binary);
streampos size = readFile.tellg();
readFile.seekg(0, ios_base::beg);
fileContent = new char[size];
readFile.read(fileContent, size);
writeFile.write(fileContent, size);
readFile.close();
writeFile.close();
delete[] fileContent;
}
The previous code able to copy a file.iso its size "1.48GB" in between "8 to 9" seconds, while filesystem::copy able to copy the same file in between "1 to 2" seconds maximum.
Notice: I don't want to use C++17 in the current period.
How can I do to make the speed of my function to be like filesystem::copy?

Your implementation needs to allocate a buffer of the size of the whole file. That is wasteful, you could just read 64k, write 64k, repeat for the next blocks.
There's cost to paging memory in and out. If you read the whole thing then write the whole thing, you end up paging in and out the whole file twice.
It could be that multiple threads might read/write separately (provided read stays ahead). That may speed things up.
With hardware support, there might not even be a need for the data to go all the way to the CPU. Yet, your implementation probably ends up doing it. It would be very hard hard for the compiler to reason about what you do or don't with fileContent.
There's countless other tricks the implementation of filesystem::copy might be using. You could go see how it is coded, there's plenty of open implementations.
There's a caveat though: The implementation of the standard library might rely on specific behaviours that the language doesn't guarantee. So you can't simply copy the code to a different compiler/architecture/platform.

Related

Does my C++ code handle 100GB+ file copying? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
I need a cross-platform portable function that is able to copy a 100GB+ binary file to a new destination. My first solution was this:
void copy(const string &src, const string &dst)
{
FILE *f;
char *buf;
long len;
f = fopen(src.c_str(), "rb");
fseek(f, 0, SEEK_END);
len = ftell(f);
rewind(f);
buf = (char *) malloc((len+1) * sizeof(char));
fread(buf, len, 1, f);
fclose(f);
f = fopen(dst.c_str(), "a");
fwrite(buf, len, 1, f);
fclose(f);
}
Unfortunately, the program was very slow. I suspect the buffer had to keep 100GB+ in the memory. I'm tempted to try the new code (taken from Copy a file in a sane, safe and efficient way):
std::ifstream src_(src, std::ios::binary);
std::ofstream dst_ = std::ofstream(dst, std::ios::binary);
dst_ << src_.rdbuf();
src_.close();
dst_.close();
My question is about this line:
dst_ << src_.rdbuf();
What does the C++ standard say about it? Does the code compiled to byte-by-byte transfer or just whole-buffer transfer (like my first example)?
I'm curious does the << compiled to something useful for me? Maybe I don't have to invest my time on something else, and just let the compiler do the job inside the operator? If the operator translates to looping for me, why should I do it myself?
PS: std::filesystem::copy is impossible as the code has to work for C++11.
The crux of your question is what happens when you do this:
dst_ << src_.rdbuf();
Clearly this is two function calls: one to istream::rdbuf(), which simply returns a pointer to a streambuf, followed by one to ostream::operator<<(streambuf*), which is documented as follows:
After constructing and checking the sentry object, checks if sb is a null pointer. If it is, executes setstate(badbit) and exits. Otherwise, extracts characters from the input sequence controlled by sb and inserts them into *this until one of the following conditions are met: [...]
Reading this, the answer to your question is that copying a file in this way will not require buffering the entire file contents in memory--rather it will read a character at a time (perhaps with some chunked buffering, but that's an optimization that shouldn't change our analysis).
Here is one implementation: https://gcc.gnu.org/onlinedocs/libstdc++/libstdc++-api-4.6/a01075_source.html (__copy_streambufs). Essentially it a loop calling sgetc() and sputc() repeatedly until EOF is reached. The memory required is small and constant.
The C++ standard (I checked C++98, so this should be extremely compatible) says in [lib.ostream.inserters]:
basic_ostream<charT,traits>& operator<<
(basic_streambuf<charT,traits> *sb);
Effects: If sb is null calls setstate(badbit) (which may throw ios_base::failure).
Gets characters from sb and inserts them in *this. Characters are read from sb and inserted until any of the following occurs:
end-of-file occurs on the input sequence;
inserting in the output sequence fails (in which case the character to be inserted is not extracted);
an exception occurs while getting a character from sb.
If the function inserts no characters, it calls setstate(failbit) (which may throw ios_base::failure (27.4.4.3)). If an exception was thrown while extracting a character, the function set failbit in error state, and if failbit is on in exceptions() the caught exception is rethrown.
Returns: *this.
This description says << on rdbuf works on a character-by-character basis. In particular, if inserting of a character fails, that exact character remains unread in the input sequence. This implies that an implementation cannot just extract the whole contents into a single huge buffer upfront.
So yes, there's a loop somewhere in the internals of the standard library that does a byte-by-byte (well, charT really) transfer.
However, this does not mean that the whole thing is completely unbuffered. This is simply about what operator<< does internally. Your ostream object will still accumulate data internally until its buffer is full, then call write (or whatever low-level function your OS uses).
Unfortunately, the program was very slow.
Your first solution is wrong for a very simple reason: it reads the entire source file in memory, then write it entirely.
Files have been invented (perhaps in the 1960s) to handle data that don't fit in memory (and has to be in some "slower" storage, at that time hard disks or drums, or perhaps even tapes). And they have always been copied by "chunks".
The current (Unix-like) definition of file (as a sequence of bytes than is open-ed, read, write-n, close-d) is more recent than 1960s. Probably the late 1970s or early 1980s. And it comes with the notion of streams (which has been standardized in C with <stdio.h> and in C++ with std::fstream).
So your program has to work (like every file copying program today) for files much bigger than the available memory.You need some loop to read some buffer, write it, and repeat.
The size of the buffer is very important. If it is too small, you'll make too many IO operations (e.g. system calls). If it is too big, IO might be inefficient or even not work.
In practice, the buffer should today be much less than your RAM, typically several megabytes.
Your code is more C like than C++ like because it uses fopen. Here is a possible solution in C with <stdio.h>. If you code in genuine C++, adapt it to <fstream>:
void copyfile(const char*destpath, const char*srcpath) {
// experiment with various buffer size
#define MYBUFFERSIZE (4*1024*1024) /* four megabytes */
char* buf = malloc(MYBUFFERSIZE);
if (!buf) { perror("malloc buf"); exit(EXIT_FAILURE); };
FILE* filsrc = fopen(srcpath, "r");
if (!filsrc) { perror(srcpath); exit(EXIT_FAILURE); };
FILE* fildest = fopen(destpath, "w");
if (!fildest) { perror(destpath); exit(EXIT_FAILURE); };
for (;;) {
size_t rdsiz = fread(buf, 1, MYBUFFERSIZE, filsrc);
if (rdsiz==0) // end of file
break;
else if (rdsiz<0) // input error
{ perror("fread"); exit(EXIT_FAILURE); };
size_t wrsiz = fwrite(buf, rdsiz, 1, fildest);
if (wrsiz != 1) { perror("fwrite"); exit(EXIT_FAILURE); };
}
if (fclose(filsrc)) { perror("fclose source"); exit(EXIT_FAILURE); };
if (fclose(fildest)) { perror("fclose dest"); exit(EXIT_FAILURE); };
}
For simplicity, I am reading the buffer in byte components and writing it as a whole. A better solution is to handle partial writes.
Apparently dst_ << src_.rdbuf(); might do some loop internally (I have to admit I never used it and did not understand that at first; thanks to Melpopene for correcting me). But the actual buffer size matters a big lot. The two other answers (by John Swinck and by melpomene) focus on that rdbuf() thing. My answer focus on explaining why copying can be slow when you do it like in your first solution, and why you need to loop and why the buffer size matters a big lot.
If you really care about performance, you need to understand implementation details and operating system specific things. So read Operating systems: three easy pieces. Then understand how, on your particular operating system, the various buffering is done (there are several layers of buffers involved: your program buffers, the standard stream buffers, the kernel buffers, the page cache). Don't expect your C++ standard library to buffer in an optimal fashion.
Don't even dream of coding in standard C++ (without operating system specific stuff) an optimal or very fast copying function. If performance matters, you need to dive in OS specific details.
On Linux, you might use time(1), oprofile(1), perf(1) to measure your program's performance. You could use strace(1) to understand the various system calls involved (see syscalls(2) for a list). You might even code (in a Linux specific way) using directly the open(2), read(2), write(2), close(2) and perhaps readahead(2), mmap(2), posix_fadvise(2), madvise(2), sendfile(2) system calls.
At last, large file copying are limited by disk IO (which is the bottleneck). So even by spending days in optimizing OS specific code, you won't win much. The hardware is the limitation. You probably should code what is the most readable code for you (it might be that dst_ << src_.rdbuf(); thing which is looping) or use some library providing file copy. You might win a tiny amount of performance by tuning the various buffer sizes.
If the operator translates to looping for me, why should I do it myself?
Because you have no explicit guarantee on the actual buffering done (at various levels). As I explained, buffering matters for performance. Perhaps the actual performance is not that critical for you, and the ordinary settings of your system and standard library (and their default buffers sizes) might be enough.
PS. Your question contains at least 3 different questions (but related ones). I don't find it clear (so downvoted it), because I did not understand what is the most relevant one. Is it : performance? robustness? meaning of dst_ << src_.rdbuf();? Why is the first solution slow? How to copy large files quickly?

How to copy a file from one location to another in a fast way with C++ program? [duplicate]

This question already has answers here:
Copy a file in a sane, safe and efficient way
(9 answers)
Closed 7 years ago.
I am trying to understand the code behind the copy command which copies a file from one place to other.I studied c++ file system basics and have written the following code for my task.
#include<iostream>
#include<fstream>
using namespace std;
main()
{
cout<<"Copy file\n";
string from,to;
cout<<"Enter file address: ";
cin>>from;
ifstream in(from,ios::in | ios::binary);
if(!in)
{
cout<<"could not find file "<<from<<endl;
return 1;
}
cout<<"Enter file destination: ";
cin>>to;
ofstream out(to,ios::out | ios::binary);
char ch;
while(in.get(ch))
{
out.put(ch);
}
cout<<"file has been copied\n";
in.close();
out.close();
}
Though this code works but is much slower than the copy command of my OS which is windows.I want to know how I can make my program faster to reduce the difference between my program's time and the my OS's copy command time.
Reading one byte at time is going to waste a lot of time in function calls... use a bigger buffer:
char ch[4096];
while(in) {
in.read(ch, sizeof(ch));
out.write(ch, in.gcount());
}
(you may want to add some more error handling, e.g. out may go in a bad state and the like)
(the most C++-ish way is reported here, but takes advantage of streambuf functionalities that typically a beginner rarely has reason to know, and to me is also way less instructive)
You have correctly opened the file for binary read and binary write. However instead of reading characters(which is not meaningful in binary format), use istream::read and ostream::write.
Like other answers say, use bigger buffers. I'd go for 1MB.
But there's a lot more to it.
Also, avoid stream lib and FILE stuff. They buffer the data so you get 2 memcpy calls instead of 1.
Disabling buffering on the streams can achieve a similar result, but I think you're better of using the system calls directly.
And one last thing, on the "do it yourself" front. You must check the return values from read and write calls. They may read/write less bytes than you ask them to.
If you can manage a circular buffer, you should switch read/wrote whenever the function returns short... disk may be more ready for reading or for writing so no point in wasting time waiting instead of switching to the other thing you have to do.
And now the very last thing you might want to explore- look into the sendfile system call. It was built to speed up web servers by doing all the copy in the kernel and avoiding context switches and memcpys, but may serve here if it works with two disk file descriptors.

buffered std::ifstream to read from disk only once (C++)

Is there a way to add buffering to a std::ifstream in the sense that seeking (seekg) and reading multiple times wouldn't cause any more reads than necessary.
I'd basically like to read a chunk of file using stream multiple times but I'd want to have the chunk read from disk only once.
The question is probably a bit off cuz I want to mix buffered reads and streams ...
For example:
char filename[] = "C:\\test.txt";
fstream inputfile;
char buffer[20];
inputfile.open(filename, ios::binary);
inputfile.seekg(2, ios::beg);
inputfile.read(buffer, 3);
cout << buffer << std::endl;
inputfile.seekg(2, ios::beg);
inputfile.read(buffer, 3);
cout << buffer3 << std::endl;
I'd want to have to read from disk only once.
Personally, I wouldn't worry about reading from the file multiple times: the system will keep the used buffers hot anyway. However, depending on the location of the file and swap space, different disks may be used.
The file stream itself does support a setbuf() function which could theoretically set the internally used buffer to a size chosen by the user. However, the only arguments which have to be supported and need to have an effect are setbuf(0, 0) which is quite the opposite effect, i.e., the stream becomes unbuffered.
I guess, the easiest way to guarantee that the data isn't read from the stream again is to use a std::stringstream and use that instead of the file stream after initial reading, e.g.:
std::stringstream inputfile;
inputfile << std::ifstream(filename).rdbuf();
inputfile.seekg(0, std::ios_base::beg);
If it is undesirable to read the entire file stream first, a filtering stream could be used which reads the file whenever it reaches a section it hasn't read, yet. However, creating a corresponding stream buffer isn't that trivial and since I consider the original objective already questionable I would doubt that it has much of a benefit. Of course, you could create a simple stream which just does the initialization in the constructor and use that instead.

Writing huge txt files without overloading RAM

I need to write the results of a process in a txt file. The process is very long and the amount of data to be written is huge (~150Gb). The program works fine, but the problem is that the RAM gets overloaded and, at a certain point, it just stops.
The program is simple:
ostream f;
f.open(filePath);
for(int k=0; k<nDataset; k++){
//treat element of dataset
f << result;
}
f.close();
Is there a way of writing this file without overloading the memory?
You should flush output periodically.
For example:
if (k%10000 == 0) f.flush();
I'd like to suggest something like this
ogzstream f;
f.open(filePath);
string s("");
for(int k=0; k<nDataset; k++){
//treat element of dataset
s.append(result);
if (s.length() == OPTIMUM_BUFFER_SIZE) {
f << s;
f.flush();
s.clear();
}
}
f << s;
f.flush();
f.close();
Basically, you construct the stream in memory rather than redirecting to the stream so you don't have to worry about when the stream gets flushed. And when you are redirecting you ensure it's flushed to the actual file. Some ideas for the OPTIMUM_BUFFER_SIZE can be found from here and here.
I'm not exactly sure whether string or vector is the best option for the buffer. Will do some research myself and update the answer or you can refer to Effective STL by Scott Meyers.
If that truly is the code where your program gets stuck, then your explanation of the problem is wrong.
There's no text file. Your igzstream is not dealing with text, but a gzip archive.
There's no data being written. The code you show reads from the stream.
I don't know what your program does with result, because you didn't show that. But if it accumulates results into a collection in memory, that will grow. You'll need to find a way to process all your data without loading all of it into RAM at the same time.
Your memory usage could be from the decompressor. For some compression algorithms, an entire block has to be stored in memory. In such cases it's best to break the file into blocks and compress each separately (possibly pre-initializing a dictionary with the results of the previous block). I don't think that gzip is such an algorithm, however. You may need to find a library that supports streaming.

ifstream vs. fread for binary files

Which is faster? ifstream or fread.
Which should I use to read binary files?
fread() puts the whole file into the memory.
So after fread, accessing the buffer it creates is fast.
Does ifstream::open() puts the whole file into the memory?
or does it access the hard disk every time we run ifstream::read()?
So... does ifstream::open() == fread()?
or (ifstream::open(); ifstream::read(file_length);) == fread()?
Or shall I use ifstream::rdbuf()->read()?
edit:
My readFile() method now looks something like this:
void readFile()
{
std::ifstream fin;
fin.open("largefile.dat", ifstream::binary | ifstream::in);
// in each of these small read methods, there are at least 1 fin.read()
// call inside.
readHeaderInfo(fin);
readPreference(fin);
readMainContent(fin);
readVolumeData(fin);
readTextureData(fin);
fin.close();
}
Will the multiple fin.read() calls in the small methods slow down the program?
Shall I only use 1 fin.read() in the main method and pass the buffer into the small methods? I guess I am going to write a small program to test.
Thanks!
Are you really sure about fread putting the whole file into memory? File access can be buffered, but I doubt that you really get the whole file put into memory. I think ifstream::read just uses fread under the hood in a more C++ conformant way (and is therefore the standard way of reading binary information from a file in C++). I doubt that there is a significant performance difference.
To use fread, the file has to be open. It doesn't take just a file and put it into memory at once. so ifstream::open == fopen and ifstream::read == fread.
C++ stream api is usually a little bit slower then C file api if you use high level api, but it provides cleaner/safer api then C.
If you want speed, consider using memory mapped files, though there is no portable way of doing this with standard library.
As to which is faster, see my comment. For the rest:
Neither of these methods automatically reads the whole file into memory. They both read as much as you specify.
As least for ifstream I am sure that the IO is buffered, so there will not necessarily be a disk access for every read you make.
See this question for the C++-way of reading binary files.
The idea with C++ file streams is that some or all of the file is buffered in memory (based on what it thinks is optimal) and that you don't have to worry about it.
I would use ifstream::read() and just tell it how much you need.
Use stream operator:
DWORD processPid = 0;
std::ifstream myfile ("C:/Temp/myprocess.pid", std::ios::binary);
if (myfile.is_open())
{
myfile >> processPid;
myfile.close();
std::cout << "PID: " << processPid << std::endl;
}