In my application I'm trying to merge sorted files (keeping them sorted of course), so I have to iterate through each element in both files to write the minimal to the third one. This works pretty much slow on big files, as far as I don't see any other choice (the iteration has to be done) I'm trying to optimize file loading. I can use some amount of RAM, which I can use for buffering. I mean instead of reading 4 bytes from both files every time I can read once something like 100Mb and work with that buffer after that, until there will be no element in buffer, then I'll refill the buffer again. But I guess ifstream is already doing that, will it give me more performance and is there any reason? If fstream does, maybe I can change size of that buffer?
added
My current code looks like that (pseudocode)
// this is done in loop
int i1 = input1.read_integer();
int i2 = input2.read_integer();
if (!input1.eof() && !input2.eof())
{
if (i1 < i2)
{
output.write(i1);
input2.seek_back(sizeof(int));
} else
input1.seek_back(sizeof(int));
output.write(i2);
}
} else {
if (input1.eof())
output.write(i2);
else if (input2.eof())
output.write(i1);
}
What I don't like here is
seek_back - I have to seek back to previous position as there is no way to peek 4 bytes
too much reading from file
if one of the streams is in EOF it still continues to check that stream instead of putting contents of another stream directly to output, but this is not a big issue, because chunk sizes are almost always equal.
Can you suggest improvement for that?
Thanks.
Without getting into the discussion on stream buffers, you can get rid of the seek_back and generally make the code much simpler by doing:
using namespace std;
merge(istream_iterator<int>(file1), istream_iterator<int>(),
istream_iterator<int>(file2), istream_iterator<int>(),
ostream_iterator<int>(cout));
Edit:
Added binary capability
#include <algorithm>
#include <iterator>
#include <fstream>
#include <iostream>
struct BinInt
{
int value;
operator int() const { return value; }
friend std::istream& operator>>(std::istream& stream, BinInt& data)
{
return stream.read(reinterpret_cast<char*>(&data.value),sizeof(int));
}
};
int main()
{
std::ifstream file1("f1.txt");
std::ifstream file2("f2.txt");
std::merge(std::istream_iterator<BinInt>(file1), std::istream_iterator<BinInt>(),
std::istream_iterator<BinInt>(file2), std::istream_iterator<BinInt>(),
std::ostream_iterator<int>(std::cout));
}
In decreasing order of performance (best first):
memory-mapped I/O
OS-specific ReadFile or read calls.
fread into a large buffer
ifstream.read into a large buffer
ifstream and extractors
A program like this should be I/O bound, meaning it should be spending at least 80% of it's time waiting for completion of reading or writing a buffer, and if the buffers are reasonably big, it should be keeping the disk heads busy. That's what you want.
Don't assume it is I/O bound, without proof. A way to prove it is by taking several stackshots. If it is, most of the samples will show the program waiting for I/O completion.
It is possible that it is not I/O bound, meaning you may find other things going on in some of the samples that you never expected. If so, then you know what to fix to speed it up. I have seen some code like this spending much more time than necessary in the merge loop, testing for end-of-file, getting data to compare, etc. for example.
You can just use the read function of an ifstream to read large blocks.
http://www.cplusplus.com/reference/iostream/istream/read/
The second parameter is the number of bytes. You should make this a multiple of 4 in your case - maybe 4096? :)
Simply read a chunk at a time and work on it.
As martin-york said, this may not have any beneficial effect on your performance, but try it and find out.
I think it is very likely that you can improve performance by reading big chunks.
Try opening the file with ios::binary as an argument, then use istream::read to read the data.
If you need maximum performance, I would actually suggest skipping iostreams altogether, and using cstdio instead. But I guess this is not what you want.
Unless there is something very special about your data it is unlikely that you will improve on the buffering that is built into the std::fstream object.
The std::fstream objects are designed to be very effecient for general purpose file access. It does not sound like you are doing anything special by accessing the data 4 bytes at a time. You can always profile your code to see where the actual time is spent in your code.
Maybe if you share the code with ous we could spot some major inefficiencies.
Edit:
I don't like your algorithm. Seeking back and forward may be hard on the stream especially of the number lies over a buffer boundary. I would only read one number each time through the loop.
Try this:
Note: This is not optimal (and it assumes stream input of numbers (while yours looks binary)) But I am sure you can use it as a starting point.
#include <fstream>
#include <iostream>
// Return the current val (that was the smaller value)
// and replace it with the next value in the stream.
int getNext(int& val, std::istream& str)
{
int result = val;
str >> val;
return result;
}
int main()
{
std::ifstream f1("f1.txt");
std::ifstream f2("f2.txt");
std::ofstream re("result");
int v1;
int v2;
f1 >> v1;
f2 >> v2;
// While there are values in both stream
// Output one value and replace it using getNext()
while(f1 && f2)
{
re << (v1 < v2)? getNext(v1, f1) : getNext(v2, f2);
}
// At this point one (or both) stream(s) is(are) empty.
// So dump the other stream.
for(;f1;f1 >> v1)
{
// Note if the stream is at the end it will
// never enter the loop
re << v1;
}
for(;f2;f2 >> v2)
{
re << v2;
}
}
Related
I'm trying to read and write a few megabytes of data stored in files, consisting out of 8 floats converted to strings per line, to my SSD.
Looking up C++ code and implementing some of the answers here for reading and writing files yielded me this code for reading a file:
std::stringstream file;
std::fstream stream;
stream.open("file.txt", std::fstream::in);
file << stream.rdbuf();
stream.close();
And this code for writing files:
stream.write(file.str().data(), file.tellg());
The problem is, that this code is very slow, compared to the speed of my SSD. My SSD has a reading speed of 2400 MB/s and a writing speed of 1800 MB/s.
But my program has a read speed of only 180.6 MB/s and a write speed of 25.11 MB/s.
Because some asked how I measure the speed, I obtain a std::chrono::steady_clock::time_point using std::chrono::steady_clock::now() and then do a std::chrono::duration_cast.
Using the same 5.6MB large file and dividing the file size by the measured time, I get the megabytes per second.
How can I increase the speed of reading and writing to files, while using only standard C++ and STL?
I made a short evaluation for you.
I have written a test program, that first creates a test file.
Then I did several improvement methods:
I switch on all compiler optimizations
For the string, i use resize to avoid reallocations
Reading from the stream is drastically improved by setting a bigger input buffer
Please see and check, if you can implement one of my ideas for your solution
Edit
Strip down test program to pure reading:
#include <string>
#include <iterator>
#include <iostream>
#include <fstream>
#include <chrono>
#include <algorithm>
constexpr size_t NumberOfExpectedBytes = 80'000'000;
constexpr size_t SizeOfIOStreamBuffer = 1'000'000;
static char ioBuffer[SizeOfIOStreamBuffer];
const std::string fileName{ "r:\\log.txt" };
void writeTestFile() {
if (std::ofstream ofs(fileName); ofs) {
for (size_t i = 0; i < 2'000'000; ++i)
ofs << "text,text,text,text,text,text," << i << "\n";
}
}
int main() {
//writeTestFile();
// Make string with big buffer
std::string completeFile{};
completeFile.resize(NumberOfExpectedBytes);
if (std::ifstream ifs(fileName); ifs) {
// Increase buffer size for buffered input
ifs.rdbuf()->pubsetbuf(ioBuffer, SizeOfIOStreamBuffer);
// Time measurement start
auto start = std::chrono::system_clock::now();
// Read complete file
std::copy(std::istreambuf_iterator<char>(ifs), {}, completeFile.begin());
// Time measurement evaluation
auto end = std::chrono::system_clock::now();
auto elapsed = std::chrono::duration_cast<std::chrono::milliseconds>(end - start);
// How long did it take?
std::cout << "Elapsed time: " << elapsed.count() << " ms\n";
}
else std::cerr << "\n*** Error. Could not open source file\n";
return 0;
}
With that I do achieve 123,2MB/s
You can try to copy the whole file at once and see if that improves the speed:
#include <algorithm>
#include <fstream>
#include <iterator>
int main() {
std::ifstream is("infile");
std::ofstream os("outfile");
std::copy(std::istreambuf_iterator<char>(is), std::istreambuf_iterator<char>{},
std::ostreambuf_iterator<char>(os));
// or simply: os << is.rdbuf()
}
In your sample, the slow part is likely the repeated calls to getline(). While this is somewhat implementation-dependent, typically a call to getline eventually boils down to an OS call to retrieve the next line of text from an open file. OS calls are expensive, and should be avoided in tight loops.
Consider a getline implementation that incurs ~1ms of overhead. If you call it 1000 times, each reading ~80 characters, you've acquired a full second of overhead. If, on the other hand, you call it once and read 80,000 characters, you've removed 999ms of overhead and the function will likely return nearly instantaneously.
(This is also one reason games and the like implement custom memory management rather than just malloc and newing all over the place.)
For reading: Read the entire file at once, if it'll fit in memory.
See: How do I read an entire file into a std::string in C++?
Specifically, see the slurp answer towards the bottom. (And take to heart the comment about using a std::vector instead of a char[] array.)
If it won't all fit in memory, manage it in large chunks.
For writing: build your output in a stringstream or similar buffer, and then write it one step, or in large chunks to minimize the number of OS round trips.
Looks like you are outputting formatted numbers to a file. There are two bottlenecks already: formatting the numbers into human readable form and the file I/O.
The best performance you can achieve is to keep the data flowing. Starting and stopping requires overhead penalties.
I recommend double buffering with two or more threads.
One thread formats the data into one or more buffers. Another thread writes the buffers to the file. You'll need to adjust the size and quantity of buffers to keep the data flowing. When one thread finishes a buffer, the thread starts processing another buffer.
For example, you could have the writing thread use fstream.write() to write the entire buffer.
The double buffering with threads can also be adapted for reading. One thread reads the data from the file into one or more buffers and another thread formats the data (from the buffers) into internal format.
I am using boost::iostreams::mapped_file_source to read a text file from a specific position to a specific position and to manipulate each line (compiled using g++ -Wall -O3 -lboost_iostreams -o test main.cpp):
#include <iostream>
#include <string>
#include <boost/iostreams/device/mapped_file.hpp>
int main() {
boost::iostreams::mapped_file_source f_read;
f_read.open("in.txt");
long long int alignment_offset(0);
// set the start point
const char* pt_current(f_read.data() + alignment_offset);
// set the end point
const char* pt_last(f_read.data() + f_read.size());
const char* pt_current_line_start(pt_current);
std::string buffer;
while (pt_current && (pt_current != pt_last)) {
if ((pt_current = static_cast<const char*>(memchr(pt_current, '\n', pt_last - pt_current)))) {
buffer.assign(pt_current_line_start, pt_current - pt_current_line_start + 1);
// do something with buffer
pt_current++;
pt_current_line_start = pt_current;
}
}
return 0;
}
Currently, I would like to make this code handle gzip files as well and modify the code like this:
#include<iostream>
#include<boost/iostreams/device/mapped_file.hpp>
#include<boost/iostreams/filter/gzip.hpp>
#include<boost/iostreams/filtering_streambuf.hpp>
#include<boost/iostreams/filtering_stream.hpp>
#include<boost/iostreams/stream.hpp>
int main() {
boost::iostreams::stream<boost::iostreams::mapped_file_source> file;
file.open(boost::iostreams::mapped_file_source("in.txt.gz"));
boost::iostreams::filtering_streambuf< boost::iostreams::input > in;
in.push(boost::iostreams::gzip_decompressor());
in.push(file);
std::istream std_str(&in);
std::string buffer;
while(1) {
std::getline(std_str, buffer);
if (std_str.eof()) break;
// do something with buffer
}
}
This code also work well but I don't know how can set the start point (pt_current) and the end point (pt_last) like the first code. Could you let me know how I can set the two values in the second code?
The answer is no, that's not possible. The compressed stream would need to have indexes.
The real question is Why?. You are using a memory mapped file. Doing on-the-fly compression/decompression is only going to reduce performance and increase memory consumption.
If you're not short on actual file storage, then you should probably consider a binary representation, or keep the text as it is.
Binary representation could sidestep most of the complexity involved when using text files with random access.
Some inspirational samples:
Simplest way to read a CSV file mapped to memory?
Using boost::iostreams::mapped_file_source with std::multimap
Iterating over mmaped gzip file with boost
What you're basically discovering is that text files aren't random access, and compression makes indexing essentially fuzzy (there is no precise mapping from compressed stream offset to uncompressed stream offset).
Look at the zran.c example in the zlib distribution as mentioned in the zlib FAQ:
28. Can I access data randomly in a compressed stream?
No, not without some preparation. If when compressing you periodically use Z_FULL_FLUSH, carefully write all the pending data at those points, and keep an index of those locations, then you can start decompression at those points. You have to be careful to not use Z_FULL_FLUSH too often, since it can significantly degrade compression. Alternatively, you can scan a deflate stream once to generate an index, and then use that index for random access. See examples/zran.c
¹ you could specifically look at parallel implementations such as e.g. pbzip2 or pigz; These will necessarily use these "chunks" or "frames" to schedule the load across cores
I'm having a problem reading from a binary file (*.dat) using the .read(reinterpret_cast (&x),sizeof(x)) command but there is always an error about the existence of the file even when the file exist or has been created successfully. Here is the code:
#include <iostream>
#include <string>
#include <fstream>
using namespace std;
struct x{
char name[10],pass[10];
};
int main()
{
x x1,x2;
fstream inout;
inout.open("test.dat" ,ios::binary);
if(!inout)
{
cout<<"Error";
exit(1);
}
cout<<"Enter your name:";
cin>>x1.name;
inout.write(reinterpret_cast <const char*> (&x1.name), sizeof(x1));
cout<<"Enter your name:";
cin>>x1.pass;
inout.write(reinterpret_cast <const char*> (&x1.pass), sizeof(x1));
while(inout.read(reinterpret_cast <char*> (&x2.name), sizeof(x1)))
{
cout<<x2.name;//here is my problem cannot read!!
}
inout.close();
}
Use std:flush after your write operations.
// ... Write x1.name and x1.pass
inout << std::flush;
// ... Read x2.name in while loop.
inout.close();
There is a problem with your output to the file.
First you are writing the struct x1 to the file where only the name field is filled
inout.write(reinterpret_cast <const char*> (&x1.name), sizeof(x1));
and afterwards:
inout.write(reinterpret_cast <const char*> (&x1.pass), sizeof(x1));
You start writing from the address of x1.pass but you are writing sizeof(x1) bytes.
sizeof(x1) is 20 here but its only 10 bytes from the start of x1.pass to the end of the struct, so you are writing 10 bytes of unknown data from the stack into your file.
So this is the first thing that your file may not contain what you expect it to contain.
The next thing is that after writing your data the stream is sitting at the end of the file and you try to read from there. You have to move the position back to the beginning of the stream to read the stuff you just wrote. For example use:
inout.seekg(std::ios::beg);
If you mess with read and write to the same stream, you'd rather use flush or file positioning functions.
MSDN says:
When a basic_fstream object is used to perform file I/O, although the underlying buffer contains separately designated positions for reading and writing, the current input and current output positions are tied together, and therefore, reading some data moves the output position.
GNU Stdlib:
As you can see, ‘+’ requests a stream that can do both input and output. When using such a stream, you must call fflush (see Stream Buffering) or a file positioning function such as fseek (see File Positioning) when switching from reading to writing or vice versa. Otherwise, internal buffers might not be emptied properly.
Reading into raw C-style arrays from an input stream is not as idiomatic as a simple call to operator>>(). You also have to prevent buffer overruns by keeping track of the both the bytes allocated for the buffer, and the bytes being read into the buffer.
Reading into the buffer can be done by using the input stream method getline(). The following example shows the extraction into x1.name; the same would be done for x1.path:
if (std::cin.getline(x1.name, sizeof(x1.name))) {
}
The second argument is the maximum number of bytes to be read. It is useful in that the stream won't write pass the allocated bounds of the array. The next thing to do is just write it to the file as you have done:
if (std::cin.getline(x1.name, sizeof(x1.name))) {
inout.write(reinterpret_cast<char*>(&x1.name), std::cin.gcount());
}
std::cin.gcount() is the number of characters that were read from the input stream. It is a much more reliable alternative to sizeof(x1.name) in that it returns the number of characters written, not the characters allotted.
Now, bidirectional file streams are a bit tricky. They have be coordinated in the right way. As explained in the other answers, bidirectional file streams (or std::fstreams) share a joint buffer for both input and output. The position indicators that mark positions in the input and output sequence are both affected by any input and output operations that may occur. As such, the file stream position has to be "moved" back before performing input. This can be done by either a call to seekg() or seekp(). Either will suffice since, as I said, the position indicators are bound to each other:
if (std::cin.getline(x1.pass, sizeof(x1.pass))) {
inout.write(reinterpret_cast<char*>(&x1.pass), std::cin.gcount());
inout.seekg(0, std::ios_base::beg);
}
Notice how this was done after the extraction into x1.pass. We can't do it after x1.name because we would be overwriting the stream on the second call to write().
As you can see, extracting into raw C-style arrays isn't pretty, you have to manage more things than you should. Fortunately, C++ comes to the rescue with their standard string class std::string. Use this for more efficient I/O:
Make both name and pass standard C++ strings (std::string) instead of raw C-arrays. This allows you pass in the size as the second argument to your read() and write() calls:
#include <string>
struct x {
std::string name;
std::string pass;
};
// ...
if (std::cin >> x1.name) {
inout.write(x1.name.data(), x1.name.size());
}
if (std::cin >> x1.pass) {
inout.write(x1.name.data(), x1.name.size());
inout.seekg(0, std::ios_base::beg);
}
std::string allows us to leverage its dynamic nature and its capacity for maintaining the size of the buffer. We no longer have to use getline() but now a simple call to operator>>() and an if() check.
This was not possible before, but now that we're using std::string we can also combine both extractions to achieve the following:
if (std::cout << "Enter your name: " && std::cin >> x1.name &&
std::cout << "Enter your pass: " && std::cin >> x1.pass) {
inout.write(x1.name.data(), x1.name.size());
inout.write(x1.pass.data(), x1.pass.size());
inout.seekg(0, std::ios_base::beg);
}
And finally, the last extraction would simply be this:
while (inout >> x2.name)
{
std::cout << x2.name;
}
Have stumbled upon this code to insert the contents of a file into a vector. Seems like a useful thing to learn how to do:
#include <iostream>
#include <fstream>
#include <vector>
int main() {
typedef std::vector<char> fileContainer;
std::ifstream testFile("testfile.txt");
fileContainer container;
container.assign(
(std::istreambuf_iterator<char>(testFile)),
std::istreambuf_iterator<char>());
return 0;
}
It works but I'd like to ask is this the best way to do such a thing? That is, to take the contents any file type and insert it into an appropriate STL container. Is there a more efficient way of doing this than above? As i understand, it creates a testFile instance of ifstream and fills it with the contents of testfile.txt, then that copy is again copied into the container through assign. Seems like a lot of copying?
As for speed/efficiency, I'm not sure how to estimate the file size and use the reserve function with that, if i use reserve it appears to slow this code down even. At the moment swapping out vector and just using a deque is quite a bit more efficient it seems.
I'm not sure that there's a best way, but using the two iterator
constructor would be more idiomatic:
FileContainer container( (std::istreambuf_iterator<char>( testFile )),
(std::istreambuf_iterator<char>()) );
(I notice that you have the extra parentheses in your assign. They
aren't necessary there, but they are when you use the constructor.)
With regards to performance, it would be more efficient to pre-allocate
the data, something like:
FileContainer container( actualSizeOfFile );
std::copy( std::istreambuf_iterator<char>( testFile ),
std::istreambuf_iterator<char>(),
container.begin() );
This is slightly dangerous; if your estimation is too small, you'll
encounter undefined behavior. To avoid this, you could also do:
FileContainer container;
container.reserve( estimatedSizeOfFile );
container.insert( container.begin(),
std::istreambuf_iterator<char>( testFile ),
std::istreambuf_iterator<char>() );
Which of these two is faster will depend on the implementation; the last
time I measured (with g++), the first was slightly faster, but if you're
actually reading from file, the difference probably isn't measurable.
The problem with these two methods is that, despite other answers, there
is no portable way of finding the file size other than by actually
reading the file. Non-portable methods exist for some systems (fstat
under Unix), but on other systems, like Windows, there is no means
of finding the exact number of char you can read from a text file.
And of course, there's no guarantee that the results of tellg() will
even convert to an integral type, and that if it does, that they won't
be a magic cookie, with no numerical signification.
Having said that, in practice, the use of tellg() suggested by other
posters will often be "portable enough" (Windows and most Unix, at
least), and the results will often be "close enough"; they'll usually be
a little too high under Windows (since the results will count the
carriage return characters which won't be read), but in a lot of cases,
that's not a big problem. In the end, it's up to you to decide what
your requirements are with regards to portability and precision of the
size.
it creates a testFile instance of ifstream and fills it with the contents of testfile.txt
No, it opens testfile.txt and calls the handle testFile. There is one copy being made, from disk to memory. (Except that I/O is commonly done by another copy through kernel space, but you're not going to avoid that in a portable way.)
As for speed/efficiency, i'm not sure how to estimate the file size and use the reserve function with that
If the file is a regular file:
std::ifstream testFile("testfile.txt");
testFile.seekg(0, std::ios::end);
std::ios::streampos size = testFile.tellg();
testFile.seekg(0, std::ios::beg);
std::vector<char> container;
container.reserve(size);
Then fill container as before. Or construct it as std::vector<char> container(size) and fill it with
testFile.read(&container.front, size);
Which one is faster should be determined by profiling.
The std::ifstream is not fulled with the contents of the file, the contents are read on demand. Some kind of buffering is involved, so the file would be read in chunks of k-bytes. Since stream iterators are InputIterators, it should be more efficient to call reserve on the vector first; but only if you already have that information or can guess a good approximate, otherwise you would have to iterate through the file contents twice.
People much more frequently want to read from a file into a string than a vector. If you can use that, you might want to see the answer I posted to a previous question.
A minor edit of the fourth test there will give this:
std::vector<char> s4;
file.seekg(0, std::ios::end);
s4.resize(file.tellg());
file.seekg(0, std::ios::beg);
file.read(&s4[0], s4.size());
My guess is that this should give performance essentially indistinguishable from the code using a string. Depending on your compiler/standard library, this is likely to be substantially faster than your current code (again, see the timing results there for some idea of the difference you're likely to see).
Also note that this gives a little extra ability to detect and diagnose errors. For example, you can check whether you successfully read the entire file by comparing s4.size() to file.gcount() (and/or check for file.eof()). This also makes it a bit easier to prevent problems by limiting the amount you read, in case somebody decides to see what happens when/if they try to use your program to read a file that's, say, 6 terabytes.
There is definitely a better way if you want to make it efficient. You can check the file size, pre-allocate vector and read directly into vector's memory. A simple example:
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <fcntl.h>
#include <cstdio>
#include <cstdlib>
#include <vector>
#include <iostream>
using namespace std;
int main ()
{
int fd = open ("test.data", O_RDONLY);
if (fd == -1)
{
perror ("open");
return EXIT_FAILURE;
}
struct stat info;
int res = fstat (fd, &info);
if (res != 0)
{
perror ("fstat");
return EXIT_FAILURE;
}
std::vector<char> data;
if (info.st_size > 0)
{
data.resize (info.st_size);
ssize_t x = read (fd, &data[0], data.size ());
if (x != info.st_size)
{
perror ("read");
return EXIT_FAILURE;
}
cout << "Data (" << info.st_size << "):\n";
cout.write (&data[0], data.size ());
}
}
There are other more efficient ways for some tasks. For example, to copy file without transferring data to and from user space, you can use sendfile etc.
It does work, and it is convenient, but there are many situations where it is a bad idea.
Error handling in a user-edited file, for example. If the user has hand edited a data file or it has been imported from a spreadsheet or even a database with lax field definitions, then this method of filling the vector will result in a simple error with no detail.
In order to process the file and report where the error happened, you need to read it line by line and attempt the conversion to a number on each line. Then you can report the line number and the text that failed to convert. This is extremely useful. Without this feature the user is left to wonder which line caused the problem instead of being able to immediately fix it.
I have an array of precomputed integers, it's fixed size of 15M values. I need to load these values at the program start. Currently it takes up to 2 mins to load, file size is ~130MB. Is it any way to speed-up loading. I'm free to change save process as well.
std::array<int, 15000000> keys;
std::string config = "config.dat";
// how array is saved
std::ofstream out(config.c_str());
std::copy(keys.cbegin(), keys.cend(),
std::ostream_iterator<int>(out, "\n"));
// load of array
std::ifstream in(config.c_str());
std::copy(std::istream_iterator<int>(in),
std::istream_iterator<int>(), keys.begin());
in_ranks.close();
Thanks in advance.
SOLVED. Used the approach proposed in accepted answer. Now it takes just a blink.
Thanks all for your insights.
You have two issues regarding the speed of your write and read operations.
First, std::copy cannot do a block copy optimization when writing to an output_iterator because it doesn't have direct access to underlying target.
Second, you're writing the integers out as ascii and not binary, so for each iteration of your write output_iterator is creating an ascii representation of your int and on read it has to parse the text back into integers. I believe this is the brunt of your performance issue.
The raw storage of your array (assuming a 4 byte int) should only be 60MB, but since each character of an integer in ascii is 1 byte any ints with more than 4 characters are going to be larger than the binary storage, hence your 130MB file.
There is not an easy way to solve your speed problem portably (so that the file can be read on different endian or int sized machines) or when using std::copy. The easiest way is to just dump the whole of the array to disk and then read it all back using fstream.write and read, just remember that it's not strictly portable.
To write:
std::fstream out(config.c_str(), ios::out | ios::binary);
out.write( keys.data(), keys.size() * sizeof(int) );
And to read:
std::fstream in(config.c_str(), ios::in | ios::binary);
in.read( keys.data(), keys.size() * sizeof(int) );
----Update----
If you are really concerned about portability you could easily use a portable format (like your initial ascii version) in your distribution artifacts then when the program is first run it could convert that portable format to a locally optimized version for use during subsequent executions.
Something like this perhaps:
std::array<int, 15000000> keys;
// data.txt are the ascii values and data.bin is the binary version
if(!file_exists("data.bin")) {
std::ifstream in("data.txt");
std::copy(std::istream_iterator<int>(in),
std::istream_iterator<int>(), keys.begin());
in.close();
std::fstream out("data.bin", ios::out | ios::binary);
out.write( keys.data(), keys.size() * sizeof(int) );
} else {
std::fstream in("data.bin", ios::in | ios::binary);
in.read( keys.data(), keys.size() * sizeof(int) );
}
If you have an install process this preprocessing could also be done at that time...
Attention. Reality check ahead:
Reading integers from a large text file is an IO bound operation unless you're doing something completely wrong (like using C++ streams for this). Loading 15M integers from a text file takes less than 2 seconds on an AMD64#3GHZ when the file is already buffered (and only a bit long if had to be fetched from a sufficiently fast disk). Here's a quick & dirty routine to prove my point (that's why I do not check for all possible errors in the format of the integers, nor close my files at the end, because I exit() anyway).
$ wc nums.txt
15000000 15000000 156979060 nums.txt
$ head -n 5 nums.txt
730547560
-226810937
607950954
640895092
884005970
$ g++ -O2 read.cc
$ time ./a.out <nums.txt
=>1752547657
real 0m1.781s
user 0m1.651s
sys 0m0.114s
$ cat read.cc
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <vector>
int main()
{
char c;
int num=0;
int pos=1;
int line=1;
std::vector<int> res;
while(c=getchar(),c!=EOF)
{
if (c>='0' && c<='9')
num=num*10+c-'0';
else if (c=='-')
pos=0;
else if (c=='\n')
{
res.push_back(pos?num:-num);
num=0;
pos=1;
line++;
}
else
{
printf("I've got a problem with this file at line %d\n",line);
exit(1);
}
}
// make sure the optimizer does not throw vector away, also a check.
unsigned sum=0;
for (int i=0;i<res.size();i++)
{
sum=sum+(unsigned)res[i];
}
printf("=>%d\n",sum);
}
UPDATE: and here's my result when read the text file (not binary) using mmap:
$ g++ -O2 mread.cc
$ time ./a.out nums.txt
=>1752547657
real 0m0.559s
user 0m0.478s
sys 0m0.081s
code's on pastebin:
http://pastebin.com/NgqFa11k
What do I suggest
1-2 seconds is a realistic lower bound for a typical desktop machine for load this data. 2 minutes sounds more like a 60 Mhz micro controller reading from a cheap SD card. So either you have an undetected/unmentioned hardware condition or your implementation of C++ stream is somehow broken or unusable. I suggest to establish a lower bound for this task on your your machine by running my sample code.
if the integers are saved in binary format and you're not concerned with Endian problems, try reading the entire file into memory at once (fread) and cast the pointer to int *
You could precompile the array into a .o file, which wouldn't need to be recompiled unless the data changes.
thedata.hpp:
static const int NUM_ENTRIES = 5;
extern int thedata[NUM_ENTRIES];
thedata.cpp:
#include "thedata.hpp"
int thedata[NUM_ENTRIES] = {
10
,200
,3000
,40000
,500000
};
To compile this:
# make thedata.o
Then your main application would look something like:
#include "thedata.hpp"
using namespace std;
int main() {
for (int i=0; i<NUM_ENTRIES; i++) {
cout << thedata[i] << endl;
}
}
Assuming the data doesn't change often, and that you can process the data to create thedata.cpp, then this is effectively instant loadtime. I don't know if the compiler would choke on such a large literal array though!
Save the file in a binary format.
Write the file by taking a pointer to the start of your int array and convert it to a char pointer. Then write the 15000000*sizeof(int) chars to the file.
And when you read the file, do the same in reverse: read the file as a sequence of chars, take a pointer to the beginning of the sequence, and convert it to an int*.
of course, this assumes that endianness isn't an issue.
For actually reading and writing the file, memory mapping is probably the most sensible approach.
If the numbers never change, preprocess the file into a C++ source and compile it into the application.
If the number can change and thus you have to keep them in separate file that you have to load on startup then avoid doing that number by number using C++ IO streams. C++ IO streams are nice abstraction but there is too much of it for such simple task as loading a bunch of number fast. In my experience, huge part of the run time is spent in parsing the numbers and another in accessing the file char by char.
(Assuming your file is more than single long line.) Read the file line by line using std::getline(), parse numbers out of each line using not streams but std::strtol(). This avoids huge part of the overhead. You can get more speed out of the streams by crafting your own variant of std::getline(), such that reads the input ahead (using istream::read()); standard std::getline() also reads input char by char.
Use a buffer of 1000 (or even 15M, you can modify this size as you please) integers, not integer after integer. Not using a buffer is clearly the problem in my opinion.
If the data in the file is binary and you don't have to worry about endianess, and you're on a system that supports it, use the mmap system call. See this article on IBM's website:
High-performance network programming, Part 2: Speed up processing at both the client and server
Also see this SO post:
When should I use mmap for file access?