Block-level copying of data between streambuffers

Block-level copying of data between streambuffers - c++

I would like to copy data efficiently between std::streambuf instances. That is, I would like to shovel blocks of data between them, as opposed to perform character-by-character copying. For example, this is not what I am looking for:
stringbuf in{ios_base::in};
stringbuf out{ios_base::out};
copy(istreambuf_iterator<char>{in},
istreambuf_iterator<char>{},
ostreambuf_iterator<char>{out});
There exists syntactic sugar for this, with a bit more error checking:
ostream os{&out};
os << &in;
Here's a snippet of the implementation of operator<<(basic_streambuf<..>*) in my standard library (Mac OS X, XCode 7):
typedef istreambuf_iterator<_CharT, _Traits> _Ip;
typedef ostreambuf_iterator<_CharT, _Traits> _Op;
_Ip __i(__sb);
_Ip __eof;
_Op __o(*this);
size_t __c = 0;
for (; __i != __eof; ++__i, ++__o, ++__c)
{
*__o = *__i;
if (__o.failed())
break;
}
The bottom line is: this is still per-character copying. I was hoping the standard library uses an algorithm that relies on the block-level member functions of streambuffers, sputn and sgetn, as opposed to per-character transport. Does the standard library provide such an algorithm or do I have to roll my own?

I'm afraid that the answer is: it is not possible with the current design of the standard library. The reason is that streambuffers completely hide the character sequence they manage. This makes it impossible to directly copy bytes from the get area of one streambuffer to the put area of another.
If the "input" streambuffer would expose its internal buffer, then the "output" streambuffer could just use sputn(in.data(), in.size()). Or more obviously: if the output buffer also exposed its internal buffer, then one could use plain memcpy to shovel bytes between the two. Other I/O libraries operate in this fashion: the stream implementation of Google's Protocol Buffers, for example. Boost IOStreams has an optimized implementation to copy between streams. In both cases, efficient block-level copying is possible because the streambuffer equivalent provides access to its intermediary buffer.
In fact, streambuffers ironically do not even need to have a buffer: when operating unbuffered, each read/write goes directly to the underlying device. Presumably this is one reason why the standard library does not support introspection. The unfortunate consequence is that no efficient copying between input and output streambuffers is possible. Block-level copying requires an intermediary buffer, and a copy algorithm would operate as follows:
Read from the input streambuffer via sgetn into the intermediary buffer.
Write from the intermediary buffer into the output streambuffer via sputn.
Go to 1. until input is exhausted or writes to the output streambuffer fail

Related

Reading large strings in C++ -- is there a safe fast way?

http://insanecoding.blogspot.co.uk/2011/11/how-to-read-in-file-in-c.html reviews a number of ways of reading an entire file into a string in C++. The key code for the fastest option looks like this:
std::string contents;
in.seekg(0, std::ios::end);
contents.resize(in.tellg());
in.seekg(0, std::ios::beg);
in.read(&contents[0], contents.size());
Unfortunately, this is not safe as it relies on the string being implemented in a particular way. If, for example, the implementation was sharing strings then modifying the data at &contents[0] could affect strings other than the one being read. (More generally, there's no guarantee that this won't trash arbitrary memory -- it's unlikely to happen in practice, but it's not good practice to rely on that.)
C++ and the STL are designed to provide features that are efficient as C, so one would expect there to be a version of the above that was just as fast but guaranteed to be safe.
In the case of vector<T>, there are functions which can be used to access the raw data, which can be used to read a vector efficiently:
T* vector::data();
const T* vector::data() const;
The first of these can be used to read a vector<T> efficiently. Unfortunately, the string equivalent only provides the const variant:
const char* string::data() const noexcept;
So this cannot be used to read a string efficiently. (Presumably the non-const variant is omitted to support the shared string implementation.)
I have also checked the string constructors, but the ones that accept a char* copy the data -- there's no option to move it.
Is there a safe and fast way of reading the whole contents of a file into a string?
It may be worth noting that I want to read a string rather than a vector<char> so that I can access the resulting data using a istringstream. There's no equivalent of that for vector<char>.

If you really want to avoid copies, you can slurp the file into a std::vector<char>, and then roll your own std::basic_stringbuf to pull data from the vector.
You can then declare a std::istringstream and use std::basic_ios::rdbuf to replace the input buffer with your own one.
The caveat is that if you choose to call istringstream::str it will invoke std::basic_stringbuf::str and will require a copy. But then, it sounds like you won't be needing that function, and can actually stub it out.
Whether you get better performance this way would require actual measurement. But at least you avoid having to have two large contiguous memory blocks during the copy. Additionally, you could use something like std::deque as your underlying structure if you want to cope with truly huge files that cannot be allocated in contiguous memory.
It's also worth mentioning that if you're really just streaming that data you are essentially double-buffering by reading it into a string first. Unless you also require the contents in memory for some other purpose, the buffering inside std::ifstream is likely to be sufficient. If you do slurp the file, you may get a boost by turning buffering off.

I think using &string[0] is just fine, and it should work with the widely used standard library implementations (even if it is technically UB).
But since you mention that you want to put the data into an istringstream, here's an alternative:
Read the data into a char array (new char[in.tellg()])
Construct a stringstream (without the leading 'i')
Insert the data with stringstream::write
The istringstream would have to copy the data anyway, because a std::stringstream doesn't store a std::string internally as far as I'm aware, so you can leave the std::string away and put the data into it directly.
EDIT: Actually, instead of the manual allocation (or make_unique), this way you could also use the vector<char> you mentioned.

Boost::knuth_morris_pratt over a std::istream

I would like to use boost::algorithm::knuth_morris_pratt over some huge files (serveral hundred gigabytes). This means I can't just read the whole file into memory nor mmap it, I need to read it in chunks.
knuth_morris_pratt operates on an iterator, so I guess it is possible to make it read input data "lazily" (on-demand), it would be a matter of writing a "lazy" iterator for some file access class like ifstream, or better istream.
I would like to know if there is some adapter available (already written) that adapts istream to Boost's knuth_morris_pratt so that it won't read all file data all at once?
I know there is a boost::spirit::istream_iterator, but it lacks some some methods (like operator+), so it would have to be modified to work.
On StackOverflow there's a implementation of bidirectional_iterator here, but it still requires some work before it can be used with knuth_morris_pratt.
Are there any istream iterators that are already written, tested and working?
Update: I can't do mmap, because my software should work on multiple operating systems, both on 32-bit and 64-bit architectures. Also very often I don't have the files anyway, they're being generated on-the-fly, that's why I search for a solution that involves streams.

You should simply memory map it.
In practice, 64-bit processors usually have 48-bit address space which is enough for 256 terabytes of memory.
Last I checked, Linux allows 128TB of virtual address space per process on x86-64
(from https://superuser.com/a/168121/75721)
Spirit's istream_iterator is actually it's multi_pass_adaptor and it has different design goals. Unless you have a way to flush the buffer, it will grow indefinitely (by default allocating a deque buffer).

What exactly is streambuf? How do I use it?

I'm trying to learn a bit more about how I/O streams work in C++, and I'm really confused at when to use what.
What exactly is a streambuf?
When do I use a streambuf, as compared to a string, an istream, or a vector? (I already know the last three, but not how streambuf compares to them, if it does at all.)

With the help of streambuf, we can work in an even lower level. It allows access to the underlying buffers.
Here are some good examples : Copy, load, redirect and tee using C++ streambufs and in reference to comparison, This might be helpful,
See this for more details : IOstream Library

Stream buffers represent input or output devices and provide a low level interface for unformatted I/O to that device. Streams, on the other hand, provide a higher level wrapper around the buffer by way of basic unformatted I/O functions and especially via formatted I/O functions (i.e., operator<< and operator>> overloads). Stream objects may also manage a stream buffer's lifetime.
For example a file stream has an internal file stream buffer. The stream manages the lifetime of the buffer and the buffer is what provides actual read and write capabilities to a file. The stream's formatting operators ultimately access the stream buffer's unformatted I/O functions, so you only ever have to use the stream's I/O functions, and don't need to touch the buffer's I/O functions directly.
Another way to understand the differences is to look at the different uses they make of locale objects. Streams use the facets that have to do with formatting such as numpunct and num_get. You can also expect that the overloads of stream operator<< and operator>> for custom time or money data types will use the time and money formatting facets. Stream buffers, however, use the codecvt facets in order to convert between the units their interface uses and bytes. So, for example, the interface for basic_streambuf<char16_t> uses char16_t and so basic_streambuf<char16_t> internally uses codecvt<char16_t, char, mbstate_t> by default to convert the formatted char16_t units written to the buffer to char units written to the underlying device. So you can see that streams are mostly for formatting and stream buffers provide a low level interface for unformatted input or output to devices which may use a different, external encoding.
You can use a stream buffer when you want only unformatted access to an I/O device. You can also use stream buffers if you want to set up multiple streams that share a stream buffer (though you'll have to carefully manage the lifetime of the buffer). There are also special purpose stream buffers you might want to use, such as wbuffer_convert in C++11 which acts as a façade for a basic_streambuf<char> to make it look like a wide character stream buffer. It uses the codecvt facet it's constructed with instead of using the codecvt facet attached to any locale. You can usually achieve the same effect by simply using a wide stream buffer imbued with a locale that has the appropriate facet.

Performance of ostream_iterator for writing numeric data to a file?

I've got various std::vector instances with numeric data in them, primarily int16_t, int32_t, etc. I'd like to dump this data to a file in as fast a manner as possible. If I use an ostream_iterator, will it write the entire block of memory in a single operation, or will it iterate over the elements of the vector, issuing a write operation for each one?

A stream iterator and a vector will definitely not use a block copy in any implementation I'm familiar with. If the vector item type was a class rather than POD, for example, a direct copy would be a bad thing. I suspect the ostream will format the output as well, rather than writing the values directly (i.e., ascii instead of binary output).
You might have better luck with boost::copy, as it's specifically optimized to do block writes when possible, but the most practical solution is to operate on the vector memory directly using &v[0].

Most ofstream implementations I know of do buffer data, so you probably will not end up doing an excessive number of writes. The buffer in the ofstream() has to fill up before an actual write is done, and most OS's buffer file data underneath this, too. The interplay of these is not at all transparent from the C++ application level; selection of buffer sizes, etc. is left up to the implementation.
C++ does provide a way to supply your own buffer to an ostream's streambuf. You can try calling pubsetbuf like this:
char *mybuffer = new char[bufsize];
os.rdbuf()->pubsetbuf(mybuffer, bufsize);
The downside is that this doesn't necessarily do anything. Some implementations just ignore it.
The other option you have if you want to buffer things and still use ostream_iterator is to use an ostringstream, e.g.:
ostringstream buffered_chars;
copy(data.begin(), data.end(), ostream_iterator<char>(buffered_chars, " ");
string buffer(buffered_chars.str());
Then once all your data is buffered, you can write the entire buffer using one big ostream::write(), POSIX I/O, etc.
This can still be slow, though, since you're doing formatted output, and you have to have two copies of your data in memory at once: the raw data and the formatted, buffered data. If your application pushes the limits of memory already, this isn't the greatest way to go, and you're probably better off using the built-in buffering that ofstream gives you.
Finally, if you really want performance, the fastest way to do this is to dump the raw memory to disk using ostream::write() as Neil suggests, or to use your OS's I/O functions. The disadvantage here is that your data isn't formatted, your file probably isn't human-readable, and it isn't easily readable on architectures with a different endianness than the one you wrote from. But it will get your data to disk fast and without adding memory requirements to your application.

The quickest (but most horrible) way to dump a vector will be to write it in one operation with ostream::write:
os.write( (char *) &v[0], v.size() * sizeof( value_type) );
You can make this a bit nicer with a template function:
template <typename T>
std::ostream & DumpVec( std::ostream & os, const std::vector <T> & v ) {
return os.write( &v[0], v.size() * sizeof( T ) );
}
which allows you to say things like:
vector <unsigned int> v;
ofstream f( "file.dat" );
...
DumpVec( f, v );
Reading it back in will be a bit problematic, unless you prefix the write with an the size of the vector somehow (or the vectors are fixed-sized), and even then you will have problems on different endian and/or 32 v 64 bit architectures, as several people have pointed out.

I guess that's implementation dependent. If you don't get the performance you want, you can always memmap the result file and memcpy the std::vector data to the memmapped file.

if you construct the ostream_iterator with an ofstream, that will make sure the output is buffered:
ofstream ofs("file.txt");
ostream_iterator<int> osi(ofs, ", ");
copy(v.begin(), v.end(), osi);
the ofstream object is buffered, so anything written to the stream will get buffered before written to disk.

You haven't written how you want to use the iterators (I'll presume std::copy) and whether you want to write the data binary or as strings.
I would expect a decent implementation of std::copy to fork into std::memcpy for PODs and with dumb pointers as iterators (Dinkumware, for example, does so). However, with ostream iterators, I don't think any implementation of std::copy will do this, as it doesn't have direct access to the ostream's buffer to write into.
The streams themselves, though, buffer, too.
In the end, I would write the simplest possible code first, and measure this. If it's fast enough, move on to the next problem. If this is code of the sort that cannot be fast enough, you'll have to resort to OS-specific tricks anyway.

It will iterate over the elements. Iterators don't let you mess with more than one item at a time. Also, IIRC, it will convert your integers to their ASCII representations.
If you want to write everything in the vector, in binary, to the file in one step via an ostream, you want something like:
template<class T>
void WriteArray(std::ostream& os, const std::vector<T>& v)
{
os.write(static_cast<const char*>(&v[0]), v.size() * sizeof(T));
}

C++ stl stringstream direct buffer access

this should be pretty common yet I find it fascinating that I couldn't find any straight forward solution.
Basically I read in a file over the network into a stringstream. This is the declaration:
std::stringstream membuf(std::ios::in | std::ios::out | std::ios::binary);
Now I have some C library that wants direct access to the read chunk of the memory. How do I get that? Read only access is OK. After the C function is done, I dispose of the memorystream, no need for it.
str() copies the buffer, which seems unnecessary and doubles the memory.
Am I missing something obvious? Maybe a different stl class would work better.
Edit:
Apparently, stringstream is not guaranteed to be stored continuously. What is?
if I use vector<char> how do I get byte buffer?

You can take full control of the buffer used by writing the buffer yourself and using that buffer in the stringstream
stringstream membuf(std::ios::in | std::ios::out | std::ios::binary);
membuf.rdbuf(yourVeryOwnStreamBuf);
Your own buffer should be derived from basic_streambuf, and override the sync() and overflow() methods appropriately.
For your internal representation you could probably use something like vector< char >, and reserve() it to the needed size so that no reallocations and copies are done.
This implies you know an upper bound for the space needed in advance. But if you don't know the size in advance, and need a continguous buffer in the end, copies are of course unavoidable.

std::stringstream doesn't (necessarily) store its buffer contiguously but can allocate chunks as it is gradually filled. If you then want all of its data in a contiguous region of memory then you will need to copy it and that is what str() does for you.
Of course, if you want to use or write a class with a different storage strategy then you can, but you don't then need to use std::stringstream at all.

You can call str() to get back a std::string. From there you can call c_str() on the std::string to get a char*. Note that c_str() isn't officially supported for this use, but everyone uses it this way :)
Edit
This is probably a better solution: std::istream::read. From the example on that page:
buffer = new char [length];
// read data as a block:
is.read (buffer,length);

Well, if you are seriously concerned about storage, you can get closer to the metal. basic_stringstream has a method, rdbuf() which returns it's basic_stringbuf (which is derived from basic_streambuf). You can then use the eback(), egptr(), and gptr() pointers to access characters directly out of the buffer. I've used this machinery in the past to implement a custom buffer with my desired semantics, so it is do-able.
Beware, this is not for the faint of heart! Set aside a few days, read Standard C++ IOStreams and Locales, or similar nitpicky reference, and be careful...

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js