Should I preallocate std::stringstream? - c++

I use std::stringstream extensively to construct strings and error messages in my application. The stringstreams are usually very short life automatic variables.
Will such usage cause heap reallocation for every variable? Should I switch from temporary to class-member stringstream variable?
In latter case, how can I reserve stringstream buffer? (Should I initialize it with a large enough string or is there a more elegant method?)

Have you profiled your execution, and found them to be a source of slow down?
Consider their usage. Are they mostly for error messages outside the normal flow of your code?
As far as reserving space...
Some implementations probably reserve a small buffer before any allocation takes place for the stringstream. Many implementations of std::string do this.
Another option might be (untested!)
std::string str;
str.reserve(50);
std::stringstream sstr(str);
You might find some more ideas in this gamedev thread.
edit:
Mucking around with the stringstream's rdbuf might also be a solution. This approach is probably Very Easy To Get Wrong though, so please be sure it's absolutely necessary. Definitely not elegant or concise.

Although "mucking around with the stringstream's rdbuf...is probably Very Easy To Get Wrong", I went ahead and hacked together a proof-of-concept anyway for fun, as it has always bugged me that there is no easy way to reserve storage for stringstream. Again, as #luke said, you are probably better off optimizing what your profiler tells you needs optimizing, so this is just to address "What if I want to do it anyway?".
Instead of mucking around with stringstream's rdbuf, I made my own, which does pretty much the same thing. It implements only the minimum, and uses a string as a buffer. Don't ask me why I called it a VECTOR_output_stream. This is just a quickly-hacked-together thing.
constexpr auto preallocated_size = 256;
auto stream = vector_output_stream(preallocated_size);
stream << "My parrot ate " << 3 << " cookies.";
cout << stream.str() << endl;

The Bad
This is an old question, but even as of C++1z/C++2a in Visual Studio 2019, stringstream has no ideal way of reserving a buffer.
The other answers to this question do not work at all and for the following reasons:
calling reserve on an empty string yields an empty string, so stringstream constructor doesn't need to allocate to copy the contents of that string.
seekp on a stringstream still seems to be undefined behavior and/or does nothing.
The Good
This code segment works as expected, with ss being preallocated with the requested size.
std::string dummy(reserve, '\0');
std::stringstream ss(dummy);
dummy.clear();
dummy.shrink_to_fit();
The code can also be written as a one-liner std::stringstream ss(std::string(reserve, '\0'));.
The Ugly
What really happens in this code segment is the following:
dummy is preallocated with the reserve, and the buffer is subsequently filled with null bytes (required for the constructor).
stringstream is constructed with dummy. This copies the entire string's contents into an internal buffer, which is preallocated.
dummy is then cleared and then erased, freeing up its allocation.
This means that in order to preallocate a stringstream, two allocations, one fill, and one copy takes place. The worst part is that during the expression, twice as much memory is needed for the desired allocation. Yikes!
For most use cases, this might not matter at all and it's OK to take the extra fill and copy hit to have fewer reallocations.

I'm not sure, but I suspect that stringbuf of stringstream is tightly related with resulted string. So I suspect that you can use ss.seekp(reserved-1); ss.put('\0'); to reserve reserved amount of bytes inside of underlying string of ss. Actually I'd like to see something like ss.seekp(reserved); ss.trunc();, but there is no trunc() method for streams.

Related

Optimize file reading C++

string str1, str2;
vector<string> vec;
ifstream infile;
infile.open("myfile.txt");
while (! infile.eof() )
{
getline(infile,str1);
istringstream is;
is >> str1;
while (is >> str2)
{
vec.push_back(str2);
}
}
What the code does is read the string from the file and stores it into a vector.
Performance needs to be given priority. How can i optimize this code, to make the reading performance faster?
As others have already pointed out (see for example herohuyongtao's answer), the loop condition and how you put str1 into the istringstream must be fixed.
However, there is an important issue here that everybody has missed so far: you don't need the istringstream at all!
vec.reserve(the_number_of_words_you_exptect_at_least);
while (infile >> str1) {
vec.push_back(str1);
}
It gets rid of the inner loop which you didn't need in the first place, and doesn't create an istringstream in each iteration.
If you need to parse further each line and you do need an istringstream, create it outside of the loop and set its string buffer via istringstream::str(const string& s).
I can easily imagine your loops being very slow: Heap allocation on Windows is outrageously slow (compared to Linux); I got bitten once.
Andrei Alexandrescu presents (in some sense) a similar example in his talk Writing Quick Code in C++, Quickly. The surprising thing is that doing unnecessary heap allocations in a tight loop like the one above can be slower than the actual file IO. I was surprised to see that.
You didn't tag your question as C++11 but here is what I would do in C++11.
while (infile >> str1) {
vec.emplace_back(std::move(str1));
}
This move constructs the string at the back of the vector, without copying in. We can do it because we don't need the contents of str1 after we have put it into the vector. In other words, there is no need to copy it into a brand new string at the back of the vector, it is sufficient to just move its contents there. The first loop with the vec.push_back(str1); may potentially copy the contents of str1 which is really unnecessary.
The string implementation in gcc 4.7.2 is currently copy on write, so the two loops have identical performance; it doesn't matter which one you use. For now.
Unfortunately, copy on write strings are now forbidden by the standard. I don't know when the gcc developers are going to change the implementation. If the implementation changes, it may make a difference in performance whether you move (emplace_back(std::move(s))) or you copy (push_back(s)).
If C++98 compatibility is important for you, then go with push_back(). Even if the worst thing happens in the future and your string is copied (it isn't copied now), that copy can be turned into a memmove() / memcpy() which is blazing fast, most likely faster than reading the contents of the file from the hard disk so file IO will most likely remain the bottleneck.
Before any optimization, you need to change
while (! infile.eof() ) // problem 1
{
getline(infile,str1);
istringstream is;
is >> str1; // problem 2
while (is >> str2){
vec.push_back(str2);
}
}
to
while ( getline(infile,str1) ) // 1. don't use eof() in a while-condition
{
istringstream is(str1); // 2. put str1 to istringstream
while (is >> str2){
vec.push_back(str2);
}
}
to make it work as you expected.
P.S. For the optimization part, you don't need to think too much on it unless it becomes a bottleneck. Premature optimization is the root of all evil. However, if you do want to speed it up, check out #Ali's answer for further info.
Loop condition is wrong. Not a performance issue. Assuming this IO loop is indeed your application's bottleneck. But even if not, it can be a good educational exercise or just a weekend fun.
You have quite a few temporaries and cases of dynamic memory allocation in the loop.
Calling std::vector::reserve() in front of the loop will improve it a bit. Reallocating it manually to emulate x1.2 grow factor opposed to 2x after some size will help as well. std::list may though be more appropriate if the file size is unpredictable.
Using std::istringstream as a tokenizer is very inoptimal. Switching to iterator-based "view" tokenizer (Boost has one) should improve the speed a lot.
If you need it to be very fast and have enough RAM you can memory map the file before reading it. Boost::iostreams can let you get there quick. Generally, though, without Boost you can be twice as fast (Boost is not bad but it has to be generic and work on a dozen of compilers this is why).
If you are a blessed person using Unix/Linux as your development environment run your program under valgrind --tool=cachegrind and you will see all problematic places and how bad they are relative to one other. Also, valgrind --tool=massif will let you identify nubmerous small heap allocated objects which is generally not tolerable in the high performance code.
The fastest, though not fully portable, approach is to load the file into a memory mapped region (see wiki mmap.
Given that you know the size of the file, you now can define forward iterators (possibly pointer to const char) on that memory region which you can use to find the tokens which separate your file into "strings".
Essentially, you repeatedly get a pair of pointers pointing to the first character respectively the end of each "string". From that pair of iterators create your std::string.
This approach has subtle issues though:
You need to take care of the character encoding of the file, possibly convert from this character encoding to your desired encoding which is used by your std::string (presumable UTF-8).
The "token" to separate strings (usually \n, may be platform dependent, or may depend on which program created the file.

Is it recommended to std::move a string into containers that is going to be overwritten?

I have the following code
std::vector<std::string> lines;
std::string currentLine;
while(std::getline(std::cin, currentLine)) {
// // option 1
// lines.push_back(std::move(currentLine));
// // option 2
// lines.push_back(currentLine);
}
I see different costs for the two
The first approach will clear currentLine, making the getline need to allocate a new buffer for the string. But it will use the buffer for the vector instead.
The second approach will make getline be able to reuse the buffer, and require a new buffer allocation for the in-vector string.
In such situations, is there a "better" way? Can the compiler optimize the one or other approach more efficiently? Or are there clever string implementations that make one option way more performant than the other?
Given the prevalence of the short string optimization, my immediate guess is that in many cases none of this will make any difference at all -- with SSO, a move ends up copying the contained data anyway (even if the source is an rvalue so it's eligible as the source for a move).
Between the two you've given, I think I'd tend to favor the non-moving version, but I doubt it's going to make a big difference either way. Given that (most of the time) you're going to be re-using the source immediately after the move, I doubt that moving is really going to do a lot of good (even at best). Assuming SSO isn't involved, your choice is being creating a new string in the vector to hold a copy of the string you read, or move from the string you read and (in essence) create a new string to hold the next line in the next iteration. Either way, the expensive part (allocating a buffer to hold the string, copy data into that buffer) is going to be pretty much the same either way.
As far as: "is there a better way" goes, I can think of at least a couple possibilities. The most obvious would be to memory map the file, then walk through that buffer, find the ends of lines, and use emplace_back to create strings in the vector directly from the data in the buffer, with no intermediate strings at all.
That does have the minor disadvantage of memory mapping not being standardized -- if you can't live with that level of non-portability, you can read the whole file into a buffer instead of memory mapping.
The next possibility after that would be to create a class with an interface like a const string's, that just maintains a pointer to the data in the big buffer instead of making a copy of it (e.g., CLang uses something like this). This will typically reduce total allocation, heap fragmentation, etc., but if you (for example) need to modify the strings afterward, it's unlikely to be of much (if any) use.

NULL terminated string and its length

I have a legacy code that receives some proprietary, parses it and creates a bunch of static char arrays (embedded in class representing the message), to represent NULL strings. Afterwards pointers to the string are passed all around and finally serialized to some buffer.
Profiling shows that str*() methods take a lot of time.
Therefore I would like to use memcpy() whether it's possible. To achive it I need a way to associate length with pointer to NULL terminating string. I though about:
Using std::string looks less efficient, since it requires memory allocation and thread synchronization.
I can use std::pair<pointer to string, length>. But in this case I need to maintain length "manually".
What do you think?
use std::string
Profiling shows that str*() methods
take a lot of time
Sure they do ... operating on any array takes a lot of time.
Therefore I would like to use memcpy()
whether it's possible. To achive it I
need a way to associate length with
pointer to NULL terminating string. I
though about:
memcpy is not really any slower than strcpy. In fact if you perform a strlen to identify how much you are going to memcpy then strcpy is almost certainly faster.
Using std::string looks less
efficient, since it requires memory
allocation and thread synchronization
It may look less efficient but there are a lot of better minds than yours or mine that have worked on it
I can use std::pair. But in this case I need to
maintain length "manually".
thats one way to save yourself time on the length calculation. Obviously you need to maintain the length manually. This is how windows BSTRs work, effectively (though the length is stored immediately prior, in memory, to the actual string data). std::string. for example, already does this ...
What do you think?
I think your question is asked terribly. There is no real question asked which makes answering next to impossible. I advise you actually ask specific questions in the future.
Use std::string. It's an advice already given, but let me explain why:
One, it uses a custom memory allocation scheme. Your char* strings are probably malloc'ed. That means they are worst-case aligned, which really isn't needed for a char[]. std::string doesn't suffer from needless alignment. Furthermore, common implementatios of std::string use the "Small String Optimization" which eliminates a heap allocation altogether, and improves locality of reference. The string size will be on the same cache line as the char[] itself.
Two, it keeps the string length, which is indeed a speed optimization. Most str* functions are slower because they don't have this information up front.
A second option would be a rope class, e.g. from SGI. This be more efficient by eliminating some string copies.
Your post doesn't explain where the str*() function calls are coming from; passing around char * certainly doesn't invoke them. Identify the sites that actually do the string manipulation and then try to find out if they're doing so inefficiently. One common pitfall is that strcat first needs to scan the destination string for the terminating 0 character. If you call strcat several times in a row, you can end up with a O(N^2) algorithm, so be careful about this.
Replacing strcpy by memcpy doesn't make any significant difference; strcpy doesn't do an extra pass to find the length of the string, it's simply (conceptually!) a character-by-character copy that stops when it encounters the terminating 0. This is not much more expensive than memcpy, and always cheaper than strlen followed by memcpy.
The way to gain performance on string operations is to avoid copies where possible; don't worry about making the copying faster, instead try to copy less! And this holds for all string (and array) implementations, whether it be char *, std::string, std::vector<char>, or some custom string / array class.
What do I think? I think that you should do what everyone else obsessed with pre-optimization does. You should find the most obscure, unmaintainable, yet intuitively (to you anyway) high-performance way you can and do it that way. Sounds like you're onto something with your pair<char*,len> with malloc/memcpy idea there.
Whatever you do, do NOT use pre-existing, optimized wheels that make maintenence easier. Being maintainable is simply the least important thing imaginable when you're obsessed with intuitively measured performance gains. Further, as you well know, you're quite a bit smarter than those who wrote your compiler and its standard library implementation. So much so that you'd be seriously silly to trust their judgment on anything; you should really consider rewriting the entire thing yourself because it would perform better.
And ... the very LAST thing you'll want to do is use a profiler to test your intuition. That would be too scientific and methodical, and we all know that science is a bunch of bunk that's never gotten us anything; we also know that personal intuition and revelation is never, ever wrong. Why waste the time measuring with an objective tool when you've already intuitively grasped the situation's seemingliness?
Keep in mind that I'm being 100% honest in my opinion here. I don't have a sarcastic bone in my body.

How to allocate more memory for a buffer in C++?

I have pointer str:
char* str = new char[10];
I use the memory block str points to to store data.
How can I allocate more bytes for the buffer pointed to by str and not lose old data stored in the buffer?
Use std::string instead. It will do what you need without you worrying about allocation, copy etc. You can still access the raw memory via the c_str() function.
Even std::vector<char> will work well for you.
new[] another buffer, copy the data there (use memcpy() for that), then delete[] the old one, assign the new buffer address to the pointer originally holding the old buffer address.
You cannot using the new construction. For that you need to use the good old malloc, realloc, and free (do not mix malloc/realloc/free and new/delete).
The realloc function is what you are searching for. You had to use malloc/free instead of new/delete to use it
If you are really using C++, the most correct solution would be to use std::vector. I assume that you are not using that information as a standard string, in that case you should use std::string (which is an specialization of std::vector, so no big deal). You are creating at least 10 chars. This gives me the hint that you are probably quite sure that you'll need 10 chars, but maybe you'll nedd more. Maybe you are worried about the performance problems involved in allocating and deallocating memory. In that case, you can create your string and then reserve the estimated capacity that you expect you'll need, so there won't be any reallocation at least until you get to that limit.
int main()
{
std::string s;
s.reserve( 10 );
// do whatever with s
}
As others have already pointed out, the use of std::string or std::Vector will get you the benefit of forgetting about copy, resizing or deleting the reserved memory.
You have to allocate a different, bigger string array, and copy over the data from str to that new string array.
Allocation is a bit like finding a parking place.
You're asking here if it's possible to add a trailer on your car that has been parked for a fews days.
The answer is, in C there exists something called realloc that allows you to do following thing.
If I have already enough place to add my trailer, do so. If not park in another place big enough for your trailer and your car, which is equivalent to copying your data.
In other words you'll get strong and random performance hits.
So what would you do in the real world? If you knew you might need to add some trailers to your car you'd probably park in a bigger place than required. And when exceeding the size required for the place, you'd move your car and your trailers to a place with a nice margin for future trailers.
That's precisely what the STL's string and vector is doing for you. You can even give them a hint of the size of your futures trailer by calling "reserve". Using std::string is probably the best answer to your problem.
You can use realloc: http://www.cplusplus.com/reference/clibrary/cstdlib/realloc/
I would add that this approach is not the favored c++ approach (depending on your needs you could use std::vector<char> for instance).

Is there a way to reduce ostringstream malloc/free's?

I am writing an embedded app. In some places, I use std::ostringstream a lot, since it is very convenient for my purposes. However, I just discovered that the performance hit is extreme since adding data to the stream results in a lot of calls to malloc and free. Is there any way to avoid it?
My first thought was making the ostringstream static and resetting it using ostringstream::set(""). However, this can't be done as I need the functions to be reentrant.
Well, Booger's solution would be to switch to sprintf(). It's unsafe, and error-prone, but it is often faster.
Not always though. We can't use it (or ostringstream) on my real-time job after initialization because both perform memory allocations and deallocations.
Our way around the problem is to jump through a lot of hoops to make sure that we perform all string conversions at startup (when we don't have to be real-time yet). I do think there was one situation where we wrote our own converter into a fixed-sized stack-allocated array. We have some constraints on size we can count on for the specific conversions in question.
For a more general solution, you may consider writing your own version of ostringstream that uses a fixed-sized buffer (with error-checking on the bounds being stayed within, of course). It would be a bit of work, but if you have a lot of those stream operations it might be worth it.
If you know how big the data is before creating the stream you could use ostrstream whose constructor can take a buffer as a parameter. Thus there will be no memory management of the data.
Probably the approved way of dealing with this would be to create your own basic_stringbuf object to use with your ostringstream. For that, you have a couple of choices. One would be to use a fixed-size buffer, and have overflow simply fail when/if you try to create output that's too long. Another possibility would be to use a vector as the buffer. Unlike std::string, vector guarantees that appending data will have amortized constant complexity. It also never releases data from the buffer unless you force it to, so it'll normally grow to the maximum size you're dealing with. From that point, it shouldn't allocate or free memory unless you create a string that's beyond the length it currently has available.
std::ostringsteam is a convenience interface. It links a std::string to a std::ostream by providing a custom std::streambuf. You can implement your own std::streambuf. That allows you to do the entire memory management. You still get the nice formatting of std::ostream, but you have full control over the memory management. Of course, the consequence is that you get your formatted output in a char[] - but that's probably no big problem if you're an embedded developer.