std::to_string vs stringstream - c++

The code below shows 2 solutions (std::to_string and std::stringstream) that convert an int m_currentSoundTime to std::string. Is std::to_string or std::stringstream faster?
// Compute current sound time in minute and convert to string
stringstream currentTime;
currentTime << m_currentSoundTime / 60;
m_currentSoundTimeInMinute = currentTime.str();
or
m_currentSoundTimeInMinute = to_string( m_currentSoundTime / 60 );

In any reasonable library implementation to_string will be at least as fast as stringstream for this. However, if you wanted to put 10 ints into a string, stringstream will likely be faster. If you were to do to_string(a) + ", " + to_string(b) + /*...*/ every operation would probably cause an allocation and a copy from the previous string to the new allocation - not true with stringstream.
More importantly, it's pretty obvious from your example code that to_string is cleaner for dealing with converting a single int to a string.

This blog post tests several int-to-string conversion methods (using GCC 4.7 on Ubuntu 13.04). In this
case to_string is somewhat slower than stringstream. But this probably depends strongly on the compiler and std library.

Related

Fast move CString to std::string

I'm working in a codebase with a mixture of CString, const char* and std::string (non-unicode), where all new code uses std::string exclusively. I've now had to do the following:
{
CString tempstring;
load_cstring_legacy_method(tempstring);
stdstring = tempstring;
}
and worry about performance. The strings are DNA sequences so we can easily have 100+ of them with each of them ~3M characters. Note that adjusting load_cstring_legacy_method is not an option. I did a quick test:
// 3M
const int stringsize = 3000000;
const int repeat = 1000;
std::chrono::steady_clock::time_point startTime = std::chrono::steady_clock::now();
for ( int i = 0; i < repeat; ++i ){
CString cstring('A', stringsize);
std::string stdstring(cstring); // Comment out
cstring.Empty();
}
std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::steady_clock::now() - startTime).count() << " ms" << std::endl;
and commenting out the std::string gives 850 ms, with the assignment its 3600 ms. The magnitude of the difference is suprising so I guess the benchmark might not be doing what I expect. Assuming there is a penalty, is there a way I can avoid it?
So your question is to make the std::string construction faster?
On my machine, comparing this
std::string stdstring(cstring); // 4741 ms
I get better performance this way:
std::string stdstring(cstring, stringsize); // 3419 ms
or if the std::string already exists like the first part of your question suggests:
stdstring.assign(cstring, stringsize); // 3408 ms
Use a more efficient memory allocator. Something like a memory arena/region would substantially help with allocation costs.
If you're really, really desperate, you could theoretically combine ReleaseBuffer with some hideous allocator hacks to avoid the copy altogether. This would involve a lot of pain, though.
In addition, if you have a serious problem, you could consider changing your string implementation. The std::string that ships with Visual Studio employs SSO, or Small String Optimization. This does exactly what it sounds like- it optimizes very small strings, which are quite common all around but not necessarily good for this use case. Another implementation like COW could be more appropriate (be super careful if doing so in a multi-threaded environment).
Finally, if you're using an old version of VS, you should also consider upgrading. Move semantics are a huge instawin as far as performance goes.
CString is probably the Unicode version, which explains the slowness. The generic conversion routine cannot know assume that the characters used are limited to "ACGT".
You can, however, and shamelessly take advantage of that.
{
CString tempstring;
load_cstring_legacy_method(tempstring);
int len = tempstring.GetLength();
stdstring.reserve(len);
for(int i = 0; i != len; ++i)
{
stdstring.push_back(static_cast<char>(tempstring[i]));
}
}
Portable? Only so far as CString is, so Windows variants.

static_cast vs boost::lexical_cast

I am trying to concatenate an integer to a known string, and I have found that there are several ways to do it, two of those being:
int num=13;
string str = "Text" + static_cast<ostringstream*>( &(ostringstream() << num) )->str();
or I could also use boost libraries' lexical_cast:
int num=13;
string str= "Text" + boost::lexical_cast<std::string>(num);
Is the use of boost::lexical_cast more efficient in any way, since I already know the conversion type (int to string)? Or is static_cast just as effective, without having to rely on external libraries?
string str = "Text" + static_cast<ostringstream*>( &(ostringstream() << num) )->str();
This is ugly and not easily readable. Adding to this the fact that lexical_cast does almost exactly this underneath we can definitely say that using lexical_cast is "better".
In C++11, however, we have to_string overloads.
string str = "Text" + to_string(num);
Which is the best option provided your compiler supports it.
See also How to convert a number to string and vice versa in C++

C++ faster way to do string addition?

I'm finding standard string addition to be very slow so I'm looking for some tips/hacks that can speed up some code I have.
My code is basically structured as follows:
inline void add_to_string(string data, string &added_data) {
if(added_data.length()<1) added_data = added_data + "{";
added_data = added_data+data;
}
int main()
{
int some_int = 100;
float some_float = 100.0;
string some_string = "test";
string added_data;
added_data.reserve(1000*64);
for(int ii=0;ii<1000;ii++)
{
//variables manipulated here
some_int = ii;
some_float += ii;
some_string.assign(ii%20,'A');
//then we concatenate the strings!
stringstream fragment;
fragment<<some_int <<","<<some_float<<","<<some_string;
add_to_string(fragment.str(),added_data);
}
return;
}
Doing some basic profiling, I'm finding that a ton of time is being used in the for loop. Are there some things I can do that will significantly speed this up? Will it help to use c strings instead of c++ strings?
String addition is not the problem you are facing. std::stringstream is known to be slow due to it's design. On every iteration of your for-loop the stringstream is responsible for at least 2 allocations and 2 deletions. The cost of each of these 4 operations is likely more than that of the string addition.
Profile the following and measure the difference:
std::string stringBuffer;
for(int ii=0;ii<1000;ii++)
{
//variables manipulated here
some_int = ii;
some_float += ii;
some_string.assign(ii%20,'A');
//then we concatenate the strings!
char buffer[128];
sprintf(buffer, "%i,%f,%s",some_int,some_float,some_string.c_str());
stringBuffer = buffer;
add_to_string(stringBuffer ,added_data);
}
Ideally, replace sprintf with _snprintf or the equivalent supported by your compiler.
As a rule of thumb, use stringstream for formatting by default and switch to the faster and less safe functions like sprintf, itoa, etc. whenever performance matters.
Edit: that, and what didierc said: added_data += data;
You can save lots of string operations if you do not call add_to_string in your loop.
I believe this does the same (although I am not a C++ expert and do not know exactly what stringstream does):
stringstream fragment;
for(int ii=0;ii<1000;ii++)
{
//variables manipulated here
some_int = ii;
some_float += ii;
some_string.assign(ii%20,'A');
//then we concatenate the strings!
fragment<<some_int<<","<<some_float<<","<<some_string;
}
// inlined add_to_string call without the if-statement ;)
added_data = "{" + fragment.str();
I see you used the reserve method on added_data, which should help by avoiding multiple reallocations of the string as it grows.
You should also use the += string operator where possible:
added_data += data;
I think that the above should save up some significant time by avoiding unecessary copies back and forth of added_data in a temporary string when doing the catenation.
This += operator is a simpler version of the string::append method, it just copies data directly at the end of added_data. Since you made the reserve, that operation alone should be very fast (almost equivalent to a strcpy).
But why going through all this, when you are already using a stringstream to handle input? Keep it all in there to begin with!
The stringstream class is indeed not very efficient.
You may have a look at the stringstream class for more information on how to use it, if necessary, but your solution of using a string as a buffer seems to avoid that class speed issue.
At any rate, stay away from any attempt at reimplementing the speed critical code in pure C unless you really know what you are doing. Some other SO posts support the idea of doing it,, but I think it's best (read safer) to rely as much as possible on the standard library, which will be enhanced over time, and take care of many corner cases you (or I) wouldn't think of. If your input data format is set in stone, then you might start thinking about taking that road, but otherwise it's premature optimization.
If you start added_data with a "{", you would be able to remove the if from your add_to_string method: the if gets executed exactly once, when the string is empty, so you might as well make it non-empty right away.
In addition, your add_to_string makes a copy of the data; this is not necessary, because it does not get modified. Accepting the data by const reference should speed things up for you.
Finally, changing your added_data from string to sstream should let you append to it in a loop, without the sstream intermediary that gets created, copied, and thrown away on each iteration of the loop.
Please have a look at Twine used in LLVM.
A Twine is a kind of rope, it represents a concatenated string using a
binary-tree, where the string is the preorder of the nodes. Since the
Twine can be efficiently rendered into a buffer when its result is used,
it avoids the cost of generating temporary values for intermediate string
results -- particularly in cases when the Twine result is never
required. By explicitly tracking the type of leaf nodes, we can also avoid
the creation of temporary strings for conversions operations (such as
appending an integer to a string).
It may helpful in solving your problem.
How about this approach?
This is a DevPartner for MSVC 2010 report.
string newstring = stringA & stringB;
i dont think strings are slow, its the conversions that can make it slow
and maybe your compiler that might check variable types for mismatches.

String addition or subtraction operators

How to add or subtract the value of string? For example:
std::string number_string;
std::string total;
cout << "Enter value to add";
std::getline(std::cin, number_string;
total = number_string + number_string;
cout << total;
This just append the string so this won't work. I know I can use int data type but I need to use string.
You can use atoi(number_string.c_str()) to convert the string to an integer.
If you are concerned about properly handling non-numeric input, strtol is a better choice, albeit a little more wordy. http://www.cplusplus.com/reference/cstdlib/strtol/
You will want to work with integers the entire time, and then convert to a std::string at the very end.
Here is a solution that works if you have a C++11 capable compiler:
#include <string>
std::string sum(std::string const & old_total, std::string const & input) {
int const total = std::stoi(old_total);
int const addend = std::stoi(input);
return std::to_string(total + addend);
}
Otherwise, use boost:
#include <string>
#include <boost/lexical_cast.hpp>
std::string sum(std::string const & old_total, std::string const & input) {
int const total = boost::lexical_cast<int>(old_total);
int const addend = boost::lexical_cast<int>(input);
return boost::lexical_cast<std::string>(total + addend);
}
The function first converts each std::string into an int (a step that you will have to do, no matter what approach you take), then adds them, and then converts it back to a std::string. In other languages, like PHP, that try to guess what you mean and add them, they are doing this under the hood, anyway.
Both of these solutions have a number of advantages. They are faster, they report their errors with exceptions rather than silently appearing to work, and they don't require extra intermediary conversions.
The Boost solution does require a bit of work to set up, but it is definitely worth it. Boost is probably the most important tool of any C++ developer's work, except maybe the compiler. You will need it for other things because they have already done top-notch work solving many problems that you will have in the future, so it is best for you to start getting experience with it. The work required to install Boost is much less than the time you will save by using it.

Which is efficient, itoa or sprintf?

I am in the processes of building my first C++ application and choosing an efficient C++ libraries to rely on at this stage, is one of the design consideration I am looking at.
Consequently I want to convert an integer type to string and deciding on whether to use;
sprintf(string, "%d", x);
Or
Integer to ASCI
itoa(x, string);
Can anyone suggest which one of these route is efficient and possibly why?
Thanks.
They're both efficient. It's probably much more relevant to note that itoa() is not part of the C++ standard, and as such is not available in many common runtimes. (In particular, it's not part of libstdc++, so it's not available on Mac OS X or Linux.)
Don't use either of these. Use std::stringstream and so on.
std::stringstream ss;
ss << x;
ss.str(); // Access the std::string
Either way, it's quite unlikely that converting to string will be a significant part of your application's execution time.
From a pure algorithm viewpoint one can argue that itoa would be faster since sprintf has the additional cost of parsing the format descriptor string. However without benchmarking the cost of the two functions in an implementation, with a non-trivial work load, one cannot be sure.
Also this isn't apples to apples comparison since both functions aren't equivalent. sprintf can do much more formatting than itoa does, apart from the fact that the former is a standard function while the latter isn't.
Aside: If you can use C++11 you can use to_string which returns you an std::string. If you want representations other than decimal you may do this:
int i = 1234;
std::stringstream ss;
ss << std::hex << i; // hexadecimal
ss << std::oct << i; // octal
ss << std::dec << i; // decimal
std::bitset<sizeof(int) * std::numeric_limits<unsigned char>::digits> b(i);
ss << b; // binary
std::string str = ss.str();