I have a simple scenario. I need to join two C-strings together into a std::string. I have decided to do this in one of two ways:
Solution 1
void ProcessEvent(char const* pName) {
std::string fullName;
fullName.reserve(50); // Ensure minimal reallocations for small event names (50 is an arbitrary limit).
fullName += "com.domain.events.";
fullName += pName;
// Use fullName as needed
}
Solution 2
void ProcessEvent(char const* pName) {
std::ostringstream ss;
ss << "com.domain.events." << pName;
std::string fullName{ss.str()};
// Use fullName as needed
}
I like solution 2 better because the code is more natural. Solution 1 seems like a response to a measurable bottleneck from performance testing. However, Solution 1 exists for 2 reasons:
It's a light optimization to reduce allocations. Event management in this application is used quite frequently so there might be benefits (but no measurements have been taken).
I've heard criticism regarding STL streams WRT performance. Some have recommended to only use stringstream when doing heavy string building, especially those involving number conversions and/or usage of manipulators.
Is it a premature pessimization to prefer solution 2 for its simplicity? Or is it a premature optimization to choose solution 1? I'm wondering if I'm too overly concerned about STL streams.
Let's measure it
A quick test with the following functions:
void func1(const char* text) {
std::string s;
s.reserve(50);
s += "com.domain.event.";
s += text;
}
void func2(const char* text) {
std::ostringstream oss;
oss << "com.domain.event." << text;
std::string s = oss.str();
}
Running each 100 000 times in a loop gives the following results on average on my computer (using gcc-4.9.1):
func1 : 37 milliseconds
func2 : 87 milliseconds
That is, func1 is more than twice as fast.
That being said, I would recommend using the clearest most readable syntax until you really need the performance. Implement a testable program first, then optimize if its too slow.
Edit:
As suggested by #Ken P:
void func3(const char* text) {
std::string s = "com.domain.event" + std::string{text};
}
func3 : 27 milliseconds
The simplest solution is often the fastest.
You didn't mention the 3rd alternative of not pre-allocating anything at all in the string and just let the optimizer do what it's best at.
Given these two functions, func1 and func3:
void func1(const char* text) {
std::string s;
s.reserve(50);
s += "com.domain.event.";
s += text;
std::cout << s;
}
void func3(const char* text) {
std::string s;
s += "com.domain.event.";
s += text;
std::cout << s;
}
It can be seen in the example at http://goo.gl/m8h2Ks that the gcc assembly for func1 just for reserving space for 50 characters will add an additional 3 instructions compared to when no pre-allocation is done in func3. One of the calls is a string append call, which in turn will give some overhead:
leaq 16(%rsp), %rdi
movl $50, %esi
call std::basic_string<char>::append(char const*, unsigned long)
Looking at the code alone doesn't guarantee that func3 is faster than func1 though, just because it has fewer instructions. Cache and other things also contributes to the actual performance, which can only be properly assessed by measuring, as others pointed out.
Related
Trying out the example in Section 5.9.2 Class monotonic_buffer_resource of the following article on Polymorphic Memory Resources by Pablo Halpern :
Doc No: N3816
Date: 2013-10-13
Author: Pablo Halpern
phalpern#halpernwightsoftware.com
Polymorphic Memory Resources - r1
(Originally N3525 – Polymorphic Allocators)
The article claims that :
The monotonic_buffer_resource class is designed for very fast memory allocations
in situations where memory is used to build up a few objects and then is released all
at once when those objects go out of scope.
and that :
A particularly good use for a monotonic_buffer_resource is to provide memory for
a local variable of container or string type. For example, the following code
concatenates two strings, looks for the word “hello” in the concatenated string, and
then discards the concatenated string after the word is found or not found. The
concatenated string is expected to be no more than 80 bytes long, so the code is
optimized for these short strings using a small monotonic_buffer_resource [...]
I've benchmarked the example using the google benchmark library and boost.container 1.69's polymorphic resources, compiled and linked to release binaries with g++-8 on an Ubuntu 18.04 LTS hyper-v virtual machine with the following code :
// overload using pmr::string
static bool find_hello(const boost::container::pmr::string& s1, const boost::container::pmr::string& s2)
{
using namespace boost::container;
char buffer[80];
pmr::monotonic_buffer_resource m(buffer, 80);
pmr::string s(&m);
s.reserve(s1.length() + s2.length());
s += s1;
s += s2;
return s.find("hello") != pmr::string::npos;
}
// overload using std::string
static bool find_hello(const std::string& s1, const std::string& s2)
{
std::string s{};
s.reserve(s1.length() + s2.length());
s += s1;
s += s2;
return s.find("hello") != std::string::npos;
}
static void allocator_local_string(::benchmark::State& state)
{
CLEAR_CACHE(2 << 12);
using namespace boost::container;
pmr::string s1(35, 'c'), s2(37, 'd');
for (auto _ : state)
{
::benchmark::DoNotOptimize(find_hello(s1, s2));
}
}
// pmr::string with monotonic buffer resource benchmark registration
BENCHMARK(allocator_local_string)->Repetitions(5);
static void allocator_global_string(::benchmark::State& state)
{
CLEAR_CACHE(2 << 12);
std::string s1(35, 'c'), s2(37, 'd');
for (auto _ : state)
{
::benchmark::DoNotOptimize(find_hello(s1, s2));
}
}
// std::string using std::allocator and global allocator benchmark registration
BENCHMARK(allocator_global_string)->Repetitions(5);
Here are the results :
How is the pmr::string benchmark so slow compared to std::string?
I assume std::string's std::allocator should use "new" on the reserve call, and construct each character afterwards when calling :
s += s1;
s += s2
Comparing that to a pmr::string using a polymorphic allocator that holds the monotonic_buffer_resource, reserving memory should boil down to simply pointer arithmetic, necessitating no "new" as the char buffer should be sufficient. Subsequently, it would construct each character as std::string does.
So, considering that the only differing operations between the pmr::string version of find_hello and the std::string version of find_hello is the call to reserve memory, with pmr::string using stack allocation and std::string using heap allocation :
Is my benchmark wrong?
Is my interpretation of how allocation should occur wrong?
Why is the pmr::string benchmark approximately 5 times slower than the std::string benchmark?
There is a combination of things that makes boost pmr::basic_string slower:
Construction of the pmr::monotonic_buffer_resource has some cost (17 nano-seconds here).
pmr::basic_string::reserve reserves more than one requires. It reserves 96 bytes in this case, which is more than the 80 bytes you have.
Reserving in pmr::basic_string is not free, even when the buffer is big enough (extra 8 nano-seconds here).
The concatenation of strings is costly (extra 64 ns here).
pmr::basic_string::find has a suboptimal implementation. This is the real cost for the poor speed. In GCC's std::basic_string::find uses __builtin_memchr to find the first character that might match, which boost is doing it all in one big loop. Apparently this is the main cost, and what makes boost run slower than std.
So, after increasing the buffer, and comparing boost::container::string with boost::container::pmr::string, the pmr version comes slightly slower (293 ns vs.
276 ns). This is because new and delete are actually quite fast for such micro-benchmarks, and are faster than the complicated machinery of the pmr (just 17 ns for construction). In fact, the default Linux/gcc new/delete reuse the same pointer again and again. This optimization has a very simple and fast implementation, that also works great with the CPU cache.
As a proof, try this out (without optimization):
for (int i=0 ; i < 10 ; ++i)
{
char * ptr = new char[96];
std::cout << (void*) ptr << '\n';
delete[] ptr;
}
This prints the same pointer again and again.
The theory is that in a real program, where new/delete don't behave that nicely, and can't reuse the same block again and again, then new/delete slow down the execution much more and cache locality becomes quite poor. In such case the pmr+buffer are worth it.
Conclusion: the implementation of boost pmr string is slower than gcc's string. The pmr machinery is slightly more costly than the default and simple scenario of the new/delete.
I'm working in a codebase with a mixture of CString, const char* and std::string (non-unicode), where all new code uses std::string exclusively. I've now had to do the following:
{
CString tempstring;
load_cstring_legacy_method(tempstring);
stdstring = tempstring;
}
and worry about performance. The strings are DNA sequences so we can easily have 100+ of them with each of them ~3M characters. Note that adjusting load_cstring_legacy_method is not an option. I did a quick test:
// 3M
const int stringsize = 3000000;
const int repeat = 1000;
std::chrono::steady_clock::time_point startTime = std::chrono::steady_clock::now();
for ( int i = 0; i < repeat; ++i ){
CString cstring('A', stringsize);
std::string stdstring(cstring); // Comment out
cstring.Empty();
}
std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::steady_clock::now() - startTime).count() << " ms" << std::endl;
and commenting out the std::string gives 850 ms, with the assignment its 3600 ms. The magnitude of the difference is suprising so I guess the benchmark might not be doing what I expect. Assuming there is a penalty, is there a way I can avoid it?
So your question is to make the std::string construction faster?
On my machine, comparing this
std::string stdstring(cstring); // 4741 ms
I get better performance this way:
std::string stdstring(cstring, stringsize); // 3419 ms
or if the std::string already exists like the first part of your question suggests:
stdstring.assign(cstring, stringsize); // 3408 ms
Use a more efficient memory allocator. Something like a memory arena/region would substantially help with allocation costs.
If you're really, really desperate, you could theoretically combine ReleaseBuffer with some hideous allocator hacks to avoid the copy altogether. This would involve a lot of pain, though.
In addition, if you have a serious problem, you could consider changing your string implementation. The std::string that ships with Visual Studio employs SSO, or Small String Optimization. This does exactly what it sounds like- it optimizes very small strings, which are quite common all around but not necessarily good for this use case. Another implementation like COW could be more appropriate (be super careful if doing so in a multi-threaded environment).
Finally, if you're using an old version of VS, you should also consider upgrading. Move semantics are a huge instawin as far as performance goes.
CString is probably the Unicode version, which explains the slowness. The generic conversion routine cannot know assume that the characters used are limited to "ACGT".
You can, however, and shamelessly take advantage of that.
{
CString tempstring;
load_cstring_legacy_method(tempstring);
int len = tempstring.GetLength();
stdstring.reserve(len);
for(int i = 0; i != len; ++i)
{
stdstring.push_back(static_cast<char>(tempstring[i]));
}
}
Portable? Only so far as CString is, so Windows variants.
I have the following mock up code of a class which uses an attribute to set a filename:
#include <iostream>
#include <iomanip>
#include <sstream>
class Test {
public:
Test() { id_ = 1; }
/* Code which modifies ID */
void save() {
std::string filename ("file_");
filename += getID();
std::cout << "Saving into: " << filename <<'\n';
}
private:
const std::string getID() {
std::ostringstream oss;
oss << std::setw(4) << std::setfill('0') << id_;
return oss.str();
}
int id_;
};
int main () {
Test t;
t.save();
}
My concern is about the getID method. At first sight it seems pretty inefficient since I am creating the ostringstream and its corresponding string to return. My questions:
1) Since it returns const std::string is the compiler (GCC in my case) able to optimize it?
2) Is there any way to improve the performance of the code? Maybe move semantics or something like that?
Thank you!
Creating an ostringstream, just once, prior to an expensive operation like opening a file, doesn't matter to your program's efficiency at all, so don't worry about it.
However, you should worry about one bad habit exhibited in your code. To your credit, you seem to have identified it already:
1) Since it returns const std::string is the compiler (GCC in my case) able to optimize it?
2) Is there any way to improve the performance of the code? Maybe move semantics or something like that?
Yes. Consider:
class Test {
// ...
const std::string getID();
};
int main() {
std::string x;
Test t;
x = t.getID(); // HERE
}
On the line marked // HERE, which assignment operator is called? We want to call the move assignment operator, but that operator is prototyped as
string& operator=(string&&);
and the argument we're actually passing to our operator= is of type "reference to an rvalue of type const string" — i.e., const string&&. The rules of const-correctness prevent us from silently converting that const string&& to a string&&, so when the compiler is creating the set of assignment-operator functions it's possible to use here (the overload set), it must exclude the move-assignment operator that takes string&&.
Therefore, x = t.getID(); ends up calling the copy-assignment operator (since const string&& can safely be converted to const string&), and you make an extra copy that could have been avoided if only you hadn't gotten into the bad habit of const-qualifying your return types.
Also, of course, the getID() member function should probably be declared as const, because it doesn't need to modify the *this object.
So the proper prototype is:
class Test {
// ...
std::string getID() const;
};
The rule of thumb is: Always return by value, and never return by const value.
1) Since it returns const std::string is the compiler (GCC in my case)
able to optimize it?
Makes no sense to return a const object unless returning by reference
2) Is there any way to improve the performance of the code? Maybe move
semantics or something like that?
Id id_ does not change, just create the value in the constructor, using an static method may help:
static std::string format_id(int id) {
std::ostringstream oss;
oss << std::setw(4) << std::setfill('0') << id;
return oss.str();
}
And then:
Test::Test()
: id_(1)
, id_str_(format_id(id_))
{ }
Update:
This answer is not totally valid for the problem due to the fact that id_ does change, I will not remove it 'cause maybe someone will find it usefull for his case. Anyway, I wanted to clarify some thoughts:
Must be static in order to be used in variable initialization
There was a mistake in the code (now corrected), which used the member variable id_.
It makes no sense to return a const object by value, because returning by value will just copy (ignoring optimizations) the result to a new variable, which is in the scope of the caller (and might be not const).
My advice
An option is to update the id_str_ field anytime id_ changes (you must have a setter for id_), given that you're already changin the member id_ I assume there will be no issues updating another.
This approach allows to implement getID() as a simple getter (should be const, btw) with no performance issues, and the string field is computed only once.
One possibility would be to do something like this:
std::string getID(int id) {
std::string ret(4, '0') = std::to_string(id);
return ret.substring(ret.length()-4);
}
If you're using an implementation that includes the short string optimization (e.g., VC++) chances are pretty good that this will give a substantial speed improvement (a quick test with VC++ shows it at around 4-5 times as fast).
OTOH, if you're using an implementation that does not include short string optimization, chances are pretty good it'll be substantially slower. For example, running the same test with g++, produces code that's about 4-5 times slower.
One more point: if your ID number might be more than 4 digits long, this doesn't give the same behavior--it always returns a string of exactly 4 characters rather than the minimum of 4 created by the stringstream code. If your ID numbers may exceed 9999, then this code simply won't work for you.
You could change getID in this way:
std::string getID() {
thread_local std::ostringstream oss;
oss.str(""); // replaces the input data with the given string
oss.clear(); // resets the error flags
oss << std::setw(4) << std::setfill('0') << id_;
return oss.str();
}
it won't create a new ostringstream every single time.
In your case it isn't worth it (as Chris Dodd says opening a file and writing to it is likely to be 10-100x more expensive)... just to know.
Also consider that in any reasonable library implementation std::to_string will be at least as fast as stringstream.
1) Since it returns const std::string is the compiler (GCC in my case)
able to optimize it?
There is a rationale for this practice, but it's essentially obsolete (e.g. Herb Sutter recommended returning const values for non-primitive types).
With C++11 it is strongly advised to return values as non-const so that you can take full advantage of rvalue references.
About this topic you can take a look at:
Purpose of returning by const value?
Should I return const objects?
It seems to me that defining the << operator (operator<<) to work directly with strings is more elegant than having to work with ostringstreams and then converting back to strings. Is there a reason why c++ doesn't do this out of the box?
#include <string>
#include <sstream>
#include <iostream>
using namespace std;
template <class T>
string& operator<<(string& s, T a) {
ostringstream ss;
ss << a;
s.append(ss.str());
return s;
}
int main() {
string s;
// this prints out: "inserting text and a number(1)"
cout << (s << "inserting text and a number (" << 1 << ")\n");
// normal way
ostringstream os;
os << "inserting text and a number(" << 1 << ")\n";
cout << os.str();
}
Streams contain additional state. Imagine if this were possible:
std::string str;
int n = 1234;
str << std::hex;
str << n;
return str; // returns "0x4d2" (or something, I forget)
In order to maintain this additional state, strings would have to have storage for this state. The C++ standards committee (and C++ programmers in general) have generally frowned upon superfluous resource consumption, under the motto "pay only for what you use". So, no extra fields in the string class.
The subjective answer: is that I think the std::string class was quite poorly designed to begin with, especially compared to other parts of C++'s excellent standard library, and adding features to std::string is just going to make things worse. This is a very subjective opinion and feel free to dismiss me as a raving lunatic.
The problem with the idea of strings being output streams is that they would become too heavy.
Strings are intended to "hold string data", not to format some output. Output streams have a heavy "state" which can be manipulated (see <iomanip>) and thus has to be stored. This means that, of course, this has to be stored for every string in every program, but almost none of them are used as an output stream; so it's a huge waste of resources.
C++ follows the "zero overhead" design principle (or at least no more overhead than totally necessary). Not having a string class which doesn't add any unnecessary overhead would be a huge violation of this design principle. If this was the case: what would people do in overhead-critical cases? Use C-strings... ouch!
In C++11, an alternative is to use the operator+= with std::to_string to append to a string, which can also be chained like the operator<< of the output stream. You can wrap both += and to_string in a nice operator<< for string if you like:
template <class Number>
std::string& operator<<(std::string& s, Number a) {
return s += std::to_string(a);
}
std::string& operator<<(std::string& s, const char* a) {
return s += a;
}
std::string& operator<<(std::string& s, const std::string &a) {
return s += a;
}
Your example, updated using this method: http://ideone.com/4zbVtD
Probably lost in the depths of time now but formatted output was always associated with streams in C (since they didn't have "real" strings) and this may have been carried over into C++ (which was, after all, C with classes). In C, the way to format to a string is to use sprintf, a variation on fprintf, the output-to-stream function.
Obviously conjecture on my part but someone probably thought similarly to yourself that these formatting things in the streams would be brilliant to have on strings as well, so they subclassed the stream classes to produce one that used a string as it's "output".
That seems the elegant solution to getting it working as quickly as possible. Otherwise, you would have had formatting code duplicated in streams and strings.
How to add or subtract the value of string? For example:
std::string number_string;
std::string total;
cout << "Enter value to add";
std::getline(std::cin, number_string;
total = number_string + number_string;
cout << total;
This just append the string so this won't work. I know I can use int data type but I need to use string.
You can use atoi(number_string.c_str()) to convert the string to an integer.
If you are concerned about properly handling non-numeric input, strtol is a better choice, albeit a little more wordy. http://www.cplusplus.com/reference/cstdlib/strtol/
You will want to work with integers the entire time, and then convert to a std::string at the very end.
Here is a solution that works if you have a C++11 capable compiler:
#include <string>
std::string sum(std::string const & old_total, std::string const & input) {
int const total = std::stoi(old_total);
int const addend = std::stoi(input);
return std::to_string(total + addend);
}
Otherwise, use boost:
#include <string>
#include <boost/lexical_cast.hpp>
std::string sum(std::string const & old_total, std::string const & input) {
int const total = boost::lexical_cast<int>(old_total);
int const addend = boost::lexical_cast<int>(input);
return boost::lexical_cast<std::string>(total + addend);
}
The function first converts each std::string into an int (a step that you will have to do, no matter what approach you take), then adds them, and then converts it back to a std::string. In other languages, like PHP, that try to guess what you mean and add them, they are doing this under the hood, anyway.
Both of these solutions have a number of advantages. They are faster, they report their errors with exceptions rather than silently appearing to work, and they don't require extra intermediary conversions.
The Boost solution does require a bit of work to set up, but it is definitely worth it. Boost is probably the most important tool of any C++ developer's work, except maybe the compiler. You will need it for other things because they have already done top-notch work solving many problems that you will have in the future, so it is best for you to start getting experience with it. The work required to install Boost is much less than the time you will save by using it.