Are arguments copied if they are stored in the following cases
Literal string passed:
std::string globalStr;
void store(const std::string &str)
{
globalStr = str;
}
store("Literal");
Variable string passed:
std::string globalStr;
void store(const std::string &str)
{
globalStr = str;
}
store(varStr);
And what if globalStr stores reference
std::string &globalStr;
void store(const std::string &str)
{
globalStr = str;
}
store("Literal"); //Should this cause any problem?
store(varStr);
Does C++ optimize to prevent making unnecessary copies in any of the above cases?
Does C++ optimize to prevent making unnecessary copies in any of the above cases?
No.
The stardard just guarantees that, in the 1st and 2nd cases, the value of str will be copied to globalStr, using std::string::operator=().
Depending on the implementation of your STL, if std::string uses a copy-on-write optimization, a deep copy might be avoided.
The 3rd case will not compile, as a reference cannot be reassigned after it has been initialized.
In the first two examples, you are binding a reference: void store(const std::string &str), that in itself does not copy.
But in the statement where you are assigning to a global variable: globalStr = str; - It does make a copy of it.
In your third example this does not compile: std::string &globalStr; - reference needs to be initialized!
You can try to set up for a move by writing:
void store(std::string &&str)
{
globalStr = std::move(str);
}
store(std::move(varStr));
As in any case involving compiler optimization: it depends on the compiler.
If the compiler is programmed to (and can in the situation given to it) prove to itself that a copy is unnecessary, it won't perform that copy.
For your particular question, neither of the first two methods could perform unnecessary copies anyway, because the inputs are passed by reference into the store function. They are copied when you assign them to globalStr, but that copy is necessary because globalStr is a value variable; it is its own copy, and therefore cannot point to some other copy.
And this line:
std::string &globalStr;
can't compile. A reference must be assigned something to refer to at the time it is declared.
Related
I just started working with c++11 r-values. I read some tutorials, but I haven't found the answer.
What is the best way (the most efficient way) to set a class variable? Is below code correct or not? (let's assume std::string has defined move constructor and assignment operator).
class StringWrapper
{
private:
std::string str_;
public:
StringWrapper() : str_("") {}
void setString1(std::string&& str) {
str_ = std::move(str);
}
void setString2(const std::string& str) {
str_ = std::move(str);
}
// other possibility?
};
int main() {
std::string myText("text");
StringWrapper x1, x2;
x1.setString?("text"); // I guess here should be setString1
x2.setString?(myText); // I guess here should be setString2
}
I know that compiler can optimize my code and/or I can use overload functions. I'd like to only know what is the best way.
Herb Sutter's advice on this is to start with the standard C++98 approach:
void setString(const std::string& str) {
str_ = str;
}
And if you need to optimize for rvalues add an overload that takes an rvalue reference:
void setString(std::string&& str) noexcept {
str_ = std::move(str);
}
Note that most implementations of std::string use the small string optimization so that if your strings are small a move is the same as a copy anyway and you wouldn't get any benefit.
It is tempting to use pass-by-value and then move (as in Adam Hunyadi's answer) to avoid having to write multiple overloads. But Herb pointed out that it does not re-use any existing capacity of str_. If you call it multiple times with lvalues it will allocate a new string each time. If you have a const std::string& overload then it can re-use existing capacity and avoid allocations.
If you are really clever you can use a templated setter that uses perfect forwarding but to get it completely correct is actually quite complicated.
Compiler designers are clever folk. Use the crystal clear and therefore maintainable
void setString(const std::string& str) {
str_ = str;
}
and let the compiler worry about optimisations. Pretty please, with sugar on top.
Better still, don't masquerade code as being encapsulated. If you intend to provide such a method, then why not simply make str_ public? (Unless you intend to make other adjustments to your object if the member changes.)
Finally, why don't you like the default constructor of std::string? Ditch str_("").
The version with rvalue reference would not normally bind to an lvalue (in your case, mytext), you would have to move it, and therefore construct the object twice, leaving you with a dangerous object. A const lvalue reference should be slower when constructing from an rvalue, because it would do the same thing again: construct -> move -> move construct.
The compiler could possibly optimize the overhead away though.
Your best bet would actually be:
void setString(std::string str)
{
str_ = std::move(str);
}
The compiler here is suprisingly guaranteed to deduce the type of the argument and call the copy constructor for lvalues and the move constructor for rvalues.
Update:
Chris Dew pointed out that constructing and move assigning a string is actually more expensive than copy constructing. I am now convinced that using a const& argument is the better option. :D
You might probably use templatized setString and forwarding references:
class StringWrapper
{
private:
std::string str_;
public:
template<typename T>
void setString(T&& str) {
str_ = std::forward<T>(str);
}
};
The following function accepts a string as an argument, and returns another, after some processing.
Is it fair enough to assume that the compiler will perform move optimizations, and I will not end up having the contents of string copied after every invocation? Should this function follow the copy elision [(N)RVO]?
Is this, as a practice, advisable?
std::string foo(std::string const& s)
{ // Perform sanity check on s
// ...
std::stringstream ss;
// do something and store in ss
// ...
return ss.str();
}
Because, otherwise, I generally follow the practice of returning strings by reference. So, to say, my function signature would have been:
void foo (std::string const& inValue, std::string& outValue);
ss.str() will create a temporary string. If you are assigning that string to a a new instance like
std::string bar = foo("something");
The either copy elision or move semantics will kick in.
Now if you have an already created string and you are assigning it to the return of foo then move assignment will kick in
std::string bar;
// do stuff
bar = foo("something");
I prefer this method as it does not require you to have an object already created where
void foo (std::string const& inValue, std::string& outValue);
Would have you create an empty string just to pass it to the function to get filled. This means you have a construction and a assignment where with the first example you could just have a construction.
According to this it is more optimized when you return the value:
the Return Value Optimization (RVO), allows the compiler to optimize away the copy by having the caller and the callee use the same chunk of memory for both “copies”.
While i was reading my c++ book and programming some of the examples, a question came to my mind.
...
private:
const string someString;
public:
MyClass(const string& someString) : someString(someString) {}
const string& getSomeString() const { return someString; }
...
Does declaring someString as a reference actually make a difference?
...
private:
const string& someString;
public:
MyClass(const string& someString) : someString(someString) {}
const string& getSomeString() const { return someString; }
...
If so, what are the advantages/disadvantages or use cases (since both examples compile fine)?
The latter will easily lead to dangling references as it just points to some object not controlled by your class. So I would avoid that. (As always, unless you have a good reason.)
Also a notable difference: In the second case, the string in your class "will change" if the string used to construct it does, as you are only referencing it. This would not happen in the first case as you own your own copy of the string.
If you want a string of your own, don't store a reference to somebody else's string.
In the second case, the lifetime of the string you're storing a reference to must exceed that of the object you're storing it in.
For example,
MyClass instance("bad");
would leave a dangling reference in the member.
You also have the less fatal but confusing spooky action at a distance:
std::string s = "Hello";
MyClass instance(s);
s = "World";
std::cout << instance.getSomeString(); // Prints 'World'
Reference members are very rarely a good solution, in my experience.
The string referenced in the constructor comes from outside the class. The reference member is only valid as long as the original string is valid. Consider this example:
MyClass *p;
{
string temp = "hello";
p = new MyClass(temp);
}
cout << p->getSomeString(); // reference to destroyed object
This code is wrong because the string temp which is referred to in the class no longer exists.
The problem can manifest more subtly.
const char *text = "Hello";
MyClass c(text);
cout << c.getSomeString(); // reference to destroyed object
This code is also wrong because the temporary std::string object created for the constructor call no longer exists by the time of the next line.
If you declare someString as a const string it will contain that value that is passed to it in the constructor.
However, with someString being a const string&, it holds the address of a string which is stored somewhere outside of the class, which the class can't guarantee will still exist at any point in the future, so you should avoid this one.
If someString should not be an observer, then by storing a reference you're implying the wrong semantics. I would avoid that.
Otoh, if it should be an observer then you obviously do have to store a reference.
This question already has answers here:
Are the days of passing const std::string & as a parameter over?
(13 answers)
Closed 9 years ago.
There is a set of good rules to determine whether pass by value or const reference
If the function intends to change the argument as a side effect, take
it by non-const reference.
If the function doesn't modify its argument and the argument is of
primitive type, take it by value.
Otherwise take it by const reference, except in the following
cases: If the function would then need to make a copy of the const
reference anyway, take it by value.
For constructor as following, how to determine it?
class A
{
public:
A(string str) : mStr(str) {} // here which is better,
// pass by value or const reference?
void setString(string str) { mStr = str; } // how about here?
private:
string mStr;
};
In this particular case, and assuming C++11 and move construction/assignment for strings, you should take the argument by value and move it to the member for the constructor.
A::A(string str) : mStr(std::move(str)) {}
The case of the setter is a bit trickier and I am not sure whether you really want/need to optimize every bit of it... If you want to optimize the most you could provide two overloads, one taking an rvalue reference and another taking a const lvalue reference. At any rate, the const lvalue reference is probably a good enough approach:
void A::setString(string const& str) { mStr = str; }
Why the difference?
In the case of the constructor, the member is not yet built, so it is going to need to allocate memory. You can move that memory allocation (and actual copying of the data, but that is the leaser cost) to the interface, so that if the caller has a temporary it can be forwarded without an additional memory allocation.
In the case of assignment the things are a bit more complicated. If the current size of the string is large enough to hold the new value, then no allocation is required, but if the string is not large enough, then it will need to reallocate. If the allocation is moved to the interface (by-value argument), it will be executed always even when it is unnecessary. If the allocation is done inside the function (const reference argument) then for a small set of cases (those where the argument is a temporary that is larger then the current buffer) an allocation that could otherwise have been avoided would be done.
The article you site is not a good reference for software
engineering. (It is also likely out of date, given that it
talks about move semantics and is dated from 2003.)
The general rule is simple: pass class types by const reference,
and other types by value. There are explicit exceptions: in
keeping with the conventions of the standard library, it is also
usual to pass iterators and functional objects by value.
Anything else is optimization, and shouldn't be undertaken until
the profiler says you have to.
In this case it is better to pass argument by const reference. Reason: string is a class type, you don't modify it and it can be arbitrary big.
It is always better to use the member initialization list to initialize your values as it provides with the following advantages:
1) The assignment version, creates a default constructor to initialize mStr and then assigned a new value on top of the default-constructed one. Using the MIL avoids this wasted construction because the arguments in the list are used as constructor arguments.
2) It's the only place to initialize constant variables unless these are just intergers which you can use enums in the class. enum T{v=2};
3) It's the place to initialize references.
This is what I would suggest:
#include <iostream>
#include <string>
class A
{
private:
std::string mStr;
public:
A(std::string str):mStr(str){}
//if you are using an older version of C++11, then use
//std::string &str instead
inline const void setString(std::string str)
{
mStr = str;
}
const std::string getString() const
{
return mStr;
}
};
int main()
{
A a("this is a 1st test.");
a.setString("this is a 2nd test.");
std::cout<<a.getString()<<std::endl;
}
Have a look at this:
http://www.cplusplus.com/forum/articles/17820/
I want to know if the compiler is allowed to automatically use the move constructor for wstring in the following setter method (without an explicit call to std::move):
void SetString(std::wstring str)
{
m_str = str; // Will str be moved into m_str automatically or is std::move(str) needed?
}
From what I've read it sounds as though the compiler is not allowed to make this decision since str is an lvalue, but it seems pretty obvious that using move here would not change program behavior.
Barring move, will some other sort of copy elision be applied?
[is] the compiler [...] allowed to automatically use the move constructor
Yes, it would be nice. But this is not only an optimization, this has real impact on the language.
Consider a move-only type like unique_ptr:
std::unique_ptr<int> f()
{
std::unique_ptr<int> up;
return up; // this is ok although unique_ptr is non-copyable.
}
Let's assume your rule would be included into the C++ standard, called the rule of "argument's last occurence".
void SetString(std::unique_ptr<int> data)
{
m_data = data; // this must be ok because this is "argument's last occurence"
}
Checking if an identifier is used in a return is easy. Checking if it is "argument's last occurence" isn't.
void SetString(std::unique_ptr<int> data)
{
if (condition) {
m_data = data; // this is argument's last occurence
} else {
data.foo();
m_data = data; // this is argument's last occurence too
}
// many lines of code without access to data
}
This is valid code too. So each compiler would be required to check for "argument's last occurence", wich isn't an easy thing. To do so, he would have to scan the whole function just to decide if the first line is valid. It is also difficult to reason about as a human if you have to scroll 2 pages down to check this.
No, the compiler isn't allowed to in C++11. And he probably won't be allowed in future standards because this feature is very difficult to implement in compilers in general, and it is just a convenience for the user.
no, move semantics will not be used here, since str can be used in the next code, in fact even if it was rvalue youd still have to std::move force it.. if you want to use move semantics I would advise getting wstring&& str to the function and then using move..
No, the complier is not allowed. Due to some reasons, not only because it is difficult to do. I think copy and move can have side effects and you need to know when you can expect each will be used. For example it is well know that returning a local object will move it - you expect that, it is documented, is OK.
So, we have the following possibilities:
Your example:
void SetString(std::wstring str)
{
m_str = str;
}
For r-values: One r-ref in str, plus a copy into m_str. For l-values: a copy in str an a copy in m_str.
We can do it “better” manually:
void SetString( std::wstring str)
{
m_str = std::move(str);
}
For r-values: One r-ref in str, plus a move into m_str. For l-values: a copy in str an a move in m_str.
If for some reason (you want it to compile without C++11 without changes but automatically take advantages of C++11 when porting the code?) you don’t want “manually optimize” the code you can do:
void SetString(const std::wstring& str)
{
m_str = str;
}
For r-values: One ref in str, plus a copy into m_str. For l-values: a ref in str an a copy in m_str. Never 2 copy.