Pass by value or const reference? [duplicate] - c++

This question already has answers here:
Are the days of passing const std::string & as a parameter over?
(13 answers)
Closed 9 years ago.
There is a set of good rules to determine whether pass by value or const reference
If the function intends to change the argument as a side effect, take
it by non-const reference.
If the function doesn't modify its argument and the argument is of
primitive type, take it by value.
Otherwise take it by const reference, except in the following
cases: If the function would then need to make a copy of the const
reference anyway, take it by value.
For constructor as following, how to determine it?
class A
{
public:
A(string str) : mStr(str) {} // here which is better,
// pass by value or const reference?
void setString(string str) { mStr = str; } // how about here?
private:
string mStr;
};

In this particular case, and assuming C++11 and move construction/assignment for strings, you should take the argument by value and move it to the member for the constructor.
A::A(string str) : mStr(std::move(str)) {}
The case of the setter is a bit trickier and I am not sure whether you really want/need to optimize every bit of it... If you want to optimize the most you could provide two overloads, one taking an rvalue reference and another taking a const lvalue reference. At any rate, the const lvalue reference is probably a good enough approach:
void A::setString(string const& str) { mStr = str; }
Why the difference?
In the case of the constructor, the member is not yet built, so it is going to need to allocate memory. You can move that memory allocation (and actual copying of the data, but that is the leaser cost) to the interface, so that if the caller has a temporary it can be forwarded without an additional memory allocation.
In the case of assignment the things are a bit more complicated. If the current size of the string is large enough to hold the new value, then no allocation is required, but if the string is not large enough, then it will need to reallocate. If the allocation is moved to the interface (by-value argument), it will be executed always even when it is unnecessary. If the allocation is done inside the function (const reference argument) then for a small set of cases (those where the argument is a temporary that is larger then the current buffer) an allocation that could otherwise have been avoided would be done.

The article you site is not a good reference for software
engineering. (It is also likely out of date, given that it
talks about move semantics and is dated from 2003.)
The general rule is simple: pass class types by const reference,
and other types by value. There are explicit exceptions: in
keeping with the conventions of the standard library, it is also
usual to pass iterators and functional objects by value.
Anything else is optimization, and shouldn't be undertaken until
the profiler says you have to.

In this case it is better to pass argument by const reference. Reason: string is a class type, you don't modify it and it can be arbitrary big.

It is always better to use the member initialization list to initialize your values as it provides with the following advantages:
1) The assignment version, creates a default constructor to initialize mStr and then assigned a new value on top of the default-constructed one. Using the MIL avoids this wasted construction because the arguments in the list are used as constructor arguments.
2) It's the only place to initialize constant variables unless these are just intergers which you can use enums in the class. enum T{v=2};
3) It's the place to initialize references.
This is what I would suggest:
#include <iostream>
#include <string>
class A
{
private:
std::string mStr;
public:
A(std::string str):mStr(str){}
//if you are using an older version of C++11, then use
//std::string &str instead
inline const void setString(std::string str)
{
mStr = str;
}
const std::string getString() const
{
return mStr;
}
};
int main()
{
A a("this is a 1st test.");
a.setString("this is a 2nd test.");
std::cout<<a.getString()<<std::endl;
}
Have a look at this:
http://www.cplusplus.com/forum/articles/17820/

Related

Moving parameter to data member: Take parameter by copy or rvalue-ref?

I have a class object entity which gobbles up a string and shoves it into it's member on construction (for the sake of this argument this could be any old member function).
Now I can do this in at least two ways: I might accept a std::string copy and move this into a member (inefficient_entity). However I could also directly take by rvalue-ref and just continue the move into the data member (efficient_entity).
Is there a performance difference between the two? I'm asking bc it's way more convenient to take by copy and let the call site decide if it wants to move or copy the string. The other way I would probably need to create an overload set which can grow to huge amounts given that my constructor could also accept multiple arguments in the same manner.
Will this be optimized out anyway or do I have to worry?
Demo
#include <string>
#include <cstdio>
struct efficient_entity
{
efficient_entity(std::string&& str)
: str_ { std::move(str) }
{ }
std::string str_;
};
struct inefficient_entity
{
inefficient_entity(std::string str)
: str_ { std::move(str) }
{ }
std::string str_;
};
int main()
{
std::string create_here = "I guarantee you this string is absolutely huge!";
efficient_entity(std::move(create_here));
inefficient_entity(std::move(create_here));
}
Note: Afaik there was a talk on cppcon which covered exactly that but I can't find it anymore (appreciated if someone could drop a hint).
Passing by value costs you one extra move.
Passing by reference, on the other hand, requires you to write two functions: const T & or T &&. Or 2N functions for N parameters, or a template.
I prefer the first one by default, and use the second one in low-level utilities that should be fast.
Having just a single T && overload is viable, but highly unorthodox. This gives you one extra move only for copies, which have to be written as func(std::string(value)) (or func(auto(value)) in C++23).
The supposed benefit here is that all copies are explicit. I would only do this for heavy types, and only if you like this style.

Should a std::string class member be a pointer?

And why/why not?
Say I have a class which takes a string in the constructor and stores it. Should this class member be a pointer, or just a value?
class X {
X(const std::string& s): s(s) {}
const std::string s;
};
Or...
class X {
X(const std::string* s): s(s) {}
const std::string* s;
};
If I was storing a primitive type, I'd take a copy. If I was storing an object, I'd use a pointer.
I feel like I want to copy that string, but I don't know when to decide that. Should I copy vectors? Sets? Maps? Entire JSON files...?
EDIT:
Sounds like I need to read up on move semantics. But regardless, I'd like to make my question a little more specific:
If I have a 10 megabyte file as a const string, I really don't want to copy that.
If I'm newing up 100 objects, passing a 5 character const string into each one's constructor, none of them ought to have ownership. Probably just take a copy of the string.
So (assuming I'm not completely wrong) it's obvious what to do from outside the class, but when you're designing class GenericTextHaver, how do you decide the method of text-having?
If all you need is a class that takes a const string in its constructor, and allows you to get a const string with the same value out of it, how do you decide how to represent it internally?
Should a std::string class member be a pointer?
No
And why not?
Because std::string, like every other object in the standard library, and every other well-written object in c++ is designed to be treated as a value.
It may or may not use pointers internally - that is not your concern. All you need to know is that it's beautifully written and behaves extremely efficiently (actually more efficient than you can probably imagine right now) when treated like a value... particularly if you use move-construction.
I feel like I want to copy that string, but I don't know when to decide that. Should I copy vectors? Sets? Maps? Entire JSON files...?
Yes. A well-written class has "value semantics" (this means it's designed to be treated like a value) - therefore copied and moved.
Once upon a time, when I was first writing code, pointers were often the most efficient way to get a computer to do something quickly. These days, with memory caches, pipelines and prefetching, copying is almost always faster. (yes, really!)
In a multi-processor environment, copying is very much faster in all but the most extreme cases.
If I have a 10 megabyte file as a const string, I really don't want to copy that.
If you need a copy of it, then copy it. If you really just mean to move it, then std::move it.
If I'm newing up 100 objects, passing a 5 character const string into each one's constructor, none of them ought to have ownership. Probably just take a copy of the string.
A 5-character string is so cheap to copy that you should not even think about it. Just copy it. Believe it or not, std::string is written with the full knowledge that most strings are short, and they're often copied. There won't even be any memory allocation involved.
So (assuming I'm not completely wrong) it's obvious what to do from outside the class, but when you're designing class GenericTextHaver, how do you decide the method of text-having?
Express the code in the most elegant way you can that succinctly conveys your intent. Let the compiler make decisions about how the machine code will look - that it's job. Hundreds of thousands of people have given their time to ensure that it does that job better than you ever will.
If all you need is a class that takes a const string in its constructor, and allows you to get a const string with the same value out of it, how do you decide how to represent it internally?
In almost all cases, store a copy. If 2 instances actually need to share the same string then consider something else, like a std::shared_ptr. But in that case, they probably would not only need to share a string so the 'shared state' should be encapsulated in some other object (ideally with value semantics!)
OK, stop talking - show me how the class should look
class X {
public:
// either like this - take a copy and move into place
X(std::string s) : s(std::move(s)) {}
// or like this - which gives a *miniscule* performance improvement in a
// few corner cases
/*
X(const std::string& s) : s(s) {} // from a const ref
X(std::string&& s) : s(std::move(s)) {} // from an r-value reference
*/
// ok - you made _s const, so this whole class is now not assignable
const std::string s;
// another way is to have a private member and a const accessor
// you will then be able to assign an X to another X if you wish
/*
const std::string& value() const {
return s;
}
private:
std::string s;
*/
};
If the constructor truly "takes a string and stores it", then of course your class needs to contain a std::string data member. A pointer would only point at some other string that you don't actually own, let alone "store":
struct X
{
explicit X(std::string s) : s_(std::move(s)) {}
std::string s_;
};
Note that since we're taking ownership of the string, we may as well take it by value and then move from the constructor argument.
In most cases you will want to be copying by value. If the std::string gets destroyed outside of X, X will not know about it and result in undesired behavior. However, if we want to do this without taking any copies, a natural thing to do might be to use std::unique_ptr<std::string> and use the std::move operator on it:
class X {
public:
std::unique_ptr<std::string> m_str;
X(std::unique_ptr<std::string> str)
: m_str(std::move(str)) { }
}
By doing this, note that the original std::unique_ptr will be empty. The ownership of the data has been transferred. The nice thing about this is that it protects the data without needing the overhead of a copy.
Alternately, if you still want it accessible from the outside world, you can use an std::shared_ptr<std::string> instead, though in this case care must be taken.
Yes, generally speaking, it is fine to have a class that holds a pointer to an object but you will need to implement a more complex behaviour in order to make your class safe. First, as one of the previous responders noticed it is dangerous to keep a pointer to the outside string as it can be destroyed without the knowledge of the class X. This means that the initializing string must be copied or moved when an instance of X is constructed. Secondly, since the member X.s now points to the string object allocated on the heap (with operator new), the class X needs a destructor to do the proper clean-up:
class X {
public:
X(const string& val) {
cout << "copied " << val << endl;
s = new string(val);
}
X(string&& val) {
cout << "moved " << val << endl;
s = new string(std::move(val));
}
~X() {
delete s;
}
private:
const string *s;
};
int main() {
string s = "hello world";
X x1(s); // copy
X x2("hello world"); // move
return 0;
}
Note, that theoretically you can have a constructor that takes a const string* as well. It will require more checks (nullptr), will support only copy semantics, and it may look as follows:
X(const string* val) : s(nullptr) {
if(val != nullptr)
s = new string(*val);
}
These are techniques. When you design your class the specifics of the problem at hand will dictate whether to have a value or a pointer member.

C++ How to detect when nullptr is passed to function where std::string is expected?

i could not find threads giving a clear answer on this -
i have a constructor, for example:
FanBookPost::FanBookPost(Fan* owner, std::string content);
Fan is another class in my code, but content is the problematic one:
since std::string is not a pointer, i would expect that calling the constructor with (fan, nullptr) would not be possible. however, it compiles! ... and crashes at runtime with EXC_BAD_ACCESS.
is this avoidable?
The problem here is that the crash will occure when the constructor of std::string is called (as a nullptr interpreted as const char* is accessed there). There is nothing here you can do against this but to tell other peoples not to do shit. Its not a problem of your constructor and thus not your responsibility (besides that you cant prevent it).
What you're observing is that you can implicitly create a std::string from a character pointer (nullptr in this case) which is then passed to the function. However creating a string from a null pointer is not allowed. There is nothing wrong with your method signature, just the client use that violates the std::string constructor's contract.
How about using a proxy / wrapper typ, if you really want to be safe:
template<typename T>
struct e_t
{
public:
inline e_t ( e_t const & other )
: m_value( other.m_value )
{}
inline T & value( void ) { return m_value; }
inline operator T&() { return m_value; }
inline e_t( const T& c ) : m_value( c ) {}
private:
T m_value;
};
void FanBookPost(int* owner, e_t<std::string> content) {
}
int main()
{
int n = 0;
//FanBookPost(&n, 0); // compiler error
//FanBookPost(&n, nullptr); // compiler error
//FanBookPost(&n, ""); // unfortunately compiler error too
FanBookPost(&n, std::string(""));
}
The problem is that std::string has a non-explicit ctor that takes a char * as its sole (required) parameter. This gives an implicit conversion from nullptr to std::string, but gives undefined behavior, because that ctor specifically requires a non-null pointer.
There are a few ways to prevent this. Probably the most effective would be to take a (non-const) reference to a std::string, which will require passing a (non-temporary) string as the parameter.
FanBookPost::FanBookPost(Fan* owner, std::string &content);
This does have to unfortunate side effect of giving the function the ability to modify the string that's passed. It also means that (with a conforming compiler1) you won't be able to pass nullptr or a string literal to the function--you'll have to pass an actual instance of std::string.
If you want to be able to pass a string literal, you can then add an overload that takes a char const * parameter, and possibly one that takes a nullptr_t parameter as well. The former would check for a non-null pointer before creating a string and calling the function that takes a reference to a string, and the latter would do something like log the error and unconditionally kill the program (or, just possibly, log the error and throw an exception).
That's annoying and inconvenient, but may be superior to the current situation.
Unfortunately, the last time I noticed MS VC++ did not conform in this respect. It allows passing a temporary object by non-const reference. Normally that's fairly harmless (it just lets you modify the temporary, but that normally has no visible side effects). In this case it's much more troublesome though, since you're depending on it specifically to prevent passing a temporary object.

Should I always move on `sink` constructor or setter arguments?

struct TestConstRef {
std::string str;
Test(const std::string& mStr) : str{mStr} { }
};
struct TestMove {
std::string str;
Test(std::string mStr) : str{std::move(mStr)} { }
};
After watching GoingNative 2013, I understood that sink arguments should always be passed by value and moved with std::move. Is TestMove::ctor the correct way of applying this idiom? Is there any case where TestConstRef::ctor is better/more efficient?
What about trivial setters? Should I use the following idiom or pass a const std::string&?
struct TestSetter {
std::string str;
void setStr(std::string mStr) { str = std::move(str); }
};
The simple answer is: yes.
The reason is quite simple as well, if you store by value you might either need to move (from a temporary) or make a copy (from a l-value). Let us examine what happens in both situations, with both ways.
From a temporary
if you take the argument by const-ref, the temporary is bound to the const-ref and cannot be moved from again, thus you end up making a (useless) copy.
if you take the argument by value, the value is initialized from the temporary (moving), and then you yourself move from the argument, thus no copy is made.
One limitation: a class without an efficient move-constructor (such as std::array<T, N>) because then you did two copies instead of one.
From a l-value (or const temporary, but who would do that...)
if you take the argument by const-ref, nothing happens there, and then you copy it (cannot move from it), thus a single copy is made.
if you take the argument by value, you copy it in the argument and then move from it, thus a single copy is made.
One limitation: the same... classes for which moving is akin to copying.
So, the simple answer is that in most cases, by using a sink you avoid unnecessary copies (replacing them by moves).
The single limitation is classes for which the move constructor is as expensive (or near as expensive) as the copy constructor; in which case having two moves instead of one copy is "worst". Thankfully, such classes are rare (arrays are one case).
A bit late, as this question already has an accepted answer, but anyways... here's an alternative:
struct Test {
std::string str;
Test(std::string&& mStr) : str{std::move(mStr)} { } // 1
Test(const std::string& mStr) : str{mStr} { } // 2
};
Why would that be better? Consider the two cases:
From a temporary (case // 1)
Only one move-constructor is called for str.
From an l-value (case // 2)
Only one copy-constructor is called for str.
It probably can't get any better than that.
But wait, there is more:
No additional code is generated on the caller's side! The calling of the copy- or move-constructor (which might be inlined or not) can now live in the implementation of the called function (here: Test::Test) and therefore only a single copy of that code is required. If you use by-value parameter passing, the caller is responsible for producing the object that is passed to the function. This might add up in large projects and I try to avoid it if possible.

Is it possible to take a parameter by const reference, while banning conversions so that temporaries aren't passed instead?

Sometimes we like to take a large parameter by reference, and also to make the reference const if possible to advertize that it is an input parameter. But by making the reference const, the compiler then allows itself to convert data if it's of the wrong type. This means it's not as efficient, but more worrying is the fact that I think I am referring to the original data; perhaps I will take it's address, not realizing that I am, in effect, taking the address of a temporary.
The call to bar in this code fails. This is desirable, because the reference is not of the correct type. The call to bar_const is also of the wrong type, but it silently compiles. This is undesirable for me.
#include<vector>
using namespace std;
int vi;
void foo(int &) { }
void bar(long &) { }
void bar_const(const long &) { }
int main() {
foo(vi);
// bar(vi); // compiler error, as expected/desired
bar_const(vi);
}
What's the safest way to pass a lightweight, read-only reference? I'm tempted to create a new reference-like template.
(Obviously, int and long are very small types. But I have been caught out with larger structures which can be converted to each other. I don't want this to silently happen when I'm taking a const reference. Sometimes, marking the constructors as explicit helps, but that is not ideal)
Update: I imagine a system like the following: Imagine having two functions X byVal(); and X& byRef(); and the following block of code:
X x;
const_lvalue_ref<X> a = x; // I want this to compile
const_lvalue_ref<X> b = byVal(); // I want this to fail at compile time
const_lvalue_ref<X> c = byRef(); // I want this to compile
That example is based on local variables, but I want it to also work with parameters. I want to get some sort of error message if I'm accidentally passing a ref-to-temporary or a ref-to-a-copy when I think I'll passing something lightweight such as a ref-to-lvalue. This is just a 'coding standard' thing - if I actually want to allow passing a ref to a temporary, then I'll use a straightforward const X&. (I'm finding this piece on Boost's FOREACH to be quite useful.)
Well, if your "large parameter" is a class, the first thing to do is ensure that you mark any single parameter constructors explicit (apart from the copy constructor):
class BigType
{
public:
explicit BigType(int);
};
This applies to constructors which have default parameters which could potentially be called with a single argument, also.
Then it won't be automatically converted to since there are no implicit constructors for the compiler to use to do the conversion. You probably don't have any global conversion operators which make that type, but if you do, then
If that doesn't work for you, you could use some template magic, like:
template <typename T>
void func(const T &); // causes an undefined reference at link time.
template <>
void func(const BigType &v)
{
// use v.
}
If you can use C++11 (or parts thereof), this is easy:
void f(BigObject const& bo){
// ...
}
void f(BigObject&&) = delete; // or just undefined
Live example on Ideone.
This will work, because binding to an rvalue ref is preferred over binding to a reference-to-const for a temporary object.
You can also exploit the fact that only a single user-defined conversion is allowed in an implicit conversion sequence:
struct BigObjWrapper{
BigObjWrapper(BigObject const& o)
: object(o) {}
BigObject const& object;
};
void f(BigObjWrapper wrap){
BigObject const& bo = wrap.object;
// ...
}
Live example on Ideone.
This is pretty simple to solve: stop taking values by reference. If you want to ensure that a parameter is addressable, then make it an address:
void bar_const(const long *) { }
That way, the user must pass a pointer. And you can't get a pointer to a temporary (unless the user is being terribly malicious).
That being said, I think your thinking on this matter is... wrongheaded. It comes down to this point.
perhaps I will take it's address, not realizing that I am, in effect, taking the address of a temporary.
Taking the address of a const& that happens to be a temporary is actually fine. The problem is that you cannot store it long-term. Nor can you transfer ownership of it. After all, you got a const reference.
And that's part of the problem. If you take a const&, your interface is saying, "I'm allowed to use this object, but I do not own it, nor can I give ownership to someone else." Since you do not own the object, you cannot store it long-term. This is what const& means.
Taking a const* instead can be problematic. Why? Because you don't know where that pointer came from. Who owns this pointer? const& has a number of syntactic safeguards to prevent you from doing bad things (so long as you don't take its address). const* has nothing; you can copy that pointer to your heart's content. Your interface says nothing about whether you are allowed to own the object or transfer ownership to others.
This ambiguity is why C++11 has smart pointers like unique_ptr and shared_ptr. These pointers can describe real memory ownership relations.
If your function takes a unique_ptr by value, then you now own that object. If it takes a shared_ptr, then you now share ownership of that object. There are syntactic guarantees in place that ensure ownership (again, unless you take unpleasant steps).
In the event of your not using C++11, you should use Boost smart pointers to achieve similar effects.
You can't, and even if you could, it probably wouldn't help much.
Consider:
void another(long const& l)
{
bar_const(l);
}
Even if you could somehow prevent the binding to a temporary as input to
bar_const, functions like another could be called with the reference
bound to a temporary, and you'd end up in the same situation.
If you can't accept a temporary, you'll need to use a reference to a
non-const, or a pointer:
void bar_const(long const* l);
requires an lvalue to initialize it. Of course, a function like
void another(long const& l)
{
bar_const(&l);
}
will still cause problems. But if you globally adopt the convention to
use a pointer if object lifetime must extend beyond the end of the call,
then hopefully the author of another will think about why he's taking
the address, and avoid it.
I think your example with int and long is a bit of a red herring as in canonical C++ you will never pass builtin types by const reference anyway: You pass them by value or by non-const reference.
So let's assume instead that you have a large user defined class. In this case, if it's creating temporaries for you then that means you created implicit conversions for that class. All you have to do is mark all converting constructors (those that can be called with a single parameter) as explicit and the compiler will prevent those temporaries from being created automatically. For example:
class Foo
{
explicit Foo(int bar) { }
};
(Answering my own question thanks to this great answer on another question I asked. Thanks #hvd.)
In short, marking a function parameter as volatile means that it cannot be bound to an rvalue. (Can anybody nail down a standard quote for that? Temporaries can be bound to const&, but not to const volatile & apparently. This is what I get on g++-4.6.1. (Extra: see this extended comment stream for some gory details that are way over my head :-) ))
void foo( const volatile Input & input, Output & output) {
}
foo(input, output); // compiles. good
foo(get_input_as_value(), output); // compile failure, as desired.
But, you don't actually want the parameters to be volatile. So I've written a small wrapper to const_cast the volatile away. So the signature of foo becomes this instead:
void foo( const_lvalue<Input> input, Output & output) {
}
where the wrapper is:
template<typename T>
struct const_lvalue {
const T * t;
const_lvalue(const volatile T & t_) : t(const_cast<const T*>(&t_)) {}
const T* operator-> () const { return t; }
};
This can be created from an lvalue only
Any downsides? It might mean that I accidentally misuse an object that is truly volatile, but then again I've never used volatile before in my life. So this is the right solution for me, I think.
I hope to get in the habit of doing this with all suitable parameters by default.
Demo on ideone