Efficient way to construct object with string member - c++

Suppose I have a class ThisHasAStringMember. Assume it has a private member which is a string, and I want to efficiently get the string value, preferring a move over a copy where possible. Would the following two constructors accomplish that?
class ThisHasAStringMember
{
public:
// ctors
ThisHasAStringMember(const std::string str) : m_str(str) {}
ThisHasAStringMember(std::string &&str) : m_str(std::move(str)) {}
// getter (no setter)
std::string value() { return m_str; }
private:
std::string m_str;
}
Do I need the double ampersand before the str parameter in the second constructor?
Is this the right way to accomplish this?

At first I would notice that it's better to mark your constructors as explicit.
The next moment is that better change the first constructor in your solution to take const reference to avoid copying lvalue:
// ctors
ThisHasAStringMember(const std::string& str) : m_str(str) {}
ThisHasAStringMember(std::string &&str) : m_str(std::move(str)) {}
This approach is optimal from the performance point of view (you will have one copy constructor call for lvalue and one move constructor call for rvalue), however it's quite boring to implement each time two constructors in such case. And if you have N members - 2^N constructors.
There are few alternatives:
Signle constructor where you pass parameter just by value. Yes it was unefficient in C++98, but in C++11 when you create a full copy - that's an option.
ThisHasAStringMember(std::string str) : m_str(std::move(str)) {}
When lvalue is passed there will be one copy constructor call and one move constructor call. When rvalue is passed there will be two move constructor calls. Yes, you have one extra move constructor call in each of the cases. But it's often very cheap (or even can be optimized away by compiler) and the code is very simple.
Single constructor where you pass parameter by rvalue:
ThisHasAStringMember(std::string&& str) : m_str(std::move(str)) {}
If you pass lvalue you have to explicitely copy it first in a place of the call, e.g.
ThisHasAStringMember(copy(someStringVar)). (here copy is a simple template copying method). And you will still have one extra move constructor call for lvalues. For rvalues there will be no overhead. Personally I like this approach: all the places where the parameter is copied are explicit, you won't make occasional copies in a performance-critical places.
Make constructor template and use perfect forwarding:
template <typename String,
std::enable_if_t<std::is_constructible_v<std::string, String>>* = nullptr>
ThisHasAStringMember(String&& str) : m_str(std::forward<String>(str))
{}
You will have no overhead both for rvalues and lvalues, but you'll need to make your constructor template and define it in header in most of the cases.

Related

Copy constructor called endlessly from lambda capture group

I wrote the following class to create values of any type which are either fixed or recalculated everytime the call operator is used on them:
template <typename T>
class DynamicValue {
private:
std::variant<T, std::function<T()>> getter;
public:
DynamicValue(const T& constant) : getter(constant){};
template <typename F, typename = std::enable_if_t<std::is_invocable_v<F>>>
DynamicValue(F&& function) : getter(function) {}
DynamicValue(const T* pointer) : DynamicValue([pointer]() { return *pointer; }) {}
DynamicValue(const DynamicValue& value) : getter(value.getter) {}
DynamicValue(DynamicValue& value) : DynamicValue((const DynamicValue&) value) {}
~DynamicValue() {}
T operator()() const { return getter.index() == 0 ? std::get<T>(getter) : std::get<std::function<T()>>(getter)(); }
};
I also wrote this function, which takes a DynamicValue<int> and returns another DynamicValue<int> which returns its value plus 1:
DynamicValue<int> plus1(DynamicValue<int> a) {
return [a] { return a() + 1; };
}
However, when I attempt to do use it, the program crashes:
DynamicValue<int> a = 1;
DynamicValue<int> b = plus1(a);
You can try a live example here.
After some testing, I think the problem lies in the copy constructor, which is being called endlessly, but I'm not sure how to fix it. How can I avoid this behavior?
Some important pieces of this code:
A lambda captures a DynamicValue object by value (copy).
The lambda is used to initialize a std::variant as a std::function alternative.
There is no explicit move constructor for DynamicValue, so the template for invocable objects is used as the move constructor.
The problematic code path starts with the request to construct a DynamicValue object from the lambda. This invokes the template constructor, which attempts to copy the lambda into the std::function alternative of the variant. So far, so good. Copying (not moving) the lambda copies the captured object without problems.
However, this procedure works when the CopyConstructible named requirement is satisfied. Part of this named requirement is being MoveConstructible. In order for a lambda to satisfy MoveConstructible, all of its captures have to satisfy that named requirement. Is this the case for DynamicValue? What happens when your standard library tries to move the lambda (hence also the captured object), with copying as the fallback? While DynamicValue has no explicit move constructor, it is invocable...
When F is DynamicValue<T>, the template constructor serves as the move constructor. It tries to initialize the variant by converting the source DynamicValue (the captured copy of a in the question's code) into a std::function. This is allowed, a copy of the source is made, and the process continues until the copy needs to be moved, at which point the move constructor is again invoked. This time, it tries to initialize the variant by converting the copy of the source DynamicValue into a std::function. This is allowed, a copy of the copy of the source is made, and the process continues until the copy of the copy needs to be moved, at which point the move constructor is again invoked. Etc.
Instead of moving the DynamicValue into the new object, each "move constructor" tries to move the DynamicValue into the variant of the new object. This would add another layer of overhead with each move, except the recursive calls blow up before construction finishes.
The solution is to make DynamicValue move constructible. There are at least two ways to do this.
1) Explicitly provide a move constructor.
DynamicValue(DynamicValue&& value) : getter(std::move(value.getter)) {}
2) Exclude DynamicValue from being a template argument to the template constructor.
template <typename F, typename = std::enable_if_t<std::is_invocable_v<F>>,
typename = std::enable_if_t<!std::is_same_v<std::decay_t<F>, DynamicValue>>>
DynamicValue(F&& function) : getter(function) {}
Note that this excludes only DynamicValue<T> from being a template argument, not DynamicValue<U> when U is not T. That might be another issue to contemplate.
You might want to see if this also fixes whatever problem led you to define a second copy constructor. That may have been a band-aid approach that did not address this underlying issue.

Why C++ std::function pass the functor by value instead of universal reference?

The constructor of std::function looks like this (at least in libc++):
namespace std {
template<class _Rp, class ..._ArgTypes>
function {
// ...
base_func<_Rp(_ArgTypes...)> __func;
public:
template<typename _Fp>
function(_Fp __f) : __func(std::move(__f)) {}
template<typename _Fp>
function& operator=(_Fp&& __f) {
function(std::forward<_Fp>(__f)).swap(*this);
return *this;
}
};
}
It provides a constructor from an arbitrary functor and an assignment operator from an arbitrary functor. The constructor uses pass-by-value but the assignment operator uses pass-by-universal-reference.
My question is why the constructor of std::function doesn't pass by universal (forwarding) reference in the same way as the assignment operator? For example, it could do:
namespace std {
template<class _Rp, class ..._ArgTypes>
function {
// ...
base_func<_Rp(_ArgTypes...)> __func;
public:
template<typename _Fp>
function(_Fp&& __f) : __func(std::forward<_Fp>(__f)) {}
template<typename _Fp>
function& operator=(_Fp&& __f) {
function(std::forward<_Fp>(__f)).swap(*this);
return *this;
}
};
}
I am curious what the rationale behind treating assignment and constructor differently here. Thanks!
That is what it's called a "sink parameter". A sink parameter is a parameter of a method that needs to be "taken" from the caller and stored in the object (as a data member). The caller usually doesn't need/use the the object after the call.
The best practice for a sink parameter is to pass it by value and move from it into the object. Let's see why:
Option 1: pass by reference
class X; // expensive to copy type with cheap move
struct A
{
X stored_x_;
A(const X& x) : x_{x} {}
// ^~~~~
// this is always a copy
};
In this case there will always be at least 1 copy that cannot be elided.
Option 2: pass by value and then move from
class X; // expensive to copy type with cheap move
struct A
{
X stored_x_;
A(X x) : x_{std::move(x)} {}
// ^~~~~~~~~~~~~~~~
// this is now a move
};
We got rid of the move in the initialization of A::x_, but we still have a copy on passing the parameter, or do we?
If the caller does the right thing we don't. We have two cases here: The caller still needs a copy of the object passed (which is pretty unusual and non-idiomatic). In this case yes, a copy will be made, but that is because the called requires that, not because of a flaw in the design of our class A.
The caller doesn't need the object after passing it. In this case it moves the argument or better yet passes a prvalue and since C++17 with the new temporary materialization rules the object is created directly as a parameter:
Pass an xvalue
auto test()
{
X x{};
A a{std::move(x)}; // 2 moves (from arg to parameter and from parameter to `A::x_`)
};
Pass an prvalue
auto test()
{
A a{X{}}; // just the move in the initialization of `A::x_`
}
Option 3: lvalue and rvalue reference overloads
Yes, this will achieve the same level of performance, but why have 2 overloads when you can write and maintain just 1 method.
class X; // expensive to copy type with cheap move
struct A
{
X stored_x_;
A(const X& x) : x_{x} {}
A(X&& x) : x_{std::move(x)} {}
};
Unneeded complexity which blows out when you have multiple sink parameters in 1 method.
Option 4: pass by forwarding reference:
Again, possible. But it can have some subtle but pretty serios problems:
if you don't have a template parameter then you need to make it a template, which adds complexity and also adds other problems like now the method accepts any type.
This is even worse for a constructor as now this constructor is a viable option for a copy constructor which can really mess things up because this will be a better fit for a copy from a non-const object.
Another problem is that it cannot be always used:
if you want to accept any type that is not simply T, e.g. if X is templated: template <class T> A(X<T>&& x) this is not a forwarding reference, but an rvalue reference and you need an lvalue reference overload.

move or copy when passing arguments to the constructor and member functions

The following is an example of my typical code. A have a lot of objects that look like this:
struct Config
{
Config();
Config(const std::string& cType, const std::string& nType); //additional variables omitted
Config(Config&&) = default;
Config& operator=(Config&&) = default;
bool operator==(const Config& c) const;
bool operator!=(const Config& c) const;
void doSomething(const std::string& str);
bool doAnotherThing(const MyOtherObject& obj);
void doYetAnotherThing(int value1, unsigned long value2, const std::string& value3, MyEnums::Seasons value4, const std::vector<MySecondObject>& value5);
std::string m_controllerType;
std::string m_networkType;
//...
};
//...
Config::Config(const std::string& cType, const std::string& nType) :
m_controllerType(cType),
m_networkType(nType)
{
}
My motivations and general understand of the subject:
use const references in constructors and methods to avoid double-copying when passing objects.
simple types - pass by value; classes and structs - pass by const reference (or simple reference when I need to modify them)
force compiler to create default move constructor and move assignment so that It would be able to do it's fancy magic and simultaneously it allows to avoid writing boring ctor() : m_v1(std::move(v1)), m_v2(std::move(v2)), m_v3(std::move(v3)) {}.
if it performs badly, use libc and raw pointers, then wrap it at class and write a comment.
I have a strong feeling that by rules of thumb are flawed and simply incorrect.
After reading cppreference, Scott Mayers, C++ standard, Stroustrup and so on, I feel like: "Yea, I understand every word here, but it still doesn't make any sense'. The only thing I king of understood is that move semantics makes sense when my class contains non-copiable types, like std::mutex and std::unique_ptr.
I've seen a lot of code where people pass complex object by value, like large strings, vectors and custom classes - I believe this is where move semantics happen, but, again, how can you pass an object to a function by move? If I am correct, it would leave an object in a "kind-of-null-state", making it unusable.
So, the questionы are:
- How do I correctly decide between pass-by-value and pass-by-reference?
- Do I need to provide both copy and move constructors?
- Do I need to explicitly write move and copy constructors? May I use = default? My classes are mostly POD object so there is no complex login involved.
- When debugging, I can always write std::cout << "move\n"; or std::cout << "copy\n"; in constructors of my own classes, but how do I know what happens with classes from stdlib?
P.S. It may look like it is a cry out of desperation (it is), not a valid SO question. I simply don't know to formulate my problems better than this.
If it is a primitive type, pass by value. Locality of reference wins.
If you aren't going to store a copy of it, pass by value or const&.
If you want to store a copy of it, and it is very cheap to move and modestly expensive to copy, pass by value.
If something has a modest cost to move, and is a sink parameter, consider pass by rvalue reference. Users will be forced to std::move.
Consider providing a way for callers to emplace construct into the field in highly generic code, or where you need every ounce of performance
The Rule of 0/3/5 describes how you should handle copy assign/construct/destroy. Ideally you follow the rule of 0; copy/move/destruct is all =default in anything except resource management types. If you want to implement any of copy/move/destruct, you need to implement, =default or =delete every other one of the 5.
If you are only taking 1 argument to a setter, consider writing both the && and const& versions of the setter. Or just exposing the underlying object. Move-assignment sometimes reuses storage and that is efficient.
Emplacing looks like this:
struct emplace_tag {};
struct wrap_foo {
template<class...Ts>
wrap_foo(emplace_tag, Ts&&...ts):
foo( std::forward<Ts>(ts)... )
{}
template<class T0, class...Ts>
wrap_foo(emplace_tag, std::initializer_list<T0> il, Ts&&...ts):
foo( il, std::forward<Ts>(ts)... )
{}
private:
Foo foo;
};
there are a myriad of other ways you can permit "emplace" construction. See emplace_back or emplace in standard containers as well (where they use placement ::new to construct objects, forwarding objects passed in).
Emplace construct even permits direct construction without even a move using objects with an operator T() setup properly. But that is something that is beyond the scope of this question.

Should I always move on `sink` constructor or setter arguments?

struct TestConstRef {
std::string str;
Test(const std::string& mStr) : str{mStr} { }
};
struct TestMove {
std::string str;
Test(std::string mStr) : str{std::move(mStr)} { }
};
After watching GoingNative 2013, I understood that sink arguments should always be passed by value and moved with std::move. Is TestMove::ctor the correct way of applying this idiom? Is there any case where TestConstRef::ctor is better/more efficient?
What about trivial setters? Should I use the following idiom or pass a const std::string&?
struct TestSetter {
std::string str;
void setStr(std::string mStr) { str = std::move(str); }
};
The simple answer is: yes.
The reason is quite simple as well, if you store by value you might either need to move (from a temporary) or make a copy (from a l-value). Let us examine what happens in both situations, with both ways.
From a temporary
if you take the argument by const-ref, the temporary is bound to the const-ref and cannot be moved from again, thus you end up making a (useless) copy.
if you take the argument by value, the value is initialized from the temporary (moving), and then you yourself move from the argument, thus no copy is made.
One limitation: a class without an efficient move-constructor (such as std::array<T, N>) because then you did two copies instead of one.
From a l-value (or const temporary, but who would do that...)
if you take the argument by const-ref, nothing happens there, and then you copy it (cannot move from it), thus a single copy is made.
if you take the argument by value, you copy it in the argument and then move from it, thus a single copy is made.
One limitation: the same... classes for which moving is akin to copying.
So, the simple answer is that in most cases, by using a sink you avoid unnecessary copies (replacing them by moves).
The single limitation is classes for which the move constructor is as expensive (or near as expensive) as the copy constructor; in which case having two moves instead of one copy is "worst". Thankfully, such classes are rare (arrays are one case).
A bit late, as this question already has an accepted answer, but anyways... here's an alternative:
struct Test {
std::string str;
Test(std::string&& mStr) : str{std::move(mStr)} { } // 1
Test(const std::string& mStr) : str{mStr} { } // 2
};
Why would that be better? Consider the two cases:
From a temporary (case // 1)
Only one move-constructor is called for str.
From an l-value (case // 2)
Only one copy-constructor is called for str.
It probably can't get any better than that.
But wait, there is more:
No additional code is generated on the caller's side! The calling of the copy- or move-constructor (which might be inlined or not) can now live in the implementation of the called function (here: Test::Test) and therefore only a single copy of that code is required. If you use by-value parameter passing, the caller is responsible for producing the object that is passed to the function. This might add up in large projects and I try to avoid it if possible.

The C++11 way of initializing data members from arguments

Seeing as C++11 supports move semantics, when initializing data members from arguments, should we attempt to move the value instead of copying it?
Here's an example showing how I would approach this in pre-C++11:
struct foo {
std::vector<int> data;
explicit foo(const std::vector<int>& data)
: data(data)
{
}
};
Here, the copy constructor would be called.
In C++11, should we get into the habit of writing like this:
struct foo {
std::vector<int> data;
explicit foo(std::vector<int> data)
: data(std::move(data))
{
}
};
Here, the move constructor would be called... as well as the copy constructor if the argument passed is an lvalue, but the benefit is that if an rvalue was passed, the move constructor would be called instead of the copy one.
I'm wondering if there's something I'm missing.
My initial answer to your question was:
Don't copy data that you want to move. You can add a constructor using a rvalue reference, if performance is a problem:
explicit foo(std::vector<int>&& data)
: data(std::move(data)) // thanks to Kerrek SB
{
}
Not exactly matching your question, but reading
Rule-of-Three becomes Rule-of-Five with C++11?
seems to be useful.
Edit:
However, the accepted answer to
Passing/Moving parameters of a constructor in C++0x
seems to advocate your approach, especially with more than one parameter.
Otherwise there would be a combinatorial explosion of variants.
Passing by value in the copy constructor only helps when the argument is movable, otherwise you could end up with up to two copies (one for the argument passing, one for the member construction). So I'd say it's better to write a copy and a move constructor separately.
Passing by value makes sense for the assignment operator if you have a properly implemented swap function, though:
Foo & operator=(Foo other) { this->swap(std::move(other)); }
Now if other is moveable, Foo's move constructor comes in during argument construction, and if other is merely copyable, then the one necessary copy is made during argument construction, but in both cases you get to use the moving version of swap, which ought to be cheap. But this relies on the existence of a move constructor!
So note that out of "construction", "swap" and "assigment" you will have to implement two properly, and only the third can take advantage of the other two. Since swap should be no-throw, using the swap trick in the assigment operator is basically the only option.
Yes, you are doing it correctly. Any time you need a copy of a value, do it in the parameters by passing by value.
The following is correct:
struct foo {
std::vector<int> data;
explicit foo(std::vector<int> data)
: data(std::move(data))
{
}
};
You should stick with:
struct foo {
std::vector<int> data;
explicit foo(const std::vector<int>& data)
: data(data)
{
}
};
In which case "data" is only copied.
In the second case:
struct foo {
std::vector<int> data;
explicit foo(std::vector<int> data)
: data(std::move(data))
{
}
};
"data" is first copied and then moved. Which is more expensive than just copying. Remember moving is not free, even though it probably is alot cheaper than copying.
On the other hand you might consider adding the following (in addition to or instead of the first).
struct foo {
std::vector<int> data;
explicit foo(std::vector<int>&& data)
: data(std::move(data))
{
}
};
Where you know that "data" will not be used after the constructor call, and in which case you can just move it.