Create a temporary to pass to rvalue reference - c++

I want to code several recursively interacting merge functions, which I think should have signatures: T&& merge_XYZ(T&& a, T&& b);
They will tend to be used recursively with lines such as:
return merge_XYZ( std::move(x), std::move(y) );
Each of these several merge functions will steal the contents of one of the inputs and inject those contents into the other input and return the result. Typically, they will have x and y which are names for what were rvalue references and thus should be converted back to rvalue references by std::move (correct me if I'm wrong).
But rarely, they have x and or y that are references to objects whose contents must not be stolen. I definitely don't want to write alternate non stealing versions of these functions. Rather, I want the caller to deal with that in these rare cases. So my main question is whether the correct way to do that is to explicitly invoke copy construction, such as:
T temp = merge_QRS( T(x), T(y) ); // use x and y without stealing yet
return merge_XYZ( merge_MNO( std::move(x), std::move(y) ), std::move(temp) );
Main question: Is T(x) the right way to force a temporary copy to be created at that point?
Other questions:
Is T temp = the correct way to make sure the call to merge_QRS in the above code occurs before the call to merge_MNO but otherwise inexpensively forward the temporary from that into the first operand of merge_XYZ? If I used T&& temp instead does it end up holding a pointer to modified T(x) after the life of T(x)?
Is T&& the right return type (as opposed to T) for chaining a lot of these together?
How does the above compare to:
T tx = x;
T&& temp = merge_QRS( std::move(tx), T(y) ); // use x and y without stealing yet
return merge_XYZ( merge_MNO( std::move(x), std::move(y) ), std::move(temp) );
Assuming merge_QRS will be modifying tx and returning an rvalue reference to that, is that behavior all defined?
Writing this question may have helped me realize I could be mixing together two situations that ought not to be mixed: Object you don't want to steal from vs. objects you don't want to steal from yet. Is my original merge_QRS( T(y), T(x)) right (only if consumed within the same expression) for objects I don't want to steal from? But in the case I tried as an example should I have the following:
T tx = x; // Make copies which can be stolen from
T ty = y;
return merge_XYZ( merge_MNO( std::move(x), std::move(y) ),
merge_QRS( std::move(tx), std::move(ty) ) );
I think I may still be confused about stealing the contents vs. stealing the identity. If I return by T&& I'm stealing the identity of one input in addition to stealing the contents of the other. When do I get away with stealing an identity? If I return by T I'm never stealing an identity, and sometimes failing to steal an identity is inefficient.

Main question: Is T(x) the right way to force a temporary copy to be created at that point?
Yes
Is T temp = the correct way to make sure the call to merge_QRS in the above code occurs before the call to merge_MNO but otherwise inexpensively forward the temporary from that into the first operand of merge_XYZ?
Yes
If I used T&& temp instead does it end up holding a pointer to modified T(x) after the life of T(x)?
Yes. That's dangling reference which unfortunately the compiler won't catch.
Is T&& the right return type (as opposed to T) for chaining a lot of these together?
To be honest, it doesn't smell good to me.
You may want to reconsider your data model to be something more standard, i.e.:
T merge(T x, T y)
{
// do some merging
return x;
}
Copy-elision and RVO will eliminate any redundant copies. Now you can move items in, pass copies or pass temporaries. There's only one piece of logic to maintain and your code has value-semantics... which is always better (TM).

tl;dr: With value semantics you get:
moves happening where possible
No need to explictly copy arguments
No dangling references in case of passing prvalues to the function
Consider
struct X
{
X() = default;
X(int y) : a(y) {}
X(X &&r) : a(r.a) { std::cout << "move X..."; }
X(X const &r) : a(r.a) { std::cout << "copy X..."; }
int a;
};
with foo:
X&& foo(X &&a) { return std::move(a); }
and and bar:
X bar(X a) { return a; }
Then when executing the following Code:
std::cout << "foo:\n";
X x{ 55 };
std::cout << "a: ";
X && a = foo(std::move(x)); // fine
std::cout << "\nb: ";
X && b = foo(X(x)); // !! dangling RV is not prvalue but xvalue
std::cout << "\nc: ";
X c = foo(std::move(x)); // fine, copy
std::cout << "\nd: ";
X d = foo(X(x)); // fine
std::cout << "\ne: ";
X && e = foo(X{ 12 }); // !! dangling...
std::cout << "\nf: ";
X f = foo(X{ 12 }); // fine
std::cout << "\n\nbar:\n";
X y{ 55 };
std::cout << "a: ";
X && q = bar(std::move(y)); // fine
std::cout << "\nb: ";
X && r = bar(y); // no explict copy required, supported by syntax
std::cout << "\nc: ";
X s = bar(std::move(y)); // fine
std::cout << "\nd: ";
X t = bar(y); // fine, no explict copy required either
std::cout << "\ne: ";
X && u = bar(X{ 12 }); // fine
std::cout << "\nf: ";
X v = bar(X{ 12 }); // fine
std::cout << "\n";
we obtain
foo:
a:
b: copy X...
c: move X...
d: copy X...move X...
e:
f: move X...
bar :
a: move X...move X...
b: copy X...move X...
c: move X...move X...
d: copy X...move X...
e: move X...
f: move X...
on VS 2015 and g++ 5.2.
So the only copies made (with bar) are in cases b and d, which is the desired behaviour anyway but you get rid of the possibly dangling references at the cost of 1-2 moves per operation (which afaik even may be optimized out in some cases as well).

Related

lambda capture by reference changes variable without mutable keyword [duplicate]

Short example:
#include <iostream>
int main()
{
int n;
[&](){n = 10;}(); // OK
[=]() mutable {n = 20;}(); // OK
// [=](){n = 10;}(); // Error: a by-value capture cannot be modified in a non-mutable lambda
std::cout << n << "\n"; // "10"
}
The question: Why do we need the mutable keyword? It's quite different from traditional parameter passing to named functions. What's the rationale behind?
I was under the impression that the whole point of capture-by-value is to allow the user to change the temporary -- otherwise I'm almost always better off using capture-by-reference, aren't I?
Any enlightenments?
(I'm using MSVC2010 by the way. AFAIK this should be standard)
It requires mutable because by default, a function object should produce the same result every time it's called. This is the difference between an object orientated function and a function using a global variable, effectively.
Your code is almost equivalent to this:
#include <iostream>
class unnamed1
{
int& n;
public:
unnamed1(int& N) : n(N) {}
/* OK. Your this is const but you don't modify the "n" reference,
but the value pointed by it. You wouldn't be able to modify a reference
anyway even if your operator() was mutable. When you assign a reference
it will always point to the same var.
*/
void operator()() const {n = 10;}
};
class unnamed2
{
int n;
public:
unnamed2(int N) : n(N) {}
/* OK. Your this pointer is not const (since your operator() is "mutable" instead of const).
So you can modify the "n" member. */
void operator()() {n = 20;}
};
class unnamed3
{
int n;
public:
unnamed3(int N) : n(N) {}
/* BAD. Your this is const so you can't modify the "n" member. */
void operator()() const {n = 10;}
};
int main()
{
int n;
unnamed1 u1(n); u1(); // OK
unnamed2 u2(n); u2(); // OK
//unnamed3 u3(n); u3(); // Error
std::cout << n << "\n"; // "10"
}
So you could think of lambdas as generating a class with operator() that defaults to const unless you say that it is mutable.
You can also think of all the variables captured inside [] (explicitly or implicitly) as members of that class: copies of the objects for [=] or references to the objects for [&]. They are initialized when you declare your lambda as if there was a hidden constructor.
I was under the impression that the whole point of capture-by-value is to allow the user to change the temporary -- otherwise I'm almost always better off using capture-by-reference, aren't I?
The question is, is it "almost"? A frequent use-case appears to be to return or pass lambdas:
void registerCallback(std::function<void()> f) { /* ... */ }
void doSomething() {
std::string name = receiveName();
registerCallback([name]{ /* do something with name */ });
}
I think that mutable isn't a case of "almost". I consider "capture-by-value" like "allow me to use its value after the captured entity dies" rather than "allow me to change a copy of it". But perhaps this can be argued.
You have to understand what capture means! it's capturing not argument passing! let's look at some code samples:
int main()
{
using namespace std;
int x = 5;
int y;
auto lamb = [x]() {return x + 5; };
y= lamb();
cout << y<<","<< x << endl; //outputs 10,5
x = 20;
y = lamb();
cout << y << "," << x << endl; //output 10,20
}
As you can see even though x has been changed to 20 the lambda is still returning 10 ( x is still 5 inside the lambda)
Changing x inside the lambda means changing the lambda itself at each call (the lambda is mutating at each call). To enforce correctness the standard introduced the mutable keyword. By specifying a lambda as mutable you are saying that each call to the lambda could cause a change in the lambda itself. Let see another example:
int main()
{
using namespace std;
int x = 5;
int y;
auto lamb = [x]() mutable {return x++ + 5; };
y= lamb();
cout << y<<","<< x << endl; //outputs 10,5
x = 20;
y = lamb();
cout << y << "," << x << endl; //outputs 11,20
}
The above example shows that by making the lambda mutable, changing x inside the lambda "mutates" the lambda at each call with a new value of x that has no thing to do with the actual value of x in the main function
FWIW, Herb Sutter, a well-known member of the C++ standardization committee, provides a different answer to that question in Lambda Correctness and Usability Issues:
Consider this straw man example, where the programmer captures a local variable by
value and tries to modify the
captured value (which is a member variable of the lambda object):
int val = 0;
auto x = [=](item e) // look ma, [=] means explicit copy
{ use(e,++val); }; // error: count is const, need ‘mutable’
auto y = [val](item e) // darnit, I really can’t get more explicit
{ use(e,++val); }; // same error: count is const, need ‘mutable’
This feature appears to have been added out of a concern that the user
might not realize he got a copy, and in particular that since lambdas
are copyable he might be changing a different lambda’s copy.
His paper is about why this should be changed in C++14. It is short, well written, worth reading if you want to know "what's on [committee member] minds" with regards to this particular feature.
You need to think what is the closure type of your Lambda function. Every time you declare a Lambda expression, the compiler creates a closure type, which is nothing less than an unnamed class declaration with attributes (environment where the Lambda expression where declared) and the function call ::operator() implemented. When you capture a variable using copy-by-value, the compiler will create a new const attribute in the closure type, so you can't change it inside the Lambda expression because it is a "read-only" attribute, that's the reason they call it a "closure", because in some way, you are closing your Lambda expression by copying the variables from upper scope into the Lambda scope. When you use the keyword mutable, the captured entity will became a non-const attribute of your closure type. This is what causes the changes done in the mutable variable captured by value, to not be propagated to upper scope, but keep inside the stateful Lambda.
Always try to imagine the resulting closure type of your Lambda expression, that helped me a lot, and I hope it can help you too.
See this draft, under 5.1.2 [expr.prim.lambda], subclause 5:
The closure type for a lambda-expression has a public inline function call operator (13.5.4) whose parameters
and return type are described by the lambda-expression’s parameter-declaration-clause and trailingreturn-
type respectively. This function call operator is declared const (9.3.1) if and only if the lambdaexpression’s
parameter-declaration-clause is not followed by mutable.
Edit on litb's comment:
Maybe they thought of capture-by-value so that outside changes to the variables aren't reflected inside the lambda? References work both ways, so that's my explanation. Don't know if it's any good though.
Edit on kizzx2's comment:
The most times when a lambda is to be used is as a functor for algorithms. The default constness lets it be used in a constant environment, just like normal const-qualified functions can be used there, but non-const-qualified ones can't. Maybe they just thought to make it more intuitive for those cases, who know what goes on in their mind. :)
I was under the impression that the
whole point of capture-by-value is to
allow the user to change the temporary
-- otherwise I'm almost always better off using capture-by-reference, aren't
I?
n is not a temporary. n is a member of the lambda-function-object that you create with the lambda expression. The default expectation is that calling your lambda does not modify its state, therefore it is const to prevent you from accidentally modifying n.
To extend Puppy's answer, lambda functions are intended to be pure functions. That means every call given a unique input set always returns the same output. Let's define input as the set of all arguments plus all captured variables when the lambda is called.
In pure functions output solely depends on input and not on some internal state. Therefore any lambda function, if pure, does not need to change its state and is therefore immutable.
When a lambda captures by reference, writing on captured variables is a strain on the concept of pure function, because all a pure function should do is return an output, though the lambda does not certainly mutate because the writing happens to external variables. Even in this case a correct usage implies that if the lambda is called with the same input again, the output will be the same everytime, despite these side effects on by-ref variables. Such side effects are just ways to return some additional input (e.g. update a counter) and could be reformulated into a pure function, for example returning a tuple instead of a single value.
I also was wondering about it and the simplest explanation why [=] requires explicit mutable is in this example:
int main()
{
int x {1};
auto lbd = [=]() mutable { return x += 5; };
printf("call1:%d\n", lbd());
printf("call2:%d\n", lbd());
return 0;
}
Output:
call1:6
call2:11
By words:
You can see that the x value is different at the second call (1 for the call1 and 6 for the call2).
A lambda object keeps a captured variable by value (has its own
copy) in case of [=].
The lambda can be called several times.
And in general case we have to have the same value of the captured variable to have the same predictable behavior of the lambda based on the known captured value, not updated during the lambda work. That's why the default behavior assumed const (to predict changes of the lambda object members) and when a user is aware of consequences he takes this responsibility on himself with mutable.
Same with capturing by value. For my example:
auto lbd = [x]() mutable { return x += 5; };
There is now a proposal to alleviate the need for mutable in lambda declarations: n3424
You might see the difference, if you check 3 different use cases of lambda:
Capturing an argument by value
Capturing an argument by value with 'mutable' keyword
Capturing an argument by reference
case 1:
When you capture an argument by value, a few things happen:
You are not allowed to modify the argument inside the lambda
The value of the argument remains the same, whenever the lambda is
called, not matter what will be the argument value at the time the lambda is called.
so for example:
{
int x = 100;
auto lambda1 = [x](){
// x += 2; // compile time error. not allowed
// to modify an argument that is captured by value
return x * 2;
};
cout << lambda1() << endl; // 100 * 2 = 200
cout << "x: " << x << endl; // 100
x = 300;
cout << lambda1() << endl; // in the lambda, x remain 100. 100 * 2 = 200
cout << "x: " << x << endl; // 300
}
Output:
200
x: 100
200
x: 300
case 2:
Here, when you capture an argument by value and use the 'mutable' keyword, similar to the first case, you create a "copy" of this argument. This "copy" lives in the "world" of the lambda, but now, you can actually modify the argument within the lambda-world, so its value is changed, and saved and it can be referred to, in the future calls of this lambda. Again, the outside "life" of the argument might be totally different (value wise):
{
int x = 100;
auto lambda2 = [x]() mutable {
x += 2; // when capture by value, modify the argument is
// allowed when mutable is used.
return x;
};
cout << lambda2() << endl; // 100 + 2 = 102
cout << "x: " << x << endl; // in the outside world - x remains 100
x = 200;
cout << lambda2() << endl; // 104, as the 102 is saved in the lambda world.
cout << "x: " << x << endl; // 200
}
Output:
102
x: 100
104
x: 200
case 3:
This is the easiest case, as no more 2 lives of x. Now there is only one value for x and it's shared between the outside world and the lambda world.
{
int x = 100;
auto lambda3 = [&x]() mutable {
x += 10; // modify the argument, is allowed when mutable is used.
return x;
};
cout << lambda3() << endl; // 110
cout << "x: " << x << endl; // 110
x = 400;
cout << lambda3() << endl; // 410.
cout << "x: " << x << endl; // 410
}
Output:
110
x: 110
410
x: 410

Does cv::Point3f assignment operator do a "deep" copy?

The following class method Augmented3dPoint::getWorldPoint() returns a reference to its member cv::Point3f world_point;
class Augmented3dPoint {
private:
cv::Point3f world_point;
public:
cv::Point3f& getWorldPoint () {
return world_point;
}
};
I am calling this in main() through the following code (totalPointCloud is std::vector<Augmented3dPoint> totalPointCloud;)
cv::Point3f f;
f = totalPointCloud[i].getWorldPoint(); // <---- Probably "deep" copy applied, why?
f.x = 300; // Try to change a value to see if it is reflected on the original world_point
f = totalPointCloud[i].getWorldPoint();
std::cout << f.x << f.y << f.z << std::endl; // The change is not reflected
//and I get as the result the original world_point,
//which means f is another copy of world_point with 300 in X coordinate
What I want to do is achieve the minimum copying of variables. But, the previous code apparently does a "deep" copy...
a) Is that correct or there is another explanation?
b) I have tried the following
cv::Point3f& f = totalPointCloud[i].getWorldPoint();
f.x = 300;
f = totalPointCloud[i].getWorldPoint();
std::cout << f.x << f.y << f.z << std::endl;
which seems to directly affect class member variable world_point and avoids a "deep" copy, since its X coordinate is now 300. Is there any other way around?
Thanks a lot.
a) Is that correct or there is another explanation?
Seems correct although, not necessarily framed in a helpful way. You need to just think of a Point3f as a value. When you get the value, you get the value not a reference to it.
Which leads me to
b) Is there any other way around?
Not really, if you want a reference to a value, you can either use a reference to it, a pointer to it or a wrapper type with the same semantics as a reference or pointer.
So things like
cv::Point3f& f = totalPointCloud[i].getWorldPoint();
cv::Point3f* f1 = &totalPointCloud[i].getWorldPoint();
std::reference_wrapper<cv::Point3f> f2 = std::ref(totalPointCloud[i].getWorldPoint());

Why lambda function capture by value by const [duplicate]

Short example:
#include <iostream>
int main()
{
int n;
[&](){n = 10;}(); // OK
[=]() mutable {n = 20;}(); // OK
// [=](){n = 10;}(); // Error: a by-value capture cannot be modified in a non-mutable lambda
std::cout << n << "\n"; // "10"
}
The question: Why do we need the mutable keyword? It's quite different from traditional parameter passing to named functions. What's the rationale behind?
I was under the impression that the whole point of capture-by-value is to allow the user to change the temporary -- otherwise I'm almost always better off using capture-by-reference, aren't I?
Any enlightenments?
(I'm using MSVC2010 by the way. AFAIK this should be standard)
It requires mutable because by default, a function object should produce the same result every time it's called. This is the difference between an object orientated function and a function using a global variable, effectively.
Your code is almost equivalent to this:
#include <iostream>
class unnamed1
{
int& n;
public:
unnamed1(int& N) : n(N) {}
/* OK. Your this is const but you don't modify the "n" reference,
but the value pointed by it. You wouldn't be able to modify a reference
anyway even if your operator() was mutable. When you assign a reference
it will always point to the same var.
*/
void operator()() const {n = 10;}
};
class unnamed2
{
int n;
public:
unnamed2(int N) : n(N) {}
/* OK. Your this pointer is not const (since your operator() is "mutable" instead of const).
So you can modify the "n" member. */
void operator()() {n = 20;}
};
class unnamed3
{
int n;
public:
unnamed3(int N) : n(N) {}
/* BAD. Your this is const so you can't modify the "n" member. */
void operator()() const {n = 10;}
};
int main()
{
int n;
unnamed1 u1(n); u1(); // OK
unnamed2 u2(n); u2(); // OK
//unnamed3 u3(n); u3(); // Error
std::cout << n << "\n"; // "10"
}
So you could think of lambdas as generating a class with operator() that defaults to const unless you say that it is mutable.
You can also think of all the variables captured inside [] (explicitly or implicitly) as members of that class: copies of the objects for [=] or references to the objects for [&]. They are initialized when you declare your lambda as if there was a hidden constructor.
I was under the impression that the whole point of capture-by-value is to allow the user to change the temporary -- otherwise I'm almost always better off using capture-by-reference, aren't I?
The question is, is it "almost"? A frequent use-case appears to be to return or pass lambdas:
void registerCallback(std::function<void()> f) { /* ... */ }
void doSomething() {
std::string name = receiveName();
registerCallback([name]{ /* do something with name */ });
}
I think that mutable isn't a case of "almost". I consider "capture-by-value" like "allow me to use its value after the captured entity dies" rather than "allow me to change a copy of it". But perhaps this can be argued.
You have to understand what capture means! it's capturing not argument passing! let's look at some code samples:
int main()
{
using namespace std;
int x = 5;
int y;
auto lamb = [x]() {return x + 5; };
y= lamb();
cout << y<<","<< x << endl; //outputs 10,5
x = 20;
y = lamb();
cout << y << "," << x << endl; //output 10,20
}
As you can see even though x has been changed to 20 the lambda is still returning 10 ( x is still 5 inside the lambda)
Changing x inside the lambda means changing the lambda itself at each call (the lambda is mutating at each call). To enforce correctness the standard introduced the mutable keyword. By specifying a lambda as mutable you are saying that each call to the lambda could cause a change in the lambda itself. Let see another example:
int main()
{
using namespace std;
int x = 5;
int y;
auto lamb = [x]() mutable {return x++ + 5; };
y= lamb();
cout << y<<","<< x << endl; //outputs 10,5
x = 20;
y = lamb();
cout << y << "," << x << endl; //outputs 11,20
}
The above example shows that by making the lambda mutable, changing x inside the lambda "mutates" the lambda at each call with a new value of x that has no thing to do with the actual value of x in the main function
FWIW, Herb Sutter, a well-known member of the C++ standardization committee, provides a different answer to that question in Lambda Correctness and Usability Issues:
Consider this straw man example, where the programmer captures a local variable by
value and tries to modify the
captured value (which is a member variable of the lambda object):
int val = 0;
auto x = [=](item e) // look ma, [=] means explicit copy
{ use(e,++val); }; // error: count is const, need ‘mutable’
auto y = [val](item e) // darnit, I really can’t get more explicit
{ use(e,++val); }; // same error: count is const, need ‘mutable’
This feature appears to have been added out of a concern that the user
might not realize he got a copy, and in particular that since lambdas
are copyable he might be changing a different lambda’s copy.
His paper is about why this should be changed in C++14. It is short, well written, worth reading if you want to know "what's on [committee member] minds" with regards to this particular feature.
You need to think what is the closure type of your Lambda function. Every time you declare a Lambda expression, the compiler creates a closure type, which is nothing less than an unnamed class declaration with attributes (environment where the Lambda expression where declared) and the function call ::operator() implemented. When you capture a variable using copy-by-value, the compiler will create a new const attribute in the closure type, so you can't change it inside the Lambda expression because it is a "read-only" attribute, that's the reason they call it a "closure", because in some way, you are closing your Lambda expression by copying the variables from upper scope into the Lambda scope. When you use the keyword mutable, the captured entity will became a non-const attribute of your closure type. This is what causes the changes done in the mutable variable captured by value, to not be propagated to upper scope, but keep inside the stateful Lambda.
Always try to imagine the resulting closure type of your Lambda expression, that helped me a lot, and I hope it can help you too.
See this draft, under 5.1.2 [expr.prim.lambda], subclause 5:
The closure type for a lambda-expression has a public inline function call operator (13.5.4) whose parameters
and return type are described by the lambda-expression’s parameter-declaration-clause and trailingreturn-
type respectively. This function call operator is declared const (9.3.1) if and only if the lambdaexpression’s
parameter-declaration-clause is not followed by mutable.
Edit on litb's comment:
Maybe they thought of capture-by-value so that outside changes to the variables aren't reflected inside the lambda? References work both ways, so that's my explanation. Don't know if it's any good though.
Edit on kizzx2's comment:
The most times when a lambda is to be used is as a functor for algorithms. The default constness lets it be used in a constant environment, just like normal const-qualified functions can be used there, but non-const-qualified ones can't. Maybe they just thought to make it more intuitive for those cases, who know what goes on in their mind. :)
I was under the impression that the
whole point of capture-by-value is to
allow the user to change the temporary
-- otherwise I'm almost always better off using capture-by-reference, aren't
I?
n is not a temporary. n is a member of the lambda-function-object that you create with the lambda expression. The default expectation is that calling your lambda does not modify its state, therefore it is const to prevent you from accidentally modifying n.
To extend Puppy's answer, lambda functions are intended to be pure functions. That means every call given a unique input set always returns the same output. Let's define input as the set of all arguments plus all captured variables when the lambda is called.
In pure functions output solely depends on input and not on some internal state. Therefore any lambda function, if pure, does not need to change its state and is therefore immutable.
When a lambda captures by reference, writing on captured variables is a strain on the concept of pure function, because all a pure function should do is return an output, though the lambda does not certainly mutate because the writing happens to external variables. Even in this case a correct usage implies that if the lambda is called with the same input again, the output will be the same everytime, despite these side effects on by-ref variables. Such side effects are just ways to return some additional input (e.g. update a counter) and could be reformulated into a pure function, for example returning a tuple instead of a single value.
I also was wondering about it and the simplest explanation why [=] requires explicit mutable is in this example:
int main()
{
int x {1};
auto lbd = [=]() mutable { return x += 5; };
printf("call1:%d\n", lbd());
printf("call2:%d\n", lbd());
return 0;
}
Output:
call1:6
call2:11
By words:
You can see that the x value is different at the second call (1 for the call1 and 6 for the call2).
A lambda object keeps a captured variable by value (has its own
copy) in case of [=].
The lambda can be called several times.
And in general case we have to have the same value of the captured variable to have the same predictable behavior of the lambda based on the known captured value, not updated during the lambda work. That's why the default behavior assumed const (to predict changes of the lambda object members) and when a user is aware of consequences he takes this responsibility on himself with mutable.
Same with capturing by value. For my example:
auto lbd = [x]() mutable { return x += 5; };
There is now a proposal to alleviate the need for mutable in lambda declarations: n3424
You might see the difference, if you check 3 different use cases of lambda:
Capturing an argument by value
Capturing an argument by value with 'mutable' keyword
Capturing an argument by reference
case 1:
When you capture an argument by value, a few things happen:
You are not allowed to modify the argument inside the lambda
The value of the argument remains the same, whenever the lambda is
called, not matter what will be the argument value at the time the lambda is called.
so for example:
{
int x = 100;
auto lambda1 = [x](){
// x += 2; // compile time error. not allowed
// to modify an argument that is captured by value
return x * 2;
};
cout << lambda1() << endl; // 100 * 2 = 200
cout << "x: " << x << endl; // 100
x = 300;
cout << lambda1() << endl; // in the lambda, x remain 100. 100 * 2 = 200
cout << "x: " << x << endl; // 300
}
Output:
200
x: 100
200
x: 300
case 2:
Here, when you capture an argument by value and use the 'mutable' keyword, similar to the first case, you create a "copy" of this argument. This "copy" lives in the "world" of the lambda, but now, you can actually modify the argument within the lambda-world, so its value is changed, and saved and it can be referred to, in the future calls of this lambda. Again, the outside "life" of the argument might be totally different (value wise):
{
int x = 100;
auto lambda2 = [x]() mutable {
x += 2; // when capture by value, modify the argument is
// allowed when mutable is used.
return x;
};
cout << lambda2() << endl; // 100 + 2 = 102
cout << "x: " << x << endl; // in the outside world - x remains 100
x = 200;
cout << lambda2() << endl; // 104, as the 102 is saved in the lambda world.
cout << "x: " << x << endl; // 200
}
Output:
102
x: 100
104
x: 200
case 3:
This is the easiest case, as no more 2 lives of x. Now there is only one value for x and it's shared between the outside world and the lambda world.
{
int x = 100;
auto lambda3 = [&x]() mutable {
x += 10; // modify the argument, is allowed when mutable is used.
return x;
};
cout << lambda3() << endl; // 110
cout << "x: " << x << endl; // 110
x = 400;
cout << lambda3() << endl; // 410.
cout << "x: " << x << endl; // 410
}
Output:
110
x: 110
410
x: 410

How can a function in C++ return either a value or a reference with minimal copying?

I have a function that delegates to two others, returning either a reference or value depending on some runtime condition:
X by_value() { ... }
const X& by_reference() { ... }
?? foo(bool b) {
if (b) {
return by_value();
} else {
return by_reference();
}
}
I'd like to choose the return type of my function so that callers induce minimal copying; e.g.:
const X& x1 = foo(true); // No copies
const X& x2 = foo(false); // No copies
X x3 = foo(true); // No copies, one move (or zero via RVO)
X x4 = foo(false); // One copy
In all cases except the last, there shouldn't be a need (based on the runtime behavior) to copy the return value.
If the return type of foo is X, then there will be an extra copy in case 2; but if the return type is const X&, then cases 1 and 3 are undefined behavior.
Is it possible, via returning some sort of proxy, to ensure that the above uses have minimal copies?
Explanation: Since there's been significant pushback of the form "you're doing it wrong", I thought I'd explain the reason for this.
Imagine I have an array of type T or function<T()> (meaning the elements of this array are either of type T, or they're functions returning T). By the "value" of an element of this array, I mean, either the value itself or the return value when the function is evaluated.
If this get_value_of_array(int index) returns by value, then in the cases where the array contains just an element, I'm forced to do an extra copy. This is what I'm trying to avoid.
Further note: If the answer is, "That's impossible", that's fine with me. I'd love to see a proof of this, though - ideally of the form "Suppose there were a type Proxy<X> that solved your problem. Then...`
What you are looking for is a sum-type (that is, a type whose possible values are "the possible X values plus the possible X const& values").
In C++, these are usually called variant. These are usually implemented as a tag plus an appropriately sized and aligned array, and only hold exactly one value at runtime. Alternatively, they are implemented with dynamic allocation and the classic visitor pattern.
For example, with Boost.Variant, you could declare your function to return boost::variant<X, X const&> (live example):
boost::variant<X, X const&> foo(bool b) {
if (b) {
return by_value();
} else {
return by_reference();
}
}
I think this is impossible because whether the caller decides to move or copy the return value (whether it's from a proxy or from your class itself) is a compile-time decision, whereas what you want is to make it a run-time decision. Overload resolution cannot happen at run-time.
The only way out that I can see is to have the callee decide this, i.e. by providing a T & parameter which it can either move-assign to or copy-assign to depending on what it deems appropriate.
Alternatively, you can pass an aligned_storage<sizeof(T)> buffer and have the callee construct the value inside it, if you don't think the caller can be expected to make a "null" instance of some sort.
Well, if really want to achieve this, here's one rather ugly way:
X *foo(bool b, X *value) {
if (b) {
*value = get_value();
} else {
value = get_pointer_to_value();
}
return value;
}
Example usage:
void examplefunc() {
X local_store;
X *result;
result = foo(true, &local_store);
assert(result == &local_store);
use_x_value(*x);
result = foo(false, &local_store);
assert(result != &local_store);
use_x_value(*x);
}
Above approach is cumbersome: it needs two local variables and forces using the return value through a pointer. It also exposes a raw pointer, which can't be nicely converted to a smart pointer (putting local_store to heap to allow using a smart pointer would make this approach even more complex, not to mention add the overhead of heap allocation). Also, local_store is always default-constructed, but if you don't need to make examplefunc re-entrant, it can be made static (or use thread-local storage for multi-threaded version).
So I have hard time imagining where you would actually want to use this. It'd be simpler to always just return a copied value (and let compiler take care of copy elision when it can), or always return a reference, or maybe always return a shared_ptr.
Your goal is bad: Having multiple return types where one is a copy and the other is reference makes the function unpredictable.
Assuming foo is a member function of some classes A, B:
makes
X A::foo() { return X(); }
X foo a = A().foo()
well defined
and
const X& B::foo() { return some_internal_x; }
const X& b = B().foo()
a dangling reference
You may be able to accomplish what you want (still a bit fuzzy) by passing the variable type to get_value_of_array. This would allow it to return two different types and make adjustments based on whether the member of the array is a function or array.
struct X
{
X() { std::cout << "Construct" << std::endl; }
X(X const&) { std::cout << "Copy" << std::endl; }
X(X&&) { std::cout << "Move" << std::endl; }
};
const X array;
X function() { return X(); }
template<typename ReturnType>
ReturnType get_value_of_array(bool);
template<>
const X& get_value_of_array<const X&>(bool /*isarray*/)
{
// if (isarray == false) return the cached result of function()
return array; // gotta build the example yo!
}
template<>
X get_value_of_array<X>(bool isarray)
{
return isarray ? array : std::move(function());
}
int main()
{
// Optimizations may vary.
const X& x1 = get_value_of_array<decltype(x1)>(true); // No copies or moves
const X& x2 = get_value_of_array<decltype(x2)>(false); // No copies or moves.
X x3 = get_value_of_array<decltype(x3)>(true); // One copy, one move.
X x4 = get_value_of_array<decltype(x4)>(false); // Two moves.
}
Thanks to Cheers and hth. - Alf for the implementation of X.
At the time I'm writing this answer it's not a requirement that the argument to the function foo should be computed at run time, or should be allowed to be other than literally false or true, thus:
#include <utility>
#include <iostream>
namespace my {
using std::cout; using std::endl;
class X
{
private:
X& operator=( X const& ) = delete;
public:
X() {}
X( X const& )
{ cout << "Copy" << endl; }
X( X&& )
{ cout << "Move" << endl; }
};
} // my
auto foo_true()
-> my::X
{ return my::X(); }
auto foo_false()
-> my::X const&
{ static my::X const static_x; return static_x; }
#define foo( arg ) foo_##arg()
auto main() -> int
{
using namespace my;
cout << "A" << endl; const X& x1 = foo(true); // No copies
cout << "B" << endl; const X& x2 = foo(false); // No copies
cout << "C" << endl; X x3 = foo(true); // No copies, one move (or zero via RVO)
cout << "D" << endl; X x4 = foo(false); // One copy
}

Why does C++11's lambda require "mutable" keyword for capture-by-value, by default?

Short example:
#include <iostream>
int main()
{
int n;
[&](){n = 10;}(); // OK
[=]() mutable {n = 20;}(); // OK
// [=](){n = 10;}(); // Error: a by-value capture cannot be modified in a non-mutable lambda
std::cout << n << "\n"; // "10"
}
The question: Why do we need the mutable keyword? It's quite different from traditional parameter passing to named functions. What's the rationale behind?
I was under the impression that the whole point of capture-by-value is to allow the user to change the temporary -- otherwise I'm almost always better off using capture-by-reference, aren't I?
Any enlightenments?
(I'm using MSVC2010 by the way. AFAIK this should be standard)
It requires mutable because by default, a function object should produce the same result every time it's called. This is the difference between an object orientated function and a function using a global variable, effectively.
Your code is almost equivalent to this:
#include <iostream>
class unnamed1
{
int& n;
public:
unnamed1(int& N) : n(N) {}
/* OK. Your this is const but you don't modify the "n" reference,
but the value pointed by it. You wouldn't be able to modify a reference
anyway even if your operator() was mutable. When you assign a reference
it will always point to the same var.
*/
void operator()() const {n = 10;}
};
class unnamed2
{
int n;
public:
unnamed2(int N) : n(N) {}
/* OK. Your this pointer is not const (since your operator() is "mutable" instead of const).
So you can modify the "n" member. */
void operator()() {n = 20;}
};
class unnamed3
{
int n;
public:
unnamed3(int N) : n(N) {}
/* BAD. Your this is const so you can't modify the "n" member. */
void operator()() const {n = 10;}
};
int main()
{
int n;
unnamed1 u1(n); u1(); // OK
unnamed2 u2(n); u2(); // OK
//unnamed3 u3(n); u3(); // Error
std::cout << n << "\n"; // "10"
}
So you could think of lambdas as generating a class with operator() that defaults to const unless you say that it is mutable.
You can also think of all the variables captured inside [] (explicitly or implicitly) as members of that class: copies of the objects for [=] or references to the objects for [&]. They are initialized when you declare your lambda as if there was a hidden constructor.
I was under the impression that the whole point of capture-by-value is to allow the user to change the temporary -- otherwise I'm almost always better off using capture-by-reference, aren't I?
The question is, is it "almost"? A frequent use-case appears to be to return or pass lambdas:
void registerCallback(std::function<void()> f) { /* ... */ }
void doSomething() {
std::string name = receiveName();
registerCallback([name]{ /* do something with name */ });
}
I think that mutable isn't a case of "almost". I consider "capture-by-value" like "allow me to use its value after the captured entity dies" rather than "allow me to change a copy of it". But perhaps this can be argued.
You have to understand what capture means! it's capturing not argument passing! let's look at some code samples:
int main()
{
using namespace std;
int x = 5;
int y;
auto lamb = [x]() {return x + 5; };
y= lamb();
cout << y<<","<< x << endl; //outputs 10,5
x = 20;
y = lamb();
cout << y << "," << x << endl; //output 10,20
}
As you can see even though x has been changed to 20 the lambda is still returning 10 ( x is still 5 inside the lambda)
Changing x inside the lambda means changing the lambda itself at each call (the lambda is mutating at each call). To enforce correctness the standard introduced the mutable keyword. By specifying a lambda as mutable you are saying that each call to the lambda could cause a change in the lambda itself. Let see another example:
int main()
{
using namespace std;
int x = 5;
int y;
auto lamb = [x]() mutable {return x++ + 5; };
y= lamb();
cout << y<<","<< x << endl; //outputs 10,5
x = 20;
y = lamb();
cout << y << "," << x << endl; //outputs 11,20
}
The above example shows that by making the lambda mutable, changing x inside the lambda "mutates" the lambda at each call with a new value of x that has no thing to do with the actual value of x in the main function
FWIW, Herb Sutter, a well-known member of the C++ standardization committee, provides a different answer to that question in Lambda Correctness and Usability Issues:
Consider this straw man example, where the programmer captures a local variable by
value and tries to modify the
captured value (which is a member variable of the lambda object):
int val = 0;
auto x = [=](item e) // look ma, [=] means explicit copy
{ use(e,++val); }; // error: count is const, need ‘mutable’
auto y = [val](item e) // darnit, I really can’t get more explicit
{ use(e,++val); }; // same error: count is const, need ‘mutable’
This feature appears to have been added out of a concern that the user
might not realize he got a copy, and in particular that since lambdas
are copyable he might be changing a different lambda’s copy.
His paper is about why this should be changed in C++14. It is short, well written, worth reading if you want to know "what's on [committee member] minds" with regards to this particular feature.
You need to think what is the closure type of your Lambda function. Every time you declare a Lambda expression, the compiler creates a closure type, which is nothing less than an unnamed class declaration with attributes (environment where the Lambda expression where declared) and the function call ::operator() implemented. When you capture a variable using copy-by-value, the compiler will create a new const attribute in the closure type, so you can't change it inside the Lambda expression because it is a "read-only" attribute, that's the reason they call it a "closure", because in some way, you are closing your Lambda expression by copying the variables from upper scope into the Lambda scope. When you use the keyword mutable, the captured entity will became a non-const attribute of your closure type. This is what causes the changes done in the mutable variable captured by value, to not be propagated to upper scope, but keep inside the stateful Lambda.
Always try to imagine the resulting closure type of your Lambda expression, that helped me a lot, and I hope it can help you too.
See this draft, under 5.1.2 [expr.prim.lambda], subclause 5:
The closure type for a lambda-expression has a public inline function call operator (13.5.4) whose parameters
and return type are described by the lambda-expression’s parameter-declaration-clause and trailingreturn-
type respectively. This function call operator is declared const (9.3.1) if and only if the lambdaexpression’s
parameter-declaration-clause is not followed by mutable.
Edit on litb's comment:
Maybe they thought of capture-by-value so that outside changes to the variables aren't reflected inside the lambda? References work both ways, so that's my explanation. Don't know if it's any good though.
Edit on kizzx2's comment:
The most times when a lambda is to be used is as a functor for algorithms. The default constness lets it be used in a constant environment, just like normal const-qualified functions can be used there, but non-const-qualified ones can't. Maybe they just thought to make it more intuitive for those cases, who know what goes on in their mind. :)
I was under the impression that the
whole point of capture-by-value is to
allow the user to change the temporary
-- otherwise I'm almost always better off using capture-by-reference, aren't
I?
n is not a temporary. n is a member of the lambda-function-object that you create with the lambda expression. The default expectation is that calling your lambda does not modify its state, therefore it is const to prevent you from accidentally modifying n.
To extend Puppy's answer, lambda functions are intended to be pure functions. That means every call given a unique input set always returns the same output. Let's define input as the set of all arguments plus all captured variables when the lambda is called.
In pure functions output solely depends on input and not on some internal state. Therefore any lambda function, if pure, does not need to change its state and is therefore immutable.
When a lambda captures by reference, writing on captured variables is a strain on the concept of pure function, because all a pure function should do is return an output, though the lambda does not certainly mutate because the writing happens to external variables. Even in this case a correct usage implies that if the lambda is called with the same input again, the output will be the same everytime, despite these side effects on by-ref variables. Such side effects are just ways to return some additional input (e.g. update a counter) and could be reformulated into a pure function, for example returning a tuple instead of a single value.
I also was wondering about it and the simplest explanation why [=] requires explicit mutable is in this example:
int main()
{
int x {1};
auto lbd = [=]() mutable { return x += 5; };
printf("call1:%d\n", lbd());
printf("call2:%d\n", lbd());
return 0;
}
Output:
call1:6
call2:11
By words:
You can see that the x value is different at the second call (1 for the call1 and 6 for the call2).
A lambda object keeps a captured variable by value (has its own
copy) in case of [=].
The lambda can be called several times.
And in general case we have to have the same value of the captured variable to have the same predictable behavior of the lambda based on the known captured value, not updated during the lambda work. That's why the default behavior assumed const (to predict changes of the lambda object members) and when a user is aware of consequences he takes this responsibility on himself with mutable.
Same with capturing by value. For my example:
auto lbd = [x]() mutable { return x += 5; };
There is now a proposal to alleviate the need for mutable in lambda declarations: n3424
You might see the difference, if you check 3 different use cases of lambda:
Capturing an argument by value
Capturing an argument by value with 'mutable' keyword
Capturing an argument by reference
case 1:
When you capture an argument by value, a few things happen:
You are not allowed to modify the argument inside the lambda
The value of the argument remains the same, whenever the lambda is
called, not matter what will be the argument value at the time the lambda is called.
so for example:
{
int x = 100;
auto lambda1 = [x](){
// x += 2; // compile time error. not allowed
// to modify an argument that is captured by value
return x * 2;
};
cout << lambda1() << endl; // 100 * 2 = 200
cout << "x: " << x << endl; // 100
x = 300;
cout << lambda1() << endl; // in the lambda, x remain 100. 100 * 2 = 200
cout << "x: " << x << endl; // 300
}
Output:
200
x: 100
200
x: 300
case 2:
Here, when you capture an argument by value and use the 'mutable' keyword, similar to the first case, you create a "copy" of this argument. This "copy" lives in the "world" of the lambda, but now, you can actually modify the argument within the lambda-world, so its value is changed, and saved and it can be referred to, in the future calls of this lambda. Again, the outside "life" of the argument might be totally different (value wise):
{
int x = 100;
auto lambda2 = [x]() mutable {
x += 2; // when capture by value, modify the argument is
// allowed when mutable is used.
return x;
};
cout << lambda2() << endl; // 100 + 2 = 102
cout << "x: " << x << endl; // in the outside world - x remains 100
x = 200;
cout << lambda2() << endl; // 104, as the 102 is saved in the lambda world.
cout << "x: " << x << endl; // 200
}
Output:
102
x: 100
104
x: 200
case 3:
This is the easiest case, as no more 2 lives of x. Now there is only one value for x and it's shared between the outside world and the lambda world.
{
int x = 100;
auto lambda3 = [&x]() mutable {
x += 10; // modify the argument, is allowed when mutable is used.
return x;
};
cout << lambda3() << endl; // 110
cout << "x: " << x << endl; // 110
x = 400;
cout << lambda3() << endl; // 410.
cout << "x: " << x << endl; // 410
}
Output:
110
x: 110
410
x: 410