Lambda capturing - what is the difference between & and this? [duplicate] - c++

If I need to generate a lambda that calls a member function, should I capture by reference or capture 'this'? My understanding is that '&' captures only the variables used, but 'this' captures all member variable. So better to use '&'?
class MyClass {
public:
int mFunc() {
// accesses member variables
}
std::function<int()> get() {
//return [this] () { return this->mFunc(); };
// or
//return [&] () { return this->mFunc(); };
}
private:
// member variables
}

For the specific example you've provided, capturing by this is what you want. Conceptually, capturing this by reference doesn't make a whole lot of sense, since you can't change the value of this, you can only use it as a pointer to access members of the class or to get the address of the class instance. Inside your lambda function, if you access things which implicitly use the this pointer (e.g. you call a member function or access a member variable without explicitly using this), the compiler treats it as though you had used this anyway. You can list multiple captures too, so if you want to capture both members and local variables, you can choose independently whether to capture them by reference or by value. The following article should give you a good grounding in lambdas and captures:
https://crascit.com/2015/03/01/lambdas-for-lunch/
Also, your example uses std::function as the return type through which the lambda is passed back to the caller. Be aware that std::function isn't always as cheap as you may think, so if you are able to use a lambda directly rather than having to wrap it in a std::function, it will likely be more efficient. The following article, while not directly related to your original question, may still give you some useful material relating to lambdas and std::function (see the section An alternative way to store the function object, but the article in general may be of interest):
https://crascit.com/2015/06/03/on-leaving-scope-part-2/

Here is a good explanation of what &, this and the others indicate when used in the capture list.
In your case, assuming that all what you have to do is calling a member function of the instance that is actually referenced by the this of the method that is currently executing, put this in your capture list should be enough.

Capturing this and capturing by reference are two orthogonal concepts. You can use one, both, or none. It doesn't make sense to capture this by reference but you can capture other variables by reference while capturing this by value.

It's not a clear-cut situation where on is better than the other. Rather, the two (at least potentially) accomplish slightly different things. For example, consider code like this:
#include <iostream>
class foo {
int bar = 0;
public:
void baz() {
int bar = 1;
auto thing1 = [&] { bar = 2; };
auto thing2 = [this] { this->bar = 3; };
std::cout << "Before thing1: local bar: " << bar << ", this->bar: " << this->bar << "\n";
thing1();
std::cout << "After thing1: local bar: " << bar << ", this->bar: " << this->bar << "\n";
thing2();
std::cout << "After thing2: local bar: " << bar << ", this->bar: " << this->bar << "\n";
}
};
int main() {
foo f;
f.baz();
}
As you can see, capturing this captures only the variables that can be referred to via this. In this case, we have a local variable that shadows an instance variable (yes, that's often a bad idea, but in this case we're using it to show part of what each does). As we see when we run the program, we get different results from capturing this vs. an implicit capture by reference:
Before thing1: local bar: 1, this->bar: 0
After thing1: local bar: 2, this->bar: 0
After thing2: local bar: 2, this->bar: 3
As to the specifics of capturing everything vs. only what you use: neither will capture any variable you don't use. But, since this is a pointer, capturing that one variable gives you access to everything it points at. That's not unique to this though. Capturing any pointer will give you access to whatever it points at.

Related

Calling std::function destructor while still executing it

I want to dynamically change the behaviour of a method of a class, so I implemented these method calling the operator() of a std::function holding a copy of one lambda function, that depends on some values known only after the class construction, at a time.
The lambdas change the state of the class, so they reset a container holding the behaviours of all dynamic methods.
Executing the above idea I was not able to access the capture list of the lamba after resetting the container.
The following snippet reproduces the problem:
std::vector< std::function<void(std::string)> > vector;
int main() {
//Change class state when variable value will be known
std::string variableValue = "hello";
auto function = [variableValue](std::string arg) {
std::cout <<"From capture list, before: "<< variableValue << std::endl;
std::cout <<"From arg, before: " << arg << std::endl;
vector.clear();
std::cout << "From capture list, after: " << variableValue << std::endl;
std::cout << "From arg, after: " << arg << std::endl;
};
vector.push_back(function);
//Dynamic method execution
vector[0](variableValue);
return 0;
}
Producing output:
From capture list, before: hello
From arg, before: hello
From capture list, after:
From arg, after: hello
where variableValue is invalidated after vector was clean.
Is the capture list invalidation an expected result?
Is safe using any other local variable, not only in the capture list, after calling std::function destructor?
Is there a suggested way / pattern to accomplish the same behaviour in a safer way (excluding huge switches/if on class states)?
We can get rid of the std::function, lambda and vector for this question. Since lambdas are just syntactic sugar for classes with a function-call operator, your testcase is effectively the same as this:
struct Foo
{
std::string variableValue = "hello";
void bar(std::string arg)
{
std::cout <<"From capture list, before: "<< variableValue << std::endl;
std::cout <<"From arg, before: " << arg << std::endl;
delete this; // ugrh
std::cout << "From capture list, after: " << variableValue << std::endl;
std::cout << "From arg, after: " << arg << std::endl;
}
};
int main()
{
Foo* ptr = new Foo();
ptr->bar(variableValue);
}
The function argument is fine because it's a copy, but after delete this the member Foo::variableValue no longer exists, so your program has undefined behaviour from trying to use it.
Common wisdom is that continuing to run the function itself is legal (because function definitions aren't objects and cannot be "deleted"; they are just a fundamental property of your program), as long as you leave the encapsulating class's members well enough alone.
I would, however, advise avoiding this pattern unless you really need it. It'll be easy to confuse people as to the ownership responsibilities of your class (even when "your class" is autonomously-generated from a lambda expression!).
Is the capture list invalidation an expected result?
Yes.
Is safe using any other local variable, not only in the capture list, after calling std::function destructor?
Yes.
Is there a suggested way / pattern to accomplish the same behaviour in a safer way (excluding huge switches/if on class states)?
That's impossible to say for sure without understanding what it is that you're trying to do. But you could try playing around with storing shared_ptrs in your vector instead… Just be careful not to capture a shared_ptr in the lambda itself, or it'll never be cleaned up! Capturing a weak_ptr instead can be good for this; it can be "converted" to a shared_ptr inside the lambda body, which will protect the lambda's life for the duration of said body.
std::function's destructor destroys the object's target if the object is non-empty, where the target is the wrapped callable object.
In your case, the target is a lambda expression. When you use a lambda expression, the compiler generates a "non-union non-aggregate class type" that contains the captures-by-value as data members and has operator() as a member function.
When you execute vector.clear(), the destructors of its elements are run, and therefore the destructors of the closure's captures-by-value, which are member variables, are run.
As for captures-by-reference, "the reference variable's lifetime ends when the lifetime of the closure object ends."
So, it is not safe to access any capture, whether by value and by reference, after std::function's destructor runs.
What about the actual operator()? "Functions are not objects," so they don't have lifetimes. So, the mere execution of the operator() after the destructor has been run should be fine, as long as you don't access any captures. See the conditions under which one can safely delete this.

Why can't I change the value of a variable captured by copy in a lambda function?

I'm reading the C++ Programming Language by B. Stroustrup in its section 11.4.3.4 "mutable Lambdas", which says the following:
Usually, we don’t want to modify the state of the function object (the
closure), so by default we can’t. That is, the operator()() for the
generated function object (§11.4.1) is a const member function. In the
unlikely event that we want to modify the state (as opposed to
modifying the state of some variable captured by reference; §11.4.3),
we can declare the lambda mutable.
I don't understand why the default for the operator()() is const when the variable is captured by value. What's the rational for this? What could go wrong when I change the value of a variable, which is copied into the function object?
One can think of lambdas as classes with operator()(), which by default is defined as const. That is, it cannot change the state of the object. Consequently, the lambda will behave as a regular function and produce the same result every time it is called. If instead, we declare the lambda as mutable, it is possible for the lambda to modify the internal state of the object, and provide a different result for different calls depending on that state. This is not very intuitive and therefore discouraged.
For example, with mutable lambda, this can happen:
#include <iostream>
int main()
{
int n = 0;
auto lam = [=]() mutable {
n += 1;
return n;
};
std::cout << lam() << "\n"; // Prints 1
std::cout << n << "\n"; // Prints 0
std::cout << lam() << "\n"; // Prints 2
std::cout << n << "\n"; // Prints 0
}
It is easier to reason about const data.
By defaulting const, brief lamndas are easier to reason about. If you want mutability you can ask for it.
Many function objects in std are copied around; const objects that are copied have simpler state to track.

Are C++ lambdas true closures? Capturing by reference

In the code below, I create a lambda that captures a local variable by reference. Note that it is a pointer, so, if C++ lambdas are true closures, it should survive the lifetime of the function that creates the lambda.
However, when I call it again, rather than creating a new local variable (a new environment) it reuses the same as before, and in fact, captures exactly the same pointer as before.
This seems wrong. Either, C++ lambdas are not true closures, or is my code incorrect?
Thank you for any help
#include <iostream>
#include <functional>
#include <memory>
std::function<int()> create_counter()
{
std::shared_ptr<int> counter = std::make_shared<int>(0);
auto f = [&] () -> int { return ++(*counter); };
return f;
}
int main()
{
auto counter1 = create_counter();
auto counter2 = create_counter();
std::cout << counter1() << std::endl;
std::cout << counter1() << std::endl;
std::cout << counter2() << std::endl;
std::cout << counter2() << std::endl;
std::cout << counter1() << std::endl;
return 0;
}
This code returns:
1
2
3
4
5
But I was expecting it to return:
1
2
1
2
3
Further edit:
Thank you for pointing the error in my original code. I see now that what is happening is that the pointer gets deleted after the invocation of create_couter, and the new create simply reuses the same memory address.
Which brings me to my real question then, what I want to do is this:
std::function<int()> create_counter()
{
int counter = 0;
auto f = [&] () -> int { return ++counter; };
return f;
}
If C++ lambdas were true closures, each local counter will coexist with the returned function (the function carries its environment--at least part of it). Instead, counter is destroyed after the invocation of create_counter, and calling the returned function creates a segmentation fault. That is not the expected behaviour of a closure.
Marco A has suggested a work around: make the pointer passed by copy. That increases the reference counter, so it does not get destroyed after create_counter. But that is kludge. But, as Marco pointed out, it works and does exactly what I was expecting.
Jarod42 proposes to declare the variable, and initialize it as part of the capture list. But that defeats the purpose of the closure, as the variables are then local to the function, not to the environment where the function is created.
apple apple proposes using a static counter. But that is a workaround to avoid the destruction of the variable at the end of create_function, and it means that all returned functions share the same variable, not the environment under which they run.
So i guess the conclusion (unless somebody can shed more light) is that lambdas in C++ are not true closures.
thank you again for your comments.
The shared pointer is being destroyed at the end of the function scope and the memory is being freed: you're storing a dangling reference
std::function<int()> create_counter()
{
std::shared_ptr<int> counter = std::make_shared<int>(0);
auto f = [&]() -> int { return ++(*counter); };
return f;
} // counter gets destroyed
Therefore invoking undefined behavior. Test it for yourself by substituting the integer with a class or struct and check if the destructor actually gets called.
Capturing by value would have incremented the usage counter of the shared pointer and prevented the problem
auto f = [=]() -> int { return ++(*counter); };
^
As mentioned, you have dangling reference as the local variable is destroyed at end of the scope.
You can simplify your function to
std::function<int()> create_counter()
{
int counter = 0;
return [=] () mutable -> int { return ++counter; };
}
or even (in C++14)
auto create_counter()
{
return [counter = 0] () mutable -> int { return ++counter; };
}
Demo
Lambda expression — A lambda expression specifies an object specified inline, not just a function without a name, capable of capturing variables in scope.
Lambdas can frequently be passed around as objects.
In addition to its own function parameters, a lambda expression can refer to local variables in the scope of its definition.
Closures -
Closures are special functions that can capture the environment, i.e. variables within a lexical scope*.*
A closure is any function that closes over the environment in which it
was defined. This means that it can access variables, not in its
parameter list.
What is C++ specific part here
A closure is a general concept in programming that originated from functional programming. When we talk about the closures in C++, they always come with lambda expressions (some scholars prefer the inclusion of function object in this)
In c++ a lambda expression is the syntax used to create a special temporary object that behaves similarly to how function objects behave.
The C++ standard specifically refers to this type of object as a
closure object. This is a little bit at odds with the broader
definition of a closure, which refers to any function, anonymous or
not, that captures variables from the environment they are defined in.
As far as the standard is concerned, all instantiations of lambda expressions are closure objects, even if they don’t have any captures in their capture group.
https://pranayaggarwal25.medium.com/lambdas-closures-c-d5f16211de9a
if you want 1 2 3 4 5, you can also try this
std::function<int()> create_counter()
{
static int counter = 0;
auto f = [&] () -> int { return ++counter; };
return f;
}
If the variable is captured by vaule, then it is copy constructed from the original variable. If by reference, you can treat them as different reference to the same object.

Lambda function performance impact of capture

I just wrote a pretty big capture:
[this, &newIndex, &indexedDirs, &filters, &flags, &indexRecursion](){...
I use this lambda (indexRecursion) for a recursion with thoudands of elements and asked myself, if it would be more efficient to use the "global" capture [&]. Since I have no clue of the implementation of the capture I need some explanation. Please with background too.
Usually you can think of a lambda as equivalent to this:
class ANON {
int data;
public:
void operator ()(void) const {
cout << data << endl;
}
} lambda;
// auto lambda = [data]() {cout << data << endl;}
This should give you an idea of how capture is implemented. Capture all (be it by copy = or reference &) will probably be no more than syntactic sugar for specifying all used/available variables for capture in the current scope.
But since ...
[..] An implementation may define the closure type differently from what is described below provided this does not alter the observable behavior of the program other than by changing: [..] the size and/or alignment of the closure type [..]
[N4431 §5.1.2/3]
... it would be legal for an implementation to use some sort of "black magic" for capture all by reference lambdas and just use a pointer to the captured stack frame, rewriting accesses to the variables as accesses to some offset of that pointer:
class ANON {
void * stack_frame;
public:
void operator ()(void) const {
cout << *static_cast<int *>(stack_frame + 8) << endl;
}
} lambda;
So using & might (some day) be more efficient, but as already said this is implementation defined and this nothing to be relied upon.
Internally lambdas are /usually/ implemented as ad-hoc classes whose single instance object is constructed in the point of lambda definition and which are exposing a functor to be called later. So lambda performance should be compared with passing a method to a function using std::bind.
Captures aren't mystical entities as well. If captures are reference-captures, entities which they refer to are detroyed when go out of the scope as usual, so beware if your lambda isn't local: inside its body it may refer to object which have been destroyed already.

capture member variable by value

How would I catch a member variable by value when using C++11 lambda expressions?
Using the [my_member] syntax doesn't seem to work, and implicit capture uses the this pointer. What is need is a way to explicitly specify capture type of member variables. Is that possible?
My workaround for now is:
void member_function()
{
std::shared_ptr<my_member_class> my_member_copy = my_member; // this shouldn't be necessary
std::async([=]{ std::cout << *my_member_copy; });
// std::async([=]{ std::cout << *my_member_; }); // wrong, my member could be potentially out of scope
}
I don't think you can capture a member by value, you can capture this but since the member is part of this you'll be using a shared member and not a new variable.
Not knowing what type your member is something like this should work:
auto copy = my_member;
std::async([copy]{ std::cout << copy; });
I don't understand why you're using a shared_ptr in your example, if you want to capture by value surely shared_ptr is the last thing you should consider.
Unfortunately, I don't think there is a straight-forward way to do this, but I can think of a couple of ways to capture a member without making an extra copy.
The first option is similar to your example but uses a reference for the local variable:
void member_function()
{
std::shared_ptr<my_member_class> &my_member_ref = my_member;
// Copied by the lambda capture
std::async([my_member_ref]{ std::cout << *my_member_ref; });
}
Note that there is a bug in pre 4.6.2 versions of GCC that cause the value not to be copied. See Capturing reference variable by copy in C++0x lambda.
A second approach would be to use bind to make the copy:
void member_function()
{
// Copied by std::bind
std::async(std::bind([](const shared_ptr<my_member_class>& my_member){
std::cout << *my_member; }, my_member));
}
In this example, bind will make its own copy of my_member, and this copy will then be passed to the lambda expression by reference.
Since your question is about C++11 this is not really an answer, but in C++14 you can do like this:
void member_function()
{
std::async([my_member=my_member]{ std::cout << *my_member; });
}
It does the same thing as your own "work-around" (if my_member is a shared_ptr).
auto& copy = my_member;
std::async([copy]{ std::cout << copy; });
auto& (above) also works and obviates copying twice. Although this approach is more syntax than passing [this], it obviates passing into the closure a dependency on the object [this] points to.
Right now, I faced the same problem and solved it myself:
Capture the this pointer.
then write this->member syntax inside the lambda:
That is,
std::async([this]{ std::cout << this->my_member_; } );
// ^^^^ ^^^^^^ use this syntax
// |
// ^
// capture `this` as well
It works for me. I hope it should work for you too. However, I'm not completely satisfied with this knowledge. After my work, I'll look for the reason why this syntax is required, or it is a compiler bug. I'm using GCC 4.5.0 (MinGW).
Well, I found the following topic which says this pointer should be captured in order to use the member of the class.
Capturing this required to access member functions?