I have a (hopefully) simple question about lambda expressions:
#include <vector>
#include <algorithm>
//----------------------------------------------------------------
void DoSomething()
//----------------------------------------------------------------
{
std::vector<int> elements;
elements.push_back(1);
elements.push_back(2);
int ref = 1;
auto printhit = [=](int iSomeNumber)
{
if (ref == iSomeNumber)
{
printf("Hit: %d\n", iSomeNumber);
}
else
{
printf("No Hit: %d\n", iSomeNumber);
}
};
ref = 2;
std::for_each(elements.begin(), elements.end(), printhit);
}
Now, my question is: When I define printhit with capture [=], it prints "Hit: 1". If I pass it by reference [&], it prints "Hit: 2".
I somehow expected, that the substitution is done within for_each, so that "Hit: 2" is printed no matter how I grant access to "ref".
Can anyone explain this to me?
Thanks,
Markus
The capture happens at the location where you declare the lambda. Just like if you were to create a class object at that point and pass ref to its constructor.
Your example is equivalent to this:
class Functor
{
public:
Functor(int r) :ref(r) {}
void operator()(int iSomeNumber) const
{
if (ref == iSomeNumber)
{
printf("Hit: %d\n", iSomeNumber);
}
else
{
printf("No Hit: %d\n", iSomeNumber);
}
}
private:
int ref;
};
void DoSomething()
//----------------------------------------------------------------
{
std::vector<int> elements;
elements.push_back(1);
elements.push_back(2);
int ref = 1;
Functor printhit(ref);
ref = 2;
std::for_each(elements.begin(), elements.end(), printhit);
}
I guess, that the following parts of the C++ standard apply:
5.1.2.14:
An entity is captured by copy if it is implicitly captured and the capture-default is = or if it is explicitly
captured with a capture that does not include an &. For each entity captured by copy, an unnamed nonstatic
data member is declared in the closure type. The declaration order of these members is unspecified.
The type of such a data member is the type of the corresponding captured entity if the entity is not a
reference to an object, or the referenced type otherwise. [ Note: If the captured entity is a reference to a
function, the corresponding data member is also a reference to a function. —end note ]
5.1.2.21:
When the lambda-expression is evaluated, the entities that are captured by copy are used to direct-initialize
each corresponding non-static data member of the resulting closure object. (For array members, the array
elements are direct-initialized in increasing subscript order.) These initializations are performed in the
(unspecified) order in which the non-static data members are declared. [ Note: This ensures that the
destructions will occur in the reverse order of the constructions. —end note ]
What would be the point of having them both operate the same way? The point of [=] is to support capture by copy instead of by reference.
Imagine if [=] wasn't available: if you know a runtime value at the point in the code where the lambda's defined and want the lambda to use it ever after, how could that value be made available to the lambda code? While DoSomething() is running by-ref [&] access to its local ref variable might serve, but what if you want to have the lambda's lifetime outlive the local scope in DoSomething() that contains it, or want to change the value of ref without that affecting future calls to the lambda? Conceptually, you could have the language forbid all these things (use after ref is changed or changes to ref or calls of the lambda after ref is changed or out of scope), or the programmer could go to elaborate lengths to put the value of ref somewhere for the lambda to use (e.g. on the heap, with the need to manage deallocation, or in some static buffer with re-entrance and thread-safety issues), but to make it convenient the language provides [=]. The compiler-generated lambda effectively takes responsibility for storing and destructing/deallocating the copy of ref.
Related
Looking at various examples with lambda expressions, I came across unexpected behavior for me. In this code, I expect that the variables captured by value will not change inside the lambda. Code execution shows the opposite. Can someone give an explanation. Thanks.
#include <iostream>
#include <functional>
using namespace std;
class Fuu{
public:
Fuu(int v):m_property(v){}
void setProperty(int v ){m_property = v;}
std::function<void(int)> lmbSet {[=](int v){ m_property = v;}};
std::function<void(char *)> lmbGet {[=](char * v){ cout<< v << " "<< m_property << endl;}};
std::function<void()> lmb4Static {[=]{ cout<<"static: "<< s_fild << endl;}};
void call() const {
lmbGet("test") ;
}
static int s_fild ;
int m_property;
};
int Fuu::s_fild = 10;
int main()
{
Fuu obj(3);
obj.call();
obj.setProperty(5);
obj.call();//expect 3
obj.lmbSet(12);
obj.call(); //expect 3
obj.lmb4Static();
Fuu::s_fild = 11;
obj.lmb4Static(); //expect 10
return 0;
}
That is a common problem. Common enough that they deprecated the implicit capture of this in a lambda with = for C++20. When your code is doing is:
std::function<void(int)> lmbSet {[=](int v){ m_property = v;}};
It is effectively doing this:
std::function<void(int)> lmbSet {[this](int v){ this->m_property = v;}};
The lambda captures the this pointer by copy, not m_property at all.
In general, but formally in C++20, the preferred way to do this would be to explicitly copy this (in which case, no = is necessary, though you could):
std::function<void(int)> lmbSet {[this](int v){ m_property = v;}};
To make a copy of the current object and modify that copy, you'd need to dereference this (and then the capture is a copy of *this) and to modify it, you'd need to mark your lambda mutable:
std::function<void(int)> lmbSet {[*this](int v) mutable { m_property = v;}};
It doesn't seem very useful, but perhaps that helps explain what's going on.
[Note: it's not valid to copy *this until the construction completes, or it's copying an object before its lifetime starts, and that will result in undefined behavior.]
Capture by value only takes copies of local variables (none in your example) and this pointer. Capturing this by value is equivalent to capturing non-static data members by reference. Static member variables, like globals, are not captured at all - you are accessing them directly instead.
If you actually need to keep a copy of non-local data, you can do it using "generalized capture" syntax which was added in C++14:
[m_property=m_property](char * v){ cout<< v << " "<< m_property << endl;}
In this case [=] effectively just copies this as a pointer. Then, when you access data members in your closure the copy of this is dereferenced, and if you modify that member it modifies the original member. If you think about it, it isn't possible that [=] copies each member. You would have an infinite recursion of copies as each lmbSet would make a new copy of lmbGet, which in turn would make a new copy of lmbSet.
Note that this solution requires that the rule of 3/5/0 be applied. If you try to make a copy of an instance of Fuu, the captured this pointer will copied and point to the original instance and not to the new copy. In general, capturing this or pointers to data members in default member initializes should be avoided.
If I create a lambda in a function and capture a variable to the lambda using std::move, when does the move happen? Is it when the lambda is created or when the lambda is executed?
Take the following code for example ... when do the various moves happen? Is it thread safe if myFunction is called on one thread and testLambda is executed on another thread?
class MyClass {
private:
// Only accessed on thread B
std::vector<int> myStuff;
// Called from thread A with new data
void myFunction(const std::vector<int>&& theirStuff) {
// Stored to be called on thread B
auto testLambda = [this, _theirStuff{ std::move(theirStuff) }]() {
myStuff = std::move(_theirStuff);
};
// ... store lambda
}
// Elsewhere on thread A
void someOtherFunction() {
std::vector<int> newStuff = { 1, 2, .... n };
gGlobalMyClass->myFunction(std::move(newStuff));
}
If I create a lambda in a function and capture a variable to the lambda using std::move, when does the move happen? Is it when the lambda is created or when the lambda is executed?
If you had written what I believe you intended to write, then the answer would be: both. Currently, the answer is: neither. You have a lambda capture _theirStuff { std::move(theirStuff) }. This basically declares a member of the closure type, which will be initialized when the closure object is created as if it were
auto _theirStuff { std::move(theirStuff) };
You also have
myStuff = std::move(_theirStuff);
in the lambda body.
However, your parameter theirStuff is actually an rvalue reference to a const std::vector<int>. Thus, _theirStuff { std::move(theirStuff) } is not actually going to perform a move, because a const std::vector cannot be moved from. Most likely, you wanted to write std::vector<int>&& theirStuff instead. Furthermore, as pointed out by #JVApen in the comments below, your lambda is not mutable. Therefore, _theirStuff will actually be const as well, and, thus, also cannot be moved from. Consequently, your code above, despite all the std::move, will actually make a copy of the vector every time. If you had written
void myFunction(std::vector<int>&& theirStuff)
{
auto testLambda = [this, _theirStuff { std::move(theirStuff) }]() {
myStuff = std::move(_theirStuff);
};
}
You would be moving theirStuff into _theirStuff when the closure object is created. And you would be copying _theirStuff into myStuff when the lambda is called. If you had written
void myFunction(std::vector<int>&& theirStuff)
{
auto testLambda = [this, _theirStuff { std::move(theirStuff) }]() mutable {
myStuff = std::move(_theirStuff);
};
}
Then you would be moving theirStuff into _theirStuff when the closure object is created. And you would be moving _theirStuff into myStuff when the lambda is called. Note that, as a consequence, your lambda then cannot really be called twice. I mean, it can, but it will only really work once since _theirStuff will be empty after the first time the lambda is called…
Also, note that above description is only valid for the particular combination of types in your example. There is no general definition of what it actually means to move an object. What it means to move an object is entirely up to the particular type of the object. It may not even mean anything. std::move itself does not really do anything. All it does is cast the given expression to an rvalue reference. If you then initialize another object from the result of std::move, or assign the result to an object, overload resolution will pick a move constructor or move assignment operator—if one exists—instead of the normal copy constructor or copy assignment operator. It is then up to the implementation of the move constructor/move assignment operator of the respective type to actually perform a move, i.e., do whatever it is that's supposed to be done for the particular type in case of initialization or assignment from an rvalue. So, in a way, what you do when you apply std::move is that you advertise the respective object as "this may be moved from". Whether or not it actually will be moved from (and, if so, what that actually means) is up to the implementation. In the particular case of std::vector, the move constructor/move assignment operator, by definition, guarantee that not only the contents of the original vector will be taken over from the original object, but also that the original object will be empty afterwards. In many other cases, it may be undefined behavior to do anything with an object that was moved from (except, maybe, destroy it; that one can be pretty much taken for granted as a type that doesn't at least allow that would be pretty much useless; typically, you will at least be able to assign a new value to an object that was moved from, but even that is not guaranteed in general). You always have to check for the particular type at hand what condition an object is guaranteed to be in after having been moved from…
How does one exploit structured binding and tuples to return objects local to a function?
In a function, I am making local objects that reference each other, and I want to return these objects in a tuple and use structured binding to identify them whenever I call the function. I currently have this:
std::tuple<Owner&&, State<Controller>&&, State<Ancillary>&&, State<Compressor>&&>
inline makeOwner() {
State<Controller>&& controller = State<Controller>();
State<Ancillary>&& ancillary = State<Ancillary>();
State<Compressor>&& compressor = State<Compressor>();
Owner&& owner = Owner(controller, ancillary, compressor);
return {owner, controller, ancillary, compressor};
}
// using the function later
const &&[owner, controller, ancillary, compressor] = makeOwner();
This does not work, and I get an error saying the return value isn't convertable to a tuple of the aforementioned return type. I'm not sure why this is the case, since the types match up to the declarations.
Ultimately, I'm trying to create a convenience function so I don't have to type the four lines in the function every time I want to make a new Owner. This is my attempt at using structured binding to make this easier.
EDIT:
I should note that I want the bindings in the last line to reference the objects inside of owner. So, copies are insufficient.
I want the bindings in the last line to reference the objects inside of owner.
Let's ignore all of the new language features and go back to basics. How do you expect this to work?
int&& f() { return 0; }
int&& r = f();
You want r to be a reference to the local variable inside of f? But that gets destroyed at the end of the execution of f(). This code compiles but r is a dangling reference.
The only way for this to be safe is to ensure that f() returns a reference to an object that definitely outlives the function. Maybe it's a local static, maybe it's global, maybe it's a member variable of the class that f is a member function of, etc:
int global = 0;
int&& f() { return std::move(global); }
int&& r = f(); // okay, r is a reference to global, no dangling
Or, if that doesn't make sense, then you need to return an object by value. You can still take a reference to it. Or not:
int f() { return 0; }
int&& r = f(); // okay, lifetime extension
int i = f(); // okay, prvalue elision
The same underlying principles apply once we add in all the complexities of tuple and structured bindings. Either return local, non-static objects by value, or return some other objects by reference. But do not return local, non-static objects by reference.
Ultimately, I'm trying to create a convenience function so I don't have to type the four lines in the function every time I want to make a new Owner. This is my attempt at using structured binding to make this easier.
Why not just make a type?
struct X {
X() : owner(controller, ancillary, compressor) { }
X(X const&) = delete;
X& operator=(X const&) = delete;
State<Controller> controller;
State<Ancillary> ancillary;
State<Compressor> compressor;
Owner owner;
};
// lifetime extension on the X, no copies anywhere
// note that owner is last
auto&& [controller, ancillary, compressor, owner] = X();
// no lifetime extension, but also no copies because
// prvalue elision
auto [controller, ancillary, compressor, owner] = X();
inline auto makeOwner() {
struct bundle {
State<Controller> controller;
State<Ancillary> ancillary;
State<Compressor> compressor;
Owner owner = Owner(controller, ancillary, compressor);
bundle(bundle const&)=delete;
bundle& operator=(bundle const&)=delete;
};
return bundle{};
}
// using the function later
const auto&&[owner, controller, ancillary, compressor] = makeOwner();
here we use the fact that structs, even anonymous ones, can be unbundled like tuples.
Live example.
In the code below, I create a lambda that captures a local variable by reference. Note that it is a pointer, so, if C++ lambdas are true closures, it should survive the lifetime of the function that creates the lambda.
However, when I call it again, rather than creating a new local variable (a new environment) it reuses the same as before, and in fact, captures exactly the same pointer as before.
This seems wrong. Either, C++ lambdas are not true closures, or is my code incorrect?
Thank you for any help
#include <iostream>
#include <functional>
#include <memory>
std::function<int()> create_counter()
{
std::shared_ptr<int> counter = std::make_shared<int>(0);
auto f = [&] () -> int { return ++(*counter); };
return f;
}
int main()
{
auto counter1 = create_counter();
auto counter2 = create_counter();
std::cout << counter1() << std::endl;
std::cout << counter1() << std::endl;
std::cout << counter2() << std::endl;
std::cout << counter2() << std::endl;
std::cout << counter1() << std::endl;
return 0;
}
This code returns:
1
2
3
4
5
But I was expecting it to return:
1
2
1
2
3
Further edit:
Thank you for pointing the error in my original code. I see now that what is happening is that the pointer gets deleted after the invocation of create_couter, and the new create simply reuses the same memory address.
Which brings me to my real question then, what I want to do is this:
std::function<int()> create_counter()
{
int counter = 0;
auto f = [&] () -> int { return ++counter; };
return f;
}
If C++ lambdas were true closures, each local counter will coexist with the returned function (the function carries its environment--at least part of it). Instead, counter is destroyed after the invocation of create_counter, and calling the returned function creates a segmentation fault. That is not the expected behaviour of a closure.
Marco A has suggested a work around: make the pointer passed by copy. That increases the reference counter, so it does not get destroyed after create_counter. But that is kludge. But, as Marco pointed out, it works and does exactly what I was expecting.
Jarod42 proposes to declare the variable, and initialize it as part of the capture list. But that defeats the purpose of the closure, as the variables are then local to the function, not to the environment where the function is created.
apple apple proposes using a static counter. But that is a workaround to avoid the destruction of the variable at the end of create_function, and it means that all returned functions share the same variable, not the environment under which they run.
So i guess the conclusion (unless somebody can shed more light) is that lambdas in C++ are not true closures.
thank you again for your comments.
The shared pointer is being destroyed at the end of the function scope and the memory is being freed: you're storing a dangling reference
std::function<int()> create_counter()
{
std::shared_ptr<int> counter = std::make_shared<int>(0);
auto f = [&]() -> int { return ++(*counter); };
return f;
} // counter gets destroyed
Therefore invoking undefined behavior. Test it for yourself by substituting the integer with a class or struct and check if the destructor actually gets called.
Capturing by value would have incremented the usage counter of the shared pointer and prevented the problem
auto f = [=]() -> int { return ++(*counter); };
^
As mentioned, you have dangling reference as the local variable is destroyed at end of the scope.
You can simplify your function to
std::function<int()> create_counter()
{
int counter = 0;
return [=] () mutable -> int { return ++counter; };
}
or even (in C++14)
auto create_counter()
{
return [counter = 0] () mutable -> int { return ++counter; };
}
Demo
Lambda expression — A lambda expression specifies an object specified inline, not just a function without a name, capable of capturing variables in scope.
Lambdas can frequently be passed around as objects.
In addition to its own function parameters, a lambda expression can refer to local variables in the scope of its definition.
Closures -
Closures are special functions that can capture the environment, i.e. variables within a lexical scope*.*
A closure is any function that closes over the environment in which it
was defined. This means that it can access variables, not in its
parameter list.
What is C++ specific part here
A closure is a general concept in programming that originated from functional programming. When we talk about the closures in C++, they always come with lambda expressions (some scholars prefer the inclusion of function object in this)
In c++ a lambda expression is the syntax used to create a special temporary object that behaves similarly to how function objects behave.
The C++ standard specifically refers to this type of object as a
closure object. This is a little bit at odds with the broader
definition of a closure, which refers to any function, anonymous or
not, that captures variables from the environment they are defined in.
As far as the standard is concerned, all instantiations of lambda expressions are closure objects, even if they don’t have any captures in their capture group.
https://pranayaggarwal25.medium.com/lambdas-closures-c-d5f16211de9a
if you want 1 2 3 4 5, you can also try this
std::function<int()> create_counter()
{
static int counter = 0;
auto f = [&] () -> int { return ++counter; };
return f;
}
If the variable is captured by vaule, then it is copy constructed from the original variable. If by reference, you can treat them as different reference to the same object.
I just wrote a pretty big capture:
[this, &newIndex, &indexedDirs, &filters, &flags, &indexRecursion](){...
I use this lambda (indexRecursion) for a recursion with thoudands of elements and asked myself, if it would be more efficient to use the "global" capture [&]. Since I have no clue of the implementation of the capture I need some explanation. Please with background too.
Usually you can think of a lambda as equivalent to this:
class ANON {
int data;
public:
void operator ()(void) const {
cout << data << endl;
}
} lambda;
// auto lambda = [data]() {cout << data << endl;}
This should give you an idea of how capture is implemented. Capture all (be it by copy = or reference &) will probably be no more than syntactic sugar for specifying all used/available variables for capture in the current scope.
But since ...
[..] An implementation may define the closure type differently from what is described below provided this does not alter the observable behavior of the program other than by changing: [..] the size and/or alignment of the closure type [..]
[N4431 §5.1.2/3]
... it would be legal for an implementation to use some sort of "black magic" for capture all by reference lambdas and just use a pointer to the captured stack frame, rewriting accesses to the variables as accesses to some offset of that pointer:
class ANON {
void * stack_frame;
public:
void operator ()(void) const {
cout << *static_cast<int *>(stack_frame + 8) << endl;
}
} lambda;
So using & might (some day) be more efficient, but as already said this is implementation defined and this nothing to be relied upon.
Internally lambdas are /usually/ implemented as ad-hoc classes whose single instance object is constructed in the point of lambda definition and which are exposing a functor to be called later. So lambda performance should be compared with passing a method to a function using std::bind.
Captures aren't mystical entities as well. If captures are reference-captures, entities which they refer to are detroyed when go out of the scope as usual, so beware if your lambda isn't local: inside its body it may refer to object which have been destroyed already.