Are C++ lambdas true closures? Capturing by reference - c++

In the code below, I create a lambda that captures a local variable by reference. Note that it is a pointer, so, if C++ lambdas are true closures, it should survive the lifetime of the function that creates the lambda.
However, when I call it again, rather than creating a new local variable (a new environment) it reuses the same as before, and in fact, captures exactly the same pointer as before.
This seems wrong. Either, C++ lambdas are not true closures, or is my code incorrect?
Thank you for any help
#include <iostream>
#include <functional>
#include <memory>
std::function<int()> create_counter()
{
std::shared_ptr<int> counter = std::make_shared<int>(0);
auto f = [&] () -> int { return ++(*counter); };
return f;
}
int main()
{
auto counter1 = create_counter();
auto counter2 = create_counter();
std::cout << counter1() << std::endl;
std::cout << counter1() << std::endl;
std::cout << counter2() << std::endl;
std::cout << counter2() << std::endl;
std::cout << counter1() << std::endl;
return 0;
}
This code returns:
1
2
3
4
5
But I was expecting it to return:
1
2
1
2
3
Further edit:
Thank you for pointing the error in my original code. I see now that what is happening is that the pointer gets deleted after the invocation of create_couter, and the new create simply reuses the same memory address.
Which brings me to my real question then, what I want to do is this:
std::function<int()> create_counter()
{
int counter = 0;
auto f = [&] () -> int { return ++counter; };
return f;
}
If C++ lambdas were true closures, each local counter will coexist with the returned function (the function carries its environment--at least part of it). Instead, counter is destroyed after the invocation of create_counter, and calling the returned function creates a segmentation fault. That is not the expected behaviour of a closure.
Marco A has suggested a work around: make the pointer passed by copy. That increases the reference counter, so it does not get destroyed after create_counter. But that is kludge. But, as Marco pointed out, it works and does exactly what I was expecting.
Jarod42 proposes to declare the variable, and initialize it as part of the capture list. But that defeats the purpose of the closure, as the variables are then local to the function, not to the environment where the function is created.
apple apple proposes using a static counter. But that is a workaround to avoid the destruction of the variable at the end of create_function, and it means that all returned functions share the same variable, not the environment under which they run.
So i guess the conclusion (unless somebody can shed more light) is that lambdas in C++ are not true closures.
thank you again for your comments.

The shared pointer is being destroyed at the end of the function scope and the memory is being freed: you're storing a dangling reference
std::function<int()> create_counter()
{
std::shared_ptr<int> counter = std::make_shared<int>(0);
auto f = [&]() -> int { return ++(*counter); };
return f;
} // counter gets destroyed
Therefore invoking undefined behavior. Test it for yourself by substituting the integer with a class or struct and check if the destructor actually gets called.
Capturing by value would have incremented the usage counter of the shared pointer and prevented the problem
auto f = [=]() -> int { return ++(*counter); };
^

As mentioned, you have dangling reference as the local variable is destroyed at end of the scope.
You can simplify your function to
std::function<int()> create_counter()
{
int counter = 0;
return [=] () mutable -> int { return ++counter; };
}
or even (in C++14)
auto create_counter()
{
return [counter = 0] () mutable -> int { return ++counter; };
}
Demo

Lambda expression — A lambda expression specifies an object specified inline, not just a function without a name, capable of capturing variables in scope.
Lambdas can frequently be passed around as objects.
In addition to its own function parameters, a lambda expression can refer to local variables in the scope of its definition.
Closures -
Closures are special functions that can capture the environment, i.e. variables within a lexical scope*.*
A closure is any function that closes over the environment in which it
was defined. This means that it can access variables, not in its
parameter list.
What is C++ specific part here
A closure is a general concept in programming that originated from functional programming. When we talk about the closures in C++, they always come with lambda expressions (some scholars prefer the inclusion of function object in this)
In c++ a lambda expression is the syntax used to create a special temporary object that behaves similarly to how function objects behave.
The C++ standard specifically refers to this type of object as a
closure object. This is a little bit at odds with the broader
definition of a closure, which refers to any function, anonymous or
not, that captures variables from the environment they are defined in.
As far as the standard is concerned, all instantiations of lambda expressions are closure objects, even if they don’t have any captures in their capture group.
https://pranayaggarwal25.medium.com/lambdas-closures-c-d5f16211de9a

if you want 1 2 3 4 5, you can also try this
std::function<int()> create_counter()
{
static int counter = 0;
auto f = [&] () -> int { return ++counter; };
return f;
}

If the variable is captured by vaule, then it is copy constructed from the original variable. If by reference, you can treat them as different reference to the same object.

Related

Trying to return a function from a function with an argument function within it

I am curious if that's even possible to create a static function in another function and then return that static function with an argument function within it. So far what I've tried doesn't work at all, and when I use raw function pointers the code fails to compile.
#include <iostream>
#include <functional>
//both do not work but this one doesn't even compile
/*
void (*func(void (*foo)()))()
{
static void (*copy)();
copy = [&]() { foo(); };
return copy;
}
*/
std::function<void(void)> Foo(std::function<void(void)> func)
{
static std::function<void(void)> temp = [&]() { func(); };
return temp;
}
int main()
{
Foo([]() { std::cout << 123 << '\n'; });
}
The problem with your commented func is that a lambda which captures anything cannot convert to a pointer to function. Lambda captures provide data saved at initialization to be used when called, and a plain C++ function does not have any data other than its passed arguments. This capability is actually one of the big reasons std::function is helpful and we don't just use function pointers for what it does. std::function is more powerful than a function pointer.
Of course, func could instead just do copy = foo; or just return foo;, avoiding the lambda issue.
One problem with Foo is that the lambda captures function parameter func by reference, and then is called after Foo has returned and the lifetime of func has ended. A lambda which might be called after the scope of its captures is over should not capture them by reference, and std::function is an easy way to get into that bad situation. The lambda should use [=] or [func] instead.
Note that your static inside Foo is not like the static in front of an actual function declaration. It makes temp a function-static variable, which is initialized only the very first time and then kept. If Foo is called a second time with a different lambda, it will return the same thing it did the first time. If that's not what you want, just drop that static. Linkage of the functions/variables is not an issue here.
(It's a bit strange that Foo puts a std::function inside a lambda inside a std::function which all just directly call the next, instead of just using the original std::function. But I'll assume that's just because it's a simplified or learning-only example. If the lambda did something additional or different, this would be a fine way to do it.)
Your main ignores the function returned from Foo. Maybe you meant to call what Foo returns, with
int main()
{
Foo([]() { std::cout << 123 << '\n'; })();
}
It works, you simply never invoke the function:
#include <iostream>
#include <functional>
std::function<void(void)> Foo(std::function<void(void)> func)
{
static std::function<void(void)> temp = [&]() { func(); };
return temp;
}
int main()
{
// Notice final `()`
Foo([]() { std::cout << 123 << '\n'; })();
}
But you are capturing the function by reference, so you have to make sure the capturer doesn't outlive it.
Otherwise you could copy capture it.

Constructor parameter access member field of object under construction

I have the following structure that has a member function pointer. My question is, is there a way for the lambda that I pass to it to refer to its own member variable? i.e.
struct Foo
{
int aVar{1};
int (*funcPtr)(int);
Foo(int (*func)(int)) : funcPtr(func) {};
};
auto bar1 = Foo([](int a) {std::cout << a;}) // ok
auto bar2 = Foo([](int a) {std::cout << a + bar2.aVar;}) // Not ok but is there a way to access the member variable of the object currently being defined?
Can I achieve something to this effect?
What I would like to achieve here is a process to automatically generate objects based on the lambda you pass in. e.g: bar2 above is an object that can return anything plus its stored value. i.e. I would like bar2.funcPtr(5) == 5 + bar2.aVar to be an invariant of the class. In the future I might need another object that can return anything minus its stored value, and I only need to pass the corresponding lambda to do that (if the lambda can access the member fields), without defining a new class or method.
The lambda must have the signature int(int) but your lambda has the signature void(int) so that's the first problem.
The other is that the lambda must capture bar2. You could use std::function for that.
#include <iostream>
#include <functional>
struct Foo {
int aVar{1};
std::function<int(int)> funcPtr;
Foo(std::function<int(int)> func) : funcPtr(func) {};
int call(int x) { return funcPtr(x); }
};
int main() {
Foo bar2([&](int a) { return a + bar2.aVar; });
std::cout << bar2.call(2); // prints 3
}
A more practical solution would be to not tie the lambda to the instance for which it was originally created but to let it take a Foo& as an argument (and not to capture it). Moving it around doesn't become such a hassle with this approach.
Example:
struct Foo {
int aVar;
int(*funcPtr)(Foo&, int); // takes a Foo& as an argument
Foo(int x, int(*func)(Foo&, int)) : aVar(x), funcPtr(func) {};
int call(int x) { return funcPtr(*this, x); }
};
int main() {
Foo bar2(10, [](Foo& f, int a) { return a + f.aVar; });
std::cout << (bar2.call(5) == bar2.aVar + 5) << '\n'; // true
Foo bar3(20, [](Foo& f, int a) { return a * f.aVar; });
std::cout << bar2.call(2) << '\n'; // 10 + 2 = 12
std::cout << bar3.call(2) << '\n'; // 20 * 2 = 40
std::swap(bar2, bar3);
std::cout << bar2.call(2) << '\n'; // swapped, now 40
std::cout << bar3.call(2) << '\n'; // swapped, now 12
}
Foo( /* ... */ )
As used in the shown code this is a constructor call. This is constructing a new object, right there.
Before the new object can be created and its constructor get invoked, the parameters to the constructor must be evaluated. This is how C++ works. There are no alternatives, or workarounds, that end up constructing the object first and only then evaluate its constructor's parameters afterwards.
For this reason it is logically impossible for a lambda, that gets passed as parameter to the constructor, "refer to its own member variable". There is nothing in existence that has "its own member variable" at this point. The object's construction has not began, and you cannot refer to an object or a member of an object that does not exist. The object cannot be constructed until the constructor's parameters get evaluated, first. C++ does not work this way.
You will need to come up with some alternative mechanism for your class. At which point you will discover another fatal design flaw that dooms the shown approach:
int (*funcPtr)(int);
This is a plain function pointer. In order for lambda to reference an object that it's related to, in some form or fashion, it must capture the object (by reference, most likely). However lambdas that capture (by value or reference), cannot be converted to a plain function pointer. Only capture-less lambdas can be converted to a plain pointer.
At the bare minimum this must be a std::function, instead.
And now that it's a std::function, you can capture its object, by reference, in the lambda, and assign it to the std::function.
But this is not all, there is another problem that you must deal with: in order for all of this to work it is no longer possible for the object to be moved or copied in any way. This is because the lambda captured a reference to the original object that was constructed, full stop.
And the fact that the constructed object gets copied or moved does not, in some form or fashion, modify the lambda so it now magically captures the reference to the copy or the moved instance of the original object.
None of these are insurmountable problems, but they will require quite a bit of work to address, in order to have a well-formed C++ program as a result.

What happens if you return a reference to a local variable through a lambda function object?

#include <iostream>
auto get_lambda()
{
int i = 5;
auto lambda = [&i]() { std::cout << i << '\n'; };
return lambda;
}
int main()
{
auto lambda = get_lambda();
lambda();
}
Inside the 'get_lambda' function I define the local variable 'i'.
The function then returns the lambda object that has one capture reference to that local variable.
Inside 'main', I call that lambda, and 'i' turns out to be uninitialized memory.
The variable 'i' is located on the stack of get_lambda. This stack is no longer valid when the function returns.
Why does this code even compile and what exactly happens to the variable 'i', is it still useable outside of get_lambda?
In your case, you are invoking undefined behaviour. The i name is local to get_lambda() function, and once i gets out of scope, it gets destroyed. So, with your lambda, you are now storing a reference to something that isn't there anymore. This is also known as a dangling reference. Capture the local variable by value instead:
auto lambda = [i]() { std::cout << i << '\n'; };
or:
auto lambda = [=]() { std::cout << i << '\n'; };
You are indeed allowed to capture locals by reference in the lambda's capture-list. Hence, no compiler error. Depending on the compiler, a warning might be issued.
i is only available in its scope, which is inside get_lambda(). Obviously it is not usable outside its scope, which is what you are attempting when calling lambda() later.
Just like using a dangling pointer, this is undefined behavior. Similarly, the compiler will let you "use" such a pointer, but it's your duty as a programmer to know that you are breaking a rule.

Why can't I change the value of a variable captured by copy in a lambda function?

I'm reading the C++ Programming Language by B. Stroustrup in its section 11.4.3.4 "mutable Lambdas", which says the following:
Usually, we don’t want to modify the state of the function object (the
closure), so by default we can’t. That is, the operator()() for the
generated function object (§11.4.1) is a const member function. In the
unlikely event that we want to modify the state (as opposed to
modifying the state of some variable captured by reference; §11.4.3),
we can declare the lambda mutable.
I don't understand why the default for the operator()() is const when the variable is captured by value. What's the rational for this? What could go wrong when I change the value of a variable, which is copied into the function object?
One can think of lambdas as classes with operator()(), which by default is defined as const. That is, it cannot change the state of the object. Consequently, the lambda will behave as a regular function and produce the same result every time it is called. If instead, we declare the lambda as mutable, it is possible for the lambda to modify the internal state of the object, and provide a different result for different calls depending on that state. This is not very intuitive and therefore discouraged.
For example, with mutable lambda, this can happen:
#include <iostream>
int main()
{
int n = 0;
auto lam = [=]() mutable {
n += 1;
return n;
};
std::cout << lam() << "\n"; // Prints 1
std::cout << n << "\n"; // Prints 0
std::cout << lam() << "\n"; // Prints 2
std::cout << n << "\n"; // Prints 0
}
It is easier to reason about const data.
By defaulting const, brief lamndas are easier to reason about. If you want mutability you can ask for it.
Many function objects in std are copied around; const objects that are copied have simpler state to track.

When is a variable reference in a C++11 lambda expression resolved?

I have a (hopefully) simple question about lambda expressions:
#include <vector>
#include <algorithm>
//----------------------------------------------------------------
void DoSomething()
//----------------------------------------------------------------
{
std::vector<int> elements;
elements.push_back(1);
elements.push_back(2);
int ref = 1;
auto printhit = [=](int iSomeNumber)
{
if (ref == iSomeNumber)
{
printf("Hit: %d\n", iSomeNumber);
}
else
{
printf("No Hit: %d\n", iSomeNumber);
}
};
ref = 2;
std::for_each(elements.begin(), elements.end(), printhit);
}
Now, my question is: When I define printhit with capture [=], it prints "Hit: 1". If I pass it by reference [&], it prints "Hit: 2".
I somehow expected, that the substitution is done within for_each, so that "Hit: 2" is printed no matter how I grant access to "ref".
Can anyone explain this to me?
Thanks,
Markus
The capture happens at the location where you declare the lambda. Just like if you were to create a class object at that point and pass ref to its constructor.
Your example is equivalent to this:
class Functor
{
public:
Functor(int r) :ref(r) {}
void operator()(int iSomeNumber) const
{
if (ref == iSomeNumber)
{
printf("Hit: %d\n", iSomeNumber);
}
else
{
printf("No Hit: %d\n", iSomeNumber);
}
}
private:
int ref;
};
void DoSomething()
//----------------------------------------------------------------
{
std::vector<int> elements;
elements.push_back(1);
elements.push_back(2);
int ref = 1;
Functor printhit(ref);
ref = 2;
std::for_each(elements.begin(), elements.end(), printhit);
}
I guess, that the following parts of the C++ standard apply:
5.1.2.14:
An entity is captured by copy if it is implicitly captured and the capture-default is = or if it is explicitly
captured with a capture that does not include an &. For each entity captured by copy, an unnamed nonstatic
data member is declared in the closure type. The declaration order of these members is unspecified.
The type of such a data member is the type of the corresponding captured entity if the entity is not a
reference to an object, or the referenced type otherwise. [ Note: If the captured entity is a reference to a
function, the corresponding data member is also a reference to a function. —end note ]
5.1.2.21:
When the lambda-expression is evaluated, the entities that are captured by copy are used to direct-initialize
each corresponding non-static data member of the resulting closure object. (For array members, the array
elements are direct-initialized in increasing subscript order.) These initializations are performed in the
(unspecified) order in which the non-static data members are declared. [ Note: This ensures that the
destructions will occur in the reverse order of the constructions. —end note ]
What would be the point of having them both operate the same way? The point of [=] is to support capture by copy instead of by reference.
Imagine if [=] wasn't available: if you know a runtime value at the point in the code where the lambda's defined and want the lambda to use it ever after, how could that value be made available to the lambda code? While DoSomething() is running by-ref [&] access to its local ref variable might serve, but what if you want to have the lambda's lifetime outlive the local scope in DoSomething() that contains it, or want to change the value of ref without that affecting future calls to the lambda? Conceptually, you could have the language forbid all these things (use after ref is changed or changes to ref or calls of the lambda after ref is changed or out of scope), or the programmer could go to elaborate lengths to put the value of ref somewhere for the lambda to use (e.g. on the heap, with the need to manage deallocation, or in some static buffer with re-entrance and thread-safety issues), but to make it convenient the language provides [=]. The compiler-generated lambda effectively takes responsibility for storing and destructing/deallocating the copy of ref.