C++ nested lambda bug in VS2010 with lambda parameter capture? - c++

I'm using Visual Studio 2010, which apparently has some buggy behavior on lambdas, and have this nested lambda, where the inner lambda returns a second lambda wrapped as a std::function (cf. "Higher-order Lambda Functions" on MSDN):
int x = 0;
auto lambda = [&]( int n )
{
return std::function<void()>(
[&] // Note capture
{
x = n;
}
);
};
lambda( -10 )(); // Call outer and inner lambdas
assert( -10 == x ); // Fails!
This compiles but fails at the assert. Specifically, n in the inner lambda is uninitialized (0xCCCCCCCC), but x is successfully modified to its value. If I change the inner lambda's capture clause to "[&,n]", the assert passes as expected. Is this a bug with VS2010 or have I not understood how lambda capture works?

It is not a bug, since n goes out of scope after lambdas return statement, thus the capture by reference is invalidated by the time you use it.
int x = 0;
auto lambda = [&]( int n )
{
return std::function<void()>( // n is local to "lambda" and is destroyed after return statement, thus when you call the std::function, the reference capture of n is invalid.
[&]
{
x = n; // Undefined behaviour
}
);
};
auto tmp = lambda(-10);
// n is no longer valid
tmp(); // calling tmp which uses reference of n which is alrdy destroyed.
assert( -10 == x ); // Fails!

This is similar to the case of just returning a simple reference. The thing that caught you was the compiler did not issue an warning. So it is not a bug in the compiler, it is just a lack of a warning.
std::function<int()> F(int n)
{
return [&]{ return n; }; //no warning
}
int& F2(int n)
{
return n; //warning
}

Related

When is it safe to capture a lambda inside another lambda by reference?

Suppose you have the following program:
static std::function<int(int)> pack_a_lambda( std::function<int(int)> to_be_packed ) {
return [=]( int value ) {
return to_be_packed( value * 4 );
};
}
int main() {
auto f = pack_a_lambda( []( int value ) {
return value * 2;
} );
int result = f( 2 );
std::cout << result << std::endl; // should print 16
return 0;
}
I haven't tried the exact code above, cause I tested it in Google Tests and then slightly edited it like above. So, the function pack_a_lambda takes a lambda by value as input. Here, I believe the temporary lambda is copied. Then, when we create the new lambda, we again capture the copied lambda to_be_packed by value. It works, and seems to me it should be safe.
Now suppose we capture that lambda by reference instead:
static std::function<int(int)> pack_a_lambda( std::function<int(int)> to_be_packed ) {
return [&]( int value ) {
return to_be_packed( value * 4 );
};
}
In my specific use case, the resulting lambda executes four times faster. In the simplified example above I couldn't reproduce this difference, though. In fact, here it seems that capturing the lambda by reference makes it ever-so-slightly slower. So there is clearly some performance difference.
But is it safe? The argument to_be_packed is copied, but it's still a temporary right? That should make it not safe. But I'm not sure. My UB sanitizer and my AddressSanitizer does not complain, but I concede that doesn't prove anything. If I pass to_be_packed by reference...
static std::function<int(int)> pack_a_lambda( const std::function<int(int)> &to_be_packed ) {
return [&]( int value ) {
return to_be_packed( value * 4 );
};
}
...the AddressSanitizer complains, which is not surprising, because the lambda I pass into the function is also a temporary. So that leaves example two: Is it safe or not, and what are possible reasons it might be faster to execute in some cases?
static std::function<int(int)> pack_a_lambda( std::function<int(int)> to_be_packed ) {
return [&]( int value ) {
return to_be_packed( value * 4 );
};
}
is Undefined behavior as you "return" reference to local variable.
By value is the safe way here.
static std::function<int(int)> pack_a_lambda(const std::function<int(int)>& to_be_packed ) {
return [&]( int value ) {
return to_be_packed( value * 4 );
};
}
might be correct. you have to ensure that lifetime of passed parameter is longer than the returned std::function.
auto func = std::function([]( int value ) {
return value * 2;
});
auto f = pack_a_lambda(func); // OK
// auto f2 = pack_a_lambda([](int){ return 42;}); // KO: temporary std::function created
as temporary can bind to const reference, in that case, safer to delete the r-value version:
static std::function<int(int)> pack_a_lambda(std::function<int(int)>&&) = delete;
When is it safe to capture a lambda inside another lambda by reference?
Same as with any captured object: it is safe when the lifetime of the captured object is longer than the capturing lambda.
In your example, you capture a function argument. Its literime ends when the function returns. But you return the capturing lambda to the outside of the function. There, the captured reference will be invalid.

Why is this recursive lambda function unsafe?

This question comes from Can lambda functions be recursive? . The accepted answer says the recursive lambda function shown below works.
std::function<int (int)> factorial = [&] (int i)
{
return (i == 1) ? 1 : i * factorial(i - 1);
};
However, it is pointed out by a comment that
such a function cannot be returned safely
, and the reason is supplied in this comment:
returning it destroys the local variable, and the function has a reference to that local variable.
I don't understand the reason. As far as I know, capturing variables is equivalent to retaining them as data members (by-value or by-reference according to the capture list). So what is "local variable" in this context? Also, the code below compiles and works correctly even with -Wall -Wextra -std=c++11 option on g++ 7.4.0.
#include <iostream>
#include <functional>
int main() {
std::function<int (int)> factorial = [&factorial] (int i)
{
return (i == 1) ? 1 : i * factorial(i - 1);
};
std::cout << factorial(5) << "\n";
}
Why is the function unsafe? Is this problem limited to this function, or lambda expression as a whole?
This is because in order to be recursive, it uses type erasure and captures the type erased container by reference.
This has the effect of allowing to use the lambda inside itself, by refering to it indirectly using the std::function.
However, for it to work, it must capture the std::function by reference, and that object has automatic storage duration.
Your lambda contains a reference to a local std::function. Even if you return the std::function by copy, the lambda will still refer to the old one, that died.
To make a secure to return recursive lambda, you can send the lambda to itself in an auto parameter and wrap that in another lambda:
auto factorial = [](auto self, int i) -> int {
return (i == 1) ? 1 : i * self(self, i - 1);
};
return [factorial](int i) { return factorial(factorial, i); };

Lambda: A by-reference capture that could dangle

Scott Meyers, in Effective Modern C++, says, at lambda chapter, that:
Consider the following code:
void addDivisorFilter()
{
auto calc1 = computeSomeValue1();
auto calc2 = computeSomeValue2();
auto divisor = computeDivisor(calc1, calc2);
filters.emplace_back(
[&](int value) { return value % divisor == 0; }
);
}
This code is a problem waiting to happen. The lambda refers to the local variable divisor, but that variable ceases to exist when addDivisorFilter returns. That's immediately after filters.emplace_back returns, so the function that's added to filters is essentially dead on arrival. Using that filter yields undefined behaviour from virtually the moment it's created.
The question is: Why is it an undefined behaviour? For what I understand, filters.emplace_back only returns after lambda expression is complete, and, during it execution, divisor is valid.
Update
An important data that I've missed to include is:
using FilterContainer = std::vector<std::function<bool(int)>>;
FilterContainer filters;
That's because the scope of the vector filters outlives the one of the function. At function exit, the vector filters still exists, and the captured reference to divisor is now dangling.
For what I understand, filters.emplace_back only returns after lambda expression is complete, and, during it execution, divisor is valid.
That's not true. The vector stores the lambda created from the closure, and does not "execute" the lambda, you execute the lambda after the function exits. Technically the lambda is constructed from a closure (an compiler-dependent-named class) that uses a reference internally, like
#include <vector>
#include <functional>
struct _AnonymousClosure
{
int& _divisor; // this is what the lambda captures
bool operator()(int value) { return value % _divisor == 0; }
};
int main()
{
std::vector<std::function<bool(int)>> filters;
// local scope
{
int divisor = 42;
filters.emplace_back(_AnonymousClosure{divisor});
}
// UB here when using filters, as the reference to divisor dangle
}
You are not evaluating the lambda function while addDivisorFilter is active. You are simply adding "the function" to the collection, not knowing when it might be evaluated (possibly long after addDivisorFilter returned).
In addition to #vsoftco's answer, the following modified example code lets you experience the problem:
#include <iostream>
#include <functional>
#include <vector>
void addDivisorFilter(std::vector<std::function<int(int)>>& filters)
{
int divisor = 5;
filters.emplace_back(
[&](int value) { return value % divisor == 0; }
);
}
int main()
{
std::vector<std::function<int(int)>> filters;
addDivisorFilter(filters);
std::cout << std::boolalpha << filters[0](10) << std::endl;
return 0;
}
live example
This example results in a Floating point exception at runtime, since the reference to divisor is not valid when the lambda is evaluated in main.

C++ lambda function access write violation

I'm learning how to use C++ lambda functions along with <functional>'s function class. I am trying to solve this Code Golf as practice (challenge is Curry for Dinner)
I have this function:
// This creates a function that runs y a number of
// times equal to x's return value.
function<void()> Curry(function<int()> x, function<void()> y)
{
return [&]() {
for (int i = 0; i < x(); i++)
{
y();
}
};
}
To test this I have this code in my main():
auto x = [](){ return 8; };
auto y = [](){ cout << "test "; };
auto g = Curry(x, y);
This throws Access violation reading location 0xCCCCCCCC. in Functional.h.
Yet when I copy-paste the lambda function from inside Curry() to inside my main like this:
auto x = [](){ return 8; };
auto y = [](){ cout << "test "; };
auto g = [&]() {
for (int i = 0; i < x(); i++)
{
y();
}
};
I get the code running as expected. Why does this happen?
You have a few problems.
Here:
return [&]() {
you capture by reference. Any variables you capture has to have a lifetime that exceeds your own. It means that running the lambda becomes undefined behavior after the variables you capture&use lifetime ends. As you are returning this lambda, and capturing local state, this seems likely to happen. (Note I said variables -- due to a quirk in the standard, [&] captures variables not the data referred to by variables, so even capturing & function arguments by [&] is not safe. This may change in future revisions of the standard... There are neat optimizations that this particular set of rules allow in lambda implementations (reduce [&] lambdas to having 1 pointer worth of state(!)), but it also introduces the only case in C++ where you have a reference to a reference variable in effect...)
Change it to
return [=]() {
and capture by-value.
Or even:
return [x,y]() {
to list your captures explicitly.
When using a lambda which does not outlive the current scope, I use [&]. Otherwise, I capture by value explicitly the stuff I am going to use, as lifetime is important in that case.
Next:
for (int i = 0; i < x(); i++)
you run x once for every loop iteration. Seems silly!
Instead:
auto max = x();
for (auto i = max; i > 0; --i)
which runs max times, and as it happens works if the return value of x was changed to unsigned int or whatever.
Or:
int max = x();
for (int i = 0; i < max; ++i)
which both runs x once, and behaves better if x returns -1.
Alternatively you can use the obscure operator -->:
int count = x();
while( count --> 0 )
if you want to make your code unreadable. ;)

How to make an array of Lambda expressions [duplicate]

I was trying to create a vector of lambda, but failed:
auto ignore = [&]() { return 10; }; //1
std::vector<decltype(ignore)> v; //2
v.push_back([&]() { return 100; }); //3
Up to line #2, it compiles fine. But the line#3 gives compilation error:
error: no matching function for call to 'std::vector<main()::<lambda()>>::push_back(main()::<lambda()>)'
I don't want a vector of function pointers or vector of function objects. However, vector of function objects which encapsulate real lambda expressions, would work for me. Is this possible?
Every lambda has a different type—even if they have the same signature. You must use a run-time encapsulating container such as std::function if you want to do something like that.
e.g.:
std::vector<std::function<int()>> functors;
functors.push_back([&] { return 100; });
functors.push_back([&] { return 10; });
All lambda expressions have a different type, even if they are identical character-by-character. You're pushing a lambda of a different type (because it's another expression) into the vector, and that obviously won't work.
One solution is to make a vector of std::function<int()> instead.
auto ignore = [&]() { return 10; };
std::vector<std::function<int()>> v;
v.push_back(ignore);
v.push_back([&]() { return 100; });
On another note, it's not a good idea to use [&] when you're not capturing anything.
While what others have said is relevant, it is still possible to declare and use a vector of lambda, although it's not very useful:
auto lambda = [] { return 10; };
std::vector<decltype(lambda)> vec;
vec.push_back(lambda);
So, you can store any number of lambdas in there, so long as it's a copy/move of lambda!
If your lambda is stateless, i.e., [](...){...}, C++11 allows it to degrade into a function pointer. In theory, a C++11 compliant compiler would be able to compile this:
auto ignore = []() { return 10; }; //1 note misssing & in []!
std::vector<int (*)()> v; //2
v.push_back([]() { return 100; }); //3
You could use a lambda generating function (updated with fix suggested by Nawaz):
#include <vector>
#include <iostream>
int main() {
auto lambda_gen = [] (int i) {return [i](int x){ return i*x;};} ;
using my_lambda = decltype(lambda_gen(1));
std::vector<my_lambda> vec;
for(int i = 0; i < 10; i++) vec.push_back(lambda_gen(i));
int i = 0;
for (auto& lambda : vec){
std::cout << lambda(i) << std::endl;
i++;
}
}
But I think you basically made your own class at this point. Otherwise if the lambdas have completely different caputres/args etc. you probably have to use a tuple.
Each lambda is a different type. You must use std::tuple instead of std::vector.