Lambda: A by-reference capture that could dangle - c++

Scott Meyers, in Effective Modern C++, says, at lambda chapter, that:
Consider the following code:
void addDivisorFilter()
{
auto calc1 = computeSomeValue1();
auto calc2 = computeSomeValue2();
auto divisor = computeDivisor(calc1, calc2);
filters.emplace_back(
[&](int value) { return value % divisor == 0; }
);
}
This code is a problem waiting to happen. The lambda refers to the local variable divisor, but that variable ceases to exist when addDivisorFilter returns. That's immediately after filters.emplace_back returns, so the function that's added to filters is essentially dead on arrival. Using that filter yields undefined behaviour from virtually the moment it's created.
The question is: Why is it an undefined behaviour? For what I understand, filters.emplace_back only returns after lambda expression is complete, and, during it execution, divisor is valid.
Update
An important data that I've missed to include is:
using FilterContainer = std::vector<std::function<bool(int)>>;
FilterContainer filters;

That's because the scope of the vector filters outlives the one of the function. At function exit, the vector filters still exists, and the captured reference to divisor is now dangling.
For what I understand, filters.emplace_back only returns after lambda expression is complete, and, during it execution, divisor is valid.
That's not true. The vector stores the lambda created from the closure, and does not "execute" the lambda, you execute the lambda after the function exits. Technically the lambda is constructed from a closure (an compiler-dependent-named class) that uses a reference internally, like
#include <vector>
#include <functional>
struct _AnonymousClosure
{
int& _divisor; // this is what the lambda captures
bool operator()(int value) { return value % _divisor == 0; }
};
int main()
{
std::vector<std::function<bool(int)>> filters;
// local scope
{
int divisor = 42;
filters.emplace_back(_AnonymousClosure{divisor});
}
// UB here when using filters, as the reference to divisor dangle
}

You are not evaluating the lambda function while addDivisorFilter is active. You are simply adding "the function" to the collection, not knowing when it might be evaluated (possibly long after addDivisorFilter returned).

In addition to #vsoftco's answer, the following modified example code lets you experience the problem:
#include <iostream>
#include <functional>
#include <vector>
void addDivisorFilter(std::vector<std::function<int(int)>>& filters)
{
int divisor = 5;
filters.emplace_back(
[&](int value) { return value % divisor == 0; }
);
}
int main()
{
std::vector<std::function<int(int)>> filters;
addDivisorFilter(filters);
std::cout << std::boolalpha << filters[0](10) << std::endl;
return 0;
}
live example
This example results in a Floating point exception at runtime, since the reference to divisor is not valid when the lambda is evaluated in main.

Related

Why is this recursive lambda function unsafe?

This question comes from Can lambda functions be recursive? . The accepted answer says the recursive lambda function shown below works.
std::function<int (int)> factorial = [&] (int i)
{
return (i == 1) ? 1 : i * factorial(i - 1);
};
However, it is pointed out by a comment that
such a function cannot be returned safely
, and the reason is supplied in this comment:
returning it destroys the local variable, and the function has a reference to that local variable.
I don't understand the reason. As far as I know, capturing variables is equivalent to retaining them as data members (by-value or by-reference according to the capture list). So what is "local variable" in this context? Also, the code below compiles and works correctly even with -Wall -Wextra -std=c++11 option on g++ 7.4.0.
#include <iostream>
#include <functional>
int main() {
std::function<int (int)> factorial = [&factorial] (int i)
{
return (i == 1) ? 1 : i * factorial(i - 1);
};
std::cout << factorial(5) << "\n";
}
Why is the function unsafe? Is this problem limited to this function, or lambda expression as a whole?
This is because in order to be recursive, it uses type erasure and captures the type erased container by reference.
This has the effect of allowing to use the lambda inside itself, by refering to it indirectly using the std::function.
However, for it to work, it must capture the std::function by reference, and that object has automatic storage duration.
Your lambda contains a reference to a local std::function. Even if you return the std::function by copy, the lambda will still refer to the old one, that died.
To make a secure to return recursive lambda, you can send the lambda to itself in an auto parameter and wrap that in another lambda:
auto factorial = [](auto self, int i) -> int {
return (i == 1) ? 1 : i * self(self, i - 1);
};
return [factorial](int i) { return factorial(factorial, i); };

visual studio implementation of "move semantics" and "rvalue reference"

I came across a Youtube video on c++11 concurrency (part 3) and the following code, which compiles and generates correct result in the video.
However, I got a compile error of this code using Visual Studio 2012. The compiler complains about the argument type of toSin(list<double>&&). If I change the argument type to list<double>&, the code compiled.
My question is what is returned from move(list) in the _tmain(), is it a rvalue reference or just a reference?
#include "stdafx.h"
#include <iostream>
#include <thread>
#include <chrono>
#include <list>
#include <algorithm>
using namespace std;
void toSin(list<double>&& list)
{
//this_thread::sleep_for(chrono::seconds(1));
for_each(list.begin(), list.end(), [](double & x)
{
x = sin(x);
});
for_each(list.begin(), list.end(), [](double & x)
{
int count = static_cast<int>(10*x+10.5);
for (int i=0; i<count; ++i)
{
cout.put('*');
}
cout << endl;
});
}
int _tmain(int argc, _TCHAR* argv[])
{
list<double> list;
const double pi = 3.1415926;
const double epsilon = 0.00000001;
for (double x = 0.0; x<2*pi+epsilon; x+=pi/16)
{
list.push_back(x);
}
thread th(&toSin, /*std::ref(list)*/std::move(list));
th.join();
return 0;
}
This appears to be a bug in MSVC2012. (and on quick inspection, MSVC2013 and MSVC2015)
thread does not use perfect forwarding directly, as storing a reference to data (temporary or not) in the originating thread and using it in the spawned thread would be extremely error prone and dangerous.
Instead, it copies each argument into decay_t<?>'s internal data.
The bug is that when it calls the worker function, it simply passes that internal copy to your procedure. Instead, it should move that internal data into the call.
This does not seem to be fixed in compiler version 19, which I think is MSVC2015 (did not double check), based off compiling your code over here
This is both due to the wording of the standard (it is supposed to invoke a decay_t<F> with decay_t<Ts>... -- which means rvalue binding, not lvalue binding), and because the local data stored in the thread will never be used again after the invocation of your procedure (so logically it should be treated as expiring data, not persistent data).
Here is a work around:
template<class F>
struct thread_rvalue_fix_wrapper {
F f;
template<class...Args>
auto operator()(Args&...args)
-> typename std::result_of<F(Args...)>::type
{
return std::move(f)( std::move(args)... );
}
};
template<class F>
thread_rvalue_fix_wrapper< typename std::decay<F>::type >
thread_rvalue_fix( F&& f ) { return {std::forward<F>(f)}; }
then
thread th(thread_rvalue_fix(&toSin), /*std::ref(list)*/std::move(list));
should work. (tested in MSVC2015 online compiler linked above) Based off personal experience, it should also work in MSVC2013. I don't know about MSVC2012.
What is returned from std::move is indeed an rvalue reference, but that doesn't matter because the thread constructor does not use perfect forwarding for its arguments. First it copies/moves them to storage owned by the new thread. Then, inside the new thread, the supplied function is called using the copies.
Since the copies are not temporary objects, this step won't bind to rvalue-reference parameters.
What the Standard says (30.3.1.2):
The new thread of execution executes
INVOKE( DECAY_COPY(std::forward<F>(f)), DECAY_COPY(std::forward<Args>(args))... )
with the calls to
DECAY_COPY being evaluated in the constructing thread.
and
In several places in this Clause the operation DECAY_COPY(x) is used. All such uses mean call the function decay_copy(x) and use the result, where decay_copy is defined as follows:
template <class T> decay_t<T> decay_copy(T&& v)
{ return std::forward<T>(v); }
The value category is lost.

What is this C++14 construct called which seems to chain lambdas?

This is a follow-up question on this one: Lambda-Over-Lambda in C++14, where the answers explain the code.
It is about a lambda that creates another lambda which when called, calls the passed lambda and passes the return value to the original lambda, thus returning a new instance of the second lambda.
The example shows how this way lambdas can be chained.
Copy from the original question:
#include <cstdio>
auto terminal = [](auto term) // <---------+
{ // |
return [=] (auto func) // | ???
{ // |
return terminal(func(term)); // >---------+
};
};
auto main() -> int
{
auto hello =[](auto s){ fprintf(s,"Hello\n"); return s; };
auto world =[](auto s){ fprintf(s,"World\n"); return s; };
terminal(stdout)
(hello)
(world) ;
return 0;
}
Is there already a name for this construct and if not what should it be called?
Does it resemble constructs in other languages?
Remark: I'm not interested in whether it is actually useful.
I looked around a bit and turns out the main functionality is reordering the function calls as explained in the answers to the original question.
So world(hello(stdout)); is rewritten to terminal(stdout)(hello)(world); which more generally could be written as compose(stdout)(hello)(world);.
In Haskell this would written as world . hello $ stdout and is called function composition.
In clojure it would be (-> stdout hello world) and is called the "thread-first" macro
I think it is only useful with decent partial application which lambdas provide a little bit, so we could have compose(4)([](int x){ return x + 7; })([](int x){ return x * 2; })([](int x){ return x == 22; }); which should return true if my calculation (and blind coding) is any good.
or to emphasize the partial application:
auto add7 = [](int x){ return x + 7; };
auto dbl = [](int x){ return x * 2; };
auto equal22 = [](int x){ return x == 22; };
assert(compose(4)(add7)(dbl)(equals22));
1 major issue with this implementation is probably that the result can't be evaluated because in the end a lambda is returned, so the construction in this answer might be better suited (function separated by comma instead of parenthesis).
terminal(x) returns an applicator that method-chains its return value into terminal for repeated invocation.
But we could instead generalize it.
Suppose you have a function F. F takes an argument, and stuffs it on a stack.
It then examines the stack. If the top of the stack, evaluated on some subset of the stack, would work for invocation, it does it, and pushes the result back onto the stack. In general, such invocation could return a tuple of results.
So:
F(3)(2)(add)(2)(subtract)(7)(3)(multiply)(power)
would evaluate to:
((3+2)-2)^(7*3)
Your terminal does this with 0 argument functions (the first argument) and with 1 argument functions (every argument after that), and only supports 1 return value per invocation.
Doing this with a lambda would be tricky, but what I described is doable in C++.
So one name for it would be stack-based programming.
As far as I know there is no "official" name, yet.
Suggestions:
Lambda chain
Lambda sausage
Curry sausage

Can I declare a variable inside a lambda capture clause?

I want to submit a handle but I only want it to be executed if a shared pointer is still valid:
// elsewhere in the class:
std::shared_ptr<int> node;
// later on:
const std::weak_ptr<int> slave(node); // can I do this in the capture clause somehow?
const auto hook = [=]()
{
if (!slave.expired())
//do something
else
// do nothing; the class has been destroyed!
};
someService.Submit(hook); // this will be called later, and we don't know whether the class will still be alive
Can I declare slave within the capture clause of the lambda? Something like const auto hook = [std::weak_ptr<int> slave = node,=]().... but unfortunately this doesn't work. I would like to avoid declaring the variable and then copying it (not for performance reasons; I just think it would be clearer and neater if I could create whatever the lambda needs without polluting the enclosing scope).
You can do this using generalized lambda captures in C++14:
const auto hook = [=, slave = std::weak_ptr<int>(node)]()
{
...
};
Here's a live example. Note that since there are no parameters or explicit return type, the empty parameter list (()) can be left out.
As mentioned by chris this is possible in C++14.
If you are willing to modify the captured value simply add mutablespecifier.
Here is an example which fills a vector from zero to the length of the vector.
#include <iostream>
#include <vector>
#include <algorithm>
int main()
{
std::vector<int> container(10);
std::generate(container.begin(), container.end(), [n = 0]() mutable { return n++; });
for (const auto & number : container)
{
std::cout << number << " ";
}
std::cin.ignore();
return 0;
}

What is wrong with my Phoenix lambda expression?

I would expect the following example Boost Phoenix expression to compile.
What am I missing?
int plus(int a,int b)
{
return a+b;
}
void main(int argc,char** argc)
{
auto plus_1 = phx::bind(&plus,1,arg1);
auto value = phx::lambda[phx::val(plus_1)(arg1)]()(1);
std::cout << value << std::endl;
}
auto plus_1 = phx::bind(&plus,1,arg1);
After this line, plus_1 is a function object that takes one int argument and adds one to it.
phx::lambda[plus_1(arg1)](1);
Whoops. This isn't going to work because (as we said above) plus_1 is a function object that takes one int argument and adds one to it. Here, you're trying to invoke it with arg1.
It isn't obvious from your code what you expect it to do. Can you clarify?
====EDIT====
I see you've edited the code in your question. Your code is still wrong but for a different reason now. This:
phx::val(plus_1)(arg1)
... uses val to create a nullary function that returns the plus_1 unary function. You then try to invoke the nullary function with arg1. Boom.
Here is code that executes and does (what I believe) you intend:
#include <iostream>
#include <boost/phoenix/phoenix.hpp>
namespace phx = boost::phoenix;
using phx::arg_names::arg1;
int plus(int a,int b)
{
return a+b;
}
int main()
{
auto plus_1 = phx::bind(&plus, 1, arg1);
int value = phx::bind(phx::lambda[plus_1], arg1)(1);
std::cout << value << std::endl;
}
The first bind takes the binary plus and turns it into a unary function with the first argument bound to 1. The second bind creates a new unary function that is equivalent to the first, but it does so by safely wrapping the first function using lambda. Why is that necessary? Consider the code below, which is equivalent, but without the lambda:
// Oops, wrong:
int value = phx::bind(phx::bind(&plus, 1, arg1), arg1)(1);
Notice that arg1 appears twice. All expressions get evaluated from the inside out. First, we'll bind the inner arg1 to 1, then evaluate the inner bind yielding 2, which we then try to bind and invoke. That's not going to work because 2 isn't callable.
The use of lambda creates a scope for the inner arg1 so it isn't eagerly substituted. But like I said, the use of the second bind, which forces the need for lambda, yields a function that is equivalent to the first. So it's needlessly complicated. But maybe it helped you understand about bind, lambda and Phoenix scopes.
It's not clear to me what you're trying to accomplish by using lambda here, but if you just want to call plus_1 with 1 (resulting in 2), it's much simpler than your attempt:
#include <iostream>
#include <boost/phoenix.hpp>
int plus(int a, int b)
{
return a + b;
}
int main()
{
namespace phx = boost::phoenix;
auto plus_1 = phx::bind(plus, 1, phx::arg_names::arg1);
std::cout << plus_1(1) << '\n';
}
Online demo
If this isn't what you're trying to accomplish, then you need to describe what you actually want. :-]
Perhaps this can explain it better.
Phoenix is not magic; it is first and foremost C++. It therefore follows the rules of C++.
phx::bind is a function that returns a function object, an object which has an overloaded operator() that calls the function that was bound. Your first statement stores this object into plus_1.
Given all of this, anytime you have the expression plus_1(...), this is a function call. That's what it is; you are saying that you want to call the overloaded operator() function on the type of that object, and that you are going to pass some values to that function.
It doesn't matter whether that expression is in the middle of a [] or not. phx::lambda cannot make C++ change its rules. It can't make plus_1(...) anything other than an immediate function call. Nor can arg1 make plus_1(...) not an immediate function call.