Is it safe to cast a lambda function to a function pointer? - c++

I have this code:
void foo(void (*bar)()) {
bar();
}
int main() {
foo([] {
int x = 2;
});
}
However, I'm worried that this will suffer the same fate as:
struct X { int i; };
void foo(X* x) {
x->i = 2;
}
int main() {
foo(&X());
}
Which takes the address of a local variable.
Is the first example completely safe?

A lambda that captures nothing is implicitly convertible to a function pointer with its same argument list and return type. Only capture-less lambdas can do this; if it captures anything, then they can't.
Unless you're using VS2010, which didn't implement that part of the standard, since it didn't exist yet when they were writing their compiler.

Yes I believe the first example is safe, regardless of the life-time of all the temporaries created during the evaluation of the full-expression that involves the capture-less lambda-expression.
Per the working draft (n3485) 5.1.2 [expr.prim.lambda] p6
The closure type for a lambda-expression with no lambda-capture has a
public non-virtual non-explicit const conversion function to pointer
to function having the same parameter and return types as the closure
type’s function call operator. The value returned by this conversion
function shall be the address of a function that, when invoked, has
the same effect as invoking the closure type’s function call operator.
The above paragraph says nothing about the pointer-to-function's validity expiring after evaluation of the lambda-expression.
For e.g., I would expect the following to work:
auto L = []() {
return [](int x, int y) { return x + y; };
};
int foo( int (*sum)(int, int) ) { return sum(3, 4); }
int main() {
foo( L() );
}
While implementation details of clang are certainly not the final word on C++ (the standard is), if it makes you feel any better, the way this is implemented in clang is that when the lambda expression is parsed and semantically analyzed a closure-type for the lambda expression is invented, and a static function is added to the class with semantics similar to the function call operator of the lambda. So even though the life-time of the lambda object returned by 'L()' is over within the body of 'foo', the conversion to pointer-to-function returns the address of a static function that is still valid.
Consider the somewhat analagous case:
struct B {
static int f(int, int) { return 0; }
typedef int (*fp_t)(int, int);
operator fp_t() const { return &f; }
};
int main() {
int (*fp)(int, int) = B{};
fp(3, 4); // You would expect this to be ok.
}
I am certainly not a core-c++ expert, but FWIW, this is my interpretation of the letter of the standard, and I feel it is defendable.
Hope this helps.

In addition to Nicol's perfectly correct general answer, I would add some views on your particular fears:
However, I'm worried that this will suffer the same fate as ..., which
takes the address of a local variable.
Of course it does, but this is absolutely no problem when you just call it inside foo (in the same way your struct example is perfectly working), since the surrounding function (main in this case) that defined the local variable/lambda will outlive the called function (foo) anyway. It could only ever be a problem if you would safe that local variable or lambda pointer for later use. So
Is the first example completely safe?
Yes, it is, as is the second example, too.

Related

Forwarding additional arguments via lambda function

Here is the minimal reproducible code,
#include <iostream>
#include <string>
void bar(std::string s, int x){
std::cout <<std::endl<< __FUNCTION__<<":"<<s<<" "<<x;
}
using f_ptr = void(*)(std::string);
void foo(f_ptr ptr){
ptr("Hello world");
}
template<typename T> void fun(T f){
static int x;
std::cout <<std::endl<<x++<<std::endl;
f("Hello World");
}
int main()
{
//case:1
f_ptr ptr1 = [](std::string s){bar(s,10);};
// foo(ptr1);
//case:2
static int x =10;
f_ptr ptr2 = [x](std::string s){bar(s,x);};
//foo(ptr2);
//case:3
int y =10;
f_ptr ptr3 = [y](std::string s){bar(s,y);}; /* error*/
foo(ptr3);
//case:4
int z = 12;
fun([z](std::string s){bar(s,z);});
return 0;
}
Error:
main.cpp:25:50: error: cannot convert ‘main()::’ to ‘f_ptr {aka void (*)(std::basic_string)}’ in initialization
f_ptr ptr3 = [y](std::string s){bar(s,y);}; /* error*/
My questions are,
Is there any way to forwards additional arguments like case:3 via lambda?
What conversion is causing error in case:3?
In case:4,typename T is deduced to what?
Is there any way to forwards additional arguments like case:3 via lambda?
What conversion is causing error in case:3?
Lambdas with capture list can't convert to function pointer implicitly; lambdas without capture could. You can use std::function instead,
void foo(std::function<void(std::string)> f){
f("Hello world");
}
Or takes the lambda directly like fun does.
In case:4,typename T is deduced to what?
The type would be the unique closure type; the lambda expression is a prvalue expression of that type.
Some details, why this compiles while case 3 does not:
//case:2
static int x =10;
f_ptr ptr2 = [x](std::string s){bar(s,x);};
It compiles since it effectively doesn't capture anything and so the lambda can be bound to the function pointer, which is not allowed for the effective capture case. The standard says:
5.1.1/2 A name in the lambda-capture shall be in scope in the context of the lambda expression, and shall be this or refer to a local
variable or reference with automatic storage duration.
So the behavior for the static variable capturing case is at least not specified, not necessarily undefined behavior though (degree of freedom for the implementation).
Be aware of the fact, that capturing static variables might lead to horrible issues in doubt since a copied value might be expected from the semantics of the capture-list but nothing is copied actually!
Also be aware of the fact, that this issue is the same for global variables (no automatic storage duration)!

Declaring variable with name `this` inside lambda inside parentheses leads to different results on 3 different compilers

In C++ it's possible to declare variable inside parentheses like int (x) = 0;. But it seems that if you use this instead of variable name, then constructor is used instead: A (this); calls A::A(B*). So the first question is why it's different for this, is it because variables can't be named this? And to complicate matters a bit lets put this inside a lambda -
struct B;
struct A
{
A (B *) {}
};
struct B
{
B ()
{
[this] { A (this); } ();
}
};
Now gcc calls A::A(B*), msvc prints error about missing default constructor and clang prints expected expression (https://godbolt.org/g/Vxe0fF). It's even funnier in msvc - it really creates variable with name this that you can use, so it's definitely a bug (https://godbolt.org/g/iQaaPH). Which compiler is right and what are the reasons for such behavior?
In the C++ standard §5.1.5 (article 7 for C++11, article 8 later standard) [expr.prim.lambda]:
The lambda-expression’s compound-statement yields the function-body (8.4) of the function call operator, but
for purposes of name lookup (3.4), determining the type and value of this (9.2.2.1) and transforming id-
expressions referring to non-static class members into class member access expressions using (*this) (9.2.2),
the compound-statement is considered in the context of the lambda-expression. [ Example:
struct S1 {
int x, y;
int operator()(int);
void f() {
[=]()->int {
return operator()(this->x + y); // equivalent to S1::operator()(this->x + (*this).y)
// this has type S1*
};
}
};
— end example ]
Thus, gcc is right. You will notice that their is no exception about the fact that you are capturing this. Their is, however a precision since C++14 in the case where you capture *this, still in §5.1.5 (article 17):
If *this is captured by copy, each odr-use of this is transformed into a pointer to the corresponding
unnamed data member of the closure type, cast (5.4) to the type of this.

Can lambdas translate into functions?

Common knowledge dictated that lambda functions are functors under the hood.
In this video (# about 45:43) Bjarne says:
I mentioned that a lambda translates into a function object, or into a function if that's convenient
I can see how this is a compiler optimization (ie it doesn't change the perception of lambdas as unnamed functors which means that eg lambdas still won't overload) but are there any rules that specify when this is applicable?
Edit
The way I understand the term translate (and that's what I'm asking about) has nothing to do with conversion (I'm not asking whether lambdas are convertible to function ptr etc). By translate I mean "compile lambda expressions into functions instead of function objects".
As mentioned in cppreference :
The lambda expression constructs an unnamed prvalue temporary object of unique unnamed non-union non-aggregate type, known as closure type.
The question is : can this object be ommited and have a plain function instead? If yes, then when and how?
Note: I imagine one such rule being "don't capture anything" but I can't find any reliable sources to confirm it
TLDR: if you only use lambda to convert it to a function pointer (and only invoke it via that function pointer), it is always profitable to omit the closure object. Optimizations which enable this are inlining and dead code elimination. If you do use the lambda itself, it is still possible to optimize the closure away, but requires somewhat more aggressive interprocedural optimization.
I will now try to show how that works under the hood. I will use GCC in my examples, because I'm more familiar with it. Other compilers should do something similar.
Consider this code:
#include <stdio.h>
typedef int (* fnptr_t)(int);
void use_fnptr(fnptr_t fn)
{
printf("fn=%p, fn(1)=%d\n", fn, fn(1));
}
int main()
{
auto lam = [] (int x) { return x + 1; };
use_fnptr((fnptr_t)lam);
}
Now, I compile it and dump intermediate representation (for versions prior to 6, you should add -std=c++11):
g++ test.cc -fdump-tree-ssa
A little cleaned-up and edited (for brevity) dump looks like this:
// _ZZ4mainENKUliE_clEi
main()::<lambda(int)> (const struct __lambda0 * const __closure, int x)
{
return x_1(D) + 1;
}
// _ZZ4mainENUliE_4_FUNEi
static int main()::<lambda(int)>::_FUN(int) (int D.2780)
{
return main()::<lambda(int)>::operator() (0B, _2(D));
}
// _ZZ4mainENKUliE_cvPFiiEEv
main()::<lambda(int)>::operator int (*)(int)() const (const struct __lambda0 * const this)
{
return _FUN;
}
int main() ()
{
struct __lambda0 lam;
int (*<T5c1>) (int) _3;
_3 = main()::<lambda(int)>::operator int (*)(int) (&lam);
use_fnptr (_3);
}
That is, lambda has 2 member functions: function call operator and a conversion operator and one static member function _FUN, which simply invokes lambda with this set to zero. main calls the conversion operator and passes the result to use_fnptr - exactly as it is written in the source code.
I can write:
extern "C" int _ZZ4mainENKUliE_clEi(void *, int);
int main()
{
auto lam = [] (int x) { return x + 1; };
use_fnptr((fnptr_t)lam);
printf("%d %d %d\n", lam(10), _ZZ4mainENKUliE_clEi(&lam, 11), __lambda0::_FUN(12));
printf("%p %p\n", &__lambda0::_FUN, (fnptr_t)lam);
return 0;
}
This program outputs:
fn=0x4005fc, fn(1)=2
11 12 13
0x4005fc 0x4005fc
Now, I think it's pretty obvious, that the compiler should inline lambda (_ZZ4mainENKUliE_clEi) into _FUN (_ZZ4mainENUliE_4_FUNEi), because _FUN is the only caller. And inline operator int (*)(int) into main (because this operator simply returns a constant). GCC does exactly this, when compiling with optimization (-O). You can check this like:
g++ test.cc -O -fdump-tree-einline
Dump file:
// Considering inline candidate main()::<lambda(int)>.
// Inlining main()::<lambda(int)> into static int main():<lambda(int)>::_FUN(int).
static int main()::<lambda(int)>::_FUN(int) (int D.2822)
{
return _2(D) + 1;
}
The closure object is gone. Now, a more complicated case, when lambda itself is used (not a function pointer). Consider:
#include <stdio.h>
#define PRINT(x) printf("%d", (x))
#define PRINT1(x) PRINT(x); PRINT(x); PRINT(x); PRINT(x);
#define PRINT2(x) do { PRINT1(x) PRINT1(x) PRINT1(x) PRINT1(x) } while(0)
__attribute__((noinline)) void use_lambda(auto t)
{
t(1);
}
int main()
{
auto lam = [] (int x) { PRINT2(x); };
use_lambda(lam);
return 0;
}
GCC will not inline lambda, because it is rather huge (that is what I used printf's for):
g++ test2.cc -O2 -fdump-ipa-inline -fdump-tree-einline -fdump-tree-esra
Early inliner's dump:
Considering inline candidate main()::<lambda(int)>
will not early inline: void use_lambda(auto:1) [with auto:1 = main()::<lambda(int)>]/16->main()::<lambda(int)>/19, growth 46 exceeds --param early-inlining-insns
But "early interprocedural scalar replacement of aggregates" pass will do what we want:
;; Function main()::<lambda(int)> (_ZZ4mainENKUliE_clEi, funcdef_no=14, decl_uid=2815, cgraph_uid=12, symbol_order=12)
IPA param adjustments: 0. base_index: 0 - __closure, base: __closure, remove_param
1. base_index: 1 - x, base: x, copy_param
The first parameter (i.e., closure) is not used, and it gets removed. Unfortunately interprocedural SRA is not able to optimize away indirection, which is introduced for captured values (though there are cases when it would be obviously profitable), so there is still some room for enhancements.
From Lambda expressions §5.1.2 p6 (draft N4140)
The closure type for a non-generic lambda-expression with no lambda-capture has a public non-virtual non-
explicit const conversion function to pointer to function with C ++ language linkage having the same
parameter and return types as the closure type’s function call operator.
The standard quote has already been posted, I want to add some examples.
You can assign lambdas to function pointers as long as there are no captured variables:
Legal:
int (*f)(int) = [] (int x) { return x + 1; }; // assign lambda to function pointer
int z = f(3); // use the function pointer
Illegal:
int y = 5;
int (*g)(int) = [y] (int x) { return x + y; }; // error
Legal:
int y = 5;
int z = ([y] (int x) { return x + y; })(2); // use lambda directly
(Edit)
Since we can not ask Bjarne what he meant exactly, I want to try a few interpretations.
"translate" meaning "convert"
This is what I understood initially, but it is clear now that the question is not about this possible meaning.
"translate" as used in the C++ standard, meaning "compile" (more or less)
As Sebastian Redl already commented, there are no function objects on the binary level. There is just opcodes and data, and the standard does not talk about, or specify, any binary formats.
"translate" meaning "being semantically equivalent"
This would imply that if A and B are semantically equivalent, the produced binary code for A and B could be the same. The rest of my answer uses this interpretation.
A closure consists of two parts:
the statements in the lambda body, "code"
the captured variable values or references, "data"
This is equivalent to a functor, as already stated in the question.
Functors can be seen as a subset of objects, because they have code and data, but only one member function: the call operator. So closures could be seen as semantically equivalent to a restricted form of objects.
A function on the other hand, has no data associated with it. There are the arguments of course, but these must be supplied by the caller and can change from one invocation to the other. This is a semantic difference to a closure, where the bound variables can not be changed and are not supplied by the caller.
A member function is not something independent, as it can not work without its object, so I think the question refers to a freestanding function.
So no, a lambda is in general not semantically equivalent to a function.
There is the obvious special case of a lambda with no captured variables, where the functor consists only of the code, and this is equivalent to a function.
But, a lambda could be said to be semantically equivalent to a set of functions. Each possible closure (distinct combination of values/references for the bound variables) would be equivalent to one function in that set.
Of course this can only be useful when the bound variables can only have a very limited set of values / are references to only a few different variables, if at all.
For example I see no reason why a compiler could not treat the following two snippets as (almost*) equivalent:
void Test(bool cond, int x)
{
int y;
if(cond) y = 5;
else y = 3;
auto f = [y](int x) { return x + y; };
// more code that
// uses f
}
A clever compiler could see that y can only have the values 5 or 3, and compile as if it would be written like this:
int F1(int x)
{
return x + 5;
}
int F2(int x)
{
return x + 3;
}
void Test(bool cond, int x)
{
int (*f)(int);
if(cond) f = F1;
else f = F2;
// more code that
// uses f
}
(*) Of course it depends on what more code that uses f does exactly.
Another (maybe better) example would be a lambda that always binds the same variable by reference. Then, there is only one possible closure, and so it is equivalent to a function, if the function has access to that variable by other means than by passing it as an argument.
Another observation that might be helpful is that asking
can this object [closure] be ommited and have a plain function instead? If yes,
then when and how?
is more or less the same as asking when and how a member function can be used without the object. Since lambdas are functors, and functors are objects, the two questions are closely related.
The bound variables of the lambda correspond to the data members of the object, and the lambda body corresponds to the body of the member function.
To give another kind of insight, let have a look to the code produced by clang when compiling the following snippet:
int (*f) = []() { return 0; }
If you compile this with:
clang++ -std=c++11 -S -o- -emit-llvm a.cc
You get the following LLVM bytecode for the lambda definition:
define internal i32 #"_ZNK3$_0clEv"(%class.anon* %this) #0 align 2 {
%1 = alloca %class.anon*, align 8
store %class.anon* %this, %class.anon** %1, align 8
%2 = load %class.anon** %1
ret i32 0
}
define internal i32 #"_ZN3$_08__invokeEv"() #1 align 2 {
%1 = call i32 #"_ZNK3$_0clEv"(%class.anon* undef)
ret i32 %1
}
The first function takes an instance of %class.anon* and return 0: that's the call operator. The second creates an instance of this class (undef) and then call its call operator and return the value.
When compiled with -O2, the whole lambda is turned into:
define internal i32 #"_ZN3$_08__invokeEv"() #0 align 2 {
ret i32 0
}
So that's a single function that returns 0.
I mentioned that a lambda translates into a function object, or into a function if that's convenient
That's exactly what clang does! It transforms the lambda into a function object, and when possible optimize it to a function.
No, it can't be done. Lambdas are defined to be functors, and I don't see the as-if rule helping here.
[C++14: 5.1.2/6]: The closure type for a non-generic lambda-expression with no lambda-capture has a public non-virtual non-explicit const conversion function to pointer to function with C++ language linkage (7.5) having the same parameter and return types as the closure type’s function call operator. [..]
…followed by similar wording for generic lambdas.

Where are lambda captured variables stored?

How is it possible that this example works? It prints 6:
#include <iostream>
#include <functional>
using namespace std;
void scopeIt(std::function<int()> &fun) {
int val = 6;
fun = [=](){return val;}; //<-- this
}
int main() {
std::function<int()> fun;
scopeIt(fun);
cout << fun();
return 0;
}
Where is the value 6 stored after scopeIt is done being called? If I replace the [=] with a [&], it prints 0 instead of 6.
It is stored within the closure, which - in your code - is then stored within std::function<int()> &fun.
A lambda generates what's equivalent to an instance of a compiler generated class.
This code:
[=](){return val;}
Generates what's effectively equivalent to this... this would be the "closure":
struct UNNAMED_TYPE
{
UNNAMED_TYPE(int val) : val(val) {}
const int val;
// Above, your [=] "equals/copy" syntax means "find what variables
// are needed by the lambda and copy them into this object"
int operator() () const { return val; }
// Above, here is the code you provided
} (val);
// ^^^ note that this DECLARED type is being INSTANTIATED (constructed) too!!
Lambdas in C++ are really just "anonymous" struct functors. So when you write this:
int val = 6;
fun = [=](){return val;};
What the compiler is translating that into is this:
int val = 6;
struct __anonymous_struct_line_8 {
int val;
__anonymous_struct_line_8(int v) : val(v) {}
int operator() () const {
return val; // returns this->val
}
};
fun = __anonymous_struct_line_8(val);
Then, std::function stores that functor via type erasure.
When you use [&] instead of [=], it changes the struct to:
struct __anonymous_struct_line_8 {
int& val; // Notice this is a reference now!
...
So now the object stores a reference to the function's val object, which becomes a dangling (invalid) reference after the function exits (and you get undefined behavior).
The so-called closure type (which is the class type of the lambda expression) has members for each captured entity. Those members are objects for capture by value, and references for capture by reference. They are initialized with the captured entities and live independently within the closure object (the particular object of closure type that this lambda designates).
The unnamed member that corresponds to the value capture of val is initialized with val and accessed from the inside of the closure types operator(), which is fine. The closure object may easily have been copied or moved multiple times until that happens, and that's fine too - closure types have implicitly defined move and copy constructors just as normal classes do.
However, when capturing by reference, the lvalue-to-rvalue conversion that is implicitly performed when calling fun in main induces undefined behavior as the object which the reference member referred to has already been destroyed - i.e. we are using a dangling reference.
The value of a lambda expression is an object of class type, and
For each entity
captured by copy, an unnamed non-static data member is declared in the closure type.
([expr.prim.lambda]/14 in C++11)
That is, the object created by the lambda
[=](){return val;}
actually contains a non-static member of int type, whose value is 6, and this object is copied into the std::function object.

Function returning a lambda expression

I wonder if it's possible to write a function that returns a lambda function in C++11. Of course one problem is how to declare such function. Each lambda has a type, but that type is not expressible in C++. I don't think this would work:
auto retFun() -> decltype ([](int x) -> int)
{
return [](int x) { return x; }
}
Nor this:
int(int) retFun();
I'm not aware of any automatic conversions from lambdas to, say, pointers to functions, or some such. Is the only solution handcrafting a function object and returning it?
You don't need a handcrafted function object, just use std::function, to which lambda functions are convertible:
This example returns the integer identity function:
std::function<int (int)> retFun() {
return [](int x) { return x; };
}
For this simple example, you don't need std::function.
From standard §5.1.2/6:
The closure type for a lambda-expression with no lambda-capture has a public non-virtual non-explicit const conversion function to pointer to function having the same parameter and return types as the closure type’s function call operator. The value returned by this conversion function shall be the address of a function that, when invoked, has the same effect as invoking the closure type’s function call operator.
Because your function doesn't have a capture, it means that the lambda can be converted to a pointer to function of type int (*)(int):
typedef int (*identity_t)(int); // works with gcc
identity_t retFun() {
return [](int x) { return x; };
}
That's my understanding, correct me if I'm wrong.
Though the question specifically asks about C++11, for the sake of others who stumble upon this and have access to a C++14 compiler, C++14 now allows deduced return types for ordinary functions. So the example in the question can be adjusted just to work as desired simply by dropping the -> decltype... clause after the function parameter list:
auto retFun()
{
return [](int x) { return x; }
}
Note, however, that this will not work if more than one return <lambda>; appears in the function. This is because a restriction on return type deduction is that all return statements must return expressions of the same type, but every lambda object is given its own unique type by the compiler, so the return <lambda>; expressions will each have a different type.
You can return lambda function from other lambda function, since you should not explicitly specify return type of lambda function. Just write something like that in global scope:
auto retFun = []() {
return [](int x) {return x;};
};
You should write like this:
auto returnFunction = [](int x){
return [&x](){
return x;
}();
};
to get your return as a function, and use it like:
int val = returnFunction(someNumber);
If you do not have c++ 11 and are running your c++ code on micro controllers for example. You can return a void pointer and then perform a cast.
void* functionThatReturnsLambda()
{
void(*someMethod)();
// your lambda
someMethod = []() {
// code of lambda
};
return someMethod;
}
int main(int argc, char* argv[])
{
void* myLambdaRaw = functionThatReturnsLambda();
// cast it
auto myLambda = (void(*)())myLambdaRaw;
// execute lambda
myLambda();
}