It is necessary for me to use std::function but I don't know what the following syntax means.
std::function<void()> f_name = []() { FNAME(); };
What is the goal of using std::function? Is it to make a pointer to a function?
std::function is a type erasure object. That means it erases the details of how some operations happen, and provides a uniform run time interface to them. For std::function, the primary1 operations are copy/move, destruction, and 'invocation' with operator() -- the 'function like call operator'.
In less abstruse English, it means that std::function can contain almost any object that acts like a function pointer in how you call it.
The signature it supports goes inside the angle brackets: std::function<void()> takes zero arguments and returns nothing. std::function< double( int, int ) > takes two int arguments and returns double. In general, std::function supports storing any function-like object whose arguments can be converted-from its argument list, and whose return value can be converted-to its return value.
It is important to know that std::function and lambdas are different, if compatible, beasts.
The next part of the line is a lambda. This is new syntax in C++11 to add the ability to write simple function-like objects -- objects that can be invoked with (). Such objects can be type erased and stored in a std::function at the cost of some run time overhead.
[](){ code } in particular is a really simple lambda. It corresponds to this:
struct some_anonymous_type {
some_anonymous_type() {}
void operator()const{
code
}
};
an instance of the above simple pseudo-function type. An actual class like the above is "invented" by the compiler, with an implementation defined unique name (often including symbols that no user-defined type can contain) (I do not know if it is possible that you can follow the standard without inventing such a class, but every compiler I know of actually creates the class).
The full lambda syntax looks like:
[ capture_list ]( argument_list )
-> return_type optional_mutable
{
code
}
But many parts can be omitted or left empty. The capture_list corresponds to both the constructor of the resulting anonymous type and its member variables, the argument_list the arguments of the operator(), and the return type the return type. The constructor of the lambda instance is also magically called when the instance is created with the capture_list.
[ capture_list ]( argument_list ) -> return_type { code }
basically becomes
struct some_anonymous_type {
// capture_list turned into member variables
some_anonymous_type( /* capture_list turned into arguments */ ):
/* member variables initialized */
{}
return_type operator()( argument_list ) const {
code
}
};
Note that in c++20 template arguments were added to lambdas, and that isn't covered above.
[]<typename T>( std::vector<T> const& v ) { return v.size(); }
1 In addition, RTTI is stored (typeid), and the cast-back-to-original-type operation is included.
Let's break the line apart:
std::function
This is a declaration for a function taking no parameters, and returning no value. If the function returned an int, it would look like this:
std::function<int()>
Likewise, if it took an int parameter as well:
std::function<int(int)>
I suspect your main confusion is the next part.
[]() { FNAME(); };
The [] part is called a capture clause. Here you put variables that are local to the declaration of your lambda, and that you want to be available within the lambda function itself. This is saying "I don't want anything to be captured". If this was within a class definition and you wanted the class to be available to the lambda, you might do:
[this]() { FNAME(); };
The next part, is the parameters being passed to the lambda, exactly the same as if it was a regular function. As mentioned earlier, std::function<void()> is a signature pointing to a method that takes no parameters, so this is empty also.
The rest of it is the body of the lambda itself, as if it was a regular function, which we can see just calls the function FNAME.
Another Example
Let's say you had the following signature, that is for something that can sum two numbers.
std::function<int(int, int)> sumFunc;
We could now declare a lambda thusly:
sumFunc = [](int a, int b) { return a + b; };
Not sure if you're using MSVC, but here's a link anyway to the lamda expression syntax:
http://msdn.microsoft.com/en-us/library/dd293603.aspx
Lambdas with captures (stateful lambdas) cannot be assigned to each other since they have unique types, even if they look exactly the same.
To be able to store and pass around lambdas with captures, we can use "std::function" to hold a function object constructed by a lambda expression.
Basically "std::function" is, to be able to assign lambda functions with different content structures to a lambda function object.
Exp :
auto func = [](int a){
cout << "a:" << a << endl;
};
func(40);
//
int x = 10;
func = [x](int a){ //ATTENTION(ERROR!): assigning a new structure to the same object
cout << "x:" << x << ",a:" << a << endl;
};
func(2);
So the above usage will be incorrect.
But if we define a function object with "std::function":
auto func = std::function<void(int)>{};
func = [](int a){
cout << "a:" << a << endl;
};
func(40);
//
int x = 10;
func = [x](int a){ //CORRECT. because of std::function
//...
};
int y = 11;
func = [x,y](int a){ //CORRECT
//...
};
This is something that has always been bugging me as a feature of C++ lambda expressions: The type of a C++ lambda expression is unique and anonymous, I simply cannot write it down. Even if I create two lambdas that are syntactically exactly the same, the resulting types are defined to be distinct. The consequence is, that a) lambdas can only be passed to template functions that allow the compile time, unspeakable type to be passed along with the object, and b) that lambdas are only useful once they are type erased via std::function<>.
Ok, but that's just the way C++ does it, I was ready to write it off as just an irksome feature of that language. However, I just learned that Rust seemingly does the same: Each Rust function or lambda has a unique, anonymous type. And now I'm wondering: Why?
So, my question is this:
What is the advantage, from a language designer point of view, to introduce the concept of a unique, anonymous type into a language?
Many standards (especially C++) take the approach of minimizing how much they demand from compilers. Frankly, they demand enough already! If they don't have to specify something to make it work, they have a tendency to leave it implementation defined.
Were lambdas to not be anonymous, we would have to define them. This would have to say a great deal about how variables are captured. Consider the case of a lambda [=](){...}. The type would have to specify which types actually got captured by the lambda, which could be non-trivial to determine. Also, what if the compiler successfully optimizes out a variable? Consider:
static const int i = 5;
auto f = [i]() { return i; }
An optimizing compiler could easily recognize that the only possible value of i that could be captured is 5, and replace this with auto f = []() { return 5; }. However, if the type is not anonymous, this could change the type or force the compiler to optimize less, storing i even though it didn't actually need it. This is a whole bag of complexity and nuance that simply isn't needed for what lambdas were intended to do.
And, on the off-case that you actually do need a non-anonymous type, you can always construct the closure class yourself, and work with a functor rather than a lambda function. Thus, they can make lambdas handle the 99% case, and leave you to code your own solution in the 1%.
Deduplicator pointed out in comments that I did not address uniqueness as much as anonymity. I am less certain of the benefits of uniqueness, but it is worth noting that the behavior of the following is clear if the types are unique (action will be instantiated twice).
int counter()
{
static int count = 0;
return count++;
}
template <typename FuncT>
void action(const FuncT& func)
{
static int ct = counter();
func(ct);
}
...
for (int i = 0; i < 5; i++)
action([](int j) { std::cout << j << std::endl; });
for (int i = 0; i < 5; i++)
action([](int j) { std::cout << j << std::endl; });
If the types were not unique, we would have to specify what behavior should happen in this case. That could be tricky. Some of the issues that were raised on the topic of anonymity also raise their ugly head in this case for uniqueness.
Lambdas are not just functions, they are a function and a state. Therefore both C++ and Rust implement them as an object with a call operator (operator() in C++, the 3 Fn* traits in Rust).
Basically, [a] { return a + 1; } in C++ desugars to something like
struct __SomeName {
int a;
int operator()() {
return a + 1;
}
};
then using an instance of __SomeName where the lambda is used.
While in Rust, || a + 1 in Rust will desugar to something like
{
struct __SomeName {
a: i32,
}
impl FnOnce<()> for __SomeName {
type Output = i32;
extern "rust-call" fn call_once(self, args: ()) -> Self::Output {
self.a + 1
}
}
// And FnMut and Fn when necessary
__SomeName { a }
}
This means that most lambdas must have different types.
Now, there are a few ways we could do that:
With anonymous types, which is what both languages implement. Another consequence of that is that all lambdas must have a different type. But for language designers, this has a clear advantage: Lambdas can be simply described using other already existing simpler parts of the language. They are just syntax sugar around already existing bits of the language.
With some special syntax for naming lambda types: This is however not necessary since lambdas can already be used with templates in C++ or with generics and the Fn* traits in Rust. Neither language ever force you to type-erase lambdas to use them (with std::function in C++ or Box<Fn*> in Rust).
Also note that both languages do agree that trivial lambdas that do not capture context can be converted to function pointers.
Describing complex features of a languages using simpler feature is pretty common. For example both C++ and Rust have range-for loops, and they both describe them as syntax sugar for other features.
C++ defines
for (auto&& [first,second] : mymap) {
// use first and second
}
as being equivalent to
{
init-statement
auto && __range = range_expression ;
auto __begin = begin_expr ;
auto __end = end_expr ;
for ( ; __begin != __end; ++__begin) {
range_declaration = *__begin;
loop_statement
}
}
and Rust defines
for <pat> in <head> { <body> }
as being equivalent to
let result = match ::std::iter::IntoIterator::into_iter(<head>) {
mut iter => {
loop {
let <pat> = match ::std::iter::Iterator::next(&mut iter) {
::std::option::Option::Some(val) => val,
::std::option::Option::None => break
};
SemiExpr(<body>);
}
}
};
which while they seem more complicated for a human, are both simpler for a language designer or a compiler.
Cort Ammon's accepted answer is good, but I think there's one more important point to make about implementability.
Suppose I have two different translation units, "one.cpp" and "two.cpp".
// one.cpp
struct A { int operator()(int x) const { return x+1; } };
auto b = [](int x) { return x+1; };
using A1 = A;
using B1 = decltype(b);
extern void foo(A1);
extern void foo(B1);
The two overloads of foo use the same identifier (foo) but have different mangled names. (In the Itanium ABI used on POSIX-ish systems, the mangled names are _Z3foo1A and, in this particular case, _Z3fooN1bMUliE_E.)
// two.cpp
struct A { int operator()(int x) const { return x + 1; } };
auto b = [](int x) { return x + 1; };
using A2 = A;
using B2 = decltype(b);
void foo(A2) {}
void foo(B2) {}
The C++ compiler must ensure that the mangled name of void foo(A1) in "two.cpp" is the same as the mangled name of extern void foo(A2) in "one.cpp", so that we can link the two object files together. This is the physical meaning of two types being "the same type": it's essentially about ABI-compatibility between separately compiled object files.
The C++ compiler is not required to ensure that B1 and B2 are "the same type." (In fact, it's required to ensure that they're different types; but that's not as important right now.)
What physical mechanism does the compiler use to ensure that A1 and A2 are "the same type"?
It simply burrows through typedefs, and then looks at the fully qualified name of the type. It's a class type named A. (Well, ::A, since it's in the global namespace.) So it's the same type in both cases. That's easy to understand. More importantly, it's easy to implement. To see if two class types are the same type, you take their names and do a strcmp. To mangle a class type into a function's mangled name, you write the number of characters in its name, followed by those characters.
So, named types are easy to mangle.
What physical mechanism might the compiler use to ensure that B1 and B2 are "the same type," in a hypothetical world where C++ required them to be the same type?
Well, it couldn't use the name of the type, because the type doesn't have a name.
Maybe it could somehow encode the text of the body of the lambda. But that would be kind of awkward, because actually the b in "one.cpp" is subtly different from the b in "two.cpp": "one.cpp" has x+1 and "two.cpp" has x + 1. So we'd have to come up with a rule that says either that this whitespace difference doesn't matter, or that it does (making them different types after all), or that maybe it does (maybe the program's validity is implementation-defined, or maybe it's "ill-formed no diagnostic required"). Anyway, mangling lambda types the same way across multiple translation units is certainly a harder problem than mangling named types like A.
The easiest way out of the difficulty is simply to say that each lambda expression produces values of a unique type. Then two lambda types defined in different translation units definitely are not the same type. Within a single translation unit, we can "name" lambda types by just counting from the beginning of the source code:
auto a = [](){}; // a has type $_0
auto b = [](){}; // b has type $_1
auto f(int x) {
return [x](int y) { return x+y; }; // f(1) and f(2) both have type $_2
}
auto g(float x) {
return [x](int y) { return x+y; }; // g(1) and g(2) both have type $_3
}
Of course these names have meaning only within this translation unit. This TU's $_0 is always a different type from some other TU's $_0, even though this TU's struct A is always the same type as some other TU's struct A.
By the way, notice that our "encode the text of the lambda" idea had another subtle problem: lambdas $_2 and $_3 consist of exactly the same text, but they should clearly not be considered the same type!
By the way, C++ does require the compiler to know how to mangle the text of an arbitrary C++ expression, as in
template<class T> void foo(decltype(T())) {}
template void foo<int>(int); // _Z3fooIiEvDTcvT__EE, not _Z3fooIiEvT_
But C++ doesn't (yet) require the compiler to know how to mangle an arbitrary C++ statement. decltype([](){ ...arbitrary statements... }) is still ill-formed even in C++20.
Also notice that it's easy to give a local alias to an unnamed type using typedef/using. I have a feeling that your question might have arisen from trying to do something that could be solved like this.
auto f(int x) {
return [x](int y) { return x+y; };
}
// Give the type an alias, so I can refer to it within this translation unit
using AdderLambda = decltype(f(0));
int of_one(AdderLambda g) { return g(1); }
int main() {
auto f1 = f(1);
assert(of_one(f1) == 2);
auto f42 = f(42);
assert(of_one(f42) == 43);
}
EDITED TO ADD: From reading some of your comments on other answers, it sounds like you're wondering why
int add1(int x) { return x + 1; }
int add2(int x) { return x + 2; }
static_assert(std::is_same_v<decltype(add1), decltype(add2)>);
auto add3 = [](int x) { return x + 3; };
auto add4 = [](int x) { return x + 4; };
static_assert(not std::is_same_v<decltype(add3), decltype(add4)>);
That's because captureless lambdas are default-constructible. (In C++ only as of C++20, but it's always been conceptually true.)
template<class T>
int default_construct_and_call(int x) {
T t;
return t(x);
}
assert(default_construct_and_call<decltype(add3)>(42) == 45);
assert(default_construct_and_call<decltype(add4)>(42) == 46);
If you tried default_construct_and_call<decltype(&add1)>, t would be a default-initialized function pointer and you'd probably segfault. That's, like, not useful.
(Adding to Caleth's answer, but too long to fit in a comment.)
The lambda expression is just syntactic sugar for an anonymous struct (a Voldemort type, because you can't say its name).
You can see the similarity between an anonymous struct and the anonymity of a lambda in this code snippet:
#include <iostream>
#include <typeinfo>
using std::cout;
int main() {
struct { int x; } foo{5};
struct { int x; } bar{6};
cout << foo.x << " " << bar.x << "\n";
cout << typeid(foo).name() << "\n";
cout << typeid(bar).name() << "\n";
auto baz = [x = 7]() mutable -> int& { return x; };
auto quux = [x = 8]() mutable -> int& { return x; };
cout << baz() << " " << quux() << "\n";
cout << typeid(baz).name() << "\n";
cout << typeid(quux).name() << "\n";
}
If that is still unsatisfying for a lambda, it should be likewise unsatisfying for an anonymous struct.
Some languages allow for a kind of duck typing that is a little more flexible, and even though C++ has templates that doesn't really help in making a object from a template that has a member field that can replace a lambda directly rather than using a std::function wrapper.
Why design a language with unique anonymous types?
Because there are cases where names are irrelevant and not useful or even counter-productive. In this case the ability abstract out their existence is useful because it reduces name pollution, and solves one of the two hard problems in computers science (how to name things). For the same reason, temporary objects are useful.
lambda
The uniqueness is not a special lambda thing, or even special thing to anonymous types. It applies to named types in the language as well. Consider following:
struct A {
void operator()(){};
};
struct B {
void operator()(){};
};
void foo(A);
Note that I cannot pass B into foo, even though the classes are identical. This same property applies to unnamed types.
lambdas can only be passed to template functions that allow the compile time, unspeakable type to be passed along with the object ... erased via std::function<>.
There's a third option for a subset of lambdas: Non-capturing lambdas can be converted to function pointers.
Note that if the limitations of an anonymous type are a problem for a use case, then the solution is simple: A named type can be used instead. Lambdas don't do anything that cannot be done with a named class.
C++ lambdas need distinct types for distinct operations, as C++ binds statically. They are only copy/move-constructable, so mostly you don't need to name their type. But that's all somewhat of an implementation detail.
I'm not sure if C# lambdas have a type, as they are "anonymous function expressions", and they immediately get converted to a compatible delegate type or expression tree type. If the do, it's probably an unpronouncable type.
C++ also has anonymous structs, where each definition leads to a unique type. Here the name isn't unpronouncable, it simply doesn't exist as far as the standard is concerned.
C# has anonymous data types, which it carefully forbids from escaping from the scope they are defined. The implementation gives a unique, unpronouncable name to those too.
Having an anonymous type signals to the programmer that they shouldn't poke around inside their implementation.
Aside:
You can give a name to a lambda's type.
auto foo = []{};
using Foo_t = decltype(foo);
If you don't have any captures, you can use a function pointer type
void (*pfoo)() = foo;
Why use anonymous types?
For types that are automatically generated by the compiler, the choice is to either (1) honor a user's request for the name of the type, or (2) let the compiler choose one on its own.
In the former case, the user is expected to explicitly provide a name each time such a construct appears (C++/Rust: whenever a lambda is defined; Rust: whenever a function is defined). This is a tedious detail for the user to provide each time, and in the majority of cases the name is never referred to again. Thus it make sense to let the compiler figure out a name for it automatically, and use existing features such as decltype or type inference to reference the type in the few places where it is needed.
In the latter case, the compiler need to choose a unique name for the type, which would probably be an obscure, unreadable name such as __namespace1_module1_func1_AnonymousFunction042. The language designer could specify precisely how this name is constructed in glorious and delicate detail, but this needlessly exposes an implementation detail to the user that no sensible user could rely upon, since the name is no doubt brittle in the face of even minor refactors. This also unnecessarily constrains the evolution of the language: future feature additions may cause the existing name generation algorithm to change, leading to backward compatibility issues. Thus, it makes sense to simply omit this detail, and assert that the auto-generated type is unutterable by the user.
Why use unique (distinct) types?
If a value has a unique type, then an optimizing compiler can track a unique type across all its use sites with guaranteed fidelity. As a corollary, the user can then be certain of the places where the provenance of this particular value is full known to the compiler.
As an example, the moment the compiler sees:
let f: __UniqueFunc042 = || { ... }; // definition of __UniqueFunc042 (assume it has a nontrivial closure)
/* ... intervening code */
let g: __UniqueFunc042 = /* some expression */;
g();
the compiler has full confidence that g must necessarily originate from f, without even knowing the provenance of g. This would allow the call to g to be devirtualized. The user would know this too, since the user has taken great care to preserve the unique type of f through the flow of data that led to g.
Necessarily, this constrains what the user can do with f. The user is not at liberty to write:
let q = if some_condition { f } else { || {} }; // ERROR: type mismatch
as that would lead to the (illegal) unification of two distinct types.
To work around this, the user could upcast the __UniqueFunc042 to the non-unique type &dyn Fn(),
let f2 = &f as &dyn Fn(); // upcast
let q2 = if some_condition { f2 } else { &|| {} }; // OK
The trade-off made by this type erasure is that uses of &dyn Fn() complicate the reasoning for the compiler. Given:
let g2: &dyn Fn() = /*expression */;
the compiler has to painstakingly examine the /*expression */ to determine whether g2 originates from f or some other function(s), and the conditions under which that provenance holds. In many circumstances, the compiler may give up: perhaps human could tell that g2 really comes from f in all situations but the path from f to g2 was too convoluted for the compiler to decipher, resulting in a virtual call to g2 with pessimistic performance.
This becomes more evident when such objects delivered to generic (template) functions:
fn h<F: Fn()>(f: F);
If one calls h(f) where f: __UniqueFunc042, then h is specialized to a unique instance:
h::<__UniqueFunc042>(f);
This enables the compiler to generate specialized code for h, tailored for the particular argument of f, and the dispatch to f is quite likely to be static, if not inlined.
In the opposite scenario, where one calls h(f) with f2: &Fn(), the h is instantiated as
h::<&Fn()>(f);
which is shared among all functions of type &Fn(). From within h, the compiler knows very little about an opaque function of type &Fn() and so could only conservatively call f with a virtual dispatch. To dispatch statically, the compiler would have to inline the call to h::<&Fn()>(f) at its call site, which is not guaranteed if h is too complex.
First, lambda without capture are convertible to a function pointer. So they provide some form of genericity.
Now why lambdas with capture are not convertible to pointer? Because the function must access the state of the lambda, so this state would need to appear as a function argument.
To avoid name collisions with user code.
Even two lambdas with same implementation will have different types. Which is okay because I can have different types for objects too even if their memory layout is equal.
The examples I have found that capture this in a lambda use it explicitly; e.g.:
capturecomplete = [this](){this->calstage1done();};
But it seems it is also possible to use it implicitly; e.g.:
capturecomplete = [this](){calstage1done();};
I tested this in g++, and it compiled.
Is this standard C++? (and if so, which version), or is it some form of extension?
It is standard and has been this way since C++11 when lambdas were added. According to cppreference.com:
For the purpose of name lookup, determining the type and value of the
this pointer and for accessing non-static class members, the body of
the closure type's function call operator is considered in the context
of the lambda-expression.
struct X {
int x, y;
int operator()(int);
void f()
{
// the context of the following lambda is the member function X::f
[=]()->int
{
return operator()(this->x + y); // X::operator()(this->x + (*this).y)
// this has type X*
};
}
};
It's completely standard and has been since lambdas were introduced in C++11.
You do not need to write this-> there.
I'm trying to design a C++ macro that needs to look something like this:
#define MY_MACRO(OBJECT, METHOD) \
[](BaseClass* obj) \
{ \
return static_cast<decltype(OBJECT)>(obj)->METHOD();\
}
Basically, a macro that translates into a lambda that calls a given method on a given object. But the lambda needs to take a base class of the object as a parameter (My use case guarantees that the cast will always work). Furthermore, the method to be called might not be on the base class.
The usage for this macro is that I have another method which I cannot modify declared as:
void Foo(std::function<int(BaseClass*)>);
and I need to be able to call it using my macro as a parameter like so:
T x;
Foo(MY_MACRO(x, method)); // match std::function<int(T*)>
However, the macro code doesn't work because I'm not capturing OBJECT, so it's not in scope when I need to pass it to decltype. Conceptually though, all the information the compiler needs is there... How can I do this? Is it possible?
A few constraints:
The lambda's parameter needs to be BaseClass. I can't make it decltype(OBJECT).
My situation does not allow me to capture OBJECT.
I don't have access to the C++14 feature of generalized lambda captures.
I need access to the type of the object without capturing it.
You can do it directly. You are required to capture only when you odr-use the named entity, and unevaluated operands, like those of decltype, don't odr-use anything. This is perfectly fine:
void f(){
int x;
[]{ decltype(x) y = 0; };
}
You can add an optional parameter to the lambda with the type that you want, and use decltype on that parameter. Here's an example of the pattern, minus the macro:
int main() {
int foo = 4;
auto lambda = [](double* bar, decltype(foo)* TP = nullptr) {
return static_cast<std::remove_pointer<decltype(TP)>::type>(*bar);
};
double x = 5;
return lambda(&x);
}
I get a pointer to decltype(foo) here because pointer types can easily be defaulted to nullptr to ensure that the parameter is optional. If decltype(foo) already resolves to a pointer type, as in your case if I got it right, you wouldn't need it (and the remove_pointer).
Here's an attempt:
template <typename T>
auto lambda_maker(int (T::* MF)())
{
return [](T* p) -> int { return (p->*MF)(); };
}
#define MY_MACRO(OBJ, METH) lambda_maker<decltype(OBJ)>(METH)
Here's a lamba wrapper expression defined:
function <int(double)> f =
[](double x) -> int{ return static_cast <int> (x*x); };
It is used like this:
f(someintvalue);
What is the difference between normal functions and wrapped lambdas?
question is - what is the difference between normal function and wrapped lambda?
Normal function is a normal function and what you call "wrapped lambda" is actually a function object.
By the way, why use std::function? You could simply write this:
auto f = [](double x) { return static_cast <int> (x*x); };
//call
int result = f(100.0);
Also, I omitted the return type, as it is implicitly known to the compiler from the return expression. No need to write -> int in the lambda expression.
Lambdas can capture surrounding scope ([=] or [&]) resulting in anonymous structs that contain a member function.
Certain tasks, like accessing lambda (types) across Translation Units, or taking standard pointer-to-memberfunction addresses from lambdas may prove hard/useless because the actual type of the automatically generated 'anonymous' type cannot be known (and is in fact implementation defined).
std::function<> was designed to alleviate exactly that problem: being able to bind a function pointer (semantically) to a lambda (or indeed, whatever callable object)
1: in fact the type can have external linkage, but you cannot portably refer to that from another translation unit because the actual type is implementation defined (and need not even be expressible in C++ code; within the original TU you'd be able to use decltype to get the magic type. Thanks, Luc for the good precisions about this)