I'm confident I get the general gist of the constructs, but I can't see the purpose of them in c++. I have read the previous posts on the topic here on SO and elsewhere, but I fail to see why they should be a new language feature.
The things I would like answered is thusly
What is the difference between a lambda and a template argument accepting a function/functor.
Is a closure just a functor with some set object state (scope?)?
What is the "killer app" for these constructs? or perhaps the typical use case?
Lambdas are really just syntactic sugar for a functor. You could do it all yourself: defining a new class, making member variables to hold the captured values and references, hooking them up in the constructor, writing operator()(), and finally creating an instance and passing it. Or you could use a lambda that's 1/10 as much code and works the same.
Lambdas which don't capture can be converted to function pointers. All lambdas can be converted to std::function, or get their own unique type which works well in templated algorithms accepting a functor.
Ok, you're actually asking a bunch of different questions, possibly because you are not fully familiar with terminology. I'll try to answer all.
What's the difference between a lambda and "operator()"? - Let's reword this to, "What's the difference between a lambda and object with operator()?"
Basically, nothing. The main difference is that a lambda expression creates a functional object while an object with an operator() IS a functional object. The end result is similar enough to consider the same though, an entity that can be invoked with the (params) syntax.
What's the difference between a closure and a functor? This is also rather confused. Please review this link:
http://en.wikipedia.org/wiki/Closure_(computer_programming)
http://en.wikipedia.org/wiki/Closure_(computer_programming)#C.2B.2B
So, as you can see, a closure is a kind of "functor" that is defined within a scope such that it absorbs the variables available to it within that scope. In other words, its a function that is built on the fly, during the operation of the program and that building process is parameterized by the runtime values of the scope containing it. So, in C++ closures are lambdas that use values within the function building the lambda.
What is the difference between a lambda and a template argument accepting a function/functor? - This is again confused. The difference is that they are nothing alike, really. A template "argument" accepting a function/functor is already confused wording so I'll assume by "argument" you mean "function", because arguments don't accept anything. In this case, although a lambda can accept a functor as an argument, it can't be templated, one. Two, generally the lambda is the one being passed as an argument to a function accepting a functor argument.
Is a closure just a functor with some set object state (scope?)?
As you can see by the above link, no. In fact, a closure doesn't even have state, really. A closure is built based UPON the state of some other entity that built it, within that functor though this isn't state, it's the very construction of the object.
What is the "killer app" for these constructs? or perhaps the typical use case?
I'll reword that to, "Why are these things useful?"
Well, in general the ability to treat any object as a function if it has operator() is extremely useful for a whole array of things. For one, it allows us to extend the behavior of any stdlib algorithm by using either objects or free functions. It's impossible to inummerate the vast supply of usefulnesses this has.
More specifically speaking of lambda expressions, they simply make this process yet easier. The limitations imposed by object definitions made the process of using stdlib algorithms slightly inefficient in some cases (from a development use perspective, not program efficiency). For one thing, at the time at least any object passed as a parameter to a template had to be externally defined. I believe that's also changing, but still...having to create entire objects just to perform basic things, that only get used in one place, is inconvenient. Lambda expressions allow that definition to be quite easy, within the scope of the place its being used, etc, etc...
Related
After some searching and testing, I have learned the following facts about lambda expression. 1) when we write a lambda expression, the compiler would create an anonymous function object for it, and make it as an instance of the function object; 2) the captured variables of a lambda expression will be used to initialize the member data of the created function object; 3) when we store a lambda function, we actually get a named instance of the function object; 4) a generic lambda function is actually a function object template; 5) a stored (plain and even generic) lambda expression can be declared and defined with template; and 6) a stored lambda expression template can even be partially specialized, just as function objects.
Given all the features of lambdas stated above, it seems to me that, through lambdas, we are able to do whatever we used to do with function objects, and regarding efficiency, they should have same performance.
On the other hand, lambdas also have additional advantages: 1) a lambda expression is more understandable than a function object, especially for inline, short functions; and 2) defining a stored lambda can be seen as a kind of syntactic sugar for defining a function object, and making an instance of it.
Therefore, for me, it seems that we have no reasons to define a function object manually any more.
Of course, I also have some concern for substituting lambdas for function objects universally, like 1) for functions more than, say, 10 lines, defining them as a stored lambda may be unusual (or even awkward, I don't know), and 2) defining lambdas at file-level may (or may not, I am not quite sure) cause some unexpected problems.
Here are my questions: is it wise to prefer lambdas to function objects? Is there any other advantages that a function object have but lambdas not? Is my concern reasonable? And, is there any other concern that I should notice when using lambdas instead of FO universally?
Thanks for any reply!
Lambdas are a terse syntax for certain kinds of function objects.
They cannot be trivially constructed, they can have exactly one (possibly template) operator(), and their type cannot be named without first having access to an instance and using decltype.
As of C++14, they are not constexpr friendly, and they are not guaranteed to be trivially copyable even if their state should be.
Two lambdas with the same capture types and method do not share a type unless declared at the same spot; this can cause symbol bloat.
You cannot declare other operations in a lambda besides (), like friend bool operator< or ==, or whatever.
Given these restructions, sure, use lambdas. Terseness has lots of utility.
There are several ways to pass callable objects as parameters or to store them for future use. You can create a class with operator(), you can define a function and pass a pointer to it, and, since C++11, you can define a lambda via [](){} syntax.
I appreciate lambda syntax as a shortcut in expressions such as find_if that often beg for a compact callable expression. What I don't understand about lambda is the desire to use them outside the point of their declaration and risk introducing dangling references and such. C++ already has a powerful way to pass callable objects around which is much safer then lambda, and in those situation there is no benefit of compact expression of lambda.
Thus the question: why does C++11 allow use of a lambda outside the function that declares is or the functions called from it (and therefore introduces the risk of dangling references, etc)? Could you give an example where keeping lambda live outside the declaring function would be desirable?
Consider a function which is registered to be called when a future event occurs. It would be convenient to define it as a lambda, but it has to live beyond the scope in which it is defined:
for example
m_button->setOnClick(YOUR LAMBDA GOES HERE);
What I don't understand about lambda is the desire to use them outside the point of their declaration and risk introducing dangling references and such. C++ already has a powerful way to pass callable objects around which is much safer then lambda, and in those situation there is no benefit of compact expression of lambda.
(1) Lambda isn't implicitly less safe than any other way of defining function objects. The way of passing a lambda is exactly the same as passing an instance of a named functor.
You can store references in a named functor, and you can capture references in a lambda. Storing a reference to a local object in either of those cases is a severe bug if the function object out lives the scope where those references were bound.
Whether the syntax of lambda is beneficial or not, is a matter of preference. I suppose, one could argue that because lambdas make the definition of functors simpler, it also makes the definition of broken functors simpler.
why does C++11 allow use of a lambda outside the function that declares is or the functions called from it (and therefore introduces the risk of dangling references, etc)?
Firstly, I imagine such semantic limitation would be hard to implement. You can't make them non-copyable because that would make them useless in standard algorithms.
Secondly, because storing a function object for later use is useful, see (2) and using lambdas isn't more dangerous than using instances of named functors, see (1).
Could you give an example where keeping lambda live outside the declaring function would be desirable?
(2) Just about any asynchronous callback situation. std::async, std::thread, GUI and other event systems. Callable function objects will be stored for later use in those situations and typically the objects do outlive the scope where they were created.
In general and also in this case, lambdas advantages over named functor types is that you get to place the function definition right where it's used. Well, you can never have the definition where it's actually used in a generic situation of asynchronous callbacks, but the point of registering the callback is as close as you can get.
The disadvantage of lambdas is their hard-for-humans-to-parse syntax that is an explosion of different brackets, braces and parenthesis. Again, this is matter of preference.
In Effective STL, Scott Meyers back in 2001 advises:
Item 46: Consider function objects instead of functions as algorithm parameters
In said chapter, he proceeds to explain that inline operator() can get inlined into the algorithm's body, but passing a function generally can`t. This is because we are passing a function pointer actually.
In support of that I seem to remember that if a function's address is ever taken, the function can't be inlined.
So two questions here. Firstly, is this still true with C++14?
If yes:
Why is there no mechanism to do this automatically (motivation: declaring a functor is a lot less straightforward and readable, than declaring a function).
A lambda without capture is convertible to function pointer, while a capturing lambda can only be passed as a functor. Does this mean we need to capture something only for the sake of the stated optimization?
Is this still true with C++14?
Depends on whether the compiler can inline the whole algorithm. If it can, then it can probably also inline the function call. If not, then the function probably can't be inlined, because the algorithm in that case is instantiated using a function pointer type and so must be able to handle all function pointers of that type.
For instance, g++ can inline a simple algorithm like std::transform but not std::sort.
A lambda without capture is convertible to function pointer, while a capturing lambda can only be passed as a functor. Does this mean we need to capture something only for the sake of stated at top optimization?
No. A lambda without capture is still a functor; the algorithm is instantiated using the closure type (the type of the lambda) rather than the function pointer type.
In support of that I seem to remember that if a function's address is ever taken, the function can't be inlined.
You're reading that incorrectly. A function can be inlined into any direct call site, and any indirect call site if the compiler can trace the function pointer. What the GCC manpage is saying that a function that is inlined into every call site will not be emitted as a separate function at all (thus reducing binary size), unless its address is taken.
Firstly, is this still true with C++14?
Yes. Of course, now you would generally write lambdas instead of hand-crafted functors.
Why is there no mechanism to do this automatically.
It's a matter of the type system. All functions with a given signature have the same type. Thus, an algorithm that is passed a function pointer gets instantiated to one concrete function, and its code just exists once in the compilation model of C++.
Of course, an optimizer can still specialize the function on one particular argument, but that's a more advanced optimization than just inlining a functor. So yes, there is a mechanism, it's just less likely to be used.
Does this mean we need to capture something only for the sake of stated at top optimization?
No. The conversion to a function pointer is possible, but unless you invoke it explicitly, it won't be done when you pass a lambda to an algorithm.
When implementing a callback function in C++, should I still use the C-style function pointer:
void (*callbackFunc)(int);
Or should I make use of std::function:
std::function< void(int) > callbackFunc;
In short, use std::function unless you have a reason not to.
Function pointers have the disadvantage of not being able to capture some context. You won't be able to for example pass a lambda function as a callback which captures some context variables (but it will work if it doesn't capture any). Calling a member variable of an object (i.e. non-static) is thus also not possible, since the object (this-pointer) needs to be captured.(1)
std::function (since C++11) is primarily to store a function (passing it around doesn't require it to be stored). Hence if you want to store the callback for example in a member variable, it's probably your best choice. But also if you don't store it, it's a good "first choice" although it has the disadvantage of introducing some (very small) overhead when being called (so in a very performance-critical situation it might be a problem but in most it should not). It is very "universal": if you care a lot about consistent and readable code as well as don't want to think about every choice you make (i.e. want to keep it simple), use std::function for every function you pass around.
Think about a third option: If you're about to implement a small function which then reports something via the provided callback function, consider a template parameter, which can then be any callable object, i.e. a function pointer, a functor, a lambda, a std::function, ... Drawback here is that your (outer) function becomes a template and hence needs to be implemented in the header. On the other hand you get the advantage that the call to the callback can be inlined, as the client code of your (outer) function "sees" the call to the callback will the exact type information being available.
Example for the version with the template parameter (write & instead of && for pre-C++11):
template <typename CallbackFunction>
void myFunction(..., CallbackFunction && callback) {
...
callback(...);
...
}
As you can see in the following table, all of them have their advantages and disadvantages:
function ptr
std::function
template param
can capture context variables
no1
yes
yes
no call overhead (see comments)
yes
no
yes
can be inlined (see comments)
no
no
yes
can be stored in a class member
yes
yes
no2
can be implemented outside of header
yes
yes
no
supported without C++11 standard
yes
no3
yes
nicely readable (my opinion)
no
yes
(yes)
(1) Workarounds exist to overcome this limitation, for example passing the additional data as further parameters to your (outer) function: myFunction(..., callback, data) will call callback(data). That's the C-style "callback with arguments", which is possible in C++ (and by the way heavily used in the WIN32 API) but should be avoided because we have better options in C++.
(2) Unless we're talking about a class template, i.e. the class in which you store the function is a template. But that would mean that on the client side the type of the function decides the type of the object which stores the callback, which is almost never an option for actual use cases.
(3) For pre-C++11, use boost::function
void (*callbackFunc)(int); may be a C style callback function, but it is a horribly unusable one of poor design.
A well designed C style callback looks like void (*callbackFunc)(void*, int); -- it has a void* to allow the code that does the callback to maintain state beyond the function. Not doing this forces the caller to store state globally, which is impolite.
std::function< int(int) > ends up being slightly more expensive than int(*)(void*, int) invocation in most implementations. It is however harder for some compilers to inline. There are std::function clone implementations that rival function pointer invocation overheads (see 'fastest possible delegates' etc) that may make their way into libraries.
Now, clients of a callback system often need to set up resources and dispose of them when the callback is created and removed, and to be aware of the lifetime of the callback. void(*callback)(void*, int) does not provide this.
Sometimes this is available via code structure (the callback has limited lifetime) or through other mechanisms (unregister callbacks and the like).
std::function provides a means for limited lifetime management (the last copy of the object goes away when it is forgotten).
In general, I'd use a std::function unless performance concerns manifest. If they did, I'd first look for structural changes (instead of a per-pixel callback, how about generating a scanline processor based off of the lambda you pass me? which should be enough to reduce function-call overhead to trivial levels.). Then, if it persists, I'd write a delegate based off fastest possible delegates, and see if the performance problem goes away.
I would mostly only use function pointers for legacy APIs, or for creating C interfaces for communicating between different compilers generated code. I have also used them as internal implementation details when I am implementing jump tables, type erasure, etc: when I am both producing and consuming it, and am not exposing it externally for any client code to use, and function pointers do all I need.
Note that you can write wrappers that turn a std::function<int(int)> into a int(void*,int) style callback, assuming there are proper callback lifetime management infrastructure. So as a smoke test for any C-style callback lifetime management system, I'd make sure that wrapping a std::function works reasonably well.
Use std::function to store arbitrary callable objects. It allows the user to provide whatever context is needed for the callback; a plain function pointer does not.
If you do need to use plain function pointers for some reason (perhaps because you want a C-compatible API), then you should add a void * user_context argument so it's at least possible (albeit inconvenient) for it to access state that's not directly passed to the function.
The only reason to avoid std::function is support of legacy compilers that lack support for this template, which has been introduced in C++11.
If supporting pre-C++11 language is not a requirement, using std::function gives your callers more choice in implementing the callback, making it a better option compared to "plain" function pointers. It offers the users of your API more choice, while abstracting out the specifics of their implementation for your code that performs the callback.
std::function may bring VMT to the code in some cases, which has some impact on performance.
The other answers answer based on technical merits. I'll give you an answer based on experience.
As a very heavy X-Windows developer who always worked with function pointer callbacks with void* pvUserData arguments, I started using std::function with some trepidation.
But I find out that combined with the power of lambdas and the like, it has freed up my work considerably to be able to, at a whim, throw multiple arguments in, re-order them, ignore parameters the caller wants to supply but I don't need, etc. It really makes development feel looser and more responsive, saves me time, and adds clarity.
On this basis I'd recommend anyone to try using std::function any time they'd normally have a callback. Try it everywhere, for like six months, and you may find you hate the idea of going back.
Yes there's some slight performance penalty, but I write high-performance code and I'm willing to pay the price. As an exercise, time it yourself and try to figure out whether the performance difference would ever matter, with your computers, compilers and application space.
I have just read the classic book "Effective C++, 3rd Edition", and in item 20 the author concludes that built-in types, STL iterators and function object types are more appropriate for pass-by-value. I could well understand the reason for built-in and iterators types, but why should the function object be pass-by-value, as we know it is class-type anyway?
In a typical case, a function object will have little or (more often) no persistent state. In such a case, passing by value may no require actually passing anything at all -- the "value" that's passed is basically little or nothing more than a placeholder for "this is the object".
Given the small amount of code in many function objects, that leads to a further optimization: it's often fairly easy for the compiler to expand the code for the function object inline, so no parameters get passed, and no function call is involved at all.
A compiler may be able to do the same when you pass a pointer or reference instead, but it's not quite as easy -- a lot more common that you'll end up with an object being created, its address passed, and then the function call operator for that object being invoked via that pointer.
Edit: It's probably also worth mentioning that the same applies to lambdas, since they're really just function objects in disguise. You don't know the name of the class, but they create a class in the immediately surrounding scope that overloads the function call operator, which is what gets invoked when you "call" the lambda. [Thanks #Mark Garcia.]
The #1 reason to pass function objects by value is because the standard library requires that function objects you pass to its algorithms be copyable. C++11 §25.1/10:
[ Note: Unless otherwise specified, algorithms that take function objects as arguments are permitted to copy
those function objects freely. Programmers for whom object identity is important should consider using a
wrapper class that points to a noncopied implementation object such as reference_wrapper<T> (20.8.3),
or some equivalent solution. —end note ]
The other answers do a great job of explaining the rationale.
From Effective STL (since you seems to like Scott Meyers) item 38 Design functor classes for pass-by-value.
"In both C and C++ function pointers are passed by value. STL Function objects are modeled after function pointers, so the convention in the STL is that function objects, too, are passed by value when passed to and from functions."
This has some benefits and some implications, like #Jerry Coffin said, the compiler can make some optimizations like inlining the code to avoid function calls (You have to mark your functor as inline). A good example of this case is the qsort vs std::sort performance comparison, where std::sort using inline functors outperform qsort by a lot, you can find more information on this on Effective STL where it is discussed extensively and mentioned in several chapters.
This also has several implications too, since function objects are passed and returned by value, you have to make sure your object have a well defined copy mechanisms, are small in size (otherwise it could get expensive), and are monomorphic (since passing polymorphic objects by value may result in object slicing).