Passing a lambda is really easy in c++11:
func( []( int arg ) {
// code
} ) ;
But I'm wondering, what is the cost of passing a lambda to a function like this? What if func passes the lambda to other functions?
void func( function< void (int arg) > f ) {
doSomethingElse( f ) ;
}
Is the passing of the lambda expensive? Since a function object can be assigned 0,
function< void (int arg) > f = 0 ; // 0 means "not init"
it leads me to think that function objects kind of act like pointers. But without use of new, then it means they might be like value-typed struct or classes, which defaults to stack allocation and member-wise copy.
How is a C++11 "code body" and group of captured variables passed when you pass a function object "by value"? Is there a lot of excess copy of the code body? Should I have to mark each function object passed with const& so that a copy is not made:
void func( const function< void (int arg) >& f ) {
}
Or do function objects somehow pass differently than regular C++ structs?
Disclaimer: my answer is somewhat simplified compared to the reality (I put some details aside) but the big picture is here. Also, the Standard does not fully specify how lambdas or std::function must be implemented internally (the implementation has some freedom) so, like any discussion on implementation details, your compiler may or may not do it exactly this way.
But again, this is a subject quite similar to VTables: the Standard doesn't mandate much but any sensible compiler is still quite likely to do it this way, so I believe it is worth digging into it a little. :)
Lambdas
The most straightforward way to implement a lambda is kind of an unnamed struct:
auto lambda = [](Args...) -> Return { /*...*/ };
// roughly equivalent to:
struct {
Return operator ()(Args...) { /*...*/ }
}
lambda; // instance of the unnamed struct
Just like any other class, when you pass its instances around you never have to copy the code, just the actual data (here, none at all).
Objects captured by value are copied into the struct:
Value v;
auto lambda = [=](Args...) -> Return { /*... use v, captured by value...*/ };
// roughly equivalent to:
struct Temporary { // note: we can't make it an unnamed struct any more since we need
// a constructor, but that's just a syntax quirk
const Value v; // note: capture by value is const by default unless the lambda is mutable
Temporary(Value v_) : v(v_) {}
Return operator ()(Args...) { /*... use v, captured by value...*/ }
}
lambda(v); // instance of the struct
Again, passing it around only means that you pass the data (v) not the code itself.
Likewise, objects captured by reference are referenced into the struct:
Value v;
auto lambda = [&](Args...) -> Return { /*... use v, captured by reference...*/ };
// roughly equivalent to:
struct Temporary {
Value& v; // note: capture by reference is non-const
Temporary(Value& v_) : v(v_) {}
Return operator ()(Args...) { /*... use v, captured by reference...*/ }
}
lambda(v); // instance of the struct
That's pretty much all when it comes to lambdas themselves (except the few implementation details I ommitted, but which are not relevant to understanding how it works).
std::function
std::function is a generic wrapper around any kind of functor (lambdas, standalone/static/member functions, functor classes like the ones I showed, ...).
The internals of std::function are pretty complicated because they must support all those cases. Depending on the exact type of functor this requires at least the following data (give or take implementation details):
A pointer to a standalone/static function.
Or,
A pointer to a copy[see note below] of the functor (dynamically allocated to allow any type of functor, as you rightly noted it).
A pointer to the member function to be called.
A pointer to an allocator that is able to both copy the functor and itself (since any type of functor can be used, the pointer-to-functor should be void* and thus there has to be such a mechanism -- probably using polymorphism aka. base class + virtual methods, the derived class being generated locally in the template<class Functor> function(Functor) constructors).
Since it doesn't know beforehand which kind of functor it will have to store (and this is made obvious by the fact that std::function can be reassigned) then it has to cope with all possible cases and make the decision at runtime.
Note: I don't know where the Standard mandates it but this is definitely a new copy, the underlying functor is not shared:
int v = 0;
std::function<void()> f = [=]() mutable { std::cout << v++ << std::endl; };
std::function<void()> g = f;
f(); // 0
f(); // 1
g(); // 0
g(); // 1
So, when you pass a std::function around it involves at least those four pointers (and indeed on GCC 4.7 64 bits sizeof(std::function<void()> is 32 which is four 64 bits pointers) and optionally a dynamically allocated copy of the functor (which, as I already said, only contains the captured objects, you don't copy the code).
Answer to the question
what is the cost of passing a lambda to a function like this?[context of the question: by value]
Well, as you can see it depends mainly on your functor (either a hand-made struct functor or a lambda) and the variables it contains. The overhead compared to directly passing a struct functor by value is quite negligible, but it is of course much higher than passing a struct functor by reference.
Should I have to mark each function object passed with const& so that a copy is not made?
I'm afraid this is very hard to answer in a generic way. Sometimes you'll want to pass by const reference, sometimes by value, sometimes by rvalue reference so that you can move it. It really depends on the semantics of your code.
The rules concerning which one you should choose are a totally different topic IMO, just remember that they are the same as for any other object.
Anyway, you now have all the keys to make an informed decision (again, depending on your code and its semantics).
See also C++11 lambda implementation and memory model
A lambda-expression is just that: an expression. Once compiled, it results in a closure object at runtime.
5.1.2 Lambda expressions [expr.prim.lambda]
The evaluation of a lambda-expression results in a prvalue temporary
(12.2). This temporary is called the closure object.
The object itself is implementation-defined and may vary from compiler to compiler.
Here is the original implementation of lambdas in clang
https://github.com/faisalv/clang-glambda
If the lambda can be made as a simple function (i.e. it does not capture anything), then it is made exactly the same way. Especially as standard requires it to be compatible with the old-style pointer-to-function with the same signature. [EDIT: it's not accurate, see discussion in comments]
For the rest it is up to the implementation, but I'd not worry ahead. The most straightforward implementation does nothing but carry the information around. Exactly as much as you asked for in the capture. So the effect would be the same as if you did it manually creating a class. Or use some std::bind variant.
Related
I came across a C++17 code base where functions always accept parameters by value, even if a const reference would work, and no semantic reason for passing by value is apparent. The code then explicitly uses a std::move when calling functions. For instance:
A retrieveData(DataReader reader) // reader could be passed by const reference.
{
A a = { };
a.someField = reader.retrieveField(); // retrieveField is a const function.
return a;
}
auto someReader = constructDataReader();
auto data = retrieveData(std::move(someReader)); // Calls explicitly use move all the time.
Defining functions with value parameters by default and counting on move semantics like this seems like a bad practice, but is it? Is this really faster/better than simply passing lvalues by const reference, or perhaps creating a && overload for rvalues if needed?
I'm not sure how many copies modern compilers would do in the above example in case of a call without an explicit move on an lvalue, i.e. retrieveData(r).
I know a lot has been written on the subject of moving, but would really appreciate some clarification here.
std::bind and std::thread share a few design principles. Since both of them store local objets corresponding to the passed arguments, we need to use either std::ref or std::cref if reference semantics is desired:
void f(int& i, double d) { /*...*/ }
void g() {
int x = 10;
std::bind(f, std::ref(x), _1) (3.14);
std::thread t1(f, std::ref(x), 3.14);
//...
}
But I'm intrigued by a recent personal discovery: std::bind will allow you to pass a value in the case above, even though this is not what one usually wants.
std::bind(f, x, _1) (3.14); // Usually wrong, but valid.
However, this is not true for std::thread. The following will trigger a compile error.
std::thread t2(f, x, 3.14); // Usually wrong and invalid: Error!
At first sight I thought this was a compiler bug, but the error is indeed legitimate. It seems that the templated version of std::thread's constructor is not able to correctly deduce the arguments due to the copy decaying requirement (tranforming int& in int) imposed by 30.3.1.2.
The question is: Why not require something similar to std::bind's arguments? Or is this apparent inconsistency intended?
Note: Explained why it's not a duplicated in the comment below.
The function object returned by bind is designed for reuse (i.e., call be called multiple times); it therefore must pass its bound arguments as lvalues, because you don't want to move from said arguments or later calls would see a moved-from bound argument. (Similarly, you want the function object to be called as an lvalue, too.)
This concern is inapplicable to std::thread and friends. The thread function will only be called once, with the provided arguments. It's perfectly safe to move from them because nothing else is going to be looking at them. They are effectively temporary copies, made just for the new thread. Thus the function object is called as an rvalue and the arguments are passed as rvalues.
std::bind was mostly obsolete when it arrived due to the existence of lambdas. With C++14 improvements and C++17 std::apply, the remaining use cases for bind are pretty much gone.
Even in C++11 the cases where bind solved a problem that a lambda did not solve where relatively rare.
On the other hand, std::thread was solving a slightly different problem. It doesn't need the flexibility of bind to "solve every issue", instead it could block what would usually be bad code.
In the bind case The reference passed to f won't be x but rather a reference to an internal stored copy of x. This is extremely surprising.
void f(int& x) {
++x;
std::cout << x << '\n';
};
int main() {
int x = 0;
auto b = std::bind(f, x);
b();
b();
b();
std::cout << x << '\n';
}
prints
1
2
3
0
where the last 0 is the original x, while 1 2 and 3 are the incremented copy of x stored within f.
With lambda, the difference between a mutable stored state and an external reference can be made clear.
auto b = [&x]{ f(x); };
vs
auto b = [x]()mutable{ f(x); };
one of which copies x then invokes f repeatedly on it, the other passes a reference to x into f.
There really isn't a way to do this with bind without allowing f to access the stored copy of x as a reference.
For std::thread, if you want this mutable local copy behavior you just use a lambda.
std::thread t1([x]()mutable{ f(x); });
In fact, I would argue most of the INVOKE syntax in C++11 seems to be a legacy of not having C++14 power lambdas and std::apply in the language. There are very few cases not solved by lambda and std::apply (apply is needed as lambdas don't easily support moving packs into them then extracing them inside).
But we don't have a time machine, so we have these multiple parallel ways to express the idea of invoking something in a specific context in C++.
From what I can tell, thread started off with basically the same rules as bind, but was modified in 2010 by N3090 to take on the constraint you've identified.
Using that to bisect the various contributions, I believe you're looking for LWG issue 929.
Ironically, the intention seems to have been to make the thread constructor less constrained. Certainly there's no mention of bind, although this wording was later also applied to async ("Clean-up" section after LWG 1315), so I would say bind got left behind.
It's quite hard to be sure, though, so I would recommend asking the committee itself.
Please note that there are previous answers concerning template functions or member functions but this question is only about non-template non-member functions. std::move() returns T&&, for example but it is a template function.
Is there ever a good reason to use T&& as a return type from a non-template and non-member function, where T is any arbitrary type?
For example when would you ever use the following?
T&& fn()
{
....
return ....
}
I have seen examples where this is used but in all the examples, the developer misunderstood move semantics and should have returned by value and taken advantage of NRVO.
Suppose we were to have a particle system. We want to implement this as a pool of particles. This means that we are going to want to recycle the same particles over and over again, so to reuse resources, we will want to pass around rvalues.
Now, suppose that our particles have a very short life time, but we want something to happen when they "expire" (like incrementing an integer x) but we still want to recycle them. Now, suppose that we want to be able to specify x. But, now what do we do?
On move, we want to be able to call a function to increment a variable, but that variable must fluctuate. We wouldn't want to put this into a destructor because this would involve either templates to derive the exact function call at compile time or we would need a std::function or function pointer to refer to the function from inside a particle, wasting space. What we want to do is to be able to take in an expiring value, do an action, and then forward it. In other words, we want an outgoing-move with a side effect specifically on the conversion from lvalue to rvalue.
It does matter on which action you do it--on lvalue to rvalue conversion or on receiving an into another Particle with operator= or operator(). Either you do it when an object receives a value or when you take the rvalue in. But suppose we want to do this for many different types of objects--do you want to write 5 or more different move functions for each of 5 different classes, or perhaps should we parameterize over the types with a external templated function?
How else would you do this? You're still going to use std::move() from inside the object, but we want to couple it with a function external to the object, because this would represent a side effect with a move.
Coliru: http://coliru.stacked-crooked.com/a/0ff11890c16f1621
#include <iostream>
#include <string>
struct Particle {
int i;
};
template <typename T>
T&& move( T& obj, int (*fn)(int) , int& i ) {
i = fn(i);
return(std::move(obj));
}
int increment(int i) { return i+1; }
int main() {
// Have some object pool we want to optimize by moving instead of creating new ones.
// We'll use rvalue semantics so we can "inherit" lifetime instead of copying.
Particle pool[2000];
// Fill up the "particles".
for (auto i = 0; i < 2000; ++i) {
pool[i].i = i;
}
// Perform the moves with side effects.
int j = 0;
for (auto i = 0; i < 1999; ++i) {
pool[i+1] = move<Particle>(pool[i], &increment, j);
std::cout << "Moves performed: " << j << '\n';
}
}
Sometimes, it would be handy to have a method that returns *this as an l-value reference, so that you can pass a temporary to a function that accepts an l-value reference.
func(foo_t().lval_ref());
So, why would people want to do this? Suppose a function takes an l-value reference as an output variable. If the caller does not actually need this output, it would be convenient to pass a temporary instead of defining some dummy variable unused, and then static_cast<void>(unused); to suppress any warnings of unused variable.
Similarly, one may want to have a method that returns *this as an r-value reference, which is used the same way as std::move(). Doing this allows you greater flexibility. You can do any tricky things you want within the method implementation :-)
Often times I see transformer functions that will take a parameter by reference, and also return that same parameter as a function's return value.
For example:
std::string& Lowercase(std::string & str){
std::transform(str.begin(), str.end(), str.begin(), ::tolower);
return str;
}
I understand that this is done as a convenience, and I am under the impression that the compiler will optimize for cases when the return value is not actually used. However, I don't believe the compiler can optimize for newly created return values of non-basic types. For example:
std::tuple<int,std::string,float> function(int const& num, std::string const& str, float const& f){
return std::tuple<int,std::string,float>(num,str,f);
}
Constructors could do almost anything, and although the return type isn't used, it does not mean it would be safe to avoid creating the type. However, in this case, it would be advantageous to not create the type when the return value of the function isn't used.
Is there some kind of way to notify the compiler that if the return type is not being used, it's safe to avoid the creation of the type? This would be function specific, and a decision of the programmers; not something that the compiler could figure out on its own.
In the case of function and if the function is not inlined it might not optimize it since it has non trivial constructor. However, if the function is inlined it might optimize the unused return class if it's lifetime affects none of the arguments. Also since tuple is a standard type i believe most compiler will optimize that returned variable.
In general, there are two ways the compiler optimize your code:
(Named) Return value optimization (RVO/NRVO)
R-value reference
RVO
Take following code as example:
struct A {
int v;
};
A foo(int v) {
A a;
a.v = v;
return a;
}
void bar(A& a, int v) {
a.v = v;
}
A f;
f = foo(1);
A b;
bar(b, 1);
In function foo, the variable a is constructed, modify its member v, and returns. In an human-optimized version bar, a is passed in, modified, and gives to caller.
Obviously, f and b are the same.
And if you know more about C++, you will know in returning from foo, result a is copied to outer f, and the dtor of a is called.
The way to make optimized foo into bar is called RVO, it omits the copy of internal a to outer f. Modification directly goes to variable of caller.
Please beware, there are cases RVO won't apply: copy ctor has side effect, multiple returns, etc.
Here is a detailed explaination:
MSDN
CppWiki
RValue
Another way to optimize return value is to use rvalue ctor introduced in C++ 11. Generally speaking, it's like calling std::swap(vector_lhs, vector_rhs) to swap the internal data pointer to avoid deep copy.
Here is a very good article about optimization:
http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/
Last but not least
In Going Native 2013, Andrei Alexandrescu gave a talk about how to write quick C++ code. And pass by reference is faster than rvalue. (Also RVO has some limitations) So if you do care about performance, please use pass by reference.
I've got a C++ data-structure that is a required "scratchpad" for other computations. It's not long-lived, and it's not frequently used so not performance critical. However, it includes a random number generator amongst other updatable tracking fields, and while the actual value of the generator isn't important, it is important that the value is updated rather than copied and reused. This means that in general, objects of this class are passed by reference.
If an instance is only needed once, the most natural approach is to construct them whereever needed (perhaps using a factory method or a constructor), and then passing the scratchpad to the consuming method. Consumers' method signatures use pass by reference since they don't know this is the only use, but factory methods and constructors return by value - and you can't pass unnamed temporaries by reference.
Is there a way to avoid clogging the code with nasty temporary variables? I'd like to avoid things like the following:
scratchpad_t<typeX<typeY,potentially::messy>, typename T> useless_temp = factory(rng_parm);
xyz.initialize_computation(useless_temp);
I could make the scratchpad intrinsically mutable and just label all parameters const &, but that doesn't strike me as best-practice since it's misleading, and I can't do this for classes I don't fully control. Passing by rvalue reference would require adding overloads to all consumers of scratchpad, which kind of defeats the purpose - having clear and concise code.
Given the fact that performance is not critical (but code size and readability are), what's the best-practice approach to passing in such a scratchpad? Using C++0x features is OK if required but preferably C++03-only features should suffice.
Edit: To be clear, using a temporary is doable, it's just unfortunate clutter in code I'd like to avoid. If you never give the temporary a name, it's clearly only used once, and the fewer lines of code to read, the better. Also, in constructors' initializers, it's impossible to declare temporaries.
While it is not okay to pass rvalues to functions accepting non-const references, it is okay to call member functions on rvalues, but the member function does not know how it was called. If you return a reference to the current object, you can convert rvalues to lvalues:
class scratchpad_t
{
// ...
public:
scratchpad_t& self()
{
return *this;
}
};
void foo(scratchpad_t& r)
{
}
int main()
{
foo(scratchpad_t().self());
}
Note how the call to self() yields an lvalue expression even though scratchpad_t is an rvalue.
Please correct me if I'm wrong, but Rvalue reference parameters don't accept lvalue references so using them would require adding overloads to all consumers of scratchpad, which is also unfortunate.
Well, you could use templates...
template <typename Scratch> void foo(Scratch&& scratchpad)
{
// ...
}
If you call foo with an rvalue parameter, Scratch will be deduced to scratchpad_t, and thus Scratch&& will be scratchpad_t&&.
And if you call foo with an lvalue parameter, Scratch will be deduced to scratchpad_t&, and because of reference collapsing rules, Scratch&& will also be scratchpad_t&.
Note that the formal parameter scratchpad is a name and thus an lvalue, no matter if its type is an lvalue reference or an rvalue reference. If you want to pass scratchpad on to other functions, you don't need the template trick for those functions anymore, just use an lvalue reference parameter.
By the way, you do realize that the temporary scratchpad involved in xyz.initialize_computation(scratchpad_t(1, 2, 3)); will be destroyed as soon as initialize_computation is done, right? Storing the reference inside the xyz object for later user would be an extremely bad idea.
self() doesn't need to be a member method, it can be a templated function
Yes, that is also possible, although I would rename it to make the intention clearer:
template <typename T>
T& as_lvalue(T&& x)
{
return x;
}
Is the problem just that this:
scratchpad_t<typeX<typeY,potentially::messy>, typename T> useless_temp = factory(rng_parm);
is ugly? If so, then why not change it to this?:
auto useless_temp = factory(rng_parm);
Personally, I would rather see const_cast than mutable. When I see mutable, I'm assuming someone's doing logical const-ness, and don't think much of it. const_cast however raises red flags, as code like this should.
One option would be to use something like shared_ptr (auto_ptr would work too depending on what factory is doing) and pass it by value, which avoids the copy cost and maintains only a single instance, yet can be passed in from your factory method.
If you allocate the object in the heap you might be able to convert the code to something like:
std::auto_ptr<scratch_t> create_scratch();
foo( *create_scratch() );
The factory creates and returns an auto_ptr instead of an object in the stack. The returned auto_ptr temporary will take ownership of the object, but you are allowed to call non-const methods on a temporary and you can dereference the pointer to get a real reference. At the next sequence point the smart pointer will be destroyed and the memory freed. If you need to pass the same scratch_t to different functions in a row you can just capture the smart pointer:
std::auto_ptr<scratch_t> s( create_scratch() );
foo( *s );
bar( *s );
This can be replaced with std::unique_ptr in the upcoming standard.
I marked FredOverflow's response as the answer for his suggestion to use a method to simply return a non-const reference; this works in C++03. That solution requires a member method per scratchpad-like type, but in C++0x we can also write that method more generally for any type:
template <typename T> T & temp(T && temporary_value) {return temporary_value;}
This function simply forwards normal lvalue references, and converts rvalue references into lvalue references. Of course, doing this returns a modifiable value whose result is ignored - which happens to be exactly what I want, but may seem odd in some contexts.