Why std::thread accepts a functor by forwarding reference - c++

Why does a std::thread object accept the function parameter by forwarding reference and then make a copy of the object with decay_copy? Wouldn't it be easier to just accept the function object by value?
In general why would one not template a function so as to take a function object by value? Can reference-ness not be imitated with reference_wrappers (which would be more explicit, and also conveniently has a member operator() to call the stored function)?

Why does a std::thread object accept the function parameter by forwarding reference and then make a copy of the object with decay_copy? Wouldn't it be easier to just accept the function object by value?
It has to have a copy of the function object in a storage that it can guarantee lasts as long as the thread it is about to launch lasts.
A function argument to construct std::thread would not last that long, as the line where std::thread is created can end long before the created thread ends.
So it must make a copy. If it took its argument by-value, it would make a copy when the constructor was called, then have to make another copy to the persistent storage. By taking it by forwarding reference, it makes exactly one copy.
Now this additional copy could be moved-to, making the extra overhead one extra move. This is still extra overhead, as not all constructs are cheap to move.
In general why would one not template a function so as to take a function object by value?
Because that mandates an extra move.
Can reference-ness not be imitated with reference_wrappers (which would be more explicit, and also conveniently has a member operator() to call the stored function)?
In the case where you intend to store the function object, taking by forwarding reference saves a move, and doesn't require much extra work on the part of the writer of the function-object taking code.
If the caller passed in a reference wrapper, the value stored would be the reference wrapper, which has a different meaning.

Related

Why should arguments be passed by value when used to initialize another object?

When passing objects to functions there is the choice to pass arguments either by value or by const&. Especially when the object is possibly expensive to create and it is internally mutated or used to initialize another object the recommendation is to pass the object by value. For example:
class Foo {
std::vector<std::string> d_strings;
public:
Foo(std::vector<std::string> strings): d_strings(std::move(strings)) {}
// ...
};
The conventional approach would be to declare the strings parameter as std::vector<std::string> const& and copy the argument. The value argument to the constructor above also needs to be copied!
Why is it preferable to pass by value rather than pass by const&?
When passing the strings argument by const& there is a guaranteed copy: there is no other way to get hold a, well, copy of the argument other than copying it. The question becomes: how is that different when passing by value?
When the argument is passed by value the strings object is clearly used nowhere else and its content can be moved. Move construction of expansive to copy objects may still be comparatively cheap. For example, in the case of the std::vector<std::string> the move is just copying a few pointers to the new object and setting a few pointers to indicate to the original object that it shouldn't release anything.
There is still the need to create the argument, though. However, the creation of the argument may be elided without creating a new object. For example
Foo f(std::vector<std::string>{ "one", "two", "three" });
will create a vector with three strings and construction of the argument to the Foo construct is most likely elided. In the worst case, the argument is constructed by moving the temporary vector, avoiding a copy, too.
There are, of course, cases where a copy still needs to be created. For example, in the case
std::vector<std::string> v{ "one", "two", "three" };
Foo f(v);
The argument is created by a copy. The ready made copy is then moved to the member. In this case pass by const& would have been better because only a copy construction would have been needed rather than a copy construction (to create the argument) followed by a move construction (to create the member) being done.
That is, passing by value enables possibly eliding a copy entirely and just having a move. In the worst case, when the argument needs to be copied anyway, an additional move needs to be performed. Since the move is generally assumed to be a cheap operation the expectation is that overall pass by value for objects which need to be transferred results in better performance.
The statement
Arguments should be passed by value when used to initialize another object
Is true, starting with C++11, thanks to the introduction of move semantic.
It could be generalized to:
When a function needs a copy of one of its arguments, pass it by value.
This is actually well detailed in the "Want Speed? Pass by Value." article.
The outline is that, since your function will need a copy of the argument anyway, it is better to have this copy handled at the call-site instead of inside the called function. This is because, if the object that the function needs a copy is an Rvalue, it is only known at call site, thus enabling move optimization: the calling context is well aware that the object is expiring, and thus can move-it into the copy that the function requires. Now, if the copy was to be made inside the function itself, the notion that the source object (in the calling context) was an Rvalue would not be forwarded up to the actual place of the copy, loosing the opportunity for a move.

call by value vs const call by reference

I am a little confused about the differences between call by value and const call by reference. Could someone please explain this to me. For example, do they both protect against changing the callers argument, are they fast for all object sizes, do either copy the argument while one doesnt, and which use more memory when making a copy?
do they both protect against changing the callers argument
Passing by value creates a copy of the argument provided by the caller, so whatever the function does, it does it on a separate objects. This means the original object won't ever be touched, so in this case the answer is "Yes".
Passing by reference to const, on the other hand, lets the function refer the very same object that the caller provided, but it won't let that function modify it... Unless (as correctly remarked by Luchian Grigore in the comments) the function's implementer uses const_cast<> to cast away the const-ness from the reference, which is something that can be safely done only if it is known that the object bound to the reference was not declared to be of a const type (otherwise, you would get Undefined Behavior).
Since this does not seem to be the most likely scenario considering your question, and considering that in general accepting a reference to const represents a promise that the argument won't be touched, then the answer is that as long as we assume this promise to be fulfilled, passing by reference to const won't alter the argument provided by the caller. So the answer is "Yes" again - with the little caveat I mentioned above.
are they fast for all object sizes
No. Although you should first define "fast". If the type of the object being passed is expensive to copy (or to move, if a move is performed rather than a copy), then passing by value might be slow. Passing by reference will always cost you the same (the size of an address), no matter what is the type of the value you are passing.
Notice, that on some architecture and for some data types (like char) passing by value could be faster than passing by reference, while the opposite is generally true for large enough UDTs.
and which use more memory when making a copy?
Since only one of them is causing a copy, the question has an obvious answer.
The main difference is that passing by const reference (or non-const) doesn't make a copy of the argument. (the copy is actually subject to copy elision, but theoretically it's a copy that's passed to the function when you pass by value)
In some cases, passing by value is just as fast, or even faster (typically when the object is at most the size of a register). You'd usually pass basic types by value, and class-types by reference.
When passing by const reference you can still modify the original value just by casting the const away (via const_cast), but that results in undefined behavior if the original value is const.
call by value will copy all the elements of the object it does protect the callers argument because if you are going to change something it is only a copy you are changing.
calling by const reference does not copy elements but because of the "const" it will protect caller's argument.
You const reference.
I suppose that you mean the difference between:
void Fn1(MyType x);
and
void Fn2(const MyType& x);
In former case, a copy of the object is always created, which makes it slower especially if the type has a non-trivial constructor. The original object will be unaffected by any changes done on the copy within the function, but the copy itself can be changed.
The latter example will not create a copy and will in general be faster. Inside the function, only the const functions can be called on the argument (unless you resort to dirty tricks like casting away constness), thus guaranteeing that the object will not be modified.
IMPORTANT: This discussion doesn't cover types with special semantics, like smart pointers. In that case, call by value will still allow you to change what is logically the same object, i.e. not the smart ptr instance itself but the object it points to.
So here are the answers to your questions:
do they both protect against changing the callers argument: yes, the original object will remain unchanged (excluding tricks)
are they fast for all object sizes: they are not equally fast - call by reference is in general faster, except for some primitive types where speed is more or less the same or maybe even marginally faster, depending on compiler optimizations.
do either copy the argument while one doesnt: call by value creates a copy, call by reference doesn't
which use more memory when making a copy? call by reference doesn't create a copy so the answer is clear
One other point worth mentioning is that call-by-reference functions are converted into inline functions.

Should templated functions take lambda arguments by value or by rvalue reference?

GCC 4.7 in C++11 mode is letting me define a function taking a lambda two different ways:
// by value
template<class FunctorT>
void foo(FunctorT f) { /* stuff */ }
And:
// by r-value reference
template<class FunctorT>
void foo(FunctorT&& f) { /* stuff */ }
But not:
// by reference
template<class FunctorT>
void foo(FunctorT& f) { /* stuff */ }
I know that I can un-template the functions and just take std::functions instead, but foo is small and inline and I'd like to give the compiler the best opportunity to inline the calls to f it makes inside. Out of the first two, which is preferable for performance if I specifically know I'm passing lambdas, and why isn't it allowed to pass lambdas to the last one?
FunctorT&& is a universal reference and can match anything, not only rvalues. It's the preferred way to pass things in C++11 templates, unless you absolutely need copies, since it allows you to employ perfect forwarding. Access the value through std::forward<FunctorT>(f), which will make f an rvalue again if it was before, or else will leave it as an lvalue. Read more here about the forwarding problem and std::forward and read here for a step-by-step guide on how std::forward really works. This is also an interesting read.
FunctorT& is just a simple lvalue reference, and you can't bind temporaries (the result of a lambda expression) to that.
When you create a lambda function you get a temporary object. You cannot bind a temporary to a non-const l-value references. Actually, you cannot directly create an l-value referencing a lambda function.
When you declare you function template using T&& the argument type for the function will be T const& if you pass a const object to the function, T& if you pass a non-const l-value object to it, and T if you pass it a temporary. That is, when passing a temporary the function declaration will take an r-value reference which can be passed without moving an object. When passing the argument explicitly by value, a temporary object is conceptually copied or moved although this copy or move is typically elided. If you only pass temporary objects to your functions, the first two declarations would do the same thing, although the first declaration could introduce a move or copy.
This is a good question -- the first part: pass-by-value or use forwarding. I think the second part (having FunctorT& as an argument) has been reasonably answered.
My advice is this: use forwarding only when the function object is known, in advance, to modify values in its closure (or capture list). Best example: std::shuffle. It takes a Uniform Random Number Generator (a function object), and each call to the generator modifies its state. The function object is forwarded into the algorithm.
In every other case, you should prefer to pass by value. This does not prevent you from capturing locals by reference and modifying them within your lambda function. That will work just like you think it should. There should be no overhead for copying, as Dietmar says. Inlining will also apply and references may be optimized out.

When should I use a reference?

Newbie here, I am reading some code, and I see sometimes the author used the reference in a function as
funca (scalar& a)
// Etc
Sometimes he just use
funcb (scalar a)
// Etc
What's the difference? Is using a reference a good habit that I should have?
Thank you!
If you call foo(scalar a), the argument a of type scalar will be COPIED from the caller and foo will have it's own COPY of the original object.
If you call foo(scalar &b), the argument b will be just a reference to the original object, so you will be able to modify it.
It's faster to pass an object by reference using the &name syntax, since it avoids creating a copy of the given object, but it can be potentially dangerous and sometimes the behavior is unwanted because you simply want an actual copy.
That being said, there's actually an option that disallows the ability to modify the original object for the called function yet avoids creating a copy. It's foo(const scalar &x) which explicitly states that the caller does not want the function foo to modify the object passed as an argument.
Optional reading, carefully:
There's also a way of passing an argument as a raw pointer which is very rare in modern C++. Use with caution: foo(scalar *a). The caller has got to provide the address of an object instead of the object itself in this scenario, so the caller would call foo(&a). For the called function foo to be able to modify the object itself in this case, it would need to dereference the pointer a, like this in foo: *a =. The star in front of the variable name in this case says that we don't want to modify the address that we have received (as a direct result of the calling function providing &a, that is, the address of the object a).
Passing a parameter by reference allows the called function to modify its argument in a way that will be visible to the caller after the function returns, while passing by value means that any changes will be limited in scope to the called function. Therefore passing by (non-const) reference typically signifies that the callee intends to modify the argument or, less commonly, use it as an additional "return value".
Additionally, passing by reference means that no copy of the parameter needs to be made; passing by value requires such a copy (which may be detrimental for the memory footprint or runtime performance of your application). For this reason you will often see arguments of class type being passed as a const reference: the callee does not intend to modify the argument but it also wants to avoid a copy being made. Scalar arguments are of very small size, so they do not benefit from this approach.
See also Pass by Reference / Value in C++.
Call by value (funcb (scalar a)) will give the function a copy of the argument, so changes made to the argument are not visible to the caller.
Call by reference (funcb(scalar& b)) means that the function operates directly on the argument, so any changes made are directly visible to the caller.
Whether or not call by reference is a good practice depends on the circumstances. If you need the function to modify the argument (and the modifications to be visible to the caller) you obviously want to use call by reference. If you don't want to modify the argument using non-const reference arguments is misleading (since the signature indicates the argument could be changed), so call by value is more apropriate here. Of course for more complex types call by value can have a non-trivial overhead. In these cases call-by-const-reference is preferable (funcc(const scalar& c))

Should I copy an std::function or can I always take a reference to it?

In my C++ application (using Visual Studio 2010), I need to store an std::function, like this:
class MyClass
{
public:
typedef std::function<int(int)> MyFunction;
MyClass (Myfunction &myFunction);
private:
MyFunction m_myFunction; // Should I use this one?
MyFunction &m_myFunction; // Or should I use this one?
};
As you can see, I added the function argument as a reference in the constructor.
But, what is the best way to store the function in my class?
Can I store the function as a reference since std::function is just a function-pointer and the 'executable code' of the function is guaranteed to stay in memory?
Do I have to make a copy in case a lambda is passed and the caller returns?
My gut feeling says that it's safe to store a reference (even a const-reference). I expect the compiler to generate code for the lambda at compile time, and keep this executable code in 'virtual' memory while the application is running. Therefore the executable code is never 'deleted' and I can safely store a reference to it. But is this really true?
Can I store the function as a reference since std::function is just a function-pointer and the 'executable code' of the function is guaranteed to stay in memory?
std::function is very much not just a function pointer. It's a wrapper around an arbitrary callable object, and manages the memory used to store that object. As with any other type, it's safe to store a reference only if you have some other way to guarantee that the referred object is still valid whenever that reference is used.
Unless you have a good reason for storing a reference, and a way to guarantee that it remains valid, store it by value.
Passing by const reference to the constructor is safe, and probably more efficient than passing a value. Passing by non-const reference is a bad idea, since it prevents you from passing a temporary, so the user can't directly pass a lambda, the result of bind, or any other callable object except std::function<int(int)> itself.
If you pass the function in to the constructor by reference, and don't make a copy of it, you'll be out of luck when the function goes out of scope outside of this object, as the reference will no longer be valid. That much has been said in the previous answers already.
What I wanted to add was that, instead, you could pass the function by value, not reference, into the constructor. Why? well, you need a copy of it anyway, so if you pass by value the compiler can optimize away the need to make a copy when a temporary is passed in (such as a lambda expression written in-place).
Of course, however you do things, you potentially make another copy when you assign the passed in function to the variable, so use std::move to eliminate that copy. Example:
class MyClass
{
public:
typedef std::function<int(int)> MyFunction;
MyClass (Myfunction myFunction): m_myfunction(std::move(myFunction))
{}
private:
MyFunction m_myFunction;
};
So, if they pass in an rvalue to the above, the compiler optimises away the first copy into the constructor, and std::move removes the second one :)
If your (only) constructor takes a const reference, you will need to make a copy of it in the function regardless of how it's passed in.
The alternative is to define two constructors, to deal with lvalues and rvalues separately:
class MyClass
{
public:
typedef std::function<int(int)> MyFunction;
//takes lvalue and copy constructs to local var:
MyClass (const Myfunction & myFunction): m_myfunction(myFunction)
{}
//takes rvalue and move constructs local var:
MyClass (MyFunction && myFunction): m_myFunction(std::move(myFunction))
{}
private:
MyFunction m_myFunction;
};
Now, you handly rvalues differently and eliminate the need to copy in that case by explicitly handling it (rather than letting the compiler handle it for you). May be marginally more efficient than the first but is also more code.
The (probably seen a fair bit around here) relevant reference (and a very good read):
http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/
I would suggest you to make a copy:
MyFunction m_myFunction; //prefferd and safe!
It is safe because if the original object goes out of scope destructing itself, the copy will still exist in the class instance.
Copy as much as you like. It is copyable. Most algorithms in standard library require that functors are.
However, passing by reference will probably be faster in non-trivial cases, so I'd suggest passing by constant reference and storing by value so you don't have to care about lifecycle management. So:
class MyClass
{
public:
typedef std::function<int(int)> MyFunction;
MyClass (const Myfunction &myFunction);
// ^^^^^ pass by CONSTANT reference.
private:
MyFunction m_myFunction; // Always store by value
};
By passing by constant or rvalue reference you promise the caller that you will not modify the function while you can still call it. This prevents you from modifying the function by mistake and doing it intentionally should usually be avoided, because it's less readable than using return value.
Edit: I originally said "CONSTANT or rvalue" above, but Dave's comment made me look it up and indeed rvalue reference does not accept lvalues.
As a general rule (especially if you're using these for some highly threaded system), pass by value. There is really no way to verify from within a thread that the underlying object is still around with a reference type, so you open yourself up to very nasty race and deadlock bugs.
Another consideration is any hidden state variables in the std::function, for whom modification is very unlikely to be thread-safe. This means that even if the underlying function call is thread-safe, the std::function wrapper's "()" call around it MAY NOT BE. You can recover the desired behavior by always using thread-local copies of the std::function because they'll each have an isolated copy of the state variables.