I've noticed that it's impossible to pass a non-const reference as an argument to std::async.
#include <functional>
#include <future>
void foo(int& value) {}
int main() {
int value = 23;
std::async(foo, value);
}
My compiler (GCC 4.8.1) gives the following error for this example:
error: no type named ‘type’ in ‘class std::result_of<void (*(int))(int&)>’
But if I wrap the value passed to std::async in std::reference_wrapper, everything is OK. I assume this is because std::async takes it's arguments by value, but I still don't understand the reason for the error.
It's a deliberate design choice/trade-off.
First, it's not necessarily possible to find out whether the functionoid passed to async takes its arguments by reference or not. (If it's not a simple function but a function object, it could have an overloaded function call operator, for example.) So async cannot say, "Hey, let me just check what the target function wants, and I'll do the right thing."
So the design question is, does it take all arguments by reference if possible (i.e. if they're lvalues), or does it always make copies? Making copies is the safe choice here: a copy cannot become dangling, and a copy cannot exhibit race conditions (unless it's really weird). So that's the choice that was made: all arguments are copied by default.
But then, the mechanism is written so that it actually fails to then pass the arguments to a non-const lvalue reference parameter. That's another choice for safety: otherwise, the function that you would expect to modify your original lvalue instead modifies the copy, leading to bugs that are very hard to track down.
But what if you really, really want the non-const lvalue reference parameter? What if you promise to watch out for dangling references and race conditions? That's what std::ref is for. It's an explicit opt-in to the dangerous reference semantics. It's your way of saying, "I know what I'm doing here."
std::async (and other functions that do perfect forwarding) look at the type of the argument that you pass to figure out what to do. They do not look at how that argument will eventually be used. So, to pass an object by reference you need to tell std::async that you're using a reference. However, simply passing a reference won't do that. You have to use std::ref(value) to pass value by reference.
The issue itself is only marginally related to std::async(): When defining the result of the operation, std::async() uses std::result_of<...>::type with all its arguments being std::decay<...>::type'ed. This is reasonable because std::async() takes arbitrary types and forwards them to store them in some location. To store them, values are needed for the function object as well as for the arguments. Thus, std::result_of<...> is used similar to this:
typedef std::result_of<void (*(int))(int&)>::type result_type;
... and since int can't be bound to an int& (int isn't an lvalue type was is needed to be bound to int&), this fails. Failure in this case means that std::result_of<...> doesn't define a nested type.
A follow-up question could be: What is this type used to instantiate std::result_of<...>? The idea is that the function call syntax consisting of ResultType(ArgumentTypes...) is abused: instead of a result type, a function type is passed and std::result_of<...> determines the type of the function called when that function type is called with the given list of arguments is called. For function pointer types it isn't really that interesting but the function type can also be a function object where overloading needs to be taken into account. So basically, std::result_of<...> is used like this:
typedef void (*function_type)(int&);
typedef std::result_of<function_type(int)>::type result_type; // fails
typedef std::result_of<function_type(std::reference_wrapper<int>)>::type result_type; //OK
Related
This question already has answers here:
std::thread pass by reference calls copy constructor
(2 answers)
Closed 5 months ago.
(1) I've this code snippet:
void tByRef(int& i) {
++i;
}
int main() {
int i = 0;
tByRef(i); // ok
thread t1(tByRef, i); // fail to compile
thread t2(tByRef, (int&)i); // fail to compile
thread t3(tByRef, ref(i)); // ok
return 0;
}
As you could see, function tByRef accepts a lvalue reference as parameter to change the i value. So calling it directly tByRef(i) passes compilation.
But when I try to do same thing for thread function call, e.g. thread t1(tByRef, i), it fails to compile. Only when I added ref() around i, then it gets compiled.
Why need extra ref call here? If this is required for passing by reference, then how to explain that tByRef(i) gets compiled?
(2) I then changed tByRef to be template function with && parameter, this time, even t3 fails to compile:
template<typename T>
void tByRef(T&& i) {
++i;
}
This && in template parameter type is said to be reference collapse which could accept both lvalue and rvalue reference. Why in my sample code, t1, t2, t3 all fails to compile to match it?
Thanks.
Threads execute asynchronously from the code that started them. That's kind of the point. This means that, when a thread function actually gets called, the code that started the thread may well have left that callstack. If the user passed a reference to a local variable, that variable may be off the stack by the time the thread function gets called. Basically, passing by reference to a thread function is highly dangerous.
However, in C++, passing a variable by reference is trivial; you just provide the name to the function that takes its parameter by reference. Since it is so dangerous in this particular case, std::thread takes steps to prevent you from doing it.
All arguments to the thread function are copied/moved into internal storage when the thread object is created, and your thread function's parameters are initialized from those copies.
Now, thread could initialize non-const lvalue reference parameters with a reference to the internal object for that parameter. However, a function which specifically takes a non-const lvalue reference is almost always a function that is expected to modify this value in a way that will be visible to others. But... it won't be visible to anyone, because it will be given a reference to an object stored internally in the thread that is accessible to no one else.
In short, whatever you thought was going to happen will not happen. Hence the compile error: thread is specifically designed to detect this circumstance and assume that you've made some kind of mistake.
However, while non-const lvalue reference parameters are inherently dangerous, they can still be useful. So std::ref is used as a way for a user to explicitly ask to pass a reference parameter.
As for why it fails to compile in your second example, tByRef in this case is not the name of a function. It is the name of a template. std::thread expects to be given a value which it can call. A template is not a value, nor is it convertible to a value.
A function template is a construct which generates a function when provided with template parameters. The template name alone is not a function.
1)
Because you need a object to pass to another thread, hence you need to encapsulate the variable with an object that mimics a reference, but isn't truly a reference.
std::ref docs.
Note the possible implementation section in the std::reference_wrapper documentation, it saves a pointer to the object.
2)
C++ needs a way to deduct the template type, the compiler creates a new function for each different template. In this case you must specify the type:
std::thread t3(tByRef<std::reference_wrapper<int>>, std::ref(i));
If you have a function parameter that is intended to be moved into a variable within the function, would you have want to use pass by reference instead of pass by value?
For example, is there ever any benefit to using
void func(T &object2move)
{
T obj{std::move(object2move)};
}
vs.
void func(T object2move)
{
T obj{std::move(object2move)};
}
In addition to the above, is the only case where you want to use the following code when you only want func to take in an rvalue?
void func(T object2move)
{
T obj{object2move};
}
Is the answer to these questions dependent on what T is?
If your function will definitely move from the given value, then the best way to express this is by taking an rvalue-reference parameter (T&&). This way, if the user tries to pass an lvalue directly, they will get a compile error. And that forces them to invoke std::move directly on the lvalue, which visibly indicates to the reader that the value will be moved-from.
Using an lvalue-reference is always wrong. Lvalue-references don't bind to xvalues and prvalues (ie: expressions where it's OK to move from them), so you're kind of lying with such an interface.
I inspected the signature of this right part of this assignment:
creating a thread:
std::thread t2 = std::thread(&Vehicle::addID, &v2, 2);
by hovering with the mouse on and "thread" on the right I got:
std::thread::thread<...>(void (Vehicle::*&&_Fx)(int id), Vehicle &_Ax, int &&_Ax)
Now, I know the basics of C function pointers syntax.
But in C++ you see many times first the class name on the left (especially when using templates)
so I understand that - * within this syntax means a pointer to a (public) member function of the class Vehicle that take an int and returns void (nothing), but whats the && (similar to move constructor) mean?
reference to reference of / take the reference to the member function object by reference??
Notice how the lvalue argument (&v2) becomes an lvalue reference, and the rvalue arguments (the literal 2 and your &Vehicle::addID) become an rvalue reference.
The constructor template you're using is:
template< class Function, class... Args >
explicit thread( Function&& f, Args&&... args );
// ^^
We can see there that we ask the computer to take the arguments by "universal reference", i.e. as referency as possible, given each one's value category.
So you're seeing the result of that.
It's not part of the type of the pointer-to-member-function: it's something that's become an rvalue-reference-to-pointer-to-member-function because that's how std::thread takes its arguments, for the purpose of being nice and generic. In the case of a function pointer it's redundant, as there's nothing to "move", but for more complex arguments this can be important.
Of course, due to the nasty "spiral rule" we inherited from C, you end up with the && confusingly plonked in the middle of the pointer's type. 🤪😭
tl;dr:
take the reference [pointer — Ed.] to the member function object by reference??
Pretty much.
When I generate a new thread (std::thread) with a function the arguments of that function
are by value - not by reference.
So if I define that function with a reference argument (int& nArg)
my compiler (mingw 4.9.2) outputs an error (in compilian-suaeli something like
"missing copy constructor" I guess ;-)
But if I make that reference argument const (const int& nArg) it does not complain.
Can somebody explain please?
If you want to pass reference, you have to wrap it into std::reference_wrapper thanks to std::ref. Like:
#include <functional>
#include <thread>
void my_function(int&);
int main()
{
int my_var = 0;
std::thread t(&my_function, std::ref(my_var));
// ...
t.join();
}
std::thread's arguments are used once.
In effect, it stores them in a std::tuple<Ts...> tup. Then it does a f( std::get<Is>(std::move(tup))...).
Passing std::get an rvalue tuple means that it is free to take the state from a value or rvalue reference field in the tuple. Without the tuple being an rvalue, it instead gives a reference to it.
Unless you use reference_wrapper (ie, std::ref/std::cref), the values you pass to std::thread are stored as values in the std::tuple. Which means the function you call is passed an rvalue to the value in the std::tuple.
rvalues can bind to const& but not to &.
Now, the std::tuple above is an implementation detail, an imagined implementation of std::thread. The wording in the standard is more obtuse.
Why does the standard say this happens? In general, you should not bind a & parameter to a value which will be immediately discarded. The function thinks that it is modifying something that the caller can see; if the value will be immediately discarded, this is usually an error on the part of the caller.
const& parameters, on the other hand, do bind to values that will be immediately discarded, because we use them for efficiency purposes not just for reference purposes.
Or, roughly, because
const int& x = 7;
is legal
int& x = 7;
is not. The first is a const& to a logically discarded object (it isn't due to reference lifetime extension, but it is logically a temporary).
I recently ran into a bug in my code when using boost::bind.
From the boost::bind docs:
The arguments that bind takes are copied and held internally by the returned function object.
I had assumed that the type of the copy that was being held was based on the signature of the function. However, it is actually based on the type of the value passed in.
In my case an implicit conversion was happening to convert the type used in the bind expression to the type received by the function. I was expecting this conversion to happen at the site of the bind, however it happens when the resulting function object is used.
In retrospect I should have been able to figure this out from the fact that using boost::bind gives errors when types are not compatible only at the call site, not the bind site.
My question is:
Why does boost::bind work this way?
It seems to give worse compiler error messages
It seems to be less efficient when implicit conversion happens and there are multiple calls to the functor
But given how well Boost is designed I'm guessing there is a reason. Was it behavior inherited from std::bind1st/bind2nd? Is there a subtle reason why this would be hard/impossible to implement? Something else entirely?
To test that second theory I wrote up a little code snippet that seems to work, but there may well be features of bind I haven't accounted for since it's just a fragment:
namespace b = boost;
template<class R, class B1, class A1>
b::_bi::bind_t<R, R (*) (B1), typename b::_bi::list_av_1<B1>::type>
mybind(R (*f) (B1), A1 a1)
{
typedef R (*F) (B1);
typedef typename b::_bi::list_av_1<B1>::type list_type;
return b::_bi::bind_t<R, F, list_type> (f, list_type(B1(a1)));
}
struct Convertible
{
Convertible(int a) : b(a) {}
int b;
};
int foo(Convertible bar)
{
return 2+bar.b;
}
void mainFunc()
{
int x = 3;
b::function<int()> funcObj = mybind(foo, x);
printf("val: %d\n", funcObj());
}
Because the functor may support multiple overloads, which may give different behaviours. Even if this signature could be resolved when you knew all the arguments (and I don't know if Standard C++ can guarantee this facility) bind does not know all the arguments, and therefore it definitely cannot be provided. Therefore, bind does not possess the necessary information.
Edit: Just to clarify, consider
struct x {
void operator()(int, std::vector<float>);
void operator()(float, std::string);
};
int main() {
auto b = std::bind(x(), 1); // convert or not?
}
Even if you were to reflect on the struct and gain the knowledge of it's overloads, it's still undecidable as to whether you need to convert the 1 to a float or not.
There are different cases where you need the arguments to be processed at the call site.
The first such example is calling a member function, where you can either have the member called on a copy of the object (boost::bind( &std::vector<int>::push_back, myvector)) which most probably you don't want, or else you need to pass a pointer and the binder will dereference the pointer as needed (boost::bind( &std::vector<int>::push_back, &myvector )) --Note both options can make sense in different programs
Another important use case is passing an argument by reference to a function. bind will copy performing the equivalent to a pass-by-value call. The library offers the option of wrapping arguments through the helper functions ref and cref, both of which store a pointer to the actual object to be passed, and at the place of call they dereference the pointer (through an implicit conversion). If the conversion to the target type was performed at bind time, then this would be impossible to implement.
I think this is due to the fact that bind has to work with any callable entity, be it a function pointer, std::function<>, or your own functor struct with operator(). This makes bind generic on any type that can be called using (). I.e. Bind's implicit requirement on your functor is just that it can be used with ()
If bind was to store the function argument types, it would have to somehow infer them for any callable entity passed in as a type parameter. This would obviously not be as generic, since deducing parameter types of an operator() of a passed-in struct type is impossible without relying on the user to specify some kind of typedef (as an example). As a result the requirement on the functor (or concept) is no longer concrete/simple.
I am not entirely sure this is the reason, but it's one of the things that would be a problem.
EDIT: Another point as DeadMG mentions in another answer, overloads would create ambiguities even for standard function pointers, since the compiler would not be able to resolve the functor type. By storing the types you provide to bind and using (), this problem is also avoided.
A good example would binding "std::future"s to some ordinary function taking ordinary types:
Say I want to use an ordinary f(x,y) function in an incredibly asynchronous way. Namely, I want to call it like "f(X.get(), Y.get())". There's a good reason for this- I can just call that line and f's logic will run as soon as both inputs are available (I don't need separate lines of code for the join). To do this I need the following:
1) I need to support implicit conversions "std::future<T> -> T". This means std::future or my custom equivalent needs a cast operator:
operator T() { return get(); }
2) Next, I need to bind my generic function to hide all its parameters
// Hide the parameters
template<typename OUTPUT, typename... INPUTS>
std::function<OUTPUT()> BindVariadic(std::function<OUTPUT(INPUTS...)> f,
INPUTS&&... in)
{
std::function<OUTPUT()> stub = std::bind( f, std::forward<INPUTS>(in)...);
return stub;
}
With a std::bind that does the "std::function<T> -> T" conversion at call time, I only wait for all the input parameters to become available when I ACTUALLY CALL "stub()". If it did the conversion via operator T() at the bind, the logic would silently force the wait when I actually constructed "stub" instead of when I use it. That might be fatal if "stub()" cannot always run safely in the same thread I built it.
There are other use cases that also forced that design choice. This elaborate one for async processing is simply the one I'm personally familiar with.