std::function - value vs. reference argument - c++

Are there any practical differences between std::function for type with value parameter vs with const reference to value parameter? Consider following code:
auto foo = [] (VeryBigType i) {
};
auto bar = [] (const VeryBigType& i) {
};
std::function<void(VeryBigType)> a;
a = foo;
a = bar;
std::function<void(const VeryBigType&)> b;
b = foo;
b = bar;
This code compiles without issues and works perfeclty well. I know that passing by value vs by ref has performance differences and so foo and bar would behave differently. But are there any differences depending on std::function template type? For example, are there any implementation and/or behaviour and/or performance differences between
std::function<void(VeryBigType)>(bar) vs std::function<void(const VeryBigType&)>(bar) or these constructs are equivalent?

cppreference says that std::function<R(Args...)>::operator() has signature
R operator()(Args... args) const;
and that it calls the stored callable f basically by f(std::forward<Args>(args)...). The performance characteristics depend on both the template argument and the lambda's argument type and I think it would be helpful to just see everything that can happen. In your case, you have 2 std::function types, 2 callables, and 3 possible value categories for the argument, giving you 12 possibilities.
std::function<void(VeryBigType)> f = [](VeryBigType i) { }
If you call this with an lvalue, like
VeryBigType v;
f(v);
This will copy v into the argument of operator(), and then operator() will pass an rvalue to the lambda, which will move the value into i. Total cost: 1 copy + 1 move
If you call this with a prvalue, like
f(VeryBigType{});
Then this will materialize the prvalue into the argument of operator(), then pass an rvalue to the lambda, which will move it into i. Total cost: 1 move
If you call this with an xvalue, like
VeryBigType v;
f(std::move(v));
This will move v into the argument of operator(), which will pass an rvalue to the lambda, which will move it again into i. Total cost: 2 moves.
std::function<void(VeryBigType)> f = [](VeryBigType const &i) { }
If you call this with an lvalue, this will copy once into the argument of operator(), and then the lambda will be given a reference to that argument. Total cost: 1 copy.
If you call this with a prvalue, this will materialize it into the argument of operator(), which will pass a reference to that argument to the lambda. Total cost: nothing.
If you call this with an xvalue, this will move it into the argument of operator(), which will pass a reference to that argument to the lambda. Total cost: 1 move.
std::function<void(VeryBigType const&)> f = [](VeryBigType i) { }
If you call this with an lvalue or xvalue (i.e. with a glvalue), operator() will receive a reference to it. If you call this with a prvalue, it will be materialized into a temporary, and operator() will receive a reference to that. In any case, the inner call to the lambda will always copy. Total cost: 1 copy.
std::function<void(VeryBigType const&)> f = [](VeryBigType const &i) { }
Again, no matter what you call this with, operator() will receive just a reference to it, and the lambda will just receive the same reference. Total cost: nothing.
So, what did we learn? If both the std::function and the lambda take references, you avoid any extraneous copies and moves. Use this when possible. Putting a by-value lambda inside a by-const-lvalue-reference std::function, however, is a bad idea (unless you have to). Essentially, the lvalue reference "forgets" the value category of the argument, and the argument to the lambda is always copied. Putting a by-const-lvalue-reference lambda inside a by-value std::function is pretty good performance-wise, but you only need to do so if you're calling into other code that expects a by-value std::function, because otherwise a by-reference std::function achieves the same thing but with less copying and moving. Putting a by-value lambda inside a by-value std::function is slightly worse than putting a by-const-lvalue-reference lambda inside of it, due to an extra move in all calls. It would be better to instead take the argument of the lambda by-rvalue-reference, which is pretty much the same as taking it by-const-lvalue-reference except you still can mutate the argument, just as if you took it by value anyway.
TL;DR: By-value and rvalue-reference arguments in a std::function template argument should correspond to by-rvalue-reference or by-const-lvalue-reference arguments in the lambda you put inside the std::function. By-lvalue-reference arguments in the type should correspond to by-lvalue-reference arguments in the lambda. Anything else incurs additional copies or moves, and should only be used when needed.

Your question is confusing, because you seem to be aware that there are very large performance differences between a value and a ref argument.
What you seem to not be aware of is that the template type of the function object is what determines how arguments are passed to your lambda function, not the lambda function itself, because of the cdecl calling convention: the caller passes arguments on the stack and then performs the cleanup, and the caller calls through your std::function object.
So, a will always allocate a new copy of your object and pass a reference to it, then clean it up, and b will always pass a reference to the original object.
Edit: as to why that works regardless of how you define the lambda functions, again because of cdecl, both functions expect a pointer as the first argument and do their work on them. The rest of the declarations around the types (type sizes, constness, references etc) are only used to validate the code inside the function and validating that the function itself can be called by your function object (ie, that function will send a pointer as the first argument).

Related

std::async no matching overloaded function found [duplicate]

I've noticed that it's impossible to pass a non-const reference as an argument to std::async.
#include <functional>
#include <future>
void foo(int& value) {}
int main() {
int value = 23;
std::async(foo, value);
}
My compiler (GCC 4.8.1) gives the following error for this example:
error: no type named ‘type’ in ‘class std::result_of<void (*(int))(int&)>’
But if I wrap the value passed to std::async in std::reference_wrapper, everything is OK. I assume this is because std::async takes it's arguments by value, but I still don't understand the reason for the error.
It's a deliberate design choice/trade-off.
First, it's not necessarily possible to find out whether the functionoid passed to async takes its arguments by reference or not. (If it's not a simple function but a function object, it could have an overloaded function call operator, for example.) So async cannot say, "Hey, let me just check what the target function wants, and I'll do the right thing."
So the design question is, does it take all arguments by reference if possible (i.e. if they're lvalues), or does it always make copies? Making copies is the safe choice here: a copy cannot become dangling, and a copy cannot exhibit race conditions (unless it's really weird). So that's the choice that was made: all arguments are copied by default.
But then, the mechanism is written so that it actually fails to then pass the arguments to a non-const lvalue reference parameter. That's another choice for safety: otherwise, the function that you would expect to modify your original lvalue instead modifies the copy, leading to bugs that are very hard to track down.
But what if you really, really want the non-const lvalue reference parameter? What if you promise to watch out for dangling references and race conditions? That's what std::ref is for. It's an explicit opt-in to the dangerous reference semantics. It's your way of saying, "I know what I'm doing here."
std::async (and other functions that do perfect forwarding) look at the type of the argument that you pass to figure out what to do. They do not look at how that argument will eventually be used. So, to pass an object by reference you need to tell std::async that you're using a reference. However, simply passing a reference won't do that. You have to use std::ref(value) to pass value by reference.
The issue itself is only marginally related to std::async(): When defining the result of the operation, std::async() uses std::result_of<...>::type with all its arguments being std::decay<...>::type'ed. This is reasonable because std::async() takes arbitrary types and forwards them to store them in some location. To store them, values are needed for the function object as well as for the arguments. Thus, std::result_of<...> is used similar to this:
typedef std::result_of<void (*(int))(int&)>::type result_type;
... and since int can't be bound to an int& (int isn't an lvalue type was is needed to be bound to int&), this fails. Failure in this case means that std::result_of<...> doesn't define a nested type.
A follow-up question could be: What is this type used to instantiate std::result_of<...>? The idea is that the function call syntax consisting of ResultType(ArgumentTypes...) is abused: instead of a result type, a function type is passed and std::result_of<...> determines the type of the function called when that function type is called with the given list of arguments is called. For function pointer types it isn't really that interesting but the function type can also be a function object where overloading needs to be taken into account. So basically, std::result_of<...> is used like this:
typedef void (*function_type)(int&);
typedef std::result_of<function_type(int)>::type result_type; // fails
typedef std::result_of<function_type(std::reference_wrapper<int>)>::type result_type; //OK

Function pointers in C++ syntax

I inspected the signature of this right part of this assignment:
creating a thread:
std::thread t2 = std::thread(&Vehicle::addID, &v2, 2);
by hovering with the mouse on and "thread" on the right I got:
std::thread::thread<...>(void (Vehicle::*&&_Fx)(int id), Vehicle &_Ax, int &&_Ax)
Now, I know the basics of C function pointers syntax.
But in C++ you see many times first the class name on the left (especially when using templates)
so I understand that - * within this syntax means a pointer to a (public) member function of the class Vehicle that take an int and returns void (nothing), but whats the && (similar to move constructor) mean?
reference to reference of / take the reference to the member function object by reference??
Notice how the lvalue argument (&v2) becomes an lvalue reference, and the rvalue arguments (the literal 2 and your &Vehicle::addID) become an rvalue reference.
The constructor template you're using is:
template< class Function, class... Args >
explicit thread( Function&& f, Args&&... args );
// ^^
We can see there that we ask the computer to take the arguments by "universal reference", i.e. as referency as possible, given each one's value category.
So you're seeing the result of that.
It's not part of the type of the pointer-to-member-function: it's something that's become an rvalue-reference-to-pointer-to-member-function because that's how std::thread takes its arguments, for the purpose of being nice and generic. In the case of a function pointer it's redundant, as there's nothing to "move", but for more complex arguments this can be important.
Of course, due to the nasty "spiral rule" we inherited from C, you end up with the && confusingly plonked in the middle of the pointer's type. 🤪😭
tl;dr:
take the reference [pointer — Ed.] to the member function object by reference??
Pretty much.

Capturing a lambda in another lambda can violate const qualifiers

Consider the following code:
int x = 3;
auto f1 = [x]() mutable
{
return x++;
};
auto f2 = [f1]()
{
return f1();
};
This will not compile, because f1() is not const, and f2 is not declared as mutable. Does this mean that if I have a library function that accepts an arbitrary function argument and captures it in a lambda, I always need to make that lambda mutable, because I don't know what users can pass in? Notably, wrapping f1 in std::function seems to resolve this problem (how?).
Does this mean that if I have a library function that accepts an arbitrary function argument and captures it in a lambda, I always need to make that lambda mutable, because I don't know what users can pass in?
That's a design decision for your library API. You can require client code to pass function objects with a const-qualified operator() (which is the case for non-mutable lambda expressions). If something different is passed, a compiler error is triggered. But if the context might require a function object argument that modifies its state, then yes, you have to make the internal lambda mutable.
An alternative would be to dispatch on the ability to invoke operator() on a const-qualified instance of the given function type. Something along those lines (note that this needs a fix for function objects with both const and non-const operator(), which results in an ambiguity):
template <class Fct>
auto wrap(Fct&& f) -> decltype(f(), void())
{
[fct = std::forward<Fct>(f)]() mutable { fct(); }();
}
template <class Fct>
auto wrap(Fct&& f) -> decltype(std::declval<const Fct&>()(), void())
{
[fct = std::forward<Fct>(f)]() { fct(); }();
}
Notably, wrapping f1 in std::function seems to resolve this problem (how?).
This is a bug in std::function due to its type-erasure and copy semantics. It allows non-const-qualified operator() to be invoked, which can be verified with such a snippet:
const std::function<void()> f = [i = 0]() mutable { ++i; };
f(); // Shouldn't be possible, but unfortunately, it is
This is a known issue, it's worth checking out Titus Winter's complaint on this.
I'll start by addressing your second question first. std::function type erases, and holds a copy of the functor it's initialized with. That means there's a layer of indirection between std::function::operator() and the actual functor's operator().
Envision if you will, holding something in your class by pointer. Then you may call a mutating operation on the pointee from a const member function of your class, because it doesn't affect (in a shallow view) the pointer that the class holds. This is a similar situation to what you observed.
As for your first question... "Always" is too strong a word. It depends on your goal.
If you want to support self mutating functors easily, then you should capture in a mutable lambda. But beware it may affect the library functions you may call now.
If you wish to favor non-mutating operations, then a non-mutable lambda. I say "favor" because as we observed, the type system can be "fooled" with an extra level of indirection. So the approach you prefer is only going to be easier to use, not impossible to go around. This is as the sage advice goes, make correct use of your API easy, and incorrect harder.

Should I always use `T&&` instead of `const T&` or `T&` to bind to a callback function?

template <typename T>
void myFunction(..., T && callback) {
...
callback(...);
...
}
Is it preferable to use T && than T& or const T&?
Or even simply T to pass by value instead of pass by reference.
Does function or lambdas have the concept of lvalue & rvalue? Can I std::move a function / lambdas?
Does const of const T& enforce that the function cannot modify its closure?
Taking a forwarding reference can make a difference, but you have to call the callback correctly to see it. for functions and lambdas it doesn't matter if they are an rvalue or lvalue, but if you have a functor, it can make a difference
template <typename T>
void myFunction(..., T && callback) {
callback(...);
}
Takes a forwarding reference, and then calls the callback as a lvalue. This can be an error if the function object was passed as an rvalue, and its call operator is defined as
operator()(...) && { ... }
since that is only callable on an rvalue. To make your function work correctly you need to wrap the function name in std::forward so you call it in the same value expression category it was passed to the function as. That looks like
template <typename T>
void myFunction(..., T && callback) {
std::forward<T>(callback)(...);
// you should only call this once since callback could move some state into its return value making UB to call it again
}
So, if you want to take rvalues and lvalues, and call the operator as an rvalue or lvalue, then the above approach should be used, since it is just like doing the call in the call site of myFunction.
You probably want to choose either T (with optional std::ref) or T&& and stick with one or the other.
T const& works if you want to let the caller know that myFunction won't modify callback. It doesn't work if callback might be stateful. Of course, if callback's operator() is marked const (or callback is a non-mutable lambda), then myFunction won't modify callback. Essentially, whoever calls myFunction can provide that constness guarantee for themselves.
T& works if you want myFunction to be allowed to take in a stateful functor. The drawback is that T& can't bind to an rvalue (e.g. myFunction([&](Type arg){ /* whatever */ })).
T is good for both stateful and stateless functors, and it works for both rvalues (e.g. lambdas) and lvalues. If somebody who calls myFunction wants changes to callback's state to be observable outside of myFunction (e.g. callback has more than just operator()), they can use std::ref. This is what the C++ standard library seems to go with.
T&& is similarly general-purpose (it can handle rvalues and/or stateful functors), but it doesn't require std::ref to make changes to callback be visible from outside of myFunction.
Is it preferable to use T && than T& or const T&?
Universal references allow perfect forwarding, so they are often preferable in generic template functions.
Or even simply T to pass by value instead of pass by reference.
This is often preferable when you wish to take ownership.
Does function or lambdas have the concept of lvalue & rvalue?
Yes. Every value expression has a value category.
Can I std::move a function / lambdas?
Yes. You can std::move pretty much anything.
Another matter is whether std::move'd object is moved from. And yet another matter is whether an object can be moved. Lambdas are movable unless they contain non-movable captures.
In the OP's example:
template <typename T>
void myFunction(..., T && callback) {
...
...
}
Passing a T&& will give you the best of both worlds. If we pass an r-value reference, the template resolves the function with an r-value reference, but if we pass an l-value reference, the function resolves as a l-value. When passing callback just remember to use std::forward to preserve the l/r value.
This only works if T is templated, this is not the case where we use something like a std::function.
So what do we pass in the non-templated example. The first thing to decide is whether callback can be called after the exit of the function. In the event that the callback will only be called during the scope of myFunc, you are probably better off using l-value references, since you can call the reference directly.
However if callback will be called after the scope of myFunc, using an r-value will allow you to move callback. This will save you the copy but forces you to guarantee that callback cannot be used anywhere else after passing to myFunc.
In C++ functions (not to be confused with std::function) are essentially just pointers. You can move pointers, though it's the same as copying them.
Lambdas under the hood are just more or less regular structs (or classes, it's the same) that implement operator() and potentially store some state (if your lambda captures it). So the move would function just the same as it would for regular structs.
Similarly, there could be move-only lambdas (because they have move-only values in their state). So sometimes, if you want to accept such lambdas, you need to pass by &&.
However, in some cases you want a copy (for example if you want to execute function in a thread).

std::thread arguments (value vs. const)

When I generate a new thread (std::thread) with a function the arguments of that function
are by value - not by reference.
So if I define that function with a reference argument (int& nArg)
my compiler (mingw 4.9.2) outputs an error (in compilian-suaeli something like
"missing copy constructor" I guess ;-)
But if I make that reference argument const (const int& nArg) it does not complain.
Can somebody explain please?
If you want to pass reference, you have to wrap it into std::reference_wrapper thanks to std::ref. Like:
#include <functional>
#include <thread>
void my_function(int&);
int main()
{
int my_var = 0;
std::thread t(&my_function, std::ref(my_var));
// ...
t.join();
}
std::thread's arguments are used once.
In effect, it stores them in a std::tuple<Ts...> tup. Then it does a f( std::get<Is>(std::move(tup))...).
Passing std::get an rvalue tuple means that it is free to take the state from a value or rvalue reference field in the tuple. Without the tuple being an rvalue, it instead gives a reference to it.
Unless you use reference_wrapper (ie, std::ref/std::cref), the values you pass to std::thread are stored as values in the std::tuple. Which means the function you call is passed an rvalue to the value in the std::tuple.
rvalues can bind to const& but not to &.
Now, the std::tuple above is an implementation detail, an imagined implementation of std::thread. The wording in the standard is more obtuse.
Why does the standard say this happens? In general, you should not bind a & parameter to a value which will be immediately discarded. The function thinks that it is modifying something that the caller can see; if the value will be immediately discarded, this is usually an error on the part of the caller.
const& parameters, on the other hand, do bind to values that will be immediately discarded, because we use them for efficiency purposes not just for reference purposes.
Or, roughly, because
const int& x = 7;
is legal
int& x = 7;
is not. The first is a const& to a logically discarded object (it isn't due to reference lifetime extension, but it is logically a temporary).