Should I pass an std::function by const-reference? - c++

Let's say I have a function which takes an std::function:
void callFunction(std::function<void()> x)
{
x();
}
Should I pass x by const-reference instead?:
void callFunction(const std::function<void()>& x)
{
x();
}
Does the answer to this question change depending on what the function does with it? For example if it is a class member function or constructor which stores or initializes the std::function into a member variable.

If you want performance, pass by value if you are storing it.
Suppose you have a function called "run this in the UI thread".
std::future<void> run_in_ui_thread( std::function<void()> )
which runs some code in the "ui" thread, then signals the future when done. (Useful in UI frameworks where the UI thread is where you are supposed to mess with UI elements)
We have two signatures we are considering:
std::future<void> run_in_ui_thread( std::function<void()> ) // (A)
std::future<void> run_in_ui_thread( std::function<void()> const& ) // (B)
Now, we are likely to use these as follows:
run_in_ui_thread( [=]{
// code goes here
} ).wait();
which will create an anonymous closure (a lambda), construct a std::function out of it, pass it to the run_in_ui_thread function, then wait for it to finish running in the main thread.
In case (A), the std::function is directly constructed from our lambda, which is then used within the run_in_ui_thread. The lambda is moved into the std::function, so any movable state is efficiently carried into it.
In the second case, a temporary std::function is created, the lambda is moved into it, then that temporary std::function is used by reference within the run_in_ui_thread.
So far, so good -- the two of them perform identically. Except the run_in_ui_thread is going to make a copy of its function argument to send to the ui thread to execute! (it will return before it is done with it, so it cannot just use a reference to it). For case (A), we simply move the std::function into its long-term storage. In case (B), we are forced to copy the std::function.
That store makes passing by value more optimal. If there is any possibility you are storing a copy of the std::function, pass by value. Otherwise, either way is roughly equivalent: the only downside to by-value is if you are taking the same bulky std::function and having one sub method after another use it. Barring that, a move will be as efficient as a const&.
Now, there are some other differences between the two that mostly kick in if we have persistent state within the std::function.
Assume that the std::function stores some object with a operator() const, but it also has some mutable data members which it modifies (how rude!).
In the std::function<> const& case, the mutable data members modified will propagate out of the function call. In the std::function<> case, they won't.
This is a relatively strange corner case.
You want to treat std::function like you would any other possibly heavy-weight, cheaply movable type. Moving is cheap, copying can be expensive.

If you're worried about performance, and you aren't defining a virtual member function, then you most likely should not be using std::function at all.
Making the functor type a template parameter permits greater optimization than std::function, including inlining the functor logic. The effect of these optimizations is likely to greatly outweigh the copy-vs-indirection concerns about how to pass std::function.
Faster:
template<typename Functor>
void callFunction(Functor&& x)
{
x();
}

As usual in C++11, passing by value/reference/const-reference depends on what you do with your argument. std::function is no different.
Passing by value allows you to move the argument into a variable (typically a member variable of a class):
struct Foo {
Foo(Object o) : m_o(std::move(o)) {}
Object m_o;
};
When you know your function will move its argument, this is the best solution, this way your users can control how they call your function:
Foo f1{Object()}; // move the temporary, followed by a move in the constructor
Foo f2{some_object}; // copy the object, followed by a move in the constructor
Foo f3{std::move(some_object)}; // move the object, followed by a move in the constructor
I believe you already know the semantics of (non)const-references so I won't belabor the point. If you need me to add more explanations about this, just ask and I'll update.

Related

Smart pointer to lambda

I'm trying to make a function that accepts a shared pointer to some functor. With manually crafted functors there're no problems, but with lambda there are. I know that I can't use decltype with lambda - every new lambda declaration creates a new type. Right now I'm writing:
auto lambda = [](int a, float b)->int
{
return 42;
};
using LambdaType = decltype(lambda);
shared_ptr<LambdaType> ptr{ new LambdaType{ lambda } };
It works, but looks ugly. Moreover there's a copy constructor call! Is there any way to simplify?
You could use std::function as type.
Lambdas are merely auto written invokable objects to make simple code simple. It you want something beyond their default automatic storage behavior, write the type they write yourself.
It is illegal to have a lambda type in an unevaluated context. In an evaluated context, it creates a lambda in automatic storage. You want it on the free store. This requires at least logically a copy.
A horrible hack involving violating the unevaluated context rule, sizeof/alignof, aligned_storage_t, placement new, possibly unbounded compile time recursion (or maybe one with a static_assert), returning pointers to local variables, and the aliasing constructor of shared ptr, and requiring callers to write insane code might avoid calling the copy/move. But it is a bad idea, and simply using invokable objects is easier.
Of course, accepting the copy/move makes it trivial. But at that point, just use std::function unless you need something like varargs.
You state you do not want to force users to use std::function; but std::function would implicitly convert a compatible lambda into itself.
If you are willing to accept a copy, we can do this:
template<class T>
std::shared_ptr<std::decay_t<T>>
auto_shared( T&& t ) {
return std::make_shared<std::decay_t<T>>(std::forward<T>(t));
}
then auto ptr = auto_shared( [x=0]()mutable{ return x++; } ); is a non-type-erased shared pointer to a counting lambda. The lambda is copied (well, moved) into the shared storage.
If you want to avoid that copy, the client can write a manual function object and call make_shared<X>(ctor_args) on it.
There is no reasonable way to separate a lambdas type from its construction in C++ at this point.
if you catch something in lambda, it becomes algorithmically same as std::function, so use it freely. Also, std::function implements captured values memory management, so using std::shared_ptr on top of it is not required.
If you catch nothing, lambda is convertible to simple function pointer:
int(*ptr)(int,int) = [](int a, int b) -> int {
return a+b;
};
Functions are allocated statically and definitely shouldn't be deleted. So, you don't actually need std::shared_ptr

Construct returned object in calling function's scope

Is it possible to force C++ to construct an object in the scope of a calling function? What I mean is to explicitly do what an return value optimization (RVO) does.
I have some container classes which are in a chain of derivation. Since the classes are constructed with stack data, they can't be returned, so I disabled the copy constructor and assignment operators. For each class, I am providing an iterator. The constructor of each iterator has only one argument: a pointer to the container class. To get the iterator, I want to use this function:
BindPackIterator BindPack.begin(void)
{
return BindPackIterator(this);
}
in this context:
for (auto i=bindpack.begin(); !i.end(); ++i) { i.run(); }
The compiler issues errors, complaining about not being able to copy the BindPackIterator object. Remember, I disabled them.
What I want to happen is for the BindPackIterator to be instantiated in the calling function's scope to avoid either a copy or move operation.
In this particular case, I know I can do a workaround, changing the begin function to return a BindPack pointer,
for(BindPackIterator i=bindpack.begin(); !i.end(); ++i) { i.run(); }
and I've experimented a bit, without success, with decltype and this construction:
auto BindPack::begin(void) -> BindPackIterator
{
return BindPackIterator(this);
}
This is just the example with which I'm currently frustrated. There have been other projects where the obvious solution is for the function to instantiate an object in the calling function's scope. The move constructor (foo&&) helps in some cases, but for objects with many data members, even that can be inefficient. Is there a design pattern that allows object construction/instantiation in the caller's scope?
Putting n.m.'s comment into code, write a constructor for BindPackIterator that takes a BindPack and initializes the iterator in the "begin" state. e.g:
BindPackIterator(BindPack* pack) : pack(pack), pos(0){ }
That you can use in your for loop:
BindPack pack;
for(BindPackIterator i(&pack); !i.end(); ++i){
i.run();
}
Live demo
Is it fair to say that the answer is "No," it is not possible to construct a returned object in the calling function's scope? Or in other words, you can't explicitly tell the compiler to use RVO.
To be sure, it is a dangerous possibility: stack memory used to construct the object while available in the called function will not be valid in the calling function, even though the values might remain untouched in the abandoned stack frame. This would result in unpredictable behavior.
Upon further consideration, while summing up at the end of this response, I realized that the compiler may not be able to accurately predict the necessary stack size for objects created in the calling function and initialized in a called function, and it would not be possible to dynamically expand the stack frame if the execution had passed to another function. These considerations make my whole idea impossible.
That said, I want to address the workarounds that solve my iterator example.
I had to abandon the idea of using auto like this:
for (auto i=bindpack.begin(); !i.end(); ++i)
Having abandoned auto, and realizing that it's more sensible to explicitly name the variable anyway (if the iterator is different enough to require a new class, it's better to name it to avoid confusion) , I am using this constructor:
BindPackIterator(BindPack &ref) : m_ref_pack(ref), m_index(0) { }
in order to be able to write:
for (BindPackIterator i=bindpack; !i.end(); ++i)
preferring to initialize with an assignment. I used to do this when I was last heavily using C++ in the late 1990's, but it's not been working for me recently. The compiler would ask for a copy operator I didn't want to define for reasons stated above. Now I think that problem was due to my collection of constructors and assignment operators I define to pass the -Weffc++ test. Using simplified classes for this example allowed it to work.
Another workaround for an object more complicated than an iterator might be to use a tuple for the constructor argument for objects that need multiple variables to initialize. There could be a casting operator that returns the necessary tuple from the class that initializes the object.
The constructor could look like:
FancyObject(BigHairyTuple val) : m_data1(get<0>(val)), m_data2(get<1>(val), etc
and the contributing object would define this:
class Foo
{
...
operator BigHairyTuple(void) {
return BigHairyTuple(val1, val2, ...);
}
};
to allow:
FancyObject fo = foo;
I haven't tested this specific example, but I'm working with something similar and it seems likely to work, with some possible minor refinements.

const parameter vs const reference parameter

Implementation 1:
foo(const Bar x);
Implementation 2:
foo(const Bar & x);
If the object will not be changed within the function, why would you ever copy it(implementation 1).
Will this be automatically optimized by the compiler?
Summary: Even though the object is declared as const in the function declaration, it is still possible that the object be edited via some other alias &.
If you are the person writing the library and know that your functions don't do that or that the object is big enough to justify the dereferencing cost on every operation, than
foo(const Bar & x); is the way to go.
Part 2:
Will this be automatically optimized by the compiler?
Since we established that they are not always equivalent, and the conditions for equivalence is non-trivial, it would generally be very hard for the compiler to ensure them, so almost certainly no
you ask,
“If the object will not be changed within the function, why would you ever copy it(implementation 1).”
well there are some bizarre situations where an object passed by reference might be changed by other code, e.g.
namespace g { int x = 666; }
void bar( int ) { g::x = 0; }
int foo( int const& a ) { assert( a != 0 ); bar( a ); return 1000/a; } // Oops
int main() { foo( g::x ); }
this has never happened to me though, since the mid 1990s.
so, this aliasing is a theoretical problem for the single argument of that type.
with two arguments of the same type it gets more of a real possibility. for example, an assignment operator might get passed the object that it's called on. when the argument is passed by value (as in the minimal form of the swap idiom) it's no problem, but if not then self-assignment generally needs to be avoided.
you further ask,
“Will this be automatically optimized by the compiler?”
no, not in general, for the above mentioned reason
the compiler can generally not guarantee that there will be no aliasing for a reference argument (one exception, though, is where the machine code of a call is inlined)
however, on the third hand, the language could conceivably have supported the compiler in this, e.g. by providing the programmer with a way to explicitly accept any such optimization, like, a way to say ”this code is safe to optimize by replacing pass by value with pass by reference, go ahead as you please, compiler”
Indeed, in those circumstances you would normally use method 2.
Typically, you would only use method 1 if the object is tiny, so that it's cheaper to copy it once than to pay to access it repeatedly through a reference (which also incurs a cost). In TC++PL, Stroustrup develops a complex number class and passes it around by value for exactly this reason.
It may be optimized in some circumstances, but there are plenty of things that can prevent it. The compiler can't avoid the copy if:
the copy constructor or destructor has side effects and the argument passed is not a temporary.
you take the address of x, or a reference to it, and pass it to some code that might be able to compare it against the address of the original.
the object might change while foo is running, for example because foo calls some other function that changes it. I'm not sure whether this is something you mean to rule out by saying "the object will not be changed within the function", but if not then it's in play.
You'd copy it if any of those things matters to your program:
if you want the side effects of copying, take a copy
if you want "your" object to have a different address from the user-supplied argument, take a copy
if you don't want to see changes made to the original during the running of your function, take a copy
You'd also copy it if you think a copy would be more efficient, which is generally assumed to be the case for "small" types like int. Iterators and predicates in standard algorithms are also taken by value.
Finally, if your code plans to copy the object anyway (including by assigning to an existing object) then a reasonable idiom is to take the copy as the parameter in the first place. Then move/swap from your parameter.
What if the object is changed from elsewhere?
void f(const SomeType& s);
void g(const SomeType s);
int main() {
SomeType s;
std::thread([&](){ /* s is non-const here, and we can modify it */}
// we get a const reference to the object which we see as const,
// but others might not. So they can modify it.
f(s);
// we get a const *copy* of the object,
// so what anyone else might do to the original doesn't matter
g(s);
}
What if the object is const, but has mutable members? Then you can still modify the object, and so it's very important whether you have a copy or a reference to the original.
What if the object contains a pointer to another object? If s is const, the pointer will be const, but what it points to is not affected by the constness of s. But creating a copy will (hopefully) give us a deep copy, so we get our own (const) object with a separate (const) pointer pointing to a separate (non-const) object.
There are a number of cases where a const copy is different than a const reference.

c++ unique_ptr argument passing

Suppose I have the following code:
class B { /* */ };
class A {
vector<B*> vb;
public:
void add(B* b) { vb.push_back(b); }
};
int main() {
A a;
B* b(new B());
a.add(b);
}
Suppose that in this case, all raw pointers B* can be handled through unique_ptr<B>.
Surprisingly, I wasn't able to find how to convert this code using unique_ptr. After a few tries, I came up with the following code, which compiles:
class A {
vector<unique_ptr<B>> vb;
public:
void add(unique_ptr<B> b) { vb.push_back(move(b)); }
};
int main() {
A a;
unique_ptr<B> b(new B());
a.add(move(b));
}
So my simple question: is this the way to do it and in particular, is move(b) the only way to do it? (I was thinking of rvalue references but I don't fully understand them.)
And if you have a link with complete explanations of move semantics, unique_ptr, etc. that I was not able to find, don't hesitate to share it.
EDIT According to http://thbecker.net/articles/rvalue_references/section_01.html, my code seems to be OK.
Actually, std::move is just syntactic sugar. With object x of class X, move(x) is just the same as:
static_cast <X&&>(x)
These 2 move functions are needed because casting to a rvalue reference:
prevents function "add" from passing by value
makes push_back use the default move constructor of B
Apparently, I do not need the second std::move in my main() if I change my "add" function to pass by reference (ordinary lvalue ref).
I would like some confirmation of all this, though...
I am somewhat surprised that this is not answered very clearly and explicitly here, nor on any place I easily stumbled upon. While I'm pretty new to this stuff, I think the following can be said.
The situation is a calling function that builds a unique_ptr<T> value (possibly by casting the result from a call to new), and wants to pass it to some function that will take ownership of the object pointed to (by storing it in a data structure for instance, as happens here into a vector). To indicate that ownership has been obtained by the caller, and it is ready to relinquish it, passing a unique_ptr<T> value is in place. Ther are as far as I can see three reasonable modes of passing such a value.
Passing by value, as in add(unique_ptr<B> b) in the question.
Passing by non-const lvalue reference, as in add(unique_ptr<B>& b)
Passing by rvalue reference, as in add(unique_ptr<B>&& b)
Passing by const lvalue reference would not be reasonable, since it does not allow the called function to take ownership (and const rvalue reference would be even more silly than that; I'm not even sure it is allowed).
As far as valid code goes, options 1 and 3 are almost equivalent: they force the caller to write an rvalue as argument to the call, possibly by wrapping a variable in a call to std::move (if the argument is already an rvalue, i.e., unnamed as in a cast from the result of new, this is not necessary). In option 2 however, passing an rvalue (possibly from std::move) is not allowed, and the function must be called with a named unique_ptr<T> variable (when passing a cast from new, one has to assign to a variable first).
When std::move is indeed used, the variable holding the unique_ptr<T> value in the caller is conceptually dereferenced (converted to rvalue, respectively cast to rvalue reference), and ownership is given up at this point. In option 1. the dereferencing is real, and the value is moved to a temporary that is passed to the called function (if the calles function would inspect the variable in the caller, it would find it hold a null pointer already). Ownership has been transferred, and there is no way the caller could decide to not accept it (doing nothing with the argument causes the pointed-to value to be destroyed at function exit; calling the release method on the argument would prevent this, but would just result in a memory leak). Surprisingly, options 2. and 3. are semantically equivalent during the function call, although they require different syntax for the caller. If the called function would pass the argument to another function taking an rvalue (such as the push_back method), std::move must be inserted in both cases, which will transfer ownership at that point. Should the called function forget to do anything with the argument, then the caller will find himself still owning the object if holding a name for it (as is obligatory in option 2); this in spite of that fact that in case 3, since the function prototype asked the caller to agree to the release of ownership (by either calling std::move or supplying a temporary). In summary the methods do
Forces caller to give up ownership, and be sure to actually claim it.
Force caller to possess ownership, and be prepared (by supplying a non const reference) to give it up; however this is not explicit (no call of std::move required or even allowed), nor is taking away ownership assured. I would consider this method rather unclear in its intention, unless it is explicitly intended that taking ownership or not is at discretion of the called function (some use can be imagined, but callers need to be aware)
Forces caller to explicitly indicate giving up ownership, as in 1. (but actual transfer of ownership is delayed until after the moment of function call).
Option 3 is fairly clear in its intention; provided ownership is actually taken, it is for me the best solution. It is slightly more efficient than 1 in that no pointer values are moved to temporaries (the calls to std::move are in fact just casts and cost nothing); this might be especially relevant if the pointer is handed through several intermediate functions before its contents is actually being moved.
Here is some code to experiment with.
class B
{
unsigned long val;
public:
B(const unsigned long& x) : val(x)
{ std::cout << "storing " << x << std::endl;}
~B() { std::cout << "dropping " << val << std::endl;}
};
typedef std::unique_ptr<B> B_ptr;
class A {
std::vector<B_ptr> vb;
public:
void add(B_ptr&& b)
{ vb.push_back(std::move(b)); } // or even better use emplace_back
};
void f() {
A a;
B_ptr b(new B(123)),c;
a.add(std::move(b));
std::cout << "---" <<std::endl;
a.add(B_ptr(new B(4567))); // unnamed argument does not need std::move
}
As written, output is
storing 123
---
storing 4567
dropping 123
dropping 4567
Note that values are destroyed in the ordered stored in the vector. Try changing the prototype of the method add (adapting other code if necessary to make it compile), and whether or not it actually passes on its argument b. Several permutations of the lines of output can be obtained.
Yes, this is how it should be done. You are explicitly transferring ownership from main to A. This is basically the same as your previous code, except it's more explicit and vastly more reliable.
So my simple question: is this the way to do it and in particular, is this "move(b)" the only way to do it? (I was thinking of rvalue references but I don't fully understand it so...)
And if you have a link with complete explanations of move semantics, unique_ptr... that I was not able to find, don't hesitate.
Shameless plug, search for the heading "Moving into members". It describes exactly your scenario.
Your code in main could be simplified a little, since C++14:
a.add( make_unique<B>() );
where you can put arguments for B's constructor inside the inner parentheses.
You could also consider a class member function that takes ownership of a raw pointer:
void take(B *ptr) { vb.emplace_back(ptr); }
and the corresponding code in main would be:
a.take( new B() );
Another option is to use perfect forwarding for adding vector members:
template<typename... Args>
void emplace(Args&&... args)
{
vb.emplace_back( std::make_unique<B>(std::forward<Args>(args)...) );
}
and the code in main:
a.emplace();
where, as before, you could put constructor arguments for B inside the parentheses.
Link to working example

Why do std::function instances have a default constructor?

This is probably a philosophical question, but I ran into the following problem:
If you define an std::function, and you don't initialize it correctly, your application will crash, like this:
typedef std::function<void(void)> MyFunctionType;
MyFunctionType myFunction;
myFunction();
If the function is passed as an argument, like this:
void DoSomething (MyFunctionType myFunction)
{
myFunction();
}
Then, of course, it also crashes. This means that I am forced to add checking code like this:
void DoSomething (MyFunctionType myFunction)
{
if (!myFunction) return;
myFunction();
}
Requiring these checks gives me a flash-back to the old C days, where you also had to check all pointer arguments explicitly:
void DoSomething (Car *car, Person *person)
{
if (!car) return; // In real applications, this would be an assert of course
if (!person) return; // In real applications, this would be an assert of course
...
}
Luckily, we can use references in C++, which prevents me from writing these checks (assuming that the caller didn't pass the contents of a nullptr to the function:
void DoSomething (Car &car, Person &person)
{
// I can assume that car and person are valid
}
So, why do std::function instances have a default constructor? Without default constructor you wouldn't have to add checks, just like for other, normal arguments of a function.
And in those 'rare' cases where you want to pass an 'optional' std::function, you can still pass a pointer to it (or use boost::optional).
True, but this is also true for other types. E.g. if I want my class to have an optional Person, then I make my data member a Person-pointer. Why not do the same for std::functions? What is so special about std::function that it can have an 'invalid' state?
It does not have an "invalid" state. It is no more invalid than this:
std::vector<int> aVector;
aVector[0] = 5;
What you have is an empty function, just like aVector is an empty vector. The object is in a very well-defined state: the state of not having data.
Now, let's consider your "pointer to function" suggestion:
void CallbackRegistrar(..., std::function<void()> *pFunc);
How do you have to call that? Well, here's one thing you cannot do:
void CallbackFunc();
CallbackRegistrar(..., CallbackFunc);
That's not allowed because CallbackFunc is a function, while the parameter type is a std::function<void()>*. Those two are not convertible, so the compiler will complain. So in order to do the call, you have to do this:
void CallbackFunc();
CallbackRegistrar(..., new std::function<void()>(CallbackFunc));
You have just introduced new into the picture. You have allocated a resource; who is going to be responsible for it? CallbackRegistrar? Obviously, you might want to use some kind of smart pointer, so you clutter the interface even more with:
void CallbackRegistrar(..., std::shared_ptr<std::function<void()>> pFunc);
That's a lot of API annoyance and cruft, just to pass a function around. The simplest way to avoid this is to allow std::function to be empty. Just like we allow std::vector to be empty. Just like we allow std::string to be empty. Just like we allow std::shared_ptr to be empty. And so on.
To put it simply: std::function contains a function. It is a holder for a callable type. Therefore, there is the possibility that it contains no callable type.
Actually, your application should not crash.
§ 20.8.11.1 Class bad_function_call [func.wrap.badcall]
1/ An exception of type bad_function_call is thrown by function::operator() (20.8.11.2.4) when the function wrapper object has no target.
The behavior is perfectly specified.
One of the most common use cases for std::function is to register callbacks, to be called when certain conditions are met. Allowing for uninitialized instances makes it possible to register callbacks only when needed, otherwise you would be forced to always pass at least some sort of no-op function.
The answer is probably historical: std::function is meant as a replacement for function pointers, and function pointers had the capability to be NULL. So, when you want to offer easy compatibility to function pointers, you need to offer an invalid state.
The identifiable invalid state is not really necessary since, as you mentioned, boost::optional does that job just fine. So I'd say that std::function's are just there for the sake of history.
There are cases where you cannot initialize everything at construction (for example, when a parameter depends on the effect on another construction that in turn depends on the effect on the first ...).
In this cases, you have necessarily to break the loop, admitting an identifiable invalid state to be corrected later.
So you construct the first as "null", construct the second element, and reassign the first.
You can, actually, avoid checks, if -where a function is used- you grant that inside the constructor of the object that embeds it, you will always return after a valid reassignment.
In the same way that you can add a nullstate to a functor type that doesn't have one, you can wrap a functor with a class that does not admit a nullstate. The former requires adding state, the latter does not require new state (only a restriction). Thus, while i don't know the rationale of the std::function design, it supports the most lean & mean usage, no matter what you want.
Cheers & hth.,
You just use std::function for callbacks, you can use a simple template helper function that forwards its arguments to the handler if it is not empty:
template <typename Callback, typename... Ts>
void SendNotification(const Callback & callback, Ts&&... vs)
{
if (callback)
{
callback(std::forward<Ts>(vs)...);
}
}
And use it in the following way:
std::function<void(int, double>> myHandler;
...
SendNotification(myHandler, 42, 3.15);