Passing functor object by value vs by reference (C++) - c++

Compare generic integration functions:
template <class F> double integrate(F integrand);
with
template <class F> double integrate(F& integrand);
or
template <class F> double integrate(const F& integrand);
What are the pros and cons of each? STL uses the first approach (pass by value), does it mean it's the most universal one?

Function objects usually should be small so I don't think that passing them by value will suffer from performance noticably (compare it to the work the function does in its body). If you pass by value, you can also gain from code analysis, because a by value parameter is local to the function and the optimizer may tell when and when not a load from a data member of the functor can be omitted.
If the functor is stateless, passing it as argument implies no cost at all - the padding byte that the functor takes doesn't have to have any particular value (in the Itanium Abi used by GCC at least). When using references, you always have to pass an address.
The last one (const T&) has the drawback that in C++03 that doesn't work for raw functions, because in C++03 the program is ill-formed if you try to apply const to a function type (and is an SFINAE case). More recent implementations instead ignore const when applied on function types.
The second one (T&) has the obvious drawback that you cannot pass temporary functors.
Long story short, I would generally pass them by value, unless I see a clear benefit in concrete cases.

STL uses the first approach (pass by value)
Sure, the standard libraries pass iterators and functors by value. They are assumed (rightly or wrongly) to be cheap to copy, and this means that if you write an iterator or a functor that is expensive to copy, you might have to find a way to optimize that later.
But that is just for the purposes for which the standard libraries use functors - mostly they're predicates, although there are also things like std::transform. If you're integrating a function, that suggests some kind of mathematics libraries, in which case I suppose you might be much more likely to deal with functions that carry a lot of state. You could for example have a class representing nth order polynomials, with n+1 coefficients as non-static data members.
In that case, a const reference might be better. When using such a functor in standard algorithms like transform, you might wrap it in a little class that performs indirection through a pointer, to ensure that it remains cheap to copy.
Taking a non-const reference is potentially annoying to users, since it stops them passing in temporaries.

Given the context, F is expected to be a "callable object" (something like a free function or a class having a operator() defined)
Now, since a free function name cannot be an L-value, the second version is not suitable for that.
The third assumes F::operator() to be const (but may not be the case, if it requires to alter the state of F)
The first operates on a "own copy", but requires F to be copyable.
None of the three is "universal", but the first is most likely working in the most common cases.

Related

Why does std::promise::set_value() have two overloads

For the case when std::promise<> is instantiated with a non reference type, why does the set_value() method have two distinct overloads as opposed to one pass by value overload?
so instead of the following two
std::promise::set_value(const Type& value);
std::promise::set_value(Type&& value);
just one
std::promise::set_value(Type value);
This has at least the following two benefits
Enable users to move the value into the promise/future when they want, since the API argument is a value type. When copying is not supported it is obvious that the value is going to be copied. Further when the expression being passed into the function is a prvalue it can be completely elided easily by the compiler (especially so in C++17)
It conveys the point that the class requires a copy of the value a lot better and succinctly than two overloads which accomplish the same task.
I was making a similar API (as far as ownership is concerned) and I was wondering what benefits the design decision employed by the C++ standard library has as opposed to what I mentioned.
Thanks!
Passing an argument by value if it needs to be "transferred" unconditionally and then moving from it is a neat little trick, but it does incur at least one mandatory move. Therefore, this trick is best in leaf code that is used rarely or only in situations that are completely under the control of the author.
By contrast, a core library whose users and uses are mostly unknown to the author should not unnecessarily add avoidable costs, and providing two separate reference parameter overloads is more efficient.
In a nutshell, the more leaf and user you are, the more you should favour simplicity over micro-optimizations, and the more library you are, the more you should go out of your way to be general and efficient.

Pass-by-value, overloading or perfect forwarding for class type parameters [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
Everytime I think about the design of my class I ask myself these questions, should I use the pass by value, should I overload on const lvalue reference and rvalue reference or should I use perfect forwarding.
Often I use pass by value as having cheap to move types and I almost never use perfect forwarding. I overload when having only 1 parameter, maybe 2 if i really need the perf.
What do you do ?
Do you have easy rules of thumb to decide how to pass arguments, for member/non member functions but also for constructors and all the copy/assignment guys.
Thanks.
So all of the following is opinion-based, but these are the rules I tend to follow when thinking about an API. As always in C++, there are many ways to accomplish the same thing, and people will have different view on exactly what is best.
There are three kinds of parameters we need to think about: in parameters, out parameters, and in/out parameters. The latter two are simple, so we'll cover them first.
Out parameters
Don't use them. Seriously. If your function is going to return a new object, then return it by value. If you're going to return multiple new objects, then return them by value packed in a std::tuple (or std::pair). The caller can use std::tie (or structured bindings in C++17) to unpack them again. This gives the caller the maximum flexibility, and with RVO it's no less efficient than any other method.
In/out parameters
For functions which modify an already-constructed value, use a mutable lvalue reference, i.e. T&. This will prevent callers from passing a temporary, but that's actually a good thing: what would be the point of modifying something you're just going to throw away? Not that some style guides (notably Google's, but also Qt) advocate using a raw pointer (T*) in this situation, so that it's obvious at the call site that the argument will be modified (because you need to say f(&arg)), but I personally don't find this convincing.
In parameters
For pure input parameters, where the function will not modify the argument passed to it, things are a tiny bit more complicated. In general, the best advice is to pass by lvalue-reference-to-const, that is, const T&. This will allow the caller to pass both lvalues and rvalues. However, for small objects (sizeof(T) <= sizeof(void*)), such as int, it can be more efficient to pass by value instead.
An exception though is if you're going to take a copy of a passed argument, for example in a constructor; in this case, it's better to take the parameter by value, because the compiler can turn this into a move for rvalues.
What about T&&?
There are two circumstances where it's appropriate to use arguments of the form T&&. The first is templated forwarding functions where the type of the parameter is the template type, i.e.
template <typename T>
decltype(auto) func(T&& arg) {
return other_func(std::forward<T>(arg));
}
In this case, although the parameter looks as if it's an rvalue reference, it's actually a forwarding reference (sometimes called a universal reference). Only use a forwarding reference to pass things on to another function via std::forward; if you care about the value category of the argument, then T&& is not appropriate.
The second case is for real rvalue references, where the argument type is not a template parameter. In a very limited number of cases, it can be appropriate to overload on both the const arg& and arg&& forms, to avoid an unnecessary move. This should only be necessary in performance-critical situations in which you're going to copy or move the argument somewhere (for example, std::vector does this for its push_back() method) -- in general I would say it's better to take the argument by value and then move it into place.
Interfaces should express intent.
Optimisations should happen when users complain.
To me, the following interfaces have different meanings:
void foo(thing const& t); // "I won't modify your t. If it's copyable, I might copy it, but that's none of your concern."
void foo(thing t); // "Pass me a copy if you wish, or a temporary, or move your thing into me. What I do with t is up to me".
void foo(thing& t); // "t will be modified."
What follows now is only for "default" behavior. Like "normal" not really big types ("normal sized" vectors, strings etc.) nothing which seems to be very expensive in the first place.
In short:
Do whatever you like but be consistent.
There is no best practice which can guarantee you the best performance.
Some detail to this:
I was once on a conference having 3 popular C++ people (Herb Sutter, Andrei Alexandrescu and Scott Meyers) discuss this problem and each had another opinion on the best "default" behavior.
All by const-reference or by perfect-forwarding or just by-value.
So you won't get a perfect answer here. Compilers also can optimize differently etc.
Here is my personal opinion on this:
What I do is I prefer the by-value approach and if I later notice some thing becoming slow I start to optimize. I assume modern compilers being smart enough to avoid unnecessary copies and also maybe just move the object when they see it's no longer used afterwards. I try to keep in mind Return Value Optimization to let the compiler more easier optimize here if necessary (either return only one object or only r-values).
Though I have heard this behavior and optimization potential changing from compiler to compiler. So like said before: use what you prefer / stick to one way so it's consistent.

why function objects should be pass-by-value

I have just read the classic book "Effective C++, 3rd Edition", and in item 20 the author concludes that built-in types, STL iterators and function object types are more appropriate for pass-by-value. I could well understand the reason for built-in and iterators types, but why should the function object be pass-by-value, as we know it is class-type anyway?
In a typical case, a function object will have little or (more often) no persistent state. In such a case, passing by value may no require actually passing anything at all -- the "value" that's passed is basically little or nothing more than a placeholder for "this is the object".
Given the small amount of code in many function objects, that leads to a further optimization: it's often fairly easy for the compiler to expand the code for the function object inline, so no parameters get passed, and no function call is involved at all.
A compiler may be able to do the same when you pass a pointer or reference instead, but it's not quite as easy -- a lot more common that you'll end up with an object being created, its address passed, and then the function call operator for that object being invoked via that pointer.
Edit: It's probably also worth mentioning that the same applies to lambdas, since they're really just function objects in disguise. You don't know the name of the class, but they create a class in the immediately surrounding scope that overloads the function call operator, which is what gets invoked when you "call" the lambda. [Thanks #Mark Garcia.]
The #1 reason to pass function objects by value is because the standard library requires that function objects you pass to its algorithms be copyable. C++11 §25.1/10:
[ Note: Unless otherwise specified, algorithms that take function objects as arguments are permitted to copy
those function objects freely. Programmers for whom object identity is important should consider using a
wrapper class that points to a noncopied implementation object such as reference_wrapper<T> (20.8.3),
or some equivalent solution. —end note ]
The other answers do a great job of explaining the rationale.
From Effective STL (since you seems to like Scott Meyers) item 38 Design functor classes for pass-by-value.
"In both C and C++ function pointers are passed by value. STL Function objects are modeled after function pointers, so the convention in the STL is that function objects, too, are passed by value when passed to and from functions."
This has some benefits and some implications, like #Jerry Coffin said, the compiler can make some optimizations like inlining the code to avoid function calls (You have to mark your functor as inline). A good example of this case is the qsort vs std::sort performance comparison, where std::sort using inline functors outperform qsort by a lot, you can find more information on this on Effective STL where it is discussed extensively and mentioned in several chapters.
This also has several implications too, since function objects are passed and returned by value, you have to make sure your object have a well defined copy mechanisms, are small in size (otherwise it could get expensive), and are monomorphic (since passing polymorphic objects by value may result in object slicing).

What is distinctive for functors compared to normal functions taking values as arguments

I am newbie for the concept but as I search the difference and the good of the functors is that they are able to store values inside and initialize these values from the construction but normal functions also work in same fashion except they take the all arguments as whole at the function call. Most probably I am wrong in some way but where is the trick and the benefit of functors in relation to normal functions
The core difference is that a functor defines a type not a function. Even stateless functors (without any attached data) can take advantage of this. For example consider the use of std::less inside a sorting algorithm:
template <typename Iterator, typename Comparator>
sort(Iterator begin, Iterator end, Comparator c) {
...
if (c(*begin,*end)) { ...
...
}
Called as sort(v.begin(), v.end(), std::less<int>());. When the function is called, an instance of std::less<int> is created and passed to the template. Because it is stateless, the cost of passing the function is almost nothing. Inside the function, the call c(a,b) is determined to be a call to c.operator()(a,b), and the compiler knows the type. It can efficiently inline the call (which in this case is simple enough) and substitute it by a single compare instruction.
On the other hand, the equivalent C function qsort takes a function pointer (you cannot pass functions by value). Inside qsort, the compiler does not know what the function called is, and it cannot inline it, so it must perform a function call for each comparison.
Functors serve both to add extra information that can later be used at the place of call (this is impossible with a plain function), and to pass extra information like provide information like what needs to be called (the same behavior can be obtained, but with a hit on performance) or other attached information (the type can have nested types/typedefs, information for traits inspection...)
Normal functions, free-standing or member, only have their arguments which will be passed when the function is called. So there is no way to pass extra data to the function.
This is different with a functor. A functor is an instance of an object, and as such can indeed store data passed to its constructor (which you use when passing the functor).
With C++11 things are muddled up a bit, as lambdas can also "store" (not technically correct word) values by using captures. Or by using std::bind which allows you to bind values as arguments when the callable object is actually called.

"const T &arg" vs. "T arg"

Which of the following examples is the better way of declaring the following function and why?
void myFunction (const int &myArgument);
or
void myFunction (int myArgument);
Use const T & arg if sizeof(T)>sizeof(void*) and use T arg if sizeof(T) <= sizeof(void*)
They do different things. const T& makes the function take a reference to the variable. On the other hand, T arg will call the copy constructor of the object and passes the copy.
If the copy constructor is not accessible (e.g. it's private), T arg won't work:
class Demo {
public: Demo() {}
private: Demo(const Demo& t) { }
};
void foo(Demo t) { }
int main() {
Demo t;
foo(t); // error: cannot copy `t`.
return 0;
}
For small values like primitive types (where all matters is the contents of the object, not the actual referential identity; say, it's not a handle or something), T arg is generally preferred. For large objects and objects that you can't copy and/or preserving referential identity is important (regardless of the size), passing the reference is preferred.
Another advantage of T arg is that since it's a copy, the callee cannot maliciously alter the original value. It can freely mutate the variable like any local variables to do its work.
Taken from Move constructors. I like the easy rules
If the function intends to change the argument as a side effect, take it by reference/pointer to a non-const object. Example:
void Transmogrify(Widget& toChange);
void Increment(int* pToBump);
If the function doesn't modify its argument and the argument is of primitive type, take it by value. Example:
double Cube(double value);
Otherwise
3.1. If the function always makes a copy of its argument inside, take it by value.
3.2. If the function never makes a copy of its argument, take it by reference to const.
3.3. Added by me: If the function sometimes makes a copy, then decide on gut feeling: If the copy is done almost always, then take by value. If the copy is done half of the time, go the safe way and take by reference to const.
In your case, you should take the int by value, because you don't intend to modify the argument, and the argument is of primitive type. I think of "primitive type" as either a non-class type or a type without a user defined copy constructor and where sizeof(T) is only a couple of bytes.
There's a popular advice that states that the method of passing ("by value" vs "by const reference") should be chosen depending in the actual size of the type you are going to pass. Even in this discussion you have an answer labeled as "correct" that suggests exactly that.
In reality, basing your decision on the size of the type is not only incorrect, this is a major and rather blatant design error, revealing a serious lack of intuition/understanding of good programming practices.
Decisions based on the actual implementation-dependent physical sizes of the objects must be left to the compiler as often as possible. Trying to "tailor" your code to these sizes by hard-coding the passing method is a completely counterproductive waste of effort in 99 cases out of 100. (Yes, it is true, that in case of C++ language, the compiler doesn't have enough freedom to use these methods interchangeably - they are not really interchangeable in C++ in general case. Although, if necessary, a proper size-based [semi-]automatic passing methios selection might be implemented through template metaprogramming; but that's a different story).
The much more meaningful criterion for selecting the passing method when you write the code "by hand" might sound as follows:
Prefer to pass "by value" when you are passing an atomic, unitary, indivisible entity, such as a single non-aggregate value of any type - a number, a pointer, an iterator. Note that, for example, iterators are unitary values at the logical level. So, prefer to pass iterators by value, regardless of whether their actual size is greater than sizeof(void*). (STL implementation does exactly that, BTW).
Prefer to pass "by const reference" when you are passing an aggregate, compound value of any kind. i.e. a value that has exposed pronouncedly "compound" nature at the logical level, even if its size is no greater than sizeof(void*).
The separation between the two is not always clear, but that how things always are with all such recommendations. Moreover, the separation into "atomic" and "compound" entities might depend on the specifics of your design, so the decision might actually differ from one design to the other.
Note, that this rule might produce decisions different from those of the allegedly "correct" size-based method mentioned in this discussion.
As an example, it is interesing to observe, that the size-based method will suggest you manually hard-code different passing methods for different kinds of iterators, depending on their physical size. This makes is especially obvious how bogus the size-based method is.
Once again, one of the basic principles from which good programming practices derive, is to avoid basing your decisions on physical characteristics of the platform (as much as possible). Instead, you decisions have to be based on the logical and conceptual properties of the entities in your program (as much as possible). The issue of passing "by value" or "by reference" is no exception here.
In C++11 introduction of move semantics into the language produced a notable shift in the relative priorities of different parameter-passing methods. Under certain circumstances it might become perfectly feasible to pass even complex objects by value
Should all/most setter functions in C++11 be written as function templates accepting universal references?
Contrary to popular and long-held beliefs, passing by const reference isn't necessarily faster even when you're passing a large object. You might want to read Dave Abrahams recent article on this very subject.
Edit: (mostly in response to Jeff Hardy's comments): It's true that passing by const reference is probably the "safest" alternative under the largest number of circumstances -- but that doesn't mean it's always the best thing to do. But, to understand what's being discussed here, you really do need to read Dave's entire article quite carefully, as it is fairly technical, and the reasoning behind its conclusions is not always intuitively obvious (and you need to understand the reasoning to make intelligent choices).
Usually for built-in types you can just pass by value. They're small types.
For user defined types (or templates, when you don't what is going to be passed) prefer const&. The size of a reference is probably smaller than the size of the type. And it won't incurr an extra copy (no call to a copy constructor).
Well, yes ... the other answers about efficiency are true. But there's something else going on here which is important - passing a class by value creates a copy and, therefore, invokes the copy constructor. If you're doing fancy stuff there, it's another reason to use references.
A reference to const T is not worth the typing effort in case of scalar types like int, double, etc. The rule of thumb is that class-types should be accepted via ref-to-const. But for iterators (which could be class-types) we often make an exception.
In generic code you should probably write "T const&" most of the time to be on the safe side. There's also boost's call traits you can use to select the most promising parameter passing type. It basically uses ref-to-const for class types and pass-by-value for scalar types as far as I can tell.
But there are also situations where you might want to accept parameters by value, regardless of how expensive creating a copy can be. See Dave's article "Want Speed? Use pass by value!".
For simple types like int, double and char*, it makes sense to pass it by value. For more complex types, I use const T& unless there is a specific reason not to.
The cost of passing a 4 - 8 byte parameter is as low as you can get. You don't buy anything by passing a reference. For larger types, passing them by value can be expensive.
It won't make any difference for an int, as when you use a reference the memory address still has to be passed, and the memory address (void*) is usually about the size of an integer.
For types that contain a lot of data it becomes far more efficient as it avoids the huge overhead from having to copy the data.
Well the difference between the two doesn't really mean much for ints.
However, when using larger structures (or objects), the first method you used, pass by const reference, gives you access to the structure without need to copy it. The second case pass by value will instantiate a new structure that will have the same value as the argument.
In both cases you see this in the caller
myFunct(item);
To the caller, item will not be changed by myFunct, but the pass by reference will not incur the cost of creating a copy.
There is a very good answer to a similar question over at Pass by Reference / Value in C++
The difference between them is that one passes an int (which gets copied), and one uses the existing int. Since it's a const reference, it doesn't get changed, so it works much the same. The big difference here is that the function can alter the value of the int locally, but not the const reference. (I suppose some idiot could do the same thing with const_cast<>, or at least try to.) For larger objects, I can think of two differences.
First, some objects simply can't get copied, auto_ptr<>s and objects containing them being the obvious example.
Second, for large and complicated objects it's faster to pass by const reference than to copy. It's usually not a big deal, but passing objects by const reference is a useful habit to get into.
Either works fine. Don't waste your time worrying about this stuff.
The only time it might make a difference is when the type is a large struct, which might be expensive to pass on the stack. In that case, passing the arg as a pointer or a reference is (slightly) more efficient.
The problem appears when you are passing objects. If you pass by value, the copy constructor will be called. If you haven't implemented one, then a shallow copy of that object will be passed to the function.
Why is this a problem? If you have pointers to dynamically allocated memory, this could be freed when the destructor of the copy is called (when the object leaves the function's scope). Then, when you re call your destructor, youll have a double free.
Moral: Write your copy constructors.