The benefit of function objects? - c++

I have known that the function objects used in STL is just simple object that we can operator it like a function. That I can say the works of function and function objects is the same. If it's true, then why we should used function object rather than function?

The primary benefit is that calls to function objects (functors) are generally inlineable, where as calls to function pointers are generally not (prime example is comparing C's qsort to C++'s std::sort). For non-trivial objects/comparators, C++ should kill C's performance for sorting.
There are other benefits, for example you could possibly bind or store state in a functor, with which a raw function you cannot do.
Edit
Apologies for no direct reference, but Scott Meyers claims 670% improvement under certain circumstances:
Performance of qsort vs std::sort?
Edit 2
The passage with the performance note is this:
The fact that function pointer parameters inhibit inlining explains an
observation that long-time C programmers often find hard to believe:
C++’s sort virtually always embarrasses C’s qsort when it comes to
speed. Sure, C++ has function and class templates to instantiate and
funny-looking operator() functions to invoke while C makes a simple
function call, but all that C++ “overhead” is absorbed during
compilation. At runtime, sort makes inline calls to its comparison
function (assuming the comparison function has been declared inline
and its body is available during compilation) while qsort calls its
comparison function through a pointer. The end result is that sort
runs faster. In my tests on a vector of a million doubles, it ran up
to 670% faster, but don’t take my word for it, try it yourself. It’s
easy to verify that when comparing function objects and real functions
as algorithm parameters, there’s an abstraction bonus.
-Scott Meyers "Effective STL: 50 Specific Ways to Improve Your Use of the Standard Template Library" - Item 46

The benefit of a function object over a function is that it can hold state (from wikipedia):
#include <iostream>
#include <iterator>
#include <algorithm>
class CountFrom {
private:
int &count;
public:
CountFrom(int &n) : count(n) {}
int operator()() { return count++; }
};
int main() {
int state(10);
std::generate_n(std::ostream_iterator<int>(std::cout, "\n"), 11, CountFrom(state));
return 0;
}
A regular function cannot hold state like a function object. If I remember correctly it was the way of getting around not having lambdas and closures (before C++11 wikipedia section)...

I think that the best thing about functors are that they can store information internally. Back in those days without std::bind, one would have to write lots of unary comparison functions so that it can be passed to certain routines like remove_if.

See http://cs.stmarys.ca/~porter/csc/ref/stl/function_objects.html.
STL uses function objects (functors) as a callback for sorting/searching containers.
Functors are templates and thus easier to implement as classes. Try saying greater<T> with a function pointer... considering that containers in STL are templates, too.

Related

Any reason not to use global lambdas?

We had a function that used a non-capturing lambda internal to itself, e.g.:
void foo() {
auto bar = [](int a, int b){ return a + b; }
// code using bar(x,y) a bunch of times
}
Now the functionality implemented by the lambda became needed elsewhere, so I am going to lift the lambda out of foo() into the global/namespace scope. I can either leave it as a lambda, making it a copy-paste option, or change it to a proper function:
auto bar = [](int a, int b){ return a + b; } // option 1
int bar(int a, int b){ return a + b; } // option 2
void foo() {
// code using bar(x,y) a bunch of times
}
Changing it to a proper function is trivial, but it made me wonder if there is some reason not to leave it as a lambda? Is there any reason not to just use lambdas everywhere instead of "regular" global functions?
There's one very important reason not to use global lambdas: because it's not normal.
C++'s regular function syntax has been around since the days of C. Programmers have known for decades what said syntax means and how they work (though admittedly that whole function-to-pointer decay thing sometimes bites even seasoned programmers). If a C++ programmer of any skill level beyond "utter newbie" sees a function definition, they know what they're getting.
A global lambda is a different beast altogether. It has different behavior from a regular function. Lambdas are objects, while functions are not. They have a type, but that type is distinct from the type of their function. And so forth.
So now, you've raised the bar in communicating with other programmers. A C++ programmer needs to understand lambdas if they're going to understand what this function is doing. And yes, this is 2019, so a decent C++ programmer should have an idea what a lambda looks like. But it is still a higher bar.
And even if they understand it, the question on that programmer's mind will be... why did the writer of this code write it that way? And if you don't have a good answer for that question (for example, because you explicitly want to forbid overloading and ADL, as in Ranges customization points), then you should use the common mechanism.
Prefer expected solutions to novel ones where appropriate. Use the least complicated method of getting your point across.
I can think of a few reasons you'd want to avoid global lambdas as drop-in replacements for regular functions:
regular functions can be overloaded; lambdas cannot (there are techniques to simulate this, however)
Despite the fact that they are function-like, even a non-capturing lambda like this will occupy memory (generally 1 byte for non-capturing).
as pointed out in the comments, modern compilers will optimize this storage away under the as-if rule
"Why shouldn't I use lambdas to replace stateful functors (classes)?"
classes simply have fewer restrictions than lambdas and should therefore be the first thing you reach for
(public/private data, overloading, helper methods, etc.)
if the lambda has state, then it is all the more difficult to reason about when it becomes global.
We should prefer to create an instance of a class at the narrowest possible scope
it's already difficult to convert a non-capturing lambda into a function pointer, and it is impossible for a lambda that specifies anything in its capture.
classes give us a straightforward way to create function pointers, and they're also what many programmers are more comfortable with
Lambdas with any capture cannot be default-constructed (in C++20. Previously there was no default constructor in any case)
Is there any reason not to just use lambdas everywhere instead of "regular" global functions?
A problem of a certain level of complexity requires a solution of at least the same complexity. But if there is a less complex solution for the same problem, then there is really no justification for using the more complex one. Why introduce complexity you don't need?
Between a lambda and a function, a function is simply the less complex kind of entity of the two. You don't have to justify not using a lambda. You have to justify using one. A lambda expression introduces a closure type, which is an unnamed class type with all the usual special member functions, a function call operator, and, in this case, an implicit conversion operator to function pointer, and creates an object of that type. Copy-initializing a global variable from a lambda expression simply does a lot more than just defining a function. It defines a class type with six implicitly-declared functions, defines two more operator functions, and creates an object. The compiler has to do a lot more. If you don't need any of the features of a lambda, then don't use a lambda…
After asking, I thought of a reason to not do this: Since these are variables, they are prone to Static Initialization Order Fiasco (https://isocpp.org/wiki/faq/ctors#static-init-order), which could cause bugs down the line.
if there is some reason not to leave it as a lambda? Is there any reason not to just use lambdas everywhere instead of "regular" global functions?
We used to use functions instead of global functor, so it breaks the coherency and the Principle of least astonishment.
The main differences are:
functions can be overloaded, whereas functors cannot.
functions can be found with ADL, not functors.
Lambdas are anonymous functions.
If you are using a named lambda, it means you are basically using a named anonymous function. To avoid this oxymoron, you might as well use a function.

Is there any advantage to implementing functions as free functions rather than members in C++?

I'm interested in the technical logistics. Is there any advantage, such as memory saved, etc., to implementing certain functions dealing with a class?
In particular, implementing operator overloads as free functions (providing you don't need access to any private members, and even then you can make them use a friend non-member)?
Is a distinct memory address provided for each function of the class each time an object is created?
This answer may helps you : Operator overloading : member function vs. non-member function?. In general free functions are mandatory if you need to implement operators on classes you don't have access to code source (think about streams) or if left operand is not of class type (int for example). If you control the code of the class then you can freely use function members.
For your last question, no, function members are uniquely defined and an object internal table is used to point to them. Function members can be viewed as free functions with an hidden parameter that is a pointer to the object, i.e. o.f(a) is more or less the same as f(&o,a) with a prototype roughly like f(C *this,A a);.
There are various articles about circumstances when implementing functionality using non-member functions is preferred over function members.
Examples include
Scott Meyers (author of books like "Effective C++", "Effective STL", and others) on how non-members improve encapsulation: http://www.drdobbs.com/cpp/how-non-member-functions-improve-encapsu/184401197
Herb Sutter in his Guru of the Week series #84 "Monoliths Unstrung". Essentially he advocates, when it is possible to implement functionality as a member or as a non-member non-friend, prefer the non-member option. http://www.gotw.ca/gotw/084.htm
Non-static member functions have an implicit this parameter. If your function doesn't use any non-static members, it should be either a free function or a static member function, depending on what namespace you want it to be in. This will avoid confusion for human readers (who will be scratching their heads looking for the reason it's not static), and will be a small improvement in code size, with a probably non-measurable gain in performance.
To be clear: in the asm, there's zero difference between a static member function and a non-member function. The choice between static-member and global or static (file scope) is purely a namespace / design quality issue, not a performance issue. (In Unix shared libraries (position-independent code), calling global functions has a level of indirection through the PLT, so prefer static file-scope functions. This is a different meaning of the static keyword vs. global static-member functions, which are globally visible and thus subject to symbol interposition.)
One possible exception to this rule is that wrapper functions that pass on most of their args unchanged to another function benefit from having their args in the same order as the function they call, so they don't have to move them between registers. e.g. if a member function does something simple to a class member and then calls a static member function with the same arg list, it's actually without the implicit this pointer, so all the args have to move over by one register.
Most ABIs use an args-in-registers calling convention. 32bit x86 (other than some Windows calling conventions) is the major exception I know of, where all args are always passed on the stack. 64bit x86 passes the first 6 integer args in registers, and the first 8 FP args in xmm registers (SysV). Or the first 4 args of args of either type in registers (Windows).
Passing an object pointer will typically take an extra instruction or two at every call site. If the implicit first arg bumps any other args out of the limited set of arg-passing regs, then it will have to be passed on the stack. This adds a few cycles of latency to the critical path involving that arg for the store-load round trip, and extra instructions in the callee as well as the caller. (See the x86 wiki for links to more details about this sort of thing, for that platform).
Inlining of course eliminates this. static functions can also be optimized by modern compilers, because the compiler knows all the calls come from code it can see, so it can make them non-standard. IDK if any compiler will drop unused args during inter-procedure optimization. Link-time and/or whole-program optimization may also be able to reduce or eliminate overhead from unused args.
Code-size always matters at least a little, since smaller binaries load from disk faster, and miss less in I-cache. I don't expect any measurable speed difference unless you specifically design an experiment that's sensitive to it.
The most important thing to consider when designing classes is, "what is the invariant?" Classes are design to protect invariants. So, classes must be the tiniest as possible to ensure that invariant is properly protected. If you have so many member/friend functions, there is more code to review.
From this point of view, if a class has members which don't need to be protected (for example, a boolean which its corresponding get/set functions can be freely changed by the user), is better to put that attributes as public and remove the get/set functions (more or less, these are the Bjarne Stroustrup words).
So, which functions must be declared inside the class and which ones out? Inside functions must be these minimum required set of function to protect the invariant, and outside functions must be any function that can be implemented using the other ones.
The thing with operator overloading is another history, because the criteria to put some operators inside, and some other outside, is because of syntactical issues related to implicit conversions and so on:
class A
{
private:
int i_i;
public:
A(int i) noexcept : i_i(i) {}
int val() const noexcept { return i_i; }
A operator+(A const& other) const noexcept
{ return A(i_i + other.i_i); }
};
A a(5);
cout << (4 + a).val() << endl;
In this case, since the operator is defined inside the class, the compiler doesn't find the operator, because the first argument is an integer (when an operator is called, the compiler search for free functions and functions declared inside the class of the first argument).
When declared outside:
class A
{
private:
int i_i;
public:
A(int i) noexcept : i_i(i) {}
int val() const noexcept { return i_i; }
};
inline A operator+(A const& first, A const& other) const noexcept;
{ return A(first.val() + other.val()); }
A a(5);
cout << (4 + a).i_i << endl;
In these case, the compiler find the operator, and try to perform an implicit conversion of the first parameter from int to A, using the proper A's constructor.
In these case, the operator can also be implemented using other functions, so, it doesn't need to be friend and you can be sure the invariant is not compromised with that additional function. So, in these concrete example, moving the operator outside is good for two reasons.
One strictly technical difference, which also is valid for static vs non-static member functions, might affect performance in extreme scenarios:
For a member function, the this pointer will be passed as an "invisible" parameter to the function. Usually, depending on the parameter types, a fixed number of parameter values can be passed via registers instead of via the stack (registers are faster to read and write).
If the function already takes that number of parameters explicitly, then making it a non-static member function might cause parameters to be passed via the stack instead of via registers, and if that happens, barring optimizations that may or may not happen, the function call will be slower.
However, even if it is slower - in this case, in the vast majority of any use cases that you can dream up, slower is insignificant (but real).
Depending on the subject, class functions may not be the right solution. Class functions depend on the assumption that exists an assymetry between the arguments of the corrispetive non class function where it is cleary individuated a main subject of the function (id est the implicitly passed this that practically corresponds to passing the object by reference). On the other side many times such an assymetry may not exist. In those cases the free functions are the best solution.
Regarding execution speed there isn't any difference because the method of a class is just a function where the first argument is the this pointer. So it is totally equivalent the the corrispetive non class function where the first element is the pointer to the object.

Variadic function declaration VS a function taking a list

While designing a class or a function, which way, that is shown below, is better and why?
class Container {
//Provide this functionality??
void addItemVariadic(const Value& val, ...);
//Or provide this functionality??
void addItemList(const list<Value>& vals);
};
Is it better to provide a function like addItemVariadic(..), or addItemList(..)?
Or is it better to provide a set of such functions, like further taking some iterators, or is it better to limit the functionality, like just taking a list?
Using variadic functions is dangerous
If you ever pass a variable which has not the appriopriate type, it will crash at run time, when calling the function.
On the contrary if you use a std::list, it won't compile simply, and you avoid a crash.
Btw: I advice you to use std::vector instead of std::list in this case.
Edit1 Possible Dupplicate Question. with a nice solution using operator << to inputs all the parameters in one shot.
Edit2 So as to choose between different std::containers, there is a choose-graph as answer to this question. This graph addresses at C++03, it doesn't cover the new containers introduced in C++1.

Most efficient way to process all items in an unknown container?

I'm doing a computation in C++ and it has to be as fast as possible (it is executed 60 times per second with possibly large data). During the computation, a certain set of items have to be processed. However, in different cases, different implementations of the item storage are optimal, so i need to use an abstract class for that.
My question is, what is the most common and most efficient way to do an action with each of the items in C++? (I don't need to change the structure of the container during that.) I have thought of two possible solutions:
Make iterators for the storage classes. (They're also mine, so i can add it.) This is common in Java, but doesn't seem very 'C' to me:
class Iterator {
public:
bool more() const;
Item * next();
}
Add sort of an abstract handler, which would be overriden in the computation part and would include the code to be called on each item:
class Handler {
public:
virtual void process(Item &item) = 0;
}
(Only a function pointer wouldn't be enough because it has to also bring some other data.)
Something completely different?
The second option seems a bit better to me since the items could in fact be processed in a single loop without interruption, but it makes the code quite messy as i would have to make quite a lot of derived classes. What would you suggest?
Thanks.
Edit: To be more exact, the storage data type isn't exactly just an ADT, it has means of only finding only a specific subset of the elements in it based on some parameters, which i need to then process, so i can't prepare all of them in an array or something.
#include <algorithm>
Have a look at the existing containers provided by the C++ standard, and functions such as for_each.
For a comparison of C++ container iteration to interfaces in "modern" languages, see this answer of mine. The other answers have good examples of what the idiomatic C++ way looks like in practice.
Using templated functors, as the standard containers and algorithms do, will definitely give you a speed advantage over virtual dispatch (although sometimes the compiler can devirtualize calls, don't count on it).
C++ has iterators already. It's not a particularly "Java" thing. (Note that their interface is different, though, and they're much more efficient than their Java equivalents)
As for the second approach, calling a virtual function for every element is going to hurt performance if you're worried about throughput.
If you can (pre-)sort your data so that all objects of the same type are stored consecutively, then you can select the function to call once, and then apply it to all elements of that type. Otherwise, you'll have to go through the indirection/type check of a virtual function or another mechanism to perform the appropriate action for every individual element.
What gave you the impression that iterators are not very C++-like? The standard library is full of them (see this), and includes a wide range of algorithms that can be used to effectively perform tasks on a wide range of standard container types.
If you use the STL containers you can save re-inventing the wheel and get easy access to a wide variety of pre-defined algorithms. This is almost always better than writing your own equivalent container with an ad-hoc iteration solution.
A function template perhaps:
template <typename C>
void process(C & c)
{
typedef typename C::value_type type;
for (type & x : c) { do_something_with(x); }
}
The iteration will use the containers iterators, which is generally as efficient as you can get.
You can specialize the template for specific containers.

Functor class doing work in constructor

I'm using C++ templates to pass in Strategy functors to change my function's behavior. It works fine. The functor I pass is a stateless class with no storage and it just overloads the () operator in the classic functor way.
template <typename Operation> int foo(int a)
{
int b=Operation()(a);
/* use b here, etc */
}
I do this often, and it works well, and often I'm making templates with 6 or 7 templated functors passed in!
However I worry both about code elegance and also efficiency. The functor is stateless so I assume the Operation() constructor is free and the evaluation of the functor is just as efficient as an inlined function, but like all C++ programmers I always have some nagging doubt.
My second question is whether I could use an alternate functor approach.. one that does not override the () operator, but does everything in the constructor as a side effect!
Something like:
struct Operation {
Operation(int a, int &b) { b=a*a; }
};
template <typename Operation> int foo(int a)
{
int b;
Operation(a,b);
/* use b here, etc */
}
I've never seen anyone use a constructor as the "work" of a functor, but it seems like it should work. Is there any advantage? Any disadvantage? I do like the removal of the strange doubled parenthesis "Operator()(a)" , but that's likely just aesthetic.
Any disadvantage?
Ctors do not return any useful value -- cannot be used in chained calls (e.g. foo(bar()).
They can throw.
Design point of view -- ctors are object creation functions, not really meant to be workhorses.
Compilers actually inline the empty constructor of Operation (at least gcc in similar situations does, except when you turned off optimization)
The disadvantage of doing everything in the constructor is that you cannot create a functor with some internal state this way - eg. functor for counting the number of elements satisfying a predicate. Also, using a method of a real object as a functor allows you to store the instance of it for later execution, something you cannot do with your constructor approach.
From a performance pov the code demonstrated with get completely optimized with both VC and GCC. However, a better strategy often is to take the functor as a parameter, that way you get a lot more flexibility and identical performance characteristics.
I'd recommend defining functor that work with the STL-containers, i.e. they should implement operator(). (Following the API of the language you're using is always a good idea.)
That allow your algorithms to be very generic (pass in functions, functors, stl-bind, boost::function, boost::bind, boost::lambda, ...) which is what one usually wants.
This way, you don't need to specify the functor type as a template parameter, just construct an instance and pass it in:
my_algorithm(foo, bar, MyOperation())
There does not seem any point in implementing the constructor in another class.
All you are doing is breaking encapsulation and setting up your class for abuse.
The constructor is supposed to initialize the object into a good state as defined by the class. You are allowing another object to initialize your class. What guarantees do you have that this template class knows how to initialize your class correctly? A user of your class can provide any object that could mess with the internal state of your object in ways not intended.
The class should be self contained and initialize itself to a good state. What you seem to be doing is playing with templates just to see what they can do.