STL Functional -- Why? - c++

In C++ Standard Template Library, there's a 'functional' part, in which many classes have overloaded their () operator.
Does it bring any convenience to use functions as objects in C++?
Why can't we just use function pointer instead? Any examples?

Ofcourse, One can always use Function pointers instead of Function Objects, However there are certain advantages which function objects provide over function pointers, namely:
Better Performance:
One of the most distinct and important advantage is they are more likely to yield better performance. In case of function objects more details are available at compile time so that the compiler can accurately determine and hence inline the function to be called unlike in case of function pointers where the derefencing of the pointer makes it difficult for the compiler to determine the actual function that will be called.
Function objects are Smart functions:
Function objects may have other member functions and attributes.This means that function objects have a state. In fact, the same function, represented by a function object, may have different states at the same time. This is not possible for ordinary functions. Another advantage of function objects is that you can initialize them at runtime before you use/call them.
Power of Generic programming:
Ordinary functions can have different types only when their signatures differ. However, function objects can have different types even when their signatures are the same. In fact, each functional behavior defined by a function object has its own type. This is a significant improvement for generic programming using templates because one can pass functional behavior as a template parameter.

Why can't we just use function pointer instead? Any examples?
Using C style function pointer cannot leverage the advantage of inlining. Function pointer typically requires an additional indirection for lookup.
However, if operator () is overloaded then it's very easy for compiler to inline the code and save an extra call, so increase in performance.
The other advantage of overloaded operator () is that, one can design a function which implicitly considers the function object as argument; no need to pass it as a separate function. Lesser the hand coded program, lesser the bugs and better readability.
This question from Bjarne Stroustrup (C++ inventor) webpage explains that aspect nicely.
C++ Standard (Template) Library uses functional programming with overloaded operator (), if it's needed.

> Does it bring any convenience to use functions as objects in C++?
Yes: The C++ template mechanism allows all other C/C++ programming styles (C style and OOP style, see below).
> Why can't we just use function pointer instead? Any examples?
But we can: A simple C function pointer is an object with a well defined operator(), too.
If we design a library, we do not want to force anyone to use that C pointer style if not desired. It is usually as undesired as forcing everything/everyone to be in/use OOP style; see below.
From C-programmers and functional programmers views, OOP not only tends to be slower but more verbose and in most cases to be the wrong direction of abstraction ("information" is not and should not be an "object"). Because of that, people tend to be confused whenever the word "object" is used in other contexts.
In C++, anything with the desired properties can be seen as an object. In this case, a simple C function pointer is an object, too. This does not imply that OOP paradigms are used when not desired; it is just a proper way to use the template mechanism.
To understand the performance differences, compare the programming(-language) styles/paradigms and their possible optimisations:
C style:
Function pointer with its closure ("this" in OOP, pointer to some structure) as first parameter.
To call the function, the address of the function needs to be accessed first.
That is 1 indirection; no inlining possible.
C++ (and Java) OOP style:
Reference to an object derived from a class with virtual functions.
Reference is 1st pointer.
Pointer to virtual-table is 2nd pointer.
Function pointer in virtual-table is 3rd pointer.
That are 3 indirections; no inlining possible.
C++ template style:
Copy of an object with () function.
No virtual-table since the type of that object is known at compile time.
The address of the function is known at compile time.
That are 0 indirections; inlining possible.
The C++ templates are versatile enough to allow the other two styles above, and in the case of inlining they can even outperform…
compiled functional languages: (excluding JVM and Javascript as target platforms because of missing "proper tail calls")
Function pointer and reference to its closure in machine registers.
It is usually no function "call" but a GOTO like jump.
Functions do not need the stack, no address to jump back, no parameters nor local variables on the stack.
Functions have their garbage collectable closure(s) containing parameters and a pointer to the next function to be called.
For the CPU to predict the jump, the address of the function needs to be loaded to a register as early as possible.
That is 1 indirection with possible jump prediction; everything is nearly as fast as inlined.

The main difference is that function objects are more powerful than plain function pointers as they can hold state. Most algorithms take templates functions rather than plain function pointers, which enable the use of powerful constructs as binders that call functions with different signatures by filling extra arguments with values stored on the functor, or the newer lambdas in C++11. Once the algorithms are designed to take functors it just makes sense to provide a set of predefined generic function objects in the library.
Aside from that there are potential advantages in that in most cases those functors are simple classes for which the compiler has the full definition and can perform inlining of the function calls improving performance. This is the reason why std::sort can be much faster than qsort from the C library.

Related

Comparison with passing criteria as template parameter to sort() results in less overhead than passing criteria function pointer to qsort()?

In Stroustrup's The C++ programming language , Page 431, when he was discussing about the design of the standard libraries, he said,
For example, building the comparison criteria into a sort function is unacceptable because the same
data can be sorted according to different criteria. This is why the C standard library qsort() takes a comparison function as an argument rather than relying on something fixed, say, the < operator. On the other hand, the overhead imposed by a function call for each comparison compromises qsort() as a building block for further library building.
These above make sense to me. But in the second paragraph, he said,
Is that overhead serious? In most cases, probably not. However, the function call overhead can
dominate the execution time for some algorithms and cause users to seek alternatives. The technique of supplying comparison criteria through a template argument described in §13.4 solves that
problem.
In §13.4, the comparison criteria are defined as class with static member functions (which does the comparison). When these classes are used as template parameters, the comparison is still done by their static member functions. It seems to me there would still be overheads for calling the static member function.
What did Stroustrup mean by saying that?
std::sort is a function template. A separate sort instance will be created for each type and comparison operator during compilation. And because for each sort instantiation the type and the comparator is known at compile time, this allows inlining the comparator function and therefore avoiding the cost of a function call.
There is no theoretical reason why sort need be faster than qsort. Some compilers will even inline the function pointer passed to functions 'like' qsort: I believe I have seen gcc or clang do this (not to qsort), and even do so where the function definition was in a different cpp file.
The important part is that sort gets passed the function object as a type as well as an instance. A different function for each such type is generated: templates are factories for functions. At the point where it is called, the exact function called is really easy to determine for each such function instance, so inlining is trivial.
Doing the same with a function pointer is possible, but requires inlining from the point where qsort is invoked, and tracking the immutability of the function pointer carefully, and knowing which function it was to start with. This is far more fragile than the above mechanism in practice.
Similar issues appear with element stride (clearly static when sorting an array, harder to deal with when using qsort) and the like.
Calling a function via a pointer has two overheads: the pointer dereferencing and a function call overhead. This is a runtime process.
Template instantiation is done by compiler. Pointer dereferencing is eliminated as there is no pointer obviously. Function call overhead is optimised out by a compiler by inlining the call.

function pointer returning another function pointer in C++ with locality

Considering the fact that a pointer to a function returning another pointer to another function is the mechanism used in C to introduce some runtime polymorphism/callbacks, What is the equivalent way to implement this in C++ while improving locality and lower the cost about pointers and indirections ?
For example this syntactic sugar can help but I'm not really interested in this, altough it's a nice way to do things in a C++ way instead of a more C-ish typedef, I'm more interested in improving locality while trying to reduce the use of explicit pointers at runtime.
The real reason as to why people use function pointers in C to emulate polymorphism is not performance but the fact that C neither supports real polymorphism nor templates. These are two alternatives you have in C++. All three approaches are compared in this thread.
Note that even though calling a function pointer does not require the additional vtable lookup that virtual function calls do, calling virtual functions and function pointers both suffer from the same major performance problem: Branch prediction in both cases is not as reliable and you tend to end up with more pipeline flushes.
I think you can use Virtual function for meet part of the requirement.

why function objects should be pass-by-value

I have just read the classic book "Effective C++, 3rd Edition", and in item 20 the author concludes that built-in types, STL iterators and function object types are more appropriate for pass-by-value. I could well understand the reason for built-in and iterators types, but why should the function object be pass-by-value, as we know it is class-type anyway?
In a typical case, a function object will have little or (more often) no persistent state. In such a case, passing by value may no require actually passing anything at all -- the "value" that's passed is basically little or nothing more than a placeholder for "this is the object".
Given the small amount of code in many function objects, that leads to a further optimization: it's often fairly easy for the compiler to expand the code for the function object inline, so no parameters get passed, and no function call is involved at all.
A compiler may be able to do the same when you pass a pointer or reference instead, but it's not quite as easy -- a lot more common that you'll end up with an object being created, its address passed, and then the function call operator for that object being invoked via that pointer.
Edit: It's probably also worth mentioning that the same applies to lambdas, since they're really just function objects in disguise. You don't know the name of the class, but they create a class in the immediately surrounding scope that overloads the function call operator, which is what gets invoked when you "call" the lambda. [Thanks #Mark Garcia.]
The #1 reason to pass function objects by value is because the standard library requires that function objects you pass to its algorithms be copyable. C++11 §25.1/10:
[ Note: Unless otherwise specified, algorithms that take function objects as arguments are permitted to copy
those function objects freely. Programmers for whom object identity is important should consider using a
wrapper class that points to a noncopied implementation object such as reference_wrapper<T> (20.8.3),
or some equivalent solution. —end note ]
The other answers do a great job of explaining the rationale.
From Effective STL (since you seems to like Scott Meyers) item 38 Design functor classes for pass-by-value.
"In both C and C++ function pointers are passed by value. STL Function objects are modeled after function pointers, so the convention in the STL is that function objects, too, are passed by value when passed to and from functions."
This has some benefits and some implications, like #Jerry Coffin said, the compiler can make some optimizations like inlining the code to avoid function calls (You have to mark your functor as inline). A good example of this case is the qsort vs std::sort performance comparison, where std::sort using inline functors outperform qsort by a lot, you can find more information on this on Effective STL where it is discussed extensively and mentioned in several chapters.
This also has several implications too, since function objects are passed and returned by value, you have to make sure your object have a well defined copy mechanisms, are small in size (otherwise it could get expensive), and are monomorphic (since passing polymorphic objects by value may result in object slicing).

Why the delegates or closures often referred as true "object-oriented function pointers"?

For example, considering the C#
Unlike function pointers in C or C++, delegates are object-oriented,
type-safe, and secure.
source:
http://msdn.microsoft.com/en-us/library/aa288459%28v=vs.71%29.aspx
Now talking about the C++ only, what is the real difference and what is missing from an OO prospective?
Also from another source
Most C++ programmers have never used member function pointers, and
with good reason. They have their own bizarre syntax (the ->* and .*
operators, for example), it's hard to find accurate information about
them, and most of the things you can do with them could be done better
in some other way. This is a bit scandalous: it's actually easier for
a compiler writer to implement proper delegates than it is to
implement member function pointers!
source:
http://www.codeproject.com/Articles/7150/Member-Function-Pointers-and-the-Fastest-Possible
I find that many programs in C++ uses the ->* syntax, i don't find that this is bizarre or strange; I don't get the point about this potential about the delegates and the attack to the pointers.
The difference between a member pointer and a true closure is that a closure contains the function and the associated state. The two are inseparable.
A member pointer is just the function. In order to call a member pointer, you must provide the state.
It's the difference between a function pointer and a functor. The function object holds the function to be called (as an overload of operator()), but it can also have members. Those members can be private, thus providing encapsulation. The function object is an object by C++ rules, so it has an explicit lifetime. It is constructed and destructed according to C++ rules.
A function pointer has no lifetime; it always exists (unless it's NULL). It has no state to encapsulate.
Therefore, it's not a closure because it can't close over anything.
That's why C++11 lambdas are implemented as explicit objects. That way, they can close over things. They have encapsulated state.
I think people find it difficult to write them, but they behave like ordinary pointers, if they are NULL then its wrong to execute them. If they are written and used properly then there is nothing wrong with them. I never really had problems with,maybe besides forgetting proper syntax.
In c++11, you can use std::function<> class which is a wrapper for function pointer. You can use it for functions, member functions, functions objects and lambdas.
for example:
void foo(int x);
std::function<void(int)> f=foo;
f(1);
or easier (works in VS2010):
void foo(int x);
std::function<decltype(foo)> f=foo;
f(1);
ref: http://en.cppreference.com/w/cpp/utility/functional/function
Member function pointers are not member functions. Function pointers in C/C++ are dangerous in the sense that they could point to deallocated memory or be set to NULL at any time. Generally they are avoided, though not always with things like boost::bind or swapping print functions.
Thus in general they aren't any more dangerous than standard pointers. However, you can oftentimes avoid using function pointers by just calling an object's member method or a static function somewhere else. It compiles to using function pointers on the inside, but you have better guarantees about the existence of that function when it gets called.

Disadvantages of passing around functions?

I'm learning C++ (coming from java) and recently discovered that you can pass functions around. This is really cool and I think immensely useful. Now I was thinking on how I could use this and one of the idea's that popped into my head was a completely customizable class.
The best example of my train of though for completely customizable classes (code) would be say a person class. Person would have all functions pertaining to P. Later Person may pick up a sword (S), so now Person has access to all functions pertaining to both P and S.
Are there limits or performance issues with this? Is this sloppy and just plain frowned upon?
Any insight is educational, thanks.
~Aedon
When passing around functions - i.e. pointers to functions really - calls are always indirect and therefore possibly slower than a direct call (and definitely slower than an inlined call altogether).
The STL is modeled with functors. That is: light function objects that have a operator() member which gets called. This has the advantage of being a very likely candidate for inlining, especially if the functor and operator() are very simple (as e.g. std::less<T>).
See also: http://www.sgi.com/tech/stl/functors.html
There is a slight performance hit since a pointer or reference must be dereferenced before calling the function.
This is a very advantageous feature. Many design patterns and polymorphism depend on pointers to functions. Check out the "Visitor Design Pattern".
Another usage is for a table of functions. For example, you could write a generic menu engine that displays different menus by using different functions.
Also research "Factory design pattern."
There's absolutely nothing wrong with passing around a function, but it's kind of primitive and limiting. Often you have some data that you want to associate with the function, in addition to the parameters you're passing to it. Also you might want to group related functions together and pass them as one. Congratulations, you've just described a C++ class!
If you want to see how C++ can really blur the line, consider a functor. This is a class that has an operator() method, so that you can call it just as you would a function. It has two immediate advantages over a plain function: it can hold state between calls, and it can be inlined by the compiler for superior performance. It's not uncommon for std::sort to outperform the older C qsort for example, because qsort uses a function pointer while std::sort uses a functor.