As per my understanding when we call a non-inlined function like foo() program control will shift to called function address then store the location of caller and return bank to the caller to another statement after previous function class. But when I implement the class with operator definition will the same process occur or something different happens in favor for operator function?
An operator overload is just a function with a peculiar name.
The compiler translates use of the operator into a function call.
That is, a + b becomes a.operator+(b) or operator+(a, b), depending on how the overload is defined.
(You can also write those out yourself, and it will behave exactly the same but miss the point.)
Note that function call overhead is something I haven't seen anyone worry about during this millennium. It only takes nanoseconds on a reasonably modern machine, unless you make very expensive argument copies – but then you get rid of the copying, not the function.
You will very likely never encounter a situation where getting rid of function calls is your top-priority speed optimisation.
Virtual function calls can matter in very time-sensitive situations, for instance in a tight loop, but those instances are rare.
(And the overhead for that is not the function call per se, but is caused by the late binding.)
Related
For example, calling int func( int a, int * b...) will put a and b into registers r0, r1 and so on, call the function and return the result in r0 (remark: speaking very generally here, not related to any specific processor or calling convention; also, let's assume fast-call passing of arguments in registers).
Now, wouldn't it be better that each function is compiled with arguments passed in registers that are already preferred for their types of arguments (pointers to preferred pointer/base registers, array-like data to vector registers...), and not following one of few calling-convention rules using the more-or-less strict order of function arguments as given in a function prototype ?
This way the code might avoid few instructions used to shuffle arguments in registers before and after such a function call, and also often avoid reusing the same basic registers all over again etc. Of course, this would require some extra data for each function (presumably kept in a database or object files), but it would be beneficial even in case of dynamic linking.
So, are there any strong arguments against doing something like that in the 21st century, except maybe historical reasons ?
For functions that are called from a single place, or that are at the very least static so the compiler can know every call place (assuming that the compiler can prove that the address of that function is not passed around as a function pointer) this could be done.
The catch is that almost every time this can be done it is done, but in a slightly different way: the compiler can inline code, and once inlined, the calling convention is no longer needed, because there is no longer a call being made.
But let's get back to the data base idea: you could argue that this has no runtime cost. When generating the code the compiler checks the data base and generates the appropriate code. This doesn't really help in any way. You still have a calling convention, but instead of having one (or a few) that is respected by all code, you now have a different calling convention for each function. Sure, the compiler no longer needs to put the first argument in r0, but it needs to put it in r1 for foo, in r5 for bar, etc. There's still overhead for setting up the proper registers with the proper values. Knowing what registers to restore after such a function call also becomes harder. Calling convention specify clearly which registers are volatile (so their values are lost upon returning from a called function) and non-volatile (so their values are preserved).
A far more useful feature is to generate the code of the called function in such a way that it uses the registers that already happen to hold those values. This can happen when inlining code.
To add to this, I believe this is what Rust does in Rust-to-Rust calls. As far as I know, the language does not have a fixed calling convention. Instead, the compiler tries to generate code based on in which registers the values for the arguments are already present. Unfortunately I can't seem to find any official docs about this, but this rust-lang discussion may be of help.
Going one step further: not all code paths are known at compile time. Think about function pointers: if I have the following code:
typedef void (*my_function_ptr_t)(int arg1);
my_function_ptr_t get_function(int value) {
switch (value) {
case 0: return foo;
case 1: return bar;
default: return baz;
}
}
void do_some_stuff(int a, int b) {
my_function_ptr_t handler = get_function(a);
handler(b);
}
Under the data base proposal foo, bar, and baz can have completely different calling conventions. This either means that you can't actually have working function pointers anymore, or that the data base needs to be accessible at runtime, and the compiler will generate code that will check it at runtime in order to properly call a function through a function pointer. This can have some serious overhead, for no actual gains.
This is by no means an exhaustive list of reasons, but the main idea is this: by having each function expect arguments to be in different registers you replace one calling convention with many calling conventions without gaining anything from it.
While searching for the difference in new and malloc, I came across this statement (source):
new is faster than malloc() because an operator is always faster than a function.
Are operators always faster than functions? If so, why? I would really appreciate low-level explanations (you can assume basic compiler, SASS, and hardware knowledge).
new is faster than malloc() because an operator is always faster than a function.
This is completely untrue. In fact, it is quite typical that the default behaviour of new expression is to internally call malloc, in which case it cannot possibly be faster.
There is no reason to expect different performance for using one over another as long as the contending programs do the same thing. The reasons to use new instead of malloc are not related to performance.
Are operators faster than functions?
Calling a function at runtime is potentially slower than not calling a function.
But, as we've found out, an operator can actually internally call a function. Besides, a function call for the abstract machine doesn't necessarily mean that a function will be called at runtime. As long as the compiler is able to produce the result of the function at compile time, or if it is able to expand the call inline, then there is no need for any function call overhead.
So, it depends on what function calls we are discussing. As far as a C++ function call is concerned: It is not necessarily slower than the use of an operator.
Also, do note that all overloaded operators that operate on class types are actually function calls to the operator overload function.
I know of one way to call a function :
func(x, y);
Are there more ways to call a function?
Functions can be invoked
explicitly, by providing an argument parenthesis after a designation of the function (in the case of constructors this is decidedly not formally correct wording, since they don't have names, but anyway),
implicitly, in particular destructors and default constructors, but also implicit type conversion,
via operators other than the function call operator (), in particular the copy assignment operator = and the dereferencing operator ->,
in a placement new expression, invocation of a specified allocation function by placing an argument parenthesis right after new (not sure if this counts as a separate way).
In addition library facilities can of course invoke functions for you.
I think the above list is exhaustive, but I'm not sure. I remember Andrei Alexandrescu enumerated the constructs that yielded callable thingies, in his Modern C++ Design book, and there was a surprise for me. So there is a possibility that the above is not exhaustive.
Arbitrary functions can be invoked:
using f(arguments...) notation
via a pointer to the function (whether member or non-)
via a std::function - (will check the implementation's left unspecified, though I'd expect it to use a pointer to function or pointer to member function under the covers so no new language features)
Class-specific functions are also invoked in certain situations:
constructors are invoked when objects are created on the stack, and when static/global or thread-specific objects or dynamically-allocated objects are dynamically initialised, or with placement new, and as expressions are evaluated
destructors are invoked when objects leave scope, are deleted, threads exit, temporaries are destroyed, and when the destructor is explicitly called ala x.~X()
all manner of operators ([], +=, ==, < etc.) may be invoked during expression evaluation
Arbitrary non-member functions may be run by:
functions may be run due to earlier std::atexit() or std::at_quick_exit() calls, and if they throw std::terminate may run
thread creation and asynchronous signals (again the interfaces accept pointer to functions, and there's no reason to think any implementation has or would use any other technique to achieve dispatch)
Specific functions are triggered in very specific situations:
main() is executed by the runtime
std::unexpected, std::unexpected_handler, std::terminate are invoked when dynamic exception specifications are violated
It's also possible to use setjmp and longjmp to "jump" back into a function... not quite the same thing as calling it though.
Though not truly "C++", it's also possible to arrange function execution using inline assembly language / linked assembler, writing to executable memory.
C++ is a fairly flexible language and therefore this is a very vague question as there can be a 100 different ways of "calling a function" given not limitations of what is allowed.
Remember a function is only really a block of code sitting somewhere in memory. The act of "calling" a function is to some extent the following:
Putting the parameters required in the correct registers/stack locations
Moving the PC(Program Counter) to the location of the function in memory (this is usually done with a "call" type machine instruction)
Technically afterwards there might be some "clean-up" code depending on how the compiler implements functions.
In the end all methods come down to this happening in some way or another.
Perhaps not 100% relevant here but remember that in C++ functions can be members of a class.
class MyClass{
public:
void myFunction(int A);
}
Usually what happens in this case is that the class object is passed as a first parameters.
So the function call:
myObject.myFunction(A)
is in a way equivalent to calling:
myFunction(myObject,A)
if you look at function object you will see this kind of behavior.
Function objects reference
Ok so here is a short list:
call the function normally myFunc(a,b);
function pointers. typedef int(*funcP)(int,in);
Function objects. overload the () operator makes your object callable.
C++11 std::function replaces function pointers largely and I suggest you look into how these works
lambda functions are also a type of function in a way.
Delegates can have a variety of implementations.
Things like function pointers and delegates are many times used with the concept of a callback
You can use multi-cast delegates. (e.g. boost.signals2 or Qt Signals & slots)
You can bind to a function in a DLL and call it. DLL calling
There are various ways to call functions between processes and over the network. Usually refereed to as rpc implementations.
In a threaded environment things might also get more interesting as you might want to call functions in a different thread.
See Qt Signals & Slots threaded connections
Also thread pools can be used. link1 link2
Lastly I suppose it's a good idea to mention meta-programming and the idea of RTTI. This is not as strongly supported as say in languages like c#.
If this is to be manually implemented one would be able to at run-time search the list of available functions and call one. By this method it would be possible to match a function at run-time vs a string name. This is to some extent impemented by Qt's MOC system.
What are we counting as a different way? If I have a function that is a member of a class foo, then I might call it like this:
foo.func(x, y);
If I have a pointer to foo, I would do this
foo->func(x, y);
If I had a class bar that was derived from foo, I might call foo's constructor with an initialization list
bar::bar(const int x, const int y) : foo(x, y) {}
A constructor is just a function, after all.
The following both call the function T::f on object t.
1. t.f();
2. t.T::f();
3. (t.*&T::f)();
I've seen the second one used where the other was not. What is the difference between these two and in what situation should one be preferred?
Thanks.
EDIT: Sorry, I forgot about the second one t.T::f(). I added that one in.
The call t.f() and (t.*&T::f)() are semantically identical and are the "normal" way to call a member functions. The call t.T::f() calls T::f() even if f() is an overridden virtual function.
The expression (t.*&T::f)() calls a member function by obtaining a pointer to a member function. The only potential effect that this expression could have is that it might inhibit inlining the function for some compilers. Using a variable obtained from &T::f to call a member function would be a customization point but directly calling the function is merely obfuscation and potentially a pessimization (if the compiler isn't capable to do sufficient const propagation to detect that the address can't change).
What I could imagine is that someone tried to inhibit a virtual function call on t. Of course, this doesn't work this way because the pointer to member will still call the correct virtual function. To inhibit virtual dispatch you'd use t.T::f().
You should prefer t.f() over (t.*&T::f)(). If you want to inhibit virtual dispatch you'd use t.T::f() otherwise you'd use t.f(). The primary use for inhibiting virtual dispatch is to call the base class version of a function from within an overriding function. Otherwise it is rarely useful.
The first one is the regular one, that's the one you should prefer. The second one takes a member function pointer to f, dereferences it for t, and then calls it.
If there is an actual benefit in that extra trip I am not aware of it. This is the member version of *&f() when calling a free function.
The one that you added later on, t.T::f(), is statically dispatched so that the f of T is called even if it were virtual and t were a derived class of T with its own implementation. It effectively inhibits the virtual call mechanism.
The second one is pointless, it's just obfuscated code. No it doesn't disable virtual dispatch nor inlining. They both do the exact same thing, in each and every case.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Virtual Functions and Performance C++
Is this correct, that class member function takes more time than a simple function? What if inheritance and virtual functions are used?
I have tried to collect my functions into a simple interface class (only member functions, no data members), and it appears that I lose time. Is there a way to fix it?
P.S. I'm checking with gcc and icc compilers and using -O3 option.
Premature optimization is the root of all evil
A nonstatic member function takes an additional argument, which is the object (a pointer or reference to it) on which the function is called. This is one overhead. If the function is virtual, then there is also one small indirection in case of a polymorphic call, that is, addition of the function index to the virtual table base offset. Both of these "overheads" are sooo negligable, you shouldn't worry about it unless the profiler says this is your bottleneck. Which it most likely is not.
Premature optimization is the root of all evil
Member functions, if they're not virtual, are same as free functions. There is no overhead in its invocation.
However, in case of virtual member functions, there is overhead, as it involves indirection, and even then, it is slower when you call the virtual function through pointer or reference (that is called polymorphic call). Otherwise, there is no difference if the call is not polymorphic.
There is no additional time penalty involved for member functions. Virtual functions are slightly slower but not by much. Unless you're running an incredibly tight loop, even virtual function overhead is negligible.
For normal functions it's enough with a "jump" to them, which is very fast. The same goes for normal member functions. Virtual functions on the other hand, the address to jump to has to be fetched from a table, which of course involves more machine code instructions and therefore will be slower. However, the difference is negligible and will hardly even be measurable.
In other words, don't worry about it. If you have slowdowns, then it's most likely (like 99,999%) something else. Use a profiler to find out where.