What are custom calling conventions? - c++

What are these? And how am I affected by these as a developer?
Related:
What are the different calling conventions in C/C++ and what do each mean?

A calling convention describes how something may call another function. This requires parameters and state to be passed to the other function, so that it can execute and return control correctly. The way in which this is done has to be standardized and specified, so that the compiler knows how to order parameters for consumption by the remote function that's being called. There are several standard calling conventions, but the most common are fastcall, stdcall, and cdecl.
Usually, the term custom calling convention is a bit of a misnomer and refers to one of two things:
A non-standard calling convention or one that isn't in widespread use (e.g. if you're building an architecture from scratch).
A special optimization that a compiler/linker can perform that uses a one-shot calling convention for the purpose of improving performance.
In the latter case, this causes some values that would otherwise be pushed onto the stack to be stored in registers instead. The compiler will try to make this decision based on how the parameters are being used inside the code. For example, if the parameter is going to be used as a maximum value for a loop index, such that the index is compared against the max on each iteration to see if things should continue, that would be a good case for moving it into a register.
If the optimization is carried out, this typically reduces code size and improves performance.
And how am I affected by these as a developer?
From your standpoint as a developer, you probably don't care; this is an optimization that will happen automatically.

Each language, when calling a function, has a convention about what parameters will be passed in register variables vs on the stack, and how return values will be returned.
Sometimes a different convention than the standard one is used and that's referred to as a custom calling convention.
This is most common when interoperating between different languages. For example, C and Pascal have different conventions about how to pass parameters. From C's point of view, the Pascal calling convention could be referred to as a custom calling convention.

I don't think you really need to care.
Normal calling conventions are things like __stdcall and __fastcall. They determine how your call signature is converted into a stack layout, who (caller or callee) is responsible for saving and restoring registers, etc. For example, __fastcall should use more registers where __stdcall would use more stack.
Custom calling conventions are optimised specifically for a particular function and the way it is used. They only happen, IIRC, for functions that are local to a particular module. That's how the compiler knows so much about how it's used, and also how you know that no external caller needs to be able to specify the convention.
Based on that, your compiler will use them automatically where appropriate, your code will run slightly faster and/or take slightly less space, but you don't really need to worry about it.

Unless you are directly manipulating the stack or writing inline assembly referencing local variables, it does not affect you. Or if your interfacing with libraries linked with different calling conventions
What it is: Most compilers and such use a standard calling convention such as cdecl where function arguments are pushed in a certain order to the stack and such.

Related

Is there a particular reason for calling conventions with predetermined order of registers used for passing arguments?

For example, calling int func( int a, int * b...) will put a and b into registers r0, r1 and so on, call the function and return the result in r0 (remark: speaking very generally here, not related to any specific processor or calling convention; also, let's assume fast-call passing of arguments in registers).
Now, wouldn't it be better that each function is compiled with arguments passed in registers that are already preferred for their types of arguments (pointers to preferred pointer/base registers, array-like data to vector registers...), and not following one of few calling-convention rules using the more-or-less strict order of function arguments as given in a function prototype ?
This way the code might avoid few instructions used to shuffle arguments in registers before and after such a function call, and also often avoid reusing the same basic registers all over again etc. Of course, this would require some extra data for each function (presumably kept in a database or object files), but it would be beneficial even in case of dynamic linking.
So, are there any strong arguments against doing something like that in the 21st century, except maybe historical reasons ?
For functions that are called from a single place, or that are at the very least static so the compiler can know every call place (assuming that the compiler can prove that the address of that function is not passed around as a function pointer) this could be done.
The catch is that almost every time this can be done it is done, but in a slightly different way: the compiler can inline code, and once inlined, the calling convention is no longer needed, because there is no longer a call being made.
But let's get back to the data base idea: you could argue that this has no runtime cost. When generating the code the compiler checks the data base and generates the appropriate code. This doesn't really help in any way. You still have a calling convention, but instead of having one (or a few) that is respected by all code, you now have a different calling convention for each function. Sure, the compiler no longer needs to put the first argument in r0, but it needs to put it in r1 for foo, in r5 for bar, etc. There's still overhead for setting up the proper registers with the proper values. Knowing what registers to restore after such a function call also becomes harder. Calling convention specify clearly which registers are volatile (so their values are lost upon returning from a called function) and non-volatile (so their values are preserved).
A far more useful feature is to generate the code of the called function in such a way that it uses the registers that already happen to hold those values. This can happen when inlining code.
To add to this, I believe this is what Rust does in Rust-to-Rust calls. As far as I know, the language does not have a fixed calling convention. Instead, the compiler tries to generate code based on in which registers the values for the arguments are already present. Unfortunately I can't seem to find any official docs about this, but this rust-lang discussion may be of help.
Going one step further: not all code paths are known at compile time. Think about function pointers: if I have the following code:
typedef void (*my_function_ptr_t)(int arg1);
my_function_ptr_t get_function(int value) {
switch (value) {
case 0: return foo;
case 1: return bar;
default: return baz;
}
}
void do_some_stuff(int a, int b) {
my_function_ptr_t handler = get_function(a);
handler(b);
}
Under the data base proposal foo, bar, and baz can have completely different calling conventions. This either means that you can't actually have working function pointers anymore, or that the data base needs to be accessible at runtime, and the compiler will generate code that will check it at runtime in order to properly call a function through a function pointer. This can have some serious overhead, for no actual gains.
This is by no means an exhaustive list of reasons, but the main idea is this: by having each function expect arguments to be in different registers you replace one calling convention with many calling conventions without gaining anything from it.

Is there a reason some functions don't take a void*?

Many functions accept a function pointer as an argument. atexit and call_once are excellent examples. If these higher level functions accepted a void* argument, such as atexit(&myFunction, &argumentForMyFunction), then I could easily wrap any functor I pleased by passing a function pointer and a block of data to provide statefulness.
As is, there are many cases where I wish I could register a callback with arguments, but the registration function does not allow me to pass any arguments through. atexit only accepts one argument: a function taking 0 arguments. I cannot register a function to clean up after my object, I must register a function which cleans up after all objects of a class, and force my class to maintain a list of all objects needing cleanup.
I always viewed this as an oversight, there seemed no valid reason why you wouldn't allow a measly 4 or 8 byte pointer to be passed along, unless you were on an extremely limited microcontroller. I always assumed they simply didn't realize how important that extra argument could be until it was too late to redefine the spec. In the case of call_once, the posix version accepts no arguments, but the C++11 version accepts a functor (which is virtually equivalent to passing a function and an argument, only the compiler does some of the work for you).
Is there any reason why one would choose not to allow that extra argument? Is there an advantage to accepting only "void functions with 0 arguments"?
I think atexit is just a special case, because whatever function you pass to it is supposed to be called only once. Therefore whatever state it needs to do its job can just be kept in global variables. If atexit were being designed today, it would probably take a void* in order to enable you to avoid using global variables, but that wouldn't actually give it any new functionality; it would just make the code slightly cleaner in some cases.
For many APIs, though, callbacks are allowed to take additional arguments, and not allowing them to do so would be a severe design flaw. For example, pthread_create does let you pass a void*, which makes sense because otherwise you'd need a separate function for each thread, and it would be totally impossible to write a program that spawns a variable number of threads.
Quite a number of the interfaces taking function pointers lacking a pass-through argument are simply coming from a different time. However, their signatures can't be changed without breaking existing code. It is sort of a misdesign but that's easy to say in hindsight. The overall programming style has moved on to have limited uses of functional programming within generally non-functional programming languages. Also, at the time many of these interfaces were created storing any extra data even on "normal" computers implied an observable extra cost: aside from the extra storage used, the extra argument also needs to be passed even when it isn't used. Sure, atexit() is hardly bound to be a performance bottleneck seeing that it is called just once but if you'd pass an extra pointer everywhere you'd surely also have one qsort()'s comparison function.
Specifically for something like atexit() it is reasonably straight forward to use a custom global object with which function objects to be invoked upon exit are registered: just register a function with atexit() calling all of the functions registered with said global object. Also note that atexit() is only guaranteed to register up to 32 functions although implementations may support more registered functions. It seems ill-advised to use it as a registry for object clean-up function rather than the function which calling an object clean-up function as other libraries may have a need to register functions, too.
That said, I can't imagine why atexit() is particular useful in C++ where objects are automatically destroyed upon program termination anyway. Of course, this approach assumes that all objects are somehow held but that's normally necessary anyway in some form or the other and typically done using appropriate RAII objects.

Should I use std::function or a function pointer in C++?

When implementing a callback function in C++, should I still use the C-style function pointer:
void (*callbackFunc)(int);
Or should I make use of std::function:
std::function< void(int) > callbackFunc;
In short, use std::function unless you have a reason not to.
Function pointers have the disadvantage of not being able to capture some context. You won't be able to for example pass a lambda function as a callback which captures some context variables (but it will work if it doesn't capture any). Calling a member variable of an object (i.e. non-static) is thus also not possible, since the object (this-pointer) needs to be captured.(1)
std::function (since C++11) is primarily to store a function (passing it around doesn't require it to be stored). Hence if you want to store the callback for example in a member variable, it's probably your best choice. But also if you don't store it, it's a good "first choice" although it has the disadvantage of introducing some (very small) overhead when being called (so in a very performance-critical situation it might be a problem but in most it should not). It is very "universal": if you care a lot about consistent and readable code as well as don't want to think about every choice you make (i.e. want to keep it simple), use std::function for every function you pass around.
Think about a third option: If you're about to implement a small function which then reports something via the provided callback function, consider a template parameter, which can then be any callable object, i.e. a function pointer, a functor, a lambda, a std::function, ... Drawback here is that your (outer) function becomes a template and hence needs to be implemented in the header. On the other hand you get the advantage that the call to the callback can be inlined, as the client code of your (outer) function "sees" the call to the callback will the exact type information being available.
Example for the version with the template parameter (write & instead of && for pre-C++11):
template <typename CallbackFunction>
void myFunction(..., CallbackFunction && callback) {
...
callback(...);
...
}
As you can see in the following table, all of them have their advantages and disadvantages:
function ptr
std::function
template param
can capture context variables
no1
yes
yes
no call overhead (see comments)
yes
no
yes
can be inlined (see comments)
no
no
yes
can be stored in a class member
yes
yes
no2
can be implemented outside of header
yes
yes
no
supported without C++11 standard
yes
no3
yes
nicely readable (my opinion)
no
yes
(yes)
(1) Workarounds exist to overcome this limitation, for example passing the additional data as further parameters to your (outer) function: myFunction(..., callback, data) will call callback(data). That's the C-style "callback with arguments", which is possible in C++ (and by the way heavily used in the WIN32 API) but should be avoided because we have better options in C++.
(2) Unless we're talking about a class template, i.e. the class in which you store the function is a template. But that would mean that on the client side the type of the function decides the type of the object which stores the callback, which is almost never an option for actual use cases.
(3) For pre-C++11, use boost::function
void (*callbackFunc)(int); may be a C style callback function, but it is a horribly unusable one of poor design.
A well designed C style callback looks like void (*callbackFunc)(void*, int); -- it has a void* to allow the code that does the callback to maintain state beyond the function. Not doing this forces the caller to store state globally, which is impolite.
std::function< int(int) > ends up being slightly more expensive than int(*)(void*, int) invocation in most implementations. It is however harder for some compilers to inline. There are std::function clone implementations that rival function pointer invocation overheads (see 'fastest possible delegates' etc) that may make their way into libraries.
Now, clients of a callback system often need to set up resources and dispose of them when the callback is created and removed, and to be aware of the lifetime of the callback. void(*callback)(void*, int) does not provide this.
Sometimes this is available via code structure (the callback has limited lifetime) or through other mechanisms (unregister callbacks and the like).
std::function provides a means for limited lifetime management (the last copy of the object goes away when it is forgotten).
In general, I'd use a std::function unless performance concerns manifest. If they did, I'd first look for structural changes (instead of a per-pixel callback, how about generating a scanline processor based off of the lambda you pass me? which should be enough to reduce function-call overhead to trivial levels.). Then, if it persists, I'd write a delegate based off fastest possible delegates, and see if the performance problem goes away.
I would mostly only use function pointers for legacy APIs, or for creating C interfaces for communicating between different compilers generated code. I have also used them as internal implementation details when I am implementing jump tables, type erasure, etc: when I am both producing and consuming it, and am not exposing it externally for any client code to use, and function pointers do all I need.
Note that you can write wrappers that turn a std::function<int(int)> into a int(void*,int) style callback, assuming there are proper callback lifetime management infrastructure. So as a smoke test for any C-style callback lifetime management system, I'd make sure that wrapping a std::function works reasonably well.
Use std::function to store arbitrary callable objects. It allows the user to provide whatever context is needed for the callback; a plain function pointer does not.
If you do need to use plain function pointers for some reason (perhaps because you want a C-compatible API), then you should add a void * user_context argument so it's at least possible (albeit inconvenient) for it to access state that's not directly passed to the function.
The only reason to avoid std::function is support of legacy compilers that lack support for this template, which has been introduced in C++11.
If supporting pre-C++11 language is not a requirement, using std::function gives your callers more choice in implementing the callback, making it a better option compared to "plain" function pointers. It offers the users of your API more choice, while abstracting out the specifics of their implementation for your code that performs the callback.
std::function may bring VMT to the code in some cases, which has some impact on performance.
The other answers answer based on technical merits. I'll give you an answer based on experience.
As a very heavy X-Windows developer who always worked with function pointer callbacks with void* pvUserData arguments, I started using std::function with some trepidation.
But I find out that combined with the power of lambdas and the like, it has freed up my work considerably to be able to, at a whim, throw multiple arguments in, re-order them, ignore parameters the caller wants to supply but I don't need, etc. It really makes development feel looser and more responsive, saves me time, and adds clarity.
On this basis I'd recommend anyone to try using std::function any time they'd normally have a callback. Try it everywhere, for like six months, and you may find you hate the idea of going back.
Yes there's some slight performance penalty, but I write high-performance code and I'm willing to pay the price. As an exercise, time it yourself and try to figure out whether the performance difference would ever matter, with your computers, compilers and application space.

Why are C++ function calls cheap?

While reading Stroustrup's "The C++ Programming Language", I came across this sentence on p. 108:
"The style of syntax analysis used is usually called recursive descent; it is a popular and straightforward top-down technique. In a language such as C++, in which function calls are relatively cheap, it is also efficient."
Can someone explain why C++ function calls are cheap? I'd be interested in a general explanation, i.e. what makes a function call cheap in any language, as well, if that's possible.
Calling C or C++ functions (in particular when they are not virtual) is quite cheap since it involves only a few machine instructions, and a jump (with link to return address) to a known location.
On some other languages (e.g. Common Lisp, when applying an unknown variadic function), it may be more complex.
Actually, you should benchmark: many recent processors are out-of-order & superscalar, so are doing "several things at a time".
However, optimizing compilers are capable of marvellous tricks.
For many functional languages, a called function is in general a closure, and need some indirection (and also passing the closed values).
Some object oriented languages (like Smalltalk) may involve searching a dictionary of methods when invoking a selector (on an arbitrary receiver).
Interpreted languages may have a quite larger function call overhead.
Function calls are cheap in C++ compared to most other languages for one reason: C++ is built upon the concept of function inlining, whereas (for example) java is built upon the concept of everything-is-a-virtual-function.
In C++, most of the time you're calling a function, you're not actually generating an call instruction. Especially when calling small or template functions, the compiler will most likely inline the code. In such case the function call overhead is simply zero.
Even when the function is not inlined, the compiler can make assumptions about what the function does For example: the windows X64 calling convention specifies that the registers R12-R15, XMM6-XMM15 should be saved by the caller. When calling a function, the compiler must generate code at the call site to save and restore these registers. But if the compiler can prove that the registers R12-R15, XMM6-XMM15 are not used by the called function such code can be omitted. This optimization is much harder when calling a virtual function.
Sometimes inlining is not possible. Common reasons include the function body not being available at compile time, of the function being too large. In that case the compiler generates an direct call instruction. However because the call target is fixed, the CPU can prefetch the instructions quite well. Although direct function calls are fast, there is still some overhead because the caller needs to save some registers on the stack, increase the stack pointer, etc.
Finally, when using an java function call or C++ function with the virtual keyword, the CPU will execute an virtual call instruction. The difference with an direct call is that the target is not fixed, but instead stored in in memory. The target function may change during the runtime of the program, which means that the CPU cannot always prefetch the data at the function location. Modern CPU's and JIT compilers have various tricks up their sleeve to predict the location of the target function, but it is still not as fast as direct calls.
tldr: function calls in C++ are fast because C++ implements inlining and by default uses direct calls over virtual calls. Many other languages do not implement inlining as well as C++ does and utilize virtual functions by default.
The cost of a function call is associated with the set of operations required to go from a given scope to another, i.e., from a current execution to the scope of another function. Consider the following code:
void foo(int w) { int x, y, z; ...; }
int main() { int a, b, c; ...; foo(b); ...; }
The execution starts in main(), and you may have some variables loaded into registers/memory. When you reach foo(), the set of variables available for use is different: a, b, c values are not reachable by function foo() and, in case you run out of available registers, the values stored will have to be spilled to memory.
The issue with registers appears in any language. But some languages needs more complex operations to change from scope to scope: C++ simply pushes up whatever is required by the function into the memory stack, maintaining pointers for surrounding scopes (in this case, while running foo(), you'd be able to reach the definition of w in main()'s scope.
Other languages must allocate and pass forth complex data to allow access for surrounding scope variables. These extra allocations, and even searches for specific labels within the surrounding scopes, can raise the cost of function calls considerably.

Where is stored in memory the reference to current object?

I have a simple question. I know that after compile a program when I call a function a call stack is generated with the arguments, space for local vars, return point and the registers that i'm charged.
But in object-oriented language like c++, where the compiler stores the reference to the current object? object->instanceMethod() will store the object pointer like an argument in the call stack?
I know the question is generalist and thanks for the answer
It's implementation-defined but in practice you will find that most (all?) C++ compilers generate code which passes the this pointer as a hidden first argument to the function, so you can access it without explicitely specifiying it in the method signature.
In C++, when a member function is called the pointer to the instance on which it will operate (i.e. what will be this inside the function) is implicitly passed alongside the other function arguments/parameters. Actually, different systems use different conventions, so some number of such parameters could be packed into registers and never placed on the stack (this tends to be faster), but your conception is basically sound.