int __cdecl funcB(int a, int b) {
return 0;
}
int __stdcall funcA(int a, int b) {
return funcA(a, b);
}
I wrote this two functions and they have different calling conventions: __stdcall and __cdecl.
And my question is why MSVC didn't throw a compile error?
Because in my view two functions with different calling conventions can't call each other
If caller think callee should clean the stack, and callee think caller should clean the stack, and that's my problem
Any answers will be helpful
Because in my view two functions with different calling conventions can't call each other
That's simply an incorrect view. A calling convention is just a set of rules for how arguments are handled across the call. The compiler generates instructions at each call site and within the body of the function that follow whichever convention the function is defined with.
If caller think callee should clean the stack, and callee think caller should clean the stack, and that's my problem
The problem you are thinking of is when the calling convention is omitted, and different translation units are compiled with different default conventions. The declarations in one TU are used in a manner incompatible with the definition in another TU.
This question already has answers here:
What does "WINAPI" in main function mean?
(4 answers)
What is the point of defining a calling convention?
(5 answers)
Closed 4 years ago.
I'm just learning how to create a dll with C++.
There appears this :
BOOL WINAPI DllMain(HINSTANCE hinstDLL,DWORD fdwReason,LPVOID lpvReserved)
And I can not understand what is "WINAPI" in DllMain()?
I know that a function is :
typeReturn functionName (params) { function body }
typeReturn : is the value that function returns,
functionName : is the name of the function,
params : are the parameters for the function,
{function body} : is the code inside in the function.
...
Then, following the explanation, what does WINAPI mean in C++ that or __stdcall?
I'm not asking what means WINAPI itself.
************ UPDATE **************
C++ has (calling conventions) that is used to put in memory each parameter given in a special way.
please read correctly the question and avoid mark it as duplicate, because people learning c/c++ needs learn without fall into confusions
WINAPI is defined as __stdcall.
Actually __stdcall is a calling convention And different calling conventions push parameters in different ways, Bellow are some of c/c++ Calling Conventions :
In x86 :
C calling convention (__cdecl). The main characteristics of __cdecl calling convention are :
Arguments are passed from right to left, and placed on the stack.
Stack cleanup is performed by the caller.
Function name is decorated by prefixing it with an underscore character '_' .
Standard calling convention (__stdcall). The main characteristics of __stdcall calling convention are :
Arguments are passed from right to left, and placed on the stack.
Stack cleanup is performed by the called function.
Function name is decorated by prepending an underscore character and appending a '#' character and the number of bytes of stack space required.
Fast calling convention (__fastcall). he main characteristics of __fastcall calling convention are :
The first two function arguments that require 32 bits or less are placed into registers ECX and EDX. The rest of them are pushed on the stack from right to left.
Arguments are popped from the stack by the called function.
Function name is decorated by by prepending a '#' character and appending a '#' and the number of bytes (decimal) of space required by the arguments.
Consider to Read This Link
In x64 :
In x64, only __fastcall exists. All other attributes are ignored.
The x64 Application Binary Interface (ABI) uses a four register fast-call calling convention by default.
Notice :
When you call a function, what happens at the assembly level is all the passed-in parameters are pushed to the stack or placed in registers or placed in static storage, then the program jumps to a different area of code. The new area of code looks at the stack and expects the parameters to be placed there.
Different calling conventions push parameters in different ways. Some might push the first parameter first, or some might push the first param last. Or some might keep a parameter in a register and not push it at all.
By specifying a calling convention, you are telling the compiler how the parameters are to be pushed.
Dynamic-Link Library, used in Windows operating system, so the entry point of DLL function should be crowned by "WINAPI" if it is running on Windows.
It seems to me, that MSVS ignores __stdcall directive on my functions. I'm cleaning up the stack manually, but the compiler still append ADD ESP instructions after each CALL.
This is how I declare the function:
extern "C" void * __stdcall core_call(int addr, ...);
#define function(...) (DWORD WINAPI) core_call(12345, __VA_ARGS__)
return function("Hello", 789);
And this is how the output looks like:
(source: server4u.cz)
I've marked with arrows redundant ADD instructions, which MSVS automatically append after each call, despite the fact, that cleaining the stack is a callee responsibility (reference: http://en.wikipedia.org/wiki/X86_calling_conventions#List_of_x86_calling_conventions) and this causes the crash of my progrm. If I manually replace the ADD instructions with NOPs, program works as supposed. So, my question is... Is there a way how to force the compiler to stop addaing these instructions?
Thanks.
The problem is here: , ...).
Functions with variable number of arguments cannot be __stdcall.
__stdcall functions must remove all their stack arguments from the stack at the end, but they can't know in advance how much stuff they will receive as parameters.
The same holds for __fastcall functions.
The only applicable calling convention for functions with variable number of arguments is __cdecl, where the caller has to remove the stack parameters after the call. And that's what the compiler uses despite your request to use __stdcall.
There are (among others) two types of calling conventions - stdcall and cdecl. I have few questions on them:
When a cdecl function is called, how does a caller
know if it should free up the stack ? At the call site, does the
caller know if the function being called is a cdecl or a stdcall
function ? How does it work ? How does the caller know if it should
free up the stack or not ? Or is it the linkers responsibility ?
If a function which is declared as stdcall calls a function(which
has a calling convention as cdecl), or the other way round, would
this be inappropriate ?
In general, can we say that which call will be faster - cdecl or
stdcall ?
Raymond Chen gives a nice overview of what __stdcall and __cdecl does.
(1) The caller "knows" to clean up the stack after calling a function because the compiler knows the calling convention of that function and generates the necessary code.
void __stdcall StdcallFunc() {}
void __cdecl CdeclFunc()
{
// The compiler knows that StdcallFunc() uses the __stdcall
// convention at this point, so it generates the proper binary
// for stack cleanup.
StdcallFunc();
}
It is possible to mismatch the calling convention, like this:
LRESULT MyWndProc(HWND hwnd, UINT msg,
WPARAM wParam, LPARAM lParam);
// ...
// Compiler usually complains but there's this cast here...
windowClass.lpfnWndProc = reinterpret_cast<WNDPROC>(&MyWndProc);
So many code samples get this wrong it's not even funny. It's supposed to be like this:
// CALLBACK is #define'd as __stdcall
LRESULT CALLBACK MyWndProc(HWND hwnd, UINT msg
WPARAM wParam, LPARAM lParam);
// ...
windowClass.lpfnWndProc = &MyWndProc;
However, assuming the programmer doesn't ignore compiler errors, the compiler will generate the code needed to clean up the stack properly since it'll know the calling conventions of the functions involved.
(2) Both ways should work. In fact, this happens quite frequently at least in code that interacts with the Windows API, because __cdecl is the default for C and C++ programs according to the Visual C++ compiler and the WinAPI functions use the __stdcall convention.
(3) There should be no real performance difference between the two.
In CDECL arguments are pushed onto the stack in revers order, the caller clears the stack and result is returned via processor registry (later I will call it "register A"). In STDCALL there is one difference, the caller doeasn't clear the stack, the calle do.
You are asking which one is faster. No one. You should use native calling convention as long as you can. Change convention only if there is no way out, when using external libraries that requires certain convention to be used.
Besides, there are other conventions that compiler may choose as default one i.e. Visual C++ compiler uses FASTCALL which is theoretically faster because of more extensive usage of processor registers.
Usually you must give a proper calling convention signature to callback functions passed to some external library i.e. callback to qsort from C library must be CDECL (if the compiler by default uses other convention then we must mark the callback as CDECL) or various WinAPI callbacks must be STDCALL (whole WinAPI is STDCALL).
Other usual case may be when you are storing pointers to some external functions i.e. to create a pointer to WinAPI function its type definition must be marked with STDCALL.
And below is an example showing how does the compiler do it:
/* 1. calling function in C++ */
i = Function(x, y, z);
/* 2. function body in C++ */
int Function(int a, int b, int c) { return a + b + c; }
CDECL:
/* 1. calling CDECL 'Function' in pseudo-assembler (similar to what the compiler outputs) */
push on the stack a copy of 'z', then a copy of 'y', then a copy of 'x'
call (jump to function body, after function is finished it will jump back here, the address where to jump back is in registers)
move contents of register A to 'i' variable
pop all from the stack that we have pushed (copy of x, y and z)
/* 2. CDECL 'Function' body in pseudo-assembler */
/* Now copies of 'a', 'b' and 'c' variables are pushed onto the stack */
copy 'a' (from stack) to register A
copy 'b' (from stack) to register B
add A and B, store result in A
copy 'c' (from stack) to register B
add A and B, store result in A
jump back to caller code (a, b and c still on the stack, the result is in register A)
STDCALL:
/* 1. calling STDCALL in pseudo-assembler (similar to what the compiler outputs) */
push on the stack a copy of 'z', then a copy of 'y', then a copy of 'x'
call
move contents of register A to 'i' variable
/* 2. STDCALL 'Function' body in pseaudo-assembler */
pop 'a' from stack to register A
pop 'b' from stack to register B
add A and B, store result in A
pop 'c' from stack to register B
add A and B, store result in A
jump back to caller code (a, b and c are no more on the stack, result in register A)
I noticed a posting that say that it does not matter if you call a __stdcall from a __cdecl or visa versa. It does.
The reason: with __cdecl the arguments that are passed to the called functions are removed form the stack by the calling function, in __stdcall, the arguments are removed from the stack by the called function. If you call a __cdecl function with a __stdcall, the stack is not cleaned up at all, so eventually when the __cdecl uses a stacked based reference for arguments or return address will use the old data at the current stack pointer. If you call a __stdcall function from a __cdecl, the __stdcall function cleans up the arguments on the stack, and then the __cdecl function does it again, possibly removing the calling functions return information.
The Microsoft convention for C tries to circumvent this by mangling the names. A __cdecl function is prefixed with an underscore. A __stdcall function prefixes with an underscore and suffixed with an at sign “#” and the number of bytes to be removed. Eg __cdecl f(x) is linked as _f, __stdcall f(int x) is linked as _f#4 where sizeof(int) is 4 bytes)
If you manage to get past the linker, enjoy the debugging mess.
I want to improve on #adf88's answer. I feel that pseudocode for the STDCALL does not reflect the way of how it happens in reality. 'a', 'b', and 'c' aren't popped from the stack in the function body. Instead they are popped by the ret instruction (ret 12 would be used in this case) that in one swoop jumps back to the caller and at the same time pops 'a', 'b', and 'c' from the stack.
Here is my version corrected according to my understanding:
STDCALL:
/* 1. calling STDCALL in pseudo-assembler (similar to what the compiler outputs) */
push on the stack a copy of 'z', then copy of 'y', then copy of 'x'
call
move contents of register A to 'i' variable
/* 2. STDCALL 'Function' body in pseaudo-assembler */
copy 'a' (from stack) to register A
copy 'b' (from stack) to register B
add A and B, store result in A
copy 'c' (from stack) to register B
add A and B, store result in A
jump back to caller code and at the same time pop 'a', 'b' and 'c' off the stack (a, b and
c are removed from the stack in this step, result in register A)
It's specified in the function type. When you have a function pointer, it's assumed to be cdecl if not explicitly stdcall. This means that if you get a stdcall pointer and a cdecl pointer, you can't exchange them. The two function types can call each other without issues, it's just getting one type when you expect the other. As for speed, they both perform the same roles, just in a very slightly different place, it's really irrelevant.
The caller and the callee need to use the same convention at the point of invokation - that's the only way it could reliably work. Both the caller and the callee follow a predefined protocol - for example, who needs to clean up the stack. If conventions mismatch your program runs into undefined behavior - likely just crashes spectacularly.
This is only required per invokation site - the calling code itself can be a function with any calling convention.
You shouldn't notice any real difference in performance between those conventions. If that becomes a problem you usually need to make less calls - for example, change the algorithm.
Those things are Compiler- and Platform-specific. Neither the C nor the C++ standard say anything about calling conventions except for extern "C" in C++.
how does a caller know if it should free up the stack ?
The caller knows the calling convention of the function and handles the call accordingly.
At the call site, does the caller know if the function being called is a cdecl or a stdcall function ?
Yes.
How does it work ?
It is part of the function declaration.
How does the caller know if it should free up the stack or not ?
The caller knows the calling conventions and can act accordingly.
Or is it the linkers responsibility ?
No, the calling convention is part of a function's declaration so the compiler knows everything it needs to know.
If a function which is declared as stdcall calls a function(which has a calling convention as cdecl), or the other way round, would this be inappropriate ?
No. Why should it?
In general, can we say that which call will be faster - cdecl or stdcall ?
I don't know. Test it.
a) When a cdecl function is called by the caller, how does a caller know if it should free up the stack?
The cdecl modifier is part of the function prototype (or function pointer type etc.) so the caller get the info from there and acts accordingly.
b) If a function which is declared as stdcall calls a function(which has a calling convention as cdecl), or the other way round, would this be inappropriate?
No, it's fine.
c) In general, can we say that which call will be faster - cdecl or stdcall?
In general, I would refrain from any such statements. The distinction matters eg. when you want to use va_arg functions. In theory, it could be that stdcall is faster and generates smaller code because it allows to combine popping the arguments with popping the locals, but OTOH with cdecl, you can do the same thing, too, if you're clever.
The calling conventions that aim to be faster usually do some register-passing.
Calling conventions have nothing to do with the C/C++ programming languages and are rather specifics on how a compiler implements the given language. If you consistently use the same compiler, you never need to worry about calling conventions.
However, sometimes we want binary code compiled by different compilers to inter-operate correctly. When we do so we need to define something called the Application Binary Interface (ABI). The ABI defines how the compiler converts the C/C++ source into machine-code. This will include calling conventions, name mangling, and v-table layout. cdelc and stdcall are two different calling conventions commonly used on x86 platforms.
By placing the information on the calling convention into the source header, the compiler will know what code needs to be generated to inter-operate correctly with the given executable.
Sample code that shows how to create threads using MFC declares the thread function as both static and __cdecl. Why is the latter required? Boost threads don't bother with this convention, so is it just an anachronism?
For example (MFC):
static __cdecl UINT MyFunc(LPVOID pParam)
{
...
}
CWinThread* pThread = AfxBeginThread(MyFunc, ...);
Whereas Boost:
static void func()
{
...
}
boost::thread t;
t.create(&func);
(the code samples might not be 100% correct as I am nowhere near an IDE).
What is the point of __cdecl? How does it help when creating threads?
__cdecl tells the compiler to use the C calling convention (as opposed to the stdcall, fastcall or whatever other calling convention your compiler supports). I believe, VC++ uses stdcall by default.
The calling convention affects things such as how arguments are pushed onto the stack (or registers, in the case of fastcall) and who pops arguments off the stack (caller or callee).
In the case of Boost. I believe it uses template specialization to figure out the appropriate function type and calling convention.
Look at the prototype for AfxBeginThread():
CWinThread* AfxBeginThread(
AFX_THREADPROC pfnThreadProc,
LPVOID pParam,
int nPriority = THREAD_PRIORITY_NORMAL,
UINT nStackSize = 0,
DWORD dwCreateFlags = 0,
LPSECURITY_ATTRIBUTES lpSecurityAttrs = NULL
);
AFX_THREADPROC is a typedef for UINT(AFX_CDECL*)(LPVOID). When you pass a function to AfxBeginThread(), it must match that prototype, including the calling convention.
The MSDN pages on __cdecl and __stdcall (as well as __fastcall and __thiscall) explain the pros and cons of each calling convention.
The boost::thread constructor uses templates to allow you to pass a function pointer or callable function object, so it doesn't have the same restrictions as MFC.
Because your thread is going to be called by a runtime function that manages this for you, and that function expects it to be that way. Boost designed it a different way.
Put a breakpoint at the start of your thread function and look at the stack when it gets called, you'll see the runtime function that calls you.
C/C++ compilers by default use the C calling convention (pushing rightmost param first on the stack) for it allows working with functions with variable argument number as printf.
The Pascal calling convention (aka "fastcall") pushes leftmost param first. This is quicker though costs you the possibility of easy variable argument functions (I read somewhere they're still possible, though you need to use some tricks).
Due to the speed resulting from using the Pascal convention, both Win32 and MacOS APIs by default use that calling convention, except in certain cases.
If that function has only one param, in theory using either calling convention would be legal, though the compiler may enforce the same calling convention is used to avoid any problem.
The boost libraries were designed with an eye on portability, so they should be agnostic as to which caller convention a particular compiler is using.
The real answer has to do with how windows internally calls the thread proc routine, and it is expecting the function to abide by a specific calling convention, which in this case is a macro, WINAPI, which according to my system is defined as:
#define WINAPI __stdcall
This means that the called function is responsible for cleaning up the stack. The reason why boost::thread is able to support arbitrary functions is that it passes a pointer to the function object used in the call to thread::create function to CreateThread. The threadproc associated with the thread simply calls operator() on the function object.
The reason MFC requires __cdecl therefore has to do with the way it internally calls the function passed in to the call to AfxBeginThread. There is no good reason to do this unless they were planning on allowing vararg parameters...