What makes the calling convention different? - c++

From my knowledge, the calling convention is depending on whether the platform is Windows or Linux.
I wanna know,
Compilers make the calling convention different.
Platforms make the calling convention different.
Which one is true?
if only 2 is true, is the calling convention is defined by the platforms, and do the compilers just follow the defined convention?

Platforms generally define one or more "standard" calling conventions. Compilers need to follow those conventions if they want to interoperate with other tools or components on the platform using those conventions, but can use their own different calling conventions internally.
The only real requirement is that any caller and callee need to agree on the conventions for the call between them.

Related: Why does Windows64 use a different calling convention from all other OSes on x86-64? talks about who defined the calling convention.
In that case, GCC developers effectively decided on behalf of the whole platform for the x86-64 System V ABI.
Obviously compiler devs are the most likely people to be able to design a good one, but once it's set, other people making new compilers have to follow it if they want to be compatible.
All non-Windows OSes chose to follow the same x86-64 System V calling convention because it was pretty well designed, so the set of platforms it covers grew to include all non-Windows OSes. Partly because they all use GCC and GCC-compatible compilers. It's not like developers of different compilers got together to agree on a calling convention they'd all follow for that platform, there was only one major free non-Windows C compiler at that time (early 2000s).

Related

Does the standard say anything about coexistence of exceptions and different calling conventions?

Like what if a function declared with a calling convention calls a normal(standard) function that throws? So the stack gets unwound and... what happens is, it would have to... I don't even know...
I would guess that it's all just UB. But that would be too sad. Probably only the compilers have something to say about that.
Does the standard actually say something about that?
The calling convention used and how exceptions work internally is implementation specific. For example, compilers on Linux and Windows use different calling conventions.
The C and C++ standards only specify how functions and exceptions should behave, but normally do not say much about how these features are to be implemented. Every combination of CPU, operating system and compiler may have their own way of implementing certain things.
If you want more information about exactly how the calling conventions are implemented in Linux and Windows on different CPUs/compilers, I recommend you read Agner Fog's Optimization Manual number 5. That manual also contains a chapter on exception handling/stack unwinding.

Can C++ binary code become portable through a native C interface? What are the limitations?

I often use the technique to wrap my high-performance C++ classes with a a thin C layer that I compile to shared libraries, and then load them in other programming languages, such as Python.
From my reading here and there, I understand that the only requirement for this to work, is to have the function interfaces use only native types or structs of these types. (so, int and longs, float, double, etc and their pointers of any rank).
My question is: Assuming full ABI compatibility between various compilers, is this the only requirement I have to fulfill to have full API compatibility with a shared library?
Why can't C++ libraries be ported? Here's my understanding:
Case 1: Consider the type std::string. Internally it contains a char* null-terminated string, and a size integer. The C++ standard doesn't say which of these should come first (right?). Meaning that if I put std::string on a function interface, two different compilers may have them in different order, which will not work.
Case 2: Consider inheritance and vtables for a class with virtual methods. The C++ standard doesn't require any specific position/order for where vtable pointers have to go (right?). They could be at the beginning of the class before any other variable, and they could also be at the end, after all other member variables. So again, interfacing this class on a function will not be consistent.
An additional question following my first one: Doesn't this very problem happen also inside function calls? Or is it that nothing matters after it's compiled to binary, and types have no meaning anymore? Wouldn't RTTI elements cause problems, for example, if I put them in a C wrapper interface?
The reason why there is no C++ ABI, is partly because there is no C ABI. As stated by Bjarne Stroustrup (source):
The technical hardest problem is probably the lack of a C++ binary interface (ABI). There is no C ABI either, but on most (all?) Unix platforms there is a dominant compiler and other compilers have had to conform to its calling conventions and structure layout rules - or become unused. In C++ there are more things that can vary - such as the layout of the virtual function table - and no vendor has created a C++ ABI by fiat by eliminating all competitors that did not conform. In the same way as it used to be impossible to link code from two different PC C compilers together, it is generally impossible to link the code from two different Unix C++ compilers together (unless there are compatibility switches).
The lack of an ABI gives more freedom to compiler implementations, and allows the languages to be spread to multiple different types of systems.
On Windows there are some platform specific dependencies that relies on the way the compiler outputs the result, one example comes from COM where pure virtual interfaces are required to be laid out in a specific way. So on Windows most compilers will, at least agree on that.
The Windows API uses the stdcall calling convention, so when coding against the Windows API, there are a fixed set of rules for how to pass parameters to a function. But again this is system dependent, and there is nothing preventing you from writing a program that uses a different convention.

ISO_C_BINDING between different Fortran and C vendors

Is the concept of the Fortran ISO_C_BINDING module also supported by C/C++ compiler vendors? For example, the size of a C/C++ int can vary between the compilers from different vendors. So, with the ISO_C_BINDING module, we know that a Fortran C_INT type is 4 bytes; rather than merely having a kind of 4. But, we still don't know the size of an int in general in C/C++. Am I correct? Is there perhaps a standard C/C++ ISO_C_BINDING-compatible compiler switch?
As far as I know, the standard only demands matching types in the same toolchain. Thus you are better using the C-Compiler from the same vendor. The standard doesn't claim anything about the sizes of the C_ kinds, I think.
Edit: Just looked it up in the standard, it is always talking about the companion C-compiler.
Most operating systems expose a C API, which obviously implies the existence of a standard C ABI on that platform. Normally, C compilers use this ABI, but there may be some peculiarities (eg, the standard calling convention for the Windows API is stdcall, which doesn't support variadic functions, thus there's a second major calling convention called cdecl).
The situation for C++ isn't as clear-cut: most operating system do not expose a C++ API (there are exceptions like BeOS/Haiku), thus compiler vendors were free to do whatever they thoght best, leading to incompatibilities between compilers from different vendors and sometimes even between different versions of the same compiler. I think at least GCC has stabilized their C++ ABI, but I have no idea about the general situation...

Gcc x64 function calling

As far as I know, there are two possible calling conventions for the x64 code - Microsoft x64 and AMD64.
Now, gcc can be launched with the -mregparm=0 parameter, which doesn't work if we are working using the AMD64 calling convention. This happens because the AMD64 convention mandates the usage of the registers for the first 6 variables (I'm not really sure why this is done, but I suspect it's implemented due do possibly stack security issues).
So, here is the question:
Are there some strict rules like this (forced register usage) when compiling using gcc under Microsoft x64 convention? And, if yes, how can they be bypassed without breaking the ABI compatibility?
I don't know Microsoft Windows (and never used it), so I probably cannot answer your question about it.
However, the AMD64 Application Binary Interface calling conventions (On Linux and other Unixes) are documented in the AMD64 ABI spec (maybe you should also find and read the equivalent document for Microsoft calling conventions). I believe they are using registers for the 6 first arguments because of performance considerations (passing values thru register is faster than passing them on the stack), not because of security considerations.
And whatever C++ compiler you use, you want it to follow some calling conventions, and these are practically dictated by the system (because you want to be able to call system libraries from your code). So if you break them, you will break the ABI compatibility.
But I cannot guess why are asking such a question. Are you developing a compiler with its own calling conventions? If yes, you still should have some means to call C libraries, and this required that for call to external C libraries, you follow the ABI conventions governing them. Look into the Ocaml compiler for an example.
I don't think you can bypass these without breaking ABI. A function call and how that affects registers etc. is a fundamental part of the platform ABI.
Chances are your program will not work on Windows x64 due to a mismatched function call ABI.
For all the documentation you could want, see this MSDN link

What could C/C++ "lose" if they defined a standard ABI?

The title says everything. I am talking about C/C++ specifically, because both consider this as "implementation issue". I think, defining a standard interface can ease building a module system on top of it, and many other good things.
What could C/C++ "lose" if they defined a standard ABI?
The freedom to implement things in the most natural way on each processor.
I imagine that c in particular has conforming implementations on more different architectures than any other language. Abiding by a ABI optimized for the currently common, high-end, general-purpose CPUs would require unnatural contortions on some the odder machines out there.
Backwards compatibility on every platform except for the one whose ABI was chosen.
Basically, everyone missed that one of the C++14 proposals actually DID define a standard ABI. It was a standard ABI specifically for libraries that used a subset of C++. You define specific sections of "ABI" code (like a namespace) and it's required to conform to the subset.
Not only that, it was written by THE Herb Stutter, C++ expert and author the "Exceptional C++" book series.
The proposal goes into many reasons why a portable ABI is difficult, as well as novel solutions.
https://isocpp.org/blog/2014/05/n4028
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4028.pdf
Note that he defines a "target platform" to be a combination of CPU architecture (x64, x86, ARM, etc), OS, and bitness (32/64).
So the goal here, is actually having C++ code (Visual Studio) be able to talk to other C++ code (GCC, older Visual Studio, etc) on the same platform. It's not a goal of a universal ABI that lets cellphones libraries run on your Windows machine.
This proposal was NOT ratified in C++14, however, it was moved into the "Evolution" phase of C++17 for further discussion/iteration.
https://www.ibm.com/developerworks/community/blogs/5894415f-be62-4bc0-81c5-3956e82276f3/entry/c_14_is_ratified_the_view_from_the_june_2014_c_standard_meeting?lang=en
So as of January 2017, my fingers remain crossed.
Rather than a generic ABI for all platforms (which would be disastrous as it would only be optimal for only one platform). The standard's committee could say that each platform will conform to a specific ABI.
But: Who defines it (the first compiler through the door?). In which case they get an excessive competitive advantage. Or a committee after 5 years of compilers (which would be another horrible idea).
Also it does not give the compiler leaway to do further research into new optimization strategies, you would be stuck with the tricks available at the point where the standard was defined.
The C (or C++) language specifications define the source language. They don't care about the processor running it (A C program could even be interpreted by a human slave, but that would be unethical and not cost-effective).
The ABI is by definition something about the target system. It is related to the processor and the system (and the existing libraries following the ABI).
In the past, it did happen that some processors had proprietary (i.e. undisclosed) specification (even their machine instruction set was not public), and they had a non-public ABI which was followed by a compiler (respecting more or less the language standard).
Defining a programming language don't require the same skill sets as defining the ABI.
You could even define a newer ABI for an existing processor, but that requires a lot of work (patching the compiler, recompiling every thing, including C & C++ standard libraries and all utilities and libraries that you need) so is generally useless.
Execution speed would suffer drastically on a majority of platforms. So much so that it would likely no longer be reasonable to use the C language for a number of embedded platforms. The standards body could be liable for an antitrust suit brought by the makers of the various chips not compatible with the ABI.
Well, there wouldn't be one standard ABI, but about 1000. You would need one for every combination of OS and processor architecture.
Initially, nothing would be lost. But eventually, somebody would find some horrible bug and they would either fix it, breaking the ABI, or leave it, causing problems.
I think that the situation right now is fine. Any OS is free to define an ABI for itself (and they do), which makes sense. It should be the job of the OS to define its ABI, not the C/C++ standard.
C always had a standard ABI, which is even the one used for any most standard ABI (I mean, the C ABI is the ABI of choice, when different languages or systems has to bind to each others). The C ABI is kind of common ABI of others ABIs. C++ is more complex although extending and thus based on C, and indeed, a standard ABI for C++ is more challenging and may present issues to the freedom a C++ compiler have for its own implementation of the target machine code. However, it seems to actually have a standard ABI; see Itanium C++ ABI.
So the question may not be that much “what could they loose?”, but rather “what do they loose?” (if ever they really loose something).
Side note: needed to keep in mind ABIs are always architecture and OS dependant. So if what was meant by “Standard ABI” is “standard across architectures and platforms”, then there may never has been or be such thing, but communication protocols.