Gcc x64 function calling - c++

As far as I know, there are two possible calling conventions for the x64 code - Microsoft x64 and AMD64.
Now, gcc can be launched with the -mregparm=0 parameter, which doesn't work if we are working using the AMD64 calling convention. This happens because the AMD64 convention mandates the usage of the registers for the first 6 variables (I'm not really sure why this is done, but I suspect it's implemented due do possibly stack security issues).
So, here is the question:
Are there some strict rules like this (forced register usage) when compiling using gcc under Microsoft x64 convention? And, if yes, how can they be bypassed without breaking the ABI compatibility?

I don't know Microsoft Windows (and never used it), so I probably cannot answer your question about it.
However, the AMD64 Application Binary Interface calling conventions (On Linux and other Unixes) are documented in the AMD64 ABI spec (maybe you should also find and read the equivalent document for Microsoft calling conventions). I believe they are using registers for the 6 first arguments because of performance considerations (passing values thru register is faster than passing them on the stack), not because of security considerations.
And whatever C++ compiler you use, you want it to follow some calling conventions, and these are practically dictated by the system (because you want to be able to call system libraries from your code). So if you break them, you will break the ABI compatibility.
But I cannot guess why are asking such a question. Are you developing a compiler with its own calling conventions? If yes, you still should have some means to call C libraries, and this required that for call to external C libraries, you follow the ABI conventions governing them. Look into the Ocaml compiler for an example.

I don't think you can bypass these without breaking ABI. A function call and how that affects registers etc. is a fundamental part of the platform ABI.
Chances are your program will not work on Windows x64 due to a mismatched function call ABI.
For all the documentation you could want, see this MSDN link

Related

What makes the calling convention different?

From my knowledge, the calling convention is depending on whether the platform is Windows or Linux.
I wanna know,
Compilers make the calling convention different.
Platforms make the calling convention different.
Which one is true?
if only 2 is true, is the calling convention is defined by the platforms, and do the compilers just follow the defined convention?
Platforms generally define one or more "standard" calling conventions. Compilers need to follow those conventions if they want to interoperate with other tools or components on the platform using those conventions, but can use their own different calling conventions internally.
The only real requirement is that any caller and callee need to agree on the conventions for the call between them.
Related: Why does Windows64 use a different calling convention from all other OSes on x86-64? talks about who defined the calling convention.
In that case, GCC developers effectively decided on behalf of the whole platform for the x86-64 System V ABI.
Obviously compiler devs are the most likely people to be able to design a good one, but once it's set, other people making new compilers have to follow it if they want to be compatible.
All non-Windows OSes chose to follow the same x86-64 System V calling convention because it was pretty well designed, so the set of platforms it covers grew to include all non-Windows OSes. Partly because they all use GCC and GCC-compatible compilers. It's not like developers of different compilers got together to agree on a calling convention they'd all follow for that platform, there was only one major free non-Windows C compiler at that time (early 2000s).

Can an x86 executable run on any x86 platform given the right runtime libraries?

While I did find similar-ish questions, they did not really answer this specific question.
Can a compiled x86 executable run on any x86 platform given the right runtime libraries?
Say I make a C++17 program without dependencies, could I run this program on Windows 95 or is there some sort of support required by the OS?
I also heard that RTTI (in the case of C++) may not be supported everywhere, is this only due to the processor having to support this feature or does the OS play a role in that? This would imply that new features would maybe not be supported by, e.g., Windows 95.
Edit
What I'm after is whether an executable (e.g., x86) can run on any platform supporting that instruction set or wether certain features, like RTTI, need specific OS support and thus are not available on all platforms supporting that instruction set.
In general you cannot, even if you restricted your universe to x86 hardware - at least not without some conversion of the binary or some platform-specific "loader" for each target platform.
For exmaple a typical binary emitted by a C or C++ compiler1 will have some minimal dependency on the OS and runtime, for example to load and do runtime linking on the executable. Different platforms have different binary formats (such as PE/COFF on Windows or ELF across various UNIX flavors and Linux) and there isn't any common "x86 format" that would work directly on any platform.
Furthermore, any non-trivial program and in many cases any program, trivial or not, is going to have platform-specific dependencies on the the langauge runtime. For example, even an empty main() function often requires runtime support to get from the OS-defined "start" method to the main method, and without unusual build options there are often calls at startup to initialize parts of the standard library.
Finally, as you alluded to with your comment about RTTI, various language or platform features may essentially be compiled into the binary and require OS support. RTTI probably doesn't obviously fall into this category, but things like position-independent code, thread-local storage and stack-unwinding support for exception handling often do. The compiled x86 code that uses such features may be quite different on different platforms since it needs to build in assumptions of how those work.
In principle, however, you could imagine this working, at least for some limited subset of programs. For example, while the various executable formats are in practice incompatible, they aren't that different and tools exist to convert between them. So you could certainly implement a minimal runtime on your platform of interest that takes an x86 executable compiled to whatever fixed format you choose and converts at runtime to the local format and runs it.
Beyond that actually trying to map even standard library calls would be quite difficult since different operating systems using different calling conventions, but it could be possible for "C" functions using some thunks to put things in the right place. C++ is pretty much right out because the ABI there is much more complex, compiler-and-platform specific and much of the implementation detail is already compiled-in for stuff implemented in headers.
In fact, the idea that (a subset of) x86 might provide a interesting intermediate language for cross-platform execution is exactly the idea behind exploited in Google's [NaCl project]. Essentially, the NaCl runtime provides platform agnostic "loading" capabilities which allow x86 code to run more-or-less natively on various platforms. Subsequently other native formats such as ARM were added, but it started as an x86 sandbox. A large part of the project deals with running code that provably safe (i.e., sandboxed) - but it shows that with some infrastructure you can write "portable" x86. A standard C or C++ compiler isn't going to emit NaCl compatible code directly, however.
1 Really, any compiler that compiles to a native format. I just call out C and C++ since they seem like the ones you are interested in and are widely familiar.
This question misses the point. C++ is, first and foremost, a language to describe the behaviour of a computer program.
Using a compiler to create a native binary executable file to produce that behaviour on an actual computer is the typical way of using the language.
Once you have the binary file, all traces of the source code used to produce it are gone (unless you have built a special version for debugging purposes). The compatibility of the binary file with specific hardware or operating systems is beyond the scope of C++ itself.
The same is true for C, or any other programming language which typically gets compiled to native binary code.
Or, to answer the question more briefly:
Can compiled C++/C code (i.e. an executable) run anywhere given the right runtime libraries?
No.
Can a compiled x86 executable run anywhere given the right runtime libraries?
No, it will only work on x86 hardware, or other hardware (or software, such as a virtual machine) that emulates the x86 instruction set (such as a x64 CPU). In practice, that's very likely to be a far cry from "anywhere."
And even if the hardware matches, an x86 executable will have operating system dependencies. A Windows binary won't run on Linux, even if the hardware is the same. There are various strategies that can make things like this "work" in some situations, Microsoft's Linux Subsystem for Windows is one recent example which allows Linux binaries to run unchanged on Windows. Again, a fry cry from "anywhere."

GCC vs MS C++ compiler for maintaining API backwards binary compatibility

I came from the Linux world and know a lot of articles about maintaining backwards binary compatibility (BC) of a dynamic library API written in C++ language. One of them is "Policies/Binary Compatibility Issues With C++" based on the Itanium C++ ABI, which is used by the GCC compiler. But I can't find anything similar for the Microsoft C++ compiler (from MSVC).
I understand that most of the techniques are applicable to the MS C++ compiler and I would like to discover compiler-specific issues related to ABI differences (v-table layout, mangling, etc.)
So, my questions are the following:
Do you know any differences between MS C++ and GCC compilers when maintaining BC?
Where can I find information about MS C++ ABI or about maintaining BC of API in Windows?
Any related information will be highly appreciated.
Thanks a lot for your help!
First of all these policies are general and not refer to gcc only. For example: private/public mark in functions is something specific to MSVC and not gcc.
So basically these rules are fully applicable to MSVC and general compiler as well.
But...
You should remember:
GCC/C++ keeps its ABI stable since 3.4 release and it is about 7 years (since 2004) while MSVC breaks its ABI every major release: MSVC8 (2005), MSVC9 (2008), MSVC10 (2010) are not compatible with each other.
Some frequently flags used with MSVC can break ABI as well (like Exceptions model)
MSVC has incompatible run-times for Debug and Release modes.
So yes you can use these rules, but as in usual case of MSVC it has much more quirks.
See also "Some thoughts on binary compatibility" and Qt keeps they ABI stable with MSVC as well.
Note I have some experience with this as I follow these rules in CppCMS
On Windows, you basically have 2 options for long term binary compatibility:
COM
mimicking COM
Check out my post here. There you'll see a way to create DLLs and access DLLs in a binary compatible way across different compilers and compiler versions.
C++ DLL plugin interface
The best rule for MSVC binary compatibility is use a C interface. The only C++ feature you can get away with, in my experience, is single-inheritance interfaces. So represent everything as interfaces which use C datatypes.
Here's a list of things which are not binary compatible:
The STL. The binary format changes even just between debug/release, and depending on compiler flags, so you're best off not using STL cross-module.
Heaps. Do not new / malloc in one module and delete / free in another. There are different heaps which do not know about each other. Another reason the STL won't work cross-modules.
Exceptions. Don't let exceptions propagate from one module to another.
RTTI/dynamic_casting datatypes from other modules.
Don't trust any other C++ features.
In short, C++ has no consistent ABI, but C does, so avoid C++ features crossing modules. Because single inheritance is a simple v-table, you can usefully use it to expose C++ objects, providing they use C datatypes and don't make cross-heap allocations. This is the approach used by Microsoft themselves as well, e.g. for the Direct3D API. GCC may be useful in providing a stable ABI, but the standard does not require this, and MSVC takes advantage of this flexibility.

What could C/C++ "lose" if they defined a standard ABI?

The title says everything. I am talking about C/C++ specifically, because both consider this as "implementation issue". I think, defining a standard interface can ease building a module system on top of it, and many other good things.
What could C/C++ "lose" if they defined a standard ABI?
The freedom to implement things in the most natural way on each processor.
I imagine that c in particular has conforming implementations on more different architectures than any other language. Abiding by a ABI optimized for the currently common, high-end, general-purpose CPUs would require unnatural contortions on some the odder machines out there.
Backwards compatibility on every platform except for the one whose ABI was chosen.
Basically, everyone missed that one of the C++14 proposals actually DID define a standard ABI. It was a standard ABI specifically for libraries that used a subset of C++. You define specific sections of "ABI" code (like a namespace) and it's required to conform to the subset.
Not only that, it was written by THE Herb Stutter, C++ expert and author the "Exceptional C++" book series.
The proposal goes into many reasons why a portable ABI is difficult, as well as novel solutions.
https://isocpp.org/blog/2014/05/n4028
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4028.pdf
Note that he defines a "target platform" to be a combination of CPU architecture (x64, x86, ARM, etc), OS, and bitness (32/64).
So the goal here, is actually having C++ code (Visual Studio) be able to talk to other C++ code (GCC, older Visual Studio, etc) on the same platform. It's not a goal of a universal ABI that lets cellphones libraries run on your Windows machine.
This proposal was NOT ratified in C++14, however, it was moved into the "Evolution" phase of C++17 for further discussion/iteration.
https://www.ibm.com/developerworks/community/blogs/5894415f-be62-4bc0-81c5-3956e82276f3/entry/c_14_is_ratified_the_view_from_the_june_2014_c_standard_meeting?lang=en
So as of January 2017, my fingers remain crossed.
Rather than a generic ABI for all platforms (which would be disastrous as it would only be optimal for only one platform). The standard's committee could say that each platform will conform to a specific ABI.
But: Who defines it (the first compiler through the door?). In which case they get an excessive competitive advantage. Or a committee after 5 years of compilers (which would be another horrible idea).
Also it does not give the compiler leaway to do further research into new optimization strategies, you would be stuck with the tricks available at the point where the standard was defined.
The C (or C++) language specifications define the source language. They don't care about the processor running it (A C program could even be interpreted by a human slave, but that would be unethical and not cost-effective).
The ABI is by definition something about the target system. It is related to the processor and the system (and the existing libraries following the ABI).
In the past, it did happen that some processors had proprietary (i.e. undisclosed) specification (even their machine instruction set was not public), and they had a non-public ABI which was followed by a compiler (respecting more or less the language standard).
Defining a programming language don't require the same skill sets as defining the ABI.
You could even define a newer ABI for an existing processor, but that requires a lot of work (patching the compiler, recompiling every thing, including C & C++ standard libraries and all utilities and libraries that you need) so is generally useless.
Execution speed would suffer drastically on a majority of platforms. So much so that it would likely no longer be reasonable to use the C language for a number of embedded platforms. The standards body could be liable for an antitrust suit brought by the makers of the various chips not compatible with the ABI.
Well, there wouldn't be one standard ABI, but about 1000. You would need one for every combination of OS and processor architecture.
Initially, nothing would be lost. But eventually, somebody would find some horrible bug and they would either fix it, breaking the ABI, or leave it, causing problems.
I think that the situation right now is fine. Any OS is free to define an ABI for itself (and they do), which makes sense. It should be the job of the OS to define its ABI, not the C/C++ standard.
C always had a standard ABI, which is even the one used for any most standard ABI (I mean, the C ABI is the ABI of choice, when different languages or systems has to bind to each others). The C ABI is kind of common ABI of others ABIs. C++ is more complex although extending and thus based on C, and indeed, a standard ABI for C++ is more challenging and may present issues to the freedom a C++ compiler have for its own implementation of the target machine code. However, it seems to actually have a standard ABI; see Itanium C++ ABI.
So the question may not be that much “what could they loose?”, but rather “what do they loose?” (if ever they really loose something).
Side note: needed to keep in mind ABIs are always architecture and OS dependant. So if what was meant by “Standard ABI” is “standard across architectures and platforms”, then there may never has been or be such thing, but communication protocols.

Why is Application Binary Interface important for programming

I don't understand why the ABI is important context of developing user-space applications. Is the set of system calls for an operating system considered an ABI? But if so then aren't all the complexities regarding system calls encapsulated within standard libraries?
So then is ABI compatibility only relevant for running statically linked applications on different platforms, since the system calls would be embedded into the binary?
An ABI defines a set of alignment, calling convention, and data types that are common to a system. This makes an ABI awfully important if you're doing any sort of dynamic linking; as without it code from one application has no way of calling code provided by another.
So, no. ABI compatibility is relevant for all dynamic linking (less so for static).
Its worth emphasizing again that a system's ABI affects inter-application work as well as application-to-operating-system work.
The ABI is more than what system calls are available. It also usually describes the actual way arguments are passed to functions and how structures and objects are laid-out in memory. Without a consistent ABI, code built by different compilers might not be able to call each other -- if you call foo(a,b) and one compiler pushes a and b on the stack while another passes those in registers, you've got an ABI clash.
"ABI" (see Wikipedia) is an umbrella term for all the assumptions an operating system makes about data formats. This includes the layout of executable files and that of any data structure in memory given its C definition.
The term also generally covers formatting requirements between programs written in the same language. Each language has particular features that might result in different conventions within executable formats and memory structures, but all must ultimately generate executables compatible with the OS and data structures compatible with the processor's instruction set.
ABI doesn't matter much if you only care about compiling standard-conformant code. It matters a little when you violate the standard and do unportable things like casting a char * to a long *. It's yet more important when writing a large body of assembly code. Writing something like a linker or a debugger, it can come to embody the bulk of work to be done.
Incompatible ABI is why even though OSX, Linux, Solaris, Windows and *BSD all run on Intel x86 CPUs, a simple POSIX-only hello world program compiled on one OS that does not use any vendor specific or proprietary system calls and/or libraries generally cannot run on one OS when compiled for another OS*.
ABI is not really important to programmers as such because we already instinctively know that you cannot run a Windows app on Macs. Even non-programmers (except Hollywood screenwriters) know this. It is important to compiler writers when they need to target a particular environment.
* note: Some OSes like Linux and BSD support foreign ABI so that a simple Linux command line program can sometimes be executed on BSD without modification. And there are of course emulation layers like wine.
Don't forget in C++ the way name mangling is implemented forms part of the ABI
Only when you want your binary to be run on other environment without recompilation, then there are some places you may need take ABI into account:
you may call a third library in your program, and the third library may varies on different environment. (so only the ABI you can trust)
the syscall to os. (if you static linked the syscall to your binary instead dynamically link to libc)
Actually, most developer need not take ABI into account, only the binary loader/tool developer need know more about it.
System calls also follow an ABI - the syscall interface differs from operating system to operating system.
Statically linking your application and the standard library into it will tie it into one syscall ABI. For instance, FreeBSD allows using the Linux syscall ABI only through an emulation module.