ISO_C_BINDING between different Fortran and C vendors - c++

Is the concept of the Fortran ISO_C_BINDING module also supported by C/C++ compiler vendors? For example, the size of a C/C++ int can vary between the compilers from different vendors. So, with the ISO_C_BINDING module, we know that a Fortran C_INT type is 4 bytes; rather than merely having a kind of 4. But, we still don't know the size of an int in general in C/C++. Am I correct? Is there perhaps a standard C/C++ ISO_C_BINDING-compatible compiler switch?

As far as I know, the standard only demands matching types in the same toolchain. Thus you are better using the C-Compiler from the same vendor. The standard doesn't claim anything about the sizes of the C_ kinds, I think.
Edit: Just looked it up in the standard, it is always talking about the companion C-compiler.

Most operating systems expose a C API, which obviously implies the existence of a standard C ABI on that platform. Normally, C compilers use this ABI, but there may be some peculiarities (eg, the standard calling convention for the Windows API is stdcall, which doesn't support variadic functions, thus there's a second major calling convention called cdecl).
The situation for C++ isn't as clear-cut: most operating system do not expose a C++ API (there are exceptions like BeOS/Haiku), thus compiler vendors were free to do whatever they thoght best, leading to incompatibilities between compilers from different vendors and sometimes even between different versions of the same compiler. I think at least GCC has stabilized their C++ ABI, but I have no idea about the general situation...

Related

Can C++ binary code become portable through a native C interface? What are the limitations?

I often use the technique to wrap my high-performance C++ classes with a a thin C layer that I compile to shared libraries, and then load them in other programming languages, such as Python.
From my reading here and there, I understand that the only requirement for this to work, is to have the function interfaces use only native types or structs of these types. (so, int and longs, float, double, etc and their pointers of any rank).
My question is: Assuming full ABI compatibility between various compilers, is this the only requirement I have to fulfill to have full API compatibility with a shared library?
Why can't C++ libraries be ported? Here's my understanding:
Case 1: Consider the type std::string. Internally it contains a char* null-terminated string, and a size integer. The C++ standard doesn't say which of these should come first (right?). Meaning that if I put std::string on a function interface, two different compilers may have them in different order, which will not work.
Case 2: Consider inheritance and vtables for a class with virtual methods. The C++ standard doesn't require any specific position/order for where vtable pointers have to go (right?). They could be at the beginning of the class before any other variable, and they could also be at the end, after all other member variables. So again, interfacing this class on a function will not be consistent.
An additional question following my first one: Doesn't this very problem happen also inside function calls? Or is it that nothing matters after it's compiled to binary, and types have no meaning anymore? Wouldn't RTTI elements cause problems, for example, if I put them in a C wrapper interface?
The reason why there is no C++ ABI, is partly because there is no C ABI. As stated by Bjarne Stroustrup (source):
The technical hardest problem is probably the lack of a C++ binary interface (ABI). There is no C ABI either, but on most (all?) Unix platforms there is a dominant compiler and other compilers have had to conform to its calling conventions and structure layout rules - or become unused. In C++ there are more things that can vary - such as the layout of the virtual function table - and no vendor has created a C++ ABI by fiat by eliminating all competitors that did not conform. In the same way as it used to be impossible to link code from two different PC C compilers together, it is generally impossible to link the code from two different Unix C++ compilers together (unless there are compatibility switches).
The lack of an ABI gives more freedom to compiler implementations, and allows the languages to be spread to multiple different types of systems.
On Windows there are some platform specific dependencies that relies on the way the compiler outputs the result, one example comes from COM where pure virtual interfaces are required to be laid out in a specific way. So on Windows most compilers will, at least agree on that.
The Windows API uses the stdcall calling convention, so when coding against the Windows API, there are a fixed set of rules for how to pass parameters to a function. But again this is system dependent, and there is nothing preventing you from writing a program that uses a different convention.

C++ ABI issues list

I've seen a lot of discussion about how C++ doesn't have a Standard ABI quite in the same way that C does. I'm curious as to what, exactly, the issues are. So far, I've come up with
Name mangling
Exception handling
RTTI
Are there any other ABI issues pertaining to C++?
Off the top of my head:
C++ Specific:
Where the 'this' parameter can be found.
How virtual functions are called
ie does it use a vtable or other
What is the layout of the structures used for implementing this.
How are multiple definitions handled
Multiple template instantiations
Inline functions that were not inlined.
Static Storage Duration Objects
How to handle creation (in the global scope)
How to handle creation of function local (how do you add it to the destructor list)
How to handle destruction (destroy in reverse order of creation)
You mention exceptions. But also how exceptions are handled outside main()
ie before or after main()
Generic.
Parameter passing locations
Return value location
Member alignment
Padding
Register usage (which registers are preserved which are scratch)
size of primitive types (such as int)
format of primitive types (Floating point format)
The big problem, in my experience, is the C++ standard library. Even if you had an ABI that dictates how a class should be laid out, different compilers provide different implementations of standard objects like std::string and std::vector.
I'm not saying that it would not be possible to standardize the internal layout of C++ library objects, only that it has not been done before.
The closest thing we have to a standard C++ ABI is the Itanium C++ ABI:
this document is written as a generic specification, to be usable by C++ > implementations on a variety of architectures. However, it does contain > processor-specific material for the Itanium 64-bit ABI, identified as
such."
The GCC doc explains support of this ABI for C++:
Starting with GCC 3.2, GCC binary conventions for C++ are based
on a written, vendor-neutral C++ ABI that was designed to be specific
to 64-bit Itanium but also includes generic specifications that apply
to any platform. This C++ ABI is also implemented by other compiler
vendors on some platforms, notably GNU/Linux and BSD systems
As was pointed out by #Lindydancer, you need to use the same C++ standard libary/runtime as well.
An ABI standard for any language really needs to come from a given platform that wants to support such a thing. Language standards especially C/C++ really can not do this for many reasons but mostly because such a thing would make the language less flexible and less portable and therefore less used. C really doesn't have a defined ABI but many platforms define (directly or indirectly) one. The reason this isn't happening with C++ is because the language is much bigger and changes are made more often. However, Herb Sutter has a very interesting proposal about how to get more platforms to create standard ABIs and how developers can write code that uses the ABI in a standard way:
https://isocpp.org/blog/2014/05/n4028
He points out how C++ has a standard way to link into a platform C ABI but not a C++ ABI via extern "C". I think this proposal could go a long way to allowing interfaces to be defined in terms of C++ instead of C.
I've seen a lot of discussion about how C++ doesn't have a Standard ABI quite in the same way that C does.
What standard C ABI? Appendix J in the C99 standard is 27 pages long. In addition to undefined behavior (and some implementations give some UB a well-defined behavior), it covers unspecified behavior, implementation-defined behavior, locale-specific behavior, and common extensions.

Implement the C standard library in C++

Say an OS/kernel is written with C++ in mind and does not "do" any pure C style stuff, but instead exposes the C standard library built upon a full-fledged C++ standard library. Is this possible? If not, why?
PS: I know the C library is "part of C++", but let's say it's internally based on a C++-based implementation.
Small update: It seems I've stirred up a discussion as to what is "allowed" by my rules here. Generally speaking: the C Standard library implementation should use C++ everwhere that is possible/Right (tm). I mostly think about algorithms and acting on static class objects behind the scenes. I'm not really excluding any language features, but instead trying to put the emphasis on a sane C++ implementation. With regards to the setjmp example, I see no reason why valid C (which would use either other pre-implemented in C++ C library parts or not use any other library functions at all) here would be violation of my "rules". If there is no counterpart in the C++ library, why debate the use of it.
Yes, that is possible. It would be much like one exports a C API from a library written in C++, FORTRAN, assembler or most any other language for that matter.
Actually, c++ has the ability to be faster than c in many ways, due to it's ability to support many translationtime constructs like expression templates. For this reason, c++ matrix libraries tend to be much more optimised than c, involve less temporaries, unroll loops, etc. With new c++0x features like variant templates, the printf function, for instance, could be much faster and typesafe than a version implemented in c. It my even be able to honor the interfaces of many c constructs and evaluate some of their arguments (like string literals) translationtime.
Unfortunately, many people think c is faster than c++ because many people use OOP to mean that all relations and usage must occur through large inheritance hierarchies, virtual dispatch, etc. That caused some early comparisons to be completely different from what is considered good usage these days. If you were to use virtual dispatch where it is appropriate (e.g. like filesystems in the kernel, where they build vtables through function pointers and often basically build c++ in c), you would have no pessimisation from c, and with all of the new features, can be significantly faster.
Not only is speed a possible improvement, but there are places where the implementation would benefit from better type safety. There are common tricks in c (like storing data in void pointers when it must be generic) that break type safety and where c++ can provide strong error checking. This won't always translate through the interfaces to the c library, since those have fixed typing, but it will definitely be of use to the implementers of the library and could assist in some places where it may be possible to extract more information from calls by providing "as-if" interfaces (for instance, an interface that takes a void* might be implemented as a generic interface with a concept check that the argument is implicitly convertible to void*).
I think this would be a great test of the power of c++ over c.
Given that "pure C stuff" has such a large overlap with C++, I fail to see how you'd avoid it entirely in anything, much less an OS kernel. After all, is the + operation "pure C stuff"? :)
That said, you could certainly implement certain C library functions using classes and whatnot. Implement qsort using std::sort? Sure, no problem. Just don't forget your extern "C".
I see no reason why you couldn't do it, but I also see no reason why someone would use such an implementation. It's going to use a lot more memory, and be at least somewhat slower, than a normal implementation...although it might not be much worse than glibc, whose implementation of stdio is already essentially C++ anyway... (Lookup GNU libio... you'll be horrified.)
Kernels like Linux have very strict ABI, based on syscalls, ioctls, filesystems, and conforming to quite a few standards (POSIX being the major one). Since the ABI has to be stable its surface is also limited. It would be a lot of work (particularly since you need a minimally useful kernel as well), but these standards could be implemented in any language.
Edit: You mentioned the libc as well. That is not part of the kernel, and the language of the libc can be entirely unrelated to that of the kernel, thanks to the aforementioned ABI. Unlike the kernel, the libc needs to be C or have a very good ffi for C. C++ with parts in extern C would fit the bill.

GCC vs MS C++ compiler for maintaining API backwards binary compatibility

I came from the Linux world and know a lot of articles about maintaining backwards binary compatibility (BC) of a dynamic library API written in C++ language. One of them is "Policies/Binary Compatibility Issues With C++" based on the Itanium C++ ABI, which is used by the GCC compiler. But I can't find anything similar for the Microsoft C++ compiler (from MSVC).
I understand that most of the techniques are applicable to the MS C++ compiler and I would like to discover compiler-specific issues related to ABI differences (v-table layout, mangling, etc.)
So, my questions are the following:
Do you know any differences between MS C++ and GCC compilers when maintaining BC?
Where can I find information about MS C++ ABI or about maintaining BC of API in Windows?
Any related information will be highly appreciated.
Thanks a lot for your help!
First of all these policies are general and not refer to gcc only. For example: private/public mark in functions is something specific to MSVC and not gcc.
So basically these rules are fully applicable to MSVC and general compiler as well.
But...
You should remember:
GCC/C++ keeps its ABI stable since 3.4 release and it is about 7 years (since 2004) while MSVC breaks its ABI every major release: MSVC8 (2005), MSVC9 (2008), MSVC10 (2010) are not compatible with each other.
Some frequently flags used with MSVC can break ABI as well (like Exceptions model)
MSVC has incompatible run-times for Debug and Release modes.
So yes you can use these rules, but as in usual case of MSVC it has much more quirks.
See also "Some thoughts on binary compatibility" and Qt keeps they ABI stable with MSVC as well.
Note I have some experience with this as I follow these rules in CppCMS
On Windows, you basically have 2 options for long term binary compatibility:
COM
mimicking COM
Check out my post here. There you'll see a way to create DLLs and access DLLs in a binary compatible way across different compilers and compiler versions.
C++ DLL plugin interface
The best rule for MSVC binary compatibility is use a C interface. The only C++ feature you can get away with, in my experience, is single-inheritance interfaces. So represent everything as interfaces which use C datatypes.
Here's a list of things which are not binary compatible:
The STL. The binary format changes even just between debug/release, and depending on compiler flags, so you're best off not using STL cross-module.
Heaps. Do not new / malloc in one module and delete / free in another. There are different heaps which do not know about each other. Another reason the STL won't work cross-modules.
Exceptions. Don't let exceptions propagate from one module to another.
RTTI/dynamic_casting datatypes from other modules.
Don't trust any other C++ features.
In short, C++ has no consistent ABI, but C does, so avoid C++ features crossing modules. Because single inheritance is a simple v-table, you can usefully use it to expose C++ objects, providing they use C datatypes and don't make cross-heap allocations. This is the approach used by Microsoft themselves as well, e.g. for the Direct3D API. GCC may be useful in providing a stable ABI, but the standard does not require this, and MSVC takes advantage of this flexibility.

What could C/C++ "lose" if they defined a standard ABI?

The title says everything. I am talking about C/C++ specifically, because both consider this as "implementation issue". I think, defining a standard interface can ease building a module system on top of it, and many other good things.
What could C/C++ "lose" if they defined a standard ABI?
The freedom to implement things in the most natural way on each processor.
I imagine that c in particular has conforming implementations on more different architectures than any other language. Abiding by a ABI optimized for the currently common, high-end, general-purpose CPUs would require unnatural contortions on some the odder machines out there.
Backwards compatibility on every platform except for the one whose ABI was chosen.
Basically, everyone missed that one of the C++14 proposals actually DID define a standard ABI. It was a standard ABI specifically for libraries that used a subset of C++. You define specific sections of "ABI" code (like a namespace) and it's required to conform to the subset.
Not only that, it was written by THE Herb Stutter, C++ expert and author the "Exceptional C++" book series.
The proposal goes into many reasons why a portable ABI is difficult, as well as novel solutions.
https://isocpp.org/blog/2014/05/n4028
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4028.pdf
Note that he defines a "target platform" to be a combination of CPU architecture (x64, x86, ARM, etc), OS, and bitness (32/64).
So the goal here, is actually having C++ code (Visual Studio) be able to talk to other C++ code (GCC, older Visual Studio, etc) on the same platform. It's not a goal of a universal ABI that lets cellphones libraries run on your Windows machine.
This proposal was NOT ratified in C++14, however, it was moved into the "Evolution" phase of C++17 for further discussion/iteration.
https://www.ibm.com/developerworks/community/blogs/5894415f-be62-4bc0-81c5-3956e82276f3/entry/c_14_is_ratified_the_view_from_the_june_2014_c_standard_meeting?lang=en
So as of January 2017, my fingers remain crossed.
Rather than a generic ABI for all platforms (which would be disastrous as it would only be optimal for only one platform). The standard's committee could say that each platform will conform to a specific ABI.
But: Who defines it (the first compiler through the door?). In which case they get an excessive competitive advantage. Or a committee after 5 years of compilers (which would be another horrible idea).
Also it does not give the compiler leaway to do further research into new optimization strategies, you would be stuck with the tricks available at the point where the standard was defined.
The C (or C++) language specifications define the source language. They don't care about the processor running it (A C program could even be interpreted by a human slave, but that would be unethical and not cost-effective).
The ABI is by definition something about the target system. It is related to the processor and the system (and the existing libraries following the ABI).
In the past, it did happen that some processors had proprietary (i.e. undisclosed) specification (even their machine instruction set was not public), and they had a non-public ABI which was followed by a compiler (respecting more or less the language standard).
Defining a programming language don't require the same skill sets as defining the ABI.
You could even define a newer ABI for an existing processor, but that requires a lot of work (patching the compiler, recompiling every thing, including C & C++ standard libraries and all utilities and libraries that you need) so is generally useless.
Execution speed would suffer drastically on a majority of platforms. So much so that it would likely no longer be reasonable to use the C language for a number of embedded platforms. The standards body could be liable for an antitrust suit brought by the makers of the various chips not compatible with the ABI.
Well, there wouldn't be one standard ABI, but about 1000. You would need one for every combination of OS and processor architecture.
Initially, nothing would be lost. But eventually, somebody would find some horrible bug and they would either fix it, breaking the ABI, or leave it, causing problems.
I think that the situation right now is fine. Any OS is free to define an ABI for itself (and they do), which makes sense. It should be the job of the OS to define its ABI, not the C/C++ standard.
C always had a standard ABI, which is even the one used for any most standard ABI (I mean, the C ABI is the ABI of choice, when different languages or systems has to bind to each others). The C ABI is kind of common ABI of others ABIs. C++ is more complex although extending and thus based on C, and indeed, a standard ABI for C++ is more challenging and may present issues to the freedom a C++ compiler have for its own implementation of the target machine code. However, it seems to actually have a standard ABI; see Itanium C++ ABI.
So the question may not be that much “what could they loose?”, but rather “what do they loose?” (if ever they really loose something).
Side note: needed to keep in mind ABIs are always architecture and OS dependant. So if what was meant by “Standard ABI” is “standard across architectures and platforms”, then there may never has been or be such thing, but communication protocols.