GCC vs MS C++ compiler for maintaining API backwards binary compatibility - c++

I came from the Linux world and know a lot of articles about maintaining backwards binary compatibility (BC) of a dynamic library API written in C++ language. One of them is "Policies/Binary Compatibility Issues With C++" based on the Itanium C++ ABI, which is used by the GCC compiler. But I can't find anything similar for the Microsoft C++ compiler (from MSVC).
I understand that most of the techniques are applicable to the MS C++ compiler and I would like to discover compiler-specific issues related to ABI differences (v-table layout, mangling, etc.)
So, my questions are the following:
Do you know any differences between MS C++ and GCC compilers when maintaining BC?
Where can I find information about MS C++ ABI or about maintaining BC of API in Windows?
Any related information will be highly appreciated.
Thanks a lot for your help!

First of all these policies are general and not refer to gcc only. For example: private/public mark in functions is something specific to MSVC and not gcc.
So basically these rules are fully applicable to MSVC and general compiler as well.
But...
You should remember:
GCC/C++ keeps its ABI stable since 3.4 release and it is about 7 years (since 2004) while MSVC breaks its ABI every major release: MSVC8 (2005), MSVC9 (2008), MSVC10 (2010) are not compatible with each other.
Some frequently flags used with MSVC can break ABI as well (like Exceptions model)
MSVC has incompatible run-times for Debug and Release modes.
So yes you can use these rules, but as in usual case of MSVC it has much more quirks.
See also "Some thoughts on binary compatibility" and Qt keeps they ABI stable with MSVC as well.
Note I have some experience with this as I follow these rules in CppCMS

On Windows, you basically have 2 options for long term binary compatibility:
COM
mimicking COM
Check out my post here. There you'll see a way to create DLLs and access DLLs in a binary compatible way across different compilers and compiler versions.
C++ DLL plugin interface

The best rule for MSVC binary compatibility is use a C interface. The only C++ feature you can get away with, in my experience, is single-inheritance interfaces. So represent everything as interfaces which use C datatypes.
Here's a list of things which are not binary compatible:
The STL. The binary format changes even just between debug/release, and depending on compiler flags, so you're best off not using STL cross-module.
Heaps. Do not new / malloc in one module and delete / free in another. There are different heaps which do not know about each other. Another reason the STL won't work cross-modules.
Exceptions. Don't let exceptions propagate from one module to another.
RTTI/dynamic_casting datatypes from other modules.
Don't trust any other C++ features.
In short, C++ has no consistent ABI, but C does, so avoid C++ features crossing modules. Because single inheritance is a simple v-table, you can usefully use it to expose C++ objects, providing they use C datatypes and don't make cross-heap allocations. This is the approach used by Microsoft themselves as well, e.g. for the Direct3D API. GCC may be useful in providing a stable ABI, but the standard does not require this, and MSVC takes advantage of this flexibility.

Related

C++ ABI issue related to STL

I have searched the net without any conclusive answers to issue related to lack of C++ ABI when it comes to exporting c++ classes accross dll boundaries in windows.
I can use extern c and provide a c like api to a library , but I would like end users to be able to use classes that us stl containers.
What generic patterns do you use for exporting a class that uses stl container across dll boundaries , in a safe manner? e.g. best practice.
This question is for experienced library authors.
Regards
Niladri
There's no defined C++ ABI and it does differ between compilers in terms of memory layout, name mangling, RTL etc.
It gets worse than that though. If you are targeting MSVC compiler for example, your dll/exe can be built with different macros that configure the STL to not include iterator checks so that it is faster. This modifies the layout of STL classes and you end up breaking the One Definition Rule (ODR) when you link (but the link still succeeds). If the ODR is violated then your program will crash seemingly randomly.
I would recommend maybe reading Imperfect C++ which has a chapter on the subject of C++ ABIs.
The upshot is that either:
you compile the dll specifically for the target compiler that is going to link to it and name the dll's appropriately (like boost does it). In this case you can't avoid ODR violations so the user of the library must be able to recompile the library themselves with different options.
Provide a portable C API and provide C++ wrapper classes for convenience that are compiled on the client side of the API. This is time consuming but portable.
Take a look at CppComponents https://github.com/jbandela/cppcomponents
You can export classes and interfaces across dll boundaries even across different compilers.
The library is header-only so nothing to build. To use it, you will need MSVC 2013 and/or GCC 4.8 as it uses C++11 variadic templates extensively.
See presentation from C++Now
https://github.com/boostcon/cppnow_presentations_2014/blob/master/files/cppnow2014_bandela_presentation.pdf?raw=true
https://github.com/jbandela/cppcomponents_cppnow_examples has the examples from the presentation.

C++ Standard Library Portability

I work on large scale, multi platform, real time networked applications. The projects I work on lack any real use of containers or the Standard Library in general, no smart pointers or really any "modern" C++ language features. Lots of raw dynamically allocated arrays are common place.
I would very much like to start using the Standard Library and some of the C++11 spec, however, there are many people also working on my projects that are against because "STL / C++11 isn't as portable, we take a risk using it". We do run software on a wide variety of embedded systems as well as fully fledged Ubuntu/Windows/Mac OS systems.
So, to my question; what are the actual issues of portability with concern to the Standard Library and C++11? Is it just a case of having g++ past a certain version? Are there some platforms that have no support? Are compiled libraries required and if so, are they difficult to obtain/compile? Has anyone had serious issues being burnt by non-portable pure C++?
Library support for the new C++11 Standard is pretty complete for either Visual C++ 2012, gcc >= 4.7 and Clang >= 3.1, apart from some concurrency stuff. Compiler support for all the individual language features is another matter. See this link for an up to date overview of supported C++11 features.
For an in-depth analysis of C++ in an embedded/real-time environment, Scott Meyers's presentation materials are really great. It discusses costs of virtual functions, exception handling and templates, and much more. In particular, you might want to look at his analysis of C++ features such as heap allocations, runtime type information and exceptions, which have indeterminate worst-case timing guarantees, which matter for real-time systems.
It's those kind of issues and not portability that should be your major concern (if you care about your granny's pacemaker...)
Any compiler for C++ should support some version of the standard library. The standard library is part of C++. Not supporting it means the compiler is not a C++ compiler. I would be very surprised if any of the compilers you're using at the moment don't portably support the C++03 standard library, so there's no excuse there. Of course, the compiler will have to be have been updated since 2003, but unless you're compiling for some archaic system that is only supported by an archaic compiler, you'll have no problems.
As for C++11, support is pretty good at the moment. Both GCC and MSVC have a large portion of the C++11 standard library supported already. Again, if you're using the latest versions of these compilers and they support the systems you want to compile for, then there's no reason you can't use the subset of the C++11 standard library that they support - which is almost all of it.
C++ without the standard library just isn't C++. The language and library features go hand in hand.
There are lists of supported C++11 library features for GCC's libstdc++ and MSVC 2012. I can't find anything similar for LLVM's libc++, but they do have a clang c++11 support page.
The people you are talking to are confusing several different
issues. C++11 isn't really portable today. I don't think any
compiler supports it 100% (although I could be wrong); you can
get away with using large parts of it if (and only if) you limit
yourself to the most recent compilers on two or three platforms
(Windows and Linux, and probably Apple). While these are the
most visible platforms, they represent but a small part of all
machines. (If you're working on large scale networked
applications, Solaris will probably be important, and Sun CC.
Unless Sun have greatly changed since I last worked on it, that
means that there are even parts of C++03 that you can't count
on.)
The STL is a completely different issue. It depends partially
on what you mean by the STL, but there is certainly no
portability problem today in using std::vector. locale
might be problematic on a very few compilers (it was with Sun
CC—with both the Rogue Wave and the Stlport libraries),
and some of the algorithms, but for the most part, you can
pretty much count on all of C++03.
And in the end, what are the alternatives? If you don't have
std::vector, you end up implementing something pretty much
like it. If you're really worried about the presence of
std::vector, wrap it in your own class—if ever it's not
available (highly unlikely, unless you go back with a time
machine), just reimplement it, exactly like we did in the
pre-standard days.
Use STLPort with your existing compiler, if it supports it. This is no more than a library of code, and you use other libraries without problem, right?
Every permitted implementation-defined behaviour is listed in publicly available standard draft. There is next to nothing less portable in C+11 than in C++98.

How portable IS C++?

In C++, if I write a simple game like pong using Linux, can that same code be compiled on Windows and OSX? Where can I tell it won't be able to be compiled?
You have three major portability hurdles.
The first, and simplest, is writing C++ code that all the target compilers understand. Note: this is different from writing to the C++ standard. The problem with "writing to the standard" starts with: which standard? You have C++98, C++03, C++TR1 or C++11 or C++14 or C++17? These are all revisions to C++ and the newer one you use the less compliant compilers are likely to be. C++ is very large, and realistically the best you can hope for is C++98 with some C++03 features.
Compilers all add their own extensions, and it's all too easy to unknowingly use them. You would be wise to write to the standard and not to the compiler documentation. Some compilers have a "strict" mode where they will turn off all extensions. You would be wise to do primary development in the compiler which has the most strictures and the best standard compliance. gcc has the -Wstrict family of flags to turn on strict warnings. -ansi will remove extensions which conflict with the standard. -std=c++98 will tell the compiler to work against the C++98 standard and remove GNU C++ extensions.
With that in mind, to remain sane you must restrict yourself to a handful of compilers and only their recent versions. Even writing a relatively simple C library for multiple compilers is difficult. Fortunately, both Linux and OS X use gcc. Windows has Visual C++, but different versions are more like a squabbling family than a single compiler when it comes to compatibility (with the standard or each other), so you'll have to pick a version or two to support. Alternatively, you can use one of the gcc derived compiler environments such as MinGW. Check the [list of C++ compilers](less compliant compilers are likely to be) for compatibility information, but keep in mind this is only for the latest version.
Next is your graphics and sound library. It has to not just be cross platform, it has to look good and be fast on all platforms. These days there's a lot of possibilities, Simple DirectMedia Layer is one. You'll have to choose at what level you want to code. Do you want detailed control? Or do you want an engine to take care of things? There's an existing answer for this so I won't go into details. Be sure to choose one that is dedicated to being cross platform, not just happens to work. Compatibility bugs in your graphics library can sink your project fast.
Finally, there's the simple incompatibilities which exist between the operating systems. POSIX compliance has come a long way, and you're lucky that both Linux and OS X are Unix under the hood, but Windows will always be the odd man out. Things which are likely to bite you mostly have to do with the filesystem. Here's a handful:
Filesystem layout
File path syntax (ie. C:\foo\bar vs /foo/bar)
Mandatory Windows file locking
Differing file permissions systems
Differing models of interprocess communication (ie. fork, shared memory, etc...)
Differing threading models (your graphics library should smooth this out)
There you have it. What a mess, huh? Cross-platform programming is as much a state of mind and statement of purpose as it is a technique. It requires some dedication and extra time. There are some things you can do to make the process less grueling...
Turn on all strictures and warnings and fix them
Turn off all language extensions
Periodically compile and test in Windows, not just at the end
Get programmer who likes Windows on the project
Restrict yourself to as few compilers as you can
Choose a well maintained, well supported graphics library
Isolate platform specific code (for example, in a subclass)
Treat Windows as a first class citizen
The most important thing is to do this all from the start. Portability is not something you bolt on at the end. Not just your code, but your whole design can become unportable if you're not vigilant.
C++ is ultra portable and has compilers available on more platforms than you can shake a stick at. Languages like Java are typically touted as being massively cross platform, ironically they are in fact usually implemented in C++, or C.
That covers "portability". If you actually mean, how cross platform is C++, then not so much: The C++ standard only defines an IO library suitable for console IO - i.e. text based, so as soon as you want develop some kind of GUI, you are going to need to use a GUI framework - and GUI frameworks are historically very platform specific. Windows has multiple "native" GUI frameworks now - the C++ framework made available from Microsoft is still MFC - which wraps the native Win32 API which is a C API. (WPF and WinForms are available to CLR C++).
The Apple Mac's GUI framework is called Cocoa, and is an objective-C library, but its easy to access Objective C from C++ in that development environment.
On Linux there is the GTK+ and Qt frameworks that are both actually ported to Windows and Apple, so one of these C++ frameworks can solve your "how to write a GUI application in C++ once that builds and runs on windows, apple mac and linux".
Of course, its difficult to regard Qt as strictly C++ anymore - Qt defines a special markup for signals and slots that requires a pre-compile compile step.
You can read the standard - if a program respects the standard, it should be compilable on all platforms that have a C++ standard-compliant compiler.
As for 3rd party libraries you might be using, the platform availability is usually specified in the documentation.
When GUI comes to question, there are cross-platform options (such as QT), but you should probably ask yourself - do I really want portability when it comes to UI? Sometimes, it's better to have the GUI part platform-specific.
If you are thinking of porting from Linux to Windows, using OPENGL for the graphical part gives you freedom to run your program on both operating systems as long as you don't use any system specific functionality.
Compared to C, C++ portability is extremely limited, if not completely unexisting. For one you can't disable exceptions (well you can), for the standard specifically says that's undefined behaviour. Many devices don't even support exceptions. So as for that, C++ is ZERO portable. Plus seeing the UB, it's obvioulsy a no-go for zero-fail high-performance real time systems in which exceptions are taboo - undefined behaviour has no place in zero-fail environment. Then there's the name mangling which most, if not every, compiler does completely different. For good portability and inter-compatibility extern "C" would have to be used to export symbols, yet this renders any and all namespace information completely void, resulting in duplicate symbols. One can ofcourse choose to not use namespaces and use unique symbol names. Yet another C++ feature rendered void. Then there's the complexity of the language, which results in implementation difficulties in the various compilers for various architectures. Due to these difficulties, true portability becomes a problem. One can solve this by having a large chain of compiler directives/#ifdefs/macros. Templates? Not even supported by most compilers.
What portability? You mean the semi-portability between a couple of main-stream build targets like MSVC for Windows and GCC for Linux? Even there, in that MAIN-STREAM segment, all the above problems and limitations exist. It's retarded to even think C++ is portable.

Developing embedded software library, C or C++?

I'm in the process of developing a software library to be used for embedded systems like an ARM chip or a TI DSP (for mostly embedded systems, but it would also be nice if it could also be used in a PC environment). Obviously this is a pretty broad range of target systems, so being able to easily port to different systems is a priority.The library will be used for interfacing with a specific hardware and running some algorithms.
I am thinking C++ is the best option, over C, because it is much easier to maintain and read. I think the additional overhead is worth it for being able to work in the object oriented paradigm. If I was writing for a very specific system, I would work in C but this is not the case.
I'm assuming that these days most compilers for popular embedded systems can handle C++. Is this correct?
Is there any other factors I should consider? Is my line of thinking correct?
If portability is very important for you, especially on an embedded system, then C is certainly a better option than C++. While C++ compilers on embedded platforms are catching up, there's simply no match for the widespread use of C, for which any self-respecting platform has a compliant compiler.
Moreover, I don't think C is inferior to C++ where it comes to interfacing hardware. The amount of abstraction is sufficiently low (i.e. no deep class hierarchies) to make C just as good an option.
There is certainly good support of C++ for ARM. ARM have their own compiler and g++ can also generate EABI compliant ARM code. When it comes to the DSPs, you will have to look at their toolchain to decide what you are going to do. Be aware that the library that comes with a DSP may well not implement the full C or C++ standard library.
C++ is suitable for low-level embedded development and is used in the SymbianOS Kernel. Having said that, you should keep things as simple as possible.
Avoid exceptions which may demand more library support than what is present (therefore use new (std::nothrow) Foo instead of new Foo).
Avoid memory allocations as much as possible and do them as early as possible.
Avoid complex patterns.
Be aware that templates can bloat your code.
I have seen many complaints that C++ is "bloated" and inappropriate for embedded systems.
However, in an interview with Stroustrup and Sutter, Bjarne Stroustrup mentioned that he'd seen heavily templated C++ code going into (IIRC) the braking systems of BMWs, as well as in missile guidance systems for fighter aircraft.
What I take away from this is that experts of the language can generate sophisticated, efficient code in C++ that is most certainly suitable for embedded systems. However, a "C With Classes"[1] programmer that does not know the language inside out will generate bloated code that is inappropriate.
The question boils down to, as always: in which language can your team deliver the best product?
[1] I know that sounds somewhat derogatory, but let me say that I know an awful lot of these guys, and they churn out an awful lot of relatively simple code that gets the job done.
C++ compilers for embedded platforms are much closer to 83's C with classes than 98's C++ standard, let alone C++0x. For instance, some platform we use still compile with a special version of gcc made from gcc-2.95!
This means that your library interface will not be able to provide interfaces with containers/iterators, streams, or such advanced C++ features. You'll have to stick with simple C++ classes, that can very easily be expressed as a C interface with a pointer to a structure as first parameter.
This also means that within your library, you won't be able to use templates to their full power. If you want portability, you will still be restricted to generic containers use of templates, which is, I'm sure you'll admit, only a very tiny part of C++ templates power.
C++ has little or no overhead compared to C if used properly in an embedded environment. C++ has many advantages for information hiding, OO, etc. If your embedded processor is supported by gcc in C then chances are it will also be supported with C++.
On the PC, C++ isn't a problem at all -- high quality compilers are extremely widespread and almost every C compiler is directly associated with a C++ compiler that's quite good, though there are a few exceptions such as lcc and the newly revived pcc.
Larger embedded systems like those based on the ARM are generally quite similar to desktop systems in terms of tool chain availability. In fact, many of the same tools available for desktop machines can also generate code to run on ARM-based machines (e.g., lots of them use ports of gcc/g++). There's less variety for TI DSPs (and a greater emphasis on quality of generated code than source code features), but there are still at least a couple of respectable C++ compilers available.
If you want to work with smaller embedded systems, the situation changes in a hurry. If you want to be able to target something like a PIC or an AVR, C++ isn't really much of an option. In theory, you could get (for example) Comeau to produce a custom port that generated code you could compile on that target's C compiler -- but chances are pretty good that even if you did, it wouldn't work out very well. These systems are really just too limitated (especially on memory size) for C++ to fit them well.
Depending on what your intended use is for the library, I think I'd suggest implementing it first as C - but the design should keep in mind how it would be incorporated into a C++ design. Then implement C++ classes on top of and/or along side of the C implementation (there's no reason this step cannot be done concurrently with the first). If your C design is done with a C++ design in mind, it's likely to be as clean, readable and maintainable as the C++ design would be. This is somewhat more work, but I think you'll end up with a library that's useful in more situations.
While you'll find C++ used more and more on various embedded projects, there are still many that restrict themselves to C (and I'd guess this is more often the case than not) - regardless of whether or not the tools support C++. It would be a shame to have a nice library of routines that you could bring to a new project you're working on, but be unable to use them because C++ isn't being used on that particular project.
In general, it's much easier to use a well-designed C library from C++ than the other way around. I've taken this approach with several sets of code including parsing Intel Hex files, a simple command parser, manipulating synchronization objects, FSM frameworks, etc. I'm planning on doing a simple XML parser at some point.
Here's an entirely different C++-vs-C argument: stable ABIs. If your library exports a C ABI, it can be compiled with any compiler that works on the system, because C ABIs are generally platform standards. If your library exports a C++ ABI, it can only be compiled with a matching compiler -- because C++ ABIs are usually not platform standards, and often differ from compiler to compiler and even version to version.
Interestingly, one of the rare exceptions to this is ARM; there's an ARM C++ ABI specification, and all compliant ARM compilers follow it. This is not true on x86; on x86, you're lucky if a C++ library compiled with a 4.1 version of GCC will link correctly with an application compiled with GCC 4.4, and don't even ask about 3.4.6.
Even if you export a C ABI, you can have problems. If your library uses C++ internally, it will then link to libstdc++ for things in the C++ std:: namespace. If your user compiles a C++ application that uses your library, they'll also link to libstdc++ -- and so the overall application gets linked to libstdc++ twice, and their libstdc++ may not be compatible with your libstdc++, which can (or so I understand) lead to odd errors from the intersection of the two. Considerably less likely, but still possible.
All of these arguments only apply because you're writing a library, and they're not showstoppers. But they are things to be aware of.

What could C/C++ "lose" if they defined a standard ABI?

The title says everything. I am talking about C/C++ specifically, because both consider this as "implementation issue". I think, defining a standard interface can ease building a module system on top of it, and many other good things.
What could C/C++ "lose" if they defined a standard ABI?
The freedom to implement things in the most natural way on each processor.
I imagine that c in particular has conforming implementations on more different architectures than any other language. Abiding by a ABI optimized for the currently common, high-end, general-purpose CPUs would require unnatural contortions on some the odder machines out there.
Backwards compatibility on every platform except for the one whose ABI was chosen.
Basically, everyone missed that one of the C++14 proposals actually DID define a standard ABI. It was a standard ABI specifically for libraries that used a subset of C++. You define specific sections of "ABI" code (like a namespace) and it's required to conform to the subset.
Not only that, it was written by THE Herb Stutter, C++ expert and author the "Exceptional C++" book series.
The proposal goes into many reasons why a portable ABI is difficult, as well as novel solutions.
https://isocpp.org/blog/2014/05/n4028
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4028.pdf
Note that he defines a "target platform" to be a combination of CPU architecture (x64, x86, ARM, etc), OS, and bitness (32/64).
So the goal here, is actually having C++ code (Visual Studio) be able to talk to other C++ code (GCC, older Visual Studio, etc) on the same platform. It's not a goal of a universal ABI that lets cellphones libraries run on your Windows machine.
This proposal was NOT ratified in C++14, however, it was moved into the "Evolution" phase of C++17 for further discussion/iteration.
https://www.ibm.com/developerworks/community/blogs/5894415f-be62-4bc0-81c5-3956e82276f3/entry/c_14_is_ratified_the_view_from_the_june_2014_c_standard_meeting?lang=en
So as of January 2017, my fingers remain crossed.
Rather than a generic ABI for all platforms (which would be disastrous as it would only be optimal for only one platform). The standard's committee could say that each platform will conform to a specific ABI.
But: Who defines it (the first compiler through the door?). In which case they get an excessive competitive advantage. Or a committee after 5 years of compilers (which would be another horrible idea).
Also it does not give the compiler leaway to do further research into new optimization strategies, you would be stuck with the tricks available at the point where the standard was defined.
The C (or C++) language specifications define the source language. They don't care about the processor running it (A C program could even be interpreted by a human slave, but that would be unethical and not cost-effective).
The ABI is by definition something about the target system. It is related to the processor and the system (and the existing libraries following the ABI).
In the past, it did happen that some processors had proprietary (i.e. undisclosed) specification (even their machine instruction set was not public), and they had a non-public ABI which was followed by a compiler (respecting more or less the language standard).
Defining a programming language don't require the same skill sets as defining the ABI.
You could even define a newer ABI for an existing processor, but that requires a lot of work (patching the compiler, recompiling every thing, including C & C++ standard libraries and all utilities and libraries that you need) so is generally useless.
Execution speed would suffer drastically on a majority of platforms. So much so that it would likely no longer be reasonable to use the C language for a number of embedded platforms. The standards body could be liable for an antitrust suit brought by the makers of the various chips not compatible with the ABI.
Well, there wouldn't be one standard ABI, but about 1000. You would need one for every combination of OS and processor architecture.
Initially, nothing would be lost. But eventually, somebody would find some horrible bug and they would either fix it, breaking the ABI, or leave it, causing problems.
I think that the situation right now is fine. Any OS is free to define an ABI for itself (and they do), which makes sense. It should be the job of the OS to define its ABI, not the C/C++ standard.
C always had a standard ABI, which is even the one used for any most standard ABI (I mean, the C ABI is the ABI of choice, when different languages or systems has to bind to each others). The C ABI is kind of common ABI of others ABIs. C++ is more complex although extending and thus based on C, and indeed, a standard ABI for C++ is more challenging and may present issues to the freedom a C++ compiler have for its own implementation of the target machine code. However, it seems to actually have a standard ABI; see Itanium C++ ABI.
So the question may not be that much “what could they loose?”, but rather “what do they loose?” (if ever they really loose something).
Side note: needed to keep in mind ABIs are always architecture and OS dependant. So if what was meant by “Standard ABI” is “standard across architectures and platforms”, then there may never has been or be such thing, but communication protocols.