c++ class instance memory layout once again - c++

I know that this question has been asked previously, but before you give me a minus and report repeated question, ponder a while on this:
In all previous answers everybody says that object memory layout is compiler dependent. How is it then, that shared libraries (*.dll, *.so) can export and import c++ classes, and they can definitely be combined even if coming from different compilers? Consider a DirectX application written under mingw. DirectX was compiled using MSVC++, so how do those environments agree on memory layout? I know that DirectX relies heavily on C++ classes and polymorphism.
Asking differently: let's say that I have a chosen architecture (eg. Windows, intel x86) and I am trying to write a new compiler. How do I know how to access class instance (vtable, member fields) provided by .dll lib compiled by another compiler? Or is it simply like that: M$ has written VC++, and since then it is unwritten standard, and every other compiler does it the same "for compatibility reasons"? And what about linux or other OS-es?
EDIT:
OK I admit, the example with DirectX was bad because of COM specification...
Another example: QT. I am using QT with mingw, but I know there are also available MSVC versions. I don't know if the difference is only in headers, or if shared libs (dll-s) are also different. If they are, does it mean that I have to distribute my app with qt libs included, so if anybody happens to have ones for a different compiler it will not get mixed-up? (Nice memory and code sharing then, right?). Or are they the same and there is some unwritten law about it anyway?
EDIT2:
I have installed a different qt version (msvc 2010) just to see what is and isn't shared. Seems that shared (are they really shared then) libraries are different. Seems that I really have to provide qt-libs with my app then... And this is no small thing (eg. QtGui 8-9MB). What about other, smaller libs, whose authors weren't so kind to provide versions for other compilers? Does it mean that I am stuck with their original compiler? What if I want to use two different libs that were compiled by different compilers?

Basically you're asking about an ABI.
In some cases (e.g., on the Itanium) there's a document that specifies the ABI, and essentially everybody follows the document.
In the case of DirectX, it's a bit the same: Microsoft has published specs for COM, so anybody who follows those specs can get interop with essentially any COM object (within reason -- a 64-bit compiler probably isn't going to work with a 16-bit COM object written for Windows 3.1).
For most other things, you're more or less on your own to figure things out. There's often at least a little published in the way of documentation, but at least in my experience it often skims over some details that end up important, isn't updated entirely dependably, and in some cases is just plain wrong. In most cases, it's also poorly organized so it's pretty much up to you to glean what you can from wherever you can find it, and when you run out of information (or patience) do some reverse engineering to fill in the missing parts.
Edit: For non-COM things like Qt libraries, you have a couple of choices. One is to link to the Qt libraries statically, so the Qt code you need gets linked directly into your executable. If you want to use a DLL, then yes, you're pretty much stuck with distributing one with your app that's going to be specific to the compiler you use to generate the application itself. A different compiler (or even a different version or set of compilation flags with the same compiler) will generally require a different DLL (possibly the same source code, but built to accommodate the compiler change).
There are exceptions to this, but they're pretty much like I've outlined above. For example, the Intel compiler for Windows normally uses Microsoft's standard library, and can use most (if not all) other libraries built for Microsoft's compiler as well. This is pretty much because Intel has bent over backward to assure that their compiler uses the same calling convention, name mangling scheme, etc., as Microsoft's though. It works because they put in a lot of effort to make it work, not because of any unwritten law, or anything like that.

DirectX is a COM-based technology and that is why it can be used in C with various compilers. Under the hood COM interface is a C-like structure which emulates the VMT.
Technically, DirectX has a midl-autogenerated C++ interface, but the general assertion that one can use classes exported in .dlls across different compilers is wrong.
Edit1:
QT dlls built using MSVC won't be compatible with the ones built by gcc, unfortunately, because of the natural reasons: different C++ compiler - different ABI and different run-time library (mingw uses older MSVCRT and that is why pure-C .dlls can be consumed in MSVC and vice versa). Some compilers incidentally or partially intentionally match their ABIs, but this is definitely not the case with MSVC/gcc. QT can also be built as a static library, so to redistribute things one might just link statically.
The name mangling for C++ classes in DLLs depends largely on the compiler front-end used. Many commercial compilers from well-known companies use EDG's C++ parser and that is why the class names an overloaded functions have similar or matching signatures.
Edit2:
"What if I want to use two different libs that were compiled by different compilers?"
If you desperately need some specific functionality from both libraries (I mean some concrete operation, not the framework overall), then the way to go without having the library's source code is to write some wrapper and compile this wrapper to the C-style .dll.
Think of this like having "two different C++-es, C++-1 and C++-2". The problem with ABI/Runtime is no different from using Pascal code from C or linking with some older Fortran libs.

Related

Creating C++ API Library

I'm trying to understand the correct way, or right approach, to provide a reasonably large C++ API for a non-open source project. I do not want to provide a "header only" library as the code base is fairly large and meant to be closed source. The goals are as follows:
Provide a native C++ API that users may instantiate C++ classes, pass data around, all in C++, without a C-only wraper
Allow methods to take as parameters and return C++ objects, especially STL types (std::string, std::vector, etc)
No custom allocators
Most industry standard / canonical method of doing this possible, if such a standard exists
No re-creating COM or using MS COM
Assume all C++ compilers are at least C++11 "compliant"
I am targeting Windows as well as other platforms (Linux).
My understanding is creating a DLL or shared library is out of the question because of the DLL boundary issue. To deal with DLL boundary issue, the DLL and the calling code must be compiled with dynamically linked runtime (and the right version, multithreaded/debug/etc on Windows), and then all compiler settings would need to match with respect to debug symbols (iterator debugging settings, etc). One question I have this is whether, if, say on Windows, we ensure that the compiler settings match in terms of /MD using either default "Debug" and "Releas" settings in Visual Studio, can we really be "safe" in using the DLL in this way (that is passing STL objects back and forth and various things that would certainly be dangerous/fail if there was a mismatch)? Do shared object, *.so in Linux under gcc have the same problem?
Does using static libraries solve this problem? How much do compiler settings need to match between a static library and calling code for which it is linked? Is it nearly the same problem as the DLL (on Windows)?
I have tried to find examples of libraries online but cannot find much guidance on this. Many resources discuss Open Source solution, which seems to be copying header and implementation files into the code base (for non-header-only), which does not work for closed source.
What's the right way to do this? It seems like it should be a common issue; although I wonder if most commercial vendors just use C interfaces.
I am ok with static libraries if that solves the problem. I could also buy into the idea of having a set of X compilers with Y variations of settings (where X and Y are pre-determined list of options to support) and having a build system that generated X * Y shared binary libraries, if that was "safe".
Is the answer is really only to do either C interfaces or create Pure Abstract interfaces with factories? (if so, is there a canonical book or guide for doing this write, that is not implementing Microsoft COM?).
I am aware of Stefanus DuToit's Hourglass Pattern:
https://www.youtube.com/watch?v=PVYdHDm0q6Y
I worry that it is a lot of code duplication.
I'm not complaining about the state of things, I just want to understand the "right" way and hopefully this will serve as a good question for others in similar position.
I have reviewed these Stackoverflow references:
When to use dynamic vs. static libraries
How do I safely pass objects, especially STL objects, to and from a DLL?
Distributing Windows C++ library: how to decide whether to create static or dynamic library?
Static library API question (std::string vs. char*)
Easy way to guarantee binary compatibility for C++ library, C linkage?
Also have reviewed:
https://www.acodersjourney.com/cplusplus-static-vs-dynamic-libraries/
https://blogs.msmvps.com/gdicanio/2016/07/11/the-perils-of-c-interface-dlls/
Given your requirements, you'll need a static library (e.g. .lib under windows) that is linked into your program during the build phase.
The interface to that library will need to be in a header file that declares types and functions.
You might choose to distribute as a set of libraries and header files, if they can be cleanly broken into separate pieces - if you manage the dependencies between the pieces. That's optional.
You won't be able to define your own templated functions or classes, since (with most compilers) that requires distributing source for those functions or classes. Or, if you do, you won't be able to use them as part of the interface to the library (e.g. you might use templated functions/classes internally within the library, but not expose them in the library header file to users of the library).
The main down side is that you will need to build and distribute a version of the library for each compiler and host system (in combination that you support). The C++ standard specifically encourages various types of incompatibilities (e.g. different name mangling) between compilers so, generally speaking, code built with one compiler will not interoperate with code built with another C++ compiler. Practically, your library will not work with a compiler other than the one it is built with, unless those compilers are specifically implemented (e.g. by agreement by both vendors) to be compatible. If you support a compiler that supports a different ABI between versions (e.g. g++) then you'll need to distribute a version of your library built with each ABI version (e.g. the most recent version of the compiler that supports each ABI)
An upside - which follows from having a build of your library for every compiler and host you support - is that there will be no problem using the standard library, including templated types or functions in the standard library. Passing arguments will work. Standard library types can be member of your classes. This is simply because the library and the program using it will be built with the same compiler.
You will need to be rigorous in including standard headers correctly (e.g. not relying on one standard header including another, unless the standard says it actually does - some library vendors are less than rigorous about this, which can cause your code to break when built with different compilers and their standard libraries).
There will mostly be no need for "Debug" and "Release" versions (or versions with other optimisation settings) of your library. Generally speaking, there is no problem with having parts of a program being linked that are compiled with different optimisation settings. (It is possible to cause such things to break,if you - or a programmer using your library in their program - use exotic combinations of build options, so keep those to a minimum). Distributing a "Debug" version of your library will permit stepping through your library with a debugger, which seems counter to your wishes.
None of the above prevents you using custom allocators, but doesn't require it either.
You will not need to recreate COM unless you really want to. In fact, you should aim to ensure your code is as standard as possible - minimise use of compiler-specific features, don't make particular assumptions about sizes of types or layout of types, etc. Using vendor specific features like COM is a no-no, unless those features are supported uniformly by all target compilers and systems you support. Which COM is not.
I don't know if there is a "standard/canonical" way of doing this. Writing code that works for multiple compilers and host systems is a non-trivial task, because there are variations between how different compiler vendors interpret or support the standard. Keeping your code simple is best - the more exotic or recent the language or standard library feature you use, the more likely you are to encounter bugs in some compilers.
Also, take the time to set up a test suite for your library, and maintain it regularly, to be as complete as possible. Test your library with that suite on every combination of compiler/system you support.
Provide a native C++ API that users may instantiate C++ classes, pass data around, all in C++, without a C-only wraper
This excludes COM.
Allow methods to take as parameters and return C++ objects, especially STL types (std::string, std::vector, etc)
This excludes DLLs
Most industry standard / canonical method of doing this possible, if such a standard exists
Not something "standard", but common practises are there. For example, in DLL, pass only raw C stuff.
No re-creating COM or using MS COM
This requires DLL/COM servers
Assume all C++ compilers are at least C++11 "compliant"
Perhaps. Normally yes.
Generally: If source is to be available, use header only (if templates) or h +cpp.
If no source, the best is DLL. Static libraries - you have to build for many compilers and one has to carry on your lib everywhere and link to it.
On linux, gcc uses libstdc++ for the c++ standard library while clang can use libc++ or libstdc++. If your users are building with clang and libc++, they won't be able to link if you only build with gcc and libstdc++ (libc++ and libstdc++ are not binary compatible). So if you wan't to get target both you need two version of you library, one for libstdc++, and another for libc++.
Also, binaries on linux (executable, static library, dynamic library) are not binary compatible between distros. It might work on your distro and not work on someone else distro or even a different version of your distro. Be super careful to test that it work on whichever distro you want to target. Holy Build Box could help you to produce cross-distribution binaries. I heard good thing about it, just never tried it. Worst case you might need to build on all the linux distro you want to support.
https://phusion.github.io/holy-build-box/

C++ ABI issue related to STL

I have searched the net without any conclusive answers to issue related to lack of C++ ABI when it comes to exporting c++ classes accross dll boundaries in windows.
I can use extern c and provide a c like api to a library , but I would like end users to be able to use classes that us stl containers.
What generic patterns do you use for exporting a class that uses stl container across dll boundaries , in a safe manner? e.g. best practice.
This question is for experienced library authors.
Regards
Niladri
There's no defined C++ ABI and it does differ between compilers in terms of memory layout, name mangling, RTL etc.
It gets worse than that though. If you are targeting MSVC compiler for example, your dll/exe can be built with different macros that configure the STL to not include iterator checks so that it is faster. This modifies the layout of STL classes and you end up breaking the One Definition Rule (ODR) when you link (but the link still succeeds). If the ODR is violated then your program will crash seemingly randomly.
I would recommend maybe reading Imperfect C++ which has a chapter on the subject of C++ ABIs.
The upshot is that either:
you compile the dll specifically for the target compiler that is going to link to it and name the dll's appropriately (like boost does it). In this case you can't avoid ODR violations so the user of the library must be able to recompile the library themselves with different options.
Provide a portable C API and provide C++ wrapper classes for convenience that are compiled on the client side of the API. This is time consuming but portable.
Take a look at CppComponents https://github.com/jbandela/cppcomponents
You can export classes and interfaces across dll boundaries even across different compilers.
The library is header-only so nothing to build. To use it, you will need MSVC 2013 and/or GCC 4.8 as it uses C++11 variadic templates extensively.
See presentation from C++Now
https://github.com/boostcon/cppnow_presentations_2014/blob/master/files/cppnow2014_bandela_presentation.pdf?raw=true
https://github.com/jbandela/cppcomponents_cppnow_examples has the examples from the presentation.

Creating a DLL with a different C++ environment

I have a library I'm building which is targeted to be a DLL that is linked into the main solution.
This new DLL is quite complex and I'd like to make use of C++11 features, while the program that will link it most certainly does not. In fact, the main program is currently "cleanly" built using VS2008 and VS2010 (and i think GCC 4.3 for linux?).
What I propose:
Using VS2012 as the IDE and Intel C++ Compiler 2013 for compilation to .dll/.so - for linux - which, as I understand, is basically down to machine form (like an .exe).
While I'm familiar with using C++ to solve problems, I am not fluent in the fundamentals of compilation/linking, etc. Therefore, I'd like to ask the community if
This is possible
If it is possible, how easy is it (as simple as I described?) / what pitfalls or issues can I expect along the way (is it worth it)?
Areas of concern I anticipate:
runtime libraries - I expect this to be the factor that derails this effort. I know nothing about them/how they work except that they might be a problem.
Standard Library implementation differences - should it matter if it's down to DLL form?
threading conflicts - the dll threads and the main programs threads never modify the same data, and actually one of the main program's threads will call the DLL functions.
Bonus: While the above is the route I expect to take, I'd ideally like to have this code open for intellisense, general viewing, etc (essentially for it to become a project in the main solution). Is there a way to specify different runtime libraries/compiler? Can this be done?
EDIT: The main reason for this bonus part is to eliminate the necessary "versioning" conflicts that will arise if the main program and this library are built separately.
NOTE: I'm not using C++11 just for the sake of being newer - strongly typed enums and cross-platform threading code will be huge bonuses for the library.
The question isn't so much "Can an application use a library built with a different compiler ?" (The answer is yes.) but "What C++ features can be used in the public interface of a library built with another compiler and C++ standard library?"
On Windows, the answer is "almost none". Interfaces (classes containing only virtual functions) are about it. No classes with data members. No exceptions. No runtime objects (like iostream instances or strings). No templates.
On Linux, the answer is "lots more but still not many". Classes are ok, as long as the ODR is satisfied. Exceptions will work. Templates too, as long as the definition is exactly the same on both sides. But definitions of standard library types did change between C++03 and C++11, so you won't for example be able to pass std::string or std::vector<int> objects between the application and library (both sides can use these features, but the same object can't cross over).
I'm afraid this is not possible with C++. Especially name mangling can be different. All C++ files linked together need to be compiled with same compiler.
In C++, extern "C" stuff is standard (naming, calling convention), so C libraries can be called from C++, as well as C++ functions declared withing extern "C" block. This exludes classes, templates, overloads, mixing them compiled by different compilers is not workable.
Which is a pity.

How portable IS C++?

In C++, if I write a simple game like pong using Linux, can that same code be compiled on Windows and OSX? Where can I tell it won't be able to be compiled?
You have three major portability hurdles.
The first, and simplest, is writing C++ code that all the target compilers understand. Note: this is different from writing to the C++ standard. The problem with "writing to the standard" starts with: which standard? You have C++98, C++03, C++TR1 or C++11 or C++14 or C++17? These are all revisions to C++ and the newer one you use the less compliant compilers are likely to be. C++ is very large, and realistically the best you can hope for is C++98 with some C++03 features.
Compilers all add their own extensions, and it's all too easy to unknowingly use them. You would be wise to write to the standard and not to the compiler documentation. Some compilers have a "strict" mode where they will turn off all extensions. You would be wise to do primary development in the compiler which has the most strictures and the best standard compliance. gcc has the -Wstrict family of flags to turn on strict warnings. -ansi will remove extensions which conflict with the standard. -std=c++98 will tell the compiler to work against the C++98 standard and remove GNU C++ extensions.
With that in mind, to remain sane you must restrict yourself to a handful of compilers and only their recent versions. Even writing a relatively simple C library for multiple compilers is difficult. Fortunately, both Linux and OS X use gcc. Windows has Visual C++, but different versions are more like a squabbling family than a single compiler when it comes to compatibility (with the standard or each other), so you'll have to pick a version or two to support. Alternatively, you can use one of the gcc derived compiler environments such as MinGW. Check the [list of C++ compilers](less compliant compilers are likely to be) for compatibility information, but keep in mind this is only for the latest version.
Next is your graphics and sound library. It has to not just be cross platform, it has to look good and be fast on all platforms. These days there's a lot of possibilities, Simple DirectMedia Layer is one. You'll have to choose at what level you want to code. Do you want detailed control? Or do you want an engine to take care of things? There's an existing answer for this so I won't go into details. Be sure to choose one that is dedicated to being cross platform, not just happens to work. Compatibility bugs in your graphics library can sink your project fast.
Finally, there's the simple incompatibilities which exist between the operating systems. POSIX compliance has come a long way, and you're lucky that both Linux and OS X are Unix under the hood, but Windows will always be the odd man out. Things which are likely to bite you mostly have to do with the filesystem. Here's a handful:
Filesystem layout
File path syntax (ie. C:\foo\bar vs /foo/bar)
Mandatory Windows file locking
Differing file permissions systems
Differing models of interprocess communication (ie. fork, shared memory, etc...)
Differing threading models (your graphics library should smooth this out)
There you have it. What a mess, huh? Cross-platform programming is as much a state of mind and statement of purpose as it is a technique. It requires some dedication and extra time. There are some things you can do to make the process less grueling...
Turn on all strictures and warnings and fix them
Turn off all language extensions
Periodically compile and test in Windows, not just at the end
Get programmer who likes Windows on the project
Restrict yourself to as few compilers as you can
Choose a well maintained, well supported graphics library
Isolate platform specific code (for example, in a subclass)
Treat Windows as a first class citizen
The most important thing is to do this all from the start. Portability is not something you bolt on at the end. Not just your code, but your whole design can become unportable if you're not vigilant.
C++ is ultra portable and has compilers available on more platforms than you can shake a stick at. Languages like Java are typically touted as being massively cross platform, ironically they are in fact usually implemented in C++, or C.
That covers "portability". If you actually mean, how cross platform is C++, then not so much: The C++ standard only defines an IO library suitable for console IO - i.e. text based, so as soon as you want develop some kind of GUI, you are going to need to use a GUI framework - and GUI frameworks are historically very platform specific. Windows has multiple "native" GUI frameworks now - the C++ framework made available from Microsoft is still MFC - which wraps the native Win32 API which is a C API. (WPF and WinForms are available to CLR C++).
The Apple Mac's GUI framework is called Cocoa, and is an objective-C library, but its easy to access Objective C from C++ in that development environment.
On Linux there is the GTK+ and Qt frameworks that are both actually ported to Windows and Apple, so one of these C++ frameworks can solve your "how to write a GUI application in C++ once that builds and runs on windows, apple mac and linux".
Of course, its difficult to regard Qt as strictly C++ anymore - Qt defines a special markup for signals and slots that requires a pre-compile compile step.
You can read the standard - if a program respects the standard, it should be compilable on all platforms that have a C++ standard-compliant compiler.
As for 3rd party libraries you might be using, the platform availability is usually specified in the documentation.
When GUI comes to question, there are cross-platform options (such as QT), but you should probably ask yourself - do I really want portability when it comes to UI? Sometimes, it's better to have the GUI part platform-specific.
If you are thinking of porting from Linux to Windows, using OPENGL for the graphical part gives you freedom to run your program on both operating systems as long as you don't use any system specific functionality.
Compared to C, C++ portability is extremely limited, if not completely unexisting. For one you can't disable exceptions (well you can), for the standard specifically says that's undefined behaviour. Many devices don't even support exceptions. So as for that, C++ is ZERO portable. Plus seeing the UB, it's obvioulsy a no-go for zero-fail high-performance real time systems in which exceptions are taboo - undefined behaviour has no place in zero-fail environment. Then there's the name mangling which most, if not every, compiler does completely different. For good portability and inter-compatibility extern "C" would have to be used to export symbols, yet this renders any and all namespace information completely void, resulting in duplicate symbols. One can ofcourse choose to not use namespaces and use unique symbol names. Yet another C++ feature rendered void. Then there's the complexity of the language, which results in implementation difficulties in the various compilers for various architectures. Due to these difficulties, true portability becomes a problem. One can solve this by having a large chain of compiler directives/#ifdefs/macros. Templates? Not even supported by most compilers.
What portability? You mean the semi-portability between a couple of main-stream build targets like MSVC for Windows and GCC for Linux? Even there, in that MAIN-STREAM segment, all the above problems and limitations exist. It's retarded to even think C++ is portable.

Preparing a library for plugin support

I searched for this particular question but could not come up with any results, neither here nor on-line in general (maybe because it is a little harder to phrase for me). If it has already been asked, please point me in the right direction.
I am at a point where I would like my libraries/software to be pluggable. I see all these various libraries and systems where plugins are used extensively and the authors boastfully point out (in a good way!) that their software has plugin support.
So my question is, where do I start? Are there any books/on-line-resources that break the ice and may guide one on the do's and dont's of making your library pluggable, define best practices etc.?
You have to understand some things before starting :
There is no support for modules (static or dynamic) in standard C++. Nope. Not yet. Maybe in 2015.
Dlls (or .so on unix-like systems) are dynamically loaded libraries that are compiler/os dependant. So it's a pragmatic solution that fill the need.
So, you'll have to use shared libraries (whatever the file extension, it's the keyword for searches about this subject) as plugin binaries. If your plugin should contain more than runtime code, like graphic resources, you can include your graphic resources in the binary, or have a file format or compressed archibe that contain the binary file.
Whatever the way you setup your plugin files, in C++ the problem is about the interface.
Depending on wich compiler you use, you'll have different ways to "tag" functions or classes as exported/imported (meaning your plugin source code export the code and the user of the plugin should import the code).
Setup clean and clear interface in C++ for the modules, with no templates (because they are compiler and compiler configuration dependant). Those interfaces should be function declarations and class declarations with no inline code and marked exported/imported.
Now, once you've got this, you can use OS-specific API to load/unload dynamic library binaries while the application is running. Once it's done, you can get pointers to functions, again using the OS-specific API. I let you search for it.
Now, there are libraries that provide ways to abstract this in a cross-platform way. I didn't use them yet and they are known to be unperfect because of lack of definitions in the C++ standard, but they could be useful if you're planning to have your application cross-platform:
boost::extension : it's not yet a boost library, nor even proposed yet, and it's developpement is in pause (until some new standard C++ implementations are done) so it might be a bad idea but a lot of people say they use it with success.
POCO libraries have a library for shared libraries that would be the equivalent of boost::extension. Again lot of people say it's useful so I guess it's good enough to be used.
The other alternative, that is easy to setup if you don't need to support tons of target platforms, is to just write some wrapper code around OS-Specific APIs. That's what I did before knowing about boost::extension for example.