RTTI and Portability in C++

RTTI and Portability in C++ - c++

If a compiler doesn't "support" RTTI, does that mean that the compiler can not handle class hierarchies that have virtual functions in them? Or have I been misunderstanding the literature about how RTTI isn't portable, and the issues lie elsewhere?
Thank you all for your comments!

This is probably way more of an answer you were looking for, but here goes:
RTTI is not "portable" means that if you use compiler A to build dynamic library A, and use compiler B to build application B that links with A, then you cannot use RTTI, because the RTTI implementations of compiler a and b are different. Virtual function are affected only because the virtual function mechanism may not be binary compatible either.
This issue was very important in the mid 90's, but the issue is now obsolete. Not because compliers have now all become binary compatible with each other, but rather the opposite: C++ developers have now recognized that C++ libraries must be delivered as source code, and not linkable libraries. For those who view C++ as an extension of C, this is very discomforting, but for more modern programmers, who grew up in an open source enviroment, nothing special at all.
What changed between the mid-90's and now is differing attitudes between what constitutes valuable intellectual property and what doesn't. To wit: there is actually a patent registered with USPO on "expression templates." Even the holder of such realizes that the patent is unenforcable.
C style "header and binary" libraries were long seen as a way to obfuscate valuable source code. More and more, business came to recognize that the obfuscation was more self-defeating than protective: there is very little code out there that meets "valuable IP" status. Most people buy libraries not because of the special IP it contains, but because it is cheaper to buy rather than roll their own. In fact: expertise in applying IP is far more valuable than the IP itself. But if no one cares about this IP because theyh don't know about it, then it is not worth very much.
This is how open source works: IP is freely distrbuted, in return those distributors gain consultancy fees in applying that IP. Those who can figure it out for themselves profitably-- well good on them. But it is not the norm. What happpens actually is that a developer understands IP and sells their employer on buying the product that implements it. Yeah, whole "developer communities" are founded on this premise.
To make a long story short: binary (and subsequently RTTI) compatibilty went the way of the dinosaur once the open source movement took off, and concurrently, C++ template libraries became the norm. C++ libraries long ago became "source distributable only" like Perl, Python, JavaScript etc. To make your C++ compiler work with all source that you compile with it, make sure that RTTI is turned on (inideed all C++ standard features, like exceptions), and that all C++ libs you link are likewise commpied with the same options that you used to compile your app.
There is one (and only one) compiler I know of that does not enable RTTI by default, and that is beccause there are other legacy ways to do the same thing. To read about these, pick up a copy of Don Box's excellent work "Essential COM."

RTTI is not needed for virtual functions.
It is mainly used for dynamic_cast and typeid.

The only part of RTTI that is unportable is the format of strings returned from type_info::name().
Even this has a fighting chance so long as you can find a c++filt tool for your compiler that converts (unmangles) such a string back into a compliant C++ type.

If a compiler doesn't "support" RTTI, does that mean that the compiler can not handle class hierarchies that have virtual functions in them?
Generally all modern C++ compilers support RTTI... So forget about it.
Or have I been misunderstanding the literature about how RTTI isn't portable, and the issues lie elsewhere?
RTTI today is portable and works fine on any modern compiler... However some special
cases may occur.
Under ELF platforms (Linux) when you load libraries dynamically (i.e. dlopen) and try
to perform dynamic_cast to some class between library and executable it may fail if
you do not pass correct flags for linking executable (-rdynamic).
Almost any other cases... Just works.

Related

Why doesn't compiler help us with type traits instead of resorting to language quirks?

Type traits are cool and I've used them since they originated in boost a few years ago. However, when you look at their implementation (check out "How is_base_of works?" StackOverflow thread).
Why won't compiler help here? For example, if you want to check if some class is base of another, compiler already knows that, why can't it tell us? This would make things like concepts so much easier to implement and use. You could use language constructs right there.
I am not sure, but I am guessing that it would increase general performance. It is like asking compiler for help, instead of C++ language.
I suspect that the primary reason will sound something like "we need to maintain backwards compatibility" and I agree, but why won't the compiler be more active in generating generic templated code?

Actually... some do.
The thing is that if something can be implemented in pure C++ code, there is no reason, other than simplifying the code, to hard-wire them in the compiler. It is then a matter of trade-off, is the value brought by the code simplification worth the hard-wiring ?
This depends on several points:
correctness (some times software may only partially emulate the trait)
complexity of the code ~~ maintenance burden
performance
...
Once all those points have been weighted, then you can determine whether it's more advantageous to put things in the library or the compiler; and the more likely situation is that you will end up with a mixed strategy: a couple intrinsics in the compiler used as building blocks to provide the required interface in the library.
Note that the maintenance burden is much more significant in a compiler: any C++ developer sufficiently acquainted with the language can delve into a library implementation, whereas the compiler code is a black-box. Therefore, it'll be much easier to debug and patch the library than the compiler, so there is incentive not to put things in the compiler unless you have a very good reason to.

It's hard to give an objective answer here, but I suspect the following.
The code using language quirks to find out this stuff has often already been written (Boost, etc).
The compiler does not have to be changed to implement this if it can be done with language quirks (which saves a lot of time in writing, compiling, debugging and testing).
It's basically a "don't fix what isn't broken" mentality.

Compiler help for type traits has always been a design goal. TR1 formally introduced type traits, and included a section that described acceptable incorrect results in some cases to enable writing the type traits in straight C++. When those type traits were added to C++11 (with some name changes that don't affect their implementation) the allowance for incorrect results was removed, effectively requiring compiler help to implement some of them. And even for those that can be implemented in straight C++, compiler writers prefer intrinsics to complicated templates so that you don't have to put a drip pan under your computer to catch the slag as the overworked compiler causes the computer to melt down.

What problems can appear when using G++ compiled DLL (plugin) in VC++ compiled application?

I use and Application compiled with the Visual C++ Compiler. It can load plugins in form of a .dll. It is rather unimportant what exactly it does, fact is:
This includes calling functions from the .dll that return a pointer to an object of the Applications API, etc.
My question is, what problems may appear when the Application calls a function from the .dll, retrieves a pointer from it and works with it. For example, something that comes into my mind, is the size of a pointer. Is it different in VC++ and G++? If yes, this would probably crash the Application?
I don't want to use the Visual Studio IDE (which is unfortunately the "preferred" way to use the Applications SDK). Can I configure G++ to compile like VC++?
PS: I use MINGW GNU G++

As long as both application and DLL are compiled on the same machine, and as long as they both only use the C ABI, you should be fine.
What you can certainly not do is share any sort of C++ construct. For example, you mustn't new[] an array in the main application and let the DLL delete[] it. That's because there is no fixed C++ ABI, and thus no way in which any given compiler knows how a different compiler implements C++ data structures. This is even true for different versions of MSVC++, which are not ABI-compatible.

All C++ language features are going to be entirely incompatible, I'm afraid. Everything from the name-mangling to memory allocation to the virtual-call mechanism are going to be completely different and not interoperable. The best you can hope for is a quick, merciful crash.
If your components only use extern "C" interfaces to talk to one another, you can probably make this work, although even there, you'll need to be careful. Both C++ runtimes will have startup and shutdown code, and there's no guarantee that whichever linker you use to assemble the application will know how to include this code for the other compiler. You pretty much must link g++-compiled code with g++, for example.
If you use C++ features with only one compiler, and use that compiler's linker, then it gets that much more likely to work.

This should be OK if you know what you are doing. But there's some things to watch out for:
I'm assuming the interface between EXE and DLL is a "C" interface or something COM like where the only C++ classes exposed are through pure-virutal interfaces. It gets messier if you are exporting a concrete class through a DLL.
32-bit vs. 64bit. The 32-bit app won't load a 64-bit DLL and vice-versa. Make sure they match.
Calling convention. __cdecl vs __stdcall. Often times Visual Studio apps are compiled with flags that assuming __stdcall as the default calling convention (or the function prototype explicitly says so). So make sure that the g++ compilers generates code that matches the calling type expected by the EXE. Otherwise, the exported function might run, but the stack can get trashed on return. If you debug through a crash like this, there's a good chance the cdecl vs stdcall convention was incorrectly specified. Easy to fix.
C-Runtimes will not likely be shared between the EXE and DLL, so don't mix and match. A pointer allocated with new or malloc in the EXE should not be released with delete or free in the DLL (and vice versa). Likewise, FILE handles returned by fopen() can not be shared between EXE and DLL. You'll likely crash if any of this happens.... which leads me to my next point....
C++ header files with inline code cause enough headaches and are the source of issues I called out in #3. You'll be OK if the interface between DLL And EXE is a pure "C" interface.
Name mangling issues. If you run into issues where the function name exported doesn't match because of name mangling or because of a leading underscore, you can fix that up in a .DEF file. At least that's what I've done in the past with Visual Studio. Not sure if the equivalent exists in g++/MinGW. Example below. Learn to use "dumpbin.exe /exports" to you can validate your DLL is exporting function with the right name. Using extern "C" will also help fix this as well.
EXPORTS
FooBar=_Foobar#12
BlahBlah=??BlahBlah##QAE#XZ #236 NONAME
Those are the issues that I know of. I can't tell you much more since you didn't explain the interface between the DLL and EXE.

The size of a pointer won't vary; that is dependent on the platform and module bitness and not the compiler (32-bit vs 64-bit and so on).
What can vary is the size of basically everything else, and what will vary are templates.
Padding and alignment of structs tends to be compiler-dependent, and often settings-within-compiler dependent. There are so loose rules, like pointers typically being on a platform-bitness-boundary and bools having 3 bytes after them, but it's up to the compiler how to handle that.
Templates, particularly from the STL (which is different for each compiler) may have different members, sizes, padding, and mostly anything. The only standard part is the API, the backend is left to the STL implementation (there are some rules, but compilers can still compile templates differently). Passing templates between modules from one build is bad enough, but between different compilers it can often be fatal.
Things which aren't standardized (name mangling) or are highly specific by necessity (memory allocation) will also be incompatible. You can get around both of those issues by only destroying from the library that creates (good practice anyway) and using STL objects that take a deleter, for allocation, and exporting using undecorated names and/or the C style (extern "C") for exported methods.
I also seem to remember a catch with how the compilers handle virtual destructors in the vtable, with some small difference.
If you can manage to only pass references of your own objects, avoid externally visible templates entirely, work primarily with pointers and exported or virtual methods, you can avoid a vast majority of the issues (COM does precisely this, for compatibility with most compilers and languages). It can be a pain to write, but if you need that compatibility, it is possible.
To alleviate some, but not all, of the issues, using an alternate to the STL (like Qt's core library) will remove that particular problem. While throwing Qt into any old project is a hideous waste and will cause more bloat than the "boost ALL THE THINGS!!!" philosophy, it can be useful for decoupling the library and the compiler to a greater extent than using a stock STL can.

You can't pass C runtime objects between them. For example you can not open a FILE buffer in one and pass it to be used in the other. You can't free memory allocated on the other side.

The main problems are the function signatures and way parameters are passed to library code. I've had great difficulty getting VC++ dll's to work in gnu based compilers in the past. This was way back when VC++ always cost money and mingw was the free solution.
My experience was with DirectX API's. Slowly a subset got it's binaries modified by enthusiasts but it was never as up-to-date or reliable so after evaluating it I switched to a proper cross platform API, that was SDL.
This wikipedia article describes the different ways libraries can be compiled and linked. It is rather more in depth than I am able to summarise here.

How hide C++ source code from customer

I wish to send some components to my customers. The reasons I want to deliver source code are:
1) My class is templatized. Customer might use any template argument, so I can't pre-compile and send .o file.
2) The customer might use different compiler versions for gcc than mine. So I want him to do compilation at his end.
Now, I can't reveal my source code for obvious reasons. The max I can do is to reveal the .h file. Any ideas how I may achieve this. I am thinking about some hooks in gcc that supports decryption before compilation, etc. Is this possible?
In short, I want him to be able to compile this code without being able to peek inside.

Contract = good, obfuscation = ungood.
That said, you can always do a kind of PIMPL idiom to serve your customer with binaries and just templated wrappers in the header(s). The idea is then to use an "untyped" separately compiled implementation, where the templated wrapper just provides type safety for client code. That's how one often did things before compilers started to understand how to optimize templates, that is, to avoid machine-code level code bloat, but it only provides some measure of protection about trivial copy-and-paste theft, not any protection against someone willing to delve into the machine code.
But perhaps the effort is then greater than just reinventing your functionality?

Just adding some terminology to Alf's answer: The Thin template idiom is what you might look at. It basically simulates the functionality of a generic. Don't get confused by the wikipedia article which pops up in google, you don't have to use void*...
This, of course, does not guarantee binary compatibility. As usual with 'native' c++, you either compile the component for customers platform yourself and deploy the binary, or give them your code... The difference to the pure generic component code is that you can do the former at all.

use some c++ obfuscators may be help?: http://www.semdesigns.com/products/obfuscators/CppObfuscationExample.html or Magle It

First, if you're going to provide the source code, then you have to provide the source code. Sure, you could encrypt it, but even if GCC had a "decrypt before compile" option, it would need to decrypt the code, and if GCC can decrypt the code, so can your customer.
What you're asking is impossible. (If you find a way to do it, I believe the movie industry might have a multi-million contract for you. They currently have to resort to expensive custom hardware to prevent people from ripping content, and that only works to a limited degree)
As for your "obvious reasons" why you don't want to provide the source code, I don't see why they're obvious. What would happen if you provided the source code?
You have two options:
provide the source code in its entirety, or
compile everything that can be precompiled into a (static or dynamic) library, and provide your customer with that, plus the header files.

what about pimpls?

1) My class is templatized. Customer might use any template argument, so I can't pre-compile and send .o file.
2) The customer might use different compiler versions for gcc than mine. So I want him to do compilation at his end.
Now, I can't reveal my source code for obvious reasons. The max I can do is to reveal the .h file. Any ideas how I may achieve this. I am thinking about some hooks in gcc that supports decryption before compilation, etc. Is this possible?
In short, I want him to be able to compile this code without being able to peek inside.
Consideration 2) above encompasses A) ABI differences such that the same code compiled with different compiler versions/vendors on the same platform is incompatible, as well as B) the differences in system libraries, kernel versions etc. that the code might be dependent on. The only general solution is to compile on the specific platforms. Either you do it for all platforms, or you give them all the source code and they do it. That's not just the headers and template implementation, that's your out-of-line functions too. You might mitigate A) a little by building a wall of more interoperable extern "C" functions, but you're basically stuck when it comes to B).
So, can you decrypt during compilation? Only if you ship your own hacked GCC binaries to them, built for their specific system, which is probably more hassle than providing different builds of your own libraries (though it may address the template/header exposure issue).
Alternatively, you could employ source code obfuscation techniques. This is probably - practically - as good as it gets. I don't know what tools are out there, but it's an approach that people have pursued for decades (though I'm yet to hear anyone recommend it), so there's sure to be some mature tools.
Re templated code - other people have suggested a templated front end to a C-style generic implementation shipped as a precompiled object. That may or may not be practical (clearly risks performance degradation, and you have to capture the set of type-specific operations you want - e.g. by instantiating a type-specific class derived from an abstract operations base class) but anyway the precompiled object still runs afoul of B).
One other thought... clients might take your source code, but are unlikely to understand it as well as you. Even if they build more systems dependent on their version of it, in a way they're getting more locked in, and may have more need for your services in future. And, if you see they've not played fair, you charge them for it appropriately when the time comes.

It seems with gcc 4.5 comes the support for plugins. So you can provide your own .so which would be, for instance, called before compilation stage starts. So you can have all kinds of tricks(decryption of source file) in there, neatly hidden. This would also be portable solution as no change is made to g++ per se.
This is exactly what I was looking for. You can read more here:
http://www.codesynthesis.com/~boris/blog/2010/05/03/parsing-cxx-with-gcc-plugin-part-1/

Where do I learn "what I need to know" about C++ compilers?

I'm just starting to explore C++, so forgive the newbiness of this question. I also beg your indulgence on how open ended this question is. I think it could be broken down, but I think that this information belongs in the same place.
(FYI -- I am working predominantly with the QT SDK and mingw32-make right now and I seem to have configured them correctly for my machine.)
I knew that there was a lot in the language which is compiler-driven -- I've heard about pre-compiler directives, but it seems like someone would be able to write books the different C++ compilers and their respective parameters. In addition, there are commands which apparently precede make (like qmake, for example (is this something only in QT)).
I would like to know if there is any place which gives me an overview of what compilers are out there, and what their different options are. I'd also like to know how each of them views Makefiles (it seems that there is a difference in syntax between them?).
If there is no website regarding, "Everything you need to know about C++ compilers but were afraid to ask," what would be the best way to go about learning the answers to these questions?

Concerning the "numerous options of the various compilers"
A piece of good news: you needn't worry about the detail of most of these options. You will, in due time, delve into this, only for the very compiler you use, and maybe only for the options that pertain to a particular set of features. But as a novice, generally trust the default options or the ones supplied with the make files.
The broad categories of these features (and I may be missing a few) are:
pre-processor defines (now, you may need a few of these)
code generation (target CPU, FPU usage...)
optimization (hints for the compiler to favor speed over size and such)
inclusion of debug info (which is extra data left in the object/binary and which enables the debugger to know where each line of code starts, what the variables names are etc.)
directives for the linker
output type (exe, library, memory maps...)
C/C++ language compliance and warnings (compatibility with previous version of the compiler, compliance to current and past C Standards, warning about common possible bug-indicative patterns...)
compile-time verbosity and help
Concerning an inventory of compilers with their options and features
I know of no such list but I'm sure it probably exists on the web. However, suggest that, as a novice you worry little about these "details", and use whatever free compiler you can find (gcc certainly a great choice), and build experience with the language and the build process. C professionals may likely argue, with good reason and at length on the merits of various compilers and associated runtine etc., but for generic purposes -and then some- the free stuff is all that is needed.
Concerning the build process
The most trivial applications, such these made of a single unit of compilation (read a single C/C++ source file), can be built with a simple batch file where the various compiler and linker options are hardcoded, and where the name of file is specified on the command line.
For all other cases, it is very important to codify the build process so that it can be done
a) automatically and
b) reliably, i.e. with repeatability.
The "recipe" associated with this build process is often encapsulated in a make file or as the complexity grows, possibly several make files, possibly "bundled together in a script/bat file.
This (make file syntax) you need to get familiar with, even if you use alternatives to make/nmake, such as Apache Ant; the reason is that many (most?) source code packages include a make file.
In a nutshell, make files are text files and they allow defining targets, and the associated command to build a target. Each target is associated with its dependencies, which allows the make logic to decide what targets are out of date and should be rebuilt, and, before rebuilding them, what possibly dependencies should also be rebuilt. That way, when you modify say an include file (and if the make file is properly configured) any c file that used this header will be recompiled and any binary which links with the corresponding obj file will be rebuilt as well. make also include options to force all targets to be rebuilt, and this is sometimes handy to be sure that you truly have a current built (for example in the case some dependencies of a given object are not declared in the make).
On the Pre-processor:
The pre-processor is the first step toward compiling, although it is technically not part of the compilation. The purposes of this step are:
to remove any comment, and extraneous whitespace
to substitute any macro reference with the relevant C/C++ syntax. Some macros for example are used to define constant values such as say some email address used in the program; during per-processing any reference to this constant value (btw by convention such constants are named with ALL_CAPS_AND_UNDERSCORES) is replace by the actual C string literal containing the email address.
to exclude all conditional compiling branches that are not relevant (the #IFDEF and the like)
What's important to know about the pre-processor is that the pre-processor directive are NOT part of the C-Language proper, and they serve several important functions such as the conditional compiling mentionned earlier (used for example to have multiple versions of the program, say for different Operating Systems, or indeed for different compilers)
Taking it from there...
After this manifesto of mine... I encourage to read but little more, and to dive into programming and building binaries. It is a very good idea to try and get a broad picture of the framework etc. but this can be overdone, a bit akin to the exchange student who stays in his/her room reading the Webster dictionary to be "prepared" for meeting native speakers, rather than just "doing it!".

Ideally you shouldn't need to care what C++ compiler you are using. The compatability to the standard has got much better in recent years (even from microsoft)
Compiler flags obviously differ but the same features are generally available, it's just a differently named option to eg. set warning level on GCC and ms-cl
The build system is indepenant of the compiler, you can use any make with any compiler.

That is a lot of questions in one.
C++ compilers are a lot like hammers: They come in all sizes and shapes, with different abilities and features, intended for different types of users, and at different price points; ultimately they all are for doing the same basic task as the others.
Some are intended for highly specialized applications, like high-performance graphics, and have numerous extensions and libraries to assist the engineer with those types of problems. Others are meant for general purpose use, and aren't necessarily always the greatest for extreme work.
The technique for using each type of hammer varies from model to model—and version to version—but they all have a lot in common. The macro preprocessor is a standard part of C and C++ compilers.
A brief comparison of many C++ compilers is here. Also check out the list of C compilers, since many programs don't use any C++ features and can be compiled by ordinary C.
C++ compilers don't "view" makefiles. The rules of a makefile may invoke a C++ compiler, but also may "compile" assembly language modules (assembling), process other languages, build libraries, link modules, and/or post-process object modules. Makefiles often contain rules for cleaning up intermediate files, establishing debug environments, obtaining source code, etc., etc. Compilation is one link in a long chain of steps to develop software.
Also, many development environments abstract the makefile into a "project file" which is used by an integrated development environment (IDE) in an attempt to simplify or automate many programming tasks. See a comparison here.
As for learning: choose a specific problem to solve and dive in. The target platform (Linux/Windows/etc.) and problem space will narrow the choices pretty well. Which you choose is often linked to other considerations, such as working for a particular company, or being part of a team. C++ has something like 95% commonality among all its flavors. Learn any one of them well, and learning the next is a piece of cake.

Mixing C/C++ Libraries

Is it possible for gcc to link against a library that was created with Visual C++? If so, are there any conflicts/problems that might arise from doing so?

Some of the comments in the answers here are slightly too generalistic.
Whilst no, in the specific case mentioned gcc binaries won't link with a VC++ library (AFAIK). The actual means of interlinking code/libraries is a question of the ABI standard being used.
An increasingly common standard in the embedded world is the EABI (or ARM ABI) standard (based on work done during Itanium development http://www.codesourcery.com/cxx-abi/). If compilers are EABI compliant they can produce executables and libraries which will work with each other. An example of multiple toolchains working together is ARM's RVCT compiler which produces binaries which will work with GCC ARM ABI binaries.
(The code sourcery link is down at the moment but can be google cached)

I would guess not. Usually c++ compilers have quite different methods of name-mangling which means that the linkers will fail to find the correct symbols. This is a good thing by the way, because C++ compilers are allowed by the standard to have much greater levels of incompatibility than just this that will cause your program to crash, die, eat puppies and smear paint all over the wall.
Usual schemes to work around this usually involve language independent techniques like COM or CORBA. A simpler sanctified method is to use C "wrappers" around your C++ code.

It is not possible. It's usually not even possible to link libraries produced by different versions of the same compiler.

No. Plain and simple :-)

Yes, if you make it a dynamic link and make the interface c-style. lib.exe will generate import libraries which are compatible with the gcc toolchain.
That will resolve your linking problems. However that is just the start of the problem.
Your larger problems will be things like exceptions, and memory allocation.
You must ensure that no exception cross from VC++ to gcc code, there are no guarantees of compatibility.
Every object from the VC++ library will need to live on the heap because:
Do not mix gcc new/delete with anything from VC++, bad things will happen. This goes for object construction on the stack too. However, if you make an interface like create_some_obj()/delete_some_obj() you do not end up using gcc new to construct VC++ objects. Maybe make a small handler object that handles construction and destruction. This way you preserve RAII, but still use the c-interface for the true interface.
Calling convention must be correct. In VC++ there is cdecl and stdcall. If gcc tried to call an imported function with the wrong calling type, bad things will happen.
The bottom line is keep a simple interface that is ANSI C compliant, and you should be fine. The fact that crazy C++ goes on behind is okay, as long as it is contained.
Oh and make sure all the code is re-entrant, or you risk opening a whole nother can-o-worms.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js