Building C++ source code as a library - where to start? - c++

Over the months I've written some nice generic enough functionality that I want to build as a library and link dynamically against rather than importing 50-odd header/source files.
The project is maintained in Xcode and Dev-C++ (I do understand that I might have to go command line to do what I want) and have to link against OpenGL and SDL (dynamically in SDL's case). Target platforms are Windows and OS X.
What am I looking at at all?
What will be the entry point of my
library if it needs one?
What do I have to change in my code?
(calling conventions?)
How do I release it? My understanding
is that headers and the compiled
library (.dll, .dylib(, .framework),
whatever it'll be) need to be
available for the project -
especially as template functionality
can not be included in the library by
nature.
What else I need to be aware of?

I'd recommend building as a statc library rather than a DLL. A lot of the issues of exporting C++ functions and classes go away if you do this, provided you only intend to link with code produced by the same compiler you built the library with.
Building a static library is very easy as it is just an collection of .o/.obj files - a bit like a ZIP file but without compression. There is no need to export anything - just include the library in the list of files that your application links with. To access specific functions or classes, just include the relevant header file. Note you can't get rid of header files - the C++ compilation model, particularly for templates, depends on them.

It can be problematic to export a C++ class library from a dynamic library, but it is possible.
You need to mark each function to be exported from the DLL (syntax depends on the compiler). I'm poking around to see if I can find how to do this from xcode. In VC it's __declspec(dllexport) and in CodeWarrior it's #pragma export on/#pragma export off.
This is perfectly reasonable if you are only using your binary in-house. However, one issue is that C++ methods are named differently by different compilers. This means that nobody who uses a different compiler will be able to use your DLL, unless you are only exporting C functions.
Also, you need to make sure the calling conventions match in the DLL and the DLL's client. This either means you should have the same default calling convention flag passed to the compiler for both the DLL or the client, or better, explicitly set the calling convention on each exported function in the DLL, so that it won't matter what the default is for the client.
This article explains the naming issue:
http://en.wikipedia.org/wiki/Name_decoration

The C++ standard doesn't define a standard ABI, and that's bad news for people trying to build C++ libraries. This means that you get different behavior from your compiled code depending on which flags were used to compile it, and that can lead to mysterious bugs in code that compiles and links just fine.
This extends beyond just different calling conventions - C++ code can be compiled to support or not support RTTI, exception handling, and with various optimizations that can affect the the memory layout of class instances, which C++ code relies on.
So, what can you do? I would build C++ libraries inside my source tree, and make sure that they're built as part of my project's build, and that all the libraries and the code that links to them use the same compiler flags.
Note that name mangling, which was supposed to at least prevent you from linking object files that were compiled with different compilers/compiler flags only mostly works, and there are certain things you can do, especially with GCC, that will result in code that links just fine and fails at runtime.
You have to be extra careful with vendor supplied dynamic C++ libraries (QT on most Linux distributions, for example.) I've seen instances of vendor supplied libraries that were compiled in ways that prevented certain things from working properly. For example, some Redhat Linux releases (maybe all of them) disabled exceptions in QT, which made it impossible to catch exceptions in main() if the exceptions were thrown in a QT callback. Fun.

Related

Why can some libraries built by older compilers link against modern code, and others cannot?

We have a lot of prebuilt libraries (via CMake mostly), built using Visual Studio 2017 v141. When we try to use these against a project using Visual STudio 2019 v142 we see errors like:
Error C1047 The object or library file
‘boost_chrono-vc141-mt-gd-x32-1_68.lib’ was created by a different
version of the compiler than other objects...
On the other hand, we also use pre-compiled .libs from 3rd-party vendors which are over a decade old and these have worked just fine when linked against our codebase.
What determines whether a library needs to be rebuilt, and why can some ancient libraries still be used when others that are only one version behind cannot?
ABI incompatibilities could cause some issues. Even though the C++ standard calls for objects such as std::vector and std::mutex and that they need to have specific public/protected members, how these classes are made is left to the implementation.
In practice, it means that nothing prevents the GNU standard library from having their data fields in another orders than the LLVM standard library, or having completely different private members.
As such, if you try to use a function from a library built with the LLVM libc++ by sending it a GNU libstdc++ vector it causes UB. Even on the same standard library, different versions could have changed something and that could be a problem.
To avoid these issues, popular C++ libraries only use C data structures in their ABIs since (at least for now) every compiler produces the same memory layout for a char*, an int or a struct.
These ABI issues can appears in two places:
When you use dynamic libraries (.so and .dll files) your compiler probably won't say anything and you'll get undefined behavior when you call a function of the library using incompatible C++ objects.
When you use static libraries (.a and .lib files) I'm not really sure, I'm guessing it could either print an error if it sees there's gonna be a problem or successfully compile some Frankenstein monster of a binary that will behave like the above point
I will try to answer some integral parts, but be aware this answer could be incomplete. With more information from peers we will maybe be able to construct a full answer!
The simples kind of linking is linking towards a C library. Since there is no concept of classes and overloading function names, the compiler creators are able to create entry points to functions by their pure name. This seems to be pretty much quasi-standardized since, I myself, haven't encountered a pure C library not at least linkable to my projects. You can select this behaviour in C++ code by prepending a function declaration with extern "C". (This also makes it easy to link against a library from C# code) Here is a detailed explanation about extern "C". But as far as I am aware this behaviour is not standardized; it is just so simple - it seems - there is just one sane solution.
Going into C++ we start to encounter function, variable and struct names repeating. Lets just talk about overloaded functions here. For that compiler creators have to come up with some kind of mapping between void a(); void a(int x); void a(char x); ... and their respective library representation. Since this process also is not standardized (see this thread) and this process is far more complex than the 1 to 1 mapping of C, the ABIs of different compilers or even compiler versions can differ in any way.
Now given two compilers (or linkers I couldn't find a resource wich specifies wich one exactly is responsible for the mangling but since this process is not standardized it could be also outsourced to cthulhu) with different name mangling schemes, create following function entry points (simplified):
compiler1
_a_
_a_int_
_a_char_
compiler2
_a_NULL_
_a_++INT++_
_a_++CHAR++_
Different linkers will not understand the output of your particular process; linker1 will try to search for _a_int_ in a library containing only _a_++INT++_. Since linkers can't use fuzzy string comparison (this could lead to a apocalypse imho) it won't find your function in the library. Also don't be fooled by the simplicity of this example: For every feature like namespace, class, method etc. there has to be a method implemented to map a function name to a entry point or memory structure.
Given your example you are lucky you use libraries of the same publisher who coded some logic to detect old libraries. Usually you will get something along the lines of <something> could not be resolved or some other convoluted, irritating and/or unhelpful error message.
Some info and experience dump regarding Visual Studio and libraries in general:
In general the Visual C++ suite doesn't support crosslinked libs between different versions but you could be lucky and it works. Don't rely on it.
Since VC++ 2015 the ABI of the libraries is guaranteed by microsoft to be compatible as drescherjm commented: link to microsoft documentation
In general when using libraries from different suites you should always be cautious as n. 1.8e9-where's-my-share m. commented here (here is your share btw) about dependencies to other libraries and runtimes. In general in general not having the control over how libraries are built is a huge pita
Edit addressing memory layout incompatibilities in addition to Tzigs answer: different name mangling schemes seem to be partially intentional to protect users against linkage against incompatible libraries. This answer goes into detail about it. The relevant passage from gcc docs:
G++ does not do name mangling in the same way as other C++ compilers. This means that object files compiled with one compiler cannot be used with another.
This effect is intentional [...].
Error C1047
This is caused by /GL Global optimization or /LTGC Link Time Code Generation
These use information in the .obj, to perform global optimizations. When present, VS looks at the compiler which generated the original .lib, and if they are different emits the error. These compilation switches are for code from a single compiler, and not intended for cross version usage.
The other builds which work, don't have the switches, so are compatible.
Visual studio has started to use a new #pragma detect_mismatch
This causes an old build to identify it is incompatible with a new build, by detecting the version change.
Very old builds didn't have / support the pragma, so had no checking.
When you build a lib, its dependencies are loaded and satisified by the linker, but this is not a guarantee of working. The one-definition-rule signs the developer up to a contract, that within a compiled binary, all implementations of the same named function are the same. If this came from different compilers, that may not be true, and so the linker can choose any, causing latent bugs, where mixtures of old and new code are linkeded into the binary.
If the definition or implementation of std::string has changed, it may link, but have code which is flawed.
This new compiler check, causes a fail early, which I thoroughly approve of.

Distributed dll and c++ objects - how some libraries handle this?

Recently I found that passing/exposing objects to/from dll is dangerous if executable and dll was compiled with different compilers or even with different version of the same compiler or even with different settings of the same compiler! To say I was surprised is to say nothing - all this time I thought I can create a dynamic library and safely distribute it.
Now having this knowledge I wonder how certain libraries work? Let's take a dll in system32 folder. With dependency Walker I can see that in, say, d3d11.dll functions exposed with C parameter - I guess this is C convention which gives us a guarantee it will work. Also I know that d3d11 works with COM objects, so there's no risk of creating an object in dll and destroying it in exe.
But now let's take any Qt dll - the functions exported as c++ and have all their names mangled. I remember when I installed the framework I specified my compiler - MSVC2013, but I didn't specified a version! I can use a dll compiled with update1 compiler in update5 compiler and I can't remember any strange behavior. How it works then? How they handled mangled function name resolve, parameters passing?
The usual reason that a library under windows has to be compiled with the same compiler is that the C++ standard library is not ABI compatible between different version of MSVC++. If a library uses the C++ Standard library across the dll boundary this is ok as long as the C++ standard library is identical in the library and the compiler that uses the library.
Qt expose there own strings (QString) and vectors(QVector) etc, instead of std::string and std::vector, therefore Qt doesn't expose the C++ standard library through the dll boundary, hence the Qt dll works with all versions of Visual Studio.

How to embed a C++ library in a C library?

I have a question related to embedding one library in another.
I have a code that is pure C and my users rely on that, they don't want to depend on C++ libraries. However, the need arose to embed a 3rd party library (ICU) into mine. None of the ICU functions would be exported, they would only be used internally in my library. Unfortunately ICU is a C++ library, though it does have a C wrapper. ICU does not use exceptions, but it does use RTTI (abstract base classes).
The question is how could I create my static library so that
ICU is embedded in my library (all references to ICU functions are resolved within my library)
all references to libstdc++ are also resolved and the necessary code is embedded into my library
if a user does not even have libstdc++ installed on their system things work just fine
if a user does happen to use my library within a C++ project then there are no conflicts with whatever libstdc++ (presumably the system libstdc++) he uses.
Is this possible at all? The targeted platforms are pretty much everything: windows (there my library is dynamic), and all sort of unix versions (linux, solaris, aix, hpux - here my library needs to be static).
gcc-4.5 and later does have --static-libstdc++, but as far as I understand it is only for creating shared libs or executables, and not static libs.
Thanks for any help!
The solution to this problem is pretty simple, but may not fall inside the parameters you have set.
The simple rules are:
You can dynamically link a C++ library to a C caller, but you have to wrap the library inside an extern C layer. You design the extern C API and implement it using C++ internals, which are forever hidden from view. [You can also use COM or .NET to do this on Windows.]
You cannot statically link a C++ library to a C caller. The libraries are different and the calling sequences/linker symbols are different. [You often can't even statically link between different versions of the same compiler, and almost never between different compilers.]
In other words, the solution is simple: use dynamic linking. If that's not the right answer, then I don't think there is one.
Just to make things interesting, you can even implement your own plug-in architecture. That's just another name for dynamic linking, but you get to choose the API.
Just to be clear, the only viable portable option I can see is that you link ICU inside its own dynamic library (DLL or SO). Its symbols, C++ libs, RTTI and exceptions all stay inside that. Your static lib links to the ICU dynamic lib by extern C. That's exactly how much of Windows is built: C++ inside DLL, extern C.
You can debug across the boundary, but you cannot export type information. If you need to do that, you will have to use a different API, such as .NET or COM.
I don't know if this will work, but let me at least suggest that you try it!
The excellent LLVM project (origin of clang compiler) has many front-ends and back-ends for different languages, such as C++ and C. And according to this S.O. question it should be possible for LLVM to compile C++ into C which in turn can be compiled as normal.
I imagine this route is a bumpy one, but if it works, it might solve your problem without the need to link dynamically. It all depends on if ICU will compile with LLVM C++.
If you do decide to give it a go, please let us know how you fare!

Wrapping C++ library in XCode

I need some help with wrapping C++ libraries in XCode.
What I want to achieve is to create new library in XCode, import C++ library (I have .a and .h files), wrap it to Obj-C so I can import that library to MonoTouch.
The reason why I do it round way is that when I try to import C++ lib into MonoTouch, because of name mangling I keep getting WrongEntryPoint exceptions. Correct me if I'm wrong but there is no way for me to find out mangled names, which depends on compiler.
Thank you in advance
Correct me if I'm wrong but there is no way for me to find out mangled names, which depends on compiler.
Technically you could. Many compilers share the same mangling syntax, maybe the most useful and long-lasting gift from Itanium ;-)
However it will bring it's own pain (e.g. non-primitive types, other compilers) and maintenance issues as you update your C++ code.
You'll better served by:
writing an ObjectiveC wrapper and use MonoTouch's btouch tool to generated bindings;
writing a C wrapper and using .NET p/invoke to call your code;
The choice it yours but if you think about reusing the C++/C# code elsewhere (e.g. Mono for Android) then using C and p/invoke will be reusable.
I would definitely recommend going the route of wrapping the library in an Obj-C library and using btouch to import the library into MonoTouch. I have recently done this for a C++ library that implemented a Sybase database engine. If you look at my questions you will find quite a few pertaining to wrapping C++ libraries as I posted a few times regarding issues I encountered.
Specifically, you can look at these questions:
Linking to a C++ native library in MonoTouch
Wrapping a C++ library in Objective-C is not hiding the C++ symbols
Application with static library runs on simulator but not on actual device
Undefined symbols when linking PhoneGap static library in MonoTouch
Linker options 'Link all assemblies" and "Link SDK assemblies only" causes undefined symbols in 3rd party static library
I would also recommend, if you are going to go the route of an Obj-C wrapper, that you get btouch to output code and include that in your project rather than including a dll from btouch. From my experience, the code worked more reliably than the dll, although the issues with the dll may have been resolved by now. But take a look at this question regarding the btouch issue:
Exception System.InvalidCastException when calling a method bound with btouch that returns an object. MonoTouch bug?
If you have specific questions/problems in building the Obj-C wrapper then ask them here and post some code and I am sure that I or other members of the community would be able to help you with it.
Bruce, as you assumed, I have problems with wrapping C++ code. After hours and hours of reading and trying, I couldn't wrap the C++ code.
Anyway, I managed to create a simple Obj-C library made of some dummy class, and then import it into another library. That worked fine. However, following same pattern, I included C++ .a file along with .h file (I'm not sure whether .h is mandatory because we can link header files in build options, right??) and when I compiled it, it went fine, the build succeeded, but XCode didn't produce new .a library.
I added linker flags: -ObjC -lNameOfLib
Does C++ Standard Library Type in Build - Linking has to be Static? And Symbols Hidden By Default as well?
It would be great if we could write step-by-step tut, since there are tons of various instructions, but I haven't been able to push it through the end.
I'm confused a bit..
Thank you guys...

MS vs Non-MS C++ compiler compatibility

Thinking of using MinGW as an alternative to VC++ on Windows, but am worried about compatibility issues. I am thinking in terms of behaviour, performance on Windows (any chance a MinGW compiled EXE might act up). Also, in terms of calling the Windows API, third-party DLLs, generatic and using compatible static libraries, and other issues encountered with mixing parts of the same application with the two compilers.
First, MinGW is not a compiler, but an environment, it is bundled with gcc.
If you think of using gcc to compile code and have it call the Windows API, it's okay as it's C; but for C++ DLLs generated by MSVC, you might have a harsh wake-up call.
The main issue is that in C++, each compiler has its own name mangling (or more generally ABI) and its own Standard library. You cannot mix two different ABI or two different Standard Libraries. End of the story.
Clang has a specific MSVC compatibility mode, allowing it to accept code that MSVC accepts and to emit code that is binary compatible with code compiled with MSVC. Indeed, it is even officially supported in Visual Studio.
Obviously, you could also simply do the cross-DLL communication in C to circumvent most issues.
EDIT: Kerrek's clarification.
It is possible to compile a large amount of C++ code developed for VC++ with the MinGW toolchain; however, the ease with which you complete this task depends significantly on how C++-standards-compliant the code is.
If the C++ code utilizes VC++ extensions, such as __uuidof, then you will need to rewrite these portions.
You won't be able to compile ATL & MFC code with MinGW because the ATL & MFC headers utilize a number of VC++ extensions and depend on VC++-specific behaviors:
try-except Statements
__uuidof
throw(...)
Calling a function without forward-declaring it.
__declspec(nothrow)
...
You won't be able to use VC++-generated LIB files, so you can't use MinGW's linker, ld, to link static libraries without recompiling the library code as a MinGW A archive.
You can link with closed-source DLLs; however, you will need to export the symbols of the DLL as a DEF file and use dlltool to make the corresponding A archive (similar to the VC++ LIB file for each DLL).
MinGW's inclusion of the w32api project basically means that code using the Windows C API will compile just fine, although some of the newer functions may not be immediately available. For example, a few months ago I was having trouble compiling code that used some of the "secure" functions (the ones with the _s suffix), but I got around this problem by exporting the symbols of the DLL as a DEF, preparing an up-to-date A archive, and writing forward declarations.
In some cases, you will need to adjust the arguments to the MinGW preprocessor, cpp, to make sure that all header files are properly included and that certain macros are predefined correctly.
What I recommend is just trying it. You will definitely encounter problems, but you can usually find a solution to each by searching on the Internet or asking someone. If for no other reason, you should try it to learn more about C++, differences between compilers, and what standards-compliant code is.