How to distribute C++20 modules? - c++

All the literature about modules is quite recently new, and I am struggling with one core concept thing.
When I make my own modules, after the linkage process, does exists a conventional or accepted way of package those modules to distribute them as a library?

Broadly speaking, the products of building a module's interface (as distinct from the linker-products of compilation, like a static/shared library) are not sharable between compilers. At least not the way that compiled libraries for the same OS/platform are. Compiled module formats are compiler-specific and may not even be stable between versions of the same compiler.
As such, if you want to ship a pre-compiled library that was build using modules, then just like non-module builds, you will need to ship textual files that are used to consume that module. Specifically, you need all of the interface units for any modules built into that library. Implementation units need not be given, as their products are all in the compiled form of the library (unless they are implementation partitions included by interface units).
Perhaps in the future, compilers for the same platform will standardize a compiled module format, or even across platforms. But until then, you're going to have to keep shipping text with your pre-compiled libraries.

Related

How to manage compilation of C++ header-only libraries across shared objects

I'm developing a large software package consisting of many packages which are compiled to shared objects. For performance reasons, I want to compile Eigen 3 (a header-only library) with vector instructions, but the templated methods are being compiled all over the place. How can I ensure that the Eigen functions are compiled into a specific object file?
This software consists of ~2000 individual packages. To keep development going at a reasonable pace, the recommended way of compiling the program is to sparsely check out some of the packages and compile them, after which the program can be executed using precompiled (by some CI system) shared libraries.
The problem is that part of my responsibility is to optimise the CPU time of the program. In order to do so, I wanted to compile the package I am working on (let's call it A.so) with the -march flag so Eigen can exploit modern SIMD processor extensions.
Unfortunately, because Eigen is a header-only library, the Eigen functions are compiled into many different shared objects. For example, one of the most CPU intensive methods called in A.so is the matrix multiplaction kernel which is compiled in B.so. Many other Eigen functions are compiled into C.so, D.so, etc. Since these objects are compiled for older, more widely implemented instruction set extensions, they are not compiled with AVX, AVX2, etc.
Of course, one possible solution is to include packages B, C, D, etc. into my own sparse compilation but this negates the advantage of compiling only a part of the project. In addition, it leaves me including ever more and more packages if I really want to vectorise all linear algebra operations in the code of package A.
What I am looking for is a way to compile all the Eigen functions that package A uses into A.so, as if the Eigen functions were defined with the static keyword. Is this possible? Is there some mechanism in the compiler/linker that I can leverage to make this happen?
One obvious solution is to hide these symbols. This happens (if I understand the problem properly) because these functions are exported and can be used by other subsequently loaded libraries.
When you build your library and link against the other libraries, the linker reuses what it can. And the old packages as well. I hope you don't require these libraries for your own build?
So two options:
Force the loading of A before the other libraries (but if you need the other libraries, I don't think this is doable),
Tell the linker that these functions should not be visible by other libraries (visibility=hidden by default).
I saw something similar happening with a badly compiled 3rd-party library. It was built in debug mode, shipped in the product, and all of a sudden one of our libraries experienced a slow down. The map files identified where the culprit debug function came from, as it exported all its symbols by default.
An alternative way to change visibility without modifying the code is to filter symbols during linking stage using version scripts -> https://sourceware.org/binutils/docs/ld/VERSION.html. You'll need something like
{
global: *;
local:
extern "C++"
{
Eigen::*;
*Eigen::internal::*;
};
};

Runtime dependency and build dependency concepts

I have been hearing about build dependency / runtime dependency. They are quite self explanatory terms. As far as I understand, build dependency is used for components required in the compile time. For example if A has a build dependency to B, A cannot be built without B. Runtime dependency on the other hand is dynamic. If A has a runtime dependency to B, A can be built without B but cannot run without B.
This information however is too shallow. I'd like to read and understand these concepts better. I have been googling but could not find a source, can you please provide me a link or right keywords to search?
I'll try to keep it simple and theoretical only.
When you write code that calls function "func", compiler needs your function descriptor (e.g. "int func(char c);" usually available in .h files) to verify arguments correctness and linker needs your function implementation (where your actual code reside).
Operating systems provide mechanism to separate functions implementation into different compiled modules. It is usually required for
Better code reuse (multiple applications can use the same code, with different data context)
More efficient compilation (you don't need to recompile all dependency libraries)
Partial upgrades
Distribution of compiled libraries, without disclosing the source code
To support such functionality compiler is provided with function descriptors (.h files) as usual. While Linker is provided with lib files containing function stubs. Operating system is responsible to load an actual implementation file during application loading procedure (if it is not yet loaded for different application) and to map actual functions into memory of the new application.
Dynamic load functionality is extended for object oriented languages as well (C++, C#, Java and etc.)
Practical implementations are OS dependent - dynamic linking is implemented as DLL files in Windows or as SO files in Linux
Special OS dependent techniques can be used to share context (variables, objects) between different applications that uses the same dynamic library.
Meir Tseitlin

What is the theoretical reason for C++ dependency production not being automated?

C++ Buildsystem with ability to compile dependencies beforehand
Java has Maven which is a pleasure to work with, simply specifying dependencies that are already compiled, and deposited to Mavens standard directory, meaning that the location of the dependencies is standardized as opposed to the often used way of having multiple locations (give me a break, like anyone remembers the default installed directories for particular deps) of C/C++ dependencies.
It is massively unproductive for every individual developer having to, more often than not, find, read about, get familiar with the configure options/build, and finally compile for every dependency to simply make a build of a project.
What is the theoretical reason this has not been implemented?
Why would it be difficult to provide packages of the following options with a maven-like declaration format?
version
platform (windows, linux)
src/dev/bin
shared/static
equivalent set of Boost ABI options when applicable
Having to manually go to websites and search out dependencies in the year 2013 for the oldest major programming language is absurd.
There aren't any theoretical reasons. There are a great many practical reasons. There are just too many different ways of handling things in the C++ world to easily standardize on a dependency system:
Implementation differences - C++ is a complicated language, and different implementations have historically varied in how well they support it (how well they can correctly handle various moderate to advanced C++ code). So there's no guarantee that a library could be built in a particular implementation.
Platform differences - Some platforms may not support exceptions. There are different implementations of the standard library, with various pros and cons. Unlike Java's standardized library, Windows and POSIX APIs can be quite different. The filesystem isn't even a part of Standard C++.
Compilation differences - Static or shared? Debug or production build? Enable optional dependencies or not? Unlike Java, which has very stable bytecode, C++'s lack of a standard ABI means that code may not link properly, even if built for the same platform by the same compiler.
Build system differences - Makefiles? (If so, GNU Make, or something else?) Autotools? CMake? Visual Studio project files? Something else?
Historical concerns - Because of C's and C++'s age, popular libraries like zlib predate build systems like Maven by quite a bit. Why should zlib switch to some hypothetical C++ build system when what it's doing works? How can a newer, higher-level library switch to some hypothetical build system if it depends on libraries like zlib?
Two additional factors complicate things:
In Linux, the distro packaging systems do provide standardized repositories of development library headers binaries, with (generally) standardized ABIs and an easy way of specifying a project's build dependencies. The existence of these (platform-specific) solutions reduces the impetus for a cross-platform solution.
With all of these complicating factors and pre-existing approaches, any attempt to establish a standard build system is going to run into the problem described in XKCD's "Standards":
Situation: There are 14 competing standards.
"14? Riculous! We need to develop one universal standard that covers everyone's use cases."
Soon: There are 15 competing standards.
With all of that said:
There is some hope for the future. For example, CMake seems to be gradually replacing other build systems. Some of the Boost developers have started Ryppl, an attempt to do what you're describing.
(also posted in linked question)
Right now I'm working on a tool able to automatically install all dependencies of a C/C++ app with exact version requirement :
compiler
libs
tools (cmake, autotools)
Right now it works, for my app. (Installing UnitTest++, Boost, Wt, sqlite, cmake all in correct order)
The tool, named «C++ Version Manager» (inspired by the excellent ruby version manager) is coded in bash and hosted on github : https://github.com/Offirmo/cvm
Any advices and suggestions are welcomed.
well, first off a system that resolves all the dependencies doesn't makes you productive by default, potentially it can make you even less productive.
Regarding the differences between languages I would say that in Java you have packages, which are handy when you have to organize and give a limited horizon to your code, in C++ you don't have an equivalent concept.
In C++ all the libraries that can solve a symbol are good enough for the compiler, the only real requirement for a library is to have a certain ABI and to solve the required symbols, there are no automated ways that you can work to pick the right library, also solving a symbol it's just a matter of linking your function to the actual implementation, this doesn't even grant you that a correct linking phase will make your app work.
To this you can add important variables such as the library version, different implementations of the same library and different libraries with the same methods name.
An example is the Mesa library VS the opengl lib from the official drivers, or whatever lib you want that offers multiple releases and each one can solve all the symbols but probably there is a release that is more mature than the others and you can ask a compiler to pick the right one because they are all the same for its own purposes .

Are C++ libs created with different versions of Visual Studio compatible with each other?

I am creating a open-source C++ library using Visual Studio 2005. I would like to provide prebuilt libs along with the source code. Are these libs, built with VS2005, also going to work with newer versions of Visual Studio (esp VS Express Edition 2008)? Or do I need to provide separate libs per VS version?
Not normally, no. Libraries built with the VS tools are linked into the 'Microsoft C Runtime' (called MSVCRT followed by a version number) which provides C and C++ standard library functions, and if you attempt to run a program that requires two different versions of this runtime then errors will occur.
On top of this, different compiler versions churn out different compiled code and the code from one compiler version frequently isn't compatible with another apart from in the most trivial cases (and if they churned out the same code then there would be no point having different versions :))
If you are distributing static libraries, you may be able to distribute version-independent libraries, depending on exactly what you are doing. If you are only making calls to the OS, then you may be OK. C RTL functions, maybe. But if you use any C++ Standard Library functions, classes, or templates, then probably not.
If distributing DLLs, you will need separate libraries for each VS version. Sometimes you even need separate libraries for various service-pack levels. And as mentioned by VolkerK, users of your library will have to use compatible compiler and linker settings. And even if you do everything right, users may need to link with other libraries that are somehow incompatible with yours.
Due to these issues, instead of spending time trying to build all these libraries for your users, I'd spend the time making them as easy to build as possible, so that users can can build them on their own with minimal fuss.
Generally it's not possible to link against libraries built with different compilers, different versions of the same compiler, and even different settings of the same compiler version and get a working application. (Although it might work for specific subsets of the language and std library.) There is no standard binary interface for C++ - not even one for some common platform as there are in C.
To achieve that, you either need to wrap your library in a C API or you will have to ship a binary for every compiler, compiler version, and compiler setting you want to support.
If your library project is a static library, then, you'll have to supply a build for every Visual Studio version that you want your users to be in. In the example you gave, that equates to providing both a VS2005 and a VS2008 library.
If your library project is a dynamic library, then, you evade the problems somewhat, but, it means that users will need to make sure that they use the 'Microsoft C Runtime' that's compatible with your build environment. You can eliminate that criteria should you statically link the 'Microsoft C Runtime' into your dynamic library.

Building C++ source code as a library - where to start?

Over the months I've written some nice generic enough functionality that I want to build as a library and link dynamically against rather than importing 50-odd header/source files.
The project is maintained in Xcode and Dev-C++ (I do understand that I might have to go command line to do what I want) and have to link against OpenGL and SDL (dynamically in SDL's case). Target platforms are Windows and OS X.
What am I looking at at all?
What will be the entry point of my
library if it needs one?
What do I have to change in my code?
(calling conventions?)
How do I release it? My understanding
is that headers and the compiled
library (.dll, .dylib(, .framework),
whatever it'll be) need to be
available for the project -
especially as template functionality
can not be included in the library by
nature.
What else I need to be aware of?
I'd recommend building as a statc library rather than a DLL. A lot of the issues of exporting C++ functions and classes go away if you do this, provided you only intend to link with code produced by the same compiler you built the library with.
Building a static library is very easy as it is just an collection of .o/.obj files - a bit like a ZIP file but without compression. There is no need to export anything - just include the library in the list of files that your application links with. To access specific functions or classes, just include the relevant header file. Note you can't get rid of header files - the C++ compilation model, particularly for templates, depends on them.
It can be problematic to export a C++ class library from a dynamic library, but it is possible.
You need to mark each function to be exported from the DLL (syntax depends on the compiler). I'm poking around to see if I can find how to do this from xcode. In VC it's __declspec(dllexport) and in CodeWarrior it's #pragma export on/#pragma export off.
This is perfectly reasonable if you are only using your binary in-house. However, one issue is that C++ methods are named differently by different compilers. This means that nobody who uses a different compiler will be able to use your DLL, unless you are only exporting C functions.
Also, you need to make sure the calling conventions match in the DLL and the DLL's client. This either means you should have the same default calling convention flag passed to the compiler for both the DLL or the client, or better, explicitly set the calling convention on each exported function in the DLL, so that it won't matter what the default is for the client.
This article explains the naming issue:
http://en.wikipedia.org/wiki/Name_decoration
The C++ standard doesn't define a standard ABI, and that's bad news for people trying to build C++ libraries. This means that you get different behavior from your compiled code depending on which flags were used to compile it, and that can lead to mysterious bugs in code that compiles and links just fine.
This extends beyond just different calling conventions - C++ code can be compiled to support or not support RTTI, exception handling, and with various optimizations that can affect the the memory layout of class instances, which C++ code relies on.
So, what can you do? I would build C++ libraries inside my source tree, and make sure that they're built as part of my project's build, and that all the libraries and the code that links to them use the same compiler flags.
Note that name mangling, which was supposed to at least prevent you from linking object files that were compiled with different compilers/compiler flags only mostly works, and there are certain things you can do, especially with GCC, that will result in code that links just fine and fails at runtime.
You have to be extra careful with vendor supplied dynamic C++ libraries (QT on most Linux distributions, for example.) I've seen instances of vendor supplied libraries that were compiled in ways that prevented certain things from working properly. For example, some Redhat Linux releases (maybe all of them) disabled exceptions in QT, which made it impossible to catch exceptions in main() if the exceptions were thrown in a QT callback. Fun.