Shared libraries and c++20 modules - c++

There is very little documentation online on the proper use of C++20 modules in shared libraries. Many folks are clearly interested, but I haven't been able to find a clear solution.
In MSVC, you need to use dllexport when compiling the library, and dllimport when consuming the symbols. This can be done using macros in "legacy C++", but this does not work with C++20 modules, since the code is only compiled once, regardless of preprocessor directives.
This post suggests that you only need to use dllexport now, and that dllimport will be taken care of automatically by the compiler. However, this comes from a comment which has now been deleted, and I couldn't find any reliable source on the topic.
How is one expected to create a shared library using C++20 modules?

Background
A translation unit which declares a module interface or a module partition will be treated as a module unit and will, when compiled, generate both an object file and a binary module interface (BMI).
The BMI is a binary representation of an abstract syntax tree, that is a data structure representing the syntax and data types of the program. We have the traditional C++ compilation pipeline:
program -> precompiler -> lexer -> parser -> assembler -> linker
With GCC, we should add the compiler flag -c which tells the compiler to compile and assemble but not link.
But shared libraries are built by the linker by reading several compiled object files together and creating a shared object. So that happens after the BMI's have been built. And the BMI's may be built without linking them together as that is two different stages.
Module Visibility
In C# when building a DLL we have visibility attributes on class level, ie. public, private, internal. In C++ we can obtain the same functionality with module partitions.
A module partition, declared with module <module> : <partition>; will be entirely visible inside the compilation unit that declares export module <module>;, but not outside that module. This reminds me of internal mode from C#. But if we however export the partition with export module <module> : <partition>; then its declarations will be publicly visible. Read more on cppreference.
Example
I have solved that problem with GCC (g++-11), see here.
In essence, you don't need DLL import/export since there are (likely) no headers involved. I have tried inserting these visibility attributes but with complaints from my compiler, so I guess we might not need them after all. Other than that, it's standard procedure. I copy/paste my example here as well:
Main
import <iostream>;
import mathlib;
int main()
{
int a = 5;
int b = 6;
std::cout << "a = " << a << ", b = " << b << '\n';
std::cout << "a+b = " << mathlib::add(a, b) << '\n';
std::cout << "a-b = " << mathlib::sub(a, b) << '\n';
std::cout << "a*b = " << mathlib::mul(a, b) << '\n';
std::cout << "a/b = " << mathlib::div(a, b) << '\n';
return 0;
}
Library
export module mathlib;
export namespace mathlib
{
int add(int a, int b)
{
return a + b;
}
int sub(int a, int b)
{
return a - b;
}
int mul(int a, int b)
{
return a * b;
}
int div(int a, int b)
{
return a / b;
}
}
Makefile
GCC=g++-11 -std=c++20 -fmodules-ts
APP=app
build: std_headers mathlib main
std_headers:
$(GCC) -xc++-system-header iostream
mathlib: mathlib.cpp
$(GCC) -c $< -o $#.o
$(GCC) -shared $#.o -o libmathlib.so
main: main.cpp
$(GCC) $< -o $(APP) -Xlinker ./libmathlib.so
clean:
#rm -rf gcm.cache/
#rm -f *.o
#rm -f $(APP)
#rm -f *.so
Running
g++-11 -std=c++20 -fmodules-ts -xc++-system-header iostream
g++-11 -std=c++20 -fmodules-ts -c mathlib.cpp -o mathlib.o
g++-11 -std=c++20 -fmodules-ts -shared mathlib.o -o libmathlib.so
g++-11 -std=c++20 -fmodules-ts main.cpp -o app -Xlinker ./libmathlib.so
./app
a = 5, b = 6
a+b = 11
a-b = -1
a*b = 30
a/b = 0
Now this is clearly platform-specific, but the approach should work on other platforms. I have tested a similar thing with Clang as well (same repo as linked).

C++20 modules have no special relationship with shared libraries. They are primarily a replacement of header files.
This means that you would develop a shared library with C++20 modules in a similar fashion as you would with header files before C++20, at least with my current understanding. You design some API that is exported (unfortunately still using vendor-specific attributes like __declspec(dllexport) or __attribute__((visibility("default")))) and implement it. You build your shared library file (.dll/.so) and an import library for distribution, same way as before. However instead of distributing header files, you would distribute module interface units instead. Module interface units are files containing an export module ABC; declaration at the top.
And executables consuming that shared library would then import that module using import ABC;, instead of #include-ing a header file.
Edit: As was pointed out in the comments, it is seemingly still necessary on Windows to provide a macro switch inside the module interfaces that toggles between dllexport and dllimport attributes, similar to as it is done with headers. However, I have currently not experimented with this and can only defer to what #jeremyong has experimented with in What is the expected relation of C++ modules and dynamic linkage?.

Related

With clang and libstdc++ on Linux, is it currently feasible to use any standard library types in a module interface?

So far it seems to me that including almost any libstdc++ header in a C++ module interface causes compile errors on clang 14.0.0 and the libstdc++ that comes bundled with GCC 11.2.0. I wonder if I am doing something wrong or if this is just not something that is supported yet. (I see that the Clang modules support is "partial", but haven't been able to find what is implemented and what is not.)
Here's a trivial module example that I got to work with clang-14 in Linux, linked with libstdc++. It demonstrates that libstdc++ headers can be used in a module implementation, but this example does not #include anything in the module interface:
// mod_if.cc
export module mod;
export int foo();
// mod.cc
module;
#include <iostream>
module mod;
int foo() {
std::cout << "Hello world from foo()" << std::endl;
return 42;
}
// use.cc
import mod;
#include <iostream>
int main() {
std::cout << foo() << std::endl;
}
This works:
$ CXXFLAGS="-std=c++20 -fmodules -fprebuilt-module-path=prebuilt"
$ clang++ -c $CXXFLAGS -Xclang -emit-module-interface -o prebuilt/mod.pcm mod_if.cc
$ clang++ -c $CXXFLAGS -fmodule-file=prebuilt/mod.pcm mod.cc -o mod.o
$ clang++ $CXXFLAGS use.cc mod.o prebuilt/mod.pcm -o use
$ ./use
Hello world from foo()
42
However, suppose I wanted foo to return a std::string:
// mod_if.cc
module;
#include <string>
export module mod;
export std::string foo();
// mod.cc
module;
#include <string>
module mod;
std::string foo() {
return "42";
}
// no use.cc needed since the error happens when building mod.cc
This does not compile (first of many similar errors shown):
$ clang++ -c $CXXFLAGS -Xclang -emit-module-interface -o prebuilt/mod.pcm mod_if.cc
$ clang++ -c $CXXFLAGS -fmodule-file=prebuilt/mod.pcm mod.cc -o mod.o
In file included from mod.cc:2:
In file included from /usr/lib64/gcc/x86_64-pc-linux-gnu/11.2.0/../../../../include/c++/11.2.0/string:40:
In file included from /usr/lib64/gcc/x86_64-pc-linux-gnu/11.2.0/../../../../include/c++/11.2.0/bits/char_traits.h:39:
In file included from /usr/lib64/gcc/x86_64-pc-linux-gnu/11.2.0/../../../../include/c++/11.2.0/bits/stl_algobase.h:64:
In file included from /usr/lib64/gcc/x86_64-pc-linux-gnu/11.2.0/../../../../include/c++/11.2.0/bits/stl_pair.h:65:
/usr/lib64/gcc/x86_64-pc-linux-gnu/11.2.0/../../../../include/c++/11.2.0/compare:348:33: error: redefinition of '__cmp_cat_id<std::partial_ordering>'
inline constexpr unsigned __cmp_cat_id<partial_ordering> = 2;
^
/usr/lib64/gcc/x86_64-pc-linux-gnu/11.2.0/../../../../include/c++/11.2.0/bits/stl_pair.h:65:11: note: '/usr/lib64/gcc/x86_64-pc-linux-gnu/11.2.0/../../../../include/c++/11.2.0/compare' included multiple times, additional include site in header from module 'mod.<global>'
# include <compare>
^
/usr/lib64/gcc/x86_64-pc-linux-gnu/11.2.0/../../../../include/c++/11.2.0/bits/stl_pair.h:65:11: note: '/usr/lib64/gcc/x86_64-pc-linux-gnu/11.2.0/../../../../include/c++/11.2.0/compare' included multiple times, additional include site in header from module '<global>'
# include <compare>
^
mod.cc:1:1: note: <global> defined here
module;
^
Is there currently a way to make this code work (without resorting to writing module maps for the libstdc++ headers)? Why does this error happen? It sounds strange that the inline constexpr declaration included in the global module fragment gets exported, but then I don't claim to understand modules well.
Ok, this is something that sort of worked for a large project. Note that this was half a year ago, so the world may have moved on.
I ended up creating a single header, "sys.hh", that #includes pretty much all the system headers used in the project. What seems to be important is that nothing directly or indirectly #included by this file gets #included directly or indirectly (outside the module system) in anything that gets linked into the final binary.
My "sys.hh" looks something like this:
#include <algorithm>
#include <array>
#include <assert.h>
#include <atomic>
#include <bits/std_abs.h>
// 100+ lines omitted, including things like glib, gtk, libjpeg
#include <vector>
#include <x86intrin.h>
#include <zlib.h>
// Macros won't get exported, so whatever the code needs, redefine as
// constexpr (or consteval functions) here. Unfortunately, I don't think
// there's a way to retain the name of the macro; so add an underscore.
// Also put them in a namespace.
#define MEXP(X) constexpr auto X ## _ = X;
namespace sys {
MEXP(INTENT_PERCEPTUAL);
MEXP(INTENT_RELATIVE_COLORIMETRIC);
MEXP(INTENT_SATURATION);
MEXP(INTENT_ABSOLUTE_COLORIMETRIC);
MEXP(G_PRIORITY_DEFAULT_IDLE);
}
And my my modulemap file contains an entry like this:
module sys {
header "prebuilt/sys.hh"
use _Builtin_intrinsics
export *
}
Compiling this header/module is a bit of an incremental process; you will run into modules that fail to compile because they indirectly include the same headers, so you add them into this file and rebuild until it works.
Note that managing build dependencies becomes much more of a thing with modules. At least half a year ago no good (automatic) ways seemed to exist to discover what needs to be rebuilt. This is made trickier by the fact that the name of the module does not tell where it lives in the source code.

How do I use C++ modules in Clang?

Modules are an alternative to #includes. Clang has a complete implementation for C++. How would I go about if I wanted to use modules using Clang now?
Using
import std.io;
in a C++ source file does not work (compile) yet, as the specification for modules (which includes syntax) isn't final.
The Clang documentation states that, when passing the -fmodules flag, #includes will be rewritten to their appropriate imports. However, checking the preprocessor suggests otherwise (test.cpp only contains #include <stdio.h> and an empty main):
$ clang++-3.5 -fmodules -E test.cpp -o test
$ grep " printf " test
extern int printf (const char *__restrict __format, ...);
Furthermore, compiling this test file with -fmodules vs no flags at all produces the same object file.
What am I doing wrong?
As of this commit, Clang has experimental support for the Modules TS.
Let's take the same example files (with a small change) as in the VS blog post about experimental module support.
First, define the module interface file. By default, Clang recognizes files with cppm extension (and some others) as C++ module interface files.
// file: foo.cppm
export module M;
export int f(int x)
{
return 2 + x;
}
export double g(double y, int z)
{
return y * z;
}
Note that the module interface declaration needs to be export module M; and not just module M; like in the VS blog post.
Then consume the module as follows:
// file: bar.cpp
import M;
int main()
{
f(5);
g(0.0, 1);
return 0;
}
Now, precompile the module foo.cppm with
clang++ -fmodules-ts --precompile foo.cppm -o M.pcm
or, if the module interface extension is other than cppm (let's say ixx, as it is with VS), you can use:
clang++ -fmodules-ts --precompile -x c++-module foo.ixx -o M.pcm
Then build the program with
clang++ -fmodules-ts -c M.pcm -o M.o
clang++ -fmodules-ts -fprebuilt-module-path=. M.o bar.cpp
or, if the pcm file name is not the same as the module name, you'd have to use:
clang++ -fmodules-ts -fmodule-file=M.pcm bar.cpp
I've tested these commands on Windows using the r303050 build (15th May 2017).
Note: When using the -fprebuilt-module-path=. option, I get a warning:
clang++.exe: warning: argument unused during compilation: '-fprebuilt-module-path=.' [-Wunused-command-line-argument]
which appears to be incorrect because without that option, the module M is not found.
Like you mentioned, clang does not yet have a C++ syntax for imports,
so I doubt that #include directives are going to be literally rewritten as imports when preprocessing a file, so that may not be the best way to test if modules are working as intended.
However, if you set -fmodules-cache-path=<path> explicitly, you can observe clang populating it with precompiled module files (*.pcm) during a build - if there are any modules involved.
You'll need to use libc++ (which seems to come with a module.modulemap as of version 3.7.0) if you want to use a modules enabled standard library right now - though in my experience this isn't working entirely just yet.
(Visual Studio 2015's C++ compiler is also supposed to get some form of module support with Update 1 in November)
Independently of the stdlib, you could still use modules in your own code. The clang docs contain a detailed description of the Module Map Language.

Static *template* class member across dynamic library

Edit: the comments below the accepted answer show that it might be an issue with the Android dynamic loader.
I have a header for a template class with a static member. At runtime the address of the static member is used in the library and in the client code. The template is implicitly instantiated both in the library and in the client code. It works fine on Linux and OSX, the symbol is duplicated but marked as "uniqued" as shown by nm (see below).
However when I compile for ARM (Android), the symbol is marked weak in both the DSO and the executable. The loader does not unify and the symbol is effectively duplicated at runtime!
I read these:
two instances of a static member, how could that be?
Static template data members storage
and especially this answer:
https://stackoverflow.com/a/2505528/2077394
and:
http://gcc.gnu.org/wiki/Visibility
but I am still a little bit puzzled. I understand that the attributes for visibility helps to optimize, but I thought it should work by default. I know the C++ standard does not care about shared library, but does it means that using shared libraries breaks the standard? (or at least this implementation is not C++ standard conform?)
Bonus: how can I fix it? (and not using template is not an acceptable answer:))
Header:
template<class T>
struct TemplatedClassWithStatic {
static int value;
};
template<class T>
int TemplatedClassWithStatic<T>::value = 0;
shared.cpp:
#include "TemplateWithStatic.hpp"
int *addressFromShared() {
return &TemplatedClassWithStatic<int>::value;
}
main.cpp:
#include "TemplateWithStatic.hpp"
#include <cstdio>
int *addressFromShared();
int main() {
printf("%p %p\n", addressFromShared(), &TemplatedClassWithStatic<int>::value);
}
And building, looking at the symbols definitions:
producing .so:
g++-4.8 -shared src/shared.cpp -o libshared.so -I include/ -fPIC
compiling and linking main:
g++-4.8 src/main.cpp -I include/ -lshared -L.
symbols are marked as "unique":
nm -C -A *.so a.out | grep 'TemplatedClassWithStatic<int>::value'
libshared.so:0000000000200a70 u TemplatedClassWithStatic<int>::value
a.out:00000000006012b0 u TemplatedClassWithStatic<int>::value
producing .so
~/project/android-ndk-r9/toolchains/arm-linux-androideabi-4.8/prebuilt/darwin-x86_64/bin/arm-linux-androideabi-g++ -o libshared.so src/shared.cpp -I include/ --sysroot=/Users/amini/project/android-ndk-r9/platforms/android-14/arch-arm/ -shared
compiling and linking main
~/project/android-ndk-r9/toolchains/arm-linux-androideabi-4.8/prebuilt/darwin-x86_64/bin/arm-linux-androideabi-g++ src/main.cpp libshared.so -I include/ --sysroot=${HOME}/project/android-ndk-r9/platforms/android-14/arch-arm/ -I ~/project/android-ndk-r9/sources/cxx-stl/gnu-libstdc++/4.8/include -I ~/project/android-ndk-r9/sources/cxx-stl/gnu-libstdc++/4.8/libs/armeabi-v7a/include -I ~/project/android-ndk-r9/sources/cxx-stl/gnu-libstdc++/4.8/include/backward -I ~/project/android-ndk-r9/platforms/android-14/arch-arm/usr/include ~/project/android-ndk-r9/sources/cxx-stl/gnu-libstdc++/4.8/libs/armeabi-v7a/libgnustl_static.a -lgcc
symbols are weak!
nm -C -A *.so a.out | grep 'TemplatedClassWithStatic<int>::value'
libshared.so:00002004 V TemplatedClassWithStatic<int>::value
a.out:00068000 V TemplatedClassWithStatic<int>::value
Edit, note for the context: I was playing with OOLua, a library helping binding C++ to Lua and my unittests were failing when I started to target Android. I don't "own" the code and I would rather modifying it deeply.
Edit, to run it on Android:
adb push libshared.so data/local/tmp/
adb push a.out data/local/tmp/
adb shell "cd data/local/tmp/ ; LD_LIBRARY_PATH=./ ./a.out"
0xb6fd7004 0xb004
Android does not support unique symbols. It is a GNU extension of ELF format that only works with GLIBC 2.11 and above. Android does not use GLIBC at all, it employs a different C runtime called Bionic.
(update) If weak symbols don't work for you (end update) I'm afraid you would have to modify the code such that it does not rely on static data.
There may be some compiler/linker settings that you can tweak to enable this (have you looked at the -fvisibility flag?).
Possibly a GCC attribute modifier may be worth trying (explicitly set __attribute__ ((visibility ("default"))) on the variable).
Failing that, the only workarounds I could suggest are: (all are somewhat ugly):
Explicitly instantiate all forms of the template that are created in the shared library and provide the initializers in its implementation (not in the header). This may or may not work.
Like (1) but use a shim function as a myers singleton for the shared variable (example below).
Allocate a variable in a map for the class based upon rtti (which might also fail across a shared library boundary).
e.g.
template<class T>
struct TemplatedClassWithStatic {
static int& getValue() { return TemplatedClassWithStatic_getValue((T const*)0); }
};
// types used by the shared library.. can be forward declarations here but you run the risk of violating ODR.
int& TemplatedClassWithStatic_getValue(TypeA*);
int& TemplatedClassWithStatic_getValue(TypeB*);
int& TemplatedClassWithStatic_getValue(TypeC*);
shared.cpp
int& TemplatedClassWithStatic_getValue(TypeA*) {
static int v = 0;
return v;
}
int& TemplatedClassWithStatic_getValue(TypeB*) {
static int v = 0;
return v;
}
int& TemplatedClassWithStatic_getValue(TypeC*) {
static int v = 0;
return v;
}
The executable would also have to provide implementations for any types that it uses to instantiate the template.

Creating shared libraries in C++ for OSX

I just started programming in C++ and I've realized that I've been having to write the same code over and over again(mostly utility functions).
So, I'm trying to create a shared library and install it in PATH so that I could use the utility functions whenever I needed to.
Here's what I've done so far :-
Create a file utils.h with the following contents :-
#include<iostream>
#include<string>
std::string to_binary(int x);
Create a file utils.cpp with the following contents :-
#include "utils.h"
std::string to_binary(int x) {
std::string binary = "";
while ( x > 0 ) {
if ( x & 1 ) binary += "1";
else binary += "0";
x >>= 1;
}
return binary;
}
Follow the steps mentioned here :-
http://www.techytalk.info/c-cplusplus-library-programming-on-linux-part-two-dynamic-libraries/
Create the library object code : g++ -Wall -fPIC -c utils.cpp
But as the link above is meant for Linux it does not really work on OSX. Could someone suggest reading resources or suggest hints in how I could go about compiling and setting those objects in the path on an OSX machine?
Also, I'm guessing that there should be a way I can make this cross-platform(i.e. write a set of instructions(bash script) or a Makefile) so that I could use to compile this easily across platforms. Any hints on that?
Use -dynamiclib option to compile a dynamic library on OS X:
g++ -dynamiclib -o libutils.dylib utils.cpp
And then use it in your client application:
g++ client.cpp -L/dir/ -lutils
The link you posted is using C and the C compiler. Since you are building C++:
g++ -shared -o libYourLibraryName.so utils.o

Using C++ classes in .so libraries

I'm trying to write a small class library for a C++ course.
I was wondering if it was possible to define a set of classes in my shared object and then using them directly in my main program that demos the library. Are there any tricks involved? I remember reading this long ago (before I started really programming) that C++ classes only worked with MFC .dlls and not plain ones, but that's just the windows side.
C++ classes work fine in .so shared libraries (they also work in non-MFC DLLs on Windows, but that's not really your question). It's actually easier than Windows, because you don't have to explicitly export any symbols from the libraries.
This document will answer most of your questions: http://people.redhat.com/drepper/dsohowto.pdf
The main things to remember are to use the -fPIC option when compiling, and the -shared option when linking. You can find plenty of examples on the net.
My solution/testing
Here's my solution and it does what i expected.
Code
cat.hh :
#include <string>
class Cat
{
std::string _name;
public:
Cat(const std::string & name);
void speak();
};
cat.cpp :
#include <iostream>
#include <string>
#include "cat.hh"
using namespace std;
Cat::Cat(const string & name):_name(name){}
void Cat::speak()
{
cout << "Meow! I'm " << _name << endl;
}
main.cpp :
#include <iostream>
#include <string>
#include "cat.hh"
using std::cout;using std::endl;using std::string;
int main()
{
string name = "Felix";
cout<< "Meet my cat, " << name << "!" <<endl;
Cat kitty(name);
kitty.speak();
return 0;
}
Compilation
You compile the shared lib first:
$ g++ -Wall -g -fPIC -c cat.cpp
$ g++ -shared -Wl,-soname,libcat.so.1 -o libcat.so.1 cat.o
Then compile the main executable or C++ program using the classes in the libraries:
$ g++ -Wall -g -c main.cpp
$ g++ -Wall -Wl,-rpath,. -o main main.o libcat.so.1 # -rpath linker option prevents the need to use LD_LIBRARY_PATH when testing
$ ./main
Meet my cat, Felix!
Meow! I'm Felix
$
As I understand it, this is fine so long as you are linking .so files which were all compiled using the same compiler. Different compilers mangle the symbols in different ways and will fail to link.
That is one of the advantages in using COM on Windows, it defines a standard for putting OOP objects in DLLs. I can compile a DLL using GNU g++ and link it to an EXE compiled with MSVC - or even VB!