Vector in C++ module causes useless Bad file data GCC output - c++

TL;DR: GCC 11.2.0 (image f7ea55625e09) + C++20 + <vector>'s std::vector<anything> cause useless output. How to get out something I can work with?
Compilation works in:
module cache
single file
separate module file
In module imported at main.cpp:4:1: import mymodule;
mymodule: error: failed to read compiled module: Bad file data eh???????
mymodule: note: compiled module file is 'gcm.cache/mymodule.gcm' exists, 124 912 Bytes
mymodule: fatal error: returning to the gate for a mechanical issue ????????
compilation terminated.
For the fatal (gate) I found only these references (1, 2), from which everything looks okay for my case.
I've tried various simple things with the new C++ modules (C++20, GCC 11.2) and it makes me wonder whether I'm just encountering a compiler bug / missing implementation or not getting something very simple.
Here is a simple C++ code with vector<string>, it compiles just fine with basic flags and outputs what's expected:
# create module cache for system headers
for item in iostream string vector
do
g++ -fmodules-ts -std=c++20 -x c++-system-header $item
done
g++ -Wall -Wextra -Wpedantic -std=c++20 -fmodules-ts main.cpp
// main.cpp
import <iostream>;
import <string>;
import <vector>;
int main() {
std::vector<std::string> vec = std::vector<std::string>{};
vec.push_back("Hello");
vec.push_back("world");
for (auto& item : vec) {
std::cout << item << std::endl;
}
}
$ ./a.out
Hello
world
Here I move the vector creation into a new function, compiles fine, works fine. Still no separate module except for the system headers.
// main.cpp
import <iostream>;
import <string>;
import <vector>;
std::vector<std::string> create() {
std::vector<std::string> vec = std::vector<std::string>{};
vec.push_back("Hello");
vec.push_back("world");
return vec;
}
int main() {
std::vector<std::string> vec = create();
for (auto& item : vec) {
std::cout << item << std::endl;
}
}
And here I move the function to a separate, exported function in a separate module file.
g++ -Wall -Wextra -Wpedantic -std=c++20 -fmodules-ts -c mymodule.cpp
// mymodule.cpp
export module mymodule;
import <string>;
import <vector>;
export std::vector<std::string> create() {
std::vector<std::string> vec = std::vector<std::string>{};
vec.push_back("Hello");
vec.push_back("world");
return vec;
}
which compiles just fine, but when adding to the main.cpp,
import <iostream>;
import <string>;
import <vector>;
import mymodule;
int main() {
std::vector<std::string> vec = create();
for (auto& item : vec) {
std::cout << item << std::endl;
}
}
I get only this:
g++ -Wall -Wextra -Wpedantic -std=c++20 -fmodules-ts mymodule.cpp main.cpp
In module imported at main.cpp:4:1: import mymodule;
mymodule: error: failed to read compiled module: Bad file data eh???????
mymodule: note: compiled module file is 'gcm.cache/mymodule.gcm' exists, 124 912 Bytes
mymodule: fatal error: returning to the gate for a mechanical issue ????????
compilation terminated.
# file gcm.cache/mymodule.gcm
ELF 32-bit LSB no file type, no machine, version 1 (SYSV)
# file gcm.cache/usr/local/include/c++/11.2.0/iostream.gcm
ELF 32-bit LSB no file type, no machine, version 1 (SYSV)
# file a.out
ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, not stripped
And it doesn't seem to be a problem with simple containers nor with the import declarations of the <vector> alone:
// mymodule.cpp
export module mymodule;
import <string>;
import <vector>;
export std::string create() {
return "world";
}
// main.cpp
import <iostream>;
import <string>;
import <vector>;
import mymodule;
int main() {
std::string vec = create();
std::cout << vec << std::endl;
}
And I've tried playing with (and with no real effect):
.cpp vs .mpp extensions, though it shouldn't matter
clearing the cache before each compilation (ref)
global module fragment (7) placement i.e. module; [stuff]; export module mymodule;
compiling separately (-c mymodule.cpp + -c main.cpp) to link manually (fails the same way on main.cpp)
(re)moving export from the function
not calling the function, just importing (import mymodule; to trigger compilation)
switching from std::vector<std::string> to std::vector<int> to see whether the template's argument list causes the problem
switching from std::vector<std::string> to std::pair<int, int> and then to std::pair<std::string, std::string> (with <utility> header module cache) to see whether just <vector> is broken for me
And it looks like <vector> header is causing the problem. Any ideas how can I pry open GCC to give me something better than "naaah, can't do"? At the least I can generate assembly with -S (3k+ lines) or use hexdump / objdump for gcm.cache/mymodule.gcm and look at the binary, but I'm not sure what to look for because of the useless output.
Edit: It looks like a problem with the architectures perhaps?
using -m64 does nothing for the module cache, remains 32bit
using -m32 (apk install -y g++-multilib on 64bit) returns the same output
Edit 2: So I rewrote it a bit to make it compatible with Clang 12 (b978a93) with a help of this article but it's not 1:1 and is rather kind of butchering (string_view note), but maybe I'm not seeing the broader picture or something is missing.
I don't think I should be including <string_view> though as that should have been included automatically. Otherwise even if I write my module, I can just start copy-pasting every #include from the implementation until there's none left so I can ensure the file order (then again what'd be the module's point).
// mymodule.cpp
module;
// no proper "import" available yet, so switching to the old includes
#include <string>
#include <vector>
#include <iostream>
export module mymodule;
export std::vector<std::string> create() {
std::vector<std::string> vec = std::vector<std::string>{};
vec.push_back("Hello");
vec.push_back("world");
return vec;
}
// need to wrap printing for a string_view for some reason
export void printme(std::string &item) {
std::cout << item << std::endl;
}
// main.cpp
// includes needed for "auto" and usage of those types
// and were needed also for print(create()) call
// so something seems broken over here
#include <vector>
#include <string>
import mymodule;
int main() {
auto vec = create();
for (std::string item : vec) {
// string_view:142:2: note: declaration of
// 'basic_string_view<_CharT, _Traits>' does not match
// std::cout << item << std::endl;
printme(item);
}
}
clang++ -std=c++20 -c mymodule.cpp -Xclang -emit-module-interface -fimplicit-modules -fimplicit-module-maps -o mymodule.pcm
clang++ -std=c++20 -fprebuilt-module-path=. -fimplicit-modules -fimplicit-module-maps mymodule.cpp main.cpp
So the issue seems to be GCC specific and most likely is a bug judging by the architecture switching (hardcoding/wrong code branch in GCC?). Maybe worth revisiting after >11.2.0.

I'm writing a shared library using C++20 modules (gcc 11.2.0) and have the same "Bad file data" error for std::vector with custom type. In my case, as already mentioned by #balázs-Árva this error occurs for most containers: maps, lists, etc.
Following your references I found this issue and now have a temporary solution with just this line in module.
namespace std _GLIBCXX_VISIBILITY(default){}
This works in my library with other containers. I've looked for a proper solution using compiler options but haven't any success.
Files / commands
// mymodule.cpp
export module mymodule;
import <string>;
import <vector>;
namespace std _GLIBCXX_VISIBILITY(default){}
export std::vector<std::string> create() {
std::vector<std::string> vec = std::vector<std::string>{};
vec.push_back("Hello");
vec.push_back("world");
return vec;
}
// main.cpp
import <iostream>;
import <string>;
import <vector>;
import mymodule;
int main() {
std::vector<std::string> vec = create();
for (auto& item : vec) {
std::cout << item << std::endl;
}
}
g++ -fmodules-ts -std=c++20 -x c++-system-header iostream
g++ -fmodules-ts -std=c++20 -x c++-system-header string
g++ -fmodules-ts -std=c++20 -x c++-system-header vector
g++ -fmodules-ts -std=c++20 mymodule.cpp main.cpp
$ ./a.out
Hello
world
P.S. I can't comment, so posted it as answer, hope it will bring up them.

Related

std::filesystem::directory_iterator fails listing directories in the folder from which the program is run

I wrote this super simple function:
namespace fs = std::filesystem;
void findFiles()
{
for (auto const& dir_entry : fs::directory_iterator("C:/Users/xxx/Desktop/dev/test-a"))
{
std::cout << dir_entry.path().string() << '\n';
}
}
My file structure is like this:
C:/Users/xxx/Desktop/dev/test-a/toto
C:/Users/xxx/Desktop/dev/test-b/titi
I run the compiled code from test-b. When I use:
directory_iterator("C:/Users/xxx/Desktop/dev/test-a"))
I get toto as a result (as expected).
If I use this instead:
directory_iterator("C:/Users/xxx/Desktop/dev/test-b"))
There's no output, whereas I should get titi. Any idea why?
Compiled with: clang++ -o xxx xxx.cpp -std=c++20
clang version 13.0.1

Using constexpr and string_view in module

Modern C++ offers constexpr and std::string_view as a convenient alternative to string literals. However, I am unable to link to a "constexpr std::string_view" within a module. By contrast, I am able to use string_view (not constexpr) within the module as well as a "constexpr std::string_view" outside of the module. Furthermore, the problem does not occur for other uses of constexpr within the module, such as for integers.
Below is the minimal code to recreate the error:
module interface unit (my_string.cpp):
export module my_string;
import <string_view>;
export namespace my_string {
struct MyString {
static std::string_view string_at_runtime;
static constexpr std::string_view string_at_compilation{"Hello World at compilation (inside module)"};
static constexpr int number_at_compilation{1};
};
}
module implementation unit (my_string_impl.cpp):
module;
module my_string;
namespace my_string {
std::string_view MyString::string_at_runtime = "Hello World at runtime";
}
hello_world.cpp:
import <iostream>;
import <string_view>;
import my_string;
static constexpr std::string_view hello_world{"Hello World at compilation (outside module)"};
int main(){
std::cout << hello_world << std::endl;
std::cout << my_string::MyString::string_at_runtime << std::endl;
std::cout << my_string::MyString::number_at_compilation << std::endl;
std::cout << my_string::MyString::string_at_compilation << std::endl; //<-- ERROR is here
}
Compilation and attempt to link (using gcc 11.2.0 running on Linux):
g++ -c -fmodules-ts -std=c++20 -xc++-system-header iostream string_view
g++ -c my_string.cpp -fmodules-ts -std=c++20
g++ -c my_string_impl.cpp -fmodules-ts -std=c++20
g++ -c hello_world.cpp -fmodules-ts -std=c++20
g++ -o main my_string.o my_string_impl.o hello_world.o -fmodules-ts -std=c++20
Only the last instruction (linking) results in an error:
g++ -o main my_string.o my_string_impl.o hello_world.o -fmodules-ts -std=c++20
/usr/bin/ld: hello_world.o: in function `main':
hello_world.cpp:(.text+0x97): undefined reference to `my_string::MyString::string_at_compilation'
collect2: error: ld returned 1 exit status
In addition to searching the answers to related questions (e.g. constexpr in modules and constexpr and namespaces) I have reread the GCC Wiki. I have not yet tried this code with other compilers (e.g. clang, msvc).
Is this error a failure in my code or a not-yet-implemented feature in gcc?
I have found a work around solution: adding a getter method.
In the module interface unit (my_string.cpp) add:
static std::string_view GetStringAtCompilation();
In the module implementation unit (my_string_impl.cpp) add:
std::string_view MyString::GetStringAtCompilation(){
return string_at_compilation;
}
And now the following line in the "main" function (see hello_world.cpp in question) compiles, links and executes without error:
std::cout << my_string::MyString::GetStringAtCompilation() << std::endl;
I believe that the reasons that the original attempt did not work without a getter method is given in this answer about constexpr and string_view in headers

C++ gcc Namespace not found

I know there are many similar topics but there are equally many unique mistakes that may lead to this problem (so I think). Therefore I ask, after some research.
My problem is that the compiler, GNU GCC, when compiling one file does not see my namespace declared in another file. The IDE (CodeBlocks) evidently does see it as it auto-completes the name of the namespace. I tried to isolate the problem and came up with this:
File main.cpp:
namespace MyName
{
int MyVar;
}
#include "T1.cpp"
int main()
{
return 0;
}
File T1.cpp:
using namespace MyName;
error: 'MyName' is not a name-space name.
In my project I have a header file, say T1.h, and an implementation file T1.cpp — and MyName isn't accessible in either of them.
Any help or guidance would be appreciated.
What's happening is that CodeBlocks is compiling both main.cpp and T1.cpp. Here is what happens when you try to compile each one:
main.cpp:
$ g++ main.cpp
$
T1.cpp
$ g++ T1.cpp
T1.cpp:1:17: error: ‘MyName’ is not a namespace-name
using namespace MyName;
^
T1.cpp:1:23: error: expected namespace-name before ‘;’ token
using namespace MyName;
^
$
T1.cpp, when compiled on it's own, has no knowledge of MyName. To fix this, don't include .cpp files, and put your declarations in header files.
Edit: From what I gather, this may be a better way to organize your example:
T1.h:
namespace MyName {
extern int MyVar;
}
T1.cpp
#include "T1.h"
int MyName::MyVar = 5;
main.cpp
#include "T1.h"
#include <iostream>
using namespace MyName;
int main()
{
std::cout << MyVar << std::endl;
return 0;
}
Now it will compile correctly:
$ g++ -c T1.cpp -o T1.o
$ g++ -c main.cpp -o main.o
$ g++ T1.o main.o
$ ./a.out
5

How to pass arguments to a method loaded from a static library in CPP

I'm trying to write a program to use a static library of a C++ code into another C++ code. The first C++ code is hello.cpp:
#include <iostream>
#include <string.h>
using namespace std;
extern "C" void say_hello(const char* name) {
cout << "Hello " << name << "!\n";
}
int main(){
return 0;
}
The I made a static library from this code, hello.a, using this command:
g++ -o hello.a -static -fPIC hello.cpp -ldl
Here's the second C++ code to use the library, say_hello.cpp:
#include <iostream>
#include <string>
#include <dlfcn.h>
using namespace std;
int main(){
void* handle = dlopen("./hello.a", RTLD_LAZY);
cout<<handle<<"\n";
if (!handle) {
cerr<<"Cannot open library: "<<dlerror()<<'\n';
return 1;
}
typedef void (*hello_t)();
dlerror(); // reset errors
hello_t say_hello = (hello_t) dlsym(handle, "say_hello");
const char *dlsym_error = dlerror();
if (dlsym_error) {
cerr<<"Cannot load symbol 'say_hello': "<<dlsym_error<<'\n';
dlclose(handle);
return 1;
}
say_hello("World");
dlclose(handle);
return 0;
}
Then I compiled say_hello.cpp using:
g++ -W -ldl say_hello.cpp -o say_hello
and ran ./say_hello in the command line. I expected to get Hello World! as output, but I got this instead:
0x8ea4020
Hello ▒▒▒▒!
What is the problem? Is there any trick to make compatibility for method's argument like what we use in ctypes or what?
If it helps I use a lenny.
EDIT 1:
I have changed the code and used a dynamic library, 'hello.so', which I've created using this command:
g++ -o hello.so -shared -fPIC hello.cpp -ldl
The 6th line of the code changed to:
void* handle = dlopen("./hello.so", RTLD_LAZY);
When I tried to compile say_hello.cpp, I got this error:
say_hello.cpp: In function ‘int main()’:
say_hello.cpp:21: error: too many arguments to function
I also tried to compile it using this line:
g++ -Wall -rdynamic say_hello.cpp -ldl -o say_hello
But same error raised. So I removed the argument "World" and the it has been compiled with no error; but when I run the executable, I get the same output like I have mentioned before.
EDIT 2:
Based on #Basile Starynkevitch 's suggestions, I changed my say_hello.cpp code to this:
#include <iostream>
#include <string>
#include <dlfcn.h>
using namespace std;
int main(){
void* handle = dlopen("./hello.so", RTLD_LAZY);
cout<<handle<<"\n";
if (!handle) {
cerr<<"Cannot open library: "<<dlerror()<<'\n';
return 1;
}
typedef void hello_sig(const char *);
void* hello_ad = dlsym(handle, "say_hello");
if (!hello_ad){
cerr<<"dlsym failed:"<<dlerror()<<endl;
return 1;
}
hello_sig* fun = reinterpret_cast<hello_sig*>(hello_ad);
fun("from main");
fun = NULL;
hello_ad = NULL;
dlclose(handle);
return 0;
}
Before that, I used below line to make a .so file:
g++ -Wall -fPIC -g -shared hello.cpp -o hello.so
Then I compiled say_hello.cpp wth this command:
g++ -Wall -rdynamic -g say_hello.cc -ldl -o say_hello
And then ran it using ./say_hello. Now everything is going right. Thanks to #Basile Starynkevitch for being patient about my problem.
Functions never have null addresses, so dlsym on a function name (or actually on any name defined in C++ or C) cannot be NULL without failing:
hello_t say_hello = (hello_t) dlsym(handle, "say_hello");
if (!say_hello) {
cerr<<"Cannot load symbol 'say_hello': "<<dlerror()<<endl;
exit(EXIT_FAILURE);
};
And dlopen(3) is documented to dynamically load only dynamic libraries (not static ones!). This implies shared objects (*.so) in ELF format. Read Drepper's paper How To Use Shared Libraries
I believe you might have found a bug in dlopen (see also its POSIX dlopen specification); it should fail for a static library hello.a; it is always used on position independent shared libraries (like hello.so).
You should dlopen only position independent code shared objects compiled with
g++ -Wall -O -shared -fPIC hello.cpp -o hello.so
or if you have several C++ source files:
g++ -Wall -O -fPIC src1.cc -c -o src1.pic.o
g++ -Wall -O -fPIC src2.cc -c -o src2.pic.o
g++ -shared src1.pic.o src2.pic.o -o yourdynlib.so
you could remove the -O optimization flag or add -g for debugging or replace it with -O2 if you want.
and this works extremely well: my MELT project (a domain specific language to extend GCC) is using this a lot (generating C++ code, forking a compilation like above on the fly, then dlopen-ing the resulting shared object). And my manydl.c example demonstrates that you can dlopen a big lot of (different) shared objects on Linux (typically millions, and hundred of thousands at least). Actually the limitation is the address space.
BTW, you should not dlopen something having a main function, since main is by definition defined in the main program calling (perhaps indirectly) dlopen.
Also, order of arguments to g++ matters a lot; you should compile the main program with
g++ -Wall -rdynamic say_hello.cpp -ldl -o say_hello
The -rdynamic flag is required to let the loaded plugin (hello.so) call functions from inside your say_hello program.
For debugging purposes always pass -Wall -g to g++ above.
BTW, you could in principle dlopen a shared object which don't have PIC (i.e. was not compiled with -fPIC); but it is much better to dlopen some PIC shared object.
Read also the Program Library HowTo and the C++ dlopen mini-howto (because of name mangling).
example
File helloshared.cc (my tiny plugin source code in C++) is
#include <iostream>
#include <string.h>
using namespace std;
extern "C" void say_hello(const char* name) {
cout << __FILE__ << ":" << __LINE__ << " hello "
<< name << "!" << endl;
}
and I am compiling it with:
g++ -Wall -fPIC -g -shared helloshared.cc -o hello.so
The main program is in file mainhello.cc :
#include <iostream>
#include <string>
#include <dlfcn.h>
#include <stdlib.h>
using namespace std;
int main() {
cout << __FILE__ << ":" << __LINE__ << " starting." << endl;
void* handle = dlopen("./hello.so", RTLD_LAZY);
if (!handle) {
cerr << "dlopen failed:" << dlerror() << endl;
exit(EXIT_FAILURE);
};
// signature of loaded function
typedef void hello_sig_t(const char*);
void* hello_ad = dlsym(handle,"say_hello");
if (!hello_ad) {
cerr << "dlsym failed:" << dlerror() << endl;
exit(EXIT_FAILURE);
}
hello_sig_t* fun = reinterpret_cast<hello_sig_t*>(hello_ad);
fun("from main");
fun = NULL; hello_ad = NULL;
dlclose(handle);
cout << __FILE__ << ":" << __LINE__ << " ended." << endl;
return 0;
}
which I compile with
g++ -Wall -rdynamic -g mainhello.cc -ldl -o mainhello
Then I am running ./mainhello with the expected output:
mainhello.cc:7 starting.
helloshared.cc:5 hello from main!
mainhello.cc:24 ended.
Please notice that the signature hello_sig_t in mainhello.cc should be compatible (homomorphic, i.e. the same as) with the function say_hello of the helloshared.cc plugin, otherwise it is undefined behavior (and you probably would have a SIGSEGV crash).

Writing/Using C++ Libraries

I am looking for basic examples/tutorials on:
How to write/compile libraries in C++ (.so files for Linux, .dll files for Windows).
How to import and use those libraries in other code.
The code
r.cc :
#include "t.h"
int main()
{
f();
return 0;
}
t.h :
void f();
t.cc :
#include<iostream>
#include "t.h"
void f()
{
std::cout << "OH HAI. I'M F." << std::endl;
}
But how, how, how?!
~$ g++ -fpic -c t.cc # get t.o
~$ g++ -shared -o t.so t.o # get t.so
~$ export LD_LIBRARY_PATH="." # make sure t.so is found when dynamically linked
~$ g++ r.cc t.so # get an executable
The export step is not needed if you install the shared library somewhere along the global library path.