I was trying to write a basic example of shared library opening and function calling for practice, but it turns out that I always get "segmentation fault" when the exectuable is actually running. Here are the source code:
main.cpp:
#include<iostream>
#include<dlfcn.h>
using namespace std;
typedef void (*API)(unsigned int);
int main(int argc,char** argv){
void* dl;
API api;
unsigned int tmp;
//...
dl=dlopen("pluginA.so",RTLD_LAZY);
api=(API)dlsym(dl,"API");
cin>>tmp;
(*api)(tmp);
dlclose(dl);
//...
return 0;
}
pluginA.cpp:
#include<iostream>
using namespace std;
extern "C" void API(unsigned int N){switch(N){
case 0:cout<<"1\n"<<flush;break;
case 1:cout<<"2\n"<<flush;break;
case 2:cout<<"4\n"<<flush;break;
case 4:cout<<"16\n"<<flush;break;}}
I compiled the two part with the following command:
g++ -shared -o pluginA.so -fPIC plugin.cpp
g++ main.cpp -ldl
Here is the output
Segmentation fault (core dumped)
BTW, I also tried directly call api(tmp) rather than (*api)(tmp), that also don't work. Since api is a pointer, (*api) makes more sense?
I'm not sure what should I do. There are many totorials about calling function in shared library online, but most of them aren't fully coded, or they actually don't work.
And also I'm not sure what should I do with "attribute((visibility("default")))". Should I even write it down?
EDT1
Thanks for giving me so much advice. I finally find out that actually everything is a typo in compiling command... I mistakenly typed pluginA.so to pluginA.o, and that's the reason why it don't work...
Anyway, here is my revised program, with error handling added, and more "full" system added:
main.cpp:
#include<dirent.h>
#include<dlfcn.h>
#include<iostream>
#include<cstring>
using namespace std;
typedef bool (*DLAPI)(unsigned int);
int main(){
DIR* dldir=opendir("dl");
struct dirent* dldirf;
void* dl[255];
DLAPI dlapi[255];
unsigned char i,dlc=0;
char dldirfname[255]="./dl/";
unsigned int n;
while((dldirf=readdir(dldir))!=NULL){
if(dldirf->d_name[0]=='.')continue;
strcat(dldirfname,dldirf->d_name);
dl[dlc]=dlopen(dldirfname,RTLD_LAZY);
if(!dl[dlc])cout<<dlerror()<<endl;else{
dlapi[dlc]=(DLAPI)dlsym(dl[dlc],"API");
if(!dlapi[dlc])cout<<dlerror()<<endl;else dlc++;}
dldirfname[5]='\0';}
if(dlc==0){
cerr<<"ERROR:NO DL LOADED"<<endl;
return -1;}
while(true){
cin>>n;
for(i=0;i<dlc;i++)if((*dlapi[i])(n))break;
if(i==dlc)cout<<"NOT FOUND"<<endl;}
for(i=0;i<dlc;i++)dlclose(dl[i]);
return 0;}
You should read documentation of dlopen(3) and dlsym and you should always handle failure. So code
dl=dlopen("./pluginA.so",RTLD_LAZY);
if (!dl) { fprintf(stderr, "dlopen failure: %s\n", dlerror());
exit (EXIT_FAILURE); };
api=(API)dlsym(dl,"API");
if (!api) { fprintf(stderr, "dlsym failure: %s\n", dlerror());
exit (EXIT_FAILURE); };
The documentation of dlopen is explaining why you want to pass ./pluginA.so with a ./ prefix
At last, you should always compile with all warnings and debug info, so:
g++ -Wall -Wextra -g -shared -o pluginA.so -fPIC plugin.cpp
g++ -Wall -Wextra -g -rdynamic main.cpp -ldl
(It is useful to link the main program with -rdynamic so that the plugin could access its symbols)
You could want to dlclose(dl) just before the end of main ... (calling or returning from a dlsym-ed function will crash your program if you dlclose too early). You might even avoid the dlclose (i.e. accept some resource leak). By experience you usually can dlopen many hundreds of thousands shared objects (see my manydl.c)
Only once your program is debugged you could add some optimization flag like -O or -O2 (and perhaps remove the debugging flag -g, but I don't recommend that for beginners).
You should perhaps read Drepper's paper: How To Write Shared Libraries.
I correted your code a bit and use the error checking. Try that and get the idea what's going on:
#include<iostream>
#include<dlfcn.h>
using namespace std;
typedef void (*API)(unsigned int);
int main(int argc,char** argv)
{
API api;
unsigned int tmp;
//...
void* handle = dlopen("pluginA.so", RTLD_LAZY);
if (!handle)
{
std::cerr << dlerror() << std::endl;
return 1;
}
dlerror();
api = reinterpret_cast<API>(dlsym(handle, "API"));
if (!api)
{
std::cerr << dlerror() << std::endl;
return 2;
}
cin>>tmp;
(*api)(tmp);
dlclose(handle);
//...
return 0;
}
At last: why it is failed? Use the right path: "./pluginA.so", not "pluginA.so" or put the full path to your plugin.
Related
I have been stuck in a problem for several days when using dlopen/dlsym to deal with the shared object. I desperately need your help!
There are three headers/source files: Animal.h, Animal.cpp and test.cpp, and two products: libanimal.so and a.out.
Here is the source code:
Animal.h:
// Animal.h
#ifndef STDLIB_ANIMAL_H
#define STDLIB_ANIMAL_H
#include <string>
class Animal {
public:
Animal();
std::string shout();
};
extern "C" Animal* createAnimal();
#endif //STDLIB_ANIMAL_H
Animal.cpp
// Animal.cpp
#include "Animal.h"
Animal::Animal() = default;
std::string Animal::shout() {
return "WOW";
}
Animal* createAnimal() {
return new Animal();
}
and the last one test.cpp which contained the main entry of the application:
// test.cpp
#include <cstdio>
#include <cstdlib>
#include <string>
#include <dlfcn.h>
#include <iostream>
#include "Animal.h"
int main(int argc, char** argv)
{
void *handle;
// Open shared library
handle = dlopen("./libanimal.so", RTLD_LAZY);
if (!handle) {
/* fail to load the library */
fprintf(stderr, "Handle Error: %s\n", dlerror());
return EXIT_FAILURE;
}
auto spawnAnimal = reinterpret_cast<Animal* (*)()>(dlsym(handle, "createAnimal"));
if (!spawnAnimal) {
/* no such symbol */
fprintf(stderr, "No Such Symbol Error: %s\n", dlerror());
dlclose(handle);
return EXIT_FAILURE;
}
auto animal = spawnAnimal();
std::cout << "call animal shout, the address of animal : " << animal <<std::endl;
std::cout << animal->shout() << std::endl;
std::cout << "after calling" << std::endl;
dlclose(handle);
return EXIT_SUCCESS;
}
First, I built libanimal.so shared library by the following command:
$ g++ -shared -fPIC -Wall -o libanimal.so Animal.cpp -std=c++0x
Then I built the executable, a.out:
$ g++ -Wall test.cpp -ldl -std=c++0x -Wl,--unresolved-symbols=ignore-in-object-files
# --unresolved-symbols=ignore-all: tell linker to stop complaining about missing symbols
Both the above commands ran well without any error, but when I execute a.out, errors came out:
$ ./a.out
call animal shout, the address of animal : 0x7fffe97a4520
Segmentation fault (core dumped)
I have searched lots of blogs/documentations, but all of them just post a really simple example:
The function they export always return like a std::string or char or int which is built-in type, not something like a object of a user-defined class, just like the showcase above (the createAnimal function).
So what's wrong with this? Could you please help me figuring out ?
Thank you !
BTW, my environment is:
$ g++ --version
g++ (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Ok, my suspicion from the comment has confirmed itself. The problem is with the early binding on animal->shout() which resolves at compile time to a call to _ZN6Animal5shoutB5cxx11Ev and this symbol is nowhere to be found.
If I change it to virtual (late binding, resolved at runtime through a vtable):
// Animal.h
#ifndef STDLIB_ANIMAL_H
#define STDLIB_ANIMAL_H
#include <string>
class Animal {
public:
Animal();
virtual std::string shout();
virtual ~Animal() {} // always put a virtual destructor if you have any virtual methods
};
extern "C" Animal* createAnimal();
#endif //STDLIB_ANIMAL_H
Now it runs.
You have an unresolved symbol in your main executable. Do not ignore these error messages.
./tmp/cchZxfAz.o: In function `main':
test.cpp:(.text+0x112): undefined reference to `Animal::shout[abi:cxx11]()'
collect2: error: ld returned 1 exit status
I have two versions of my app, one works as a dynamic library loaded from a host app and another one works as a standalone executable app.
In my code, there are parts where I need to use different code depending on which version I'm building.
Is there a way to detect if I'm building a dynamic library or a standalone executable app so I can use the same source files for the two versions?
I would appreciate a cross-platform solution.
ADDED:
My dynamic lib or executable code:
#include <iostream>
int main(int argc, const char * argv[])
{
#ifdef IS_DYNAMICLIB
std::cout << "Compiled as a dynamic library!\n";
#else
std::cout << "Compiled as an executable!\n";
#endif
return 0;
}
In the terminal on macOS and I build it using the command:
g++ -dynamiclib -undefined suppress -flat_namespace main.cpp -o Test.dylib
Now, I have Test.dylib
Host app code:
#include <iostream>
#include <dlfcn.h>
int main(int argc, const char * argv[])
{
void* handle;
typedef void (*func_t)();
handle = dlopen("/Path/to/Test.dylib", RTLD_LAZY);
if (!handle) {
printf("failed to open the library\n");
return 0;
}
func_t mainFunc = (func_t) dlsym(handle, "main");
if (!mainFunc) {
printf("failed to find main method\n");
dlclose(handle);
return 0;
}
mainFunc();
return 0;
}
When I build and run this, I get:
Compiled as an executable!
Program ended with exit code: 0
But I would like it to print:
Compiled as a dynamic library!
Program ended with exit code: 0
How can I do this?
(This is like my other question but this one is for another thing, even if it's related)
I've got a big issue in my project. I've got a library which handle XML and can throw exception. And, using it for creating a configuration file class show my first bug : exceptions aren't handled in the library, at all, and with every exception.
In the library I wrote :
try {
throw std::exception();
}
catch (...)
{
printf("caught\n");
}
But, the exception isn't handled and call std::terminate immediately :
terminate called after throwing an instance of 'std::exception'
what(): std::exception
The compilation flags are the simplest one : -fPIC -std=c++11 -g -Wall for the library, and -std=c++11 -g -Wall for the executable (plus the libraries and variant build defines). Also, I'm using G++ 5.4.0, under Linux (Linux Mint to be precise).
This is my main :
#include "ns/core/C_Configuration.hpp"
#include <iostream>
using namespace std;
using namespace ns;
int
main (int argc, char** argv)
{
try {
C_Configuration c ("test.xml");
c.load ("test.xml");
} catch (const std::exception& ex) {
cout << ex.what () << endl;
} catch (...) {
cout << "Caught." << endl;
}
return 0;
}
The C_Configuration.hpp :
#include <string>
#include <exception>
namespace ns
{
class C_Configuration
{
public:
C_Configuration (std::string);
bool load (std::string file);
};
} // namespace ns
And, this is the C_Configuration.cpp :
#include "ns/core/C_Configuration.hpp"
#include <cstdio>
using namespace std;
namespace ns
{
C_Configuration::C_Configuration (string)
{ }
bool
C_Configuration::load (string file)
{
try {
throw exception();
} catch (const exception& ex) {
printf ("In C_Configuration : %s\n", ex.what ());
} catch (...) {
printf ("In C_Configuration : caught\n");
}
return true;
}
} // namespace ns
Buid commands :
g++ -m64 -g -shared -fPIC -std=c++11 -o libns_framework.so C_Configuration.cpp
g++ -m64 -g -L. -o exe main.cpp -lns_framework
Note : I give this example, but it works as expected, the exception is caught in the library, not like in my main project. If you want to investigate more, you can check my project code here.
The problem is when :
The try-catch block is inside the library ;
The try-catch block is outside the library ;
In any case, the exception is thrown inside the library. But, exception thrown outside are caught in the executable code :
int
main (int argc, char** argv)
{
try {
throw 1;
} catch (...) {
cout << "Caught" << endl;
}
// Useless code
return 0;
}
This code just write Caught in the output.
So, my question is simple : Is C++ exception not handled within libraries, or I just forgot a compilation flag ? I need to say in the executable code, the exceptions work fine.
Thanks for your help.
EDIT : Oh god, my bad. Problem solved. Into the deepest part of my build configuration, an ld took the place of g++. Now the exception is working fine. Thanks for you help.
Simple. Never use ld with C++. I changed all ld commands in my project to g++ but seems I forgot for this library.
In short, I was using ld for building my library but g++ for the main executable. So the exceptions worked in the executable but not in the library because ld does not includes the C++ libraries which handle the exception system.
According to gcc manual:
if a library or main executable is supposed to throw or catch exceptions, you must link it using the G++ or GCJ driver, as appropriate for the languages used in the program, or using the option -shared-libgcc, such that it is linked with the shared libgcc.
Shared libraries (in C++ and Java) have that flag set by default, but not main executables. In any case, you should use it on both.
Test Case:
lib.cpp:
#include "lib.hpp"
#include <string>
#include <iostream>
using namespace std;
int function_throws_int() {
try {
throw 2;
}
catch (...) {
cout << "int throws lib" << endl;
throw;
}
return -1;
}
int function_throws_string() {
try {
throw std::string("throw");
}
catch (...) {
cout << "throws string lib" << endl;
throw;
}
}
lib.hpp:
int function_throws_int();
int function_throws_string();
Compile command line:
g++ -m64 -g -shared-libgcc -shared -fPIC -std=c++11 -o libtest.so lib.cpp
main.cpp:
#include "lib.hpp"
#include <string>
#include <iostream>
using namespace std;
int main(int argc, char ** argv) {
try {
function_throws_int();
}
catch (const string & e) {
cout << "string caught main" << endl;
}
catch (int i) {
cout << "int caught main" << endl;
}
return 0;
}
Compile command line:
g++ -m64 -g -shared-libgcc -o main -L. main.cpp -ltest
Execute:
LD_LIBRARY_PATH=. ./main
Output:
int throws lib
int caught main
Consider the following simple program:
#include <mpi.h>
#include <iostream>
#include <stdlib.h>
#include <stdio.h>
#include <string>
#include <vector>
using std::cout;
using std::string;
using std::vector;
vector<float> test;
#ifdef GLOBAL
string hostname;
#endif
int main(int argc, char** argv) {
int rank; // The node id of this processor.
int size; // The total number of nodes.
#ifndef GLOBAL
string hostname;
#endif
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
cout << "Joining the job as processor: " << rank << std::endl;
{
char buf[2048] = "HELLO";
hostname.assign(buf, 2048);
}
test.push_back(1.0f);
cout << "Hostname: " << hostname << "::" << test[0] << std::endl;
MPI_Finalize();
return 0;
}
If I compile/run this with:
mpicxx -c test.cc && mpicxx -lstdc++ test.o -o test && ./test
there is no segmentation fault, but if I run it with:
mpicxx -DGLOBAL -c test.cc && mpicxx -lstdc++ test.o -o test && ./test
then there is a segmentation fault at the hostname.assign() line. In addition, if I remove this assignment, there is a segmentation fault in the string destructor once the main method returns so the assign method isn't the actual culprit.
Notice that the only difference is where the "global" variable hostname gets declared.
I am compiling with MPICH2 version 1.6, and don't really have the option to change this since I am running this on a supercomputer.
If I remove MPI_Init, etc. the error goes away leading me to believe that there is something unexpected happening with MPI and this global variable.
I found some other examples of this happening to people online, but they all resolved their issues by installing a new version of MPICH, which again is not a possibility for me.
Moreover, I want to know WHY this happening more than just a way around it.
Thanks for your time.
Ok, after quite a bit of debugging I have found that the MVAPICH2-1.6 library defines a variable called hostname in:
mpid/ch3/channels/mrail/src/rdma/ch3_shmem_coll.c
Here is the line (55 in this version of the file):
char hostname[SHMEM_COLL_HOSTNAME_LEN];
The compiler didn't complain about the name clash here, but this is almost certainly the culprit since changing the variable name in my program removed the error. I imagine this is changed in later versions of MVAPICH2, but I will file the bug if not.
I'm on Linux, the question is concerning shared objects of C++ classes.
The problem comes when my shared objects try to use resources linked into the main executable. I have the following codes:
loader.cpp:
#include <dlfcn.h>
#include <iostream>
#include "CommonInfo.h"
using namespace std;
int main(int argc, char** argv) {
for(int i=1; i<argc; ++i) {
string pth = "./";
pth.append(argv[i]);
void* dh = dlopen(pth.c_str(), RTLD_NOW);
if(dh==NULL) {
cerr << dlerror() << endl;
return 1;
}
CommonInfo::GetInfoFunc getInfo = (CommonInfo::GetInfoFunc)(dlsym(dh,"getInfo"));
if(getInfo==NULL) {
cerr << dlerror() << endl;
return 1;
}
CommonInfo* info = getInfo();
cout << "INFO: " << info->getX() << endl;
delete info;
}
return 0;
}
CommonInfo.h:
#include <string>
class CommonInfo {
public:
typedef CommonInfo* (*GetInfoFunc)();
private:
std::string x;
public:
CommonInfo(const std::string& nx);
std::string getX() const;
};
EDIT:
I accidentaly forgot to ctrl-c + ctrl-v the source of CommonInfo.cpp here. Of course, it is there during compilation, so CommonInfo.cpp:
#include "CommonInfo.h"
CommonInfo::CommonInfo(const std::string& nx) : x(nx) {
}
std::string CommonInfo::getX() const {
return x;
}
A Plugin.h header:
#include "CommonInfo.h"
extern "C" CommonInfo* getInfo();
A very simple Plugin.cpp:
#include <iostream>
#include "Plugin.h"
#include "CommonInfo.h"
using namespace std;
CommonInfo* getInfo() {
return new CommonInfo("I'm a cat!");
}
Compiling is done with:
g++ -rdynamic -ldl -Werror CommonInfo.cpp loader.cpp -o loader
g++ -shared -fPIC -Werror Plugin.cpp -o Plugin.so
Running:
./loader Plugin.so
And there goes the error:
./loader: symbol lookup error: ./Plugin.so: undefined symbol: _ZN10CommonInfoC1ERKSs
Indeed, looking inside Plugin.so with nm Plugin.so | grep -i CommonInfo it gives an 'U' for this symbol (unresolved), which is perfectly ok.
Also, looking inside the binary of loader with nm loader.so | grep -i CommonInfo I could find the symbol with 'T', which is also ok.
Question is, shouldn't dlfcn.h unresolve the symbol in question from the main binary? Without this feature it becomes quite hard to use these stuff... Do I have to write a class factory function for CommonInfo, load it with dlfcn from the plugin and call that?
Thanks in advance,
Dennis
I haven't looked closely at your code, but I have in the past found behavior like you describe in the title when I did not link the executable with -E. (Or -Wl,-E when linking with gcc rather than ld.)
Note that not all platforms let the shared libraries take symbols from the calling binary. Linux and the *BSDs allow you to. But if you ever want to port to, say, Windows, you will not be able to use this pattern. I believe there are also some Unix-type OS's that won't let you do this. (It's been a while so I don't remember... Maybe it was Solaris?)