Writing a plugin system? - c++

After many hours of research I have turned up nothing, so I turn to you good folks in hopes of a solution. I am going to be writing a bot in c++, and at some point would like to make a plugin system for it. Now I know I could just write a scripting language for it, however, I know its possible to just write an api and have the program link to that dynamically at run time. My question is, how do i get that dynamic linkage (like what hexchat has for its plugins)? Are there any elegant solutions, or at least theories on the typical design?

On Linux and Posix systems, you want to use dlopen(3) & dlsym (or some libraries wrapping these functions, e.g. Glib from GTK, Qt, POCO, etc...). More precisely,
Build a position independent code shared library as your plugin:
gcc -fPIC -Wall -c plugin1.c -o plugin1.pic.o
gcc -fPIC -Wall -c plugin2.c -o plugin2.pic.o
Notice that if the plugin is coded in C++ you'll compile it with g++ and you should declare the plugin functions as extern "C" to avoid name mangling.
Then link your plugin as
gcc -shared -Wall plugin1.pic.o plugin2.pic.o -o plugin.so
You may add dynamic libraries (e.g. a -lreadline at end of command above if your plugin wants GNU readline).
At last, call dlopen with a full path in your main program, e.g.
void* dlh = dlopen("./plugin.so", RTLD_NOW);
if (!dlh) { fprintf(stderr, "dlopen failed: %s\n", dlerror());
exit(EXIT_FAILURE); };
(often dlh is a global data)
Then use dlsym to get the function pointers. So declare their signature in some header included both by program and plugin code like
typedef int readerfun_t (FILE*);
declare some (often) global function pointers
readerfun_t* readplugfun;
and use dlsym on the plugin handle dlh:
readplugfun = (readerfun_t*) dlsym(dlh, "plugin_reader");
if (!readplugfun) { fprintf (stderr, "dlsym failed: %s\n", dlerror());
exit(EXIT_FAILURE); };
Of course in your plugin source code (e.g. in plugin1.cc) you'll define
extern "C" int plugin_reader (FILE*inf) { // etc...
You might define some constructor (or destructor) functions in your plugin (see GCC function attributes); the would be called at dlopen (or dlclose) time. In C++ you should simply use static objects. (their constructor is called at dlopen time, their destructor is called at dlclose time; hence the name of the function attributes).
At the end of your program call
dlclose(dlh), dlh = NULL;
In practice, you can do a lot (perhaps a million) of dlopen calls.
You generally want to link your main program with -rdynamic to let its symbols be visible from plugins.
gcc -rdynamic prog1.o prog2.o -o yourprog -ldl
Read Program Library HowTo & C++ dlopen mini HowTo & Drepper's paper: How to Write a Shared Library
The most important part is to define and document a plugin convention (i.e. "protocol"), that is a set (and API) of functions (to be dlsym-ed) required in your plugin and how to use them, in which order they are called, what is the memory ownership policy, etc. If you allow several similar plugins, you might have some well documented hooks in your main program which calls all the dlsym-ed functions of relevant dlopen-ed plugins. Examples: GCC plugins conventions, GNU make modules, Gedit plugins, ...

Related

Linkage of standard libraries in C++ code called from ASM

as I am developing my "OsDev" project, where I am learning a new stuff (for somebody who did not code in C/C++ for a long time due to web development it is kinda "new"). I figured out in the other thread, that calling a C++ function from ASM needs to have a extern "C" prefix but now I have problem with the lining of standard libraries as a for example cstdio etc. I stuck with this message.
kc.o: In function `kmain':
kernel.cpp:(.text+0x3e4): undefined reference to `strlen`
C++
#include <string.h>
#include <cstdio>
#include "inc/screen.h"
extern "C" void kmain()
{
clearScreen();
kernel_print((char*)"Hello Github! :-)", 0x04);
}
and if I try to use strlen() it won't link. (BTW. including screen.h is working for some reason).
Compiling script
nasm -f elf32 kernel.asm -o kasm.o
g++ -c kernel.cpp -o kc.o -lgcc -m32 -Wall -Wextra -O2
ld -m elf_i386 -T link.ld -o kernel.bin kasm.o kc.o
link.ld
OUTPUT_FORMAT(elf32-i386)
ENTRY(start)
SECTIONS
{
. = 0x100000;
.text : { *(.text) }
.data : { *(.data) }
.bss : { *(.bss) }
}
Thanks for any suggestions. :)
Your code cannot work as kernel's can't directly use shared libraries.
Why can't I use shared libraries directly in my kernel?
When an application is loaded by the operating system, all the required files are brought into its address space. This includes the executable file and any dynamic libraries (all ABI-conforming ELF applications will always link with a system library - the C Standard Library or just libc).
But while loading the kernel, only the original executable is loaded. Multiboot 2 (with GRUB bootloader) will allow you to load kernel-modules which can be dynamic libraries. But still, your kernel must know how to link itself and the kernel-modules in physical memory. To do so, you must implement a ELF parser and dynamic linker in your kernel.
Before implementing one, make sure your kernel is mature enough to systematically handle dynamic memory allocation, pagination, and other basic features.
How can I use the sweet features of libc?
Usually, you won't use all of the userspace functionality of libc. But things like memcpy, strlen, strcpyn and so on are absolutely necessary. You will have to implement these functions on your own, but the better part here is that, you can change the names of these functions. For example, if you prefere camelCase for function names, then you can also use function names like copyMemory, lengthOfString, etc.
https://github.com/SukantPal/Silcos-Kernel
I have built my own kernel, which has a few implementations of the required functions in KernelHost/Source/Util/CircuitPrimitive.cpp. You can look into that. Also, it has a full-fledged module linker. (KernelHost, ModuleFramework, etc. those parent folders contain separate kernel-module source code).
Make sure not to use the standard C headers in your kernel, as for now. Implement all required functions on your own, including printf

"no main" function for linking or execution in C++ [duplicate]

This question already has answers here:
How to change entry point of C program with gcc?
(4 answers)
Closed 5 years ago.
I am trying to compile a function (not called main) that can be integrated in another code or directly executed (after linking).
I try it one my mac, and work well.
I finally test it on Linux (CentOS and ubuntu). However, the task looks harder as expected on Linux.
The source code is the following one (just to explain the problem)
test.cpp:
#include <cstdio>
#ifdef __cplusplus
extern "C" {
#endif
int test(int argc, char const *argv[]);
#ifdef __cplusplus
}
#endif
int test(int argc, char const *argv[]) {
fprintf(stderr, "%s\n", "test");
return 0;
}
Compilation line on MacOS
g++ -c test.cpp -o test.o && g++ test.o -o test -e _test
and on Linux
g++ -c test.cpp -o test.o && g++ test.o -o test -e test
I try on my MacOS with clang, g++ and Intel compiler, all 3 works fine.
And I try with g++ and the Intel compiler on Linux, always, the same error.
usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crt1.o: In function `_start':
(.text+0x20): undefined reference to `main'
collect2: error: ld returned 1 exit status
Any advice, explanation or solution, on what I am doing wrong or missing would be very helpful.
Thanks
Edit:
Currently, I have a "define" to create a main, but if we have lots of function we are obligated to do two compilations each time (one for the function version and one for the execution) and make finally the code heavier.
Like discussed in this topic is there a GCC compiler/linker option to change the name of main?
To don't do a XY I inherited from a bunch of small programs that I want to put to gather, that it is easier to use (for remote execution ...). However, each one need to be able to be executed independently if needed, for debugging,... I hesitate, between using "execv" and just convert each main as a function. I probably take the bad chose.
Edit:
The final goal is to be able to have independent programs. But that we can call from an external software too.
Solution:
The solution looks to be, to a call to the main through a dlopen
You cannot do that (and even if it appears to work on MacOSX it is implementation specific and undefined behavior).
Linux crt0 is doing more complex stuff that what you think.
The C standard (e.g. n1570 for C11) requires a main function for hosted implementations (ยง5.1.2.2.1) :
The function called at program startup is named main. The implementation declares no prototype for this function.
And the C++ standard also requires a main and strongly requires some processing (e.g. construction of static data) to be done before main is running and after it has returned (and various crt0 tricks are implementing that feature on Linux).
If you want to understand gory details (and they are not easy!), study the ABI and the (free software) source code of the implementation of the crt0.
I am trying to compile a function (not called main) that can be integrated in another code
BTW, to use dynamically some code (e.g. plug-ins) from another program, consider using the dynamic linker. I recommend using the POSIX compliant dlopen(3) with dlsym(3) on position-independent code shared libraries. It works on most Unix flavors (including MacOSX & Linux & Solaris & AIX). For C++ code beware of name mangling so read at least the C++ dlopen mini howto.
Read also the Program Library HowTo.
Problems with libraries, they cannot be executed, no ?
I don't understand what that means. You certainly can load a plugin then run code inside it from the main program dlopen-ing it.
(and on Linux, some libraries like libc.so are even specially built to also work as an executable; I don't recommend this practice for your own code)
You might take several days to read Drepper's How To Write Shared Libraries (but it is advanced stuff).
If you want to add some code at runtime, read also this answer and that one.
The final goal is to be able to have independent program. But that we can call from an external software too
You can't do that (and it would make no sense). However, you could have conventions for communicating with other running programs (i.e. processes), using inter-process communication such as pipe(7)-s and many others. Read Advanced Linux Programming first and before coding. Read also Operating Systems : Three Easy Pieces
The solution looks to be, to a call to the main through a dlopen
Calling the main function via dlopen & dlsym is forbidden by the C++ standard (which disallows using a pointer to main). The main function has a very specific status and role (and is compiled specially; the compiler knows about main).
(perhaps calling main obtained by dlsym would appear to work on some Linux systems, but it certainly is undefined behavior so you should not do that)

C++ force unloading shared library

I'm trying to create an application which reloads a shared library multiple times. But at some point in time, dlmopen fails with error
/usr/lib/libc.so.6: cannot allocate memory in static TLS block
Here is the minimal code reproducing this issue:
#include <dlfcn.h>
#include <cstdio>
#include <vector>
int main() {
for (int i = 0; i < 100; ++i) {
void *lib_so = dlmopen(LM_ID_NEWLM, "lib.so", RTLD_LAZY | RTLD_LOCAL);
if (lib_so == NULL) {
printf("Iteration %i loading failed: %s\n", i, dlerror());
return 1;
}
dlclose(lib_so);
}
return 0;
}
And empty lib.cpp, compiled with
g++ -rdynamic -ldl -Wl,-R . -o test main.cpp
g++ -fPIC -shared lib.cpp -o lib.so
Update
It seems that it crashes even with one thread. The question is: how can I force a library unload or a destruction of unused namespaces created with LM_ID_NEWLM?
There is a built-in limit to the number of link map namespaces available to a process. This is rather poorly documented in the comment:
The glibc implementation supports a maximum of 16 namespaces
in the man page.
Once you create a link map namespace, there is no support for 'erasing' it via any APIs. This is just the way it's designed, and there's no real way to get around that without editing the glibc source and adding some hooks.
Using namespaces for reloading of a library is not actually reloading the library - you're simply loading a new copy of the library. This is one of the use cases of the namespaces - if you tried to dlopen the same library multiple times, you would get the same handle to the same library; however if you load the second instance in a different namespace, you won't get the same handle. If you want to accomplish reloading, you need to unload the library using dlclose, which will unload the library once the last remaining reference to the library has been released.
If you want to attempt to 'force unload' a library, then you could try issuing multiple dlclose calls until it unloads; however if you don't know what the library has done (e.g. spawned threads) there may be no way of preventing a crash in that case.
Older glibc versions might have some bugs related to this:
https://bugzilla.redhat.com/show_bug.cgi?id=89692
https://sourceware.org/bugzilla/show_bug.cgi?id=14898
What version are you using? Try using a newer glibc version, your code works pretty fine on my computer (glibc 2.23).

Internal exceptions in shared library terminate end user application

I am building a shared library which uses Boost.thread internally. As a result, Boost.system is also used since Boost.thread depends on that. My shared library exports a C interface, so I want to hide all my internal exception handling and thread usage etc from the end user. It is supposed to be a black box so to speak. However, when I link with a client application, while the program runs fine - as soon as it is time to stop the processing by invoking a library function I get:
terminate called after throwing an instance of 'boost::thread_interrupted'
I catch this exception internally in the library, so I have no idea why it is not actually being caught. The end user's program is not meant to know about or handle Boost exceptions in any way. When building the shared library, I use static linking for both Boost.thread and Boost.system so the outside world is never meant to see them. I am on GCC 4.7 on Ubuntu 12. On Windows, I have no such problems (neither with MSVC or MinGw).
(EDIT)
I am editing the question to show a minimalistic example that reproduces the problem, as per the requests in the comments.
Here first is the code for testlib.cpp and testlib.h.
testlib.cpp:
#include <boost/thread/thread.hpp>
void thread_func()
{
while(1)
{
boost::this_thread::interruption_point();
}
}
void do_processing()
{
// Start a thread that will execute the function above.
boost::thread worker(thread_func);
// We assume the thread started properly for the purposes of this example.
// Now let's interrupt the thread.
worker.interrupt();
// And now let's wait for it to finish.
worker.join();
}
And now testlib.h:
#ifndef TESTLIB_H
#define TESTLIB_H
void do_processing();
#endif
I build this into a shared library with the following command:
g++ -static-libgcc -static -s -DNDEBUG -I /usr/boost_1_54_0 -L /usr/boost_1_54_0/stage/lib -Wall -shared -fPIC -o libtestlib.so testlib.cpp -lboost_thread -lboost_system -lpthread -O3
Then, I have the code for a trivial client program which looks as follows:
#include "testlib.h"
#include <cstdio>
int main()
{
do_processing();
printf("Execution completed properly.\n");
return 0;
}
I build the client as follows:
g++ -DNDEBUG -I /usr/boost_1_54_0 -L ./ -Wall -o client client.cpp -ltestlib -O3
When I run the client, I get:
terminate called after throwing an instance of 'boost::thread_interrupted'
Aborted (core dumped)
I am not explicitly catching the thread interruption exception, but according to the Boost documentation Boost.thread does that and terminates the given thread only. I tried explicitly catching the exception from within the thread_func function, but that made no difference.
(End OF EDIT)
(EDIT 2)
It is worth noting that even with -fexceptions turned on, the problem still persists. Also, I tried to throw and catch an exception that is defined in the same translation unit as the code that catches and throws it, with no improvement. In short, all exceptions appear to remain uncaught in the shared library even though I definitely have catch handlers for them. When I compile the client file and the testlib file as part of a single program, that is to say without making testlib into a shared library, everything works as expected.
(End OF EDIT 2)
Any tips?
I finally figured it out. The -static flag should never be specified when -shared is specified. My belief was that it merely told the linker to prefer static versions of libraries that it links, but instead it makes the generated dynamic library unsuitable for dynamic linking which is a bit ironic. But there it is. Removing -static solved all my problems, and I am able to link Boost statically just fine inside my dynamic library which handles exceptions perfectly.
Maybe this?
If you have a library L which throws E, then both L and the
application A MUST be linked against X, the library containing the
definition of E.
Try to link executable against boost, too.
A shared library that itself includes statically linked libraries is not such a good idea, and I don't think that this scenario is well supported in the GNU toolchain.
I think that your particular problem arises from the option -static-libgcc, but I've been unable to compile it in my machine with your options. Not that linking statically-dinamically to libpthread.so sounds as such a good idea either... What will happen if the main executable wants to create its own threads? Will it be compiled with -pthread? If it is, then you will link twice the thread functions; if it isn't, it will have the functions but not the precompiler macros nor the thread-safe library functions.
My advice is simply not to compile your library statically, that's just not the Linux way.
Actually that should not be a real problem, even if you don't want to rely on the distribution version of boost: compile your program against the shared boost libraries and deploy all these files (libboost_thread.so.1.54.0, libboost_system.so.1.54.0 and libtestlib.so) to the same directory. Then run your program with LD_LIBRARY_PATH=<path-to-so-files>. Since the client is not intended to use boost directly, it doesn't need the boost headers, nor link them in the compiler command. You still have your black box, but now it is formed by 3 *so files, instead of just 1.

in gcc how to force symbol resolution at runtime

My first post on this site with huge hope::
I am trying to understand static linking,dynamic linking,shared libraries,static libraries etc, with gcc. Everytime I try to delve into this topic, I have something which I don't quite understand.
Some hands-on work:
bash$ cat main.c
#include "printhello.h"
#include "printbye.h"
void main()
{
PrintHello();
PrintBye();
}
bash$ cat printhello.h
void PrintHello();
bash$ cat printbye.h
void PrintBye();
bash$ cat printbye.c
#include <stdio.h>
void PrintBye()
{
printf("Bye bye\n");
}
bash$ cat printhello.c
#include <stdio.h>
void PrintHello()
{
printf("Hello World\n");
}
gcc -Wall -fPIC -c *.c -I.
gcc -shared -Wl,-soname,libcgreet.so.1 -o libcgreet.so.1.0 *.o
ln -sf libcgreet.so.1.0 libcgreet.so
ln -sf libcgreet.so.1.0 libcgreet.so.1
So I have created a shared library.
Now I want to link this shared library with my main program to create an executable.
gcc -Wall -L. main.c -lcgreet -o greet
It very well works and if I set the LD_LIBRARY_PATH before running greet( or link it with rpath option) I can make it work.
My question is however different:
Since I am anyway using shared library, is it not possible to force symbol resolution at runtime (not sure about the terminology but perhaps called dynamic linking as per the book "Linkers and Loaders"). I understand that we may not want to do it because this makes the program run slow and has overhead everytime we want to run the program, but I am trying to understand this to clear my concepts.
Does gcc linker provide any option to delay symbol resolution at runtime? (to do it with the library we are actually going to run the program with)(as library available at compile time may be different than the one available at runtime if any changes in the library)
I want to be able to do sth like:
bash$ gcc main.c -I.
(what option needed here?)
so that I don't have to give the library name, and just tell it that I want to do symbol resolution at runtime, so headers are good enough for now, actual library names are not needed.
Thanks,
Learner For Ever.
Any linker (gcc, ld or any other) only resolves links at compile-time. That is because the ELF standard (as most others) do not define 'run-time' linkage as you describe. They either link statically (i.e. lib.a) or at start-up time (lib.so, which must be present when the ELF is loaded). However, if you use a dynamic link, the linker will only put in the ELF the name of the file and the symbols it must find, it does not link the file directly. So, if you want to upgrade the lib to a newer version later, you can do so, as long as system can find the same filename (the path can actually be different) and the same symbol names.
The other option, to get symbols at run-time, is to use dlopen, which has nothing to do with gcc or ld. dlopen simply put, opens a dynamic link library, just like fopen might, and returns you a handle, which then you pass to dlsym with the name of the symbol you want, which might be a function name for example. dlsym will then pass you a pointer to that symbol, which you can then use to call the function or use as a variable. This is how plugins are implemented.
I think you are looking for ld option '--unresolved-symbols=ignore-all', yes it can actually do it (ignore prev answer). Imagine scenario where a shared library loaded late (when program is already running), it can use all symbols that are already resolved/loaded by the main process, no need to bother to do it again . btw it does not nervelessly makes it slow , at least on Linux