ABI stability on pointers to symbols vs regular symbol lookup - c++

General Scenario
Using dlsym(), I dynamically load a shared object addon from my main thread.
I follow either of these two ways.
Way A
Pass a struct of pointers to symbols to the addon so it can call the host's functions and access other variables, knowing their data type of course.
Way B
Let the addon call symbols by their extern "C" identifier and have the runtime normally lookup them.
Question
Is there any difference between these two methods regarding ABI stability? For example: would one of this methods guarantee more chance of compatibility from an addon to the host program in case they were compiled in different environments?

One advantage of "Way A" is that it gives you the chance to pass different pointers to different plugins. So you could for example make a "v1" struct of pointers, and then later a "v2" that newer plugins could request.

If anything works well two methods are equivalent except some performance loss. But runtime lookup resolves named symbols in global scope, which could be influenced by flag like RTLD_GLOBAL used in dlopen. It will lead to different behaviour under different context even though using the same addon.
So I think method A is better.

Related

Does static initialization (and/or other) code get run when dlopen'ing?

When you dlopen() a shared object, is there a mechanism for having code in that DLL execute without being called explicitly? Specifically, C++ static initialization code for globals/statics which the caller of dlopen() might not know about? I'm pretty sure the answer should be "yes" but I don't remember what mechanism makes that happen, and how to utilize it for running arbitrary code.
Yes: dlopen respects an ELF binary format mechanism for running code at load time.
There are actually two such mechanisms:
An older one uses special .init and .finis sections, which contain an array of function pointers for dlopen and dlclose to call. Since the sections may not be present at runtime, there are also DT_INIT and DT_FINI dynamic tags which point to the corresponding sections.
The newer mechanism is .init_array and .fini_array and corresponding DT_INIT_ARRAY, DT_INIT_ARRAYSZ, DT_FINI_ARRAY and DT_FINI_ARRAYSZ dynamic tags.
The difference between the two mechanisms is described here.
Going up to the source code level, if you decorate a C function with __attribute__((constructor)), the compiler will use one of those two mechanisms to make it run when the object is dlopened. The same goes for the construction code for global C++ objects requiring dynamic initialization.

C++ Passing std::string by reference to function in dll

I have the problem with passing by reference std::string to function in dll.
This is function call:
CAFC AFCArchive;
std::string sSSS = std::string("data\\gtasa.afc");
AFCER_PRINT_RET(AFCArchive.OpenArchive(sSSS.c_str()));
//AFCER_PRINT_RET(AFCArchive.OpenArchive(sSSS));
//AFCER_PRINT_RET(AFCArchive.OpenArchive("data\\gtasa.afc"));
This is function header:
#define AFCLIBDLL_API __declspec(dllimport)
AFCLIBDLL_API EAFCErrors CAFC::OpenArchive(std::string const &_sFileName);
I try to debug pass-by-step through calling the function and look at _sFileName value inside function.
_sFileName in function sets any value(for example, t4gs..\n\t).
I try to detect any heap corruption, but compiler says, that there is no error.
DLL has been compiled in debug settings. .exe programm compiled in debug too.
What's wrong?? Help..!
P.S. I used Visual Studio 2013. WinApp.
EDIT
I have change header of func to this code:
AFCLIBDLL_API EAFCErrors CAFC::CreateArchive(char const *const _pArchiveName)
{
std::string _sArchiveName(_pArchiveName);
...
I really don't know, how to fix this bug...
About heap: it is allocated in virtual memory of our process, right? In this case, shared virtual memory is common.
The issue has little to do with STL, and everything to do with passing objects across application boundaries.
1) The DLL and the EXE must be compiled with the same project settings. You must do this so that the struct alignment and packing are the same, the members and member functions do not have different behavior, and even more subtle, the low-level implementation of a reference and reference parameters is exactly the same.
2) The DLL and the EXE must use the same runtime heap. To do this, you must use the DLL version of the runtime library.
You would have encountered the same problem if you created a class that does similar things (in terms of memory management) as std::string.
Probably the reason for the memory corruption is that the object in question (std::string in this case) allocates and manages dynamically allocated memory. If the application uses one heap, and the DLL uses another heap, how is that going to work if you instantiated the std::string in say, the DLL, but the application is resizing the string (meaning a memory allocation could occur)?
C++ classes like std::string can be used across module boundaries, but doing so places significant constraints on the modules. Simply put, both modules must use the same instance of the runtime.
So, for instance, if you compile one module with VS2013, then you must do so for the other module. What's more, you must link to the dynamic runtime rather than statically linking the runtime. The latter results in distinct runtime instances in each module.
And it looks like you are exporting member functions. That also requires a common shared runtime. And you should use __declspec(dllexport) on the entire class rather than individual members.
If you control both modules, then it is easy enough to meet these requirements. If you wish to let other parties produce one or other of the modules, then you are imposing a significant constraint on those other parties. If that is a problem, then consider using more portable interop. For example, instead of std::string use const char*.
Now, it's possible that you are already using a single shared instance of the dynamic runtime. In which case the error will be more prosaic. Perhaps the calling conventions do not match. Given the sparse level of detail in your question, it's hard to say anything with certainty.
I encountered similar problem.
I resolved it synchronizing Configuration Properties -> C / C++ settings.
If you want debug mode:
Set _DEBUG definition in Preprocessor Definitions in both projects.
Set /MDd in Code Generation -> Runtime Library in both projects.
If you want release mode:
Remove _DEBUG definition in Preprocessor Definitions in both projects.
Set /MD in Code Generation -> Runtime Library in both projects.
Both projects I mean exe and dll project.
It works for me especially if I don't want to change any settings of dll but only adjust to them.

memory address of a member variable of argument objects changes when dll function is called

class SomeClass
{
//some members
MemberClass one_of_the_mem_;
}
I have a function foo( SomeClass *object ) within a dll, it is being called from an exe.
Problem
address of one_of_the_mem_ changes during the time the dll call is dispatched.
Details:
before the call is made (from exe):
'&(this).one_of_the_mem_' - `0x00e913d0`
after - in the dll itself :
'&(this).one_of_the_mem_' - `0x00e913dc`
The address of object remains constant. It is only the member whose address shift by c every time.
I want some pointers regarding how can I troubleshoot this problem.
Code :
Code from Exe
stat = module->init ( this,
object_a,
&object_b,
object_c,
con_dir
);
Code in DLL
Status_C ModuleClass( SomeClass *object, int index, Config *conf, const char* name)
{
_ASSERT(0); //DEBUGGING HOOK
...
Update 1:
I compared the Offsets of members following Michael's instruction and they are the same in both cases.
Update 2:
I found a way to dump the class layout and noticed the difference in size, I have to figure out why is that happening though.
linked is the question that I found to dump class layout.
Update 3:
Final Update : Solved the problem, much thanks to Michael Burr.
it turned out that one of the build was using 32 bit time, _USE_32BIT_TIME_T was defined in it and the other one was using 64 bit time. So it generated the different layout for the object, attached is the difference file.
Your DLL was probably compiled with different set of compiler options (or maybe even a slightly different header file) and the class layout is different as a result.
For example, if one was built using debug flags and other wasn't or even if different compiler versions were used. For example, the libraries used by different compiler versions might have subtle differences and if your class incorporates a type defined by the library you could have different layouts.
As a concrete example, with Microsoft's compiler iterators and containers are sensitive to release/debug, _SECURE_SCL on/off , and _HAS_ITERATOR_DEBUGGING on/off setting (at least up though VS 2008 - VS 2010 may have changed some of this to a certain extent). See http://connect.microsoft.com/VisualStudio/feedback/details/352699/secure-scl-is-broken-in-release-builds for some details.
These kinds of issues make using C++ classes across DLL boundaries a bit more fragile than using straight C interfaces. They can occur in C structures as well, but it seems like C++ libraries have these differences more often (I think that's the nature of having richer functionality).
Another layout-changing issue that occurs every now and then is having a different structure packing option in effect in the different compiles. One thing that can 'hide' this is that pragmas are often used in headers to set structure packing to a certain value, and sometimes you may come across a header that does this without changing it back to the default (or more correctly the previous setting). If you have such a header, it's easy to have it included in the build for one module, but not another.
that sounds a bit wierd, you should show more code, it should 'move' if it being passed by ref, it sounds more like a copy of it is being made and that having the member function called.
Perhaps the DLL versions is compiled against a different version that you are referencing. check and make sure the header file is for the same version as the dll.
Recompile the library if you can.

Creating and using dll's: __declspec(dllimport) vs. GetProcAddress

Imagine we have a solution with 2 projects: MakeDll (a dll app), which creates a dll, and UseDll (an exe app), which uses the dll. Now I know there are basically two ways, one is pleasant, other is not. The pleasant way is that UseDll link statically to MakeDll.lib, and just dllimports functions and classes and uses them. The unpleasant way is to use LoadLibrary and GetProcAddress which I don't even imagine how is done with overloaded functions or class members, in other words anything else but extern "C" functions.
My questions are the following (all regarding the first option)
What exactly does the MakeDll.lib
contain?
When is MakeDll.dll loaded into my application, and when unloaded? Can I control that?
If I change MakeDll.dll, can I use the new version (provided it is a superset of the old one in terms of interface) without rebuilding UseDll.exe? A special case is when a polymorphic class is exported and a new virtual function is added.
Thanks in advance.
P.S. I am using MS Visual Studio 2008
It basically contains a list of the functions in the DLL, both by name and by ordinal (though almost nobody uses ordinals anymore). The linker uses that to create an import table in UseDLL.exe -- i.e., a reference that says (in essence): "this file depends on function xxx from MakeDll.dll". When the loader loads that executable, it looks at the import table, and (recursively) loads all the DLLs it lists, and (at least conceptually) uses GetProcAddress to find the functions, so it can put their addresses into the executable where they're needed.
It's normally loaded during the process of loading your executable. You can use the /delayload switch to delay its being loaded until a function from that DLL is called.
In general, yes. In the specific case of adding a virtual function, it'll depend on the class' vtable layout staying the same other than the new function being added. Unless you take steps to either assure or verify that yourself, depending on it is a really bad idea.
MakeDll.lib contains a list of studs for the exported functions and their RVAs into the MakeDll.dll
MakeDll.dll is loaded into the application based on what type of loading is defined for the dll in question. (e.g. DELAYLOAD). Raymond Chen has an interesting article on that.
You can use the new updated version of MakeDll.dll as long as all the RVA offsets used in UseDll.exe have not changed. In the event you change a vtable layout for a polymorphic class, as in add a new function in the middle of the previously defined vtable, you will need to recompile UseDll.exe. Other than that you can use the updated dll with the previously compiled UseDll.exe.
The unpleasant way is to use LoadLibrary and GetProcAddress which I don't even imagine how is done with overloaded functions or class members, in other words anything else but extern "C" functions.
Yes, this is unpleaseant but is not as bad as it sounds. If you choose to go through with this option, you'll want to do something like the following:
// Common.h: interface common to both sides.
// Note: 'extern "C"' disables name mangling on methods.
extern "C" class ISomething
{
// Public virtual methods...
// Object MUST delete itself to ensure memory allocator
// coherence. If linking to different libraries on either
// sides and don't do this, you'll get a hard crash or worse.
// Note: 'const' allows you to make constants and delete
// without a nasty 'const_cast'.
virtual void destroy () const = 0;
};
// MakeDLL.c: interface implementation.
class Something : public ISomething
{
// Overrides + oher stuff...
virtual void destroy () const { delete this; }
};
extern "C" ISomething * create () { return new Something(); }
I've successfully deployed such setups with different C++ compilers on both ends (i.e. G++ and MSVC interchanged in all 4 possible combinations).
You may change Something's implementation all you want. However, you may not change the interface without re-compiling on both sides! When you think about it, this is faily intuitive: both sides rely on the other's definition of ISomething. What you may do to add this extra flexibility is use numbered interfaces (as DirectX does) or go with a set of interfaces and test for capabilities (as COM does). The former is really intuitive to set up but requires discipline and the second well... would be re-inventing the wheel!

How to mimic the "multiple instances of global variables within the application" behaviour of a static library but using a DLL?

We have an application written in C/C++ which is broken into a single EXE and multiple DLLs. Each of these DLLs makes use of the same static library (utilities.lib).
Any global variable in the utility static library will actually have multiple instances at runtime within the application. There will be one copy of the global variable per module (ie DLL or EXE) that utilities.lib has been linked into.
(This is all known and good, but it's worth going over some background on how static libraries behave in the context of DLLs.)
Now my question.. We want to change utilities.lib so that it becomes a DLL. It is becoming very large and complex, and we wish to distribute it in DLL form instead of .lib form. The problem is that for this one application we wish to preserve the current behaviour that each application DLL has it's own copy of the global variables within the utilities library. How would you go about doing this? Actually we don't need this for all the global variables, only some; but it wouldn't matter if we got it for all.
Our thoughts:
There aren't many global variables within the library that we care about, we could wrap each of them with an accessor that does some funky trick of trying to figure out which DLL is calling it. Presumably we can walk up the call stack and fish out the HMODULE for each function until we find one that isn't utilities.dll. Then we could return a different version depending on the calling DLL.
We could mandate that callers set a particular global variable (maybe also thread local) prior to calling any function in utilities.dll. The utilities DLL could then use this global variable value to determine the calling context.
We could find some way of loading utilities.dll multiple times at runtime. Perhaps we'd need to make multiple renamed copies at build time, so that each application DLL can have it's own copy of the utilities DLL. This negates some of the advantages of using a DLL in the first place, but there are other applications for which this "static library" style behaviour isn't needed and which would still benefit from utilities.lib becoming utilities.dll.
You are probably best off simply having utilities.dll export additional functions to allocate and deallocate a structure that contains the variables, and then have each of your other worker DLLs call those functions at runtime when needed, such as in the DLL_ATTACH_PROCESS and DLL_DETACH_PROCESS stages of DllEntryPoint(). That way, each DLL gets its own local copy of the variables, and can pass the structure back to utilities.dll functions as an additional parameter.
The alternative is to simply declare the individual variables locally inside each worker DLL directly, and then pass them into utilities.dll as input/output parameters when needed.
Either way, do not have utilities.dll try to figure out context information on its own. It won't work very well.
If I were doing this, I'd factor out all stateful global variables - I would export a COM object or a simple C++ class that contains all the necessary state, and each DLL export would become a method on your class.
Answers to your specific questions:
You can't reliably do a stack trace like that - due to optimizations like tail call optimization or FPO you cannot determine who called you in all cases. You'll find that your program will work in debug, work mostly in release but crash occasionally.
I think you'll find this difficult to manage, and it also puts a demand that your library can't be reentrant with other modules in your process - for instance, if you support callbacks or events into other modules.
This is possible, but you've completely negated the point of using DLL's. Rather than renaming, you could copy into distinct directories and load via full path.