Access violation when using boost::signals2 and unloading DLLs - c++

I'm trying to figure out this problem.
Suppose, you have a code that uses boost::signals2 for communicating between objects. Lets call them "colorscales". Code for these colorscales is usually situated in the same DLL as the code that uses them. Let's call it main.dll
But sometimes code from other DLLs needs to use these objects and this is where the problems begin.
Basically, the application is pretty big and most of the DLLs are loaded to do some work and then they are unloaded. This is not the case with DLL that contain colorscales code, it's neved unloaded during application normal runtime.
So, when one of the DLLs is loaded (lets call it tools.dll) and some code runs, it may want to use these colorscale objects and communicate with them, so I connect to the signals these objects provide.
The problem is that boost is pretty lazy and all clever, and when you disconnect() slots, it doesn't actually erase connection and stuff that is associated with it (like boost::bind object and such). It just sets a flag that this connection is now disconnected and cleans it up on later (actually, it clean up 2 of these objects when you connect new slots and 1 of them when you invoke signal as of version 1.57). You probably already see where this is coming to.
So, you when you don't need more tools, you disconnect these signals and then application unloads tools.dll.
Then at a later stage, some code executes from the main.dll which causes one of colorscale signals invoked. boost::signals2 goes to invoke it, but before it tries to clean up one disconnected slot. This is where access violation happens, because internally connection had a shared_state object or something like this, that tries to clean itself up in a thread-safe way. But it faces the problem, that the code that it tries to call is already not there, because DLL is unloaded, so the Access Violation exception is thrown.
I've tried to fix this by invoking signal with some dummy parameters before DLL is unloaded and also by connecting and then disconnecting more slots (this one was a stupid idea, because it doesn't solve problem, but just multiplies it) some predefined amount of times (2 or 3 times more than there are slots at all).
It worked, or I thought so, because now it doesn't crash instantly, but rather crashes the next time you load the same tools.dll. I still need to figure out where and why does it crash, but it's somewhere else inside boost.
So, I wanted to ask, what are my options of fixing it?
My thoughts were
Implementing my own connection that works in a more simple way
Providing a more simple way to communicate, like, callbacks, for instance
Finding a workaround for boost being so lazy and smart.

Well, it seems that I've found the cause of the crash after the fix.
So, basically, what happens, when you use the workaround described above (calling signal with dummy parameters multiple times), what it does is that it replaces _shared_state object that was created from boost code from main.dll by another _shared_state object that is created from boost code from tools.dll. This object maintains pointer to reference counter (of type derived from boost::detail::sp_counter_base) inside.
Then the tools.dll unloads and the object remains, but its virtual table is pointing to the code that is no longer there. Let's look at the virtual table of the reference counter to understand what's going on.
[0] 0x000007fed8a42fe5 tools.dll!boost::detail::sp_counted_impl_p<...>::`vector deleting destructor'(unsigned int)
[1] 0x000007fed8a4181b tools.dll!boost::detail::sp_counted_impl_p<...>::dispose(void)
[2] 0x000007fed8a4458e tools.dll!boost::detail::sp_counted_base::destroy(void)
[3] 0x000007fed8a43c42 SegyTools.dll!boost::detail::sp_counted_impl_p<...>::get_deleter(class type_info const &)
[4] 0x000007fed8a42da6 tools.dll!boost::detail::sp_counted_impl_p<...>::get_untyped_deleter(void)
As you can see, all these method are connected to the disposal of reference counter, so the problem doesn't arise before you try to do the same trick second time. So, the trick with disconnecting all signals to try to get rid of all the code from tools.dll doesn't work as expected and the next time you try to do the trick, Access Violation occurs.

Related

Why is this QObject subclass created in a different thread than its parent?

I'm running into some Qt threading issues that I don't understand, running Qt5.6 on Windows 7. Everything is installed and built using MSYS2.
I've created a Server object, which manages a Device object on behalf of remote clients. Both are subclasses of QObject. After the Server receives a client request to create a device, the Device is constructed with the Server as its parent.
But when I run the application in GDB, calling this->parent() for the Device returns a NULL pointer, indicating that this object has no parent. Furthermore, this->thread() returns a different thread from the Server object's thread. And the requisite "Cannot create children for a parent that is in a different thread" warning is printed, somewhere in the QObject constructor itself.
This is despite the Device object's constructor being passed the Server as its parent and there being no other explicitly-created threads. There are no explicitly-created QThread objects, and I'm only using the default event loop created by QCoreApplication::exec(). Furthermore, compiling and running this on either Linux or OS X confirms that the parent is set correctly and that both objects live in the same thread.
I understand why a call like setParent() or its moral equivalent will fail, to keep objects and their corresponding event loops in the same thread. But it appears that the thread affinity for the Device object is somehow already set by the time the constructor is called. But why is there more than one thread in the first place? And why only on Windows? And why would the object be created in a seemingly random thread, when it could very well use the thread of the parent that I'm passing in the constructor?
EDIT:
A couple of other things to note. I am not only not calling moveToThread() or anything like that, but there's no concurrency of any kind. This includes things like QRunnable or anything from the QtConcurrent module.
Another thing that actually solves the practical problem, but that I don't understand, is that this goes away if I compile for release. I've been using the debug build during development, which is apparently the only time the strange thread/parent behavior manifests. When I run the release build, everything works swimmingly. Is it possible this is a bug in the MSYS2 build of GDB that I'm using? Maybe there's some way in which Qt is not playing nicely with GDB (or vice versa)?
EDIT 2:
Based on a comment from #Ben, I dug around in more depth in the QObject constructor code (found here which starts on line 821. On line 826, ultimately the function QThreadData::current() is called, and this is where the threads of the parent and child first seem to differ. The Qt internal class QThreadData's static function current() (found here) seems to query thread-local storage to return the current thread's index, I think, which appears different from that of the parent's thread. I don't know enough about the Windows threading libraries to know if this is really what's going on, but after reading the pthread-based version of the code, that's my best guess.
I'm pretty far down the rabbit hole at this point, but it seems like my amended question is: Why does QThreadData::current() return a different thread from the parent? Or why has the thread-local storage containing this data changed between the time the parent was constructed and the child's constructor is called?

Calling shared libraries without releasing the memory

In Ubuntu 14.04, I have a C++ API as a shared library which I am opening using dlopen, and then creating pointers to functions using dlsym. One of these functions CloseAPI releases the API from memory. Here is the syntax:
void* APIhandle = dlopen("Kinova.API.USBCommandLayerUbuntu.so", RTLD_NOW|RTLD_GLOBAL);
int (*CloseAPI) = (int (*)()) dlsym(APIhandle,"CloseAPI");
If I ensure that during my code, the CloseAPI function is always called before the main function returns, then everything seems fine, and I can run the program again the next time. However, if I Ctrl-C and interrupt the program before it has had time to call CloseAPI, then on the next time I run the program, I get a return error whenever I call any of the API functions. I have no documentation saying what this error is, but my intuition is that there is some sort of lock on the library from the previous run of the program. The only thing that allows me to run the program again, is to restart my machine. Logging in and out does not work.
So, my questions are:
1) If my library is a shared library, why am I getting this error when I would have thought a shared library can be loaded by more than one program simultaneously?
2) How can I resolve this issue if I am going to be expecting Ctrl-C to be happening often, without being able to call CloseAPI?
So, if you do use this api correctly then it requires you to do proper clean up after use (which is not really user friendly).
First of all, if you really need to use Ctrl-C, allow program to end properly on this signal: Is destructor called if SIGINT or SIGSTP issued?
Then use a technique with a stack object containing a resource pointer (to a CloseAPI function in this case). Then make sure this object will call CloseAPI in his destructor (you may want to check if CloseAPI wasn't called before). See more in "Effective C++, Chapter 3: Resource Management".
That it, even if you don't call CloseAPI, pointer container will do it for you.
p.s. you should considering doing it even if you're not going to use Ctrl-C. Imagine exception occurred and your program has to be stopped: then you should be sure you don't leave OS in an undefined state.

Waiting for GLContext to be released

Was passed a set of rendering library that is coded with OSG library and run on Window Environment.
In my program, the renderer exists as a member object in my base class in C++. In my class initiation function, I would do all the neccessary steps to initialize the renderer and use the function this renderer class provide accordingly.
However, I have tried to delete my base class, I presumed the renderer member object would be destroyed along with it. However, when I created another instance of the class, the program would crash when I try to access the rendering function within the renderer.
Have enquired about some opinions on this matter and was told that in Windows, upon deleting the class, the renderer would need to release its glContext and this might be indeterminant time in Windows environment pending upon hardware setup
Is this so? If so, what steps could I take beside amending the rendering source code(if I could get it) to resolve the issue?
Thanks
Actually not deleting / releasing the OpenGL context will just create some memory leak but nothing more. Leaving the OpenGL context around should not cause a crash. In fact crashes like yours are often the cause of releasing some object, that's still required by some other part of the program, so not releasing stuff should not be a cause for a crash like yours.
Your issue is looking more like screwed constructor/destructor or operator= then a GL issue.
its just a gues without the actual code to see/test
Most likely you are accessing already deleted pointer somewhere
check all dynamic member variables and pointers inside your class
Had similar problems in the past so check these
trace back pointers in C++
bds 2006 C hidden memory manager conflicts (class new / delete[] vs. AnsiString)
I recommend to take a look at the second link
especially mine own answer, there is nice example of screwed constructor there
Another possible cause
if you are mixing window message code with threads
and accessing visual system calls or objects within threads instead of window code
that can screw up something in the OS and create unexpected crashes ...
at least on windows

Converting a string into a function in c++

I have been looking for a way to dynamically load functions into c++ for some time now, and I think I have finally figure it out. Here is the plan:
Pass the function as a string into C++ (via a socket connection, a file, or something).
Write the string into file.
Have the C++ program compile the file and execute it. If there are any errors, catch them and return it.
Have the newly executed program with the new function pass the memory location of the function to the currently running program.
Save the location of the function to a function pointer variable (the function will always have the same return type and arguments, so
this simplifies the declaration of the pointer).
Run the new function with the function pointer.
The issue is that after step 4, I do not want to keep the new program running since if I do this very often, many running programs will suck up threads. Is there some way to close the new program, but preserve the memory location where the new function is stored? I do not want it being overwritten or made available to other programs while it is still in use.
If you guys have any suggestions for the other steps as well, that would be appreciated as well. There might be other libraries that do things similar to this, and it is fine to recommend them, but this is the approach I want to look into — if not for the accomplishment of it, then for the knowledge of knowing how to do so.
Edit: I am aware of dynamically linked libraries. This is something I am largely looking into to gain a better understanding of how things work in C++.
I can't see how this can work. When you run the new program it'll be a separate process and so any addresses in its process space have no meaning in the original process.
And not just that, but the code you want to call doesn't even exist in the original process, so there's no way to call it in the original process.
As Nick says in his answer, you need either a DLL/shared library or you have to set up some form of interprocess communication so the original process can send data to the new process to be operated on by the function in question and then sent back to the original process.
How about a Dynamic Link Library?
These can be linked/unlinked/replaced at runtime.
Or, if you really want to communicated between processes, you could use a named pipe.
edit- you can also create named shared memory.
for the step 4. we can't directly pass the memory location(address) from one process to another process because the two process use the different virtual memory space. One process can't use memory in other process.
So you need create a shared memory through two processes. and copy your function to this memory, then you can close the newly process.
for shared memory, if in windows, looks Creating Named Shared Memory
http://msdn.microsoft.com/en-us/library/windows/desktop/aa366551(v=vs.85).aspx
after that, you still create another memory space to copy function to it again.
The idea is that the normal memory allocated only has read/write properties, if execute the programmer on it, the CPU will generate the exception.
So, if in windows, you need use VirtualAlloc to allocate the memory with the flag,PAGE_EXECUTE_READWRITE (http://msdn.microsoft.com/en-us/library/windows/desktop/aa366887(v=vs.85).aspx)
void* address = NULL;
address= VirtualAlloc(NULL,
sizeof(emitcode),
MEM_COMMIT|MEM_RESERVE,
PAGE_EXECUTE_READWRITE);
After copy the function to address, you can call the function in address, but need be very careful to keep the stack balance.
Dynamic library are best suited for your problem. Also forget about launching a different process, it's another problem by itself, but in addition to the post above, provided that you did the virtual alloc correctly, just call your function within the same "loadder", then you shouldn't have to worry since you will be running the same RAM size bound stack.
The real problems are:
1 - Compiling the function you want to load, offline from the main program.
2 - Extract the relevant code from the binary produced by the compiler.
3 - Load the string.
1 and 2 require deep understanding of the entire compiler suite, including compiler flag options, linker, etc ... not just the IDE's push buttons ...
If you are OK, with 1 and 2, you should know why using a std::string or anything but pure char *, is an harmfull.
I could continue the entire story but it definitely deserve it's book, since this is Hacker/Cracker way of doing things I strongly recommand to the normal user the use of dynamic library, this is why they exists.
Usually we call this code injection ...
Basically it is forbidden by any modern operating system to access something for exceution after the initial loading has been done for sake of security, so we must fall back to OS wide validated dynamic libraries.
That's said, one you have valid compiled code, if you realy want to achieve that effect you must load your function into memory then define it as executable ( clear the NX bit ) in a system specific way.
But let's be clear, your function must be code position independant and you have no help from the dynamic linker in order to resolve symbol ... that's the hard part of the job.

What could cause SIGSEGV when calling NewObjectArray for JNI in Android?

I just started working with the Android NDK but I keep getting SIGSEGV when I have this call in my C code:
jobjectArray someStringArray;
someStringArray = (*env)->NewObjectArray(env, 10,
(*env)->FindClass(env,"java/lang/String"),(*env)->NewStringUTF(env, ""));
Base on all the example I can find, the above code is correct but I keep getting SIGSERGV and everything is ok if the NewObjectArray line is commented out. Any idea what could cause such a problem?
that looks right, so i'm guessing you've done something else wrong. i assume you're running with checkjni on? you might want to break that up into multiple lines: do the FindClass and check the return value, do the NewStringUTF and check the return value, and then call NewObjectArray.
btw, you might want to pass NULL as the final argument; this pattern of using the empty string as the default value for each element of the array is commonly used (i think it's copy & pasted from some Sun documentation and has spread from there) but it's rarely useful, and it's slightly wasteful. (and it doesn't match the behavior of "new String[10]" in Java.)
I guess one of the possible causes is that in a long-run JNI method, the VM aborts when running out of the per-method-invocation local reference slots (normally 512 slots in Android).
Since FindClass() and NewStringUTF() functions would allocate local references, if you stay in a JNI method for a long time, the VM do not know whether a specific local reference should be recycled or not. So you should explicitly call DeleteLocalRef() to release the acquired local references when not required anymore. If you don't do this, the "zombie" local references will occupy slots in VM, and the VM aborts while running out of all the local reference slots.
In short-run JNI method, this may not be a problem due to all the local references would be recycled when exiting from a JNI method.