Mysterious global variables - COM/STA Apartment Object

Mysterious global variables - COM/STA Apartment Object - c++

Here's the scenario:
-COM DLL, loaded into the address space of the process that uses the DLL.
-Inside the DLL couple global variables exist (lets say var a, var b) and a global function.
-Process starts up, calls the global function and initializes globals a, b and calls CoInitialize(NULL) - The thread is an STA.
-Then same global function creates an STA COM object
Later in the program, same thread (the thread that called CoInitialize above and created STA COM object) calls same global C-Function (lets call it func()) in this DLL. In the scope of the C-function the state of the global variables is exactly as expected (ie. correctly initialized).
The minute the function func() invokes a COM method on the existing STA COM object, the COM object being in the same DLL sees completely different copies of the global variables (var a, var b). I took the address of both variables and they are completely different in the C-func as opposed to invoked COM objects function.
What is going on? I thought globals in same address space should be visible across the board.

It's possible that two instances of your DLL are being loaded -- one explicitly by the application that hosts your DLL, and the second through the COM subsystem via CoCreateInstance. The former will look in the DLL search path for the application's process, while the latter will look in the registry for the location of the COM component that implements your COCLASS.
If your DLL has a DllMain (or the InitInstance function if it's an MFC-based DLL), then you can breakpoint it and look at the hinstance argument (or AfxGetInstanceHandle if MFC) to see if (a) you initialize twice and (b) you see two different DLL instance handles. If so, then you're definitely loading twice.
The DLL's location in the file system matters, so you should see if there are copies in separate locations that might be separately loaded based on the rules I mentioned above.
In general, a COM DLL should never be loaded directly. You should break your functionality up into two DLLs, with a COM server DLL dedicated to the COM stuff. You can provide yourself an internal COCLASS interface that will enable you to pass your globals to the COM DLL, if you so wish.

Related

what is the best way to initialize my global data in shared library?

I have a single class which does all the required initialization.
currently i have declared a global object of this class type, which is being instantiated on library load.
I've seen other ways, like delaring
BOOL APIENTRY DllMain
entry point for the shared library, and does the actual initialization on process attach.
does this differ from letting the implicit global initialization to its job? which way is better?

This is what happens during C++ DLL startup:
System calls DLL's entry point, generated by you compiler
Entry point calls DllMainCRTStartup (name may differ), which initializes C/C++ runtimes and instantiates all global objects.
DllMainCRTStartup then calls user-defined DllMain.
I personally prefer DllMain, because this way I can explicitly control order of initialization. When you use global objects in different compilation units, they will be initialized in random order which may bring some unexpected surprises 10 minutes before the deadline.
DllMain also let's you do per-thread initialization, which you can not achieve with global objects. However, it is not portable to other platforms.
P.S. You do NOT need mutex in DllMain, as all calls to it are already serialized under process-global critical section. I.e. it is guaranteed two threads will not enter it at the same time for any purpose. This is also the reason why you should not communicate with other threads, load other libraries etc. from this function; see MSDN article for explanation.

A couple of things that should never be done from DllMain:
Call LoadLibrary or LoadLibraryEx (either directly or indirectly). This can cause a deadlock or a crash.
Synchronize with other threads. This can cause a deadlock.
Acquire a synchronization object that is owned by code that is waiting to acquire the loader lock. This can cause a deadlock.
Initialize COM threads by using CoInitializeEx. Under certain conditions, this function can call LoadLibraryEx.
Call the registry functions. These functions are implemented in Advapi32.dll. If Advapi32.dll is not initialized before your DLL, the DLL can access uninitialized memory and cause the process to crash.
Call CreateProces. Creating a process can load another DLL.
Call ExitThread. Exiting a thread during DLL detach can cause the loader lock to be acquired again, causing a deadlock or a crash.
Call CreateThread. Creating a thread can work if you do not synchronize with other threads, but it is risky.
Create a named pipe or other named object (Windows 2000 only). In Windows 2000, named objects are provided by the Terminal Services DLL. If this DLL is not initialized, call to the DLL can cause the process to crash.
Use the memory management function from the dynamic C Run-Time (CRT). If the CRT DLL is not initialized, calls to these functions can cause the process to crash.
Call functions in User32.dll or Gdi32.dll. Some functions load another DLL, which may not be initialized.
Use managed code.

You need a static boolean initialization variable and a mutex. Statically initialize "initialized" to 0. In your DllMain(), make a call to CreateMutex(). Use bInitialOwner=0 and a unique name for lpName that's unique to your application. Then use WaitForSingleObject() to wait for the mutex. Check if initialized is non-zero. If not, do your initialization, and then set initialized to 1. If initialized is non-zero, do nothing. Finally, release the mutex using ReleaseMutex() and close it using CloseHandle().
Here's some pseudo-code, with error and exception handling omitted:
initialized = 0;
DllMain()
{
mutex = CreateMutex(..., 0, "some-unique-name");
result = WaitForSingleObject(handle, ...);
if (result == WAIT_OBJECT_0) {
if (!initialized) {
// initialization goes here
initialized = 1;
}
}
ReleaseMutex(mutex);
CloseHandle(mutex);
}

hi i would recommend u to prefer a signleton class where u can only create a single object of a class and use it. Sigleton class can be created with a private constructor. Now suppose ur class A is a singleton class its object can be used in a constructor of each Class which u want to initialize. Please give us some sample code so other may help u better.

CComModule::Unlock();

I've been trying to determine what this function does, however I cannot seem to find it anywhere under the MSDN documentation of the CComModule class.
Could anyone tell me what it is used for?

This function is for DllCanUnloadNow() to work properly.
You know that when you call CoCreateInstance() for an in-proc server COM automagically calls LoadLibraryEx() to load the COM server DLL if necessary. But how long is the DLL kept loaded? In fact COM calls DllCanUnloadNow() for every loaded COM server DLL periodically. If it returns S_OK COM is allowed to call FreeLibrary().
When is it safe to unload the DLL? Obviously you can't unload it until all the objects implemented by the DLL are destroyed. So here comes "lock count" - an global integer variable counting the number of live objects implemented by the DLL.
When a new COM object is created - CComModule::Lock() is called from its constructor (usually CComObject constructor) and increments the variable, when an object is destroyed - CComModule::Unlock() is called from its destructor and decrements the variable. When CComModule::GetLockCount() returns zero it means that there no live objects and it's safe to unload the DLL.
So the lock count is very similar to the reference count implemented by IUnknown. The reference count is per object, the lock count is per COM in-proc server.

notify an object when a thread starts

i have an object A which should be notified (A::Notify() method) when some thread starts or dies.
Lets say this thread dynamically loads some DLL file of mine (i can write it).
I believe i should write the dllMain function of this DLL, however i'm not sure how to get a reference to the A object from this function so i can run it's Notify() method.
any ideas?

A DLL is loaded once in every process. Once loaded, its DllMain is automatically called whenever a thread is created in the process. Assuming A is a global variable, you can do the following:
After you first load the DLL, call an exported function that will set a global pointer to A in the DLL
Whenever DllMain is called with the reason being thread attached, call A via the pointer you have in the DLL.
Another option would be to start a message loop in your exe, and pass it's thread ID to the DLL. Then, whenever a thread attaches to the DLL send the message loop a message with the details of the created thread. This is a slightly more complicated solution, but it will save you the need of making the DLL familiar with the A class.

Is it okay to make A::Notify() as static method?
Otherwise, Singleton method might serve the purpose.

So if I understand you write, in your main program you have an instance of class A. When your main program loads certain dlls you want it to call A::Notify for that instance?
As far as I'm aware there is no way to pass an additional argument to LoadLibrary.
If A::Notify can be either static, or A is a singleton, export a "NotifyA" method from the exe, then have the dll call LoadLibrary("yourexe") and you GetProcAddress to get the address of NotifyA which you can then call. (Yes exe files can export methods like dlls!)
A second option is to write your own LoadLibrary, that call a second method after dll main, eg
HMODULE MyLoadLibrary(string dll, A *a)
{
HMODULE module = LoadLibrary(dll.c_str())
void (call*)(A*) = void (*)(A*)GetProcAddress(module, "Init");
call(a);
return module;
}
The dlls Init method can then store the A instance for later.

CreateRemoteThread, LoadLibrary, and PostThreadMessage. What's the proper IPC method?

Alright, I'm injecting some code into another process using the CreateRemoteThread/LoadLibrary "trick".
I end up with a thread id, and a process with a DLL of my choice spinning up. At least in theory, the DLL does nothing at the moment so verifying this is a little tricky. For the time being I'm willing to accept it on faith alone. Besides, this question needs to be answered before I push to hard in this direction.
Basically, you can't block in DllMain. However, all I've got to communicate with the remote thread is its id. This practically begs for PostThreadMessage/GetMessage shenanigans which block. I could spin up another thread in DllMain, but I have no way of communicating its id back to the creating thread and no way of passing the another thread's id to the remote one.
In a nutshell, if I'm creating a remote thread in a process how should I be communicating with the original process?

Step zero; the injected DLL should have an entry point, lets call it Init() that takes a LPCWSTR as its single parameter and returns an int; i.e. the same signature as LoadLibrary() and therefore equally valid as a thread start function address...
Step one; inject using load library and a remote thread. Do nothing clever in the injected DLLs DLLMain(). Store the HMODULE that is returned as the exit code of the injecting thread, this is the HMODULE of the injected DLL and the return value of LoadLibrary().
Note that this is no longer a reliable approach on x64 if /DYNAMICBASE and ASLR (Address space layout randomisation) is enabled as the HMODULE on x64 is larger than the DWORD value returned from GetThreadExitCode() and the address space changes mean that it's no longer as likely that the HMODULE's value is small enough to fit into the DWORD. See the comments below and the linked question (here) for a work around using shared memory to communicate the HMODULE
Step two; load the injected DLL using LoadLibrary into the process that is doing the injecting. Then find the offset of your Init() entrypoint in your address space and subtract from it the HMODULE of your injected DLL in your address space. You now have the relative offset of the Init() function. Take the HMODULE of the injected DLL in the target process (i.e. the value you saved in step one) and add the relative address of Init() to it. You now have the address of Init() in your target process.
Step three; call Init() in the target process using the same 'remote thread' approach that you used to call LoadLibrary(). You can pass a string to the Init() call, this can be anything you fancy.
What I tend to do is pass a unique string key that I use as part of a named pipe name. The Injected DLL and the injecting process now both know the name of a named pipe and you can communicate between them. The Init() function isn't DLLMain() and doesn't suffer from the restrictions that affect DLLMain() (as it's not called from within LoadLibrary, etc) and so you can do normal stuff in it. Once the injected DLL and the injecting process are connected via a named pipe you can pass commands and data results back and forth as you like. Since you pass the Init() function a string you can make sure that the named pipe is unique for this particular instance of your injecting process and this particular injected DLL which means you can run multiple instances of the injecting process at the same time and each process can inject into multiple target processes and all of these communication channels are unique and controllable.

You don't have the thread id of a thread in the remote process, because the one you used to load the dll exited when your module was successfully loaded into the address space of your process.
You can easily use the normal interprocess communication methods like named sections/pipes/creating a named window/etc. to communicate with your 'injecting' process.

What happens to global variables declared in a DLL?

Let's say I write a DLL in C++, and declare a global object of a class with a non-trivial destructor. Will the destructor be called when the DLL is unloaded?

In a Windows C++ DLL, all global objects (including static members of classes) will be constructed just before the calling of the DllMain with DLL_PROCESS_ATTACH, and they will be destroyed just after the call of the DllMain with DLL_PROCESS_DETACH.
Now, you must consider three problems:
0 - Of course, global non-const objects are evil (but you already know that, so I'll avoid mentionning multithreading, locks, god-objects, etc.)
1 - The order of construction of objects or different compilation units (i.e. CPP files) is not guaranteed, so you can't hope the object A will be constructed before B if the two objects are instanciated in two different CPPs. This is important if B depends on A. The solution is to move all global objects in the same CPP file, as inside the same compilation unit, the order of instanciation of the objects will be the order of construction (and the inverse of the order of destruction)
2 - There are things that are forbidden to do in the DllMain. Those things are probably forbidden, too, in the constructors. So avoid locking something. See Raymond Chen's excellent blog on the subject:
Some reasons not to do anything scary in your DllMain
Another reason not to do anything scary in your DllMain: Inadvertent deadlock
Some reasons not to do anything scary in your DllMain, part 3
In this case, lazy initialization could be interesting: The classes remain in an "un-initialized" state (internal pointers are NULL, booleans are false, whatever) until you call one of their methods, at which point they'll initialize themselves. If you use those objects inside the main (or one of the main's descendant functions), you'll be ok because they will be called after execution of DllMain.
3 - Of course, if some global objects in DLL A depend on global objects in DLL B, you should be very very careful about DLL loading order, and thus dependancies. In this case, DLLs with direct or indirect circular dependancies will cause you an insane amount of headaches. The best solution is to break the circular dependancies.
P.S.: Note that in C++, constructor can throw, and you don't want an exception in the middle of a DLL loading, so be sure your global objects won't be using exception without a very, very good reason. As correctly written destructors are not authorized to throw, the DLL unloading should be ok in this case.

This page from Microsoft goes into the details of DLL initialization and destruction of globals:
http://msdn.microsoft.com/en-us/library/988ye33t.aspx

If you want to see the actual code that gets executed when linking a .dll, take a look at %ProgramFiles%\Visual Studio 8\vc\crt\src\dllcrt0.c.
From inspection, destructors will be called via _cexit() when the internal reference count maintained by the dll CRT hits zero.

It should be called when either the application ends or the DLL is unloaded, whichever comes first. Note that this is somewhat dependent on the actual runtime you're compiling against.
Also, beware non-trivial destructors as there are both timing and ordering issues. Your DLL may be unloaded after a DLL your destructor relies on, which would obviously cause issues.

In windows binary image files with extension *.exe, *.dll are in PE format
Such files have Entry Point. You can view it with dumpbin tool like
dumpbin /headers dllname.dll
If you use C runtime from Microsoft, then your entry point will be something like
*CRTStartup or *DllMainCRTStartup
Such functions perform initialization of c and c++ runtime and delegate execution to (main, WinMain) or to DllMain respectively.
If you use Microsofts VC compiler then you can watch at source code of this functions in yours VC directory:
crt0.c
dllcrt0.c
DllMainCRTStartup process all things need to init/deinit your global variables from .data sections in normal scenario, when it retrive notification DLL_PROCESS_DETACH during dll unload. For example:
main or WinMain of startup thread of program returns control flow
you explictly call FreeLibrary and use-dll-counter is zero

When DllMain with fdwReason = DLL_PROCESS_DETACH parameter is called it means the DLL is unloaded by the application. This is the time before the destructor of global/static objects gets called.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js