How to solve a problem in using RAII code and non-RAII code together in C++? - c++

We have 3 different libraries, each developed by a different developer, and each was (presumably) well designed. But since some of the libraries are using RAII and some don't, and some of the libraries are loaded dynamically, and the others aren't - it doesn't work.
Each of the developers is saying that what he is doing is right, and making a methodology change just for this case (e.g. creating a RAII singleton in B) would solve the problem, but will look just as an ugly patch.
How would you recommend to solve this problem?
Please see the code to understand the problem:
My code:
static A* Singleton::GetA()
{
static A* pA = NULL;
if (pA == NULL)
{
pA = CreateA();
}
return pA;
}
Singleton::~Singleton() // <-- static object's destructor,
// executed at the unloading of My Dll.
{
if (pA != NULL)
{
DestroyA();
pA = NULL;
}
}
"A" code (in another Dll, linked statically to my Dll) :
A* CreateA()
{
// Load B Dll library dynamically
// do all other initializations and return A*
}
void DestroyA()
{
DestroyB();
}
"B" code (in another Dll, loaded dynamically from A) :
static SomeIfc* pSomeIfc;
void DestroyB()
{
if (pSomeIfc != NULL)
{
delete pSomeIfc; // <-- crashes because the Dll B was unloaded already,
// since it was loaded dynamically, so it is unloaded
// before the static Dlls are unloaded.
pSomeIfc = NULL;
}
}

My answer is the same as every time people go on about singletons: Don't effing do it!
Your singleton causes the destructors to be called too late, after some libraries have been unloaded. If you dropped the "static" and made it a regular object, instantiated in a tighter scope, it'd get destroyed before any libraries got unloaded, and everything should work again.
Just don't use singletons.

First, in the example given, I fail to see how Singleton::~Singleton() has access to pA since you declared pA as a static local variable in Singleton::getA().
Then you explained that "B" library is loaded dynamically in A* Create();. Where is it unloaded then? Why doesn't unloading of "B" library happen in void DestroyA() before invoking DestroyB()?
And what do you call "linking a DLL statically"?
Finally, I don't want to be harsh but there are certainly better designs that putting singletons everywhere: in other words, do you really need the object that manages the lifetime of "A" to be a singleton anyway? You could invoke CreateA() at the beginning of your application and rely on atexit to make the DestroyA() call.
EDIT: Since you're relying on the OS to unload "B" library, why don't you remove DestroyB() call from DestroyA() implementation. Then make the DestroyB() call when "B" library unloads using RAII (or even OS specific mechanisms like calling it from DllMain under Windows and marking it with __attribute__((destructor)) under Linux or Mac).
EDIT2: Apparently, from your comments, your program has a lot of singletons. If they are static singletons (known as Meyers singletons), and they depend on each other, then sooner or later, it's going to break. Controlling the order of destruction (hence the singleton lifetime) requires some work, see Alexandrescu's book or the link given by #Martin in the comments.
References (a bit related, worth reading)
Clean Code Talks - Global State and Singletons
Once Is Not Enough
Performant Singletons
Modern C++ Design, Implementing Singletons

At first this looks like a problem of dueling APIs, but really it's just another static destructor problem.
Generally it's best to avoid doing anything nontrivial from a global or static destructor, for the reason you've discovered but also for other reasons.
In particular: On Windows, destructors for global and static objects in DLLs are called under special circumstances, and there are restrictions on what they may do.
If your DLL is linked with the C run-time library (CRT), the entry point provided by the CRT calls the constructors and destructors for global and static C++ objects. Therefore, these restrictions for DllMain also apply to constructors and destructors and any code that is called from them.
— http://msdn.microsoft.com/en-us/library/ms682583%28VS.85%29.aspx
The restrictions are explained on that page, but not very well. I would just try to avoid the issue, perhaps by mimicking A's API (with its explicit create and destroy functions) instead of using a singleton.

You are seeing the side effects of how atexit works in the MS C runtime.
This is the order:
Call atexit handlers for handlers defined in the executable and static libraries linked with the executable
Free all handles
Call atexit handlers for handlers defined in all dlls
So if you load the dll for B manually via a handle this handle will be freed and B will be unavailable in the destructor for A which is in a dll loaded automatically.
I have had similar problems cleaning up critical sections, sockets and db connections in singletons in the past.
The solution is to not use singletons or if you have to, clean them up before main exits. eg
int main(int argc, char * argv [])
{
A* a = Singleton::GetA();
// Do stuff;
Singleton::cleanup();
return 0;
}
Or even better to use RIIA to clean up after you
int main(int argc, char * argv [])
{
Singleton::Autoclean singletoncleaner; // cleans up singleton when it goes out of scope.
A* a = Singleton::GetA();
// Do stuff;
Singleton::cleanup();
return 0;
}
As a result of the problems with Singletons I try to avoid them completely esp since they make testing hell. If i require access to the same object everywhere and don't want to pass the reference around I construct it as an instance in main, the constructor then registers itself as the instance and you access it like a singleton everywhere else. The only place that has to know it is not a singleton is main.
Class FauxSingleton
{
public:
FauxSingleton() {
// Some construction
if (theInstance == 0) {
theInstance = this;
} else {
// could throw an exception here if it makes sense.
// I generaly don't as I might use two instances in a unit test
}
}
~FauxSingleton() {
if (theInstance == this) {
theInstance = 0;
}
}
static FauxSingleton * instance() {
return theInstance;
}
static FauxSingleton * theInstance;
};
int main(int argc, char * argv [])
{
FauxSingleton fauxSingleton;
// Do stuff;
}
// Somewhere else in the application;
void foo()
{
FauxSingleton faux = FauxSingleton::instance();
// Do stuff with faux;
}
Obviously the constructor and destructor are not thread safe but normally they are called in main before any threads are spawned. This is very useful in CORBA applications where you require an orb for the lifetime of the application and you also require access to it in lots of unrelated places.

The simple answer is don't write your own code that dynamically loads/unloads DLL.
Use a framework that handles all the details.
Once a DLL is loaded it is generally a BAD idea to unload it yourself manually.
There are so many gotchas (as you have found).
The simplest solution is to never to unload the DLL (once loaded). Let that OS do that as the application exits.

Related

How do you know whether main has exited?

In both C and C++, atexit functions are called either inside exit, or after main returns (which notionally calls exit: __libc_start_main(argc,argv) { __libc_constructors(); exit(main(argc,argv)); }).
Is there a way to find out if we're inside the exit sequence? Destructors of C++ global and local statics are registered with atexit, so your code can certainly be called into at this stage. (Interestingly, on some platforms if you try to create a C++ local-static object inside exit, it deadlocks on the exit lock!)
My best attempt so far is as follows:
static bool mainExited = false;
static void watchMain() {
static struct MainWatcher {
~MainWatcher() { mainExited = true; }
} watcher;
}
When you want to watch for exit, you call watchMain(), and mainExited tells you at any time whether or not the exit sequence has begun -- except of course if a later-initialized local-static object is destructing!
Can the technique be improved to correct this, or is there another method that would work?
Aside - the use case!
While the problem is interesting from a language point-of-view (a bit like "can I tell if I'm inside a catch block?"), it's also useful to outline a use-case. I came across the problem while writing some code which will be run with and without a JVM loaded (with either direct calls or calls via JNI). After the JVM exits, the C atexit handlers are called, and JNI_OnUnload is not called if the JNI shared library is not unloaded by the class loader.
Since the shared library's objects can be destructed both by explicit destruction (and should free their resources), and by cleanup at exit, I need to distinguish these two cases safely, since the JVM is gone by the time we reach the exit code! Basically without a bit of sniffing there's no way I can find in the JNI specs/docs for a shared library to know whether the JVM is still there or not, and if it's gone, then it's certainly wrong to try and free up references we have to Java objects.
The real issue here is that the ownership semantics you've listed are messed up. The JVM kinda owns your shared library but also kinda doesn't. You have a bunch of references to Java objects that sometimes you need to clean up but sometimes you don't.
The real solution here is simply to not keep references to Java objects as global variables. Then you won't need to know if the JVM still exists or not when the library is unloaded for whatever reason. Just keep references to Java objects from inside objects referenced by Java and then let the JVM care about whether or not it needs to free them.
In other words, don't make yourself responsible for cleanup on exit in the first place.
Your watcher doesn't need to rely on any static initialization order:
#include <iostream>
struct MainWatcher // : boost::noncopyable
{
enum MainStatus { before, during, after };
MainWatcher(MainStatus &b): flag(b) { flag = during; }
~MainWatcher() { flag = after; }
MainStatus &flag;
};
//////////////////////////////////////////////////////////////////////
// Test suite
//////////////////////////////////////////////////////////////////////
// note: static data area is zero-initialized before static objects constructed
MainWatcher::MainStatus main_flag;
char const *main_word()
{
switch(main_flag)
{
case MainWatcher::before: return "before main()";
case MainWatcher::during: return "during main()";
case MainWatcher::after: return "after main()";
default: return "(error)";
}
}
struct Test
{
Test() { std::cout << "Test created " << main_word() << "\n"; }
~Test() { std::cout << "Test destroyed " << main_word() << "\n"; }
};
Test t1;
int main()
{
MainWatcher watcher(main_flag);
// rest of code
Test t2;
}

Warm up some variables in C++

I have a C++ library that works on some numeric values, this values are not available at compile time but are immediatly available at runtime and are based on machine-related details, in short I need values like display resolution, the number of CPU cores and so on.
The key points of my question are:
I can't ask to the user to input this values ( both the coders/user of my lib and the final user )
I need to do this warm up only once the application starts, it's a 1 time only thing
this values are later used by methods and classes
the possible solutions are:
build a data structure Data, declare some Data dummy where dummy is the name of the variable used to store everything and the contructor/s will handle the one time inizialization for the related values
wrap something like the first solution in a method like WarmUp() and putting this method right after the start for the main() ( it's also a simple thing to remember and use )
the big problems that are still unsolved are:
the user can declare more than 1 data structure since Data it's a type and there are no restrictions about throwing 2-4-5-17 variables of the same type in C++
the WarmUp() method can be a little intrusive in the design of the other classes, it can also happen that the WarmUp() method is used in local methods and not in the main().
I basically need to force the creation of 1 single instance of 1 specific type at runtime when I have no power over the actual use of my library, or at least I need to design this in a way that the user will immediately understand what kind of error is going on keeping the use of the library intuitive and simple as much as possible.
Can you see a solution to this ?
EDIT:
my problems are also more difficult due to the fact that I'm also trying to get a multi-threading compatible data structure.
What about to use lazily-create singleton? E.g
struct Data
{
static Data & instance()
{
static Data data_;
return data_;
}
private:
Data()
{
//init
}
}
Data would be initialized on first use, when you call Data::instance()
Edit: as for multithreading, read efficient thread-safe singleton in C++
Edit2 realisation using boost::call_once
First, as others have said, the obvious (and best) answer to
this is a singleton. Since you've added the multithreading
requirement, however: there are two solutions, depending on
whether the object will be modified by code using the singleton.
(From your description, I gather not.) If not, then it is
sufficient to use the "naïve" implementation of a singleton, and
ensure that the singleton is initialized before threads are
started. If no thread is started before you enter main (and
I would consider it bad practice otherwise), then something like
the following is largely sufficient:
class Singleton
{
static Singleton const* ourInstance;
Singleton();
Singleton( Singleton const& );
Singleton& operator=( Singleton const& );
public:
static Singleton const& instance();
};
and in the implementation:
Singleton const* Singleton::ourInstance = &Singleton::instance();
Singleton const&
Singleton::instance()
{
if ( ourInstance == NULL ) {
ourInstance = new Singleton;
}
return *ourInstance;
}
No locking is necessary, since no thread will be modifying
anything once threading starts.
If the singleton is mutable, then you have to protect all access
to it. You could do something like the above (without the
const, obviously), and leave the locking to the client, but in
such cases, I'd prefer locking in the instance function, and
returning an std::shared_ptr with a deleter which frees the
lock, which was acquired in the instance function. I think
something like the following could work (but I've never actually
needed it, and so haven't tried it):
class Singleton
{
static Singleton* ourInstance;
static std::mutex ourMutex;
class LockForPointer
{
public:
operator()( Singleton* )
{
Singleton::ourMutex.unlock();
}
};
class LockForInstance
{
bool myOwnershipIsTransfered;
public:
LockForInstance
: myOwnershipIsTransfered( false )
{
Singleton::ourMutex.lock();
}
~LockForInstance()
{
if ( !myOwnershipIsTransfered ) {
Singleton::ourMutex.unlock();
}
}
LockForPointer transferOwnership()
{
myOwnershipIsTransfered = true;
return LockForPointer();
}
};
public:
static std::shared_ptr<Singleton> instance();
};
and the implementation:
static Singleton* ourInstance = NULL;
static std::mutex ourMutex;
std::shared_ptr<Singleton>
Singleton::instance()
{
LockForInstance lock;
if ( ourInstance == NULL ) {
ourInstance = new Singleton;
}
return std::shared_ptr<Singleton>( ourInstance, lock.transferOwnership() );
}
This way, the same lock is used for the check for null and for
accessing the data.
Warmup is usually related to performance issues (and makes me think of the processor cache, see the __builtin_prefetch of GCC).
Maybe you want to make a singleton class, there are many solutions to this (e.g. this C++ singleton tutorial).
Also, if performance is the primary concern, you could consider that the configured parameters are given at initialization (or even at installation) time. Then, you could specialize your code, perhaps as simply as having templates and instanciating them at first by emitting (at runtime = initialization time) the appropriate C++ stub source code and compiling it (at "runtime", e.g. at initialization) and dynamically loading it (using plugins & dlopen ...). See also this answer.

Portable thread-safe lazy singleton

Greetings to all.
I'm trying to write a thread safe lazy singleton for future use. Here's the best I could come up with. Can anyone spot any problems with it? The key assumption is that static initialization occurs in a single thread before dynamic initialisations. (this will be used for a commercial project and company is not using boost :(, life would be a breeze otherwise :)
PS: Haven't check that this compiles yet, my apologies.
/*
There are two difficulties when implementing the singleton pattern:
Problem (a): The "global variable instantiation fiasco". TODO: URL
This is due to the unspecified order in which global variables are initialised. Static class members are equivalent
to a global variable in C++ during initialisation.
Problem (b): Multi-threading.
Care must be taken to ensure that the mutex initialisation is handled properly with respect to problem (a).
*/
/*
Things achieved, maybe:
*) Portable
*) Lazy creation.
*) Safe from unspecified order of global variable initialisation.
*) Thread-safe.
*) Mutex is properly initialise when invoked during global variable intialisation:
*) Effectively lock free in instance().
*/
/************************************************************************************
Platform dependent mutex implementation
*/
class Mutex
{
public:
void lock();
void unlock();
};
/************************************************************************************
Threadsafe singleton
*/
class Singleton
{
public: // Interface
static Singleton* Instance();
private: // Static helper functions
static Mutex* getMutex();
private: // Static members
static Singleton* _pInstance;
static Mutex* _pMutex;
private: // Instance members
bool* _pInstanceCreated; // This is here to convince myself that the compiler is not re-ordering instructions.
private: // Singletons can't be coppied
explicit Singleton();
~Singleton() { }
};
/************************************************************************************
We can't use a static class member variable to initialised the mutex due to the unspecified
order of initialisation of global variables.
Calling this from
*/
Mutex* Singleton::getMutex()
{
static Mutex* pMutex = 0; // alternatively: static Mutex* pMutex = new Mutex();
if( !pMutex )
{
pMutex = new Mutex(); // Constructor initialises the mutex: eg. pthread_mutex_init( ... )
}
return pMutex;
}
/************************************************************************************
This static member variable ensures that we call Singleton::getMutex() at least once before
the main entry point of the program so that the mutex is always initialised before any threads
are created.
*/
Mutex* Singleton::_pMutex = Singleton::getMutex();
/************************************************************************************
Keep track of the singleton object for possible deletion.
*/
Singleton* Singleton::_pInstance = Singleton::Instance();
/************************************************************************************
Read the comments in Singleton::Instance().
*/
Singleton::Singleton( bool* pInstanceCreated )
{
fprintf( stderr, "Constructor\n" );
_pInstanceCreated = pInstanceCreated;
}
/************************************************************************************
Read the comments in Singleton::Instance().
*/
void Singleton::setInstanceCreated()
{
_pInstanceCreated = true;
}
/************************************************************************************
Fingers crossed.
*/
Singleton* Singleton::Instance()
{
/*
'instance' is initialised to zero the first time control flows over it. So
avoids the unspecified order of global variable initialisation problem.
*/
static Singleton* instance = 0;
/*
When we do:
instance = new Singleton( instanceCreated );
the compiler can reorder instructions and any way it wants as long
as the observed behaviour is consistent to that of a single threaded environment ( assuming
that no thread-safe compiler flags are specified). The following is thus not threadsafe:
if( !instance )
{
lock();
if( !instance )
{
instance = new Singleton( instanceCreated );
}
lock();
}
Instead we use:
static bool instanceCreated = false;
as the initialisation indicator.
*/
static bool instanceCreated = false;
/*
Double check pattern with a slight swist.
*/
if( !instanceCreated )
{
getMutex()->lock();
if( !instanceCreated )
{
/*
The ctor keeps a persistent reference to 'instanceCreated'.
In order to convince our-selves of the correct order of initialisation (I think
this is quite unecessary
*/
instance = new Singleton( instanceCreated );
/*
Set the reference to 'instanceCreated' to true.
Note that since setInstanceCreated() actually uses the non-static
member variable: '_pInstanceCreated', I can't see the compiler taking the
liberty to call Singleton's ctor AFTER the following call. (I don't know
much about compiler optimisation, but I doubt that it will break up the ctor into
two functions and call one part of it before the following call and the other part after.
*/
instance->setInstanceCreated();
/*
The double check pattern should now work.
*/
}
getMutex()->unlock();
}
return instance;
}
No, this will not work. It is broken.
The problem has little/nothing to do with the compiler. It has to do with the order in which a second CPU will 'see' what the first CPU has done to memory. The memory (and caches) will be consistent, but the timing of WHEN each CPU decides to write or read each part of memory/cache is indeterminate.
So for CPU1:
instance = new Singleton( instanceCreated );
instance->setInstanceCreated();
Let's consider the compiler first. There is NO reason why the compiler doesn't reorder or otherwise alter these functions. Maybe like:
temp_register = new Singleton(instanceCreated);
temp_register->setInstanceCreated();
instance = temp_register;
or many other possibilities - like you said as long as single-threaded observed behaviour is consistent. This DOES include things like " break up the ctor into two functions and call one part of it before the following call and the other part after."
Now, it probably wouldn't break it up into 2 calls, but it would INLINE the ctor, particularly since it is so small. Then, once inlined, everything may be reordered, as if the ctor was broken in 2, for example.
In general, I would say not only is it possible that the compiler reordered things, it is probable - ie for the code you have, there is probably a reordering (once inlined, and inlining is likely) that is 'better' than the order given by the C++ code.
But let's leave that aside, and try to understand the real issues of double-checked locking.
So, let's just assume the compiler didn't reorder anything. What about the CPU? Or more importantly CPUs - plural.
The first CPU, 'CPU1' needs to follow the instructions given by the compiler, in particular, it needs to write to memory the things it has been told to write:
instance,
instanceCreated
other member variable of the Singleton (ie your Singleton does DO something, and has some state, doesn't it?)
Actually, that 'other member variable' stuff is really important. Important for your singleton - that's its real purpose right?, and important for our discussion. So let's give it a name: important_data. ie instance->important_data. And maybe instance->important_function(), which uses important_data. Etc.
As mentioned, let's assume the compiler has written the code such that these items are written in the order you are expecting, namely:
important_data - written inside the ctor, called from
instance = new Singleton(instanceCreated);
instance - assigned right after new/ctor returns
instanceCreated - inside setInstanceCreated()
Now, the CPU hands these writes off to the memory bus. Know what the memory bus does? IT REORDERS THEM. The CPU and architecture has the same constraints as the compiler - ie make sure this one CPU sees things consistently - ie single threaded consistent. So if, for example, instance and instanceCreated are on the same cache-line (highly likely, actually), they might be written together, and since they were just read, that cache-line is 'hot', so maybe they get written FIRST before important_data, so that that cache-line can be retired to make room for the cache-line where important_data lives.
Did you see that? instanceCreated and instance were just committed to memory BEFORE important_data. Note that CPU1 doesn't care, because it is living in a single-threaded world...
So now introduce CPU2:
CPU2 comes in, sees instanceCreated == true and instance != NULL and thus goes off and decides to call Singleton::Instance()->important_function(), which uses important_data, which is uninitialized. CRASH BANG BOOM.
By the way, it gets worse. So far, we've seen that the compiler could reorder, but we're pretending it didn't. Let's go one step further and pretend that CPU1 did NOT reorder any of the memory writing. Are we OK now?
No. Of course not.
Just as CPU1 decided to optimize/reorder its memory writes, CPU2 can REORDER ITS READS!
CPU2 comes in and sees
if (!instanceCreated) ...
so it needs to read instanceCreated. Ever heard of 'speculative execution'? (Great name for a FPS game, by the way). If the memory bus isn't busy doing anything, CPU2 might pre-read some other values 'hoping' that instanceCreated is true. ie it may pre-read important_data for example. Maybe important_data (or the uninitialized, possibly re-claimed-by-the-allocator memory that will become important_data) is already in CPU2's cache. Or maybe (more likely?) CPU2 just free'd that memory, and the allocator wrote NULL in its first 4 bytes (allocators often use that memory for their free-lists), so actually, the memory soon-to-become important_data may actually still be in the write queue of CPU2. In that case, why would CPU2 bother re-reading that memory, when it hasn't even finished writing it yet!? (it wouldn't - it would just get the values from its write-queue.)
Did that make sense? If not, imagine that the value of instance (which is a pointer) is 0x17e823d0. What was that memory doing before it became (becomes) the Singleton? Is that memory still in the write-queue of CPU2?...
Or basically, don't even think about why it might want to do so, but realize that CPU2 might read important_data first, then instanceCreated second. So even though CPU1 may have wrote them in order CPU2 sees 'crap' in important_data, then sees true in instanceCreated (and who knows what in instance!). Again, CRASH BANG BOOM. Or BOOM CRASH BANG, since by now you realize that the order isn't guaranteed...
It's usually better to have a non-lazy singleton which does nothing in its constructor, and then in GetInstance do a thread-safe call once to a function which allocates any expensive resources. You're already creating a Mutex non-lazily, so why not just put the mutex and some kind of Pimpl in your Singleton object?
By the way, this is easier on Posix:
struct Singleton {
static Singleton *GetInstance() {
pthread_once(&control, doInit);
return instance;
}
private:
static void doInit() {
// slight problem: we can't throw from here, or fail
try {
instance = new Singleton();
} catch (...) {
// we could stash an error indicator in a static member,
// and check it in GetInstance.
std::abort();
}
}
static pthread_once_t control;
static Singleton *instance;
};
pthread_once_t Singleton::control = PTHREAD_ONCE_INIT;
Singleton *Singleton::instance = 0;
There do exist pthread_once implementations for Windows and other platforms.
If you wish to see an in-depth discussion of Singletons, the various policies about their lifetime and the thread safety issues, I can only recommend a good read: "Modern C++ Design" by Alexandrescu.
The implementation is presented on the web in Loki, find it here!
And yes, it does hold in a single header file. So I would really encourage you to at least grab the file and read it, and better yet read the book to have the full-blown reflection.
At global scope in your code:
/************************************************************************************
Keep track of the singleton object for possible deletion.
*/
Singleton* Singleton::_pInstance = Singleton::Instance();
This makes your implementation not lazy. Presumably you want to set _pInstance to NULL at global scope, and assign to it after you construct the singleton inside Instance() before you unlock the mutex.
More food for thought from Meyers & Alexandrescu, with Singleton being the specific target: C++ and the Perils of Double-Checked Locking. It's a bit of a prickly problem.

How can I schedule some code to run after all '_atexit()' functions are completed

I'm writing a memory tracking system and the only problem I've actually run into is that when the application exits, any static/global classes that didn't allocate in their constructor, but are deallocating in their deconstructor are deallocating after my memory tracking stuff has reported the allocated data as a leak.
As far as I can tell, the only way for me to properly solve this would be to either force the placement of the memory tracker's _atexit callback at the head of the stack (so that it is called last) or have it execute after the entire _atexit stack has been unwound. Is it actually possible to implement either of these solutions, or is there another solution that I have overlooked.
Edit:
I'm working on/developing for Windows XP and compiling with VS2005.
I've finally figured out how to do this under Windows/Visual Studio. Looking through the crt startup function again (specifically where it calls the initializers for globals), I noticed that it was simply running "function pointers" that were contained between certain segments. So with just a little bit of knowledge on how the linker works, I came up with this:
#include <iostream>
using std::cout;
using std::endl;
// Typedef for the function pointer
typedef void (*_PVFV)(void);
// Our various functions/classes that are going to log the application startup/exit
struct TestClass
{
int m_instanceID;
TestClass(int instanceID) : m_instanceID(instanceID) { cout << " Creating TestClass: " << m_instanceID << endl; }
~TestClass() {cout << " Destroying TestClass: " << m_instanceID << endl; }
};
static int InitInt(const char *ptr) { cout << " Initializing Variable: " << ptr << endl; return 42; }
static void LastOnExitFunc() { puts("Called " __FUNCTION__ "();"); }
static void CInit() { puts("Called " __FUNCTION__ "();"); atexit(&LastOnExitFunc); }
static void CppInit() { puts("Called " __FUNCTION__ "();"); }
// our variables to be intialized
extern "C" { static int testCVar1 = InitInt("testCVar1"); }
static TestClass testClassInstance1(1);
static int testCppVar1 = InitInt("testCppVar1");
// Define where our segment names
#define SEGMENT_C_INIT ".CRT$XIM"
#define SEGMENT_CPP_INIT ".CRT$XCM"
// Build our various function tables and insert them into the correct segments.
#pragma data_seg(SEGMENT_C_INIT)
#pragma data_seg(SEGMENT_CPP_INIT)
#pragma data_seg() // Switch back to the default segment
// Call create our call function pointer arrays and place them in the segments created above
#define SEG_ALLOCATE(SEGMENT) __declspec(allocate(SEGMENT))
SEG_ALLOCATE(SEGMENT_C_INIT) _PVFV c_init_funcs[] = { &CInit };
SEG_ALLOCATE(SEGMENT_CPP_INIT) _PVFV cpp_init_funcs[] = { &CppInit };
// Some more variables just to show that declaration order isn't affecting anything
extern "C" { static int testCVar2 = InitInt("testCVar2"); }
static TestClass testClassInstance2(2);
static int testCppVar2 = InitInt("testCppVar2");
// Main function which prints itself just so we can see where the app actually enters
void main()
{
cout << " Entered Main()!" << endl;
}
which outputs:
Called CInit();
Called CppInit();
Initializing Variable: testCVar1
Creating TestClass: 1
Initializing Variable: testCppVar1
Initializing Variable: testCVar2
Creating TestClass: 2
Initializing Variable: testCppVar2
Entered Main()!
Destroying TestClass: 2
Destroying TestClass: 1
Called LastOnExitFunc();
This works due to the way MS have written their runtime library. Basically, they've setup the following variables in the data segments:
(although this info is copyright I believe this is fair use as it doesn't devalue the original and IS only here for reference)
extern _CRTALLOC(".CRT$XIA") _PIFV __xi_a[];
extern _CRTALLOC(".CRT$XIZ") _PIFV __xi_z[]; /* C initializers */
extern _CRTALLOC(".CRT$XCA") _PVFV __xc_a[];
extern _CRTALLOC(".CRT$XCZ") _PVFV __xc_z[]; /* C++ initializers */
extern _CRTALLOC(".CRT$XPA") _PVFV __xp_a[];
extern _CRTALLOC(".CRT$XPZ") _PVFV __xp_z[]; /* C pre-terminators */
extern _CRTALLOC(".CRT$XTA") _PVFV __xt_a[];
extern _CRTALLOC(".CRT$XTZ") _PVFV __xt_z[]; /* C terminators */
On initialization, the program simply iterates from '__xN_a' to '__xN_z' (where N is {i,c,p,t}) and calls any non null pointers it finds. If we just insert our own segment in between the segments '.CRT$XnA' and '.CRT$XnZ' (where, once again n is {I,C,P,T}), it will be called along with everything else that normally gets called.
The linker simply joins up the segments in alphabetical order. This makes it extremely simple to select when our functions should be called. If you have a look in defsects.inc (found under $(VS_DIR)\VC\crt\src\) you can see that MS have placed all the "user" initialization functions (that is, the ones that initialize globals in your code) in segments ending with 'U'. This means that we just need to place our initializers in a segment earlier than 'U' and they will be called before any other initializers.
You must be really careful not to use any functionality that isn't initialized until after your selected placement of the function pointers (frankly, I'd recommend you just use .CRT$XCT that way its only your code that hasn't been initialized. I'm not sure what will happen if you've linked with standard 'C' code, you may have to place it in the .CRT$XIT block in that case).
One thing I did discover was that the "pre-terminators" and "terminators" aren't actually stored in the executable if you link against the DLL versions of the runtime library. Due to this, you can't really use them as a general solution. Instead, the way I made it run my specific function as the last "user" function was to simply call atexit() within the 'C initializers', this way, no other function could have been added to the stack (which will be called in the reverse order to which functions are added and is how global/static deconstructors are all called).
Just one final (obvious) note, this is written with Microsoft's runtime library in mind. It may work similar on other platforms/compilers (hopefully you'll be able to get away with just changing the segment names to whatever they use, IF they use the same scheme) but don't count on it.
atexit is processed by the C/C++ runtime (CRT). It runs after main() has already returned. Probably the best way to do this is to replace the standard CRT with your own.
On Windows tlibc is probably a great place to start: http://www.codeproject.com/KB/library/tlibc.aspx
Look at the code sample for mainCRTStartup and just run your code after the call to _doexit();
but before ExitProcess.
Alternatively, you could just get notified when ExitProcess gets called. When ExitProcess gets called the following occurs (according to http://msdn.microsoft.com/en-us/library/ms682658%28VS.85%29.aspx):
All of the threads in the process, except the calling thread, terminate their execution without receiving a DLL_THREAD_DETACH notification.
The states of all of the threads terminated in step 1 become signaled.
The entry-point functions of all loaded dynamic-link libraries (DLLs) are called with DLL_PROCESS_DETACH.
After all attached DLLs have executed any process termination code, the ExitProcess function terminates the current process, including the calling thread.
The state of the calling thread becomes signaled.
All of the object handles opened by the process are closed.
The termination status of the process changes from STILL_ACTIVE to the exit value of the process.
The state of the process object becomes signaled, satisfying any threads that had been waiting for the process to terminate.
So, one method would be to create a DLL and have that DLL attach to the process. It will get notified when the process exits, which should be after atexit has been processed.
Obviously, this is all rather hackish, proceed carefully.
This is dependent on the development platform. For example, Borland C++ has a #pragma which could be used for exactly this. (From Borland C++ 5.0, c. 1995)
#pragma startup function-name [priority]
#pragma exit function-name [priority]
These two pragmas allow the program to specify function(s) that should be called either upon program startup (before the main function is called), or program exit (just before the program terminates through _exit).
The specified function-name must be a previously declared function as:
void function-name(void);
The optional priority should be in the range 64 to 255, with highest priority at 0; default is 100. Functions with higher priorities are called first at startup and last at exit. Priorities from 0 to 63 are used by the C libraries, and should not be used by the user.
Perhaps your C compiler has a similar facility?
I've read multiple times you can't guarantee the construction order of global variables (cite). I'd think it is pretty safe to infer from this that destructor execution order is also not guaranteed.
Therefore if your memory tracking object is global, you will almost certainly be unable any guarantees that your memory tracker object will get destructed last (or constructed first). If it's not destructed last, and other allocations are outstanding, then yes it will notice the leaks you mention.
Also, what platform is this _atexit function defined for?
Having the memory tracker's cleanup executed last is the best solution. The easiest way I've found to do that is to explicitly control all the relevant global variables' initialization order. (Some libraries hide their global state in fancy classes or otherwise, thinking they're following a pattern, but all they do is prevent this kind of flexibility.)
Example main.cpp:
#include "global_init.inc"
int main() {
// do very little work; all initialization, main-specific stuff
// then call your application's mainloop
}
Where the global-initialization file includes object definitions and #includes similar non-header files. Order the objects in this file in the order you want them constructed, and they'll be destructed in the reverse order. 18.3/8 in C++03 guarantees that destruction order mirrors construction: "Non-local objects with static storage duration are destroyed in the reverse order of the completion of their constructor." (That section is talking about exit(), but a return from main is the same, see 3.6.1/5.)
As a bonus, you're guaranteed that all globals (in that file) are initialized before entering main. (Something not guaranteed in the standard, but allowed if implementations choose.)
I've had this exact problem, also writing a memory tracker.
A few things:
Along with destruction, you also need to handle construction. Be prepared for malloc/new to be called BEFORE your memory tracker is constructed (assuming it is written as a class). So you need your class to know whether it has been constructed or destructed yet!
class MemTracker
{
enum State
{
unconstructed = 0, // must be 0 !!!
constructed,
destructed
};
State state;
MemTracker()
{
if (state == unconstructed)
{
// construct...
state = constructed;
}
}
};
static MemTracker memTracker; // all statics are zero-initted by linker
On every allocation that calls into your tracker, construct it!
MemTracker::malloc(...)
{
// force call to constructor, which does nothing after first time
new (this) MemTracker();
...
}
Strange, but true. Anyhow, onto destruction:
~MemTracker()
{
OutputLeaks(file);
state = destructed;
}
So, on destruction, output your results. Yet we know that there will be more calls. What to do? Well,...
MemTracker::free(void * ptr)
{
do_tracking(ptr);
if (state == destructed)
{
// we must getting called late
// so re-output
// Note that this might happen a lot...
OutputLeaks(file); // again!
}
}
And lastly:
be careful with threading
be careful not to call malloc/free/new/delete inside your tracker, or be able to detect the recursion, etc :-)
EDIT:
and I forgot, if you put your tracker in a DLL, you will probably need to LoadLibrary() (or dlopen, etc) yourself to up your reference count, so that you don't get removed from memory prematurely. Because although your class can still be called after destruction, it can't if the code has been unloaded.

Destruction of singleton in DLL

I’m trying to create a simple Win32 DLL. As interface between DLL and EXE I use C functions, but inside of DLL i use C++ singleton object. Following is an example of my DLL implementation:
// MyDLLInterface.cpp file --------------------
#include "stdafx.h"
#include <memory>
#include "MyDLLInterface.h"
class MySingleton
{
friend class std::auto_ptr< MySingleton >;
static std::auto_ptr< MySingleton > m_pInstance;
MySingleton()
{
m_pName = new char[32];
strcpy(m_pName, “MySingleton”);
}
virtual ~ MySingleton()
{
delete [] m_pName;
}
MySingleton(const MySingleton&);
MySingleton& operator=(const MySingleton&);
public:
static MySingleton* Instance()
{
if (!m_pInstance.get())
m_pInstance.reset(new MySingleton);
return m_pInstance.get();
}
static void Delete()
{
m_pInstance.reset(0);
}
void Function() {}
private:
char* m_pName;
};
std::auto_ptr<MySingleton> MySingleton::m_pInstance(0);
void MyInterfaceFunction()
{
MySingleton::Instance()->Function();
}
void MyInterfaceUninitialize()
{
MySingleton::Delete();
}
// MyDLLInterface.h file --------------------
#if defined(MY_DLL)
#define MY_DLL_EXPORT __declspec(dllexport)
#else
#define MY_DLL_EXPORT __declspec(dllimport)
#endif
MY_DLL_EXPORT void MyInterfaceFunction();
MY_DLL_EXPORT void MyInterfaceUninitialize();
The problem or the question that i have is following: If i don't call MyInterfaceUninitialize() from my EXEs ExitInstance(), i have a memory leak (m_pName pointer). Why does it happening? It looks like the destruction off MySingleton happens after EXEs exit. Is it possible to force the DLL or EXE to destroy the MySingleton a little bit earlier, so I don't need to call MyInterfaceUninitialize() function?
EDIT:
Thanks for all your help and explanation. Now i understand that this is a design issue. If i want to stay with my current solution, i need to call MyInterfaceUninitialize() function in my EXE. If i don't do it, it also OK, because the singleton destroys itself, when it leaves the EXE scope (but i need to live with disturbing debugger messages). The only way to avoid this behavior, is to rethink the whole implementation.
I can also set my DLL as "Delay Loaded DLLs" under Linker->Input in Visual Studio, to get rid of disturbing debugger messages.
If i don't call MyInterfaceUninitialize() from my EXEs ExitInstance(), i have a memory leak (m_pName pointer). Why does it happening?
This is not a leak, this is the way auto_ptrs are supposed to work. They release the instance when they go out of scope (which in your case is when the dll is unloaded).
It looks like the destruction off MySingleton happens after EXEs exit.
Yes.
Is it possible to force the DLL or EXE to destroy the MySingleton a little bit earlier, so I don't need to call MyInterfaceUninitialize() function?
Not without calling this function.
You can take advantage of the DllMain callback function to take appropriate action when the DLL is loaded/unloaded or a process/thread attaches/detaches. You could then allocate objects per attached process/thread instead of using a singleton since this callback function is executed in the context of the attached thread. With that in mind, also take a look at Thread Local Storage (TLS).
Honestly, for the example you gave, it doesn't matter if you call the Uninitialize method from your ExitInstance. Yes, the debugger will complain about the unreleased memory, but then again, it's a singleton, it's intended to live for an extended duration.
Only if you have some state information in the DLL that needs to be persisted at exit, or if you are dynamically loading/unloading DLLs multiple times, do you need to be diligent about cleaning up. Otherwise, just letting the OS tear down the process at exit is just fine, the reported memory leak is inconsequential at that point.