Warm up some variables in C++ - c++

I have a C++ library that works on some numeric values, this values are not available at compile time but are immediatly available at runtime and are based on machine-related details, in short I need values like display resolution, the number of CPU cores and so on.
The key points of my question are:
I can't ask to the user to input this values ( both the coders/user of my lib and the final user )
I need to do this warm up only once the application starts, it's a 1 time only thing
this values are later used by methods and classes
the possible solutions are:
build a data structure Data, declare some Data dummy where dummy is the name of the variable used to store everything and the contructor/s will handle the one time inizialization for the related values
wrap something like the first solution in a method like WarmUp() and putting this method right after the start for the main() ( it's also a simple thing to remember and use )
the big problems that are still unsolved are:
the user can declare more than 1 data structure since Data it's a type and there are no restrictions about throwing 2-4-5-17 variables of the same type in C++
the WarmUp() method can be a little intrusive in the design of the other classes, it can also happen that the WarmUp() method is used in local methods and not in the main().
I basically need to force the creation of 1 single instance of 1 specific type at runtime when I have no power over the actual use of my library, or at least I need to design this in a way that the user will immediately understand what kind of error is going on keeping the use of the library intuitive and simple as much as possible.
Can you see a solution to this ?
EDIT:
my problems are also more difficult due to the fact that I'm also trying to get a multi-threading compatible data structure.

What about to use lazily-create singleton? E.g
struct Data
{
static Data & instance()
{
static Data data_;
return data_;
}
private:
Data()
{
//init
}
}
Data would be initialized on first use, when you call Data::instance()
Edit: as for multithreading, read efficient thread-safe singleton in C++
Edit2 realisation using boost::call_once

First, as others have said, the obvious (and best) answer to
this is a singleton. Since you've added the multithreading
requirement, however: there are two solutions, depending on
whether the object will be modified by code using the singleton.
(From your description, I gather not.) If not, then it is
sufficient to use the "naïve" implementation of a singleton, and
ensure that the singleton is initialized before threads are
started. If no thread is started before you enter main (and
I would consider it bad practice otherwise), then something like
the following is largely sufficient:
class Singleton
{
static Singleton const* ourInstance;
Singleton();
Singleton( Singleton const& );
Singleton& operator=( Singleton const& );
public:
static Singleton const& instance();
};
and in the implementation:
Singleton const* Singleton::ourInstance = &Singleton::instance();
Singleton const&
Singleton::instance()
{
if ( ourInstance == NULL ) {
ourInstance = new Singleton;
}
return *ourInstance;
}
No locking is necessary, since no thread will be modifying
anything once threading starts.
If the singleton is mutable, then you have to protect all access
to it. You could do something like the above (without the
const, obviously), and leave the locking to the client, but in
such cases, I'd prefer locking in the instance function, and
returning an std::shared_ptr with a deleter which frees the
lock, which was acquired in the instance function. I think
something like the following could work (but I've never actually
needed it, and so haven't tried it):
class Singleton
{
static Singleton* ourInstance;
static std::mutex ourMutex;
class LockForPointer
{
public:
operator()( Singleton* )
{
Singleton::ourMutex.unlock();
}
};
class LockForInstance
{
bool myOwnershipIsTransfered;
public:
LockForInstance
: myOwnershipIsTransfered( false )
{
Singleton::ourMutex.lock();
}
~LockForInstance()
{
if ( !myOwnershipIsTransfered ) {
Singleton::ourMutex.unlock();
}
}
LockForPointer transferOwnership()
{
myOwnershipIsTransfered = true;
return LockForPointer();
}
};
public:
static std::shared_ptr<Singleton> instance();
};
and the implementation:
static Singleton* ourInstance = NULL;
static std::mutex ourMutex;
std::shared_ptr<Singleton>
Singleton::instance()
{
LockForInstance lock;
if ( ourInstance == NULL ) {
ourInstance = new Singleton;
}
return std::shared_ptr<Singleton>( ourInstance, lock.transferOwnership() );
}
This way, the same lock is used for the check for null and for
accessing the data.

Warmup is usually related to performance issues (and makes me think of the processor cache, see the __builtin_prefetch of GCC).
Maybe you want to make a singleton class, there are many solutions to this (e.g. this C++ singleton tutorial).
Also, if performance is the primary concern, you could consider that the configured parameters are given at initialization (or even at installation) time. Then, you could specialize your code, perhaps as simply as having templates and instanciating them at first by emitting (at runtime = initialization time) the appropriate C++ stub source code and compiling it (at "runtime", e.g. at initialization) and dynamically loading it (using plugins & dlopen ...). See also this answer.

Related

What are some C++ alternatives to static objects that could make destruction safer (or more deterministic)?

I'm working on a large code base that, for performance reasons, limits access to one or more resources. A thread pool is a good analogy to my problem - we don't want everyone in the process spinning up their own threads, so a common pool with a producer/consumer job queue exists in an attempt to limit the number of threads running at any given time.
There isn't an elegant way to make ownership of the thread pool clear so, for all intents and purposes, it is a singleton. I speak better in code than in English, so here is an example:
class ThreadPool {
public:
static void SubmitTask(Task&& t) { instance_.SubmitTask(std::move(t)); }
private:
~ThreadPool() {
std::for_each(pool_.begin(), pool_.end(), [](auto &t) {
if (t.joinable()) t.join();
});
}
private:
std::array<std::thread, 5> pool_;
static ThreadPool instance_; // here or anonymous namespace
};
The issue with this pattern is instance_ doesn't go out of scope until after main has returned which typically results in races or crashes. Also, keep in mind this is analogous to my problem so better ways to do something asynchronously isn't really what I'm after; just better ways to manage the lifecycle of static objects.
Alternatives I've thought of:
Provide an explicit Terminate function that must be called manually before leaving main.
Not using statics at all and leaving it up to the app to ensure only a single instance exists.
Not using statics at all and crashing the app if more than 1 instance is instantiated.
I also realize that a small, sharp, team could probably make the above code work just fine. However, this code lives within a large organization that has many developers of various skill levels contributing to it.
You could explicitly bind the lifetime to your main function. Either add a static shutdown() method to your ThreadPool that does any cleanup you need and call it at the end of main().
Or fully bind the lifetime via RAII:
class ThreadPool {
public:
static ThreadPool* get() { return instance_.get(); }
void SubmitTask(Task&& t) { ... }
~ThreadPool() { ... }
private:
ThreadPool() {}
static inline std::unique_ptr<ThreadPool> instance_;
friend class ThreadPoolScope;
};
class ThreadPoolScope {
public:
ThreadPoolScope(){
assert(!ThreadPool::instance_);
ThreadPool::instance_.reset(new ThreadPool());
}
~ThreadPoolScope(){
ThreadPool::instance_.reset();
}
};
int main() {
ThreadPoolScope thread_pool_scope{};
...
}
void some_func() {
ThreadPool::get()->SubmitTask(...);
}
This makes destruction completely deterministic and if you do this with multiple objects, they are automatically destroyed in the correct order.

Synchronizing method calls on shared object from multiple threads

I am thinking about how to implement a class that will contain private data that will be eventually be modified by multiple threads through method calls. For synchronization (using the Windows API), I am planning on using a CRITICAL_SECTION object since all the threads will spawn from the same process.
Given the following design, I have a few questions.
template <typename T> class Shareable
{
private:
const LPCRITICAL_SECTION sync; //Can be read and used by multiple threads
T *data;
public:
Shareable(LPCRITICAL_SECTION cs, unsigned elems) : sync{cs}, data{new T[elems]} { }
~Shareable() { delete[] data; }
void sharedModify(unsigned index, T &datum) //<-- Can this be validly called
//by multiple threads with synchronization being implicit?
{
EnterCriticalSection(sync);
/*
The critical section of code involving reads & writes to 'data'
*/
LeaveCriticalSection(sync);
}
};
// Somewhere else ...
DWORD WINAPI ThreadProc(LPVOID lpParameter)
{
Shareable<ActualType> *ptr = static_cast<Shareable<ActualType>*>(lpParameter);
T copyable = /* initialization */;
ptr->sharedModify(validIndex, copyable); //<-- OK, synchronized?
return 0;
}
The way I see it, the API calls will be conducted in the context of the current thread. That is, I assume this is the same as if I had acquired the critical section object from the pointer and called the API from within ThreadProc(). However, I am worried that if the object is created and placed in the main/initial thread, there will be something funky about the API calls.
When sharedModify() is called on the same object concurrently,
from multiple threads, will the synchronization be implicit, in the
way I described it above?
Should I instead get a pointer to the
critical section object and use that instead?
Is there some other
synchronization mechanism that is better suited to this scenario?
When sharedModify() is called on the same object concurrently, from multiple threads, will the synchronization be implicit, in the way I described it above?
It's not implicit, it's explicit. There's only only CRITICAL_SECTION and only one thread can hold it at a time.
Should I instead get a pointer to the critical section object and use that instead?
No. There's no reason to use a pointer here.
Is there some other synchronization mechanism that is better suited to this scenario?
It's hard to say without seeing more code, but this is definitely the "default" solution. It's like a singly-linked list -- you learn it first, it always works, but it's not always the best choice.
When sharedModify() is called on the same object concurrently, from multiple threads, will the synchronization be implicit, in the way I described it above?
Implicit from the caller's perspective, yes.
Should I instead get a pointer to the critical section object and use that instead?
No. In fact, I would suggest giving the Sharable object ownership of its own critical section instead of accepting one from the outside (and embrace RAII concepts to write safer code), eg:
template <typename T>
class Shareable
{
private:
CRITICAL_SECTION sync;
std::vector<T> data;
struct SyncLocker
{
CRITICAL_SECTION &sync;
SyncLocker(CRITICAL_SECTION &cs) : sync(cs) { EnterCriticalSection(&sync); }
~SyncLocker() { LeaveCriticalSection(&sync); }
}
public:
Shareable(unsigned elems) : data(elems)
{
InitializeCriticalSection(&sync);
}
Shareable(const Shareable&) = delete;
Shareable(Shareable&&) = delete;
~Shareable()
{
{
SyncLocker lock(sync);
data.clear();
}
DeleteCriticalSection(&sync);
}
void sharedModify(unsigned index, const T &datum)
{
SyncLocker lock(sync);
data[index] = datum;
}
Shareable& operator=(const Shareable&) = delete;
Shareable& operator=(Shareable&&) = delete;
};
Is there some other synchronization mechanism that is better suited to this scenario?
That depends. Will multiple threads be accessing the same index at the same time? If not, then there is not really a need for the critical section at all. One thread can safely access one index while another thread accesses a different index.
If multiple threads need to access the same index at the same time, a critical section might still not be the best choice. Locking the entire array might be a big bottleneck if you only need to lock portions of the array at a time. Things like the Interlocked API, or Slim Read/Write locks, might make more sense. It really depends on your thread designs and what you are actually trying to protect.

Thread safe container

There is some exemplary class of container in pseudo code:
class Container
{
public:
Container(){}
~Container(){}
void add(data new)
{
// addition of data
}
data get(size_t which)
{
// returning some data
}
void remove(size_t which)
{
// delete specified object
}
private:
data d;
};
How this container can be made thread safe? I heard about mutexes - where these mutexes should be placed? Should mutex be static for a class or maybe in global scope? What is good library for this task in C++?
First of all mutexes should not be static for a class as long as you going to use more than one instance. There is many cases where you should or shouldn't use use them. So without seeing your code it's hard to say. Just remember, they are used to synchronise access to shared data. So it's wise to place them inside methods that modify or rely on object's state. In your case I would use one mutex to protect whole object and lock all three methods. Like:
class Container
{
public:
Container(){}
~Container(){}
void add(data new)
{
lock_guard<Mutex> lock(mutex);
// addition of data
}
data get(size_t which)
{
lock_guard<Mutex> lock(mutex);
// getting copy of value
// return that value
}
void remove(size_t which)
{
lock_guard<Mutex> lock(mutex);
// delete specified object
}
private:
data d;
Mutex mutex;
};
Intel Thread Building Blocks (TBB) provides a bunch of thread-safe container implementations for C++. It has been open sourced, you can download it from: http://threadingbuildingblocks.org/ver.php?fid=174 .
First: sharing mutable state between threads is hard. You should be using a library that has been audited and debugged.
Now that it is said, there are two different functional issue:
you want a container to provide safe atomic operations
you want a container to provide safe multiple operations
The idea of multiple operations is that multiple accesses to the same container must be executed successively, under the control of a single entity. They require the caller to "hold" the mutex for the duration of the transaction so that only it changes the state.
1. Atomic operations
This one appears simple:
add a mutex to the object
at the start of each method grab a mutex with a RAII lock
Unfortunately it's also plain wrong.
The issue is re-entrancy. It is likely that some methods will call other methods on the same object. If those once again attempt to grab the mutex, you get a dead lock.
It is possible to use re-entrant mutexes. They are a bit slower, but allow the same thread to lock a given mutex as much as it wants. The number of unlocks should match the number of locks, so once again, RAII.
Another approach is to use dispatching methods:
class C {
public:
void foo() { Lock lock(_mutex); foo_impl(); }]
private:
void foo_impl() { /* do something */; foo_impl(); }
};
The public methods are simple forwarders to private work-methods and simply lock. Then one just have to ensure that private methods never take the mutex...
Of course there are risks of accidentally calling a locking method from a work-method, in which case you deadlock. Read on to avoid this ;)
2. Multiple operations
The only way to achieve this is to have the caller hold the mutex.
The general method is simple:
add a mutex to the container
provide a handle on this method
cross your fingers that the caller will never forget to hold the mutex while accessing the class
I personally prefer a much saner approach.
First, I create a "bundle of data", which simply represents the class data (+ a mutex), and then I provide a Proxy, in charge of grabbing the mutex. The data is locked so that the proxy only may access the state.
class ContainerData {
protected:
friend class ContainerProxy;
Mutex _mutex;
void foo();
void bar();
private:
// some data
};
class ContainerProxy {
public:
ContainerProxy(ContainerData& data): _data(data), _lock(data._mutex) {}
void foo() { data.foo(); }
void bar() { foo(); data.bar(); }
};
Note that it is perfectly safe for the Proxy to call its own methods. The mutex will be released automatically by the destructor.
The mutex can still be reentrant if multiple Proxies are desired. But really, when multiple proxies are involved, it generally turns into a mess. In debug mode, it's also possible to add a "check" that the mutex is not already held by this thread (and assert if it is).
3. Reminder
Using locks is error-prone. Deadlocks are a common cause of error and occur as soon as you have two mutexes (or one and re-entrancy). When possible, prefer using higher level alternatives.
Add mutex as an instance variable of class. Initialize it in constructor, and lock it at the very begining of every method, including destructor, and unlock at the end of method. Adding global mutex for all instances of class (static member or just in gloabl scope) may be a performance penalty.
The is also a very nice collection of lock-free containers (including maps) by Max Khiszinsky
LibCDS1 Concurrent Data Structures
Here is the documentation page:
http://libcds.sourceforge.net/doc/index.html
It can be kind of intimidating to get started, because it is fully generic and requires you register a chosen garbage collection strategy and initialize that. Of course, the threading library is configurable and you need to initialize that as well :)
See the following links for some getting started info:
initialization of CDS and the threading manager
http://sourceforge.net/projects/libcds/forums/forum/1034512/topic/4600301/
the unit tests ((cd build && ./build.sh ----debug-test for debug build)
Here is base template for 'main':
#include <cds/threading/model.h> // threading manager
#include <cds/gc/hzp/hzp.h> // Hazard Pointer GC
int main()
{
// Initialize \p CDS library
cds::Initialize();
// Initialize Garbage collector(s) that you use
cds::gc::hzp::GarbageCollector::Construct();
// Attach main thread
// Note: it is needed if main thread can access to libcds containers
cds::threading::Manager::attachThread();
// Do some useful work
...
// Finish main thread - detaches internal control structures
cds::threading::Manager::detachThread();
// Terminate GCs
cds::gc::hzp::GarbageCollector::Destruct();
// Terminate \p CDS library
cds::Terminate();
}
Don't forget to attach any additional threads you are using:
#include <cds/threading/model.h>
int myThreadFunc(void *)
{
// initialize libcds thread control structures
cds::threading::Manager::attachThread();
// Now, you can work with GCs and libcds containers
....
// Finish working thread
cds::threading::Manager::detachThread();
}
1 (not to be confuse with Google's compact datastructures library)

Portable thread-safe lazy singleton

Greetings to all.
I'm trying to write a thread safe lazy singleton for future use. Here's the best I could come up with. Can anyone spot any problems with it? The key assumption is that static initialization occurs in a single thread before dynamic initialisations. (this will be used for a commercial project and company is not using boost :(, life would be a breeze otherwise :)
PS: Haven't check that this compiles yet, my apologies.
/*
There are two difficulties when implementing the singleton pattern:
Problem (a): The "global variable instantiation fiasco". TODO: URL
This is due to the unspecified order in which global variables are initialised. Static class members are equivalent
to a global variable in C++ during initialisation.
Problem (b): Multi-threading.
Care must be taken to ensure that the mutex initialisation is handled properly with respect to problem (a).
*/
/*
Things achieved, maybe:
*) Portable
*) Lazy creation.
*) Safe from unspecified order of global variable initialisation.
*) Thread-safe.
*) Mutex is properly initialise when invoked during global variable intialisation:
*) Effectively lock free in instance().
*/
/************************************************************************************
Platform dependent mutex implementation
*/
class Mutex
{
public:
void lock();
void unlock();
};
/************************************************************************************
Threadsafe singleton
*/
class Singleton
{
public: // Interface
static Singleton* Instance();
private: // Static helper functions
static Mutex* getMutex();
private: // Static members
static Singleton* _pInstance;
static Mutex* _pMutex;
private: // Instance members
bool* _pInstanceCreated; // This is here to convince myself that the compiler is not re-ordering instructions.
private: // Singletons can't be coppied
explicit Singleton();
~Singleton() { }
};
/************************************************************************************
We can't use a static class member variable to initialised the mutex due to the unspecified
order of initialisation of global variables.
Calling this from
*/
Mutex* Singleton::getMutex()
{
static Mutex* pMutex = 0; // alternatively: static Mutex* pMutex = new Mutex();
if( !pMutex )
{
pMutex = new Mutex(); // Constructor initialises the mutex: eg. pthread_mutex_init( ... )
}
return pMutex;
}
/************************************************************************************
This static member variable ensures that we call Singleton::getMutex() at least once before
the main entry point of the program so that the mutex is always initialised before any threads
are created.
*/
Mutex* Singleton::_pMutex = Singleton::getMutex();
/************************************************************************************
Keep track of the singleton object for possible deletion.
*/
Singleton* Singleton::_pInstance = Singleton::Instance();
/************************************************************************************
Read the comments in Singleton::Instance().
*/
Singleton::Singleton( bool* pInstanceCreated )
{
fprintf( stderr, "Constructor\n" );
_pInstanceCreated = pInstanceCreated;
}
/************************************************************************************
Read the comments in Singleton::Instance().
*/
void Singleton::setInstanceCreated()
{
_pInstanceCreated = true;
}
/************************************************************************************
Fingers crossed.
*/
Singleton* Singleton::Instance()
{
/*
'instance' is initialised to zero the first time control flows over it. So
avoids the unspecified order of global variable initialisation problem.
*/
static Singleton* instance = 0;
/*
When we do:
instance = new Singleton( instanceCreated );
the compiler can reorder instructions and any way it wants as long
as the observed behaviour is consistent to that of a single threaded environment ( assuming
that no thread-safe compiler flags are specified). The following is thus not threadsafe:
if( !instance )
{
lock();
if( !instance )
{
instance = new Singleton( instanceCreated );
}
lock();
}
Instead we use:
static bool instanceCreated = false;
as the initialisation indicator.
*/
static bool instanceCreated = false;
/*
Double check pattern with a slight swist.
*/
if( !instanceCreated )
{
getMutex()->lock();
if( !instanceCreated )
{
/*
The ctor keeps a persistent reference to 'instanceCreated'.
In order to convince our-selves of the correct order of initialisation (I think
this is quite unecessary
*/
instance = new Singleton( instanceCreated );
/*
Set the reference to 'instanceCreated' to true.
Note that since setInstanceCreated() actually uses the non-static
member variable: '_pInstanceCreated', I can't see the compiler taking the
liberty to call Singleton's ctor AFTER the following call. (I don't know
much about compiler optimisation, but I doubt that it will break up the ctor into
two functions and call one part of it before the following call and the other part after.
*/
instance->setInstanceCreated();
/*
The double check pattern should now work.
*/
}
getMutex()->unlock();
}
return instance;
}
No, this will not work. It is broken.
The problem has little/nothing to do with the compiler. It has to do with the order in which a second CPU will 'see' what the first CPU has done to memory. The memory (and caches) will be consistent, but the timing of WHEN each CPU decides to write or read each part of memory/cache is indeterminate.
So for CPU1:
instance = new Singleton( instanceCreated );
instance->setInstanceCreated();
Let's consider the compiler first. There is NO reason why the compiler doesn't reorder or otherwise alter these functions. Maybe like:
temp_register = new Singleton(instanceCreated);
temp_register->setInstanceCreated();
instance = temp_register;
or many other possibilities - like you said as long as single-threaded observed behaviour is consistent. This DOES include things like " break up the ctor into two functions and call one part of it before the following call and the other part after."
Now, it probably wouldn't break it up into 2 calls, but it would INLINE the ctor, particularly since it is so small. Then, once inlined, everything may be reordered, as if the ctor was broken in 2, for example.
In general, I would say not only is it possible that the compiler reordered things, it is probable - ie for the code you have, there is probably a reordering (once inlined, and inlining is likely) that is 'better' than the order given by the C++ code.
But let's leave that aside, and try to understand the real issues of double-checked locking.
So, let's just assume the compiler didn't reorder anything. What about the CPU? Or more importantly CPUs - plural.
The first CPU, 'CPU1' needs to follow the instructions given by the compiler, in particular, it needs to write to memory the things it has been told to write:
instance,
instanceCreated
other member variable of the Singleton (ie your Singleton does DO something, and has some state, doesn't it?)
Actually, that 'other member variable' stuff is really important. Important for your singleton - that's its real purpose right?, and important for our discussion. So let's give it a name: important_data. ie instance->important_data. And maybe instance->important_function(), which uses important_data. Etc.
As mentioned, let's assume the compiler has written the code such that these items are written in the order you are expecting, namely:
important_data - written inside the ctor, called from
instance = new Singleton(instanceCreated);
instance - assigned right after new/ctor returns
instanceCreated - inside setInstanceCreated()
Now, the CPU hands these writes off to the memory bus. Know what the memory bus does? IT REORDERS THEM. The CPU and architecture has the same constraints as the compiler - ie make sure this one CPU sees things consistently - ie single threaded consistent. So if, for example, instance and instanceCreated are on the same cache-line (highly likely, actually), they might be written together, and since they were just read, that cache-line is 'hot', so maybe they get written FIRST before important_data, so that that cache-line can be retired to make room for the cache-line where important_data lives.
Did you see that? instanceCreated and instance were just committed to memory BEFORE important_data. Note that CPU1 doesn't care, because it is living in a single-threaded world...
So now introduce CPU2:
CPU2 comes in, sees instanceCreated == true and instance != NULL and thus goes off and decides to call Singleton::Instance()->important_function(), which uses important_data, which is uninitialized. CRASH BANG BOOM.
By the way, it gets worse. So far, we've seen that the compiler could reorder, but we're pretending it didn't. Let's go one step further and pretend that CPU1 did NOT reorder any of the memory writing. Are we OK now?
No. Of course not.
Just as CPU1 decided to optimize/reorder its memory writes, CPU2 can REORDER ITS READS!
CPU2 comes in and sees
if (!instanceCreated) ...
so it needs to read instanceCreated. Ever heard of 'speculative execution'? (Great name for a FPS game, by the way). If the memory bus isn't busy doing anything, CPU2 might pre-read some other values 'hoping' that instanceCreated is true. ie it may pre-read important_data for example. Maybe important_data (or the uninitialized, possibly re-claimed-by-the-allocator memory that will become important_data) is already in CPU2's cache. Or maybe (more likely?) CPU2 just free'd that memory, and the allocator wrote NULL in its first 4 bytes (allocators often use that memory for their free-lists), so actually, the memory soon-to-become important_data may actually still be in the write queue of CPU2. In that case, why would CPU2 bother re-reading that memory, when it hasn't even finished writing it yet!? (it wouldn't - it would just get the values from its write-queue.)
Did that make sense? If not, imagine that the value of instance (which is a pointer) is 0x17e823d0. What was that memory doing before it became (becomes) the Singleton? Is that memory still in the write-queue of CPU2?...
Or basically, don't even think about why it might want to do so, but realize that CPU2 might read important_data first, then instanceCreated second. So even though CPU1 may have wrote them in order CPU2 sees 'crap' in important_data, then sees true in instanceCreated (and who knows what in instance!). Again, CRASH BANG BOOM. Or BOOM CRASH BANG, since by now you realize that the order isn't guaranteed...
It's usually better to have a non-lazy singleton which does nothing in its constructor, and then in GetInstance do a thread-safe call once to a function which allocates any expensive resources. You're already creating a Mutex non-lazily, so why not just put the mutex and some kind of Pimpl in your Singleton object?
By the way, this is easier on Posix:
struct Singleton {
static Singleton *GetInstance() {
pthread_once(&control, doInit);
return instance;
}
private:
static void doInit() {
// slight problem: we can't throw from here, or fail
try {
instance = new Singleton();
} catch (...) {
// we could stash an error indicator in a static member,
// and check it in GetInstance.
std::abort();
}
}
static pthread_once_t control;
static Singleton *instance;
};
pthread_once_t Singleton::control = PTHREAD_ONCE_INIT;
Singleton *Singleton::instance = 0;
There do exist pthread_once implementations for Windows and other platforms.
If you wish to see an in-depth discussion of Singletons, the various policies about their lifetime and the thread safety issues, I can only recommend a good read: "Modern C++ Design" by Alexandrescu.
The implementation is presented on the web in Loki, find it here!
And yes, it does hold in a single header file. So I would really encourage you to at least grab the file and read it, and better yet read the book to have the full-blown reflection.
At global scope in your code:
/************************************************************************************
Keep track of the singleton object for possible deletion.
*/
Singleton* Singleton::_pInstance = Singleton::Instance();
This makes your implementation not lazy. Presumably you want to set _pInstance to NULL at global scope, and assign to it after you construct the singleton inside Instance() before you unlock the mutex.
More food for thought from Meyers & Alexandrescu, with Singleton being the specific target: C++ and the Perils of Double-Checked Locking. It's a bit of a prickly problem.

Detecting when an object is passed to a new thread in C++?

I have an object for which I'd like to track the number of threads that reference it. In general, when any method on the object is called I can check a thread local boolean value to determine whether the count has been updated for the current thread. But this doesn't help me if the user say, uses boost::bind to bind my object to a boost::function and uses that to start a boost::thread. The new thread will have a reference to my object, and may hold on to it for an indefinite period of time before calling any of its methods, thus leading to a stale count. I could write my own wrapper around boost::thread to handle this, but that doesn't help if the user boost::bind's an object that contains my object (I can't specialize based on the presence of a member type -- at least I don't know of any way to do that) and uses that to start a boost::thread.
Is there any way to do this? The only means I can think of requires too much work from users -- I provide a wrapper around boost::thread that calls a special hook method on the object being passed in provided it exists, and users add the special hook method to any class that contains my object.
Edit: For the sake of this question we can assume I control the means to make new threads. So I can wrap boost::thread for example and expect that users will use my wrapped version, and not have to worry about users simultaneously using pthreads, etc.
Edit2: One can also assume that I have some means of thread local storage available, through __thread or boost::thread_specific_ptr. It's not in the current standard, but hopefully will be soon.
In general, this is hard. The question of "who has a reference to me?" is not generally solvable in C++. It may be worth looking at the bigger picture of the specific problem(s) you are trying to solve, and seeing if there is a better way.
There are a few things I can come up with that can get you partway there, but none of them are quite what you want.
You can establish the concept of "the owning thread" for an object, and REJECT operations from any other thread, a la Qt GUI elements. (Note that trying to do things thread-safely from threads other than the owner won't actually give you thread-safety, since if the owner isn't checked it can collide with other threads.) This at least gives your users fail-fast behavior.
You can encourage reference counting by having the user-visible objects being lightweight references to the implementation object itself [and by documenting this!]. But determined users can work around this.
And you can combine these two-- i.e. you can have the notion of thread ownership for each reference, and then have the object become aware of who owns the references. This could be very powerful, but not really idiot-proof.
You can start restricting what users can and cannot do with the object, but I don't think covering more than the obvious sources of unintentional error is worthwhile. Should you be declaring operator& private, so people can't take pointers to your objects? Should you be preventing people from dynamically allocating your object? It depends on your users to some degree, but keep in mind you can't prevent references to objects, so eventually playing whack-a-mole will drive you insane.
So, back to my original suggestion: re-analyze the big picture if possible.
Short of a pimpl style implementation that does a threadid check before every dereference I don't see how you could do this:
class MyClass;
class MyClassImpl {
friend class MyClass;
threadid_t owning_thread;
public:
void doSomethingThreadSafe();
void doSomethingNoSafetyCheck();
};
class MyClass {
MyClassImpl* impl;
public:
void doSomethine() {
if (__threadid() != impl->owning_thread) {
impl->doSomethingThreadSafe();
} else {
impl->doSomethingNoSafetyCheck();
}
}
};
Note: I know the OP wants to list threads with active pointers, I don't think that's feasible. The above implementation at least lets the object know when there might be contention. When to change the owning_thread depends heavily on what doSomething does.
Usually you cannot do this programmatically.
Unfortuately, the way to go is to design your program in such a way that you can prove (i.e. convince yourself) that certain objects are shared, and others are thread private.
The current C++ standard does not even have the notion of a thread, so there is no standard portable notion of thread local storage, in particular.
If I understood your problem correctly I believe this could be done in Windows using Win32 function GetCurrentThreadId().
Below is a quick and dirty example of how it could be used. Thread synchronisation should rather be done with a lock object.
If you create an object of CMyThreadTracker at the top of every member function of your object to be tracked for threads, the _handle_vector should contain the thread ids that use your object.
#include <process.h>
#include <windows.h>
#include <vector>
#include <algorithm>
#include <functional>
using namespace std;
class CMyThreadTracker
{
vector<DWORD> & _handle_vector;
DWORD _h;
CRITICAL_SECTION &_CriticalSection;
public:
CMyThreadTracker(vector<DWORD> & handle_vector,CRITICAL_SECTION &crit):_handle_vector(handle_vector),_CriticalSection(crit)
{
EnterCriticalSection(&_CriticalSection);
_h = GetCurrentThreadId();
_handle_vector.push_back(_h);
printf("thread id %08x\n",_h);
LeaveCriticalSection(&_CriticalSection);
}
~CMyThreadTracker()
{
EnterCriticalSection(&_CriticalSection);
vector<DWORD>::iterator ee = remove_if(_handle_vector.begin(),_handle_vector.end(),bind2nd(equal_to<DWORD>(), _h));
_handle_vector.erase(ee,_handle_vector.end());
LeaveCriticalSection(&_CriticalSection);
}
};
class CMyObject
{
vector<DWORD> _handle_vector;
public:
void method1(CRITICAL_SECTION & CriticalSection)
{
CMyThreadTracker tt(_handle_vector,CriticalSection);
printf("method 1\n");
EnterCriticalSection(&CriticalSection);
for(int i=0;i<_handle_vector.size();++i)
{
printf(" this object is currently used by thread %08x\n",_handle_vector[i]);
}
LeaveCriticalSection(&CriticalSection);
}
};
CMyObject mo;
CRITICAL_SECTION CriticalSection;
unsigned __stdcall ThreadFunc( void* arg )
{
unsigned int sleep_time = *(unsigned int*)arg;
while ( true)
{
Sleep(sleep_time);
mo.method1(CriticalSection);
}
_endthreadex( 0 );
return 0;
}
int _tmain(int argc, _TCHAR* argv[])
{
HANDLE hThread;
unsigned int threadID;
if (!InitializeCriticalSectionAndSpinCount(&CriticalSection, 0x80000400) )
return -1;
for(int i=0;i<5;++i)
{
unsigned int sleep_time = 1000 *(i+1);
hThread = (HANDLE)_beginthreadex( NULL, 0, &ThreadFunc, &sleep_time, 0, &threadID );
printf("creating thread %08x\n",threadID);
}
WaitForSingleObject( hThread, INFINITE );
return 0;
}
EDIT1:
As mentioned in the comment, reference dispensing could be implemented as below. A vector could hold the unique thread ids referring to your object. You may also need to implement a custom assignment operator to deal with the object references being copied by a different thread.
class MyClass
{
public:
static MyClass & Create()
{
static MyClass * p = new MyClass();
return *p;
}
static void Destroy(MyClass * p)
{
delete p;
}
private:
MyClass(){}
~MyClass(){};
};
class MyCreatorClass
{
MyClass & _my_obj;
public:
MyCreatorClass():_my_obj(MyClass::Create())
{
}
MyClass & GetObject()
{
//TODO:
// use GetCurrentThreadId to get thread id
// check if the id is already in the vector
// add this to a vector
return _my_obj;
}
~MyCreatorClass()
{
MyClass::Destroy(&_my_obj);
}
};
int _tmain(int argc, _TCHAR* argv[])
{
MyCreatorClass mcc;
MyClass &o1 = mcc.GetObject();
MyClass &o2 = mcc.GetObject();
return 0;
}
The solution I'm familiar with is to state "if you don't use the correct API to interact with this object, then all bets are off."
You may be able to turn your requirements around and make it possible for any threads that reference the object subscribe to signals from the object. This won't help with race conditions, but allows threads to know when the object has unloaded itself (for instance).
To solve the problem "I have an object and want to know how many threads access it" and you also can enumerate your threads, you can solve this problem with thread local storage.
Allocate a TLS index for your object. Make a private method called "registerThread" which simply sets the thread TLS to point to your object.
The key extension to the poster's original idea is that during every method call, call this registerThread(). Then you don't need to detect when or who created the thread, it's just set (often redundantly) during every actual access.
To see which threads have accessed the object, just examine their TLS values.
Upside: simple and pretty efficient.
Downside: solves the posted question but doesn't extend smoothly to multiple objects or dynamic threads that aren't enumerable.