Singleton pattern with atomic states - c++

Is it a correct way to make singleton objects by using 2 static atomic and mutex variables to save 2 states: initializing and initialized?
For example, I only need one Application instance running in a program. Its job is to init and terminate external libraries, and prevent to create any new Application object.
#include <mutex>
#include <stdexcept>
static bool initialized;
static std::mutex mutex;
Application::Application()
{
std::lock_guard<std::mutex> lock(mutex);
if (initialized) throw std::runtime_error("Application::Application");
if (!init_external_libraries())
throw std::runtime_error("Application::Application");
initialized = true;
}
Application::~Application()
{
terminiate_external_libraries();
initialized = false;
}

Do I get it right, that init_external_libraries() has to run at most one time?
Atomics won't help you there. Operations on atomics are atomic (storing and loading values in your case), but what happens between those is not.
You could use that nice trick of having a function that has a static object and returns a reference to it. As far as I know, initialization of static values are guaranteed to happen only once.
It would look something like this:
Object &get_singleton(){
static Object o;
return o;
}
EDIT: And, as far as I know, this is threadsafe. Don't quote me on that though.

Related

C++: Thread Safety in a Signal/Slot Library

I'm implementing a Signal/Slot framework, and got to the point that I want it to be thread-safe. I already had a lot of support from the Boost mailing-list, but since this is not really boost-related, I'll ask my pending question here.
When is a signal/slot implementation (or any framework that calls functions outside itself, specified in some way by the user) considered thread-safe? Should it be safe w.r.t. its own data, i.e. the data associated to its implementation details? Or should it also take into account the user's data, which might or might not be modified whatever functions are passed to the framework?
This is an example given on the mailing-list (Edit: this is an example use-case --i.e. user code--. My code is behind the calls to the Emitter object):
int * somePtr = nullptr;
Emitter<Event> em; // just an object that can emit the 'Event' signal
void mainThread()
{
em.connect<Event>(someFunction);
// now, somehow, 2 threads are created which, at some point
// execute the thread1() and thread2() functions below
}
void someFunction()
{
// can somePtr change after the check but before the set?
if (somePtr)
*somePtr = 17;
}
void cleanupPtr()
{
// this looks safe, but compilers and CPUs can reorder this code:
int *tmp = somePtr;
somePtr = null;
delete tmp;
}
void thread1()
{
em.emit<Event>();
}
void thread2()
{
em.disconnect<Event>(someFunction);
// now safe to cleanup (?)
cleanupPtr();
}
In the above code, it might happen that Event is emitted, causing someFunction to be executed. If somePtr is non-null, but becomes null just after the if, but before the assignment, we're in trouble. From the point of view of thread2, this is not obvious because it is disconnecting someFunction before calling cleanupPtr.
I can see why this could potentially lead to trouble, but who's responsibility is this? Should my library protect the user from using it in every irresponsible but imaginable way?
I suspect there is no clearly good answer, but clarity will come from documenting the guarantees you wish to make about concurrent access to an Emitter object.
One level of guarantee, which to me is what is implied by a promise of thread safety, is that:
Concurrent operations on the object are guaranteed to leave the object in a consistent state (at least, from the point of view of the accessing threads.)
Non-commutative operations will be performed as if they were scheduled serially in some (unknown) order.
Then the question is, what does the emit method promise semantically: passing control to the connected routine, or evaluation of the function? If the former, then your work sounds like it is already done; if the latter, then the 'as-if ordered' requirement would mean that you need to enforce some level of synchronisation.
Users of the library can work with either, provided it is clear what is being promised.
Firstly the simplest possibility: If you don't claim your library to be thread-safe, you don't have to bother about this.
(But even) if you do:
In your example the user would have to take care about thread-safety, since both functions could be dangerous, even without using your event-system (IMHO, this is a pretty good way to determine who should take care about those kind of problems). A possible way for him to do this in C++11 could be:
#include <mutex>
// A mutex is used to control thread-acess to a shared resource
std::mutex _somePtr_mutex;
int* somePtr = nullptr;
void someFunction()
{
/*
Create a 'lock_guard' to manage your mutex.
Is the mutex '_somePtr_mutex' already locked?
Yes: Wait until it's unlocked.
No: Lock it and continue execution.
*/
std::lock_guard<std::mutex> lock(_somePtr_mutex);
if(somePtr)
*somePtr = 17;
// End of scope: 'lock' gets destroyed and hence unlocks '_somePtr_mutex'
}
void cleanupPtr()
{
/*
Create a 'lock_guard' to manage your mutex.
Is the mutex '_somePtr_mutex' already locked?
Yes: Wait until it's unlocked.
No: Lock it and continue execution.
*/
std::lock_guard<std::mutex> lock(_somePtr_mutex);
int *tmp = somePtr;
somePtr = null;
delete tmp;
// End of scope: 'lock' gets destroyed and hence unlocks '_somePtr_mutex'
}
The last question is easy. If you say your library is threadsafe, it should threadsafe. It makes no sense to say it is partly threadsafe or, it is only threadsafe if you do not abuse it. In that case you have to explain what exactly is not threadsafe.
Now to your first question regarded someFunction:
The operation is non atomic. Which means the CPU can interrupt between the if and the assigment. And that will happen, I know that :-) The other thread can erase the pointer anytime. Even between two short and fast looking statements.
Now to cleanupPtr:
I am not a compiler expert, but if you want to be shure that your assigment take place in the same moment you wrote it in code you should write the keyword volatile in front of the declaration of somePtr. The compiler will now know that you use that attribute in a multithreaded situation and will not buffer the value in a register of the CPU.
If you have a thread situation with a reader thread and a writer thread, the keyword volatile can (IMHO) be enough to sync them. As long as the attributes you use to exchange information between threads are generic.
For other situations you can use mutex or atomics. I will give you an example for mutex. I use C++11 for that, but it works similar with previous versions of C++ using boost.
Using mutex:
int * somePtr = nullptr;
Emitter<Event> em; // just an object that can emit the 'Event' signal
std::recursive_mutex g_mutex;
void mainThread()
{
em.connect<Event>(someFunction);
// now, somehow, 2 threads are created which, at some point
// execute the thread1() and thread2() functions below
}
void someFunction()
{
std::lock_guard<std::recursive_mutex> lock(g_mutex);
// can somePtr change after the check but before the set?
if (somePtr)
*somePtr = 17;
}
void cleanupPtr()
{
std::lock_guard<std::recursive_mutex> lock(g_mutex);
// this looks safe, but compilers and CPUs can reorder this code:
int *tmp = somePtr;
somePtr = null;
delete tmp;
}
void thread1()
{
em.emit<Event>();
}
void thread2()
{
em.disconnect<Event>(someFunction);
// now safe to cleanup (?)
cleanupPtr();
}
I only added a recursive mutex here without changing any other code of the sample, even if it's now cargo code.
There are two kinds of mutex in the std. A utterly useless std::mutex and the std::recursive_mutex which work like you expect a mutex should work. The std::mutex exclude the access of any further call even from the same thread. Which can happen if a method which needs mutex protection calls a public method which use the same mutex. std::recursive_mutex is reentrant for the same thread.
Atomics (or interlocks in win32) are another way, but only to exchange values between threads or access them concurrently. Your example is missing such values, but in your case, I would look a little deeper in them (std::atomic).
UPDATE
If your are the user of a library which is not explicit declared as threadsafe by the developer, take it as non threadsafe and shield every call to it with a mutex lock.
To stick with the example. If you cannot change someFunction the you have to wrap the function like:
void threadsafeSomeFunction()
{
std::lock_guard<std::recursive_mutex> lock(g_mutex);
someFunction();
}

Locking/unlocking mutex inside private functions

Imagine you have a big function that locks/unlocks a mutex inside and you want to break the function into smaller functions:
#include <pthread.h>
class MyClass : public Uncopyable
{
public:
MyClass() : m_mutexBuffer(PTHREAD_MUTEX_INITIALIZER), m_vecBuffer() {}
~MyClass() {}
void MyBigFunction()
{
pthread_mutex_lock(&m_mutexBuffer);
if (m_vecBuffer.empty())
{
pthread_mutex_unlock(&m_mutexBuffer);
return;
}
// DoSomethingWithBuffer1();
unsigned char ucBcc = CalculateBcc(&m_vecBuffer[0], m_vecBuffer.size());
// DoSomethingWithBuffer2();
pthread_mutex_unlock(&m_mutexBuffer);
}
private:
void DoSomethingWithBuffer1()
{
// Use m_vecBuffer
}
void DoSomethingWithBuffer2()
{
// Use m_vecBuffer
}
private:
pthread_mutex_t m_mutexBuffer;
std::vector<unsigned char> m_vecBuffer;
};
How should I go about locking/unlocking the mutex inside the smaller functions?
Should I unlock the mutex first, then lock it straightaway and finally unlock it before returning?
void DoSomethingWithBuffer1()
{
pthread_mutex_unlock(&m_mutexBuffer);
pthread_mutex_lock(&m_mutexBuffer);
// Use m_vecBuffer
pthread_mutex_unlock(&m_mutexBuffer);
}
How should I go about locking/unlocking the mutex inside the smaller functions?
If your semantics require your mutex to be locked during the whole MyBigFunction() operation then you can't simply unlock it and relock it in the middle of the function.
My best bet would be to ignore the mutex in the smaller DoSomethingWithBuffer...() functions, and simply require that these functions are called with the mutex being already locked. This shouldn't be a problem since those functions are private.
On a side note, your mutex usage is incorrect: it is not exception safe, and you have code paths where you don't release the mutex. You should either use C++11's mutex and lock classes or boost's equivalents if you are using C++03. At worst if you can't use boost, write a small RAII wrapper to hold the lock.
In general, try to keep the regions of code within each lock to a minimum (to avoid contention), but avoid to unlock and immediatly re-lock the same mutex. Thus, if the smaller functions are not mutually exclusive, they should both use their own indepdenent mutices and only when they actually access the shared resource.
Another thing that should consider is to use RAII for locking and unlocking (as in C++11 with std::lock_guard<>), so that returning from a locked region (either directly or via an uncaught exception) does not leave you in a locked state.

Error about std::promise in C++

I am trying to pass my class instance into threads and the return the processed objects from threads. I've googled about C++ multithreading, and found that std::promising can be helpful.
However, I am stuck at the very beginning. Here is my code:
void callerFunc()
{
//...
std::promise<DataWareHouse> data_chunks;
// DataWareHouse is my customized class
//data_chunks has a vector<vector<double>> member variable
std::thread(&run_thread,data_chunks);
// ............
}
void run_thread(std::promise<DataWareHouse> data_chunks)
{
// ...
vector<vector<double>> results;
// ...
data_chunks.set_value(results);
}
The above code generates an error:
`error C2248: 'std::promise<_Ty>::promise' : cannot access private member declared in class 'std::promise<_Ty>'`
May I know what am I wrong and how to fix it?
Many thanks. :-)
Your first problem is that you are using std::thread -- std::thread is a low level class which you should build higher abstractions up on. Threading is newly standardized in C++ in C++11, and all of the rough parts are not filed off yet.
There are three different patterns for using threading in C++11 that might be useful to you.
First, std::async. Second, std::thread mixed with std::packaged_task. And third, dealing with std::thread and std::promise in the raw.
I'll illustrate the third, which is the lowest level and most dangerous, because that is what you asked for. I would advise looking at the first two options.
#include <future>
#include <vector>
#include <iostream>
typedef std::vector<double> DataWareHouse;
void run_thread(std::promise<DataWareHouse> data_chunks)
{
DataWareHouse results;
results.push_back( 3.14159 );
data_chunks.set_value(results);
}
std::future<DataWareHouse> do_async_work()
{
std::promise<DataWareHouse> data_chunks;
std::future<DataWareHouse> retval = data_chunks.get_future();
// DataWareHouse is my customized class
//data_chunks has a vector<vector<double>> member variable
std::thread t = std::thread(&run_thread,std::move(data_chunks));
t.detach(); // do this or seg fault
return retval;
}
int main() {
std::future<DataWareHouse> result = do_async_work();
DataWareHouse vec = result.get(); // block and get the data
for (double d: vec) {
std::cout << d << "\n";
}
}
Live example
With std::async, you'd have a function returning DataWareHouse, and it would return a std::future<DataWareHouse> directly.
With std::packaged_task<>, it would take your run_thread and turn it into a packaged_task that can be executed, and a std::future extracted from it.
std::promise<> is not copyable, and in calling run_thread() you are implicitly trying to invoke the copy constructor. The error message is telling you that you cannot use the copy constructor since it is marked private.
You need to pass a promise by reference (std::promise<DataWareHouse> &). This is safe if callerFunc() is guaranteed not to return until run_thread() is finished with the object (otherwise you will be using a reference to a destroyed stack-allocated object, and I don't have to explain why that's bad).
You're trying to pass the promise to the thread by value; but you need to pass by reference to get the results back to the caller's promise. std::promise is uncopyable, to prevent this mistake.
std::thread(&run_thread,std::ref(data_chunks));
^^^^^^^^
void run_thread(std::promise<DataWareHouse> & data_chunks)
^
The error is telling you you cannot copy an std::promise, which you do here:
void run_thread(std::promise<DataWareHouse> data_chunks)
and here:
std::thread(&run_thread,data_chunks); // makes copy of data_chunks
You should pass a reference:
void run_thread(std::promise<DataWareHouse>& data_chunks);
// ^
And then pass an std::reference_wrapper to the thread, otherwise it too will attempt to copy the promise. This is easily done with std::ref:
std::thread(&run_thread, std::ref(data_chunks));
// ^^^^^^^^
Obviously data_chunks must be alive until the thread finished running, so you will have to join the thread in callerFunc().

Warm up some variables in C++

I have a C++ library that works on some numeric values, this values are not available at compile time but are immediatly available at runtime and are based on machine-related details, in short I need values like display resolution, the number of CPU cores and so on.
The key points of my question are:
I can't ask to the user to input this values ( both the coders/user of my lib and the final user )
I need to do this warm up only once the application starts, it's a 1 time only thing
this values are later used by methods and classes
the possible solutions are:
build a data structure Data, declare some Data dummy where dummy is the name of the variable used to store everything and the contructor/s will handle the one time inizialization for the related values
wrap something like the first solution in a method like WarmUp() and putting this method right after the start for the main() ( it's also a simple thing to remember and use )
the big problems that are still unsolved are:
the user can declare more than 1 data structure since Data it's a type and there are no restrictions about throwing 2-4-5-17 variables of the same type in C++
the WarmUp() method can be a little intrusive in the design of the other classes, it can also happen that the WarmUp() method is used in local methods and not in the main().
I basically need to force the creation of 1 single instance of 1 specific type at runtime when I have no power over the actual use of my library, or at least I need to design this in a way that the user will immediately understand what kind of error is going on keeping the use of the library intuitive and simple as much as possible.
Can you see a solution to this ?
EDIT:
my problems are also more difficult due to the fact that I'm also trying to get a multi-threading compatible data structure.
What about to use lazily-create singleton? E.g
struct Data
{
static Data & instance()
{
static Data data_;
return data_;
}
private:
Data()
{
//init
}
}
Data would be initialized on first use, when you call Data::instance()
Edit: as for multithreading, read efficient thread-safe singleton in C++
Edit2 realisation using boost::call_once
First, as others have said, the obvious (and best) answer to
this is a singleton. Since you've added the multithreading
requirement, however: there are two solutions, depending on
whether the object will be modified by code using the singleton.
(From your description, I gather not.) If not, then it is
sufficient to use the "naïve" implementation of a singleton, and
ensure that the singleton is initialized before threads are
started. If no thread is started before you enter main (and
I would consider it bad practice otherwise), then something like
the following is largely sufficient:
class Singleton
{
static Singleton const* ourInstance;
Singleton();
Singleton( Singleton const& );
Singleton& operator=( Singleton const& );
public:
static Singleton const& instance();
};
and in the implementation:
Singleton const* Singleton::ourInstance = &Singleton::instance();
Singleton const&
Singleton::instance()
{
if ( ourInstance == NULL ) {
ourInstance = new Singleton;
}
return *ourInstance;
}
No locking is necessary, since no thread will be modifying
anything once threading starts.
If the singleton is mutable, then you have to protect all access
to it. You could do something like the above (without the
const, obviously), and leave the locking to the client, but in
such cases, I'd prefer locking in the instance function, and
returning an std::shared_ptr with a deleter which frees the
lock, which was acquired in the instance function. I think
something like the following could work (but I've never actually
needed it, and so haven't tried it):
class Singleton
{
static Singleton* ourInstance;
static std::mutex ourMutex;
class LockForPointer
{
public:
operator()( Singleton* )
{
Singleton::ourMutex.unlock();
}
};
class LockForInstance
{
bool myOwnershipIsTransfered;
public:
LockForInstance
: myOwnershipIsTransfered( false )
{
Singleton::ourMutex.lock();
}
~LockForInstance()
{
if ( !myOwnershipIsTransfered ) {
Singleton::ourMutex.unlock();
}
}
LockForPointer transferOwnership()
{
myOwnershipIsTransfered = true;
return LockForPointer();
}
};
public:
static std::shared_ptr<Singleton> instance();
};
and the implementation:
static Singleton* ourInstance = NULL;
static std::mutex ourMutex;
std::shared_ptr<Singleton>
Singleton::instance()
{
LockForInstance lock;
if ( ourInstance == NULL ) {
ourInstance = new Singleton;
}
return std::shared_ptr<Singleton>( ourInstance, lock.transferOwnership() );
}
This way, the same lock is used for the check for null and for
accessing the data.
Warmup is usually related to performance issues (and makes me think of the processor cache, see the __builtin_prefetch of GCC).
Maybe you want to make a singleton class, there are many solutions to this (e.g. this C++ singleton tutorial).
Also, if performance is the primary concern, you could consider that the configured parameters are given at initialization (or even at installation) time. Then, you could specialize your code, perhaps as simply as having templates and instanciating them at first by emitting (at runtime = initialization time) the appropriate C++ stub source code and compiling it (at "runtime", e.g. at initialization) and dynamically loading it (using plugins & dlopen ...). See also this answer.

Thread safe container

There is some exemplary class of container in pseudo code:
class Container
{
public:
Container(){}
~Container(){}
void add(data new)
{
// addition of data
}
data get(size_t which)
{
// returning some data
}
void remove(size_t which)
{
// delete specified object
}
private:
data d;
};
How this container can be made thread safe? I heard about mutexes - where these mutexes should be placed? Should mutex be static for a class or maybe in global scope? What is good library for this task in C++?
First of all mutexes should not be static for a class as long as you going to use more than one instance. There is many cases where you should or shouldn't use use them. So without seeing your code it's hard to say. Just remember, they are used to synchronise access to shared data. So it's wise to place them inside methods that modify or rely on object's state. In your case I would use one mutex to protect whole object and lock all three methods. Like:
class Container
{
public:
Container(){}
~Container(){}
void add(data new)
{
lock_guard<Mutex> lock(mutex);
// addition of data
}
data get(size_t which)
{
lock_guard<Mutex> lock(mutex);
// getting copy of value
// return that value
}
void remove(size_t which)
{
lock_guard<Mutex> lock(mutex);
// delete specified object
}
private:
data d;
Mutex mutex;
};
Intel Thread Building Blocks (TBB) provides a bunch of thread-safe container implementations for C++. It has been open sourced, you can download it from: http://threadingbuildingblocks.org/ver.php?fid=174 .
First: sharing mutable state between threads is hard. You should be using a library that has been audited and debugged.
Now that it is said, there are two different functional issue:
you want a container to provide safe atomic operations
you want a container to provide safe multiple operations
The idea of multiple operations is that multiple accesses to the same container must be executed successively, under the control of a single entity. They require the caller to "hold" the mutex for the duration of the transaction so that only it changes the state.
1. Atomic operations
This one appears simple:
add a mutex to the object
at the start of each method grab a mutex with a RAII lock
Unfortunately it's also plain wrong.
The issue is re-entrancy. It is likely that some methods will call other methods on the same object. If those once again attempt to grab the mutex, you get a dead lock.
It is possible to use re-entrant mutexes. They are a bit slower, but allow the same thread to lock a given mutex as much as it wants. The number of unlocks should match the number of locks, so once again, RAII.
Another approach is to use dispatching methods:
class C {
public:
void foo() { Lock lock(_mutex); foo_impl(); }]
private:
void foo_impl() { /* do something */; foo_impl(); }
};
The public methods are simple forwarders to private work-methods and simply lock. Then one just have to ensure that private methods never take the mutex...
Of course there are risks of accidentally calling a locking method from a work-method, in which case you deadlock. Read on to avoid this ;)
2. Multiple operations
The only way to achieve this is to have the caller hold the mutex.
The general method is simple:
add a mutex to the container
provide a handle on this method
cross your fingers that the caller will never forget to hold the mutex while accessing the class
I personally prefer a much saner approach.
First, I create a "bundle of data", which simply represents the class data (+ a mutex), and then I provide a Proxy, in charge of grabbing the mutex. The data is locked so that the proxy only may access the state.
class ContainerData {
protected:
friend class ContainerProxy;
Mutex _mutex;
void foo();
void bar();
private:
// some data
};
class ContainerProxy {
public:
ContainerProxy(ContainerData& data): _data(data), _lock(data._mutex) {}
void foo() { data.foo(); }
void bar() { foo(); data.bar(); }
};
Note that it is perfectly safe for the Proxy to call its own methods. The mutex will be released automatically by the destructor.
The mutex can still be reentrant if multiple Proxies are desired. But really, when multiple proxies are involved, it generally turns into a mess. In debug mode, it's also possible to add a "check" that the mutex is not already held by this thread (and assert if it is).
3. Reminder
Using locks is error-prone. Deadlocks are a common cause of error and occur as soon as you have two mutexes (or one and re-entrancy). When possible, prefer using higher level alternatives.
Add mutex as an instance variable of class. Initialize it in constructor, and lock it at the very begining of every method, including destructor, and unlock at the end of method. Adding global mutex for all instances of class (static member or just in gloabl scope) may be a performance penalty.
The is also a very nice collection of lock-free containers (including maps) by Max Khiszinsky
LibCDS1 Concurrent Data Structures
Here is the documentation page:
http://libcds.sourceforge.net/doc/index.html
It can be kind of intimidating to get started, because it is fully generic and requires you register a chosen garbage collection strategy and initialize that. Of course, the threading library is configurable and you need to initialize that as well :)
See the following links for some getting started info:
initialization of CDS and the threading manager
http://sourceforge.net/projects/libcds/forums/forum/1034512/topic/4600301/
the unit tests ((cd build && ./build.sh ----debug-test for debug build)
Here is base template for 'main':
#include <cds/threading/model.h> // threading manager
#include <cds/gc/hzp/hzp.h> // Hazard Pointer GC
int main()
{
// Initialize \p CDS library
cds::Initialize();
// Initialize Garbage collector(s) that you use
cds::gc::hzp::GarbageCollector::Construct();
// Attach main thread
// Note: it is needed if main thread can access to libcds containers
cds::threading::Manager::attachThread();
// Do some useful work
...
// Finish main thread - detaches internal control structures
cds::threading::Manager::detachThread();
// Terminate GCs
cds::gc::hzp::GarbageCollector::Destruct();
// Terminate \p CDS library
cds::Terminate();
}
Don't forget to attach any additional threads you are using:
#include <cds/threading/model.h>
int myThreadFunc(void *)
{
// initialize libcds thread control structures
cds::threading::Manager::attachThread();
// Now, you can work with GCs and libcds containers
....
// Finish working thread
cds::threading::Manager::detachThread();
}
1 (not to be confuse with Google's compact datastructures library)