Is this way of creating static instance thread safe? - c++

I have the following sample C++ code:
class Factory
{
public:
static Factory& createInstance()
{
static Factory fac;
return fac;
}
private:
Factory()
{
//Does something non-trivial
}
};
Let's assume that createInstance is called by two threads at the same time. So will the resulting object be created properly? What happens if the second thread enters the createInstance call when the first thread is in the constructor of Factory?

C++11 and above: local static creation is thread-safe.
The standard guarantees that:
The creation is synchronized.
Should the creation throws an exception, the next time the flow of execution passes the variable definition point, creation will be attempted again.
It is generally implemented with double-checking:
first a thread-local flag is checked, and if set, then the variable is accessed.
if not yet set, then a more expensive synchronized path is taken, and if the variable is created afterward, the thread-local flag is set.
C++03 and C++98: the standard knows no thread.
There are no threads as far as the Standard is concerned, and therefore there is no provision in the Standard regarding synchronization across threads.
However some compilers implement more than the standard mandates, either in the form of extensions or by giving stronger guarantees, so check out for the compilers you're interested in. If they are good quality ones, chances are that they will guarantee it.
Finally, it might not be necessary for it to be thread-safe. If you call this method before creating any thread, then you ensures that it will be correctly initialized before the real multi-threading comes into play, and you'll neatly side-step the issue.

Looking at this page, I'd say that this is not thread-safe, because the constructor could get called multiple times before the variable is finally assigned. An InterlockedCompareExchange() might be needed, where you create a local copy of the variable, then atomically assign the pointer to a static field via the interlocked function, if the static variable is null.

Of course it's thread safe! Unless you are a complete lunatic and spawn threads from constructors of static objects, you won't have any threads until after main() is called, and the createInstance method is just returning a reference to an already constructed object, there's no way this can fail. ISO C++ guarantees that the object will be constructed before the first use after main() is called: there's no assurance that will be before main is called, but is has to be before the first use, and so all systems will perform the initialisation before main() is called. Of course ISO C++ doesn't define behaviour in the presence of threads or dynamic loading, but all compilers for host level machines provide this support and will try to preserve the semantics specified for singly threaded statically linked code where possible.

The instantiation (first call) itself is threadsafe.
However, subsequent access will not be, in general. For instance, suppose after instantiation, one thread calls a mutable Factory method and another calls some accessor method in Factory, then you will be in trouble.
For example, if your factory keeps a count of the number of instances created, you will be in trouble without some kind of mutex around that variable.
However, if Factory is truly a class with no state (no member variables), then you will be okay.

Related

Do global variable constructors/destructors need thread protection?

If I have a class whose sole purpose is to have global static instances (to ensure the code in it's constructor is run before main) and it uses a class static variable, does access to this variable need to be protected via mutex?
An example will help:
class WinSock
{
public:
WinSock()
{
if(!(inst++))
//winsock init
}
~WinSock()
{
if(!--inst)
//winsock deactivate
}
private:
static int inst = 0;
}
static WinSock unusedWinSockVar;
This is all in a header that is included by any file using winsock. Does access to inst need to be protected, or is it impossible for this code to be run from multiple threads since threads will be created only once main runs and destroyed before main returns?
Firstly, I don't think that private: static int inst = 0; is a valid construct, my compilers complains loudly - if you omitted that you have something like int WinSock::inst = 0 in some .cpp file in your project for simplicity, then it's ok. If not and your project compiles at all, there is a good chance that all translation units will use a different variable, and therefore result in incorrect behavior.
Secondly, if any of the static-object constructors creates a new thread, then you need to make your code thread safe. From C++ standard p. 3.6.2:
If a program starts a thread (30.3), the subsequent initialization of
a variable is unsequenced with respect to the initialization of a
variable defined in a different translation unit. Otherwise, the
initialization of a variable is indeterminately sequenced with respect
to the initialization of a variable defined in a different translation
unit.
Indeterminate sequencing means that initialization will not have any particular ordering, but it will not overlap, so you don't need any additional safeguards. No ordering means that constructors in different compilation unis might overlap, and therefore thread safety is required.
Thirdly, do you even need it done like this? Do you have other static objects that use winsock in their constructors? I really cannot think of any other reason to do it like that.
Given the specific scenario that you describe, this is fine without adding synchronization.
Your concern is that Winsock is initialized (and de-initialized) before (after) main runs, this is guaranteed to be the case. The code is guaranteed to be only called once from one thread, too. This (the fact that there's only one thread) makes synchronization useless.
Assuming that other static global objects use Winsock (whether or not they spawn threads), that would of course be unsafe, but it wouldn't be any safer with a mutex either. The initialization takes place at an implementation-defined point in time before main.
Therefore, no static global object can use Winsock in a safe, well-defined way using this construct, since either way you don't know whether initialization occurred first. Synchronizing it doesn't change a thing for that detail.
Note: the initialization of inst inside the class declaration isn't allowed as it is.

Access violation in a multithreaded application, C++

I am not very good in multithreading programming so I would like to ask for some help/advice.
In my application I have two threads trying to access a shared object.
One can think about two tasks trying to call functions from within another object. For clarity I will show some parts of the program which may not be very relevant but hopefully can help to get my problem better.
Please take a look at the sample code below:
//DataLinkLayer.h
class DataLinkLayer: public iDataLinkLayer {
public:
DataLinkLayer(void);
~DataLinkLayer(void);
};
Where iDataLinkLayer is an interface (abstract class without any implementation) containing pure virtual functions and a reference (pointer) declaration to the isntance of DataLinkLayer object (dataLinkLayer).
// DataLinkLayer.cpp
#include "DataLinkLayer.h"
DataLinkLayer::DataLinkLayer(void) {
/* In reality task constructors takes bunch of other parameters
but they are not relevant (I believe) at this stage. */
dll_task_1* task1 = new dll_task_1(this);
dll_task_2* task2 = new dll_task_2(this);
/* Start multithreading */
task1->start(); // task1 extends thread class
task2->start(); // task2 also extends thread class
}
/* sample stub functions for testing */
void DataLinkLayer::from_task_1() {
printf("Test data Task 1");
}
void DataLinkLayer::from_task_2() {
printf("Test data Task 2");
}
Implementation of task 1 is below. The dataLinLayer interface (iDataLinkLayer) pointer is passed to the class cosntructor in order to be able to access necessary functions from within the dataLinkLayer isntance.
//data_task_1.cpp
#include "iDataLinkLayer.h" // interface to DataLinkLayer
#include "data_task_1.h"
dll_task_1::dll_task_1(iDataLinkLayer* pDataLinkLayer) {
this->dataLinkLayer = pDataLinkLayer; // dataLinkLayer declared in dll_task_1.h
}
// Run method - executes the thread
void dll_task_1::run() {
// program reaches this point and prints the stuff
this->datalinkLayer->from_task_1();
}
// more stuff following - not relevant to the problem
...
And task 2 looks simialrly:
//data_task_2.cpp
#include "iDataLinkLayer.h" // interface to DataLinkLayer
#include "data_task_2.h"
dll_task_2::dll_task_2(iDataLinkLayer* pDataLinkLayer){
this->dataLinkLayer = pDataLinkLayer; // dataLinkLayer declared in dll_task_2.h
}
// // Run method - executes the thread
void dll_task_2::run() {
// ERROR: 'Access violation reading location 0xcdcdcdd9' is signalled at this point
this->datalinkLayer->from_task_2();
}
// more stuff following - not relevant to the problem
...
So as I understand correctly I access the shared pointer from two different threads (tasks) and it is not allowed.
Frankly I thought that I will be able to access the object nevertheless however the results might be unexpected.
It seems that something goes terribly wrong at the point when dll_task_2 tries to call the function using pointer to the DataLinkLayer. dll_task_2 has lower priority hence it is started afterwards. I don't understand why i still cannot at least access the object...
I can use the mutex to lock the variable but I thought that the primary reason for this is to protect the variable/object.
I am using Microsoft Visual C++ 2010 Express.
I don't know much about multithreading so maybe you can suggest a better solution to this problem as well as explain the reason of the problem.
The address of the access violation is a very small positive offset from 0xcdcdcdcd
Wikipedia says:
CDCDCDCD Used by Microsoft's C++ debugging runtime library to mark uninitialised heap memory
Here is the relevant MSDN page.
The corresponding value after free is 0xdddddddd, so it's likely to be incomplete initialization rather than use-after-free.
EDIT: James asked how optimization could mess up virtual function calls. Basically, it's because the currently standardized C++ memory model makes no guarantees about threading. The C++ standard defines that virtual calls made from within a constructor will use the declaring type of the constructor currently being run, not the final dynamic type of the object. So this means that, from the perspective of the C++ sequential execution memory model, the virtual call mechanism (practically speaking, a v-table pointer) must be set up before the constructor starts running (I believe the specific point is after base subobject construction in the ctor-initializer-list and before member subobject construction).
Now, two things can happen to make the observable behavior different in a threaded scenario:
First, the compiler is free to perform any optimization that would, in the C++ sequential execution model, act as-if the rules were being followed. For example, if the compiler can prove that no virtual calls are made inside the constructor, it could wait and set the v-table pointer at the end of the constructor body instead of the beginning. If the constructor doesn't give out the this pointer, since the caller of the constructor also hasn't received its copy of the pointer yet, then none of the functions called by the constructor can call back (virtually or statically) to the object under construction. But the constructor DOES give away the this pointer.
We have to look closer. If the function to which the this pointer is given is visible to the compiler (i.e. included in the current compilation unit), the the compiler can include its behavior in the analysis. We weren't given that function in this question (the constructor and member functions of class task), but it seems likely that the only thing that happens is that said pointer is stored in a subobject which is also not reachable from outside the constructor.
"Foul!", you cry, "I passed the address of that task subobject to a library CreateThread function, therefore it is reachable and through it, the main object is reachable." Ah, but you do not comprehend the mysteries of the "strict aliasing rules". That library function does not accept a parameter of type task *, now does it? And being a parameter whose type is perhaps intptr_t, but definitely neither task * nor char *, the compiler is permitted to assume, for purposes of as-if optimization, that it does not point to a task object (even if it clearly does). And if it does not point to a task object, and the only place our this pointer got stored is in a task member subobject, then it cannot be used to make virtual calls to this, so the compiler may legitimately delay setting up the virtual call mechanism.
But that's not all. Even if the compiler does set up the virtual call mechanism on schedule, the CPU memory model only guarantees that the change is visible to the current CPU core. Writes may become visible to other CPU cores in a completely different order. Now, the library create thread function ought to introduce a memory barrier that constrains CPU write reordering, but that fact that Koz's answer introducing a critical section (which definitely includes a memory barrier) changes the behavior suggests that perhaps no memory barrier was present in the original code.
And, CPU write reordering can not only delay the v-table pointer, but the storage of the this pointer into the task subobject.
I hope you have enjoyed this guided tour of one small corner of the cave of "multithreaded programming is hard".
printf is not, afaik, thread safe. Try surrounding the printf with a critical section.
To do this you InitializeCriticalSection inside iDataLinkLayer class. Then around the printfs you need an EnterCriticalSection and a LeaveCriticalSection. This will prevent both functions entering the printf simultaneously.
Edit: Try changing this code:
dll_task_1* task1 = new task(this);
dll_task_2* task2 = new task(this);
to
dll_task_1* task1 = new dll_task_1(this);
dll_task_2* task2 = new dll_task_2(this);
Im guessing that task is in fact the base class of dll_task_1 and dll_task_2 ... so, more than anything, im surprised it compiles ....
I think it's not always safe to use 'this' (i.e. to call a member function) before the end of the constructor. It could be that task are calling member function of DataLinkLayer before the end of DataLinkLayer constructor. Especially if this member function is virtual:
http://www.parashift.com/c++-faq-lite/ctors.html#faq-10.7
I wanted to comment on the creation of the DataLinkLayer.
When I call the DataLinkLayer constructor from main:
int main () {
DataLinkLayer* dataLinkLayer = new DataLinkLayer();
while(true); // to keep the main thread running
}
I, of coruse, do not destruct the object, this is first. Now, inside the DataLinkLayer cosntructor I initialize many (not only these two tasks) other objects isntances and pass to most of them dataLinkLayer pointer (using this). This is legal, as far as I am concerned. Put it further - it compiles and runs as expected.
What I became curious about is the overall design idea that I am following (if any :) ).
The DataLinkLayer is a parent class that is accessed by several tasks which try to modify it parameters or perform some other processing. Since I want that everything remain as decoupled as possible I provide only interfaces for the accessors and encapsulate the data so that I don't have any global variables, friend functions etc.
It would have been a pretty easy task to do if only multithreading would not be there. I beleive I will encounter many other pitfalls on my way.
Feel free to discuss it please and merci for your generous comments!
UPD:
Speaking of passing the iDataLinkLayer interface pointer to the tasks - is this a good way to do it? In Java it would be pretty usual thing to realize a containment or so called strategy pattern to make things decoupled and stuff. However I am not 100% sure whether it is a good solution in c++... Any suggestions/commnets on it?

What exactly is a reentrant function?

Most of the times, the definition of reentrance is quoted from Wikipedia:
A computer program or routine is
described as reentrant if it can be
safely called again before its
previous invocation has been completed
(i.e it can be safely executed
concurrently). To be reentrant, a
computer program or routine:
Must hold no static (or global)
non-constant data.
Must not return the address to
static (or global) non-constant
data.
Must work only on the data provided
to it by the caller.
Must not rely on locks to singleton
resources.
Must not modify its own code (unless
executing in its own unique thread
storage)
Must not call non-reentrant computer
programs or routines.
How is safely defined?
If a program can be safely executed concurrently, does it always mean that it is reentrant?
What exactly is the common thread between the six points mentioned that I should keep in mind while checking my code for reentrant capabilities?
Also,
Are all recursive functions reentrant?
Are all thread-safe functions reentrant?
Are all recursive and thread-safe functions reentrant?
While writing this question, one thing comes to mind:
Are the terms like reentrance and thread safety absolute at all i.e. do they have fixed concrete definitions? For, if they are not, this question is not very meaningful.
1. How is safely defined?
Semantically. In this case, this is not a hard-defined term. It just mean "You can do that, without risk".
2. If a program can be safely executed concurrently, does it always mean that it is reentrant?
No.
For example, let's have a C++ function that takes both a lock, and a callback as a parameter:
#include <mutex>
typedef void (*callback)();
std::mutex m;
void foo(callback f)
{
m.lock();
// use the resource protected by the mutex
if (f) {
f();
}
// use the resource protected by the mutex
m.unlock();
}
Another function could well need to lock the same mutex:
void bar()
{
foo(nullptr);
}
At first sight, everything seems ok… But wait:
int main()
{
foo(bar);
return 0;
}
If the lock on mutex is not recursive, then here's what will happen, in the main thread:
main will call foo.
foo will acquire the lock.
foo will call bar, which will call foo.
the 2nd foo will try to acquire the lock, fail and wait for it to be released.
Deadlock.
Oops…
Ok, I cheated, using the callback thing. But it's easy to imagine more complex pieces of code having a similar effect.
3. What exactly is the common thread between the six points mentioned that I should keep in mind while checking my code for reentrant capabilities?
You can smell a problem if your function has/gives access to a modifiable persistent resource, or has/gives access to a function that smells.
(Ok, 99% of our code should smell, then… See last section to handle that…)
So, studying your code, one of those points should alert you:
The function has a state (i.e. access a global variable, or even a class member variable)
This function can be called by multiple threads, or could appear twice in the stack while the process is executing (i.e. the function could call itself, directly or indirectly). Function taking callbacks as parameters smell a lot.
Note that non-reentrancy is viral : A function that could call a possible non-reentrant function cannot be considered reentrant.
Note, too, that C++ methods smell because they have access to this, so you should study the code to be sure they have no funny interaction.
4.1. Are all recursive functions reentrant?
No.
In multithreaded cases, a recursive function accessing a shared resource could be called by multiple threads at the same moment, resulting in bad/corrupted data.
In singlethreaded cases, a recursive function could use a non-reentrant function (like the infamous strtok), or use global data without handling the fact the data is already in use. So your function is recursive because it calls itself directly or indirectly, but it can still be recursive-unsafe.
4.2. Are all thread-safe functions reentrant?
In the example above, I showed how an apparently threadsafe function was not reentrant. OK, I cheated because of the callback parameter. But then, there are multiple ways to deadlock a thread by having it acquire twice a non-recursive lock.
4.3. Are all recursive and thread-safe functions reentrant?
I would say "yes" if by "recursive" you mean "recursive-safe".
If you can guarantee that a function can be called simultaneously by multiple threads, and can call itself, directly or indirectly, without problems, then it is reentrant.
The problem is evaluating this guarantee… ^_^
5. Are the terms like reentrance and thread safety absolute at all, i.e. do they have fixed concrete definitions?
I believe they do, but then, evaluating a function is thread-safe or reentrant can be difficult. This is why I used the term smell above: You can find a function is not reentrant, but it could be difficult to be sure a complex piece of code is reentrant
6. An example
Let's say you have an object, with one method that needs to use a resource:
struct MyStruct
{
P * p;
void foo()
{
if (this->p == nullptr)
{
this->p = new P();
}
// lots of code, some using this->p
if (this->p != nullptr)
{
delete this->p;
this->p = nullptr;
}
}
};
The first problem is that if somehow this function is called recursively (i.e. this function calls itself, directly or indirectly), the code will probably crash, because this->p will be deleted at the end of the last call, and still probably be used before the end of the first call.
Thus, this code is not recursive-safe.
We could use a reference counter to correct this:
struct MyStruct
{
size_t c;
P * p;
void foo()
{
if (c == 0)
{
this->p = new P();
}
++c;
// lots of code, some using this->p
--c;
if (c == 0)
{
delete this->p;
this->p = nullptr;
}
}
};
This way, the code becomes recursive-safe… But it is still not reentrant because of multithreading issues: We must be sure the modifications of c and of p will be done atomically, using a recursive mutex (not all mutexes are recursive):
#include <mutex>
struct MyStruct
{
std::recursive_mutex m;
size_t c;
P * p;
void foo()
{
m.lock();
if (c == 0)
{
this->p = new P();
}
++c;
m.unlock();
// lots of code, some using this->p
m.lock();
--c;
if (c == 0)
{
delete this->p;
this->p = nullptr;
}
m.unlock();
}
};
And of course, this all assumes the lots of code is itself reentrant, including the use of p.
And the code above is not even remotely exception-safe, but this is another story… ^_^
7. Hey 99% of our code is not reentrant!
It is quite true for spaghetti code. But if you partition correctly your code, you will avoid reentrancy problems.
7.1. Make sure all functions have NO state
They must only use the parameters, their own local variables, other functions without state, and return copies of the data if they return at all.
7.2. Make sure your object is "recursive-safe"
An object method has access to this, so it shares a state with all the methods of the same instance of the object.
So, make sure the object can be used at one point in the stack (i.e. calling method A), and then, at another point (i.e. calling method B), without corrupting the whole object. Design your object to make sure that upon exiting a method, the object is stable and correct (no dangling pointers, no contradicting member variables, etc.).
7.3. Make sure all your objects are correctly encapsulated
No one else should have access to their internal data:
// bad
int & MyObject::getCounter()
{
return this->counter;
}
// good
int MyObject::getCounter()
{
return this->counter;
}
// good, too
void MyObject::getCounter(int & p_counter)
{
p_counter = this->counter;
}
Even returning a const reference could be dangerous if the user retrieves the address of the data, as some other portion of the code could modify it without the code holding the const reference being told.
7.4. Make sure the user knows your object is not thread-safe
Thus, the user is responsible to use mutexes to use an object shared between threads.
The objects from the STL are designed to be not thread-safe (because of performance issues), and thus, if a user want to share a std::string between two threads, the user must protect its access with concurrency primitives;
7.5. Make sure your thread-safe code is recursive-safe
This means using recursive mutexes if you believe the same resource can be used twice by the same thread.
"Safely" is defined exactly as the common sense dictates - it means "doing its thing correctly without interfering with other things". The six points you cite quite clearly express the requirements to achieve that.
The answers to your 3 questions is 3× "no".
Are all recursive functions reentrant?
NO!
Two simultaneous invocations of a recursive function can easily screw up each other, if
they access the same global/static data, for example.
Are all thread-safe functions reentrant?
NO!
A function is thread-safe if it doesn't malfunction if called concurrently. But this can be achieved e.g. by using a mutex to block the execution of the second invocation until the first finishes, so only one invocation works at a time. Reentrancy means executing concurrently without interfering with other invocations.
Are all recursive and thread-safe functions reentrant?
NO!
See above.
The common thread:
Is the behavior well defined if the routine is called while it is interrupted?
If you have a function like this:
int add( int a , int b ) {
return a + b;
}
Then it is not dependent upon any external state. The behavior is well defined.
If you have a function like this:
int add_to_global( int a ) {
return gValue += a;
}
The result is not well defined on multiple threads. Information could be lost if the timing was just wrong.
The simplest form of a reentrant function is something that operates exclusively on the arguments passed and constant values. Anything else takes special handling or, often, is not reentrant. And of course the arguments must not reference mutable globals.
Now I have to elaborate on my previous comment. #paercebal answer is incorrect. In the example code didn't anyone notice that the mutex which as supposed to be parameter wasn't actually passed in?
I dispute the conclusion, I assert: for a function to be safe in the presence of concurrency it must be re-entrant. Therefore concurrent-safe (usually written thread-safe) implies re-entrant.
Neither thread safe nor re-entrant have anything to say about arguments: we're talking about concurrent execution of the function, which can still be unsafe if inappropriate parameters are used.
For example, memcpy() is thread-safe and re-entrant (usually). Obviously it will not work as expected if called with pointers to the same targets from two different threads. That's the point of the SGI definition, placing the onus on the client to ensure accesses to the same data structure are synchronised by the client.
It is important to understand that in general it is nonsense to have thread-safe operation include the parameters. If you've done any database programming you will understand. The concept of what is "atomic" and might be protected by a mutex or some other technique is necessarily a user concept: processing a transaction on a database can require multiple un-interrupted modifications. Who can say which ones need to be kept in sync but the client programmer?
The point is that "corruption" doesn't have to be messing up the memory on your computer with unserialised writes: corruption can still occur even if all individual operations are serialised. It follows that when you're asking if a function is thread-safe, or re-entrant, the question means for all appropriately separated arguments: using coupled arguments does not constitute a counter-example.
There are many programming systems out there: Ocaml is one, and I think Python as well, which have lots of non-reentrant code in them, but which uses a global lock to interleave thread acesss. These systems are not re-entrant and they're not thread-safe or concurrent-safe, they operate safely simply because they prevent concurrency globally.
A good example is malloc. It is not re-entrant and not thread-safe. This is because it has to access a global resource (the heap). Using locks doesn't make it safe: it's definitely not re-entrant. If the interface to malloc had be design properly it would be possible to make it re-entrant and thread-safe:
malloc(heap*, size_t);
Now it can be safe because it transfers the responsibility for serialising shared access to a single heap to the client. In particular no work is required if there are separate heap objects. If a common heap is used, the client has to serialise access. Using a lock inside the function is not enough: just consider a malloc locking a heap* and then a signal comes along and calls malloc on the same pointer: deadlock: the signal can't proceed, and the client can't either because it is interrupted.
Generally speaking, locks do not make things thread-safe .. they actually destroy safety by inappropriately trying to manage a resource that is owned by the client. Locking has to be done by the object manufacturer, thats the only code that knows how many objects are created and how they will be used.
The "common thread" (pun intended!?) amongst the points listed is that the function must not do anything that would affect the behaviour of any recursive or concurrent calls to the same function.
So for example static data is an issue because it is owned by all threads; if one call modifies a static variable the all threads use the modified data thus affecting their behaviour. Self modifying code (although rarely encountered, and in some cases prevented) would be a problem, because although there are multiple thread, there is only one copy of the code; the code is essential static data too.
Essentially to be re-entrant, each thread must be able to use the function as if it were the only user, and that is not the case if one thread can affect the behaviour of another in a non-deterministic manner. Primarily this involves each thread having either separate or constant data that the function works on.
All that said, point (1) is not necessarily true; for example, you might legitimately and by design use a static variable to retain a recursion count to guard against excessive recursion or to profile an algorithm.
A thread-safe function need not be reentrant; it may achieve thread safety by specifically preventing reentrancy with a lock, and point (6) says that such a function is not reentrant. Regarding point (6), a function that calls a thread-safe function that locks is not safe for use in recursion (it will dead-lock), and is therefore not said to be reentrant, though it may nonetheless safe for concurrency, and would still be re-entrant in the sense that multiple threads can have their program-counters in such a function simultaneously (just not with the locked region). May be this helps to distinguish thread-safety from reentarncy (or maybe adds to your confusion!).
The answers your "Also" questions are "No", "No" and "No". Just because a function is recursive and/or thread safe it doesn't make it re-entrant.
Each of these type of function can fail on all the points you quote. (Though I'm not 100% certain of point 5).
non reentrant function means that there will be a static context, maintained by function. when first time entering, there will be create new context for you. and next entering, you don't send more parameter for that, for convenient to token analyze, . e.g. strtok in c. if you have not clear the context, there might be some errors.
/* strtok example */
#include <stdio.h>
#include <string.h>
int main ()
{
char str[] ="- This, a sample string.";
char * pch;
printf ("Splitting string \"%s\" into tokens:\n",str);
pch = strtok (str," ,.-");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, " ,.-");
}
return 0;
}
on the contrary of non-reentrant, reentrant function means calling function in anytime will get the same result without side effect. because there is none of context.
in the view of thread safe, it just means there is only one modification for public variable in current time, in current process. so you should add lock guard to ensure just one change for public field in one time.
so thread safety and reentrant are two different things in different views.reentrant function safety says you should clear context before next time for context analyze. thread safety says you should keep visit public field order.
The terms "Thread-safe" and "re-entrant" mean only and exactly what their definitions say. "Safe" in this context means only what the definition you quote below it says.
"Safe" here certainly doesn't mean safe in the broader sense that calling a given function in a given context won't totally hose your application. Altogether, a function might reliably produce a desired effect in your multi-threaded application but not qualify as either re-entrant or thread-safe according to the definitions. Oppositely, you can call re-entrant functions in ways that will produce a variety of undesired, unexpected and/or unpredictable effects in your multi-threaded application.
Recursive function can be anything and Re-entrant has a stronger definition than thread-safe so the answers to your numbered questions are all no.
Reading the definition of re-entrant, one might summarize it as meaning a function which will not modify any anything beyond what you call it to modify. But you shouldn't rely on only the summary.
Multi-threaded programming is just extremely difficult in the general case. Knowing which part of one's code re-entrant is only a part of this challenge. Thread safety is not additive. Rather than trying to piece together re-entrant functions, it's better to use an overall thread-safe design pattern and use this pattern to guide your use of every thread and shared resources in the your program.

How does Multiple C++ Threads execute on a class method

let's say we have a c++ class like:
class MyClass
{
void processArray( <an array of 255 integers> )
{
int i ;
for (i=0;i<255;i++)
{
// do something with values in the array
}
}
}
and one instance of the class like:
MyClass myInstance ;
and 2 threads which call the processArray method of that instance (depending on how system executes threads, probably in a completely irregular order). There is no mutex lock used in that scope so both threads can enter.
My question is what happens to the i ? Does each thread scope has it's own "i" or would each entering thread modify i in the for loop, causing i to be changing weirdly all the time.
i is allocated on the stack. Since each thread has its own separate stack, each thread gets its own copy of i.
Be careful. In the example provided the method processArray seems to be reentrant (it's not clear what happens in // do something with values in the array). If so, no race occurs while two or more threads invoke it simultaneously and therefore it's safe to call it without any locking mechanism.
To enforce this, you could mark both the instance and the method with the volatile qualifier, to let users know that no lock is required.
It has been published an interesting article of Andrei Alexandrescu about volatile qualifier and how it can be used to write correct multithreaded classes. The article is published here:
http://www.ddj.com/cpp/184403766
Since i is a local variable it is stored on the thread's own private stack. Hence, you do not need to protect i with a critical section.
As Adam said, i is a variable stored on the stack and the arguments are passed in so this is safe. When you have to be careful and apply mutexes or other synchronization mechanisms is if you were accessing shared member variables in the same instance of the class or global variables in the program (even scoped statics).

How can I create a thread-safe singleton pattern in Windows?

I've been reading about thread-safe singleton patterns here:
http://en.wikipedia.org/wiki/Singleton_pattern#C.2B.2B_.28using_pthreads.29
And it says at the bottom that the only safe way is to use pthread_once - which isn't available on Windows.
Is that the only way of guaranteeing thread safe initialisation?
I've read this thread on SO:
Thread safe lazy construction of a singleton in C++
And seems to hint at an atomic OS level swap and compare function, which I assume on Windows is:
http://msdn.microsoft.com/en-us/library/ms683568.aspx
Can this do what I want?
Edit: I would like lazy initialisation and for there to only ever be one instance of the class.
Someone on another site mentioned using a global inside a namespace (and he described a singleton as an anti-pattern) - how can it be an "anti-pattern"?
Accepted Answer:
I've accepted Josh's answer as I'm using Visual Studio 2008 - NB: For future readers, if you aren't using this compiler (or 2005) - Don't use the accepted answer!!
Edit:
The code works fine except the return statement - I get an error:
error C2440: 'return' : cannot convert from 'volatile Singleton *' to 'Singleton *'.
Should I modify the return value to be volatile Singleton *?
Edit: Apparently const_cast<> will remove the volatile qualifier. Thanks again to Josh.
A simple way to guarantee cross-platform thread safe initialization of a singleton is to perform it explicitly (via a call to a static member function on the singleton) in the main thread of your application before your application starts any other threads (or at least any other threads that will access the singleton).
Ensuring thread safe access to the singleton is then achieved in the usual way with mutexes/critical sections.
Lazy initialization can also be achieved using a similar mechanism. The usual problem encountered with this is that the mutex required to provide thread-safety is often initialized in the singleton itself which just pushes the thread-safety issue to initialization of the mutex/critical section. One way to overcome this issue is to create and initialize a mutex/critical section in the main thread of your application then pass it to the singleton via a call to a static member function. The heavyweight initialization of the singleton can then occur in a thread-safe manner using this pre-initialized mutex/critical section. For example:
// A critical section guard - create on the stack to provide
// automatic locking/unlocking even in the face of uncaught exceptions
class Guard {
private:
LPCRITICAL_SECTION CriticalSection;
public:
Guard(LPCRITICAL_SECTION CS) : CriticalSection(CS) {
EnterCriticalSection(CriticalSection);
}
~Guard() {
LeaveCriticalSection(CriticalSection);
}
};
// A thread-safe singleton
class Singleton {
private:
static Singleton* Instance;
static CRITICAL_SECTION InitLock;
CRITICIAL_SECTION InstanceLock;
Singleton() {
// Time consuming initialization here ...
InitializeCriticalSection(&InstanceLock);
}
~Singleton() {
DeleteCriticalSection(&InstanceLock);
}
public:
// Not thread-safe - to be called from the main application thread
static void Create() {
InitializeCriticalSection(&InitLock);
Instance = NULL;
}
// Not thread-safe - to be called from the main application thread
static void Destroy() {
delete Instance;
DeleteCriticalSection(&InitLock);
}
// Thread-safe lazy initializer
static Singleton* GetInstance() {
Guard(&InitLock);
if (Instance == NULL) {
Instance = new Singleton;
}
return Instance;
}
// Thread-safe operation
void doThreadSafeOperation() {
Guard(&InstanceLock);
// Perform thread-safe operation
}
};
However, there are good reasons to avoid the use of singletons altogether (and why they are sometimes referred to as an anti-pattern):
They are essentially glorified global variables
They can lead to high coupling between disparate parts of an application
They can make unit testing more complicated or impossible (due to the difficultly in swapping real singletons with fake implementations)
An alternative is to make use of a 'logical singleton' whereby you create and initialise a single instance of a class in the main thread and pass it to the objects which require it. This approach can become unwieldy where there are many objects which you want to create as singletons. In this case the disparate objects can be bundled into a single 'Context' object which is then passed around where necessary.
If you are are using Visual C++ 2005/2008 you can use the double checked locking pattern, since "volatile variables behave as fences". This is the most efficient way to implement a lazy-initialized singleton.
From MSDN Magazine:
Singleton* GetSingleton()
{
volatile static Singleton* pSingleton = 0;
if (pSingleton == NULL)
{
EnterCriticalSection(&cs);
if (pSingleton == NULL)
{
try
{
pSingleton = new Singleton();
}
catch (...)
{
// Something went wrong.
}
}
LeaveCriticalSection(&cs);
}
return const_cast<Singleton*>(pSingleton);
}
Whenever you need access to the singleton, just call GetSingleton(). The first time it is called, the static pointer will be initialized. After it's initialized, the NULL check will prevent locking for just reading the pointer.
DO NOT use this on just any compiler, as it's not portable. The standard makes no guarantees on how this will work. Visual C++ 2005 explicitly adds to the semantics of volatile to make this possible.
You'll have to declare and initialize the CRITICAL SECTION elsewhere in code. But that initialization is cheap, so lazy initialization is usually not important.
While I like the accepted solution, I just found another promising lead and thought I should share it here: One-Time Initialization (Windows)
You can use an OS primitive such as mutex or critical section to ensure thread safe initialization however this will incur an overhead each time your singleton pointer is accessed (due to acquiring a lock). It's also non portable.
There is one clarifying point you need to consider for this question. Do you require ...
That one and only one instance of a class is ever actually created
Many instances of a class can be created but there should only be one true definitive instance of the class
There are many samples on the web to implement these patterns in C++. Here's a Code Project Sample
The following explains how to do it in C#, but the exact same concept applies to any programming language that would support the singleton pattern
http://www.yoda.arachsys.com/csharp/singleton.html
What you need to decide is wheter you want lazy initialization or not. Lazy initialization means that the object contained inside the singleton is created on the first call to it
ex :
MySingleton::getInstance()->doWork();
if that call isnt made until later on, there is a danger of a race condition between the threads as explained in the article. However, if you put
MySingleton::getInstance()->initSingleton();
at the very beginning of your code where you assume it would be thread safe, then you are no longer lazy initializing, you will require "some" more processing power when your application starts. However it will solve a lot of headaches about race conditions if you do so.
If you are looking for a more portable, and easier solution, you could turn to boost.
boost::call_once can be used for thread safe initialization.
Its pretty simple to use, and will be part of the next C++0x standard.
The question does not require the singleton is lazy-constructed or not.
Since many answers assume that, I assume that for the first phrase discuss:
Given the fact that the language itself is not thread-awareness, and plus the optimization technique, writing a portable reliable c++ singleton is very hard (if not impossible), see "C++ and the Perils of Double-Checked Locking" by Scott Meyers and Andrei Alexandrescu.
I've seen many of the answer resort to sync object on windows platform by using CriticalSection, but CriticalSection is only thread-safe when all the threads is running on one single processor, today it's probably not true.
MSDN cite: "The threads of a single process can use a critical section object for mutual-exclusion synchronization. ".
And http://msdn.microsoft.com/en-us/library/windows/desktop/ms682530(v=vs.85).aspx
clearify it further:
A critical section object provides synchronization similar to that provided by a mutex object, except that a critical section can be used only by the threads of a single process.
Now, if "lazy-constructed" is not a requirement, the following solution is both cross-module safe and thread-safe, and even portable:
struct X { };
X * get_X_Instance()
{
static X x;
return &x;
}
extern int X_singleton_helper = (get_X_instance(), 1);
It's cross-module-safe because we use locally-scoped static object instead of file/namespace scoped global object.
It's thread-safe because: X_singleton_helper must be assigned to the correct value before entering main or DllMain It's not lazy-constructed also because of this fact), in this expression the comma is an operator, not punctuation.
Explicitly use "extern" here to prevent compiler optimize it out(Concerns about Scott Meyers article, the big enemy is optimizer.), and also make static-analyze tool such as pc-lint keep silent. "Before main/DllMain" is Scott meyer called "single-threaded startup part" in "Effective C++ 3rd" item 4.
However, I'm not very sure about whether compiler is allowed to optimize the call the get_X_instance() out according to the language standard, please comment.
There are many ways to do thread safe Singleton* initialization on windows. In fact some of them are even cross-platform. In the SO thread that you linked to, they were looking for a Singleton that is lazily constructed in C, which is a bit more specific, and can be a bit trickier to do right, given the intricacies of the memory model you are working under.
which you should never use