Do I need mutex in constructor for field? - c++

Let's assume I have a simple class A with one field in C++. This field is initialized in the constructor. Class A also has a method called doit() for modifing the value of this field. doit() will be called from multiple threads. If I have a mutex only in the doit() method, is this sufficient? Do I have a guarantee that I will never read an uninitialized field (because there is no lock in the constructor)?
Edit: I probably was not clear enough. Is there no issue involving processor cache or something similar? I mean, if there is no mutex for initializing memory region (i.e. my field) - is there no risk that the other thread will read some garbage value?

Your object can only be initialised once, and you won't be able use it before it's initialised, so you don't need a mutex there. You will however need a mutex or other suitable lock in your DoIt function, as you said this will be accessed across multiple threads.
Update for edited question: No, you don't need to worry about processor cache. You must construct your object first, before you can have a handle to it. Only once you have this handle can you pass it to other threads to be used. What I'm trying to say is, the spawned threads must start after the construction of the original object, it is impossible for it to happen the other way around!

It is not possible to call doit() on an object that is not created yet, so you do not need mutex in the constructor.
If doit() is the only method that accesses the field, then you should be fine.
If other methods of your class also access that field, even from a single thread, then you must use a mutex also in these methods.

You first need to construct the object before those pesky threads get their hands on it. The OS will allocate memory for the constructor that is only called by one thread. Ths OS looks after that allocation and therefore nothing needs to be done on your part. Hell you can even create two objects of the same class in two different threads.
You can be very conservative and use a mutex at the start of any method that used that field to lock it, and release it and the end.
Or if you understand the interactions of the various methods with the various algorithms , you can use a mutex for critical sections of code that use that field - i.e. That part of the code needs to be sure that the field is not altered by another thread during processing, but you method can release the lock after the critical section, do something else then perhaps have another critical section.

Related

Is A Member Function Thread Safe?

I have in a Server object multiple thread who are doing the same task. Those threads are init with a Server::* routine.
In this routine there is a infinite loop with some treatments.
I was wondering if it was thread safe to use the same method for multiple threads ? No wonder for the fields of the class, If I want to read or write it I will use a mutex. But what about the routine itself ?
Since a function is an address, those thread will be running in the same memory zone ?
Do I need to create a method with same code for every thread ?
Ps: I use std::mutex(&Server::Task, this)
There is no problem with two threads running the same function at the same time (whether it's a member function or not).
In terms of instructions, it's similar to if you had two threads reading the same field at the same time - that's fine, they both get the same value. It's when you have one writing and one reading, or two writing, that you can start to have race conditions.
In C++ every thread is allocated its own call stack. This means that all local variables which exist only in the scope of a given thread's call stack belong to that thread alone. However, in the case of shared data or resources, such as a global data structure or a database, it is possible for different threads to access these at the same time. One solution to this synchronization problem is to use std::mutex, which you are already doing.
While the function itself might be the same address in memory in terms of its place in the table you aren't writing to it from multiple locations, the function itself is immutable and local variables scoped inside that function will be stacked per thread.
If your writes are protected and the fetches don't pull stale data you're as safe as you could possibly need on most architectures and implementations out there.
Behind the scenes, int Server::Task(std::string arg) is very similar to int Server__Task(Server* this, std::string arg). Just like multiple threads can execute the same function, multiple threads can also execute the same member function - even with the same arguments.
A mutex ensures that no conflicting changes are made, and that each thread sees every prior change. But since code does not chance, you don't need a mutex for it, just like you don't need a mutex for string literals.

C++ - Initialize 2 copies of object owned by different threads

Suppose I have thread A and thread B. Each thread owns a copy of the same object (let's call it "Foo" for simplicity). Certain pieces of data in Foo are initialized from a file on startup. So that means that on startup, I have to set the data read from the file in both copies of Foo. If no data is read from the file, then I still want to set the same data instance in both copies of Foo and not have each thread initialize the data separately.
Due to the architecture of the software I am working on, I am unable to perform this work on construction of either copy of Foo. So I am forced to do some type of Initialize() or similar method that is called after construction of the object. Since this all happens during the initialization phase of the application, I do not need to be concerned about thread-safety, but I am still concerned about the cleanliness of this solution. Here is what I have come up with so far:
ThreadA::Start()
{
...
// Initialize() will read the data from the file if it exists. Otherwise, it will set default values.
m_Foo.Initialize();
// Now call Initialize() for thread B's copy of Foo while running in thread A but before thread B is even started. Pass in thread A's copy of Foo to set the values with. This should be safe to do since thread B has not even started yet.
m_ThreadB.GetFoo().Initialize(m_Foo);
...
}
Is this the cleanest way to initialize the data in both copies of Foo? Thanks!
I think you might be over thinking it because there are threads. If no threads are running, then you can use any method you wish to initialize them because there are no race cases. Don't even think about the threads. For the rest of this, I will call ThreadA and threadB BugbearA and BugbearB instead.
If you only have two Bugbears, and BugbearA always starts before BugbearB, then your solution is the best one can ever come up with.
If the situation is actually more complicated than you let on, I would recommend refactoring the data to have a shared object containing the data that may be initialized from a file, and an unshared object For each Bugbear. The first Bugbear to Start() initializes the shared object. All threads copy that data into their own Foo and start.
This refactor will also set you up for multithreading, if you need it later. All you have to do is provide proper synchronization for the shared object.

Simultaneous first time access of singleton class from two different threads

In my C++ project I have a singleton class. During project execution, sometimes the same singleton class is accessed from two different thread simultaneously. Resulting in two instances of the singleton class is produced, which is a problem.
How to handle such a cases?
Then it's not a singleton :-)
You'll probably need to show us some code but your basic problem will be with the synchronisation areas.
If done right, there is no way that two threads can create two objects of the class. In fact, the class itself should be the place where the singleton nature is being enforced so that erroneous clients cannot corrupt the intent.
The basic structure will be:
lock mutex
if instance doesn't exist:
instance = new object
unlock mutex
Without something like mutex protection (or critical code section or any other manner in which you can guarantee at the language/library level that two threads can't run the code simultaneously), there's a possibility that thread one may be swapped out between the check and the instantiation, leading to two possible instances of your "singleton".
And, as others will no doubt suggest, singletons may well be a bad idea. I'm not quite in the camp where every use is wrong, the usual problem is that people treat them as "god" objects. They can have their uses but there's often a better way, though I won't presume to tell you you need to change that, since I don't know your use case.
If you get two different instances in different threads you are doing something wrong. Threads, unlike processes, share their memory. So memory allocated in one thread (e.g. for an object instance) is also usable in the other.
If your singleton getting two copies its not guarded with mutex. lock a mutex when getting/accessing/setting the internal object.
I believe You are done something like this if(!_instance)_instance = new Singleton() there lies a critical section. which you need to safe guard with a mutex.
Do not use a singleton, It is a well known Anti-Pattern.
A good read:
Singletons: Solving problems you didn’t know you never had since 1995
If you still want to persist and go ahead with it for reasons known only to you, What you need is a thread safe singleton implementation, something like this:
YourClass* YourClass::getInstance()
{
MutexLocker locker(YourClass::m_mutex);
if(!m_instanceFlag)
{
m_instance = new YourClass();
m_instanceFlag = true;
}
return m_instance;
}
Where MutexLocker is a wrapper class for an normally used Mutex, which locks the mutex when creating its instance and unlocks the mutex the function ends.

Passing data structures to different threads

I have an application that will be spawning multiple threads. However, I feel there might be an issue with threads accessing data that they shouldn't be.
Here is the structure of the threaded application (sorry for the crudeness):
MainThread
/ \
/ \
/ \
Thread A Thread B
/ \ / \
/ \ / \
/ \ / \
Thread A_1 Thread A_2 Thread B_1 Thread B_2
Under each lettered thread (which could be many), there will only be two threads and they are fired of sequentially. The issue i'm having is I'm not entirely sure how to pass in a datastructure into these threads.
So, the datastructure is created in MainThread, will be modified in the lettered thread (Thread A, etc) specific to that thread and then a member variable from that datastructure is sent to Letter_Numbered threads.
Currently, the lettered thread class has a member variable and when the class is constructed, the datastructure from mainthread is passed in by reference, invoking the copy constructor so the lettered thread has it's own copy to play with.
The lettered_numbered thread simply takes in a string variable from the data structure within the lettered thread. My question is, is this accceptable? Is there a much better way to ensure each lettered thread gets its own data structure to play with?
Sorry for the somewhat poor explanation, please leave comments and i'll try to clarify.
EDIT:
So my lettered thread constructor should take the VALUE of the data structure, not the reference?
I would have each thread create it's own copy of the datastructure, e.g. you pass the structure in the constructor and then explicitly create a local copy. Then you are guaranteed that the threads have distinct copies. (You say that it's passsed by reference, and that this invokes the copy constructor. I think you mean pass by value? I feel it's better to explicitly make a copy, to leave no doubt and to make your intent clear. Otherwise someone might later come along and change your pass by value to pass by reference as a "smart optimization".)
EDIT: Removed comment about strings. For some reason, I was assuming .NET.
To ensure strings are privately owned, follow the same procedure, create a copy of the string, which you can then freely modify.
There is a pattern called Active Object Pattern wherein each object executes in its own thread. Frameworks like ACE support this. If you have access to such frameworks, you should use those. In any case, i would believe creating a new instance of an object and allowing it to exetute in its own thread is much cleaner that invoking the copy-constructor to make a copy of the object. Else see if you can fit a solution that uses Thread Local Storage.
Have you looked at boost threads?
You would basically create a callable class that has a constructor that takes the parameters the thread is to work on and then launch the thread by passing objects of your callable class, initialized and ready to go.
This is very similar to how Java implements threads and it makes a good amount of sense most of the time from a design point of view.
You aparently are making a copy of the data for each trhead and everything works? then no problem.
Here are some additional thoughts:
If data is read only, you can share a single struct and everything will be ok, as long as each read is small and fast (basic types)
If data needs to be written, but "private" (or contained) to each thread, then send a copy to each thread (what you are doing). Caveat: I assume the data is not too big and a copy does not eat to much resources.
If the data needs to be written and the new values shared between threads, then you need to think about it (read on it) and create a proper design. I like a transactional object to centralize each threads read/write operation. Like a tiny database in memory. Check on thread mutex, semaphores and critical sections). Dealing with huge data set I have used a database to centralize requests (See ODBM). You can also check existing messaging queuing libraries (like MSMQ) to have data change ordered and synchronized.
Hope this helps.
It seems unlikely that you would want each thread to operate on the data and then not at least occasionally have another thread react to what another thread has done to another thread's work on the data. If you are truly independent meaning that no other thread truly will ever care about work that another thread has done, then I suggest making a copy of the data, otherwise in the case where you will want to do work in one thread and make that result of that work available to another thread I would suggest that you, pass a reference/pointer to the object around and then protect access to it via locks so that the threads can work with it, properly, I suggest a multi-read, single writer lock implementation.

How does Multiple C++ Threads execute on a class method

let's say we have a c++ class like:
class MyClass
{
void processArray( <an array of 255 integers> )
{
int i ;
for (i=0;i<255;i++)
{
// do something with values in the array
}
}
}
and one instance of the class like:
MyClass myInstance ;
and 2 threads which call the processArray method of that instance (depending on how system executes threads, probably in a completely irregular order). There is no mutex lock used in that scope so both threads can enter.
My question is what happens to the i ? Does each thread scope has it's own "i" or would each entering thread modify i in the for loop, causing i to be changing weirdly all the time.
i is allocated on the stack. Since each thread has its own separate stack, each thread gets its own copy of i.
Be careful. In the example provided the method processArray seems to be reentrant (it's not clear what happens in // do something with values in the array). If so, no race occurs while two or more threads invoke it simultaneously and therefore it's safe to call it without any locking mechanism.
To enforce this, you could mark both the instance and the method with the volatile qualifier, to let users know that no lock is required.
It has been published an interesting article of Andrei Alexandrescu about volatile qualifier and how it can be used to write correct multithreaded classes. The article is published here:
http://www.ddj.com/cpp/184403766
Since i is a local variable it is stored on the thread's own private stack. Hence, you do not need to protect i with a critical section.
As Adam said, i is a variable stored on the stack and the arguments are passed in so this is safe. When you have to be careful and apply mutexes or other synchronization mechanisms is if you were accessing shared member variables in the same instance of the class or global variables in the program (even scoped statics).