Do I need a mutex for reading? - c++

I have a class that has a state (a simple enum) and that is accessed from two threads. For changing state I use a mutex (boost::mutex). Is it safe to check the state (e.g. compare state_ == ESTABLISHED) or do I have to use the mutex in this case too? In other words do I need the mutex when I just want to read a variable which could be concurrently written by another thread?

It depends.
The C++ language says nothing about threads or atomicity.
But on most modern CPU's, reading an integer is an atomic operation, which means that you will always read a consistent value, even without a mutex.
However, without a mutex, or some other form of synchronization, the compiler and CPU are free to reorder reads and writes, so anything more complex, anything involving accessing multiple variables, is still unsafe in the general case.
Assuming the writer thread updates some data, and then sets an integer flag to inform other threads that data is available, this could be reordered so the flag is set before updating the data. Unless you use a mutex or another form of memory barrier.
So if you want correct behavior, you don't need a mutex as such, and it's no problem if another thread writes to the variable while you're reading it. It'll be atomic unless you're working on a very unusual CPU. But you do need a memory barrier of some kind to prevent reordering in the compiler or CPU.

You have two threads, they exchange information, yes you need a mutex and you probably also need a conditional wait.
In your example (compare state_ == ESTABLISHED) indicates that thread #2 is waiting for thread #1 to initiate a connection/state. Without a mutex or conditionals/events, thread #2 has to poll the status continously.
Threads is used to increase performance (or improve responsiveness), polling usually results in decreased performance, either by consuming a lot of CPU or by introducing latencey due to the poll interval.

Yes. If thread a reads a variable while thread b is writing to it, you can read an undefined value. The read and write operation are not atomic, especially on a multi-processor system.

Generally speaking you don't, if your variable is declared with "volatile". And ONLY if it is a single variable - otherwise you should be really careful about possible races.

actually, there is no reason to lock access to the object for reading. you only want to lock it while writing to it. this is exactly what a reader-writer lock is. it doesn't lock the object as long as there are no write operations. it improves performance and prevents deadlocks. see the following links for more elaborate explanations :
wikipedia
codeproject

The access to the enum ( read or write) should be guarded.
Another thing:
If the thread contention is less and the threads belong to same process then Critical section would be better than mutex.

Related

c++ Multithreading - how writes to a shared variable are handled

As the title suggest - in C++, how writes to a shared variable in threads are handled? Do threads work with their own copy of data? I know what is shared between threads and what is not but I could not find anything on how writes to a variable are handled in multithreading scenario.
Let me give you a little background. In Java, threads work on their own copy of data. We add synchronization to ensure that the writes of thread are visible to other threads. In short, synchronization ensures that the writes are flushed to the main memory.
Is similar construct valid for c/c++? If a thread is writing to a variable, are those writes immediately updated to the main memory or is it the OS that governs how the write to main memory will occur? Do we need to add synchronization to flush those updates?
For e.g. If two threads are incrementing a counter, and I can guarantee that the first thread will always finish first with that particular task (somehow by using delayed start of second thread, may not be able to guarantee it in practice) and there won't be any mangling of instructions between the two threads, then would I always see the affect of both the threads?
Thread 1 { i++; ... do something that takes long time}
Thread 2 { i++; ... do something that takes long time}
Thanks,
RG
Do threads work with their own copy of data?
No, not unless the data is declared thread_local.
synchronization ensures that the writes are flushed to the main memory.
Is similar construct valid for c/c++? If a thread is writing to a variable, are those writes immediately updated to the main memory
Yes, but unless you use synchronization of some sort, the changes may or may not be seen - or read incorrectly, by other threads.
Do we need to add synchronization to flush those updates?
Yes, but I'm not sure I'd use the word flush. "Synchronization" is good.
For e.g. If two threads are incrementing a counter, and I can guarantee that the first thread will always finish first ...
If you can guarantee that, you have synchronization.
would I always see the affect of both the threads?
Yes, if you have the guarantee you talk about.
C++ has no guarantees about how atomicity of reads/writes and flushes to memory work, unless you use the atomic primitives such as std::atomic. Be very careful here. Lock-free programming is extremely hard to get right.
My advice is use locks.

C++ 17 shared_mutex : why read lock is even necessary for reading threads [duplicate]

I have a class that has a state (a simple enum) and that is accessed from two threads. For changing state I use a mutex (boost::mutex). Is it safe to check the state (e.g. compare state_ == ESTABLISHED) or do I have to use the mutex in this case too? In other words do I need the mutex when I just want to read a variable which could be concurrently written by another thread?
It depends.
The C++ language says nothing about threads or atomicity.
But on most modern CPU's, reading an integer is an atomic operation, which means that you will always read a consistent value, even without a mutex.
However, without a mutex, or some other form of synchronization, the compiler and CPU are free to reorder reads and writes, so anything more complex, anything involving accessing multiple variables, is still unsafe in the general case.
Assuming the writer thread updates some data, and then sets an integer flag to inform other threads that data is available, this could be reordered so the flag is set before updating the data. Unless you use a mutex or another form of memory barrier.
So if you want correct behavior, you don't need a mutex as such, and it's no problem if another thread writes to the variable while you're reading it. It'll be atomic unless you're working on a very unusual CPU. But you do need a memory barrier of some kind to prevent reordering in the compiler or CPU.
You have two threads, they exchange information, yes you need a mutex and you probably also need a conditional wait.
In your example (compare state_ == ESTABLISHED) indicates that thread #2 is waiting for thread #1 to initiate a connection/state. Without a mutex or conditionals/events, thread #2 has to poll the status continously.
Threads is used to increase performance (or improve responsiveness), polling usually results in decreased performance, either by consuming a lot of CPU or by introducing latencey due to the poll interval.
Yes. If thread a reads a variable while thread b is writing to it, you can read an undefined value. The read and write operation are not atomic, especially on a multi-processor system.
Generally speaking you don't, if your variable is declared with "volatile". And ONLY if it is a single variable - otherwise you should be really careful about possible races.
actually, there is no reason to lock access to the object for reading. you only want to lock it while writing to it. this is exactly what a reader-writer lock is. it doesn't lock the object as long as there are no write operations. it improves performance and prevents deadlocks. see the following links for more elaborate explanations :
wikipedia
codeproject
The access to the enum ( read or write) should be guarded.
Another thing:
If the thread contention is less and the threads belong to same process then Critical section would be better than mutex.

C/C++ arrays with threads - do I need to use mutexes or locks?

I am new to using threads and have read a lot about how data is shared and protected. But I have also not really got a good grasp of using mutexes and locks to protect data.
Below is a description of the problem I will be working on. The important thing to note is that it will be time-critical, so I need to reduce overheads as much as possible.
I have two fixed-size double arrays.
The first array will provide data for subsequent calculations.
Threads will read values from it, but it will never be modified. An element may be read at some time by any of the threads.
The second array will be used to store the results of the calculations performed by the threads. An element of this array will only ever be updated by one thread, and probably only once when the result value
is written to it.
My questions then:
Do I really need to use a mutex in a thread each time I access the data from the read-only array? If so, could you explain why?
Do I need to use a mutex in a thread when it writes to the result array even though this will be the only thread that ever writes to this element?
Should I use atomic data types, and will there be any significant time overhead if I do?
Many answers to this type of question seem to be - no, you don't need the mutex if your variables are aligned. Would my array elements in this example be aligned, or is there some way to ensure they are?
The code will be implemented on 64bit Linux. I am planning on using Boost libraries for multithreading.
I have been mulling this over and looking all over the web for days, and once posted, the answer and clear explanations came back in literally seconds. There is an "accepted answer," but all the answers and comments were equally helpful.
Do I really need to use a mutex in a thread each time I access the data from the read-only array? If so could you explain why?
No. Because the data is never modified, there cannot be synchronization problem.
Do I need to use a mutex in a thread when it writes to the result array even though this will be the only thread that ever writes to this element?
Depends.
If any other thread is going to read the element, you need synchronization.
If any thread may modify the size of the vector, you need synchronization.
In any case, take care of not writing into adjacent memory locations by different threads a lot. That could destroy the performance. See "false sharing". Considering, you probably don't have a lot of cores and therefore not a lot of threads and you say write is done only once, this is probably not going to be a significant problem though.
Should I use atomic data types and will there be any significant time over head if I do?
If you use locks (mutex), atomic variables are not necessary (and they do have overhead). If you need no synchronization, atomic variables are not necessary. If you need synchronization, then atomic variables can be used to avoid locks in some cases. In which cases can you use atomics instead of locks... is more complicated and beyond the scope of this question I think.
Given the description of your situation in the comments, it seems that no synchronization is required at all and therefore no atomics nor locks.
...Would my array elements in this example be aligned, or is there some way to ensure they are?
As pointed out by Arvid, you can request specific alignment using the alginas keyword which was introduced in c++11. Pre c++11, you may resort to compiler specific extensions: https://gcc.gnu.org/onlinedocs/gcc-5.1.0/gcc/Variable-Attributes.html
Under the two conditions given, there's no need for mutexes. Remember every use of a mutex (or any synchronization construct) is a performance overhead. So you want to avoid them as much as possible (without compromising correct code, of course).
No. Mutexes are not needed since threads are only reading the array.
No. Since each thread only writes to a distinct memory location, no race condition is possible.
No. There's no need for atomic access to objects here. In fact, using atomic objects could affect the performance negatively as it prevents the optimization possibilities such as re-ordering operations.
The only time you need to use Locks is when data is modified on a shared resource. Eg if some threads where used to write data and some used to read data (in both cases from the same resource) then you only need a lock for when writing is done. This is to prevent whats known as "race".
There is good information of race on google for when you make programs that manipulate data on a shared resource.
You are on the right track.
1) For the first array (read only) , you do not need to utilize a mutex lock for it. Since the threads are just reading not altering the data there is no way a thread can corrupt the data for another thread
2) I'm a little confused by this question. If you know that thread 1 will only write an element to array slot 1 and thread 2 will only write to array slot 2 then you do not need a mutex lock. However I'm not sure how your achieving this property. If my above statement is not correct for your situation you would definitely need a mutex lock.
3) Given the definition of atomic:
Atomic types are types that encapsulate a value whose access is guaranteed to not cause data races and can be used to synchronize memory accesses among different threads.
Key note, a mutex lock is atomic meaning that there is only 1 assembly instruction needed to grab/release a lock. If it required 2 assembly instructions to grab/release a lock, a lock would not be thread safe. For example, if thread 1 attempted to grab a lock and was switched to thread 2, thread 2 would grab the lock.
Use of atomic data types would decrease your overhead but not significantly.
4) I'm not sure how you can assure your variables are lined. Since threads can switch at any moment in your program (Your OS determines when a thread switches)
Hope this helps

Reader/Writer Locks (c++)

I have a thread that continuously read a global variable and there is another thread that occasionally update (write) global variable.
What could be the best way to do that and what would be the cost?
Is it possible if I do not put lock on read side and put a lock in writer side?
Thanks
A lock protects the resource/variable and if the readers use it, the writer also should. If the global variable is a primitive type, I would suggest you make it an atomic using std::atomic<>. If it is a complex type, like a class, you should use a lock to ensure that your readers read a consistent state.
I have had much success with spinlocks in situations where you might expect low contention. But if your readers are reading at a high rate and you have many of them. A mutex or atomic should be used.
This scenario doesn't need any locks. You need locks when multiple threads modify state or when multiple threads read some state but you want to make sure that they read an up to date version or the same version.

Why do locks work?

If the locks make sure only one thread accesses the locked data at a time, then what controls access to the locking functions?
I thought that boost::mutex::scoped_lock should be at the beginning of each of my functions so the local variables don't get modified unexpectedly by another thread, is that correct? What if two threads are trying to acquire the lock at very close times? Won't the lock's local variables used internally be corrupted by the other thread?
My question is not boost-specific but I'll probably be using that unless you recommend another.
You're right, when implementing locks you need some way of guaranteeing that two processes don't get the lock at the same time. To do this, you need to use an atomic instruction - one that's guaranteed to complete without interruption. One such instruction is test-and-set, an operation that will get the state of a boolean variable, set it to true, and return the previously retrieved state.
What this does is this allows you to write code that continually tests to see if it can get the lock. Assume x is a shared variable between threads:
while(testandset(x));
// ...
// critical section
// this code can only be executed by once thread at a time
// ...
x = 0; // set x to 0, allow another process into critical section
Since the other threads continually test the lock until they're let into the critical section, this is a very inefficient way of guaranteeing mutual exclusion. However, using this simple concept, you can build more complicated control structures like semaphores that are much more efficient (because the processes aren't looping, they're sleeping)
You only need to have exclusive access to shared data. Unless they're static or on the heap, local variables inside functions will have different instances for different threads and there is no need to worry. But shared data (stuff accessed via pointers, for example) should be locked first.
As for how locks work, they're carefully designed to prevent race conditions and often have hardware level support to guarantee atomicity. IE, there are some machine language constructs guaranteed to be atomic. Semaphores (and mutexes) may be implemented via these.
The simplest explanation is that the locks, way down underneath, are based on a hardware instruction that is guaranteed to be atomic and can't clash between threads.
Ordinary local variables in a function are already specific to an individual thread. It's only statics, globals, or other data that can be simultaneously accessed by multiple threads that needs to have locks protecting it.
The mechanism that operates the lock controls access to it.
Any locking primitive needs to be able to communicate changes between processors, so it's usually implemented on top of bus operations, i.e., reading and writing to memory. It also needs to be structured such that two threads attempting to claim it won't corrupt its state. It's not easy, but you can usually trust that any OS implemented lock will not get corrupted by multiple threads.