C/C++ arrays with threads - do I need to use mutexes or locks? - c++

I am new to using threads and have read a lot about how data is shared and protected. But I have also not really got a good grasp of using mutexes and locks to protect data.
Below is a description of the problem I will be working on. The important thing to note is that it will be time-critical, so I need to reduce overheads as much as possible.
I have two fixed-size double arrays.
The first array will provide data for subsequent calculations.
Threads will read values from it, but it will never be modified. An element may be read at some time by any of the threads.
The second array will be used to store the results of the calculations performed by the threads. An element of this array will only ever be updated by one thread, and probably only once when the result value
is written to it.
My questions then:
Do I really need to use a mutex in a thread each time I access the data from the read-only array? If so, could you explain why?
Do I need to use a mutex in a thread when it writes to the result array even though this will be the only thread that ever writes to this element?
Should I use atomic data types, and will there be any significant time overhead if I do?
Many answers to this type of question seem to be - no, you don't need the mutex if your variables are aligned. Would my array elements in this example be aligned, or is there some way to ensure they are?
The code will be implemented on 64bit Linux. I am planning on using Boost libraries for multithreading.
I have been mulling this over and looking all over the web for days, and once posted, the answer and clear explanations came back in literally seconds. There is an "accepted answer," but all the answers and comments were equally helpful.

Do I really need to use a mutex in a thread each time I access the data from the read-only array? If so could you explain why?
No. Because the data is never modified, there cannot be synchronization problem.
Do I need to use a mutex in a thread when it writes to the result array even though this will be the only thread that ever writes to this element?
Depends.
If any other thread is going to read the element, you need synchronization.
If any thread may modify the size of the vector, you need synchronization.
In any case, take care of not writing into adjacent memory locations by different threads a lot. That could destroy the performance. See "false sharing". Considering, you probably don't have a lot of cores and therefore not a lot of threads and you say write is done only once, this is probably not going to be a significant problem though.
Should I use atomic data types and will there be any significant time over head if I do?
If you use locks (mutex), atomic variables are not necessary (and they do have overhead). If you need no synchronization, atomic variables are not necessary. If you need synchronization, then atomic variables can be used to avoid locks in some cases. In which cases can you use atomics instead of locks... is more complicated and beyond the scope of this question I think.
Given the description of your situation in the comments, it seems that no synchronization is required at all and therefore no atomics nor locks.
...Would my array elements in this example be aligned, or is there some way to ensure they are?
As pointed out by Arvid, you can request specific alignment using the alginas keyword which was introduced in c++11. Pre c++11, you may resort to compiler specific extensions: https://gcc.gnu.org/onlinedocs/gcc-5.1.0/gcc/Variable-Attributes.html

Under the two conditions given, there's no need for mutexes. Remember every use of a mutex (or any synchronization construct) is a performance overhead. So you want to avoid them as much as possible (without compromising correct code, of course).
No. Mutexes are not needed since threads are only reading the array.
No. Since each thread only writes to a distinct memory location, no race condition is possible.
No. There's no need for atomic access to objects here. In fact, using atomic objects could affect the performance negatively as it prevents the optimization possibilities such as re-ordering operations.

The only time you need to use Locks is when data is modified on a shared resource. Eg if some threads where used to write data and some used to read data (in both cases from the same resource) then you only need a lock for when writing is done. This is to prevent whats known as "race".
There is good information of race on google for when you make programs that manipulate data on a shared resource.

You are on the right track.
1) For the first array (read only) , you do not need to utilize a mutex lock for it. Since the threads are just reading not altering the data there is no way a thread can corrupt the data for another thread
2) I'm a little confused by this question. If you know that thread 1 will only write an element to array slot 1 and thread 2 will only write to array slot 2 then you do not need a mutex lock. However I'm not sure how your achieving this property. If my above statement is not correct for your situation you would definitely need a mutex lock.
3) Given the definition of atomic:
Atomic types are types that encapsulate a value whose access is guaranteed to not cause data races and can be used to synchronize memory accesses among different threads.
Key note, a mutex lock is atomic meaning that there is only 1 assembly instruction needed to grab/release a lock. If it required 2 assembly instructions to grab/release a lock, a lock would not be thread safe. For example, if thread 1 attempted to grab a lock and was switched to thread 2, thread 2 would grab the lock.
Use of atomic data types would decrease your overhead but not significantly.
4) I'm not sure how you can assure your variables are lined. Since threads can switch at any moment in your program (Your OS determines when a thread switches)
Hope this helps

Related

c++ Multithreading - how writes to a shared variable are handled

As the title suggest - in C++, how writes to a shared variable in threads are handled? Do threads work with their own copy of data? I know what is shared between threads and what is not but I could not find anything on how writes to a variable are handled in multithreading scenario.
Let me give you a little background. In Java, threads work on their own copy of data. We add synchronization to ensure that the writes of thread are visible to other threads. In short, synchronization ensures that the writes are flushed to the main memory.
Is similar construct valid for c/c++? If a thread is writing to a variable, are those writes immediately updated to the main memory or is it the OS that governs how the write to main memory will occur? Do we need to add synchronization to flush those updates?
For e.g. If two threads are incrementing a counter, and I can guarantee that the first thread will always finish first with that particular task (somehow by using delayed start of second thread, may not be able to guarantee it in practice) and there won't be any mangling of instructions between the two threads, then would I always see the affect of both the threads?
Thread 1 { i++; ... do something that takes long time}
Thread 2 { i++; ... do something that takes long time}
Thanks,
RG
Do threads work with their own copy of data?
No, not unless the data is declared thread_local.
synchronization ensures that the writes are flushed to the main memory.
Is similar construct valid for c/c++? If a thread is writing to a variable, are those writes immediately updated to the main memory
Yes, but unless you use synchronization of some sort, the changes may or may not be seen - or read incorrectly, by other threads.
Do we need to add synchronization to flush those updates?
Yes, but I'm not sure I'd use the word flush. "Synchronization" is good.
For e.g. If two threads are incrementing a counter, and I can guarantee that the first thread will always finish first ...
If you can guarantee that, you have synchronization.
would I always see the affect of both the threads?
Yes, if you have the guarantee you talk about.
C++ has no guarantees about how atomicity of reads/writes and flushes to memory work, unless you use the atomic primitives such as std::atomic. Be very careful here. Lock-free programming is extremely hard to get right.
My advice is use locks.

C++ 17 shared_mutex : why read lock is even necessary for reading threads [duplicate]

I have a class that has a state (a simple enum) and that is accessed from two threads. For changing state I use a mutex (boost::mutex). Is it safe to check the state (e.g. compare state_ == ESTABLISHED) or do I have to use the mutex in this case too? In other words do I need the mutex when I just want to read a variable which could be concurrently written by another thread?
It depends.
The C++ language says nothing about threads or atomicity.
But on most modern CPU's, reading an integer is an atomic operation, which means that you will always read a consistent value, even without a mutex.
However, without a mutex, or some other form of synchronization, the compiler and CPU are free to reorder reads and writes, so anything more complex, anything involving accessing multiple variables, is still unsafe in the general case.
Assuming the writer thread updates some data, and then sets an integer flag to inform other threads that data is available, this could be reordered so the flag is set before updating the data. Unless you use a mutex or another form of memory barrier.
So if you want correct behavior, you don't need a mutex as such, and it's no problem if another thread writes to the variable while you're reading it. It'll be atomic unless you're working on a very unusual CPU. But you do need a memory barrier of some kind to prevent reordering in the compiler or CPU.
You have two threads, they exchange information, yes you need a mutex and you probably also need a conditional wait.
In your example (compare state_ == ESTABLISHED) indicates that thread #2 is waiting for thread #1 to initiate a connection/state. Without a mutex or conditionals/events, thread #2 has to poll the status continously.
Threads is used to increase performance (or improve responsiveness), polling usually results in decreased performance, either by consuming a lot of CPU or by introducing latencey due to the poll interval.
Yes. If thread a reads a variable while thread b is writing to it, you can read an undefined value. The read and write operation are not atomic, especially on a multi-processor system.
Generally speaking you don't, if your variable is declared with "volatile". And ONLY if it is a single variable - otherwise you should be really careful about possible races.
actually, there is no reason to lock access to the object for reading. you only want to lock it while writing to it. this is exactly what a reader-writer lock is. it doesn't lock the object as long as there are no write operations. it improves performance and prevents deadlocks. see the following links for more elaborate explanations :
wikipedia
codeproject
The access to the enum ( read or write) should be guarded.
Another thing:
If the thread contention is less and the threads belong to same process then Critical section would be better than mutex.

Fast thread syncronization

I'm working on a system that have multiple threads and one shared object. There is a number of threads that do read operations very often, but write operations are rare, maybe 3 to 5 per day.
I'm using rwlock for synchronization but the lock acquisition operation it's not fast enough since it happens all the time. So, I'm looking for a faster way of doing it.
Maybe a way of making the write function atomic or looking all threads during the write. Portability it's not a hard requirement, I'm using Linux with GCC 4.6.
Have you considered using read-copy-update with liburcu? This lets you avoid atomic operations and locking entirely on the read path, at the expense of making writes quite a bit slower. Note that some readers might see stale data for a short time, though; if you need the update to take effect immediately, it may not be the best option for you.
You might want to use a multiple objects rather than a single one. Instead of actually sharing the object, create an object that holds the object and an atomic count, then share a pointer to this structure among the threads.
[Assuming that there is only one writer] Each reader will get the pointer, then atomically increment the counter and use the object, after reading, atomically decrement the counter. The writer will create a new object that holds a copy of the original and modify it. Then perform an atomic swap of the two pointers. Now the problem is releasing the old object, which is why you need the count of the readers. The writer needs to continue checking the count of the old object until all readers have completed the work at which point you can delete the old object.
If there are multiple writers (i.e. there can be more than one thread updating the variable) you can follow the same approach but with writers would need to do a compare-and-swap exchange of the pointer. If the pointer from which the updated copy has changed, then the writer restarts the process (deletes it's new object, copies again from the pointer and retries the CAS)
Maybe you could use a spinlock, the threads will busy wait until unlocked. If the threads aren't locked for long it can be much more efficent than mutexes since the locking and unlocking is completed with less instructions.
spinlock is a part of POSIX pthread although optional so I don't know if it's implemented on your system. I used them in a C program on ubuntu but had to compile with -std=gnu99 instead of c99.

Thread safe programming

I keep hearing about thread safe. What is that exactly and how and where can I learn to program thread safe code?
Also, assume I have 2 threads, one that writes to a structure and another one that reads from it. Is that dangerous in any way? Is there anything I should look for? I don't think it is a problem. Both threads will not (well can't ) be accessing the struct at the exact same time..
Also, can someone please tell me how in this example : https://stackoverflow.com/a/5125493/1248779 we are doing a better job in concurrency issues. I don't get it.
It's a very deep topic. At the heart threads are usually about making things go fast by using multiple cores at the same time; or about doing long operations in the background when you don't have a good way to interleave the operation with a 'primary' thread. The latter being very common in UI programming.
Your scenario is one of the classic trouble spots, and one of the first people run into. It's vary rare to have a struct where the members are truly independent. It's very common to want to modify multiple values in the structure to maintain consistency. Without any precautions it is very possible to modify the first value, then have the other thread read the struct and operate on it before the second value has been written.
Simple example would be a 'point' struct for 2d graphics. You'd like to move the point from [2,2] to [5,6]. If you had a different thread drawing a line to that point you could end up drawing to [5,2] very easily.
This is the tip of the iceberg really. There are lots of great books, but learning this space usually goes something like this:
Uh oh, I just read from that thing in an inconsistent state.
Uh oh, I just modified that thing from 2 threads and now it's garbage.
Yay! I learned about locks
Whoa, I have a lot of locks and everything seems to just hang sometimes when I have lots of them locking in nested code.
Hrm. I need to stop doing this locking on the fly, I seem to be missing a lot of places; so I should encapsulate them in a data structure.
That data structure thing was great, but now I seem to be locking all the time and my code is just as slow as a single thread.
condition variables are weird
It's fast because I got clever with how I lock things. Hrm. Sometimes data corrupts.
Whoa.... InterlockedWhatDidYouSay?
Hey, look no lock, I do this thing called a spin lock.
Condition variables. Hrm... I see.
You know what, how about I just start thinking about how to operate on this stuff in completely independent ways, pipelineing my operations, and having as few cross thread dependencies as possible...
Obviously it's not all about condition variables. But there are many problems that can be solved with threading, and probably almost as many ways to do it, and even more ways to do it wrong.
Thread-safety is one aspect of a larger set of issues under the general heading of "Concurrent Programming". I'd suggest reading around that subject.
Your assumption that two threads cannot access the struct at the same time is not good. First: today we have multi-core machines, so two threads can be running at exactly the same time. Second: even on a single core machine the slices of time given to any other thread are unpredicatable. You have to anticipate that ant any arbitrary time the "other" thread might be processing. See my "window of opportunity" example below.
The concept of thread-safety is exactly to answer the question "is this dangerous in any way". The key question is whether it's possible for code running in one thread to get an inconsistent view of some data, that inconsistency happening because while it was running another thread was in the middle of changing data.
In your example, one thread is reading a structure and at the same time another is writing. Suppose that there are two related fields:
{ foreground: red; background: black }
and the writer is in the process of changing those
foreground = black;
<=== window of opportunity
background = red;
If the reader reads the values at just that window of opportunity then it sees a "nonsense" combination
{ foreground: black; background: black }
This essence of this pattern is that for a brief time, while we are making a change, the system becomes inconsistent and readers should not use the values. As soon as we finish our changes it becomes safe to read again.
Hence we use the CriticalSection APIs mentioned by Stefan to prevent a thread seeing an inconsistent state.
what is that exactly?
Briefly, a program that may be executed in a concurrent context without errors related to concurrency.
If ThreadA and ThreadB read and/or write data without errors and use proper synchronization, then the program may be threadsafe. It's a design choice -- making an object threadsafe can be accomplished a number of ways, and more complex types may be threadsafe using combinations of these techniques.
and how and where can I learn to program thread safe code?
boost/libs/thread/ would likely be a good introduction. The topic is quite complex.
The C++11 standard library provides implementations for locks, atomics and threads -- any well written programs which use these would be a good read. The standard library was modeled after boost's implementation.
also, assume I have 2 threads one that writes to a structure and another one that reads from it. Is that dangerous in any way? is there anything I should look for?
Yes, it can be dangerous and/or may produce incorrect results. Just imagine that a thread may run out of its time at any point, and then another thread could then read or modify that structure -- if you have not protected it, it may be in the middle of an update. A common solution is a lock, which can be used to prevent another thread from accessing shared resources during reads/writes.
When writing multithreaded C++ programs on WIN32 platforms, you need to protect certain shared objects so that only one thread can access them at any given time from different threads. You can use 5 system functions to achieve this. They are InitializeCriticalSection, EnterCriticalSection, TryEnterCriticalSection, LeaveCriticalSection, and DeleteCriticalSection.
Also maybe this links can help:
how to make an application thread safe?
http://www.codeproject.com/Articles/1779/Making-your-C-code-thread-safe
Thread safety is a simple concept: is it "safe" to perform operation A on one thread whilst another thread is performing operation B, which may or may not be the same as operation A. This can be extended to cover many threads. In this context, "safe" means:
No undefined behaviour
All invariants of the data structures are guaranteed to be observed by the threads
The actual operations A and B are important. If two threads both read a plain int variable, then this is fine. However, if any thread may write to that variable, and there is no synchronization to ensure that the read and write cannot happen together, then you have a data race, which is undefined behaviour, and this is not thread safe.
This applies equally to the scenario you asked about: unless you have taken special precautions, then it is not safe to have one thread read from a structure at the same time as another thread writes to it. If you can guarantee that the threads cannot access the data structure at the same time, through some form of synchronization such as a mutex, critical section, semaphore or event, then there is not a problem.
You can use things like mutexes and critical sections to prevent concurrent access to some data, so that the writing thread is the only thread accessing the data when it is writing, and the reading thread is the only thread accessing the data when it is reading, thus providing the guarantee I just mentioned. This therefore avoids the undefined behaviour mentioned above.
However, you still need to ensure that your code is safe in the wider context: if you need to modify more than one variable then you need to hold the lock on the mutex across the whole operation rather than for each individual access, otherwise you may find that the invariants of your data structure may not be observed by other threads.
It is also possible that a data structure may be thread safe for some operations but not others. For example, a single-producer single-consumer queue will be OK if one thread is pushing items on the queue and another is popping items off the queue, but will break if two threads are pushing items, or two threads are popping items.
In the example you reference, the point is that global variables are implicitly shared between all threads, and therefore all accesses must be protected by some form of synchronization (such as a mutex) if any thread can modify them. On the other hand, if you have a separate copy of the data for each thread, then that thread can modify its copy without worrying about concurrent access from any other thread, and no synchronization is required. Of course, you always need synchronization if two or more threads are going to operate on the same data.
My book, C++ Concurrency in Action covers what it means for things to be thread safe, how to design thread safe data structures, and the C++ synchronization primitives used for the purpose, such as std::mutex.
Threads safe is when a certain block of code is protected from being accessed by more than one thread. Meaning that the data manipulated always stays in a consistent state.
A common example is the producer consumer problem where one thread reads from a data structure while another thread writes to the same data structure : Detailed explanation
To answer the second part of the question: Imagine two threads both accessing std::vector<int> data:
//first thread
if (data.size() > 0)
{
std::cout << data[0]; //fails if data.size() == 0
}
//second thread
if (rand() % 5 == 0)
{
data.clear();
}
else
{
data.push_back(1);
}
Run these threads in parallel and your program will crash because std::cout << data[0]; might be executed directly after data.clear();.
You need to know that at any point of your thread code, the thread might be interrupted, e.g. after checking that (data.size() > 0), and another thread could become active. Although the first thread looks correct in a single threaded app, it's not in a multi-threaded program.

Do I need a mutex for reading?

I have a class that has a state (a simple enum) and that is accessed from two threads. For changing state I use a mutex (boost::mutex). Is it safe to check the state (e.g. compare state_ == ESTABLISHED) or do I have to use the mutex in this case too? In other words do I need the mutex when I just want to read a variable which could be concurrently written by another thread?
It depends.
The C++ language says nothing about threads or atomicity.
But on most modern CPU's, reading an integer is an atomic operation, which means that you will always read a consistent value, even without a mutex.
However, without a mutex, or some other form of synchronization, the compiler and CPU are free to reorder reads and writes, so anything more complex, anything involving accessing multiple variables, is still unsafe in the general case.
Assuming the writer thread updates some data, and then sets an integer flag to inform other threads that data is available, this could be reordered so the flag is set before updating the data. Unless you use a mutex or another form of memory barrier.
So if you want correct behavior, you don't need a mutex as such, and it's no problem if another thread writes to the variable while you're reading it. It'll be atomic unless you're working on a very unusual CPU. But you do need a memory barrier of some kind to prevent reordering in the compiler or CPU.
You have two threads, they exchange information, yes you need a mutex and you probably also need a conditional wait.
In your example (compare state_ == ESTABLISHED) indicates that thread #2 is waiting for thread #1 to initiate a connection/state. Without a mutex or conditionals/events, thread #2 has to poll the status continously.
Threads is used to increase performance (or improve responsiveness), polling usually results in decreased performance, either by consuming a lot of CPU or by introducing latencey due to the poll interval.
Yes. If thread a reads a variable while thread b is writing to it, you can read an undefined value. The read and write operation are not atomic, especially on a multi-processor system.
Generally speaking you don't, if your variable is declared with "volatile". And ONLY if it is a single variable - otherwise you should be really careful about possible races.
actually, there is no reason to lock access to the object for reading. you only want to lock it while writing to it. this is exactly what a reader-writer lock is. it doesn't lock the object as long as there are no write operations. it improves performance and prevents deadlocks. see the following links for more elaborate explanations :
wikipedia
codeproject
The access to the enum ( read or write) should be guarded.
Another thing:
If the thread contention is less and the threads belong to same process then Critical section would be better than mutex.