Multithreading: when to call mutex.lock?

Multithreading: when to call mutex.lock? - c++

So, I have a ton ob objects, each having several fields, including a c-array, which are modified within their "Update()" method. Now I create several threads, each updating a section of these objects. As far as I understand calling lock() before calling the update function would be useless, since this would essentially cause the updates being called in a sequential order just like they would be without multithreading. Now, there objects have pointers, cross referencing to each other. Do I need to call lock every time ANY field is modified, or just before specific operations (like delete, re-initializing arrays, etc?)

Do I need to call lock every time ANY field is modified, or just before specific operations (like delete, re-initializing arrays, etc?)
Neither. You need to have a lock even to read, to make sure another thread isn't part way through modifying the data you're reading. You might want to use a many reader / one writer lock. I suggest you start by having a single lock (whether a simple mutex or the more elaborate multi-reader/writer lock) and get the code working so you can profile it and see whether you actually need more fine-grained locking, then you'll have a bit more experience and understanding of options and advice about how to manage that.
If you do need fine-grained locking, then the trick is to think about where the locks logically belong - for example - there could be one per object. You'll then need to learn about techniques for avoiding deadlocks. You should do some background reading too.

It depends on the consequences of the data changes you want to make. If each thread is, for example, changing well defined sub-blocks of data and each sub-block is entirely independent of all other sub-blocks then it might make sense to have a mutex per sub-block.
That would allow one thread to deal with one set of sub-blocks whilst another gets a different subset to process.
Having threads make changes without gaining a mutex lock first is going to lead to inconsistencies at best...
If the data and processing isn't subdivisible that way then you would probably have to start thinking about how you might handle whole objects in parallel, ie adopt a coarser granularity and one mutex per object. This is perhaps more likely to be possible - different objects are supposed to be independent of each other, so it should in theory be possible to process their data in parallel.
However the unavoidable truth is that some computer jobs require fast single thread performance. For that one starts seriously needing the right sort of supercomputer and perhaps some jolly long pipelines.

Related

What can two threads/objects do in a collection of objects safely in C++?

So before I start working on a multi threaded program that should interact with multiple objects that are part of a collection.... I want to gain a clear understanding of the concepts involved...
My main concerns are things like deadlocks and such.
Say I have a collection of objects defined like so...
vector<MyObjects> m_objects;
Imagine it being populated with between 100-500 objects.
Now imagine that each of these objects needs to have the ability to communicate with all the other objects at some point. In english, they need to be able to both read and write to all the other objects safely...
I know that in order to write to an object it needs to be locked... But can one object read from another object safely without extra functionality? If so, can an object read from a locked object that is being written to? (My first guess to that last question is no, as that would not make sense)
If anyone has some easily understandable articles/reads on the subject I would love to dive into that for a bit...

You have to lock shared resources within an object. Each objects should keep its on state within them, if that states are not part of shared resources of system you don't need to lock them.

The safest solution would be to lock the whole vector, so only one thread at a time can modify the vector and the objects it contains. However, the question is, if there would still be a point in multithreading then...
Trying to protect every object in the vector with its own lock could easily lead to circular dependencies and is potentially dangerous.
Imagine following scenario, using your Boxer metaphor from the comments above:
Boxer1 tries to hit Boxer2. If the impact of the punch is dependent on Boxer1's current health level, you would have to lock Boxer1 first, because you don't want anyone to change his health while performing the punch operation. At the same time, you would have to lock Boxer2, because you don't want anyone to increase his health during Boxer1's punch (maybe Boxer1 would have knocked Boxer2 out...).
Now if Boxer2 tries to hit Boxer2 simultaneously tries to hit Boxer1, you would also lock Boxer2 first and then Boxer1.
So if both of the threads performing operations on your boxers get to the part where they lock their own boxer at the same time, they would be waiting forever to lock the other boxer and you would have a deadlock.
To prevent such deadlocks, you would have to work out some kind of locking hierarchy.

How can we use multi-threading while working with linked lists

I am fairly new to the concept of multi-threading and was exploring to see some interesting problems in order to get a better idea.
One of my friends suggested the following:
"It is fairly straight-forward to have a linked-list and do the regular insert, search and delete operations. But how would you do these operations if multiple threads need to work on the same list.
How many locks are required minimum. How many locks can we have to have optimized linked list functions?"
Giving some thought, I feel that a single lock should be sufficient to work with. We acquire the lock for every single read and write operation. By this I mean when we are accessing a node data in a list we acquire the lock. When we are inserting/deleting elements, we acquire lock for the complete series of steps.
But I was not able to think of a way where using more locks will give us more optimized performance.
Any help/pointers?

The logical extension of "one lock per list" would be "one lock per item".
The case when this would be useful would e.g. be if you're often only modifying a single item of the list.
For deletion and insertion, acquiring the proper locks gets more complicated, though. You'd have to acquire the lock for the item before and after, and you'd have to make sure to always acquire them in the same order (to prevent deadlocks). And there's of course also special cases to be considered if the root element has to be modified (and possibly also if it's a double-linked list or a circular linked list). This overhead resulting from the more complicated locking logic might lead to your implementation being slower again, especially if you often have to insert and delete from the list.
So I would only consider this if the majority of accesses is the modification of a single node.
If you're searching for peak performance for a specific use case, then in the end, it boils down to implementing both, and running performance comparisons for a typical scenario.

You definitely need at least one semaphore/lock to ensure list integrity.
But, presumably any operation on the list changes at most two nodes: The node being insert/changed/deleted and the adjacent node which points to it. So you could implement locking on a per-node basis, locking at most two nodes for a given operation. This would allow for a degree of concurrency when different threads accesses the list, though you'd need to distinguish between read and write locks to the get full benefit of this approach I think.

If you're new to multi-threading, embrace the notion that premature optimization is a waste of time. Linked lists are a very straight-forward data structure, and you can make it thread-safe by putting a critical section on all reads and writes. These will lock the thread into the CPU for the duration of the execution of the read/insert/delete operation, and ensure thread-safety. They also don't consume the overhead of a mutex lock, or more complicated locking mechanism.
If you want to optimize after the fact, only do so with a valid profiling tool that gives you raw numbers. The linked list operations will never end up being the biggest source of application slowdown, and it will probably never be worth your while to add in the node-level locking being discussed.

Using one lock for the entire list would completely defeat most reasons for multithreading in the first place. By locking the entire list down, you guarantee that only one thread can use the list at a time.
This is certainly safe in the sense that you will have no deadlocks or races, but it is naive and inefficient because you serialize access to the entire list.
A better approach would be to have a lock for each item in the list, and another one for the list itself. The latter would be needed when appending to the list, depending on how the list is implemented (eg, if it maintains a node count seperate from the nodes themselves).
However this might also be less than optimal depending on a number of factors. For instance, on some platforms mutexes might be expensive in terms of resources and time when instantiating the mutex. If space is at a premium, another approach might be to have a fixed-size pool of mutexes from which you draw whenever you need to access an item. These mutexes would have some kind of ownership flag which indicates which node they are allocated to, so that no other mutex would be allocated to that node at the same time.
Another technique is to use reader/write locks, which will allow read access to any thread, but write access to only one, the two being mutually exclusive. However it has been suggested in the literature that in many cases using a reader/write lock is actually less efficient than simply using a plain mutex. This will depend on your actual usage pattern and how the lock is implemented.

You only need to lock when you're writing and you say there's usually only one write, so try read/write locks.

Thread safe programming

I keep hearing about thread safe. What is that exactly and how and where can I learn to program thread safe code?
Also, assume I have 2 threads, one that writes to a structure and another one that reads from it. Is that dangerous in any way? Is there anything I should look for? I don't think it is a problem. Both threads will not (well can't ) be accessing the struct at the exact same time..
Also, can someone please tell me how in this example : https://stackoverflow.com/a/5125493/1248779 we are doing a better job in concurrency issues. I don't get it.

It's a very deep topic. At the heart threads are usually about making things go fast by using multiple cores at the same time; or about doing long operations in the background when you don't have a good way to interleave the operation with a 'primary' thread. The latter being very common in UI programming.
Your scenario is one of the classic trouble spots, and one of the first people run into. It's vary rare to have a struct where the members are truly independent. It's very common to want to modify multiple values in the structure to maintain consistency. Without any precautions it is very possible to modify the first value, then have the other thread read the struct and operate on it before the second value has been written.
Simple example would be a 'point' struct for 2d graphics. You'd like to move the point from [2,2] to [5,6]. If you had a different thread drawing a line to that point you could end up drawing to [5,2] very easily.
This is the tip of the iceberg really. There are lots of great books, but learning this space usually goes something like this:
Uh oh, I just read from that thing in an inconsistent state.
Uh oh, I just modified that thing from 2 threads and now it's garbage.
Yay! I learned about locks
Whoa, I have a lot of locks and everything seems to just hang sometimes when I have lots of them locking in nested code.
Hrm. I need to stop doing this locking on the fly, I seem to be missing a lot of places; so I should encapsulate them in a data structure.
That data structure thing was great, but now I seem to be locking all the time and my code is just as slow as a single thread.
condition variables are weird
It's fast because I got clever with how I lock things. Hrm. Sometimes data corrupts.
Whoa.... InterlockedWhatDidYouSay?
Hey, look no lock, I do this thing called a spin lock.
Condition variables. Hrm... I see.
You know what, how about I just start thinking about how to operate on this stuff in completely independent ways, pipelineing my operations, and having as few cross thread dependencies as possible...
Obviously it's not all about condition variables. But there are many problems that can be solved with threading, and probably almost as many ways to do it, and even more ways to do it wrong.

Thread-safety is one aspect of a larger set of issues under the general heading of "Concurrent Programming". I'd suggest reading around that subject.
Your assumption that two threads cannot access the struct at the same time is not good. First: today we have multi-core machines, so two threads can be running at exactly the same time. Second: even on a single core machine the slices of time given to any other thread are unpredicatable. You have to anticipate that ant any arbitrary time the "other" thread might be processing. See my "window of opportunity" example below.
The concept of thread-safety is exactly to answer the question "is this dangerous in any way". The key question is whether it's possible for code running in one thread to get an inconsistent view of some data, that inconsistency happening because while it was running another thread was in the middle of changing data.
In your example, one thread is reading a structure and at the same time another is writing. Suppose that there are two related fields:
{ foreground: red; background: black }
and the writer is in the process of changing those
foreground = black;
<=== window of opportunity
background = red;
If the reader reads the values at just that window of opportunity then it sees a "nonsense" combination
{ foreground: black; background: black }
This essence of this pattern is that for a brief time, while we are making a change, the system becomes inconsistent and readers should not use the values. As soon as we finish our changes it becomes safe to read again.
Hence we use the CriticalSection APIs mentioned by Stefan to prevent a thread seeing an inconsistent state.

what is that exactly?
Briefly, a program that may be executed in a concurrent context without errors related to concurrency.
If ThreadA and ThreadB read and/or write data without errors and use proper synchronization, then the program may be threadsafe. It's a design choice -- making an object threadsafe can be accomplished a number of ways, and more complex types may be threadsafe using combinations of these techniques.
and how and where can I learn to program thread safe code?
boost/libs/thread/ would likely be a good introduction. The topic is quite complex.
The C++11 standard library provides implementations for locks, atomics and threads -- any well written programs which use these would be a good read. The standard library was modeled after boost's implementation.
also, assume I have 2 threads one that writes to a structure and another one that reads from it. Is that dangerous in any way? is there anything I should look for?
Yes, it can be dangerous and/or may produce incorrect results. Just imagine that a thread may run out of its time at any point, and then another thread could then read or modify that structure -- if you have not protected it, it may be in the middle of an update. A common solution is a lock, which can be used to prevent another thread from accessing shared resources during reads/writes.

When writing multithreaded C++ programs on WIN32 platforms, you need to protect certain shared objects so that only one thread can access them at any given time from different threads. You can use 5 system functions to achieve this. They are InitializeCriticalSection, EnterCriticalSection, TryEnterCriticalSection, LeaveCriticalSection, and DeleteCriticalSection.
Also maybe this links can help:
how to make an application thread safe?
http://www.codeproject.com/Articles/1779/Making-your-C-code-thread-safe

Thread safety is a simple concept: is it "safe" to perform operation A on one thread whilst another thread is performing operation B, which may or may not be the same as operation A. This can be extended to cover many threads. In this context, "safe" means:
No undefined behaviour
All invariants of the data structures are guaranteed to be observed by the threads
The actual operations A and B are important. If two threads both read a plain int variable, then this is fine. However, if any thread may write to that variable, and there is no synchronization to ensure that the read and write cannot happen together, then you have a data race, which is undefined behaviour, and this is not thread safe.
This applies equally to the scenario you asked about: unless you have taken special precautions, then it is not safe to have one thread read from a structure at the same time as another thread writes to it. If you can guarantee that the threads cannot access the data structure at the same time, through some form of synchronization such as a mutex, critical section, semaphore or event, then there is not a problem.
You can use things like mutexes and critical sections to prevent concurrent access to some data, so that the writing thread is the only thread accessing the data when it is writing, and the reading thread is the only thread accessing the data when it is reading, thus providing the guarantee I just mentioned. This therefore avoids the undefined behaviour mentioned above.
However, you still need to ensure that your code is safe in the wider context: if you need to modify more than one variable then you need to hold the lock on the mutex across the whole operation rather than for each individual access, otherwise you may find that the invariants of your data structure may not be observed by other threads.
It is also possible that a data structure may be thread safe for some operations but not others. For example, a single-producer single-consumer queue will be OK if one thread is pushing items on the queue and another is popping items off the queue, but will break if two threads are pushing items, or two threads are popping items.
In the example you reference, the point is that global variables are implicitly shared between all threads, and therefore all accesses must be protected by some form of synchronization (such as a mutex) if any thread can modify them. On the other hand, if you have a separate copy of the data for each thread, then that thread can modify its copy without worrying about concurrent access from any other thread, and no synchronization is required. Of course, you always need synchronization if two or more threads are going to operate on the same data.
My book, C++ Concurrency in Action covers what it means for things to be thread safe, how to design thread safe data structures, and the C++ synchronization primitives used for the purpose, such as std::mutex.

Threads safe is when a certain block of code is protected from being accessed by more than one thread. Meaning that the data manipulated always stays in a consistent state.
A common example is the producer consumer problem where one thread reads from a data structure while another thread writes to the same data structure : Detailed explanation

To answer the second part of the question: Imagine two threads both accessing std::vector<int> data:
//first thread
if (data.size() > 0)
{
std::cout << data[0]; //fails if data.size() == 0
}
//second thread
if (rand() % 5 == 0)
{
data.clear();
}
else
{
data.push_back(1);
}
Run these threads in parallel and your program will crash because std::cout << data[0]; might be executed directly after data.clear();.
You need to know that at any point of your thread code, the thread might be interrupted, e.g. after checking that (data.size() > 0), and another thread could become active. Although the first thread looks correct in a single threaded app, it's not in a multi-threaded program.

Testing concurrent data structure

What are some methods for testing concurrent data structures to make sure the data structs behave correctly when accessed from multiple threads ?

All of the other answers have focused on actually testing the code by putting it through its paces and actually running it in one form or another or politely saying "don't do it yourself, use an existing library".
This is great and all, but IMO, the most important (practical tests are important too) test is to look at the code line by line and for every line of code ask "what happens if I get interrupted by another thread here?" Imagine another thread, running just about any of the other lines/functions during this interruption. Do things still stay consistent? When competing for resources, does the other thread[s] block or spin?
This is what we did in school when learning about concurrency and it is a surprisingly effective approach. Bottom line, I feel that taking the time to prove to yourself that things are consistent and work as expected in all states is the first technique you should use when dealing with this stuff.

Concurrent systems are probabilistic and errors are often difficult to replicate. Therefore you need to run various input/output cases, each tested over time (hours, days, etc) in order to detect possible errors.
Tests for concurrent data structure involves examining the container's state before and after expected events such as insert and delete.

Use a pre-existing, pre-tested library that meets your needs if possible.
Make sure that the code has appropriate self-consistency checks (preferably fast sanity checks), and run your code on as many different types of hardware as possible to help narrow down interesting timing problems.
Have multiple people peer review the code, preferably without a pre-explanation of how it's supposed to work. That way they have to grok the code which should help catch more bugs.
Set up a bunch of threads that do nothing but random operations on the data structures and check for consistency at some rate.

Start with the assumption that your calls to access/modify data are not thread safe and use locks to ensure only a single thread can access/modify any part of the data at a time. Only after you can prove to yourself that a specific type of access is safe outside of the lock by multiple threads at once should you move that code outside of the lock.
Assume worst case scenarios, e.g. that your code will stop right in the middle of some pointer manipulation or another critical point, and that another thread will encounter that data in mid-transition. If that would have a bad result, leave it within the lock.

I normally test these kinds of things by interjecting sleep() calls at appropriate places in the distributed threads/processes.
For instance, to test a lock, put sleep(2) in all your threads at the point of contention, and spawn two threads roughly 1 second apart. The first one should obtain the lock, and the second should have to wait for it.
Most race conditions can be tested by extending this method, but if your system has too many components it may be difficult or impossible to know every possible condition that needs to be tested.

Run your concurrent threads for one or a few days and look what happens. (Sounds strange, but finding out race conditions is such a complex topic that simply trying it is the best approach).

Is checking current thread inside a function ok?

Is it ok to check the current thread inside a function?
For example if some non-thread safe data structure is only altered by one thread, and there is a function which is called by multiple threads, it would be useful to have separate code paths depending on the current thread. If the current thread is the one that alters the data structure, it is ok to alter the data structure directly in the function. However, if the current thread is some other thread, the actual altering would have to be delayed, so that it is performed when it is safe to perform the operation.
Or, would it be better to use some boolean which is given as a parameter to the function to separate the different code paths?
Or do something totally different?
What do you think?

You are not making all too much sense. You said a non-thread safe data structure is only ever altered by one thread, but in the next sentence you talk about delaying any changes made to that data structure by other threads. Make up your mind.
In general, I'd suggest wrapping the access to the data structure up with a critical section, or mutex.

It's possible to use such animals as reader/writer locks to differentiate between readers and writers of datastructures but the performance advantage for typical cases usually wont merit the additional complexity associated with their use.
From the way your question is stated, I'm guessing you're fairly new to multithreaded development. I highly suggest sticking with the simplist and most commonly used approaches for ensuring data integrity (most books/articles you readon the issue will mention the same uses for mutexes/critical sections). Multithreaded development is extremely easy to get wrong and can be difficult to debug. Also, what seems like the "optimal" solution very often doesn't buy you the huge performance benefit you might think. It's usually best to implement the simplist approach that will work then worry about optimizing it after the fact.

There is a trick that could work in case, as you said, the other threads will only make changes only once in a while, although it is still rather hackish:
make sure your "master" thread can't be interrupted by the other ones (higher priority, non fair scheduling)
check your thread
if "master", just change
if other, put off scheduling, if needed by putting off interrupts, make change, reinstall scheduling
really test to see whether there are no issues in your setup.
As you can see, if requirements change a little bit, this could turn out worse than using normal locks.

As mentioned, the simplest solution when two threads need access to the same data is to use some synchronization mechanism (i.e. critical section or mutex).
If you already have synchronization in your design try to reuse it (if possible) instead of adding more. For example, if the main thread receives its work from a synchronized queue you might be able to have thread 2 queue the data structure update. The main thread will pick up the request and can update it without additional synchronization.
The queuing concept can be hidden from the rest of the design through the Active Object pattern. The activ object may also be able to publish the data structure changes through the Observer pattern to other interested threads.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js