If I have
1. mainThread: write data A,
2. Thread_1: read A and write it to into a Buffer;
3. Thread_2: read from the Buffer.
how to synchronize these three threads safely, with not much performance loss? Is there any existing solution to use? I use C/C++ on linux.
IMPORTANT: the goal is to know the synchronization mechanism or algorithms for this particular case, not how mutex or semaphore works.
First, I'd consider the possibility of building this as three separate processes, using pipes to connect them. A pipe is (in essence) a small buffer with locking handled automatically by the kernel. If you do end up using threads for this, most of your time/effort will be spent on creating nearly an exact duplicate of the pipes that are already built into the kernel.
Second, if you decide to build this all on your own anyway, I'd give serious consideration to following a similar model anyway. You don't need to be slavish about it, but I'd still think primarily in terms of a data structure to which one thread writes data, and from which another reads the data. By strong preference, all the necessary thread locking necessary would be built into that data structure, so most of the code in the thread is quite simple, reading, processing, and writing data. The main difference from using normal Unix pipes would be that in this case you can maintain the data in a more convenient format, instead of all the reading and writing being in text.
As such, what I think you're looking for is basically a thread-safe queue. With that, nearly everything else involved becomes borders on trivial (at least the threading part of it does -- the processing involved may not be, but at least building it with multiple threads isn't adding much to the complexity).
It's hard to say how much experience with C/C++ threads you have. I hate to just point to a link but have you read up on pthreads?
https://computing.llnl.gov/tutorials/pthreads/
And for a shorter example with code and simple mutex'es (lock object you need to sync data):
http://students.cs.byu.edu/~cs460ta/cs460/labs/pthreads.html
I would suggest Boost.Thread for this purpose. This is quite good framework with mutexes and semaphores, and it is multiplatform. Here you can find very good tutorial about this.
How exactly synchronize these threads is another problem and needs more information about your problem.
Edit The simplest solution would be to put two mutexes -- one on A and second on Buffer. You don't have to worry about deadlocks in this particular case. Just:
Enter mutex_A from MainThread; Thread1 waits for mutex to be released.
Leave mutex from MainThread; Thread1 enters mutex_A and mutex_Buffer, starts reading from A and writes it to Buffer.
Thread1 releases both mutexes. ThreadMain can enter mutex_A and write data, and Thread2 can enter mutex_Buffer safely read data from Buffer.
This is obviously the simplest solution, and probably can be improved, but without more knowledge about the problem, this is the best I can come up with.
Related
I'm using QReadWriteLock in recursive mode.
This code doesn't by itself make sense, but the issues I have arise from here:
lock->lockForWrite();
lock->lockForRead();
lockForRead is blocked. Note that this is in recursive mode.
The way i see it is that Write is a "superior" lock, it allows me to read and write to the protected data, where Read lock only allows reading.
Also, i think that write lock should not be blocked if the only reader is the same one asking for the write lock.
I can see from the qreadwritelock.cpp source codes that there is no attempt to make it work like what i would like. So it's not a bug, but a feature I find missing.
My question is this, should this kind of recursion be allowed? Are there any problems that arise from this kind of implementation and what would they be?
From QReadWriteLock docs:
Note that the lock type cannot be changed when trying to lock
recursively, i.e. it is not possible to lock for reading in a thread
that already has locked for writing (and vice versa).
So, like you say, it's just the way it works. I personally can't see how allowing reads on the same thread as write locked item would cause problems, but perhaps it requires an inefficient lock implementation?
You could try asking on the QT forums but I doubt you'll get a definitive answer.
Why don't you take the QT source as a starter and have a go at implementing yourself if it's something you need. Writing synchronisation objects can be tricky, but it's a good learning exercise.
I found this question while searching for the same functionality myself.
While thinking about implementing this on my own, I realized that there definitely is a problem arising doing so:
So you want to upgrade your lock from shared (read) to exclusive (write). Doing
lock->unlock();
lock->lockForWrite();
is not what you want, since you want no other thread to gain the write lock right after the current thread released the read lock. But if there Was a
lock->changeModus(WRITE);
or something like that, you will create a deadlock. To gain a write lock, the lock blocks until all current read locks are released. So here, multiple threads will block waiting for each other.
I keep hearing about thread safe. What is that exactly and how and where can I learn to program thread safe code?
Also, assume I have 2 threads, one that writes to a structure and another one that reads from it. Is that dangerous in any way? Is there anything I should look for? I don't think it is a problem. Both threads will not (well can't ) be accessing the struct at the exact same time..
Also, can someone please tell me how in this example : https://stackoverflow.com/a/5125493/1248779 we are doing a better job in concurrency issues. I don't get it.
It's a very deep topic. At the heart threads are usually about making things go fast by using multiple cores at the same time; or about doing long operations in the background when you don't have a good way to interleave the operation with a 'primary' thread. The latter being very common in UI programming.
Your scenario is one of the classic trouble spots, and one of the first people run into. It's vary rare to have a struct where the members are truly independent. It's very common to want to modify multiple values in the structure to maintain consistency. Without any precautions it is very possible to modify the first value, then have the other thread read the struct and operate on it before the second value has been written.
Simple example would be a 'point' struct for 2d graphics. You'd like to move the point from [2,2] to [5,6]. If you had a different thread drawing a line to that point you could end up drawing to [5,2] very easily.
This is the tip of the iceberg really. There are lots of great books, but learning this space usually goes something like this:
Uh oh, I just read from that thing in an inconsistent state.
Uh oh, I just modified that thing from 2 threads and now it's garbage.
Yay! I learned about locks
Whoa, I have a lot of locks and everything seems to just hang sometimes when I have lots of them locking in nested code.
Hrm. I need to stop doing this locking on the fly, I seem to be missing a lot of places; so I should encapsulate them in a data structure.
That data structure thing was great, but now I seem to be locking all the time and my code is just as slow as a single thread.
condition variables are weird
It's fast because I got clever with how I lock things. Hrm. Sometimes data corrupts.
Whoa.... InterlockedWhatDidYouSay?
Hey, look no lock, I do this thing called a spin lock.
Condition variables. Hrm... I see.
You know what, how about I just start thinking about how to operate on this stuff in completely independent ways, pipelineing my operations, and having as few cross thread dependencies as possible...
Obviously it's not all about condition variables. But there are many problems that can be solved with threading, and probably almost as many ways to do it, and even more ways to do it wrong.
Thread-safety is one aspect of a larger set of issues under the general heading of "Concurrent Programming". I'd suggest reading around that subject.
Your assumption that two threads cannot access the struct at the same time is not good. First: today we have multi-core machines, so two threads can be running at exactly the same time. Second: even on a single core machine the slices of time given to any other thread are unpredicatable. You have to anticipate that ant any arbitrary time the "other" thread might be processing. See my "window of opportunity" example below.
The concept of thread-safety is exactly to answer the question "is this dangerous in any way". The key question is whether it's possible for code running in one thread to get an inconsistent view of some data, that inconsistency happening because while it was running another thread was in the middle of changing data.
In your example, one thread is reading a structure and at the same time another is writing. Suppose that there are two related fields:
{ foreground: red; background: black }
and the writer is in the process of changing those
foreground = black;
<=== window of opportunity
background = red;
If the reader reads the values at just that window of opportunity then it sees a "nonsense" combination
{ foreground: black; background: black }
This essence of this pattern is that for a brief time, while we are making a change, the system becomes inconsistent and readers should not use the values. As soon as we finish our changes it becomes safe to read again.
Hence we use the CriticalSection APIs mentioned by Stefan to prevent a thread seeing an inconsistent state.
what is that exactly?
Briefly, a program that may be executed in a concurrent context without errors related to concurrency.
If ThreadA and ThreadB read and/or write data without errors and use proper synchronization, then the program may be threadsafe. It's a design choice -- making an object threadsafe can be accomplished a number of ways, and more complex types may be threadsafe using combinations of these techniques.
and how and where can I learn to program thread safe code?
boost/libs/thread/ would likely be a good introduction. The topic is quite complex.
The C++11 standard library provides implementations for locks, atomics and threads -- any well written programs which use these would be a good read. The standard library was modeled after boost's implementation.
also, assume I have 2 threads one that writes to a structure and another one that reads from it. Is that dangerous in any way? is there anything I should look for?
Yes, it can be dangerous and/or may produce incorrect results. Just imagine that a thread may run out of its time at any point, and then another thread could then read or modify that structure -- if you have not protected it, it may be in the middle of an update. A common solution is a lock, which can be used to prevent another thread from accessing shared resources during reads/writes.
When writing multithreaded C++ programs on WIN32 platforms, you need to protect certain shared objects so that only one thread can access them at any given time from different threads. You can use 5 system functions to achieve this. They are InitializeCriticalSection, EnterCriticalSection, TryEnterCriticalSection, LeaveCriticalSection, and DeleteCriticalSection.
Also maybe this links can help:
how to make an application thread safe?
http://www.codeproject.com/Articles/1779/Making-your-C-code-thread-safe
Thread safety is a simple concept: is it "safe" to perform operation A on one thread whilst another thread is performing operation B, which may or may not be the same as operation A. This can be extended to cover many threads. In this context, "safe" means:
No undefined behaviour
All invariants of the data structures are guaranteed to be observed by the threads
The actual operations A and B are important. If two threads both read a plain int variable, then this is fine. However, if any thread may write to that variable, and there is no synchronization to ensure that the read and write cannot happen together, then you have a data race, which is undefined behaviour, and this is not thread safe.
This applies equally to the scenario you asked about: unless you have taken special precautions, then it is not safe to have one thread read from a structure at the same time as another thread writes to it. If you can guarantee that the threads cannot access the data structure at the same time, through some form of synchronization such as a mutex, critical section, semaphore or event, then there is not a problem.
You can use things like mutexes and critical sections to prevent concurrent access to some data, so that the writing thread is the only thread accessing the data when it is writing, and the reading thread is the only thread accessing the data when it is reading, thus providing the guarantee I just mentioned. This therefore avoids the undefined behaviour mentioned above.
However, you still need to ensure that your code is safe in the wider context: if you need to modify more than one variable then you need to hold the lock on the mutex across the whole operation rather than for each individual access, otherwise you may find that the invariants of your data structure may not be observed by other threads.
It is also possible that a data structure may be thread safe for some operations but not others. For example, a single-producer single-consumer queue will be OK if one thread is pushing items on the queue and another is popping items off the queue, but will break if two threads are pushing items, or two threads are popping items.
In the example you reference, the point is that global variables are implicitly shared between all threads, and therefore all accesses must be protected by some form of synchronization (such as a mutex) if any thread can modify them. On the other hand, if you have a separate copy of the data for each thread, then that thread can modify its copy without worrying about concurrent access from any other thread, and no synchronization is required. Of course, you always need synchronization if two or more threads are going to operate on the same data.
My book, C++ Concurrency in Action covers what it means for things to be thread safe, how to design thread safe data structures, and the C++ synchronization primitives used for the purpose, such as std::mutex.
Threads safe is when a certain block of code is protected from being accessed by more than one thread. Meaning that the data manipulated always stays in a consistent state.
A common example is the producer consumer problem where one thread reads from a data structure while another thread writes to the same data structure : Detailed explanation
To answer the second part of the question: Imagine two threads both accessing std::vector<int> data:
//first thread
if (data.size() > 0)
{
std::cout << data[0]; //fails if data.size() == 0
}
//second thread
if (rand() % 5 == 0)
{
data.clear();
}
else
{
data.push_back(1);
}
Run these threads in parallel and your program will crash because std::cout << data[0]; might be executed directly after data.clear();.
You need to know that at any point of your thread code, the thread might be interrupted, e.g. after checking that (data.size() > 0), and another thread could become active. Although the first thread looks correct in a single threaded app, it's not in a multi-threaded program.
I'm trying to use WaitForSingleObject(fork[leftFork], Infinite); to lock a variable using multiple threads but it doesn't seem to lock anything
I set the Handle fork[5] and then use the code below but it doesn't seem to lock anything.
while(forks[rightFork] == 0 || forks[leftFork] == 0) Sleep(0);
WaitForSingleObject(fork[leftFork], INFINITE);
forks[leftFork]--;
WaitForSingleObject(fork[rightFork], INFINITE);
forks[rightFork]--;
I have tried as a WaitForMultipleObjects as well and same result. When I create the mutex I use fork[i]= CreateMutex(NULL, FALSE,NULL);
I was wondering if this is only good for each thread or do they share it?
First of all, you haven't shown enough code for us to be able to help you with any great certainty of correctness. But, having made that proviso, I'm going to try anyway!
Your use of the word fork suggests to me that you are approaching Windows threading from a pthreads background. Windows threads are a little different. Rather confusingly, the most effective in-process mutex object in Windows is not the mutex, it is in fact the critical section.
The interface for the critical section is much simpler to use, it being essentially an acquire function and a corresponding release function. If you are synchronizing within a single process, and you need a simple lock (rather than, say, a semaphore), you should use critical sections rather than mutexes.
In fact, only yesterday here on Stack Overflow, I wrote a more detailed answer to a question which described the standard usage pattern for critical sections. That post has lots of links to the pertinent sections of MSDN documentation.
Having said that, it would appear that all you are trying to do is to synchronize the decrementing of an array of integer values. If that is so then you can do this most simply in a lock free manner with InterlockIncrement or one of its friends.
You only need to use a mutex when you are performing cross process synchronization. Indeed you should only use a mutex when you are synchronizing across a process because critical sections perform so much better (i.e. faster). Since you are updating a simple array here, and since there is no obviously visible IPC going on, I can only conclude that this really is in-process.
If I'm wrong and you really are doing cross-process work and require a mutex then we would need to see more code. For example I don't see any calls to ReleaseMutex. I don't know exactly how you are creating your mutexes.
If this doesn't help, please edit your question to include more code, and also a high level overview of what you are trying to achieve.
Is it ok to check the current thread inside a function?
For example if some non-thread safe data structure is only altered by one thread, and there is a function which is called by multiple threads, it would be useful to have separate code paths depending on the current thread. If the current thread is the one that alters the data structure, it is ok to alter the data structure directly in the function. However, if the current thread is some other thread, the actual altering would have to be delayed, so that it is performed when it is safe to perform the operation.
Or, would it be better to use some boolean which is given as a parameter to the function to separate the different code paths?
Or do something totally different?
What do you think?
You are not making all too much sense. You said a non-thread safe data structure is only ever altered by one thread, but in the next sentence you talk about delaying any changes made to that data structure by other threads. Make up your mind.
In general, I'd suggest wrapping the access to the data structure up with a critical section, or mutex.
It's possible to use such animals as reader/writer locks to differentiate between readers and writers of datastructures but the performance advantage for typical cases usually wont merit the additional complexity associated with their use.
From the way your question is stated, I'm guessing you're fairly new to multithreaded development. I highly suggest sticking with the simplist and most commonly used approaches for ensuring data integrity (most books/articles you readon the issue will mention the same uses for mutexes/critical sections). Multithreaded development is extremely easy to get wrong and can be difficult to debug. Also, what seems like the "optimal" solution very often doesn't buy you the huge performance benefit you might think. It's usually best to implement the simplist approach that will work then worry about optimizing it after the fact.
There is a trick that could work in case, as you said, the other threads will only make changes only once in a while, although it is still rather hackish:
make sure your "master" thread can't be interrupted by the other ones (higher priority, non fair scheduling)
check your thread
if "master", just change
if other, put off scheduling, if needed by putting off interrupts, make change, reinstall scheduling
really test to see whether there are no issues in your setup.
As you can see, if requirements change a little bit, this could turn out worse than using normal locks.
As mentioned, the simplest solution when two threads need access to the same data is to use some synchronization mechanism (i.e. critical section or mutex).
If you already have synchronization in your design try to reuse it (if possible) instead of adding more. For example, if the main thread receives its work from a synchronized queue you might be able to have thread 2 queue the data structure update. The main thread will pick up the request and can update it without additional synchronization.
The queuing concept can be hidden from the rest of the design through the Active Object pattern. The activ object may also be able to publish the data structure changes through the Observer pattern to other interested threads.
This is an interview question. How do you implement a read/write mutex? There will be multiple threads reading and writing to a resource. I'm not sure how to go about it. If there's any information needed, please let me know.
Update: I'm not sure if my statement above is valid/understandable. But what I really want to know is how do you implement multiple read and multiple writes on a single object in terms of mutex and other synchronization objects needed?
Check out Dekker's algorithm.
Dekker's algorithm is the first known
correct solution to the mutual
exclusion problem in concurrent
programming. The solution is
attributed to Dutch mathematician Th.
J. Dekker by Edsger W. Dijkstra in his
manuscript on cooperating sequential
processes. It allows two threads to
share a single-use resource without
conflict, using only shared memory for
communication.
Note that Dekker's algorithm uses a spinlock (not a busy waiting) technique.
(Th. J. Dekker's solution, mentioned by E. W. Dijkstra in his EWD1303 paper)
The short answer is that it is surprisingly difficult to roll your own read/write lock. It's very easy to miss a very subtle timing problem that could result in deadlock, two threads both thinking they have an "exclusive" lock, etc.
In a nutshell, you need to keep a count of how many readers are active at any particular time. Only when the number of active readers is zero, should you grant a thread write access. There are some design choices as to whether readers or writers are given priority. (Often, you want to give writers the priority, on the assumption that writing is done less frequently.) The (surprisingly) tricky part is to ensure that no writer is given access when there are readers, or vice versa.
There is an excellent MSDN article, "Compound Win32 Synchronization Objects" that takes you through the creation of a reader/writer lock. It starts simple, then grows more complicated to handle all the corner cases. One thing that stood out was that they showed a sample that looked perfectly good-- then they would explain why it wouldn't actually work. Had they not pointed out the problems, you might have never noticed. Well worth a read.
Hope this is helpful.
This sounds like an rather difficult question for an interview; I would not "implement" a read/write mutex, in the sense of writing one from scratch--there are much better off-the-shelf solutions available. The sensible real world thing would be to use an existing mutex type. Perhaps what they really wanted to know was how you would use such a type?
Afaik you need either an atomic compare-and-swap instruction, or you need to be able to disable interrupts. See Compare-and-swap on wikipedia. At least, that's how an OS would implement it. If you have an operating system, stand on it's shoulders, and use an existing library (boost for example).