I am very new to Linux programming so bear with me. I have 2 thread type that perform different operations so I want each one to have it's own mutex. Here is the code I am using , is it good ? If not why ?
static pthread_mutex_t cs_mutex = PTHREAD_MUTEX_INITIALIZER;
static pthread_mutex_t cs_mutex2 = PTHREAD_MUTEX_INITALIZER;
void * Thread1(void * lp)
{
int * sock = (int*)lp;
char buffer[2024];
int bytecount = recv(*sock, buffer, 2048, 0);
while (0 == 0)
{
if ((bytecount ==0) || (bytecount == -1))
{
pthread_mutex_lock(&cs_mutex);
//Some uninteresting operations witch plays with set 1 of global variables;
pthread_mutex_unlock(&cs_mutex);
}
}
}
void * Thread2(void * lp)
{
while (0 == 0)
{
pthread_mutex_lock(&cs_mutex2);
//Some uninteresting operations witch plays with some global variables;
pthread_mutex_unlock(&cs_mutex2);
}
}
Normally, a mutex is not thread related.
It ensures that a critical area is only accessed by a single thread.
So if u have some shared areas, like processing the same array by multiple threads, then you must ensure exclusive access for this area.
That means, you do not need a mutex for each thread. You need a mutex for the critical area.
If you only have one driver, there is no advantage to having two cars. Your Thread2 code can only make useful progress while holding cs_mutex2. So there's no point to having more than one thread running that code. Only one thread can hold the mutex at a time, and the other thread can do no useful work.
So all you'll accomplish is that occasionally the thread that doesn't hold the mutex will try to run and have to wait for the other. And occasionally the thread that does hold the mutex will try to release and re-acquire it and get pre-empted by the other.
This is a completely pointless use of threads.
I see three problems here. There's a question your infinite loop, another about your intention in having multiple threads, and there's a future maintainability "gotcha" lurking.
First
int bytecount = recv(*sock, buffer, 2048, 0);
while (0 == 0)
Is that right? You read some stuff from a socket, and start an infinite loop without ever closing the socket? I can only assume that you do some more reading in the loop, but in which case you are waiting for an external event while holding the mutex. In general that's a bad pattern limiting your concurrency. A possibly pattern is to have one thread reading the data and then passing the read data to other threads which do the processing.
Next, you have two different sets of resources each protected by their own mutex. You then intend to have a set of Threads for each resource. But each thread has the pattern
take mutex
lots of processing
release mutex
tiny window (a few machine instructions)
take mutex again
lots of processing
release mutex
next tiny window
There's virtually no opportunity for two threads to work in parallel. I question whether your have need for multiple threads for each resource.
Last there's a potential maintenance issue. I'm just pointing this out for future reference, I don't think you need to do anything right now. You have two functions, intended for use by two threads, but in the end they are just functions that can be called by anyone. If later maintenance results in those functions (or refactored subsets of the functions) then you could get two threads
take mutex 1
take mutex 2
and the other
take mutex 2
take mutex 1
Bingo: deadlock.
Not an easy problem to avoid, but at the very least one can aid the maintainer by careful naming choices and refactoring.
I think your code is correct, however please note 2 things:
It is not exception safe. If exception is thrown from Some uninteresting operations then your mutex will be never unlocked -> deadlock
You could also consider using std::mutex or boost::mutex instead of raw mutexes. For mutex locking it's better to use boost::mutex::scoped_lock (or std:: analog with modern compiler)
void test()
{
// not synch code here
{
boost::mutex::scoped_lock lock(mutex_);
// synchronized code here
}
}
If you have 2 different sets of data and 2 different threads working on these sets -- why do you need mutexes at all? Usually, mutexes are used when you deal with some shared piece of data and you don't want two threads to deal with it simultaneously, so you lock it with mutex, do some stuff, unlock.
Related
I am making my own mutex to synchronize my threads and I am having the following issue:
The same thread seems to re-acquire the mutex right after it releases it
What I have tried:
Telling it to yield execution to another thread (SwitchToThread, Sleep, YieldProcessor)
Increasing delay between loops (Up to 1 second)
Here is how it works:
I have a structure with a state value:
volatile unsigned int state;
When I want to acquire the mutex, I check the state until it has been released (open), then acquire (close) it and break out of the infinite loop and do whatever needs to be done:
unsigned int previous = 0;
for (;;)
{
previous = InterlockedExchangeAdd(&mtx->state,
0);
if (STATE_OPEN == previous)
{
InterlockedExchange(&mtx->state,
STATE_CLOSED);
break;
}
Sleep(delay);
}
Then I simply release it for the next thread to acquire it:
InterlockedExchange(&mtx->state,
STATE_OPEN);
The way I am using it is I simply have one global volatile integer that I add 1 to in one thread and subtract 1 to in another one. Increasing the delay has helped with making it so that the number does not either go very low or very high and get stuck in a loop being executed in just a single thread, but a 1+ second delay is not going to work for my other purposes.
How could I go about making sure that all of the threads get a chance to acquire the mutex and not have it get stuck in a single thread?
The mutex does exactly what it is supposed to do: it prevents multiple threads from running at the same time.
To stop a thread from re-acquiring the mutex, the basic solution is to not access the shared resource which is protected by the mutex. The thread probably should be doing something else.
You may also have a design problem. If you have multiple resources protected by a single mutex, you may have false contention between threads. if each resource had its own mutex, multiple threads could each work on their own resource.
I have question about multi threading in c++. I have a scenario as follows
void ThreadedRead(int32_t thread_num, BinReader reader) {
while (!reader.endOfData) {
thread_buckets[thread_num].clear();
thread_buckets[thread_num] = reader.readnextbatch()
thread_flags[thread_num] = THREAD_WAITING;
while (thread_flags[thread_num] != THREAD_RUNNING) {
// wait until awakened
if (thread_flags[thread_num] != THREAD_RUNNING) {
//go back to sleep
}
}
}
thread_flags[thread_num] = THREAD_FINISHED;
}
No section of the above code writes or access memory shared between threads. Each thread is assigned a thread_num and a unique reader object that it may use to read data.
I want the main thread to be able to notify a thread that is in the THREAD_WAITING state that his state has been changed back to THREAD_RUNNING and he needs to do some work. I don't want to him to keep polling his state.
I understand conditional vars and mutexes can help me. But I'm not sure how to use them because I don't want to acquire or need a lock. How can the mainthread blanket notify all waiting threads that they are now free to read more data?
EDIT:
Just in case anyone needs more details
1) reader reads some files
2) thread_buckets is a vector of vectors of uint16
3) threadflags is a int vector
they have all been resized appropriately
I realize that you wrote that you wanted to avoid condition variables and locks. On the other hand you mentioned that this was because you were not sure about how to use them. Please consider the following example to get the job done without polling:
The trick with the condition variables is that a single condition_variable object together with a single mutex object will do the management for you including the handling of the unique_lock objects in the worker threads. Since you tagged your question as C++ I assume you are talking about C++11 (or higher) multithreading (I guess that C-pthreads may work similarly). Your code could be as follows:
// compile for C++11 or higher
#include <thread>
#include <condition_variable>
#include <mutex>
// objects visible to both master and workers:
std::condition_variable cvr;
std::mutex mtx;
void ThreadedRead(int32_t thread_num, BinReader reader) {
while (!reader.endOfData) {
thread_buckets[thread_num].clear();
thread_buckets[thread_num] = reader.readnextbatch()
std::unique_lock<std::mutex> myLock(mtx);
// This lock will be managed by the condition variable!
thread_flags[thread_num] = THREAD_WAITING;
while (thread_flags[thread_num] == THREAD_WAITING) {
cvr.wait(myLock);
// ...must be in a loop as shown because of potential spurious wake-ups
}
}
thread_flags[thread_num] = THREAD_FINISHED;
}
To (re-)activate the workers from a master thread:
{ // block...
// step 1: usually make sure that there is no worker still preparing itself at the moment
std::unique_lock<std::mutex> someLock(mtx);
// (in your case this would not cover workers currently busy with reader.readnextbatch(),
// these would be not re-started this time...)
// step 2: set all worker threads that should work now to THREAD_RUNNING
for (...looping over the worker's flags...) {
if (...corresponding worker should run now...) {
flag = THREAD_RUNNING;
}
}
// step 3: signalize the workers to run now
cvr.notify_all();
} // ...block, releasing someLock
Notice:
If you just want to trigger all sleeping workers you should control them with a single flag instead of a container of flags.
If you want to trigger single sleeping workers but it doesn't matter which one consider the .notify_one() member function instead of .notify_all(). Note as well that also in this case a single mutex/condition_variable pair is sufficient.
The flags should better be placed in an atomic object such as a global std::atomic<int> or maybe for finer control in a std::vector<std::atomic<int>>.
A good introduction to std::condition_variable which also inspired the suggested solution is given in: cplusplus website
It looks like there are a few issues. For one thing, you do not need the conditional inside of your loop:
while (thread_flags[thread_num] != THREAD_RUNNING);
will work by itself. As soon as that condition is false, the loop will exit.
If all you want to do is avoid checking thread_flags as quickly as possible, just put a yield in the loop:
while (thread_flags[thread_num] != THREAD_RUNNING) yield(100);
This will cause the thread to yield the CPU so that it can do other things while the thread waits for its state to change. This will make make the overhead for polling close to negligible. You can experiment with the sleep duration to find a good value. 100ms is probably on the long side.
Depending on what causes the thread state to change, you could have the thread poll that condition/value directly (with a sleep in still) and not bother with states at all.
There are a lot of options here. If you look up reader threads you can probably find just what you want; having a separate reader thread is very common.
I will make a hypothetical scenario just to be clear about what I need to know.
Let's say I have a single file being updated very often.
I need to read and parse this file by several different threads.
Everytime this file is rewritten, I'm gonna wake a condition mutex so the other threads can do whatever they want to.
My question is:
If I have 10000 threads, the first thread execution will block the execution of the other 9999 ones?
Does it work in parallel or synchronously?
This post has been edited since first posted to address comments below by Jonathan Wakely, and to better distinguish between a condition_variable, a condition (which were both called condition in the first version), and how the wait function operates. Just as important, however, is an exploration of better methods from modern C++, using std::future, std::thread and std::packaged_task, with some discussion regarding buffering and reasonable thread count.
First, 10,000 threads is a lot of threads. The thread scheduler will be highly burdened on all but the very highest performance of computers. Typical quad core workstations under Windows would struggle. It's a sign that some kind of queued scheduling of tasks is in order, typical of servers accepting thousands of connections using perhaps 10 threads, each servicing 1,000 connects. The number of threads is really not important to the question, but that in such a volume of tasks 10,000 threads is impracticable.
To handle synchronization, the mutex doesn't actually do what you're proposing, by itself. The concept you're describing is a type of event object, perhaps an auto reset event, which by itself is a higher level concept. Windows has them as part of its API, but they are fashioned on Linux (and for portable software, usually) with two primitive components, a mutex and a condition variable. Together these create the auto reset event, and other types of "waitable events" as Windows calls them. In C++ these are provided by std::mutex and std::condition_variable.
Mutexes by themselves merely provide locked control over a common resource. In that scenario we are not thinking in terms of clients and a server (or workers and an executive), but we're thinking in terms of competition among peers for a single resource which can only be accessed by one actor (thread) at a time. A mutex can block execution, but it does not release based on an external signal. Mutexes block if another thread has locked the mutex, and wait indefinitely until the owner of the lock releases it. This isn't the scenario you present in the question.
In your scenario, there are many "clients" and one "server" thread. The server is in charge of signalling that something is ready to be processed. All other threads are clients in this design (nothing about the thread itself makes them clients, we merely deem them so by the function they execute). In some discussions, clients are called worker threads.
The clients use a mutex/condition variable pair to wait for a signal. This construct usually takes the form of locking a mutex, then waiting on the condition variable using that mutex. When a thread enters wait on the condition variable, the mutex is unlocked. This is repeated for all client threads who wait for work to be done. A typical client wait example is:
std::mutex m;
std::condition_variable cv;
void client_thread()
{
// Wait until server signals data is ready
std::unique_lock<std::mutex> lk(m); // lock the mutex
cv.wait(lk); // wait on cv
// do the work
}
This is pseudo code showing the mutex/conditional variable used together. std::condition_variable has two overloads of the wait function, this is the simplest one. The intent is that a thread will block, entering into an idle state until the condition_variable is signalled. It is not intended as a complete example, merely to point out these two objects are used together.
Johnathan Wakely's comments below are based on the fact that wait is not indefinite; there is no guarantee that the reason the call is unblocked is because of a signal. The documentation calls this a "spurious wakeup", which occasionally occurs for complex reasons of OS scheduling. The point which Johnathan makes is that code using this pair must be safe to operate even if the wakeup is not because the condition_variable was signalled.
In the parlance of using condition variables, this is known as a condition (not the condition_variable). The condition is an application defined concept, usually illustrated as a boolean in the literature, and often the result of checking a bool, an integer (sometimes of atomic type) or calling a function returning a bool. Sometimes application defined notions of what constitutes a true condition are more complex, but the overall effect of the condition is to determine whether or not the thread, once awakened, should continue to process, or should simply repeat the wait.
One way to satisfy this requirement is the second version of std::condition_variable::wait. The two are declared:
void wait( std::unique_lock<std::mutex>& lock );
template< class Predicate >
void wait( std::unique_lock<std::mutex>& lock, Predicate pred );
Johnathan's point is to insist the second version be used. However, documentation describes (and the fact there are two overloads indicates) that the Predicate is optional. The Predicate is a functor of some kind, often a lambda expression, resolving to true if the wait should unblock, false if the wait should continue waiting, and it is evaluated under lock. The Predicate is synonymous with condition in that the Predicate is one way in which to indicate true or false regarding whether wait should unblock.
Although the Predicate is, in fact, optional, the notion that 'wait' is not perfect in blocking until a signal is received requires that if the first version is used, it is because the application is constructed such that spurious wakes have no consequence (indeed, are part of the design).
Jonathan's citation shows that the Predicate is evaluated under lock, but in generalized forms of the paradigm that's frequently not practicable. std::condition_variable must wait on a locked std::mutex, which may be protecting a variable defining the condition, but sometimes that's not possible. Sometimes the condition is more complex, external, or trivial enough that the std::mutex isn't associated with the condition.
To see how that works in the context of the proposed solution, assume there are 10 client threads waiting for a server to signal that work is to be done, and that work is scheduled in a queue as a container of virtual functors. A virtual functor might be something like:
struct VFunc
{
virtual void operator()(){}
};
template <typename T>
struct VFunctor
{
// Something referring to T, possible std::function
virtual void operator()(){...call the std::function...}
};
typedef std::deque< VFunc > Queue;
The pseudo code above suggests a typical functor with a virtual operator(), returning void and taking no parameters, sometimes known as a "blind call". The key point in suggesting it is the fact Queue can own a collection of these without knowing what is being called, and whatever VFunctors are in Queue could refer to anything std::function might be able to call, which includes member functions of other objects, lambdas, simple functions, etc. If, however, there is only one function signature to be called, perhaps:
typedef std::deque< std::function<void(void)>> Queue
Is sufficient.
For either case, work is to be done only if there are entries in Queue.
To wait, one might use a class like:
class AutoResetEvent
{
private:
std::mutex m;
std::condition_variable cv;
bool signalled;
bool signalled_all;
unsigned int wcount;
public:
AutoResetEvent() : wcount( 0 ), signalled(false), signalled_all(false) {}
void SignalAll() { std::unique_lock<std::mutex> l(m);
signalled = true;
signalled_all = true;
cv.notify_all();
}
void SignalOne() { std::unique_lock<std::mutex> l(m);
signalled = true;
cv.notify_one();
}
void Wait() { std::unique_lock<std::mutex> l(m);
++wcount;
while( !signalled )
{
cv.wait(l);
}
--wcount;
if ( signalled_all )
{ if ( wcount == 0 )
{ signalled = false;
signalled_all = false;
}
}
else { signalled = false;
}
}
};
This is pseudo code of a standard reset event type of waitable object, compatible with Windows CreateEvent and WaitForSingleObject API, functioning the basic same way.
All client threads end up at cv.wait (this can have a timeout in Windows, using the Windows API, but not with std::condition_variable). At some point, the server signals the event with a call to Signalxxx. Your scenario suggests SignalAll().
If notify_one is called, one of the waiting threads is released, and all others remain asleep. Of notify_all is called, then all threads waiting on that condition are released to do work.
The following might be an example of using AutoResetEvent:
AutoResetEvent evt; // probably not a global
void client()
{
while( !Shutdown ) // assuming some bool to indicate shutdown
{
if ( IsWorkPending() ) DoWork();
evt.Wait();
}
}
void server()
{
// gather data
evt.SignalAll();
}
The use of IsWorkPending() satisfies the notion of a condition, as Jonathan Wakely indicates. Until a shutdown is indidated, this loop will process work if it's pending, and wait for a signal otherwise. Spurious wakeups have no negative effect. IsWorkPending() would check Queue.size(), possibly through an object which protects Queue with a std::mutex or some other synchronization mechanism. If work is pending, DoWork() would sequentially pop entries out of Queue until Queue is empty. Upon return, the loop would again wait for a signal.
With all of that discussed, the combination of mutex and condition_variable is related to an old style of thinking, now outdated in the era of C++11/C++14. Unless you have trouble using a compliant compiler, it would be better to investigate the use of std::promise, std::future and either std::async or std::thread with std::packaged_task. For example, using future, promise, packaged_task and thread could entirely replace the discussion above.
For example:
// a function for threads to execute
int func()
{
// do some work, return status as result
return result;
}
Assuming func does the work you require on the files, these typedefs apply:
typedef std::packaged_task< int() > func_task;
typedef std::future< int > f_int;
typedef std::shared_ptr< f_int > f_int_ptr;
typedef std::vector< f_int_ptr > f_int_vec;
std::future can't be copied, so it's stored using a shared_ptr for ease of use in a vector, but there are various solutions.
Next, an example of using these for 10 threads of work
void executive_function()
{
// a vector of future pointers
f_int_vec future_list;
// start some threads
for( int n=0; n < 10; ++n )
{
// a packaged_task calling func
func_task ft( &func );
// get a future from the task as a shared_ptr
f_int_ptr future_ptr( new f_int( ft.get_future() ) );
// store the task for later use
future_list.push_back( future_ptr );
// launch a thread to call task
std::thread( std::move( ft )).detach();
}
// at this point, 10 threads are running
for( auto &d : future_list )
{
// for each future pointer, wait (block if required)
// for each thread's func to return
d->wait();
// get the result of the func return value
int res = d->get();
}
}
The point here is really in the last range-for loop. The vector stores futures, which the packaged_tasks provided. Those tasks are used to launch threads, and the future is key to synchronizing the executive. Once all threads are running, each is "waited on" with a simple call to the future's wait function, after which the return value of func can be obtained. No mutexes or condition_variables involved (that we know of).
This brings me to the subject of processing files in parallel, no matter how you launch a number of threads. If there were a machine which could handle 10,000 threads, then if each thread were a trivial file oriented operation there would be considerable RAM resources devoted to file processing, all duplicating each other. Depending on the API chosen, there are buffers associated with each read operation.
Let's say the file was 10 Mbytes, and 10,000 threads began operating on it, where each thread used 4 Kbyte buffers for processing. Combined, that suggests there would be 40 Mbytes of buffers to process a 10 Mbyte file. It would be less wasteful to simply read the file into RAM, and offer read only access to all threads from RAM.
That notion is further complicated by the fact that multiple tasks reading from various sections of the file at different times may cause heavy thrashing from a standard hard disk (not so for flash sources), if the disk cache can't keep up. More importantly, though, is that 10,000 threads are all calling system API's for reading the file, each with considerable overhead.
If the source material is a candidate for reading entirely into RAM, the threads could be focused on RAM instead of the file, alleviating that overhead, improving performance. The threads could share read access to the contents without locks.
If the source file is too large to read entirely into RAM, it may still be best read in blocks of the source file, have threads process that portion from a shared memory resource, then move to the next block in a series.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I was trying to understand how mutexes work. Did a lot of Googling but it still left some doubts of how it works because I created my own program in which locking didn't work.
One absolutely non-intuitive syntax of the mutex is pthread_mutex_lock( &mutex1 );, where it looks like the mutex is being locked, when what I really want to lock is some other variable. Does this syntax mean that locking a mutex locks a region of code until the mutex is unlocked? Then how do threads know that the region is locked? [UPDATE: Threads know that the region is locked, by Memory Fencing ]. And isn't such a phenomenon supposed to be called critical section? [UPDATE: Critical section objects are available in Windows only, where the objects are faster than mutexes and are visible only to the thread which implements it. Otherwise, critical section just refers to the area of code protected by a mutex]
What's the simplest possible mutex example program and the simplest possible explanation on the logic of how it works?
Here goes my humble attempt to explain the concept to newbies around the world: (a color coded version on my blog too)
A lot of people run to a lone phone booth (they don't have mobile phones) to talk to their loved ones. The first person to catch the door-handle of the booth, is the one who is allowed to use the phone. He has to keep holding on to the handle of the door as long as he uses the phone, otherwise someone else will catch hold of the handle, throw him out and talk to his wife :) There's no queue system as such. When the person finishes his call, comes out of the booth and leaves the door handle, the next person to get hold of the door handle will be allowed to use the phone.
A thread is : Each person
The mutex is : The door handle
The lock is : The person's hand
The resource is : The phone
Any thread which has to execute some lines of code which should not be modified by other threads at the same time (using the phone to talk to his wife), has to first acquire a lock on a mutex (clutching the door handle of the booth). Only then will a thread be able to run those lines of code (making the phone call).
Once the thread has executed that code, it should release the lock on the mutex so that another thread can acquire a lock on the mutex (other people being able to access the phone booth).
[The concept of having a mutex is a bit absurd when considering real-world exclusive access, but in the programming world I guess there was no other way to let the other threads 'see' that a thread was already executing some lines of code. There are concepts of recursive mutexes etc, but this example was only meant to show you the basic concept. Hope the example gives you a clear picture of the concept.]
With C++11 threading:
#include <iostream>
#include <thread>
#include <mutex>
std::mutex m;//you can use std::lock_guard if you want to be exception safe
int i = 0;
void makeACallFromPhoneBooth()
{
m.lock();//man gets a hold of the phone booth door and locks it. The other men wait outside
//man happily talks to his wife from now....
std::cout << i << " Hello Wife" << std::endl;
i++;//no other thread can access variable i until m.unlock() is called
//...until now, with no interruption from other men
m.unlock();//man lets go of the door handle and unlocks the door
}
int main()
{
//This is the main crowd of people uninterested in making a phone call
//man1 leaves the crowd to go to the phone booth
std::thread man1(makeACallFromPhoneBooth);
//Although man2 appears to start second, there's a good chance he might
//reach the phone booth before man1
std::thread man2(makeACallFromPhoneBooth);
//And hey, man3 also joined the race to the booth
std::thread man3(makeACallFromPhoneBooth);
man1.join();//man1 finished his phone call and joins the crowd
man2.join();//man2 finished his phone call and joins the crowd
man3.join();//man3 finished his phone call and joins the crowd
return 0;
}
Compile and run using g++ -std=c++0x -pthread -o thread thread.cpp;./thread
Instead of explicitly using lock and unlock, you can use brackets as shown here, if you are using a scoped lock for the advantage it provides. Scoped locks have a slight performance overhead though.
While a mutex may be used to solve other problems, the primary reason they exist is to provide mutual exclusion and thereby solve what is known as a race condition. When two (or more) threads or processes are attempting to access the same variable concurrently, we have potential for a race condition. Consider the following code
//somewhere long ago, we have i declared as int
void my_concurrently_called_function()
{
i++;
}
The internals of this function look so simple. It's only one statement. However, a typical pseudo-assembly language equivalent might be:
load i from memory into a register
add 1 to i
store i back into memory
Because the equivalent assembly-language instructions are all required to perform the increment operation on i, we say that incrementing i is a non-atmoic operation. An atomic operation is one that can be completed on the hardware with a gurantee of not being interrupted once the instruction execution has begun. Incrementing i consists of a chain of 3 atomic instructions. In a concurrent system where several threads are calling the function, problems arise when a thread reads or writes at the wrong time. Imagine we have two threads running simultaneoulsy and one calls the function immediately after the other. Let's also say that we have i initialized to 0. Also assume that we have plenty of registers and that the two threads are using completely different registers, so there will be no collisions. The actual timing of these events may be:
thread 1 load 0 into register from memory corresponding to i //register is currently 0
thread 1 add 1 to a register //register is now 1, but not memory is 0
thread 2 load 0 into register from memory corresponding to i
thread 2 add 1 to a register //register is now 1, but not memory is 0
thread 1 write register to memory //memory is now 1
thread 2 write register to memory //memory is now 1
What's happened is that we have two threads incrementing i concurrently, our function gets called twice, but the outcome is inconsistent with that fact. It looks like the function was only called once. This is because the atomicity is "broken" at the machine level, meaning threads can interrupt each other or work together at the wrong times.
We need a mechanism to solve this. We need to impose some ordering to the instructions above. One common mechanism is to block all threads except one. Pthread mutex uses this mechanism.
Any thread which has to execute some lines of code which may unsafely modify shared values by other threads at the same time (using the phone to talk to his wife), should first be made acquire a lock on a mutex. In this way, any thread that requires access to the shared data must pass through the mutex lock. Only then will a thread be able to execute the code. This section of code is called a critical section.
Once the thread has executed the critical section, it should release the lock on the mutex so that another thread can acquire a lock on the mutex.
The concept of having a mutex seems a bit odd when considering humans seeking exclusive access to real, physical objects but when programming, we must be intentional. Concurrent threads and processes don't have the social and cultural upbringing that we do, so we must force them to share data nicely.
So technically speaking, how does a mutex work? Doesn't it suffer from the same race conditions that we mentioned earlier? Isn't pthread_mutex_lock() a bit more complex that a simple increment of a variable?
Technically speaking, we need some hardware support to help us out. The hardware designers give us machine instructions that do more than one thing but are guranteed to be atomic. A classic example of such an instruction is the test-and-set (TAS). When trying to acquire a lock on a resource, we might use the TAS might check to see if a value in memory is 0. If it is, that would be our signal that the resource is in use and we do nothing (or more accurately, we wait by some mechanism. A pthreads mutex will put us into a special queue in the operating system and will notify us when the resource becomes available. Dumber systems may require us to do a tight spin loop, testing the condition over and over). If the value in memory is not 0, the TAS sets the location to something other than 0 without using any other instructions. It's like combining two assembly instructions into 1 to give us atomicity. Thus, testing and changing the value (if changing is appropriate) cannot be interrupted once it has begun. We can build mutexes on top of such an instruction.
Note: some sections may appear similar to an earlier answer. I accepted his invite to edit, he preferred the original way it was, so I'm keeping what I had which is infused with a little bit of his verbiage.
I stumbled upon this post recently and think that it needs an updated solution for the standard library's c++11 mutex (namely std::mutex).
I've pasted some code below (my first steps with a mutex - I learned concurrency on win32 with HANDLE, SetEvent, WaitForMultipleObjects etc).
Since it's my first attempt with std::mutex and friends, I'd love to see comments, suggestions and improvements!
#include <condition_variable>
#include <mutex>
#include <algorithm>
#include <thread>
#include <queue>
#include <chrono>
#include <iostream>
int _tmain(int argc, _TCHAR* argv[])
{
// these vars are shared among the following threads
std::queue<unsigned int> nNumbers;
std::mutex mtxQueue;
std::condition_variable cvQueue;
bool m_bQueueLocked = false;
std::mutex mtxQuit;
std::condition_variable cvQuit;
bool m_bQuit = false;
std::thread thrQuit(
[&]()
{
using namespace std;
this_thread::sleep_for(chrono::seconds(5));
// set event by setting the bool variable to true
// then notifying via the condition variable
m_bQuit = true;
cvQuit.notify_all();
}
);
std::thread thrProducer(
[&]()
{
using namespace std;
int nNum = 13;
unique_lock<mutex> lock( mtxQuit );
while ( ! m_bQuit )
{
while( cvQuit.wait_for( lock, chrono::milliseconds(75) ) == cv_status::timeout )
{
nNum = nNum + 13 / 2;
unique_lock<mutex> qLock(mtxQueue);
cout << "Produced: " << nNum << "\n";
nNumbers.push( nNum );
}
}
}
);
std::thread thrConsumer(
[&]()
{
using namespace std;
unique_lock<mutex> lock(mtxQuit);
while( cvQuit.wait_for(lock, chrono::milliseconds(150)) == cv_status::timeout )
{
unique_lock<mutex> qLock(mtxQueue);
if( nNumbers.size() > 0 )
{
cout << "Consumed: " << nNumbers.front() << "\n";
nNumbers.pop();
}
}
}
);
thrQuit.join();
thrProducer.join();
thrConsumer.join();
return 0;
}
For those looking for the shortex mutex example:
#include <mutex>
int main() {
std::mutex m;
m.lock();
// do thread-safe stuff
m.unlock();
}
The function pthread_mutex_lock() either acquires the mutex for the calling thread or blocks the thread until the mutex can be acquired. The related pthread_mutex_unlock() releases the mutex.
Think of the mutex as a queue; every thread that attempts to acquire the mutex will be placed on the end of the queue. When a thread releases the mutex, the next thread in the queue comes off and is now running.
A critical section refers to a region of code where non-determinism is possible. Often this because multiple threads are attempting to access a shared variable. The critical section is not safe until some sort of synchronization is in place. A mutex lock is one form of synchronization.
You are supposed to check the mutex variable before using the area protected by the mutex. So your pthread_mutex_lock() could (depending on implementation) wait until mutex1 is released or return a value indicating that the lock could not be obtained if someone else has already locked it.
Mutex is really just a simplified semaphore. If you read about them and understand them, you understand mutexes. There are several questions regarding mutexes and semaphores in SO. Difference between binary semaphore and mutex, When should we use mutex and when should we use semaphore and so on. The toilet example in the first link is about as good an example as one can think of. All code does is to check if the key is available and if it is, reserves it. Notice that you don't really reserve the toilet itself, but the key.
SEMAPHORE EXAMPLE ::
sem_t m;
sem_init(&m, 0, 0); // initialize semaphore to 0
sem_wait(&m);
// critical section here
sem_post(&m);
Reference : http://pages.cs.wisc.edu/~remzi/Classes/537/Fall2008/Notes/threads-semaphores.txt
I'm using the C++ boost::thread library, which in my case means I'm using pthreads. Officially, a mutex must be unlocked from the same thread which locks it, and I want the effect of being able to lock in one thread and then unlock in another. There are many ways to accomplish this. One possibility would be to write a new mutex class which allows this behavior.
For example:
class inter_thread_mutex{
bool locked;
boost::mutex mx;
boost::condition_variable cv;
public:
void lock(){
boost::unique_lock<boost::mutex> lck(mx);
while(locked) cv.wait(lck);
locked=true;
}
void unlock(){
{
boost::lock_guard<boost::mutex> lck(mx);
if(!locked) error();
locked=false;
}
cv.notify_one();
}
// bool try_lock(); void error(); etc.
}
I should point out that the above code doesn't guarantee FIFO access, since if one thread calls lock() while another calls unlock(), this first thread may acquire the lock ahead of other threads which are waiting. (Come to think of it, the boost::thread documentation doesn't appear to make any explicit scheduling guarantees for either mutexes or condition variables). But let's just ignore that (and any other bugs) for now.
My question is, if I decide to go this route, would I be able to use such a mutex as a model for the boost Lockable concept. For example, would anything go wrong if I use a boost::unique_lock< inter_thread_mutex > for RAII-style access, and then pass this lock to boost::condition_variable_any.wait(), etc.
On one hand I don't see why not. On the other hand, "I don't see why not" is usually a very bad way of determining whether something will work.
The reason I ask is that if it turns out that I have to write wrapper classes for RAII locks and condition variables and whatever else, then I'd rather just find some other way to achieve the same effect.
EDIT:
The kind of behavior I want is basically as follows. I have an object, and it needs to be locked whenever it is modified. I want to lock the object from one thread, and do some work on it. Then I want to keep the object locked while I tell another worker thread to complete the work. So the first thread can go on and do something else while the worker thread finishes up. When the worker thread gets done, it unlocks the mutex.
And I want the transition to be seemless so nobody else can get the mutex lock in between when thread 1 starts the work and thread 2 completes it.
Something like inter_thread_mutex seems like it would work, and it would also allow the program to interact with it as if it were an ordinary mutex. So it seems like a clean solution. If there's a better solution, I'd be happy to hear that also.
EDIT AGAIN:
The reason I need locks to begin with is that there are multiple master threads, and the locks are there to prevent them from accessing shared objects concurrently in invalid ways.
So the code already uses loop-level lock-free sequencing of operations at the master thread level. Also, in the original implementation, there were no worker threads, and the mutexes were ordinary kosher mutexes.
The inter_thread_thingy came up as an optimization, primarily to improve response time. In many cases, it was sufficient to guarantee that the "first part" of operation A, occurs before the "first part" of operation B. As a dumb example, say I punch object 1 and give it a black eye. Then I tell object 1 to change it's internal structure to reflect all the tissue damage. I don't want to wait around for the tissue damage before I move on to punch object 2. However, I do want the tissue damage to occur as part of the same operation; for example, in the interim, I don't want any other thread to reconfigure the object in such a way that would make tissue damage an invalid operation. (yes, this example is imperfect in many ways, and no I'm not working on a game)
So we made the change to a model where ownership of an object can be passed to a worker thread to complete an operation, and it actually works quite nicely; each master thread is able to get a lot more operations done because it doesn't need to wait for them all to complete. And, since the event sequencing at the master thread level is still loop-based, it is easy to write high-level master-thread operations, as they can be based on the assumption that an operation is complete (more precisely, the critical "first part" upon which the sequencing logic depends is complete) when the corresponding function call returns.
Finally, I thought it would be nice to use inter_thread mutex/semaphore thingies using RAII with boost locks to encapsulate the necessary synchronization that is required to make the whole thing work.
man pthread_unlock (this is on OS X, similar wording on Linux) has the answer:
NAME
pthread_mutex_unlock -- unlock a mutex
SYNOPSIS
#include <pthread.h>
int
pthread_mutex_unlock(pthread_mutex_t *mutex);
DESCRIPTION
If the current thread holds the lock on mutex, then the
pthread_mutex_unlock() function unlocks mutex.
Calling pthread_mutex_unlock() with a mutex that the
calling thread does not hold will result in
undefined behavior.
...
My counter-question would be - what kind of synchronization problem are you trying to solve with this? Most probably there is an easier solution.
Neither pthreads nor boost::thread (built on top of it) guarantee any order in which a contended mutex is acquired by competing threads.
Sorry, but I don't understand. what will be the state of your mutex in line [1] in the following code if another thread can unlock it?
inter_thread_mutex m;
{
m.lock();
// [1]
m.unlock();
}
This has no sens.
There's a few ways to approach this. Both of the ones I'm going to suggest are going to involve adding an additional piece of information to the object, rather adding a mechanism to unlock a thread from a thread other than the one that owns it.
1) you can add some information to indicate the object's state:
enum modification_state { consistent, // ready to be examined or to start being modified
phase1_complete, // ready for the second thread to finish the work
};
// first worker thread
lock();
do_init_work(object);
object.mod_state = phase1_complete;
unlock();
signal();
do_other_stuff();
// second worker thread
lock()
while( object.mod_state != phase1_complete )
wait()
do_final_work(obj)
object.mod_state = consistent;
unlock()
signal()
// some other thread that needs to read the data
lock()
while( object.mod_state != consistent )
wait();
read_data(obj)
unlock()
Works just fine with condition variables, because obviously you're not writing your own lock.
2) If you have a specific thread in mind, you can give the object an owner.
// first worker
lock();
while( obj.owner != this_thread() ) wait();
do_initial_work(obj);
obj.owner = second_thread_id;
unlock()
signal()
...
This is pretty much the same solution as my first solution, but more flexible in the adding/removing of phases, and less flexible in the adding/removing of threads.
To be honest, I'm not sure how inter thread mutex would help you here. You'd still need a semaphore or condition variable to signal the passing of the work to the second thread.
Small modification to what you already have: how about storing the id of the thread which you want to take the lock, in your inter_thread_whatever? Then unlock it, and send a message to that thread, saying "I want you execute whatever routine it is that tries to take this lock".
Then the condition in lock becomes while(locked || (desired_locker != thisthread && desired_locker != 0)). Technically you've "released the lock" in the first thread, and "taken it again" in the second thread, but there's no way that any other thread can grab it in between, so it's as if you've transferred it directly from one to the other.
There's a potential problem, that if a thread exits or is killed, while it's the desired locker of your lock, then that thread deadlocks. But you were already talking about the first thread waiting for a message from the second thread to say that it has successfully acquired the lock, so presumably you already have a plan in mind for what happens if that message is never received. To that plan, add "reset the desired_locker field on the inter_thread_whatever".
This is all very hairy, though, I'm not convinced that what I've proposed is correct. Is there a way that the "master" thread (the one that's directing all these helpers) can just make sure that it doesn't order any more operations to be performed on whatever is protected by this lock, until the first op is completed (or fails and some RAII thing notifies you)? You don't need locks as such, if you can deal with it at the level of the message loop.
I don't think it is a good idea to say that your inter_thread_mutex (binary_semaphore) can be seen as a model of Lockable. The main issue is that the main feature of your inter_thread_mutex defeats the Locakble concept. If inter_thread_mutex was a model of lockable you will expect in In [1] that the inter_thread_mutex m is locked.
// thread T1
inter_thread_mutex m;
{
unique_lock<inter_thread_mutex> lk(m);
// [1]
}
But as an other thread T2 can do m.unlock() while T1 is in [1], the guaranty is broken.
Binary semaphores can be used as Lockables as far as each thread tries to lock before unlocking. But the main goal of your class is exactly the contrary.
This is one of the reason semaphores in Boost.Interprocess don't use lock/unlock to name the functions, but wait/notify. Curiously these are the same names used by conditions :)
A mutex is a mechanism for describing mutually exclusive blocks of code. It does not make sense for these blocks of code to cross thread boundaries. Trying to use such a concept in such an counter intuitive way can only lead to problems down the line.
It sounds very much like you're looking for a different multi-threading concept, but without more detail it's hard to know what.