Readers-Writers Synchronisation Issue - concurrency

I am having trouble understanding why the first readers-writers problem can starve write processes, i.e.: how does the code provide reader processes with priority? Should the writer process not be able to gain a lock when one of the reader processes perform a signal(wrt)? Is it that the list for the semaphores is structured (as I can see how writers would be starved by a steady stream of reader processes in a LIFO list) in a way to give priority to reader processes or am I misunderstanding something fundamental here?
semaphore wrt=1,mutex=1;
readcount=0;
writer()
{
wait(wrt);
//writing is done
signal(wrt);
}
reader()
{
wait(mutex);
readcount++;
if(readcount==1)
wait(wrt);
signal(mutex);
///Do the Reading
///(Critical Section Area)
wait(mutex);
readcount--;
if(readcount==0)
signal(wrt);
signal(mutex);
}

If you always have 2 or more readers active, the signal(wrt) will never get called at the end of the reader block. New readers won't have readcount == 1 so they won't wait on wrt, but they'll increase the readcount. This makes unending read requests starve the writer thread. If the reader count ever gets to 0, then the wrt will be released and the writer can finally work. Until that point readers have priority.
This isn't a LIFO approach precisely, but rather priority queue where readers have priority.

Related

Single reader multiple writers with pthreads and locks and without boost

Consider the next piece of code.
#include <iostream>
#include <vector>
#include <map>
using namespace std;
map<pthread_t,vector<int>> map_vec;
vector<pair<pthread_t ,int>> how_much_and_where;
pthread_cond_t CV = PTHREAD_COND_INITIALIZER;
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
void* writer(void* args)
{
while(*some condition*)
{
int howMuchPush = (rand() % 5) + 1;
for (int i = 0; i < howMuchPush; ++i)
{
// WRITE
map_vec[pthread_self()].push_back(rand() % 10);
}
how_much_and_where.push_back(make_pair(pthread_self(), howMuchPush));
// Wake up the reader - there's something to read.
pthread_cond_signal(&CV);
}
cout << "writer thread: " << pthread_self() << endl;
return nullptr;
}
void* reader(void* args) {
pair<pthread_t, int> to_do;
pthread_cond_wait(&CV, &mutex);
while(*what condition??*)
{
to_do = how_much_and_where.front();
how_much_and_where.erase(how_much_and_where.begin());
// READ
cout << to_do.first << " wrote " << endl;
for (int i = 0; i < to_do.second; i++)
{
cout << map_vec[to_do.first][i] << endl;
}
// Done reading. Go to sleep.
pthread_cond_wait(&CV, &mutex);
}
return nullptr;
}
//----------------------------------------------------------------------------//
int main()
{
pthread_t threads[4];
// Writers
pthread_create(&threads[0], nullptr, writer, nullptr);
pthread_create(&threads[1], nullptr, writer, nullptr);
pthread_create(&threads[2], nullptr, writer, nullptr);
// reader
pthread_create(&threads[4], nullptr, reader, nullptr);
pthread_join(threads[0], nullptr);
pthread_join(threads[1], nullptr);
pthread_join(threads[2], nullptr);
pthread_join(threads[3], nullptr);
return 0;
}
Background
Every writer have his own container to which he writes data.
And suppose that there's a reader who knows when a writer finished writing chunk of data, and what is the size of that chunk (The reader has a container to which writers write pairs of this data).
Questions
Obviously i should put locks on the shared sources - map_vec and how_much_and_where. But i don't understand what ,in this case, is the -
efficent way to to position locks on this resources (For example, locking map_vec before every push_back in the for loop? Or maybe lock it before the for loop - But isn't pushing to a queue is a wasteful and long operation, that may cause the reader to wait too much?) /
safe way to position locks in order to prevent deadlocks.
I don't understand what is the right condition that should be in the
while loop - i thought that maybe as long as how_much_and_where is
not empty, but obviously a situation in which the reader emptied how_much_and_where right before a writer added a pair may accour.
Suppose a writer sent a signal while the reader was busy reading some
data. As far as i understand, this signal will be ignored, and the
pair the which the writer pushed, may never be dealt with (#of of signals
received and dealt with < #of pairs\tasks for the reader). How can i
prevent such scenario?
To simplify things we should decouple the implementation of the general-purpose/reusable producer-consumer queue (or simply "blocking queue" as I usually call it) from the implementation of the actual producers and the consumer (that aren't general-purpose/reusable - they are specific to your program). This will make the code much more clear and manageable from a design perspective.
1. Implementing a general-purpose (reusable) blocking queue
First you should implement a "blocking queue" that can manage multiple multiple producers and a single consumer. This blocking queue will contain the code that handles multithreading/synchronization and it can be used by a consumer thread to receive items from several producer threads. Such a blocking queue can be implemented in a lot of different ways (not only with mutex+cond combo) depending whether you have 1 or more consumers and 1 or more producers (sometimes it is possible to introduce different kinds of [platform specific] optimizations when you have only 1 consumer or 1 producer). The simplest queue implementation with mutex+cond pair automatically handles multiple producers and multiple consumers if needed.
The queue has only an internal container (it can be a non-thread safe std::queue, vector or list) that holds the items and an associated mutex+cond pair that protects this container from concurrent access of multiple threads. The queue has to provide two operations:
produce(item): puts one item into the queue and returns immediately. The pseudo code looks like this:
lock mutex
add the new item to the internal container
signal through cond
unlock mutex
return
wait_and_get(): if there is at least one item in the queue then it removes the oldest one and returns immediately, otherwise it waits util someone puts an item to the queue with the produce(item) operation.
lock mutex
if container is empty:
wait for cond (pthread_cond_wait)
remove oldest item
unlock mutex
return the removed oldest item
2. Implementing your program using the blocking queue
Now that you have a reusable blocking queue to build on we can implement the producers and the consumer along with the main thread that controls things.
The producers
They just throw a bunch of items into the queue (by calling produce(item) of the blocking queue) and then they exit. If the production of items isn't computation heavy or doesn't require waiting for a lot of IO operations then this will finish very quickly in your example program. To simulate real world scenarios where the threads do heavy work you could the the following: On each producer thread you put only X (lets say 5) number of items to the queue but between each item you wait for a random number of seconds let's say between 1 and 3 seconds. Note that after some time your producer threads quit by themselves when they finished their job.
The consumer
The consumer has an infinite loop in which it always gets the next item from the queue with wait_and_get() and processes it somehow. If it is a special item that signals the end of processing then it breaks out of the infinite loop instead of processing the item. Pseudo code:
Infinite loop:
get the next item from the queue (wait_and_get())
if this is the special item indicating the end of processing then break out of the loop...
otherwise let's process this item
The main thread
Start all threads including producers and the consumers in any order.
Wait for all producer threads to finish (pthread_join() them).
Remember that producers finish and quit by themselves after some time without external stimuli. When you finish join-ing all producers it means that every producer has quit so no one will call the produce(item) operation of the queue again. However the queue may still have unprocessed items and consumer may still work on crunching those.
Put the last special "end of processing" item to the queue for the consumer.
When the consumer finishes processing the last item produced by the producers it will still ask the queue for the next item with wait_and_get() - this may result in a deadlock because of waiting for the next item that never arrives. To aid this on the main thread we put the last special item to the queue that signals the end of processing for the consumer. Remember that our consumer implementation contains a check for this special item to find out when to finish processing. Important that this special item has to be placed to the queue on the main thread only after the producers have finished (after joining them)!
If you have multiple consumers then its easier to put multiple special "end of processing" items to the queue (1 for each consumer) than making the queue smarter to be able to handle multiple consumers with only 1 "end of processing" item. Since the main thread orchestrates the whole thing (thread creation, thread joining, etc) it knows exactly the number of consumers so it's easy to put the same number of "end of processing" items to the queue.
Wait for the consumer thread to terminate by joining it.
After putting the end-of-processing special item to the queue we wait for the consumer thread to process the remaining items (produced by the producers) along with our last special item (produced by the main "coordinator" thread) that asks consumer to finish. We do this waiting on the main thread by pthread_join()-in the consumer thread.
Additional notes:
In my threaded system implementations the items of the blocking queue are usually pointers - Pointers to "job" objects that have to be executed/processed. (You can implement the blocking queue as a template class, in this case the user of the blocking queue can specify the type of the item). In my case it is easy to put a special "end of processing" item to the queue for the consumer: I usually use a simple NULL job pointer for this purpose. In your case you will have to find out what kind of special value can you use in the queue to signal the end of processing for the consumer.
The producers may have their own queues and a whole bunch of other data structures with which they play around to "produce items" but the consumer doesn't care about those data structures. The consumer cares only about individual items received through its own blocking queue. If a producer wants something from the consumer then it has to send an item (a "job") to the consumer through the queue. The blocking queue instance belongs to the consumer thread - it provides a one-way communication channel between an arbitrary thread and the consumer thread. Even the consumer thread itself can put an item into its own queue (in some cases this is useful).
The pthread_cond_wait documentation says that this function can wake up without actual signaling (although I've never seen a single bug caused by the spurious wakup of this function in my life). To aid this the if container is empty then pthread_cond_wait part of the code should be replaced to while the container is empty pthread_cond_wait but again, this spurious wakeup thing is probably a lochness monster that is present only on some architectures with specific linux implementations of threading primitives so your code would probably work on desktop machines without caring about this problem.

Non blocking shared memory producer using boost interprocess condition to notify

I am trying to develop an application with one producer and several consumers.
The producers is one process and each consumer is one process. The shared resource is some kind of buffer in the shared memory.
The producer should work completely independent from the consumers. It should not be blocked in any case. Therefor the consumers are responsible to check if the data they read from the shared memory is valid and handle it if the producer has already overwritten the data. (They do this using some kind of hashing. Not important.)
The consumers should be informed when new data is available in the buffer. I think boost interprocess conditions are suitable for this usecase. (More suitable would be boost signals2, but this library is not working in an interprocess way).
Conditionas always need a mutex. But I do not need the mutex in my producer. In the consumers I only need the mutex for condition#wait.
Is it ok to only use the codnition#notify_all in the producer and do not use the mutex? Or is this an abuse of the library?
Thanks in advance
It's okay to signal without holding the mutex, but it could lead to worst-case behaviour in rare cases (thread starvation).
Signaling under the mutex guarantees fair scheduling of the waiters under POSIX as far as I am aware ¹
That said, I think the commenters are right when they smell overcomplication in the design. I'd simplify. Optimize when you need it.
¹ See e.g. here: http://linux.die.net/man/3/pthread_cond_signal
The pthread_cond_broadcast() or pthread_cond_signal() functions may be called by a thread whether or not it currently owns the mutex that threads calling pthread_cond_wait() or pthread_cond_timedwait() have associated with the condition variable during their waits; however, if predictable scheduling behavior is required, then that mutex shall be locked by the thread calling pthread_cond_broadcast() or pthread_cond_signal().
The producer should work completely independent from the consumers. It
should not be blocked in any case.
Why not? This should not affect the performance if you do not lock too frequently. You can have a data counter in shared memory and you would lock access to that counter only. Data can be stored in circular buffer in shared memory and access to it does not need to be locked because consumers check how much data is available to read using counter. Of course consumers need to read data fast enough. If the data is overwritten then the internal consumer counter can be reset to the value of interprocess counter.
Also producer can store data using many threads. Each thread can calculate future position of the data at the beginning of the thread and then update the counter after the data is stored at the end of the thread. Then additional locking is needed for future position calculations so that this value can be passed between threads.
In details, the non-multithreaded scenario could work like this:
Producer loop:
receive X samples of data
lock access to interprocess counter, increment the counter, unlock the access
Then each consumer has it's own internal counter so that it can compare with interprocess counter if and how much data is available to read (simply polling for data):
Consumer loop:
lock access to interprocess counter, read the counter value, unlock the access
compare the read value with internal counter
if values equal // no data available
sleep, then continue to the beginning of the loop
else if data overwritten // no need for hashing here, counter can be use to figure that out although doing it this way is probably a bit risky
set internal counter to the value of the interprocess counter
then continue to the beginning of the loop
else
read available data
increment internal counter

Idiom or pattern for N concurrent readers and 1 producer

Using the C++11 standard library (with the only help of boost::thread eventually) is there a clean way to implement a N readers - 1 producer solution, where all the readers, once notified at the same time (with std::condition_variable::notify_all() for example) by the producer, are guaranteed to enter their critical section before the producer will eventually enter its critical section a second time. In other words, all the notified readers must observe the same state of the shared resource. Once the producer noties the N readers, it cannot modify the shared resource until all the N readers have finished their reading. Note that boost::barrier is not really what I need, as I do not know N in advance. N may vary from one notification to another.
You could use atomic counters, with some polling from the producer thread.
When the counter reaches either N or 0 (it's up to you) then the producer gets to work and produce whatever it needs to produce. Before notifying the condition variable, the producers sets the counter to 0 (or N).
When a reader is done, it simply increases (or decreases) the counter.
What you describe is called a barrier

How to create a one-to-many wait mechanism in WinAPI

I have a thread that puts data in a buffer and multiple threads that will read their portions of data from the buffer.
How can I create a synchronization mechanism to satisfy those requirements:
Writer thread writes all data into buffer and lets all reader threads run at the same time (ok, i did it with Semaphore). Then waits for all of them to complete their turn. I could not use WaitForMultipleObjects since threads are not terminating but rather one round of loop is ending. Maybe an event for each reader thread and when loop ends they will signal it and Writer will use WaitForMultipleObjects to wait for all threads to finish their round of loop?
Reader threads read their data, do their job for that round and somehow let Writer thread to put the next data. Please note that, Writer should start its next round of loop when all of the threads finish their turn.
How to implement such mechanism? as i said, only thing i can think of seems to be:
Writer:
for (;;)
{
PutDataIntoBuffer();
for (i = 0; i < threadCount; ++i)
{
ResetEvent(threadEvents[i]); //so that all events will be nonsignaled
}
ReleaseSemaphore(sem, threadCount, NULL);
WaitForMultipleObjects(threadCount, threadEvents, TRUE, INFINITE);
}
Readers:
for (;;)
{
WaitForSingleObject(sem, INFINITE);
DoWhateverToBeDoneWithData();
SetEvent(threadEvents[myThreadIndex]); //writer, wait for me too!
}
What are better ways of doing this?
You should use a Readers-writer lock.
In Windows there is a Slim Reader/Writer lock which I recommend that you have a look at.
If you are using VC10, you better use Concurrency::unbounded_buffer class which will handle everything for you. Just use Concurrency::send, Concurrency::asend to append message to it; Concurrency::receive, Concurrency::try_receive to read message from it. This class implements FIFO. There can be multiple readers as well as multiple writers.

What is better for a message queue? mutex & cond or mutex&semaphore?

I am implementing a C++ message queue based on a std::queue.
As I need popers to wait on an empty queue I was considering using mutex for mutual exclusion and cond for suspending threads on empty queue, as glib does with the gasyncqueue.
However it looks to me that a mutex&semaphore would do the job, I think it contains an integer and that seems like a pretty high number to reach on pending messages.
Pros of semaphore are that you don't need to check manually the condition each time you return from a wait, as you now for sure that someone inserted something(when someone inserted 2 items and you are the second thread arriving).
Which one would you choose?
EDIT:
Changed the question in response to #Greg Rogers
A single semaphore does not do the job - you need to be comparing (mutex + semaphore) and (mutex + condition variable).
It is pretty easy to see this by trying to implement it:
void push(T t)
{
queue.push(t);
sem.post();
}
T pop()
{
sem.wait();
T t = queue.top();
queue.pop();
return t;
}
As you can see there is no mutual exclusion when you are actually reading/writing to the queue, even though the signalling (from the semaphore) is there. Multiple threads can call push at the same time and break the queue, or multiple threads could call pop at the same time and break it. Or, a thread could call pop and be removing the first element of the queue while another thread called push.
You should use whichever you think is easier to implement, I doubt performance will vary much if any (it might be interesting to measure though).
Personally I use a mutex to serialize access to the list, and wake up the consumer by sending a byte over a socket (produced by socketpair()). That may be somewhat less efficient than a semaphore or condition variable, but it has the advantage of allowing the consumer to block in select()/poll(). That way the consumer can also wait on other things besides the data queue, if it wants to. It also lets you use the exact same queueing code on almost all OS's, since practically every OS supports the BSD sockets API.
Psuedocode follows:
// Called by the producer. Adds a data item to the queue, and sends a byte
// on the socket to notify the consumer, if necessary
void PushToQueue(const DataItem & di)
{
mutex.Lock();
bool sendSignal = (queue.size() == 0);
queue.push_back(di);
mutex.Unlock();
if (sendSignal) producerSocket.SendAByteNonBlocking();
}
// Called by consumer after consumerSocket selects as ready-for-read
// Returns true if (di) was written to, or false if there wasn't anything to read after all
// Consumer should call this in a loop until it returns false, and then
// go back to sleep inside select() to wait for further data from the producer
bool PopFromQueue(DataItem & di)
{
consumerSocket.ReadAsManyBytesAsPossibleWithoutBlockingAndThrowThemAway();
mutex.Lock();
bool ret = (queue.size() > 0);
if (ret) queue.pop_front(di);
mutex.Unlock();
return ret;
}
If You want to allow multiple simultaneously users at a time to use your queue, you should use semaphores.
sema(10) // ten threads/process have the concurrent access.
sema_lock(&sema_obj)
queue
sema_unlock(&sema_obj)
Mutex will "authorize" only one user at a time.
pthread_mutex_lock(&mutex_obj)
global_data;
pthread_mutex_unlock(&mutex_obj)
That's the main difference and You should decide which solution will fit your requirements.
But I'd choose mutex approach, because You don't need to specifies how many users can grab your resource.