how to wait effectively on a set of semaphore? - c++

I'm using semaphores with shared-memory for communicating between multi-producers and multi-clients. There are two main kinds of semaphores in my system, which are "stored semaphores" and "processed semaphores".
The system run as following: Producers continously put data into the shared-memory, and then increase the stored semaphore's value, while the consumers is in the loop, waiting for such stored semaphored. The consumers, after receiving data from producer, will process such data and then, increase the processed semaphore's value. Producers will get their results by waiting on "processed semaphore"
The producer code:
for(int i =0;i<nloop;i++){
usleep(100);
strcpy(shared_mem[i], "data for processing");
sem_post(&shared_mem[i].stored_semaphored);
if(sem_timedwait(&msg_ptr->processed_semaphore,&ts)==-1){ //waiting for result
if(errno == ETIMEDOUT){
}
break;
}else{
//success
}
}
the consumer code:
for (int j = 0; j < MAX_MESSAGE; j++) {
if (sem_trywait(&(shm_ptr->messages[j].stored_semaphore)) == -1) {
if (errno == EAGAIN) {
} else {
//success ==> process data
//post result back on the shared memory, and increase
//the processed semahore
strcpy(shared_mem[j].output, "Processed data");
sem_post(&(shared_mem[j].processed_semaphore));
}
}
}//for loop over MAX_MESSAGE
My problem is that the for loop in the consumer code is wasting almost 100 % CPU because in the case of no data from producer, this for loop run continously.
My question is that there is any other ways for waiting on a set of semaphores, (which may be similar to the waiting mechanism by SELECT, POLL, or EPOLL), which does not waste CPU time.
Hope see your answer. Thanks so much!

As far as I know there isn't a way to wait on a set of semaphores. This means that all accesses need to be funnelled through a single semaphore. You're looping over a set of semaphores, so they collectively can become one object. That consumer needs to know when any of the semaphores has been signalled, so use an additional sem_post on a new semaphore to signal that the set of semaphores has changed.
Your producer code becomes something like this:
....
sem_post(&shared_mem[i].stored_semaphored);
sem_post(&list_changed_semaphore); /* Wake the consumer. */
....
and the consumer:
/* Block until a consumer has indicated that it has changed the semaphore list */
if (!sem_wait(&list_changed_semaphore)) {
/* At least one producer has signalled a change. */
for (int j = 0; j < MAX_MESSAGE; j++) {
if (sem_trywait(&(shm_ptr->messages[j].stored_semaphore)) == -1) {
}
}
}
Instead of using a semaphore for list_changed_semaphore you could use a pthread_cond_t condition variable to signal that something in your set of semaphores has changed. The list_changed_semaphore does not need to be a counter as the example shown here, it only needs to be a single bit to indicate that a producer has modified the list.

Related

Little bit confused about using Condition Variable for concurrent programming

I am currently learning about Game Programming with the book 'Game Engine Architecture' authored by Jason Gregory.
In this book, he showed an example with the reason for using 'Condition Variable'
[Without Condition Variable]
Queue g_queue;
pthread_mutex_t g_mutex; bool
g_ready = false;
void* ProducerThread(void*)
{
// keep on producing forever...
while (true)
{
pthread_mutex_lock(&g_mutex);
// fill the queue with data
ProduceDataInto(&g_queue);
g_ready = true;
pthread_mutex_unlock(&g_mutex);
// yield the remainder of my timeslice
// to give the consumer a chance to run pthread_yield();
}
return nullptr;
}
void* ConsumerThread(void*)
{
// keep on consuming forever...
while (true)
{
// wait for the data to be ready
while (true)
{
// read the value into a local,
// making sure to lock the mutex
pthread_mutex_lock(&g_mutex);
const bool ready = g_ready;
pthread_mutex_unlock(&g_mutex);
if (ready) break;
}
// consume the data
pthread_mutex_lock(&g_mutex);
ConsumeDataFrom(&g_queue);
g_ready = false;
pthread_mutex_unlock(&g_mutex);
// yield the remainder of my timeslice
// to give the producer a chance to run pthread_yield();
}
return nullptr;
}
In this example, he said 'Besides the fact that this example is somewhat contrived, there’s one big problem with it: The consumer thread spins in a tight loop, polling the value of g_ready'
I found that the function 'pthread_mutex_lock(&g_mutex)' is a blocking function that if the calling thread can't acquire the mutex, it falls to asleep.
Then, isn't that the consumer thread is not on the state of 'busy-wait'?
I mean, doesn't it spin at all if it does not acquire the mutex?
Though pthread_mutex_lock is a blocking function, the producer and consumer loop will spin tightly. Because either ProduceDataInto or ConsumeDataFrom executed and returns immediately, the mutex repeats lock/unlock after each calling ProduceDataInto/ProduceDataInto.
So there must be a queue-full Condition Variable to make the producer wait and a queue-empty Condition Variable to make the consumer wait.

Synchronising main thread and worker thread

In QT, from main(GUI) thread I am creating a worker thread to perform a certain operation which accesses a resource shared by both threads. On certain action in GUI, main thread has to manipulate the resource. I tried using QMutex to lock that particular resource. This resource is continuously used by the worker thread, How to notify main thread on this?
Tried using QWaitCondition but it was crashing the application.
Is there any other option to notify and achieve synchronisation between threads?
Attached the code snippet.
void WorkerThread::IncrementCounter()
{
qDebug() << "In Worker Thread IncrementCounter function" << endl;
while(stop == false)
{
mutex.lock();
for(int i = 0; i < 100; i++)
{
for(int j = 0; j < 100; j++)
{
counter++;
}
}
qDebug() << counter;
mutex.unlock();
}
qDebug() << "In Worker Thread Aborting " << endl;
}
//Manipulating the counter value by main thread.
void WorkerThread::setCounter(int value)
{
waitCondition.wait(&mutex);
counter = value;
waitCondition.notify_one();
}
You are using the wait condition completely wrong.
I urge you to read up on mutexes and conditions, and maybe look at some examples.
wait() will block execution until either notify_one() or notify_all() is called somewhere. Which of course cannot happen in your code.
You cannot wait() a condition on one line and then expect the next two lines to ever be called if they contain the only wake up calls.
What you want is to wait() in one thread and notify_XXX() in another.
You could use shared memory from within the same process. Each thread could lock it before writing it, like this:
QSharedMemory *shared=new QSharedMemory("Test Shared Memory");
if(shared->create(1,QSharedMemory::ReadWrite))
{
shared->lock();
// Copy some data to it
char *to = (char*)shared->data();
const char *from = &dataBuffer;
memcpy(to, from, dataSize);
shared->unlock();
}
You should also lock it for reading. If strings are wanted, reading strings can be easier that writing them, if they are zero terminated. You'll want to convert .toLatin1() to get a zero-terminated string which you can get the size of a string. You might get a lock that multiple threads can read from, with shared->attach(); but that's more for reading the shared memory of a different process..
You might just use this instead of muteces. I think if you try to lock it, and something else already has it locked, it will just block until the other process unlocks it.

Threading and Mutex

I'm working on a program that simulates a gas station. Each car at the station is it's own thread. Each car must loop through a single bitmask to check if a pump is open, and if it is, update the bitmask, fill up, and notify other cars that the pump is now open. My current code works but there are some issues with load balancing. Ideally all the pumps are used the same amount and all cars get equal fill-ups.
EDIT: My program basically takes a number of cars, pumps, and a length of time to run the test for. During that time, cars will check for an open pump by constantly calling this function.
int Station::fillUp()
{
// loop through the pumps using the bitmask to check if they are available
for (int i = 0; i < pumpsInStation; i++)
{
//Check bitmask to see if pump is open
stationMutex->lock();
if ((freeMask & (1 << i)) == 0 )
{
//Turning the bit on
freeMask |= (1 << i);
stationMutex->unlock();
// Sleeps thread for 30ms and increments counts
pumps[i].fillTankUp();
// Turning the bit back off
stationMutex->lock();
freeMask &= ~(1 << i);
stationCondition->notify_one();
stationMutex->unlock();
// Sleep long enough for all cars to have a chance to fill up first.
this_thread::sleep_for(std::chrono::milliseconds((((carsInStation-1) * 30) / pumpsInStation)-30));
return 1;
}
stationMutex->unlock();
}
// If not pumps are available, wait until one becomes available.
stationCondition->wait(std::unique_lock<std::mutex>(*stationMutex));
return -1;
}
I feel the issue has something to do with locking the bitmask when I read it. Do I need to have some sort of mutex or lock around the if check?
It looks like every car checks the availability of pump #0 first, and if that pump is busy it then checks pump #1, and so on. Given that, it seems expected to me that pump #0 would service the most cars, followed by pump #1 serving the second-most cars, all the way down to pump #(pumpsInStation-1) which only ever gets used in the (relatively rare) situation where all of the pumps are in use simultaneously at the time a new car pulls in.
If you'd like to get better load-balancing, you should probably have each car choose a different random ordering to iterate over the pumps, rather than having them all check the pumps' availability in the same order.
Normally I wouldn't suggest refactoring as it's kind of rude and doesn't go straight to the answer, but here I think it would help you a bit to break your logic into three parts, like so, to better show where the contention lies:
int Station::acquirePump()
{
// loop through the pumps using the bitmask to check if they are available
ScopedLocker locker(&stationMutex);
for (int i = 0; i < pumpsInStation; i++)
{
// Check bitmask to see if pump is open
if ((freeMask & (1 << i)) == 0 )
{
//Turning the bit on
freeMask |= (1 << i);
return i;
}
}
return -1;
}
void Station::releasePump(int n)
{
ScopedLocker locker(&stationMutex);
freeMask &= ~(1 << n);
stationCondition->notify_one();
}
bool Station::fillUp()
{
// If a pump is available:
int i = acquirePump();
if (i != -1)
{
// Sleeps thread for 30ms and increments counts
pumps[i].fillTankUp();
releasePump(i)
// Sleep long enough for all cars to have a chance to fill up first.
this_thread::sleep_for(std::chrono::milliseconds((((carsInStation-1) * 30) / pumpsInStation)-30));
return true;
}
// If no pumps are available, wait until one becomes available.
stationCondition->wait(std::unique_lock<std::mutex>(*stationMutex));
return false;
}
Now when you have the code in this form, there is a load balancing issue which is important to fix if you don't want to "exhaust" one pump or if it too might have a lock inside. The issue lies in acquirePump where you are checking the availability of free pumps in the same order for each car. A simple tweak you can make to balance it better is like so:
int Station::acquirePump()
{
// loop through the pumps using the bitmask to check if they are available
ScopedLocker locker(&stationMutex);
for (int n = 0, i = startIndex; n < pumpsInStation; ++n, i = (i+1) % pumpsInStation)
{
// Check bitmask to see if pump is open
if ((freeMask & (1 << i)) == 0 )
{
// Change the starting index used to search for a free pump for
// the next car.
startIndex = (startIndex+1) % pumpsInStation;
// Turning the bit on
freeMask |= (1 << i);
return i;
}
}
return -1;
}
Another thing I have to ask is if it's really necessary (ex: for memory efficiency) to use bit flags to indicate whether a pump is used. If you can use an array of bool instead, you'll be able to avoid locking completely and simply use atomic operations to acquire and release pumps, and that'll avoid creating a traffic jam of locked threads.
Imagine that the mutex has a queue associated with it, containing the waiting threads. Now, one of your threads manages to get the mutex that protects the bitmask of occupied stations, checks if one specific place is free. If it isn't, it releases the mutex again and loops, only to go back to the end of the queue of threads waiting for the mutex. Firstly, this is unfair, because the first one to wait is not guaranteed to get the next free slot, only if that slot happens to be the one on its loop counter. Secondly, it causes an extreme amount of context switches, which is bad for performance. Note that your approach should still produce correct results in that no two cars collide while accessing a single filling station, but the behaviour is suboptimal.
What you should do instead is this:
lock the mutex to get exclusive access to the possible filling stations
locate the next free filling station
if none of the stations are free, wait for the condition variable and restart at point 2
mark the slot as occupied and release the mutex
fill up the car (this is where the sleep in the simulation actually makes sense, the other one doesn't)
lock the mutex
mark the slot as free and signal the condition variable to wake up others
release the mutex again
Just in case that part isn't clear to you, waiting on a condition variable implicitly releases the mutex while waiting and reacquires it afterwards!

Concurrent server using pthread API

I am writing a simple client-server application using pthread-s API, which in pseudo code
looks something like this:
static volatile sig_atomic_t g_running = 1;
static volatile sig_atomic_t g_threads = 0;
static pthread_mutex_t g_threads_mutex;
static void signalHandler(int signal)
{
g_running = 0;
}
static void *threadServe(void *params)
{
/* Increment the number of currently running threads. */
pthread_mutex_lock(&g_threads_mutex);
g_threads++;
pthread_mutex_unlock(&g_threads_mutex);
/* handle client's request */
/* decrement the number of running threads */
pthread_mutex_lock(&g_threads_mutex);
g_threads--;
pthread_mutex_unlock(&g_threads_mutex);
}
int main(int argc, char *argv[])
{
/* do all the initialisation
(set up signal handlers, listening socket, ... ) */
/* run the server loop */
while (g_running)
{
int comm_sock = accept(listen_socket, NULL, 0);
pthread_create(&thread_id, NULL, &threadServe, comm_sock) ;
pthread_detach(thread_id);
}
/* wait for all threads that are yet busy processing client requests */
while (1)
{
std::cerr << "Waiting for all threads to finish" << std::endl;;
pthread_mutex_lock(&g_threads_mutex);
if (g_threads <= 0)
{
pthread_mutex_unlock(&g_threads_mutex);
break;
}
pthread_mutex_unlock(&g_threads_mutex);
}
/* clean up */
}
So the server is running in an infinite loop until a signal (SIGINT or SIGTERM) is received. The purpose of the second while loop is to let all the threads (that were processing client requests at the time a signal was received) to have a chance to finish the work they already started.
However I don't like this design very much, because that second while loop is basically a busy loop wasting cpu resources.
I tried to search on Google for some good examples on threaded concurrent server, but I had no luck. An idea that came to my mind was to use pthread_cond_wait() istead of that loop, but I am not sure if this does not bring further problems.
So the question is, how to improve my design, or point me to a nice simple example that deals with similar problem as mine.
EDIT:
I was considering pthread_join(), but I din't know how to join with worker thread,
while the main server loop (with accept() call in it) would be still running.
If I called pthread_join() somewhere after pthread_create()
(instead of pthread_detach()), then the while loop would be blocked until the worker
thread is done and the whole threading would not make sense.
I could use pthread_join() if I spawned all the threads at program start,
but then I would have them around for the entire life of my server,
which I thought might be a little inefficient.
Also after reading man page I understood, that pthread_detach() is exactly
suitable for this purpose.
The busy loop slurping CPU can easily be altered by having a usleep(10000); or something like that outside your mutex lock.
It would be more light-weight if you use a std::atomic<int> g_threads; - that way, you could get rid of the mutex altogether.
If you have an array of (active) thread_id's, you could just use a loop of
for(i = 0; i < num_active_threads; i++)
pthread_join(arr[i]);

ThreadQueue - Development for Servers - C++

Today i got a idea to make an ThreadQueue for C++, for my Server Application.
unsigned int m_Actives; // Count of active threads
unsigned int m_Maximum;
std::map<HANDLE, unsigned int> m_Queue;
std::map<HANDLE, unsigned int>::iterator m_QueueIt;
In an extra Thread i would to handle these while:
while(true)
{
if(m_Actives != m_Maximum)
{
if(m_Queue.size() > 0)
{
uintptr_t h = _beginthread((void(__cdecl*)(void*))m_QueueIt->first, 0, NULL);
m_Actives++;
}
else
{
Sleep(100); // Little Cooldown, should it be higher? or lower?
}
}
}
m_Maximum is setable and is the Maximal Thread Count. I think that should work, but now i need to Wait foreach Thread which is active and need to check if its finished/alive or not. But for this i would use WaitForSingleObject. But then i need 1 Thread per Thread. So 2 Threads. In the one something get handled. In the other one it wait for the 1 Thread to exit.
But i think that realy bad. What would you do?
You can use WaitForMultipleObjects to wait while any of started threads is ended.
Or, what is probably better in this case in each thread you can send an EVENT before stopping it. Than, the monitor thread should only wait and process this event.
But, to be honest, your description and source is rather tricky....