C++ Producer Consumer, same consumer thread grabs all tasks - c++

I am implementing a producer consumer project in c++, and when I run the program, the same consumer grabs almost all of the work, without letting any of the other consumer threads grab any. Sometimes, other threads do get some work, but then that other thread takes control for a while. for example, TID 10 could grab almost all of the work, but then all of a sudden TID 12 would grab it, with no other consumer threads getting work in between.
Any idea why other threads wouldn't have a chance to grab work?
#include <thread>
#include <iostream>
#include <mutex>
#include <condition_variable>
#include <deque>
#include <csignal>
#include <unistd.h>
using namespace std;
int max_queue_size = 100;
int num_producers = 5;
int num_consumers = 7;
int num_operations = 40;
int operations_created = 0;
thread_local int operations_created_by_this_thread = 0;
int operations_consumed = 0;
thread_local int operations_consumed_by_this_thread = 0;
struct thread_stuff {
int a;
int b;
int operand_num;
char operand;
};
char operands[] = {'+', '-', '/', '*'};
deque<thread_stuff> q;
bool finished = false;
condition_variable cv;
mutex queue_mutex;
void producer(int n) {
while (operations_created_by_this_thread < num_operations) {
int oper_num = rand() % 4;
thread_stuff equation;
equation.a = rand();
equation.b = rand();
equation.operand_num = oper_num;
equation.operand = operands[oper_num];
while ((operations_created - operations_consumed) >= max_queue_size) {
// don't do anything until it has space available
}
{
lock_guard<mutex> lk(queue_mutex);
q.push_back(equation);
operations_created++;
}
cv.notify_all();
operations_created_by_this_thread++;
this_thread::__sleep_for(chrono::seconds(rand() % 2), chrono::nanoseconds(0));
}
{
lock_guard<mutex> lk(queue_mutex);
if(operations_created == num_operations * num_producers){
finished = true;
}
}
cv.notify_all();
}
void consumer() {
while (true) {
unique_lock<mutex> lk(queue_mutex);
cv.wait(lk, [] { return finished || !q.empty(); });
if(!q.empty()) {
thread_stuff data = q.front();
q.pop_front();
operations_consumed++;
operations_consumed_by_this_thread++;
int ans = 0;
switch (data.operand_num) {
case 0:
ans = data.a + data.b;
break;
case 1:
ans = data.a - data.b;
break;
case 2:
ans = data.a / data.b;
break;
case 3:
ans = data.a * data.b;
break;
}
cout << "Operation " << operations_consumed << " processed by PID " << getpid()
<< " TID " << this_thread::get_id() << ": "
<< data.a << " " << data.operand << " " << data.b << " = " << ans << " queue size: "
<< (operations_created - operations_consumed) << endl;
}
this_thread::yield();
if (finished) break;
}
}
void usr1_handler(int signal) {
cout << "Status: Produced " << operations_created << " operations and "
<< (operations_created - operations_consumed) << " operations are in the queue" << endl;
}
void usr2_handler(int signal) {
cout << "Status: Consumed " << operations_consumed << " operations and "
<< (operations_created - operations_consumed) << " operations are in the queue" << endl;
}
int main(int argc, char *argv[]) {
if (argc < 5) {
cout << "Invalid number of parameters passed in" << endl;
exit(1);
}
max_queue_size = atoi(argv[1]);
num_operations = atoi(argv[2]);
num_producers = atoi(argv[3]);
num_consumers = atoi(argv[4]);
// signal(SIGUSR1, usr1_handler);
// signal(SIGUSR2, usr2_handler);
thread producers[num_producers];
thread consumers[num_consumers];
for (int i = 0; i < num_producers; i++) {
producers[i] = thread(producer, num_operations);
}
for (int i = 0; i < num_consumers; i++) {
consumers[i] = thread(consumer);
}
for (int i = 0; i < num_producers; i++) {
producers[i].join();
}
for (int i = 0; i < num_consumers; i++) {
consumers[i].join();
}
cout << "finished!" << endl;
}

You're holding the mutex the whole time--including yield()-ing while holding the mutex.
Scope the unique_lock like you do in your producer's code, popping from the queue and incrementing the counter atomically.
I see that you have a max queue size. You need a 2nd condition for the producer to wait on if the queue is full, and the consumer will signal this condition as it consumes items.

Any idea why other threads wouldn't have a chance to grab work?
This poll is troubling:
while ((operations_created - operations_consumed) >= max_queue_size)
{
// don't do anything until it has space available
}
You might try a minimal delay in the loop ... this is a 'bad neighbor', and can 'consume' a core.

There are few issues with your code:
Using Normal Variables for Inter-Thread Communication
Here is an example:
int operations_created = 0;
int operations_consumed = 0;
void producer(int n) {
[...]
while ((operations_created - operations_consumed) >= max_queue_size) { }
and later
void consumer() {
[...]
operations_consumed++;
This will work only on x86 architectures without optimizations, i.e. -O0. Once we try to enable optimizations, the compiler will optimize the while loop to:
void producer(int n) {
[...]
if ((operations_created - operations_consumed) >= max_queue_size) {
while (true) { }
}
So, your program simply hang here. You can check this on Compiler Explorer.
mov eax, DWORD PTR operations_created[rip]
sub eax, DWORD PTR operations_consumed[rip]
cmp eax, DWORD PTR max_queue_size[rip]
jl .L19 // here is the if before the loop
.L20:
jmp .L20 // here is the empty loop
.L19:
Why is this happening? From the single-thread program point of view, while (condition) { operators } is exact equivalent to if (condition) while (true) { operators } if operators do not change the condition.
To fix the issue, we should use std::atomic<int> instead of simple int. Those are designed for inter-thread communication and so compiler will avoid such optimizations and generate the correct assembly.
Consumer Locks The Mutex while yield()
Have a look at this snippet:
void consumer() {
while (true) {
unique_lock<mutex> lk(queue_mutex);
[...]
this_thread::yield();
[...]
}
Basically this mean that consumer does the yield() holding the lock. Since only one consumer can hold a lock at a time (mutex stands for mutual exclusion), that explains why other consumers cannot consume the work.
To fix this issue, we should unlock the queue_mutex before the yield(), i.e.:
void consumer() {
while (true) {
{
unique_lock<mutex> lk(queue_mutex);
[...]
}
this_thread::yield();
[...]
}
This still does not guarantee that only one thread will do most of the tasks. When we do notify_all() in producer, all threads get woke up, but only one will lock the mutex. Since the work we schedule is tiny, by the time producer calls notify_all() our thread will finish the work, done the yield() and will be ready for the next work.
So why this thread locks the mutex, but not the other one then? I guess that is happening due to CPU cache and busy waiting. The thread just finished the work is "hot", it is in CPU cache and ready to lock the mutex. Before go to sleep it also might try to busy wait for mutex few cycles, which increases its chances to win even more.
To fix this, we can either remove the sleep in producer (so it will wake up other threads more often, so other threads will be "hot" as well), or do a sleep() in the consumer instead of yield() (so this thread becomes "cold" during the sleep).
Anyway, there is no opportunity to do the work in parallel due to mutex, so the fact that same thread does most of the work is completely natural IMO.

Related

Sandard way of implementing c++ multi-threading for collecting data streams and processing

I'm new to c++ development. I'm trying to run infinite functions that are independent of each other.
Problem statement is smiliar to this:
The way I'm trying to implement this is
#include <iostream>
#include <cstdlib>
#include <pthread.h>
#include <unistd.h>
#include <mutex>
int g_i = 0;
std::mutex g_i_mutex; // protects g_i
// increment g_i by 1
void increment_itr()
{
const std::lock_guard<std::mutex> lock(g_i_mutex);
g_i += 1;
}
void *fun(void *s)
{
std::string str;
str = (char *)s;
std::cout << str << " start\n";
while (1)
{
std::cout << str << " " << g_i << "\n";
if(g_i > 1000) break;
increment_itr();
}
pthread_exit(NULL);
std::cout << str << " end\n";
}
void *checker(void *s) {
while (1) {
if(g_i > 1000) {
std::cout<<"**********************\n";
std::cout << "checker: g_i == 100\n";
std::cout<<"**********************\n";
pthread_exit(NULL);
}
}
}
int main()
{
int itr = 0;
pthread_t threads[3];
pthread_attr_t attr;
void *status;
// Initialize and set thread joinable
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
int rc1 = pthread_create(&threads[0], &attr, fun, (void *)&"foo");
int rc2 = pthread_create(&threads[1], &attr, fun, (void *)&"bar");
int rc3 = pthread_create(&threads[2], &attr, checker, (void *)&"checker");
if (rc1 || rc2 || rc3)
{
std::cout << "Error:unable to create thread," << rc1 << rc2 << rc3 << std::endl;
exit(-1);
}
pthread_attr_destroy(&attr);
std::cout << "main func continues\n";
for (int i = 0; i < 3; i++)
{
rc1 = pthread_join(threads[i], &status);
if (rc1)
{
std::cout << "Error:unable to join," << rc1 << std::endl;
exit(-1);
}
std::cout << "Main: completed thread id :" << i;
std::cout << " exiting with status :" << status << std::endl;
}
std::cout << "main end\n";
return 0;
}
This works, but I want to know if this implementation is a standard approach to do this or this can be done in any better way?
You correctly take a lock inside increment_itr, but your fun function is accessing g_i without acquiring the lock.
Change this:
void increment_itr()
{
const std::lock_guard<std::mutex> lock(g_i_mutex);
g_i += 1;
}
To this
int increment_itr()
{
std::lock_guard<std::mutex> lock(g_i_mutex); // the const wasn't actually needed
g_i = g_i + 1;
return g_i; // return the updated value of g_i
}
This is not thread safe:
if(g_i > 1000) break; // access g_i without acquiring the lock
increment_itr();
This this is better:
if (increment_itr() > 1000) {
break;
}
Similar fix is needed in checker:
void *checker(void *s) {
while (1) {
int i;
{
std::lock_guard<std::mutex> lock(g_i_mutex);
i = g_i;
}
if(i > 1000) {
std::cout<<"**********************\n";
std::cout << "checker: g_i == 100\n";
std::cout<<"**********************\n";
break;
}
return NULL;
}
As to your design question. Here's the fundamental issue.
You're proposing a dedicated thread that continuously takes a lock and would does some sort checking on a data structure. And if a certain condition is met, it would do some additional processing such as writing to a database. The thread spinning in an infinite loop would be wasteful if nothing in the data structure (the two maps) has changed. Instead, you only want your integrity check to run when something changes. You can use a condition variable to have the checker thread pause until something actually changes.
Here's a better design.
uint64_t g_data_version = 0;
std::conditional_variable g_cv;
void *fun(void *s)
{
while (true) {
<< wait for data from the source >>
{
std::lock_guard<std::mutex> lock(g_i_mutex);
// update the data in the map while under a lock
// e.g. g_n++;
//
// increment the data version to signal a new revision has been made
g_data_version += 1;
}
// notify the checker thread that something has changed
g_cv.notify_all();
}
}
Then your checker function only wakes up when it fun signals it to say something has changed.
void *checker(void *s) {
while (1) {
// lock the mutex
std::unique_lock<std::mutex> lock(g_i_mutex);
// do the data comparison check here
// now wait for the data version to change
uint64_t version = g_data_version;
while (version != g_data_version) { // check for spurious wake up
cv.wait(lock); // this atomically unlocks the mutex and waits for a notify() call on another thread to happen
}
}
}

Mutlithreading: synchronize threads to perform several steps race condition

I want to create 15 threads and have them performed 4 successive steps (that I call Init, Process, Terminate and WriteOutputs).
For each step I want all threads to finish it before passing to the following step.
I am trying to implement it (cf code below) using a std::condition_variable and calling the wait() and notify_all() methods but somehow I do not manage to do it
and even worse I have a race condition
when counting the number of operations done (which should be 15*4 = 60) I sometimes have some prints that are indeed not printed and the m_counter in my class at the end is less than 60 which should not be the case
I use two std::mutex objects: one for printing messages and another one for the step synchronization
Could someone explain to me the problem?
What would be a solution ?
Many thanks in advance
#include<iostream>
#include<thread>
#include<mutex>
#include<condition_variable>
#include<vector>
#include<functional>
class MTHandler
{
public:
MTHandler(){
// 15 threads
std::function<void(int)> funcThread = std::bind(&MTHandler::ThreadFunction, this, std::placeholders::_1);
for (int i=0; i<15; i++){
m_vectThreads.push_back(std::thread(funcThread,i));
}
for (std::thread & th : m_vectThreads) {
th.join();
}
std::cout << "m_counter = " << m_counter << std::endl;
}
private:
enum class ManagerStep{
Init,
Process,
Terminate,
WriteOutputs,
};
std::vector<ManagerStep> m_vectSteps = {
ManagerStep::Init,
ManagerStep::Process,
ManagerStep::Terminate,
ManagerStep::WriteOutputs
};
unsigned int m_iCurrentStep = 0 ;
unsigned int m_counter = 0;
std::mutex m_mutex;
std::mutex m_mutexStep;
std::condition_variable m_condVar;
bool m_finishedAllSteps = false;
unsigned int m_nThreadsFinishedStep = 0;
std::vector<std::thread> m_vectThreads = {};
void ThreadFunction (int id) {
while(!m_finishedAllSteps){
m_mutex.lock();
m_counter+=1;
m_mutex.unlock();
switch (m_vectSteps[m_iCurrentStep])
{
case ManagerStep::Init:{
m_mutex.lock();
std::cout << "thread " << id << " --> Init step" << "\n";
m_mutex.unlock();
break;
}
case ManagerStep::Process:{
m_mutex.lock();
std::cout << "thread " << id << " --> Process step" << "\n";
m_mutex.unlock();
break;
}
case ManagerStep::Terminate:{
m_mutex.lock();
std::cout << "thread " << id << " --> Terminate step" << "\n";
m_mutex.unlock();
break;
}
case ManagerStep::WriteOutputs:{
m_mutex.lock();
std::cout << "thread " << id << " --> WriteOutputs step" << "\n";
m_mutex.unlock();
break;
}
default:
{
break;
}
}
unsigned int iCurrentStep = m_iCurrentStep;
bool isCurrentStepFinished = getIsFinishedStatus();
if (!isCurrentStepFinished){
// wait for other threads to finish current step
std::unique_lock<std::mutex> lck(m_mutexStep);
m_condVar.wait(lck, [iCurrentStep,this]{return iCurrentStep != m_iCurrentStep;});
}
}
}
bool getIsFinishedStatus(){
m_mutexStep.lock();
bool isCurrentStepFinished = false;
m_nThreadsFinishedStep +=1;
if (m_nThreadsFinishedStep == m_vectThreads.size()){
// all threads have completed the current step
// pass to the next step
m_iCurrentStep += 1;
m_nThreadsFinishedStep = 0;
m_finishedAllSteps = (m_iCurrentStep == m_vectSteps.size());
isCurrentStepFinished = true;
}
if (isCurrentStepFinished){m_condVar.notify_all();}
m_mutexStep.unlock();
return isCurrentStepFinished;
}
};
int main ()
{
MTHandler mt;
return 0;
}

how to implement wait() for producer/consumer using C++17 (mutex or shared_mutex)

I have a set of consumers generating data, each with their own vector. A consumer, in this MWE the main function, should wait for one of the producers to create something, and then once it wakes up process all queues and then wait again.
With a mutex, the critical section is easy to create, but how to wait for any one of the producers to signal that data is available, and wake up main. The producers do not need to block each other, they could each add to their own queue, but each producer must block the consumer, and the consumer must block all the producers.
Here is my MWE, which is only missing the equivalent of Java obj.wait() and object.notify().
#include <thread>
#include <mutex>
#include <iostream>
#include <vector>
#include <unistd.h>
using namespace std;
mutex m;
vector<int> event1;
int count1 = 0;
void producer1() {
while(true) {
m.lock(); // enter critical section
cout << "producer1: " << count1 << '\n';
event1.push_back(count1++);
// TODO: wakeup main consumer loop
// in Java this would be obj.notify();
m.unlock();
sleep(1);
}
}
vector<int> event2;
int count2 = 0;
void producer2() {
while(true) {
m.lock(); // enter critical section
cout << "producer2: " << count2 << '\n';
event2.push_back(count2++);
// TODO: wakeup main consumer loop
// in Java this would be obj.notify();
m.unlock(); // enter critical section
sleep(2);
}
}
int main() {
thread t1(producer1);
thread t2(producer2);
while (true) {
// TODO: obj.wait();
// inside the critical section, get all the data from each producer
m.lock();
for (int i = 0; i < event1.size(); i++)
cout << "consumer1: " << event1[i] << '\n';
for (int i = 0; i < event2.size(); i++)
cout << "consumer2: " << event2[i] << '\n';
event1.clear();
event2.clear();
m.unlock();
sleep(10);
}
}

Synchronize threads using mutex

I'm trying to understand C++ Multithreading and synchronize between many threads.
Thus I created 2 threads the first one increments a value and the second one decrements it. what I can't understand why the resulted value after the execution is different than the first one, since I added and subtracted from the same value.
static unsigned int counter = 100;
static bool alive = true;
static Lock lock;
std::mutex mutex;
void add() {
while (alive)
{
mutex.lock();
counter += 10;
std::cout << "Counter Add = " << counter << std::endl;
mutex.unlock();
}
}
void sub() {
while (alive)
{
mutex.lock();
counter -= 10;
std::cout << "Counter Sub = " << counter<< std::endl;
mutex.unlock();
}
}
int main()
{
std::cout << "critical section value at the start " << counter << std::endl;
std::thread tAdd(add);
std::thread tSub(sub);
Sleep(1000);
alive = false;
tAdd.join();
tSub.join();
std::cout << "critical section value at the end " << counter << std::endl;
return 0;
}
Output
critical section value at the start 100
critical section value at the end 220
So what I need is how to keep my value as it's, I mean counter equal to 100 using those two threads.
The problem is that both threads will get into an "infinite" loop for 1 second and they will get greedy with the mutex. Do a print in both functions and see which thread gets the lock more often.
Mutexes are used to synchronize access to resources so that threads will not read/write incomplete or corrupted data, not create a neat sequence.
If you want to keep that value at 100 at the end of execution you need to use a semaphore so that there will be an ordered sequence of access to the variable.
I think, what you want is to signal to the subtracting thread, that you just have sucessfully added in the add thread, and vice versa. You'll have to additionally communicate the information, which thread is next. A naive solution:
bool shouldAdd = true;
add() {
while( alive ) {
if( shouldAdd ) {
// prefer lock guards over lock() and unlock() for exception safety
std::lock_guard<std::mutex> lock{mutex};
counter += 10;
std::cout << "Counter Add = " << counter << std::endl;
shouldAdd = false;
}
}
}
sub() {
while( alive ) {
if( !shouldAdd ) {
std::lock_guard<std::mutex> lock{mutex};
counter -= 10;
std::cout << "Counter Sub = " << counter << std::endl;
shouldAdd = true;
}
}
}
Now add() will busy wait for sub() to do its job before it will try and acquire the lock again.
To prevent busy waiting, you might chose a condition variable, instead of trying to only use a single mutex. You can wait() on the condition variable, before you add or subtract, and notify() the waiting thread afterwards.

Still having race condition with boost::mutex

I am trying an example, which causes race condition to apply the mutex. However, even with the mutex, it still happens. What's wrong? Here is my code:
#include <iostream>
#include <boost/thread.hpp>
#include <vector>
using namespace std;
class Soldier
{
private:
boost::thread m_Thread;
public:
static int count , moneySpent;
static boost::mutex soldierMutex;
Soldier(){}
void start(int cost)
{
m_Thread = boost::thread(&Soldier::process, this,cost);
}
void process(int cost)
{
{
boost::mutex::scoped_lock lock(soldierMutex);
//soldierMutex.lock();
int tmp = count;
++tmp;
count = tmp;
tmp = moneySpent;
tmp += cost;
moneySpent = tmp;
// soldierMutex.unlock();
}
}
void join()
{
m_Thread.join();
}
};
int Soldier::count, Soldier::moneySpent;
boost::mutex Soldier::soldierMutex;
int main()
{
Soldier s1,s2,s3;
s1.start(20);
s2.start(30);
s3.start(40);
s1.join();
s2.join();
s3.join();
for (int i = 0; i < 100; ++i)
{
Soldier s;
s.start(30);
}
cout << "Total soldier: " << Soldier::count << '\n';
cout << "Money spent: " << Soldier::moneySpent << '\n';
}
It looks like you're not waiting for the threads started in the loop to finish. Change the loop to:
for (int i = 0; i < 100; ++i)
{
Soldier s;
s.start(30);
s.join();
}
edit to explain further
The problem you saw was that the values printed out were wrong, so you assumed there was a race condition in the threads. The race in fact was when you printed the values - they were printed while not all the threads had a chance to execute
Based on this and your previous post (were it does not seem you have read all the answers yet). What you are looking for is some form of synchronization point to prevent the main() thread from exiting the application (because when the main thread exits the application all the children thread die).
This is why you call join() all the time to prevent the main() thread from exiting until the thread has exited. As a result of your usage though your loop of threads is not parallel and each thread is run in sequence to completion (so no real point in using the thread).
Note: join() like in Java waits for the thread to complete. It does not start the thread.
A quick look at the boost documentation suggests what you are looking for is a thread group which will allow you to wait for all threads in the group to complete before exiting.
//No compiler so this is untested.
// But it should look something like this.
// Note 2: I have not used boost::threads much.
int main()
{
boost::thread_group group;
boost::ptr_vector<boost::thread> threads;
for(int loop = 0; loop < 100; ++loop)
{
// Create an object.
// With the function to make it start. Store the thread in a vector
threads.push_back(new boost::thread(<Function To Call>));
// Add the thread to the group.
group.add(threads.back());
}
// Make sure main does not exit before all the threads have completed.
group.join_all();
}
If we go back to your example and retrofit your Soldier class:
int main()
{
boost::thread batallion;
// Make all the soldiers part of a group.
// When you start the thread make the thread join the group.
Soldier s1(batallion);
Soldier s2(batallion);
Soldier s3(batallion);
s1.start(20);
s2.start(30);
s3.start(40);
// Create 100 soldiers outside the loo
std::vector<Soldier> lotsOfSoldiers;
lotsOfSoldiers.reserve(100); // to prevent reallocation in the loop.
// Because you are using objects we need to
// prevent copying of them after the thread starts.
for (int i = 0; i < 100; ++i)
{
lotsOfSoldiers.push_back(Solder(batallion));
lotsOfSoldiers.back().start(30);
}
// Print out values while threads are still running
// Note you may get here before any thread.
cout << "Total soldier: " << Soldier::count << '\n';
cout << "Money spent: " << Soldier::moneySpent << '\n';
batallion.join_all();
// Print out values when all threads are finished.
cout << "Total soldier: " << Soldier::count << '\n';
cout << "Money spent: " << Soldier::moneySpent << '\n';
}