C++ Producer Consumer Problem with condition variable + mutex + pthreads - c++

I'm need to do the producer consumer problem in c++, solve for 1 consumer and 6 producer, and for 1 producer and 6 consumer, below is the statement of the question.
Question 1:
Imagine that you are waiting for some friends in a very busy restaurant and you are watching the staff, who wait on tables, bring food from the kitchen to their tables. This is an example of the classic "Producer-Consumer'' problem. There is a limit on servers and meals are constantly produced by the kitchen. Consider then that there is a limit on servers (consumers) and an "unlimited" supply of meals being produced by chefs (producers).
One approach to facilitate identification and thus reduce to a "producer-consumer" problem is to limit the number of consumers and thus limit the infinite number of meals.
produced in the kitchen. Thus, the existence of a traffic light is suggested to control the production order of the meals that will be taken by the attendants.
The procedure would be something like:
Create a semaphore;
Create the server and chef threads;
Produce as many meals as you can and keep a record of how many meals
are in the queue;
The server thread will run until it manages to deliver all the meals produced in the
tables; and
Threads must be "joined" with the main thread.
Also consider that there are 6 chefs and 1 attendant. If you want, you can consider that a chef takes 50 microseconds to produce a meal and the server takes 10 microseconds to deliver the
meal on the table. Set a maximum number of customers to serve. Print on the screen, at the end of the execution, which chef is most and least idle and how many meals each chef has produced.
Question 2:
Considering the restaurant described above. Now consider that there are 1 chef and 6 attendants. Assume that a chef takes 50 microseconds to produce a meal and the server takes 15 microseconds to deliver the meal to the table. Set a maximum number of customers to serve.
Print which server is the most and least idle and how many meals each server has delivered.
I managed to solve for 6 producers and 1 consumer, but for 6 consumers and 1 producer it's not working, it seems that the program gets stuck in some DeadLock. I'm grateful if anyone knows how to help.
#include <iostream>
#include <random>
#include <chrono>
#include <thread>
#include <mutex>
#include <deque>
//The mutex class is a synchronization primitive that can be used to protect shared data
//from being simultaneously accessed by multiple threads.
std::mutex semaforo; //semafoto para fazer o controle de acesso do buffer
std::condition_variable notifica; //variavel condicional para fazer a notificação de prato consumido/consumido
std::deque<int> buffer; //buffer de inteiros
const unsigned int capacidade_buffer = 10; //tamanho do buffer
const unsigned int numero_pratos = 25; //numeros de pratos a serem produzidos
void produtor()
{
unsigned static int contador_pratos_produzidos = 0;
while (contador_pratos_produzidos < numero_pratos)
{
std::unique_lock<std::mutex> locker(semaforo);
notifica.wait(locker, []
{ return buffer.size() < capacidade_buffer; });
std::this_thread::sleep_for(std::chrono::microseconds(50));
buffer.push_back(contador_pratos_produzidos);
if (contador_pratos_produzidos < numero_pratos)
{
contador_pratos_produzidos++;
}
locker.unlock();
notifica.notify_all();
}
}
void consumidor(int ID, std::vector<int> &consumido)
{
unsigned static int contador_pratos_consumidos = 0;
while (contador_pratos_consumidos < numero_pratos)
{
std::unique_lock<std::mutex> locker(semaforo);
notifica.wait(locker, []
{ return buffer.size() > 0; });
std::this_thread::sleep_for(std::chrono::microseconds(15));
buffer.pop_front();
if (contador_pratos_consumidos < numero_pratos)
{
contador_pratos_consumidos++;
consumido[ID]++;
}
locker.unlock();
notifica.notify_one();
}
}
int main()
{
//vetor para contagem do consumo de cada garcon
std::vector<int> consumido(6, 0);
//vetor de threads garcon(consumidores)
std::vector<std::thread> consumidores;
for (int k = 0; k < 6; k++)
{
consumidores.push_back(std::thread(consumidor, k, std::ref(consumido)));
}
//produtor/chef
std::thread p1(produtor);
for (auto &k : consumidores)
{
k.join();
}
p1.join();
int mais_ocioso = 200, menos_ocioso = 0, mais, menos;
for (int k = 0; k < 6; k++)
{
std::cout << "Garcon " << k + 1 << " entregou " << consumido[k] << " pratos\n";
if (consumido[k] > menos_ocioso)
{
menos = k + 1;
menos_ocioso = consumido[k];
}
if (consumido[k] < mais_ocioso)
{
mais = k + 1;
mais_ocioso = consumido[k];
}
}
std::cout << "\nO mais ocioso foi o garcon " << mais << " e o menos ocioso foi o garcon " << menos << "\n";
}

The same exact bug exists in both the consumer and the producer function. I'll explain one of them, and the same bug must also be fixed in the other one.
unsigned static int contador_pratos_consumidos = 0;
while (contador_pratos_consumidos < numero_pratos)
{
This static counter gets accessed and modified by multiple execution threads.
Any non-atomic object that's used by multiple execution threads must be properly sequenced (accessed only when holding an appropriate mutex).
If you focus your attention on the above two lines it should be obvious that this counter is accessed without the protection of any mutex. Once you realize that, the bug is obvious: at some point contador_pratos_consumidos will be exactly one less than numero_pratos. When that happens you can have multiple execution threads evaluating the while condition, at the same time, and all of them will happily conclude that it's true.
Multiple execution threads then enter the while loop. One will succeed in acquiring the mutex and consuming the "product", and finish. The remaining execution threads will wait forever, for another "product" that will never arrive. No more products will ever be produced. No soup for them.
The same bug also exists in the producer, except that the effects of the bug will be rather subtle: more products will end up being produced than there should be.
Of course, pedantically all of this is undefined behavior, so anything can really happen, but these are the typical, usual consequences this kind of undefined behavior. Both bugs must be fixed in order for this algorithm to work correctly.

Related

recursive threading with C++ gives a Resource temporarily unavailable

So I'm trying to create a program that implements a function that generates a random number (n) and based on n, creates n threads. The main thread is responsible to print the minimum and maximum of the leafs. The depth of hierarchy with the Main thread is 3.
I have written the code below:
#include <iostream>
#include <thread>
#include <time.h>
#include <string>
#include <sstream>
using namespace std;
// a structure to keep the needed information of each thread
struct ThreadInfo
{
long randomN;
int level;
bool run;
int maxOfVals;
double minOfVals;
};
// The start address (function) of the threads
void ChildWork(void* a) {
ThreadInfo* info = (ThreadInfo*)a;
// Generate random value n
srand(time(NULL));
double n=rand()%6+1;
// initialize the thread info with n value
info->randomN=n;
info->maxOfVals=n;
info->minOfVals=n;
// the depth of recursion should not be more than 3
if(info->level > 3)
{
info->run = false;
}
// Create n threads and run them
ThreadInfo* childInfo = new ThreadInfo[(int)n];
for(int i = 0; i < n; i++)
{
childInfo[i].level = info->level + 1;
childInfo[i].run = true;
std::thread tt(ChildWork, &childInfo[i]) ;
tt.detach();
}
// checks if any child threads are working
bool anyRun = true;
while(anyRun)
{
anyRun = false;
for(int i = 0; i < n; i++)
{
anyRun = anyRun || childInfo[i].run;
}
}
// once all child threads are done, we find their max and min value
double maximum=1, minimum=6;
for( int i=0;i<n;i++)
{
// cout<<childInfo[i].maxOfVals<<endl;
if(childInfo[i].maxOfVals>=maximum)
maximum=childInfo[i].maxOfVals;
if(childInfo[i].minOfVals< minimum)
minimum=childInfo[i].minOfVals;
}
info->maxOfVals=maximum;
info->minOfVals=minimum;
// we set the info->run value to false, so that the parrent thread of this thread will know that it is done
info->run = false;
}
int main()
{
ThreadInfo info;
srand(time(NULL));
double n=rand()%6+1;
cout<<"n is: "<<n<<endl;
// initializing thread info
info.randomN=n;
info.maxOfVals=n;
info.minOfVals=n;
info.level = 1;
info.run = true;
std::thread t(ChildWork, &info) ;
t.join();
while(info.run);
info.maxOfVals= max<unsigned long>(info.randomN,info.maxOfVals);
info.minOfVals= min<unsigned long>(info.randomN,info.minOfVals);
cout << "Max is: " << info.maxOfVals <<" and Min is: "<<info.minOfVals;
}
The code compiles with no error, but when I execute it, it gives me this :
libc++abi.dylib: terminating with uncaught exception of type
std::__1::system_error: thread constructor failed: Resource
temporarily unavailable Abort trap: 6
You spawn too many threads. It looks a bit like a fork() bomb. Threads are a very heavy-weight system resource. Use them sparingly.
Within the function void Childwork I see two mistakes:
As someone already pointed out in the comments, you check the info level of a thread and then you go and create some more threads regardless of the previous check.
Within the for loop that spawns your new threads, you increment the info level right before you spawn the actual thread. However you increment a freshly created instance of ThreadInfo here ThreadInfo* childInfo = new ThreadInfo[(int)n]. All instances within childInfo hold a level of 0. Basically the level of each thread you spawn is 1.
In general avoid using threads to achieve concurrency for I/O bound operations (*). Just use threads to achieve concurrency for independent CPU bound operations. As a rule of thumb you never need more threads than you have CPU cores in your system (**). Having more does not improve concurrency and does not improve performance.
(*) You should always use direct function calls and an event based system to run pseudo concurrent I/O operations. You do not need any threading to do so. For example a TCP server does not need any threads to serve thousands of clients.
(**) This is the ideal case. In practice your software is composed of multiple parts, developed by independent developers and maintained in different modes, so it is ok to have some threads which could be theoretically avoided.
Multithreading is still rocket science in 2019. Especially in C++. Do not do it unless you know exactly what you are doing. Here is a good series of blog posts that handle threads.

Progress bar in Windows activity field?

I've written a c++ program that performs time consuming calculations and i want the user to be able to see the progress while the program is running in the background (minimized).
I'd like to use the same effect as chrome uses when downloading a file:
How do i access this feature? Can i use it in my c++ program?
If the time consuming operation can be performed inside a loop, and depending on whether or not it is a count controlled loop, you may be able to use thread and atomic to solve your problem.
If your processor architecture supports multithreading you can use threads to run calculations concurrently. The basic use of a thread is to run a function in parallel with the main thread, these operations may be effectively done at the same time, meaning you would be able to use the main thread to check the progress of your time consuming calculations. With parallel threads comes the problem of data races, wherein if two threads try to access or edit the same data, they could do so incorrectly and corrupt the memory. This can be solved with atomic. You could use an atomic_int to make sure two actions are never cause a data race.
A viable example:
#include <thread>
#include <mutex>
#include <atomic>
#include <iostream>
//function prototypes
void foo(std::mutex * mtx, std::atomic_int * i);
//main function
int main() {
//first define your variables
std::thread bar;
std::mutex mtx;
std::atomic_int value;
//store initial value just in case
value.store(0);
//create the thread and assign it a task by passing a function and any parameters of the function as parameters of thread
std::thread functionalThread;
functionalThread = std::thread(foo/*function name*/, &mtx, &value/*parameters of the function*/);
//a loop to keep checking value to see if it has reached its final value
//temp variable to hold value so that operations can be performed on it while the main thread does other things
int temp = value.load();
//double to hold percent value
double percent;
while (temp < 1000000000) {
//calculate percent value
percent = 100.0 * double(temp) / 1000000000.0;
//display percent value
std::cout << "The current percent is: " << percent << "%" << std::endl;
//get new value for temp
temp = value.load();
}
//display message when calculations complete
std::cout << "Task is done." << std::endl;
//when you join a thread you are essentially waiting for the thread to finish before the calling thread continues
functionalThread.join();
//cin to hold program from completing to view results
int wait;
std::cin >> wait;
//end program
return 0;
}
void foo(std::mutex * mtx, std::atomic_int * i) {
//function counts to 1,000,000,000 as fast as it can
for (i->store(0); i->load() < 1000000000; i->store(i->load() + 1)) {
//keep i counting
//the first part is the initial value, store() sets the value of the atomic int
//the second part is the exit condition, load() returns the currently stored value of the atomic
//the third part is the increment
}
}

TBB task_arena & task_group usage for scaling parallel_for work

I am trying to use the Threaded Building Blocks task_arena. There is a simple array full of '0'. Arena's threads put '1' in the array on the odd places. Main thread put '2' in the array on the even places.
/* Odd-even arenas tbb test */
#include <tbb/parallel_for.h>
#include <tbb/blocked_range.h>
#include <tbb/task_arena.h>
#include <tbb/task_group.h>
#include <iostream>
using namespace std;
const int SIZE = 100;
int main()
{
tbb::task_arena limited(1); // no more than 1 thread in this arena
tbb::task_group tg;
int myArray[SIZE] = {0};
//! Main thread create another thread, then immediately returns
limited.enqueue([&]{
//! Created thread continues here
tg.run([&]{
tbb::parallel_for(tbb::blocked_range<int>(0, SIZE),
[&](const tbb::blocked_range<int> &r)
{
for(int i = 0; i != SIZE; i++)
if(i % 2 == 0)
myArray[i] = 1;
}
);
});
});
//! Main thread do this work
tbb::parallel_for(tbb::blocked_range<int>(0, SIZE),
[&](const tbb::blocked_range<int> &r)
{
for(int i = 0; i != SIZE; i++)
if(i % 2 != 0)
myArray[i] = 2;
}
);
//! Main thread waiting for 'tg' group
//** it does not create any threads here (doesn't it?) */
limited.execute([&]{
tg.wait();
});
for(int i = 0; i < SIZE; i++) {
cout << myArray[i] << " ";
}
cout << endl;
return 0;
}
The output is:
0 2 0 2 ... 0 2
So the limited.enque{tg.run{...}} block doesn't work.
What's the problem? Any ideas? Thank you.
You have created limited arena for one thread only, and by default this slot is reserved for the master thread. Though, enqueuing into such a serializing arena will temporarily boost its concurrency level to 2 (in order to satisfy 'fire-and-forget' promise of the enqueue), enqueue() does not guarantee synchronous execution of the submitted task. So, tg.wait() can start before tg.run() executes and thus the program will not wait when the worker thread is created, joins the limited arena, and fills the array with '1' (BTW, the whole array is filled in each of 100 parallel_for iterations).
So, in order to wait for the tg.run() to complete, use limited.execute instead. But it will prevent automatic enhancing of the limited concurrency level and the task will be deferred till tg.wait() executed by master thread.
If you want to see asynchronous execution, set arena's concurrency to 2 manually: tbb::task_arena limited(2);
or disable slot reservation for master thread: tbb::task_arena limited(1,0) (but note, it implies additional overheads for dynamic balancing of the number of threads in arena).
P.S. TBB has no points where threads are guaranteed to come (unlike OpenMP). Only enqueue methods guarantee creation of at least one worker thread, but it says nothing about when it will come. See local observer feature to get notification when threads are actually joining arenas.

what is the fastest way to notify another thread that data is available? any alternativies to spinning?

One my thread writes data to circular-buffer and another thread need to process this data ASAP. I was thinking to write such simple spin. Pseudo-code!
while (true) {
while (!a[i]) {
/* do nothing - just keep checking over and over */
}
// process b[i]
i++;
if (i >= MAX_LENGTH) {
i = 0;
}
}
Above I'm using a to indicate that data stored in b is available for processing. Probaly I should also set thread afinity for such "hot" process. Of course such spin is very expensive in terms of CPU but it's OK for me as my primary requirement is latency.
The question is - am I should really write something like that or boost or stl allows something that:
Easier to use.
Has roughly the same (or even better?) latency at the same time occupying less CPU resources?
I think that my pattern is so general that there should be some good implementation somewhere.
upd It seems my question is still too complicated. Let's just consider the case when i need to write some items to array in arbitrary order and another thread should read them in right order as items are available, how to do that?
upd2
I'm adding test program to demonstrate what and how I want to achive. At least on my machine it happens to work. I'm using rand to show you that I can not use general queue and I need to use array-based structure:
#include "stdafx.h"
#include <string>
#include <boost/thread.hpp>
#include "windows.h" // for Sleep
const int BUFFER_LENGTH = 10;
int buffer[BUFFER_LENGTH];
short flags[BUFFER_LENGTH];
void ProcessorThread() {
for (int i = 0; i < BUFFER_LENGTH; i++) {
while (flags[i] == 0);
printf("item %i received, value = %i\n", i, buffer[i]);
}
}
int _tmain(int argc, _TCHAR* argv[])
{
memset(flags, 0, sizeof(flags));
boost::thread processor = boost::thread(&ProcessorThread);
for (int i = 0; i < BUFFER_LENGTH * 10; i++) {
int x = rand() % BUFFER_LENGTH;
buffer[x] = x;
flags[x] = 1;
Sleep(100);
}
processor.join();
return 0;
}
Output:
item 0 received, value = 0
item 1 received, value = 1
item 2 received, value = 2
item 3 received, value = 3
item 4 received, value = 4
item 5 received, value = 5
item 6 received, value = 6
item 7 received, value = 7
item 8 received, value = 8
item 9 received, value = 9
Is my program guaranteed to work? How would you redesign it, probably using some of existent structures from boost/stl instead of array? Is it possible to get rid of "spin" without affecting latency?
If the consuming thread is put to sleep it takes a few microseconds for it to wake up. This is the process scheduler latency you cannot avoid unless the thread is busy-spinning as you do. The thread also needs to be real-time FIFO so that it is never put to sleep when it is ready to run but exhausted its time quantum.
So, there is no alternative that could match latency of busy spinning.
(Surprising you are using Windows, it is best avoided if you are serious about HFT).
This is what Condition Variables were designed for. std::condition_variable is defined in the C++11 standard library.
What exactly is fastest for your purposes depends on your problem; You can attack it from several angles, but CVs (or derivative implementations) are a good starting point for understanding the subject better and approaching an implementation.
Consider using C++11 library if your compiler supports it. Or boost analog if not. And in your case especially std::future with std::promise.
There is a good book about threading and C++11 threading library:
Anthony Williams. C++ Concurrency in Action (2012)
Example from cppreference.com:
#include <iostream>
#include <future>
#include <thread>
int main()
{
// future from a packaged_task
std::packaged_task<int()> task([](){ return 7; }); // wrap the function
std::future<int> f1 = task.get_future(); // get a future
std::thread(std::move(task)).detach(); // launch on a thread
// future from an async()
std::future<int> f2 = std::async(std::launch::async, [](){ return 8; });
// future from a promise
std::promise<int> p;
std::future<int> f3 = p.get_future();
std::thread( [](std::promise<int>& p){ p.set_value(9); },
std::ref(p) ).detach();
std::cout << "Waiting..." << std::flush;
f1.wait();
f2.wait();
f3.wait();
std::cout << "Done!\nResults are: "
<< f1.get() << ' ' << f2.get() << ' ' << f3.get() << '\n';
}
If you want a fast method then simply drop to making OS calls. Any C++ library wrapping them is going to be slower.
e.g. On Windows your consumer can call WaitForSingleObject(), and your data-producing thread can wake the consumer using SetEvent(). http://msdn.microsoft.com/en-us/library/windows/desktop/ms687032(v=vs.85).aspx
For Unix, here is a similar question with answers: Windows Event implementation in Linux using conditional variables?
Do you really need threading?
A single threaded app is trivially simple and eliminates all the issues with thread safety and the overhead of launching threads. I did a study of threaded vs non threaded code to append text to a log file. The non threaded code was better in every measure of performance.

boost::threads execution ordering

i have a problem with the order of execution of the threads created consecutively.
here is the code.
#include <iostream>
#include <Windows.h>
#include <boost/thread.hpp>
using namespace std;
boost::mutex mutexA;
boost::mutex mutexB;
boost::mutex mutexC;
boost::mutex mutexD;
void SomeWork(char letter, int index)
{
boost::mutex::scoped_lock lock;
switch(letter)
{
case 'A' : lock = boost::mutex::scoped_lock(mutexA); break;
case 'B' : lock = boost::mutex::scoped_lock(mutexB); break;
case 'C' : lock = boost::mutex::scoped_lock(mutexC); break;
case 'D' : lock = boost::mutex::scoped_lock(mutexD); break;
}
cout << letter <<index << " started" << endl;
Sleep(800);
cout << letter<<index << " finished" << endl;
}
int main(int argc , char * argv[])
{
for(int i = 0; i < 16; i++)
{
char x = rand() % 4 + 65;
boost::thread tha = boost::thread(SomeWork,x,i);
Sleep(10);
}
Sleep(6000);
system("PAUSE");
return 0;
}
each time a letter (from A to D) and a genereaion id (i) is passed to the method SomeWork as a thread. i do not care about the execution order between letters but for a prticular letter ,say A, Ax has to start before Ay, if x < y.
a random part of a random output of the code is :
B0 started
D1 started
C2 started
A3 started
B0 finished
B12 started
D1 finished
D15 started
C2 finished
C6 started
A3 finished
A9 started
B12 finished
B11 started --> B11 started after B12 finished.
D15 finished
D13 started
C6 finished
C7 started
A9 finished
how can avoid such conditions?
thanks.
i solved the problem using condition variables. but i changed the problem a bit. the solution is to keep track of the index of the for loop. so each thread knows when it does not work. but as far as this code is concerned, there are two other things that i would like to ask about.
first, on my computer, when i set the for-loop index to 350 i had an access violation. 310 was the number of loops, which was ok. so i realized that there is a maximum number of threads to be generated. how can i determine this number?
second, in visual studio 2008, the release version of the code showed a really strange behaviour. without using condition variables (lines 1 to 3 were commented out), the threads were ordered. how could that happen?
here is the code:
#include <iostream>
#include <Windows.h>
#include <boost/thread.hpp>
using namespace std;
boost::mutex mutexA;
boost::mutex mutexB;
boost::mutex mutexC;
boost::mutex mutexD;
class cl
{
public:
boost::condition_variable con;
boost::mutex mutex_cl;
char Letter;
int num;
cl(char letter) : Letter(letter) , num(0)
{
}
void doWork( int index, int tracknum)
{
boost::unique_lock<boost::mutex> lock(mutex_cl);
while(num != tracknum) // line 1
con.wait(lock); // line 2
Sleep(10);
num = index;
cout << Letter<<index << endl;
con.notify_all(); // line 3
}
};
int main(int argc , char * argv[])
{
cl A('A');
cl B('B');
cl C('C');
cl D('D');
for(int i = 0; i < 100; i++)
{
boost::thread(&cl::doWork,&A,i+1,i);
boost::thread(&cl::doWork,&B,i+1,i);
boost::thread(&cl::doWork,&C,i+1,i);
boost::thread(&cl::doWork,&D,i+1,i);
}
cout << "************************************************************************" << endl;
Sleep(6000);
system("PAUSE");
return 0;
}
If you have two different threads waiting for the lock, it's entirely non-deterministic which one will acquire it once the lock is released by the previous holder. I believe this is what you are experiencing. Assume B10 is holding the lock, and in the mean time threads are spawned for B11 and B12. B10 releases the lock - it's down to a coin toss as to whether B11 or B12 acquires it next, irrespective of which thread was created first, or even which thread started waiting first.
Perhaps you should implement work queues for each letter, such that you spawn exactly 4 threads, each of which consume work units? This is the only way to easily guarantee ordering in this way. A simple mutex is not going to guarantee ordering if multiple threads are waiting for the lock.
Even though B11 is started before B12 it is not guaranteed to be given a CPU time slice to execute SomeWork() prior to B12. This decision is up to the OS and its scheduler.
Mutex's are typically used to synchronize access to data between threads and a concern has been raised with the sequence of thread execution (i.e. data access).
If the threads for group 'A' are executing the same code on the same data then just use one thread. This will eliminate context switching between threads in the group and yield the same result. If the data is changing consider a producer/consumer pattern. Paul Bridger give's an easy to understand producer/consumer example here.
Your threads have dependencies that must be satisfied before they start execution. In your example, B12 depends on B0 and B11. Somehow you have to track that dependency knowledge. Threads with unfinished dependencies must be made to wait.
I would look into condition variables. Each time a thread finishes SomeWork() it would use the condition variable's notify_all() method. Then all of the waiting threads must check if they still have dependencies. If so, go back and wait. Otherwise, go ahead and call SomeWork().
You need some way for each thread to determine if it has unfinished dependencies. This will probably be some globally available entity. You should only modify it when you have the mutex (in SomeWork()). Reading by multiple threads should be safe for simple data structures.