So I am trying to implement a double buffer for a typical producer and consumer problem.
1.get_items() basically produces 10 items at a time.
2.producer basically push 10 items onto a write queue. Assume that currently we only have one producer.
3.consumers will consume one item from the queue. There are many consumers.
So I am sharing my code as the following. The implementation idea is simple, consume from the readq until it is empty and then swap the queue pointer, which the readq would point to the writeq now and writeq would now points to the emptied queue and would starts to fill it again. So producer and consumer can work independently without halting each other. This sort of swaps space for time.
However, my code does not work in multiple consumer cases. In my code, I initiated 10 consumer threads, and it always stuck at the .join().
So I am thinking that my code is definitely buggy. However, by examine carefully, I did not find where that bug is. And it seems the code stuck after lk1.unlock(), so it is not stuck in a while or something obvious.
mutex m1;
mutex m2; // using 2 mutex, so when producer is locked, consumer can still run
condition_variable put;
condition_variable fetch;
queue<int> q1;
queue<int> q2;
queue<int>* readq = &q1;
queue<int>* writeq = &q2;
bool flag{ true };
vector<int> get_items() {
vector<int> res;
for (int i = 0; i < 10; i++) {
res.push_back(i);
}
return res;
}
void producer_mul() {
unique_lock<mutex> lk2(m2);
put.wait(lk2, [&]() {return flag == false; }); //producer waits for consumer signal
vector<int> items = get_items();
for (auto it : items) {
writeq->push(it);
}
flag = true; //signal queue is filled
fetch.notify_one();
lk2.unlock();
}
int get_one_item_mul() {
unique_lock<mutex> lk1(m1);
int res;
if (!(*readq).empty()) {
res = (*readq).front(); (*readq).pop();
if ((*writeq).empty() && flag == true) { //if writeq is empty
flag = false;
put.notify_one();
}
}
else {
readq = writeq; // swap queue pointer
while ((*readq).empty()) { // not yet write
if (flag) {
flag = false;
put.notify_one();//start filling process
}
//if (readq->empty()) { //upadted due to race. readq now points to writeq, so if producer finished, readq is not empty and flag = true.
fetch.wait(lk1, [&]() {return flag == true; });
//}
}
if (flag) {
writeq = writeq == &q1 ? &q2 : &q1; //swap the writeq to the alternative queue and fill it again
flag = false;
//put.notify_one(); //fill that queue again if needed. but in my case, 10 item is produced and consumed, so no need to use the 2nd round, plus the code does not working in this simple case..so commented out for now.
}
res = readq->front(); readq->pop();
}
lk1.unlock();
this_thread::sleep_for(10ms);
return res;
}
int main()
{
std::vector<std::thread> threads;
std::packaged_task<void(void)> job1(producer_mul);
vector<std::future<int>> res;
for (int i = 0; i < 10; i++) {
std::packaged_task<int(void)> job2(get_one_item_mul);
res.push_back(job2.get_future());
threads.push_back(std::thread(std::move(job2)));
}
threads.push_back(std::thread(std::move(job1)));
for (auto& t : threads) {
t.join();
}
for (auto& a : res) {
cout << a.get() << endl;
}
return 0;
}
I added some comments, but the idea and code is pretty simple and self-explanatory.
I am trying to figure out where the problem is in my code. Does it work for multiple consumer? Further more, if there are multiple producers here, does it work? I do not see a problem since basically in the code the lock is not fine grained. Producer and Consumer both are locked from the beginning till the end.
Looking forward to discussion and any help is appreciated.
Update
updated the race condition based on one of the answer.
The program is still not working.
Your program contains data races, and therefore exhibits undefined behavior. I see at least two:
producer_mul accesses and modifies flag while holding m2 mutex but not m1. get_one_item_mul accesses and modifies flag while holding m1 mutex but not m2. So flag is not in fact protected against concurrent access.
Similarly, producer_mul accesses writeq pointer while holding m2 mutex but not m1. get_one_item_mul modifies writeq while holding m1 mutex but not m2.
There's also a data race on the queues themselves. Initially, both queues are empty. producer_mul is blocked waiting on flag. Then the following sequence occurs ( P for producer thread, C for consumer thread):
C: readq = writeq; // Both now point to the same queue
C: flag = false; put.notify_one(); // This wakes up producer
**P: writeq->push(it);
**C: if (readq->empty())
The last two lines happen concurrently, with no protection against concurrent access. One thread modifies an std::queue instance while the other accesses that same instance. This is a data race.
There's a data race at the heart of the design. Let's imagine there's just one producer P and two consumers C1 and C2. Initially, P waits on put until flag == false. C1 grabs m1; C2 is blocked on m1.
C1 sets readq = writeq, then unblocks P1, then calls fetch.wait(lk1, [&]() {return flag == true; });. This unlocks m1, allowing C2 to proceed. So now P is busy writing to writeq while C2 is busy reading from readq - which is one and the same queue.
Related
I am a complete beginner with threads therefore I'm not able to resolve this problem myself.
I have two threads which should run in parallel. The first thread should read in the data (simulate receive queue thread) and once data is ready the second thread shall process (processing thread) the data. The problem is, that the second thread will wait for a change of the conditional variable an infinite amount of time.
If I remove the for loop of the first thread, conditional variable will notify the second thread but the thread will only execute once. Why is the conditional variable not notified if it is used within the for loop?
My goal is to read in all data of a CSV file in the first thread and store it dependent on the rows content in a vector in the second thread.
Thread one look like this
std::mutex mtx;
std::condition_variable condVar;
bool event_angekommen{false};
void simulate_event_readin(CSVLeser leser, int sekunden, std::vector<std::string> &csv_reihe)
{
std::lock_guard<std::mutex> lck(mtx);
std::vector<std::vector<std::string>> csv_daten = leser.erhalteDatenobj();
for (size_t idx = 1; idx < csv_daten.size(); idx++)
{
std::this_thread::sleep_for(std::chrono::seconds(sekunden));
csv_reihe = csv_daten[idx];
event_angekommen = true;
condVar.notify_one();
}
}
Thread two looks like this:
void detektiere_events(Detektion detektion, std::vector<std::string> &csv_reihe, std::vector<std::string> &pir_events)
{
while(1)
{
std::cout<<"Warte"<<std::endl;
std::unique_lock<std::mutex> lck(mtx);
condVar.wait(lck, [] {return event_angekommen; });
std::cout<<"Detektiere Events"<<std::endl;
std::string externes_event_user_id = csv_reihe[4];
std::string externes_event_data = csv_reihe[6];
detektion.neues_event(externes_event_data, externes_event_user_id);
if(detektion.pruefe_Pir_id() == true)
{
pir_events.push_back(externes_event_data);
};
}
}
and my main looks like this:
int main(void)
{
Detektion detektion;
CSVLeser leser("../../Example Data/collectedData_Protocol1.csv", ";");
std::vector<std::string> csv_reihe;
std::vector<std::string> pir_values = {"28161","28211","28261","28461","285612"};
std::vector<std::string> pir_events;
std::thread thread[2];
thread[0] = std::thread(simulate_event_readin, leser, 4, std::ref(csv_reihe));
thread[1] = std::thread(detektiere_events,detektion, std::ref(csv_reihe), std::ref(pir_events));
thread[0].join();
thread[1].join();
}
I'm not a C++ expert, but the code seems understandable enough to see the issue.
Your thread 1 grabs the lock once and doesn't release it until the end of its lifetime. It may signal that the condition is fulfilled, but it never actually releases the lock to allow other threads to act.
To fix this, move std::lock_guard<std::mutex> lck(mtx); inside the loop, after sleeping. This way, the thread will take and release the lock on each iteration, giving the other thread an opportunity to act while sleeping.
std::mutex mutex;
std::condition_variable cv;
uint8_t size = 2;
uint8_t count = size;
uint8_t direction = -1;
const auto sync = [&size, &count, &mutex, &cv, &direction]() //.
{
{
std::unique_lock<std::mutex> lock(mutex);
auto current_direction = direction;
if (--count == 0)
{
count = size;
direction *= -1;
cv.notify_all();
}
else
{
cv.wait(lock,
[&direction, ¤t_direction]() //.
{ return direction != current_direction; });
}
}
};
as provided in the first unaccepted answer of reusable barrier
a 'generation' must be stored inside a barrier object to prevent a next generation from manipulating the wake up 'condition' of the current generation for a given set of threads. What I do not like about the first unaccepted answer is the growing counter of generations, I believe that we need only to differentiate between two generations at most that is if a thread satisfied the wait condition and started another barrier synchronization call as the second unaccepted solution suggests, the second solution however was somewhat complex and I believe that the above snippet would even be enough (currently implemented locally inside the main but could be abstracted into a struct). Am I correct in my 'belief' that a barrier can only be used simultaneously for 2 generations at most?
I am learning about multithreading and I wanted to simulate producer-consumer problem ( using semaphore if I can call it that ).
I have a class that holds a queue, producer push ints into queue and consumer retrieves it and prints it. I simulated is as following
class TestClass{
public:
void producer( int i ){
unique_lock<mutex> l(m);
q.push(i);
if( q.size() )
cnd.notify_all();
}
void consumer(){
unique_lock<mutex> l(m);
while( q.empty() ){
cnd.wait(l);
}
int tmp = q.front();
q.pop();
cout << "Producer got " << tmp << endl;
}
void ConsumerInit( int threads ){
for( int i = 0; i < threads; i++ ){
thrs[i] = thread(&TestClass::consumer, this );
}
for( auto &a : thrs )
a.join();
}
private:
queue<int> q;
vector<thread> thrs;
mutex m;
condition_variable cnd;
};
And I used a little console application to call data:
int main(){
int x;
TestClass t;
int counter = 0;
while( cin >> x ){
if( x == 0 )
break;
if( x == 1)
t.producer(counter++);
if( x == 2 )
t.ConsumerInit(5);
}
}
So when user input 1, a data is pushed into the queue , if user press 2 threads are spawned.
In any order of invoking it, for example, pressing 1 1 and then 2, or 2 1 1
it throws segfault. I am not sure why my understanding of my code is as following: let's assume order 2 1 1
I initialize 5 threads, they see that queue is empty, so they go to sleep. When I push a number to the queue, it notifies all threads sleeping.
The first one to wake up lock mutex again and proceed to retrieve number from queue and afterwards releasing the mutex, when mutex is released another thread do the same and unlocks the mutex, the third thread after mutex is unlocked is still in loop and see that queue is yet again empty and goes to sleep again, same with all remaining threads.
Is this logic correct? If so, why does this keep throwing segfault, if not, I appreciate all explanations.
Thanks for the help!
//edit
By answers suggets , i replaced [] with vector.push_back , but consumer does nothing with data now , does not take it or print it.
You aren't expanding the thrs vector when you do
thrs[i] = thread(&CTest::consumer, this );
You should do
thrs.emplace_back(&CTest::consumer, this);
That's where the crash would be.
Your issue has nothing to do with multithreading. You are accessing a std::vector out-of-bounds:
for (int i = 0; i < threads; i++) {
thrs[i] = thread(&CTest::consumer, this);
//...
vector<thread> thrs;
The thrs vector is empty, and you're trying to access as if it has entries.
To show the error, use:
thrs.at(i) = thread(&CTest::consumer, this);
and you will be greeted with a std::out_of_range exception instead of a segmentation fault.
Your program deadlocks, if the input sequence is not in the form of 1 1 1 1 1 ... 2. That is if the number if 1s preceding 2 is less than five.
Here is the reason:
If the total elements in queue size are less than 5 and the main thread calls consumerInit, some of the five created consumer threads will block waiting for the queue to receive elements. Meanwhile, the main thread blocks on the join operation. Since the main thread will be waiting for consumer threads to finish while some of those threads are waiting for data to consume, there will be no progress. Hence deadlock.
Problem is here:
for( auto &a : thrs )
a.join();
Main thread gets blocked here after you enter 2 waiting for the consumers to finish. So after this point you think that you are entering inputs, while there is no cin happening.
Remove these two lines and then you can enter 1 and producer/consumer will do their job.
I'm currently using boost 1.55.0, and I cant understand why this code doesn't work.
The following code is a simplification that has the same problem as my program. Small runs finish, but when they are bigger the threads keep waiting forever.
boost::mutex m1;
boost::mutex critical_sim;
int total= 50000;
class krig{
public:
float dokrig(int in,float *sim, bool *aux, boost::condition_variable *hEvent){
float simnew=0;
boost::mutex::scoped_lock lk(m1);
if (in > 0)
{
while(!aux[in-1]){
hEvent[in-1].wait(lk);
}
simnew=1+sim[in-1];
}
return simnew;
};
};
void Simulnode( int itrd,float *sim, bool *aux, boost::condition_variable *hEvent){
int j;
float simnew;
krig kriga;
for(j=itrd; j<total; j=j+2){
if (fmod(1000.*j,total) == 0.0){
printf (" .progress. %f%%\n",100.*(float)j/(float)total);
}
simnew= kriga.dokrig(j,sim, aux, hEvent);
critical_sim.lock();
sim[j]=simnew;
critical_sim.unlock();
aux[j]=true;
hEvent[j].notify_one();
}
}
int main(int argc, char* argv[])
{
int i;
float *sim = new float[total];
bool *aux = new bool[total];
for(i=0; i<total; ++i)
aux[i]=false;
//boost::mutex m1;
boost::condition_variable *hEvent = new boost::condition_variable[total];
boost::thread_group tgroup;
for(i=0; i<2; ++i) {
tgroup.add_thread(new boost::thread(Simulnode, i,sim, aux, hEvent));
}
tgroup.join_all();
return 0;
}
Curiously, I noticed that if I place the code that is inside dokrig() inline in simulnode() then it seems to work. Can it be some problem with the scope of the lock?
Can anybody tell me where I am wrong? Thanks in advance.
The problem happens in this part:
aux[j]=true;
hEvent[j].notify_one();
The first line represents a change of the condition that is being monitored by the hEvent condition variable. The second line proclaims this change to the consumer part, that is waiting for that condition to become true.
The problem is that these two steps happen without synchronization with the consumer, which can lead to the following race:
The consumer checks the condition, which is currently false. This happens in a critical section protected by the mutex m1.
A thread switch occurs. The producer changes the condition to true and notifies any waiting consumers.
Threads switch back. The consumer resumes and calls wait. However, he already missed the notify that occurred in the last step, so he will wait forever.
It is important to understand that the purpose of the mutex that is passed to the wait call of the condition variable is not to protect the condition variable itself, but the condition that it monitors (which in this case is the change to aux).
To avoid the data race, writing to aux and the subsequent notify have to be protected by the same mutex:
{
boost::lock_guard<boost::mutex> lk(m1);
aux[j]=true;
hEvent[j].notify_one();
}
I am fairly new to multi-threaded programming, so please forgive my possibly imprecise question. Here is my problem:
I have a function processing data and generating lots of objects of the same type. This is done iterating in several nested loops, so it would be practical to just do all iterations, save these objects in some container and then work on that container in interfacing code doing the next steps. However, I have to create millions of these objects which would blow up the memory usage. These constraints are mainly due to external factors I cannot control.
Generating only a certain amount of data would be ideal, but breaking out of the loops and restarting later at the same point is also impractical. My idea was to do the processing in a separate thread which would be paused after n iterations and resumed once all n objects are completely processed, then resuming, doing n next iterations and so on until all iterations are done. It is important to wait until the thread has done all n iterations, so both threads would not really run in parallel.
This is where my problems begin: How do I do the mutex locking properly here? My approaches produce boost::lock_errors. Here is some code to show what I want to do:
boost::recursive_mutex bla;
boost::condition_variable_any v1;
boost::condition_variable_any v2;
boost::recursive_mutex::scoped_lock lock(bla);
int got_processed = 0;
const int n = 10;
void ProcessNIterations() {
got_processed = 0;
// have some mutex or whatever unlocked here so that the worker thread can
// start or resume.
// my idea: have some sort of mutex lock that unlocks here and a condition
// variable v1 that is notified while the thread is waiting for that.
lock.unlock();
v1.notify_one();
// while the thread is working to do the iterations this function should wait
// because there is no use to proceed until the n iterations are done
// my idea: have another condition v2 variable that we wait for here and lock
// afterwards so the thread is blocked/paused
while (got_processed < n) {
v2.wait(lock);
}
}
void WorkerThread() {
int counter = 0;
// wait for something to start
// my idea: acquire a mutex lock here that was locked elsewhere before and
// wait for ProcessNIterations() to unlock it so this can start
boost::recursive_mutex::scoped_lock internal_lock(bla);
for (;;) {
for (;;) {
// here do the iterations
counter++;
std::cout << "iteration #" << counter << std::endl;
got_processed++;
if (counter >= n) {
// we've done n iterations; pause here
// my idea: unlock the mutex, notify v2
internal_lock.unlock();
v2.notify_one();
while (got_processed > 0) {
// when ProcessNIterations() is called again, resume here
// my idea: wait for v1 reacquiring the mutex again
v1.wait(internal_lock);
}
counter = 0;
}
}
}
}
int main(int argc, char *argv[]) {
boost::thread mythread(WorkerThread);
ProcessNIterations();
ProcessNIterations();
while (true) {}
}
The above code fails after doing 10 iterations in the line v2.wait(lock); with the following message:
terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::lock_error> >'
what(): boost::lock_error
How do I do this properly? If this is the way to go, how do I avoid lock_errors?
EDIT: I solved it using a concurrent queue like discussed here. This queue also has a maximum size after which a push will simply wait until at least one element has been poped. Therefore, the producer worker can simply go on filling this queue and the rest of the code can pop entries as it is suitable. No mutex locking needs to be done outside the queue. The queue is here:
template<typename Data>
class concurrent_queue
{
private:
std::queue<Data> the_queue;
mutable boost::mutex the_mutex;
boost::condition_variable the_condition_variable;
boost::condition_variable the_condition_variable_popped;
int max_size_;
public:
concurrent_queue(int max_size=-1) : max_size_(max_size) {}
void push(const Data& data) {
boost::mutex::scoped_lock lock(the_mutex);
while (max_size_ > 0 && the_queue.size() >= max_size_) {
the_condition_variable_popped.wait(lock);
}
the_queue.push(data);
lock.unlock();
the_condition_variable.notify_one();
}
bool empty() const {
boost::mutex::scoped_lock lock(the_mutex);
return the_queue.empty();
}
bool wait_and_pop(Data& popped_value) {
boost::mutex::scoped_lock lock(the_mutex);
bool locked = true;
if (the_queue.empty()) {
locked = the_condition_variable.timed_wait(lock, boost::posix_time::seconds(1));
}
if (locked && !the_queue.empty()) {
popped_value=the_queue.front();
the_queue.pop();
the_condition_variable_popped.notify_one();
return true;
} else {
return false;
}
}
int size() {
boost::mutex::scoped_lock lock(the_mutex);
return the_queue.size();
}
};
This could be implemented using conditional variables. Once you've performed N iterations, you call wait() on the condition variable, and when the objects are processed in another thread, call signal() on the condition variable to unblock the other thread that is blocked on the condition variable.
You probably want some sort of finite capacity queue list or stack in conjunction with a condition variable. When the queue is full, the producer thread waits on the condition variable, and any time a consumer thread removes an element from the queue, it signals the condition variable. That would allow the producer to wake up and fill the queue again. If you really wanted to process N elements at a time, then have the workers signal only when there's capacity in the queue for N elements, rather then every time they pull an item out of the queue.