"Segmentation fault (core dumped)" while using pthread_create - c++

So I've got a problem: when I trying to create the last thread it always says that core is dumped. Doesn't matter if I write to create 5 or 2 threads. Here is my code:
UPD: Now I can't do more than 3 threads and threads don't do functions that I want them to do(consume and produce)
UPD_2: Now I've go a message like that: terminate called after throwing an instance of 'terminate called recursively
terminate called recursively
Aborted (core dumped)
#include<cstdlib>
#include <iostream>
#include <string>
#include <mutex>
#include <pthread.h>
#include <condition_variable>
#define NUM_THREADS 4
using namespace std;
struct thread_data
{
int thread_id;
int repeat;
};
class our_monitor{
private:
int buffer[100];
mutex m;
int n = 0, lo = 0, hi = 0;
condition_variable in,out;
unique_lock<mutex> lk;
public:
our_monitor():lk(m)
{
}
void insert(int val, int repeat)
{
in.wait(lk, [&]{return n <= 100-repeat;});
for(int i=0; i<repeat; i++)
{
buffer[hi] = val;
hi = (hi + 1) % 100; //ring buffer
n = n +1; //one more item in buffer
}
lk.unlock();
out.notify_one();
}
int remove(int repeat)
{
out.wait(lk, [&]{return n >= repeat;});
int val;
for(int i=0; i<repeat; i++)
{
val = buffer[lo];
lo = (lo + 1) % 100;
n -= 1;
}
lk.unlock();
in.notify_one();
return val;
}
};
our_monitor mon;
void* produce(void *threadarg)
{
struct thread_data *my_data;
my_data = (struct thread_data *) threadarg;
cout<<"IN produce after paramiters"<< my_data->repeat<<endl;
int item;
item = rand()%100 + 1;
mon.insert(item, my_data->repeat);
cout<< "Item: "<< item << " Was prodused by thread:"<< my_data->thread_id << endl;
}
void* consume(void *threadarg)
{
struct thread_data *my_data;
my_data = (struct thread_data *) threadarg;
cout<<"IN consume after paramiters"<< my_data->repeat<<endl;
int item;
item = mon.remove(my_data->repeat);
if(item) cout<< "Item: "<< item << " Was consumed by thread:"<< my_data->thread_id << endl;
}
int main()
{
our_monitor *mon = new our_monitor();
pthread_t threads[NUM_THREADS];
thread_data td[NUM_THREADS];
int rc;
int i;
for( i = 0; i < NUM_THREADS; i++ )
{
td[i].thread_id = i;
td[i].repeat = rand()%5 + 1;
if(i % 2 == 0)
{
cout << "main() : creating produce thread, " << i << endl;
rc = pthread_create(&threads[i], NULL, produce, (void*) &td[i]);
if (rc)
{
cout << "Error:unable to create thread," << rc << endl;
exit(-1);
}
} else
{
cout << "main() : creating consume thread, " << i << endl;
rc = pthread_create(&threads[i], NULL, consume, (void *)&td[i]);
if (rc)
{
cout << "Error:unable to create thread," << rc << endl;
exit(-1);
}
}
}
pthread_join(threads[0], NULL);
pthread_join(threads[1], NULL);
pthread_join(threads[2], NULL);
//pthread_exit(NULL);
}
UPD: Now I can't do more than 3 threads and threads don't do functions that I want them to do(consume and produce)
UPD_2: Now I've go a message like that: terminate called after throwing an instance of 'terminate called recursively
terminate called recursively
Aborted (core dumped)

From cppref regarding std::condition_variable.wait(...)
"Calling this function if lock.mutex() is not locked by the current
thread is undefined behavior."
http://en.cppreference.com/w/cpp/thread/condition_variable/wait
Unfortunately, the program doesn't crash on line 47, but on line 55, where you unlock the lock that wasn't locked.
Lock the lock when you enter your functions. I've done a quick check of the rest of your logic, and I'm like 85% sure it's otherwise ok.
While I have you here, this is not strictly necessary, but it's good practice. std::lock_guard and std::unique_lock automatically lock the mutex when it enters scope and unlock it when it leaves scope. This helps simplify exception handling and weird function returns. I recommend you get rid of lk as a member variable and use it as a scoped local variable instead.
void insert(int val, int repeat)
{
{ // Scoped. Somewhat pedantic in this case, but it's always best to signal after the mutex is unlocked
std::unique_lock<std::mutex> lk(m);
in.wait(lk, [&]{return n <= 100-repeat;});
for(int i=0; i<repeat; i++)
{
buffer[hi] = val;
hi = (hi + 1) % 100; //ring buffer
n = n +1; //one more item in buffer
}
}
out.notify_one();
}
Ok, now for the final issue. The cool thing about producer/consumer is that we could produce and consume at the same time. However, we just locked our functions so this is no longer possible. What you can do now is move your condition lock/wait/unlock/work/signal inside the for loop
in pseudocode:
// produce:
while (true)
{
{
unique_lock lk(m)
wait(m, predicate)
}
produce 1
signal
}
The is equivalent to using semaphores (which C++'11 stl doesn't have, but you can easily make your own as shown above.)
// produce:
semaphore in(100);
semaphore out(0);
while (true)
{
in.down(1) // Subtracts 1 from in.count. Blocks when in.count == 0 (meaning the buffer is full)
produce 1
out.up(1) // Adds 1 to out.count
}

When main ends, td goes out of scope and ceases to exist. But you passed pointers into it to threads. You need to make sure td continues to exist as long as any threads might be using it.

Related

If statement passes only when preceded by debug cout line (multi-threading in C)

I created this code to use for solving CPU intensive tasks real-time and potentially as a base for a game engine in the future. For it I created a system where there is an array of ints each thread modifies to signal whether they are done with their current task.
The problem occurs when running it with more than 4 threads. When using 6 threads or more, the "if (threadone_private == threadcount)" stops working UNLESS I add this debug line "cout << threadone_private << endl;" before it.
I cannot comprehend why this debug line makes any difference on whether the if conditional functions as expected, neither why it works without it when using 4 threads or less.
For this code I'm using:
#include <GL/glew.h>
#include <GLFW/glfw3.h>
#include <iostream>
#include <thread>
#include <atomic>
#include <vector>
#include <string>
#include <fstream>
#include <sstream>
using namespace std;
Right now this code only counts up to 60 trillion, in asynchronous steps of 3 billion, really fast.
Here are the relevant parts of the code:
int thread_done[6] = { 0,0,0,0,0,0 };
atomic<long long int> testvar1 = 0;
atomic<long long int> testvar2 = 0;
atomic<long long int> testvar3 = 0;
atomic<long long int> testvar4 = 0;
atomic<long long int> testvar5 = 0;
atomic<long long int> testvar6 = 0;
void task1(long long int testvar, int thread_number)
{
int continue_work = 1;
for (; ; ) {
while (continue_work == 1) {
for (int i = 1; i < 3000000001; i++) {
testvar++;
}
thread_done[thread_number] = 1;
if (thread_number==0) {
testvar1 = testvar;
}
if (thread_number == 1) {
testvar2 = testvar;
}
if (thread_number == 2) {
testvar3 = testvar;
}
if (thread_number == 3) {
testvar4 = testvar;
}
if (thread_number == 4) {
testvar5 = testvar;
}
if (thread_number == 5) {
testvar6 = testvar;
}
continue_work = 0;
}
if (thread_done[thread_number] == 0) {
continue_work = 1;
}
}
}
And here is the relevant part of the main thread:
int main() {
long long int testvar = 0;
int threadcount = 6;
int threadone_private = 0;
thread thread_1(task1, testvar, 0);
thread thread_2(task1, testvar, 1);
thread thread_3(task1, testvar, 2);
thread thread_4(task1, testvar, 3);
thread thread_5(task1, testvar, 4);
thread thread_6(task1, testvar, 5);
for (; ; ) {
if (threadcount == 0) {
for (int i = 1; i < 3000001; i++) {
testvar++;
}
cout << testvar << endl;
}
else {
while (testvar < 60000000000000) {
threadone_private = thread_done[0] + thread_done[1] + thread_done[2] + thread_done[3] + thread_done[4] + thread_done[5];
cout << threadone_private << endl;
if (threadone_private == threadcount) {
testvar = testvar1 + testvar2 + testvar3 + testvar4 + testvar5 + testvar6;
cout << testvar << endl;
thread_done[0] = 0;
thread_done[1] = 0;
thread_done[2] = 0;
thread_done[3] = 0;
thread_done[4] = 0;
thread_done[5] = 0;
}
}
}
}
}
I expected that since each worker thread only modifies one int out of the array threadone_private, and since the main thread only ever reads it until all worker threads are waiting, that this if (threadone_private == threadcount) should be bulletproof... Apparently I'm missing something important that goes wrong whenever I change this:
threadone_private = thread_done[0] + thread_done[1] + thread_done[2] + thread_done[3] + thread_done[4] + thread_done[5];
cout << threadone_private << endl;
if (threadone_private == threadcount) {
To this:
threadone_private = thread_done[0] + thread_done[1] + thread_done[2] + thread_done[3] + thread_done[4] + thread_done[5];
//cout << threadone_private << endl;
if (threadone_private == threadcount) {
Disclaimer: Concurrent code is quite complicated and easy to get wrong, so it's generally a good idea to use higher level abstractions. There are a whole lot of details that are easy to get wrong without ever noticing. You should think very carefully about doing such low-level programming if you're not an expert. Sadly C++ lacks good built-in high level concurrent constructs, but there are libraries out there that handle this.
It's unclear what the whole code is supposed to do anyhow to me. As far as I can see whether the code ever stops relies purely on timing - even if you did the synchronization correctly - which is completely non deterministic. Your threads could execute in such a way that thread_done is never all true.
But apart from that there is at least one correctness issue: You're reading and writing to int thread_done[6] = { 0,0,0,0,0,0 }; without synchronization. This is undefined behavior so the compiler can do what it wants.
What probably happens is that the compiler sees that it can cache the value of threadone_private since the thread never writes to it so the value cannot change (legally). The external call to std::cout means it can't be sure that the value isn't change behind its back so it has to read the value each iteration new (also std::cout uses locks which causes synchronization in most implementations which again limits what the compiler can assume).
I cannot see any std::mutex, std::condition_variable or variants of std::lock in your code. Doing multithreading without any of those will never succeed reliably. Because whenever multiple threads modify the same data, you need to make sure only one thread (including your main thread) has access to that data at any given time.
Edit: I noticed you use atomic. I do not have any experience with this, however I know using mutexes works reliably.
Therefore, you need to lock every access (read or write) to that data with a mutex like this:
//somewhere
std::mutex myMutex;
std::condition_variable myCondition;
int workersDone = 0;
/* main thread */
createWorkerThread1();
createWorkerThread2();
{
std::unique_lock<std::mutex> lock(myMutex); //waits until mutex is locked.
while(workersDone != 2) {
myCondition.wait(lock); //the mutex is unlocked while waiting
}
std::cout << "the data is ready now" << std::endl;
} //the lock is destroyed, unlocking the mutex
/* Worker thread */
while(true) {
{
std::unique_lock<std::mutex> lock(myMutex); //waits until mutex is locked
if(read_or_modify_a_piece_of_shared_data() == DATA_FINISHED) {
break; //lock leaves the scope, unlocks the mutex
}
}
prepare_everything_for_the_next_piece_of_shared_data(); //DO NOT access data here
}
//data is processed
++workersDone;
myCondition.notify_one(); //no mutex here. This wakes up the waiting thread
I hope this gives you an idea on how to use mutexes and condition variables to gain thread safety.
Disclaimer: 100% pseudo code ;)

c++ threading, duplicate/missing threads

I'm trying to write a program that concurrently add and removes items from a "storehouse". I have a "Monitor" class that handles the "storehouse" operations:
class Monitor
{
private:
mutex m;
condition_variable cv;
vector<Storage> S;
int counter = 0;
bool busy = false;;
public:
void add(Computer c, int index) {
unique_lock <mutex> lock(m);
if (busy)
cout << "Thread " << index << ": waiting for !busy " << endl;
cv.wait(lock, [&] { return !busy; });
busy = true;
cout << "Thread " << index << ": Request: add " << c.CPUFrequency << endl;
for (int i = 0; i < counter; i++) {
if (S[i].f == c.CPUFrequency) {
S[i].n++;
busy = false; cv.notify_one();
return;
}
}
Storage s;
s.f = c.CPUFrequency;
s.n = 1;
// put the new item in a sorted position
S.push_back(s);
counter++;
busy = false; cv.notify_one();
}
}
The threads are created like this:
void doThreadStuff(vector<Computer> P, vector <Storage> R, Monitor &S)
{
int Pcount = P.size();
vector<thread> myThreads;
myThreads.reserve(Pcount);
for (atomic<size_t> i = 0; i < Pcount; i++)
{
int index = i;
Computer c = P[index];
myThreads.emplace_back([&] { S.add(c, index); });
}
for (size_t i = 0; i < Pcount; i++)
{
myThreads[i].join();
}
// printing results
}
Running the program produced the following results:
I'm familiar with race conditions, but this doesn't look like one to me. My bet would be on something reference related, because in the results we can see that for every "missing thread" (threads 1, 3, 10, 25) I get "duplicate threads" (threads 2, 9, 24, 28).
I have tried to create local variables in functions and loops but it changed nothing.
I have heard about threads sharing memory regions, but my previous work should have produced similar results, so I don't think that's the case here, but feel free to prove me wrong.
I'm using Visual Studio 2017
Here you catch local variables by reference in a loop, they will be destroyed in every turn, causing undefined behavior:
for (atomic<size_t> i = 0; i < Pcount; i++)
{
int index = i;
Computer c = P[index];
myThreads.emplace_back([&] { S.add(c, index); });
}
You should catch index and c by value:
myThreads.emplace_back([&S, index, c] { S.add(c, index); });
Another approach would be to pass S, i and c as arguments instead of capturing them by defining the following non-capturing lambda, th_func:
auto th_func = [](Monitor &S, int index, Computer c){ S.add(c, index); };
This way you have to explicitly wrap the arguments that must be passed by reference to the thread's callable object with std::reference_wrapper by means of the function template std::ref(). In your case, only S:
for (atomic<size_t> i = 0; i < Pcount; i++) {
int index = i;
Computer c = P[index];
myThreads.emplace_back(th_func, std::ref(S), index, c);
}
Failing to wrap with std::reference_wrapper the arguments that must be passed by reference will result in a compile-time error. That is, the following won't compile:
myThreads.emplace_back(th_func, S, index, c); // <-- it should be std::ref(S)
See also this question.

C++ Syncing threads in most elegant way

I am try to solve the following problem, I know there are multiple solutions but I'm looking for the most elegant way (less code) to solve it.
I've 4 threads, 3 of them try to write a unique value (0,1,or 2) to a volatile integer variable in an infinite loop, the forth thread try to read the value of this variable and print the value to the stdout also in an infinite loop.
I'd like to sync between the thread so the thread that writes 0 will be run and then the "print" thread and then the thread that writes 1 and then again the print thread, an so on...
So that finally what I expect to see at the output of the "print" thread is a sequence of zeros and then sequence of 1 and then 2 and then 0 and so on...
What is the most elegant and easy way to sync between these threads.
This is the program code:
volatile int value;
int thid[4];
int main() {
HANDLE handle[4];
for (int ii=0;ii<4;ii++) {
thid[ii]=ii;
handle[ii] = (HANDLE) CreateThread( NULL, 0, (LPTHREAD_START_ROUTINE) ThreadProc, &thid[ii], 0, NULL);
}
return 0;
}
void WINAPI ThreadProc( LPVOID param ) {
int h=*((int*)param);
switch (h) {
case 3:
while(true) {
cout << value << endl;
}
break;
default:
while(true) {
// setting a unique value to the volatile variable
value=h;
}
break;
}
}
your problem can be solved with the producer consumer pattern.
I got inspired from Wikipedia so here is the link if you want some more details.
https://en.wikipedia.org/wiki/Producer%E2%80%93consumer_problem
I used a random number generator to generate the volatile variable but you can change that part.
Here is the code: it can be improved in terms of style (using C++11 for random numbers) but it produces what you expect.
#include <iostream>
#include <sstream>
#include <vector>
#include <stack>
#include <thread>
#include <mutex>
#include <atomic>
#include <condition_variable>
#include <chrono>
#include <stdlib.h> /* srand, rand */
using namespace std;
//random number generation
std::mutex mutRand;//mutex for random number generation (given that the random generator is not thread safe).
int GenerateNumber()
{
std::lock_guard<std::mutex> lk(mutRand);
return rand() % 3;
}
// print function for "thread safe" printing using a stringstream
void print(ostream& s) { cout << s.rdbuf(); cout.flush(); s.clear(); }
// Constants
//
const int num_producers = 3; //the three producers of random numbers
const int num_consumers = 1; //the only consumer
const int producer_delay_to_produce = 10; // in miliseconds
const int consumer_delay_to_consume = 30; // in miliseconds
const int consumer_max_wait_time = 200; // in miliseconds - max time that a consumer can wait for a product to be produced.
const int max_production = 1; // When producers has produced this quantity they will stop to produce
const int max_products = 1; // Maximum number of products that can be stored
//
// Variables
//
atomic<int> num_producers_working(0); // When there's no producer working the consumers will stop, and the program will stop.
stack<int> products; // The products stack, here we will store our products
mutex xmutex; // Our mutex, without this mutex our program will cry
condition_variable is_not_full; // to indicate that our stack is not full between the thread operations
condition_variable is_not_empty; // to indicate that our stack is not empty between the thread operations
//
// Functions
//
// Produce function, producer_id will produce a product
void produce(int producer_id)
{
while (true)
{
unique_lock<mutex> lock(xmutex);
int product;
is_not_full.wait(lock, [] { return products.size() != max_products; });
product = GenerateNumber();
products.push(product);
print(stringstream() << "Producer " << producer_id << " produced " << product << "\n");
is_not_empty.notify_all();
}
}
// Consume function, consumer_id will consume a product
void consume(int consumer_id)
{
while (true)
{
unique_lock<mutex> lock(xmutex);
int product;
if(is_not_empty.wait_for(lock, chrono::milliseconds(consumer_max_wait_time),
[] { return products.size() > 0; }))
{
product = products.top();
products.pop();
print(stringstream() << "Consumer " << consumer_id << " consumed " << product << "\n");
is_not_full.notify_all();
}
}
}
// Producer function, this is the body of a producer thread
void producer(int id)
{
++num_producers_working;
for(int i = 0; i < max_production; ++i)
{
produce(id);
this_thread::sleep_for(chrono::milliseconds(producer_delay_to_produce));
}
print(stringstream() << "Producer " << id << " has exited\n");
--num_producers_working;
}
// Consumer function, this is the body of a consumer thread
void consumer(int id)
{
// Wait until there is any producer working
while(num_producers_working == 0) this_thread::yield();
while(num_producers_working != 0 || products.size() > 0)
{
consume(id);
this_thread::sleep_for(chrono::milliseconds(consumer_delay_to_consume));
}
print(stringstream() << "Consumer " << id << " has exited\n");
}
//
// Main
//
int main()
{
vector<thread> producers_and_consumers;
// Create producers
for(int i = 0; i < num_producers; ++i)
producers_and_consumers.push_back(thread(producer, i));
// Create consumers
for(int i = 0; i < num_consumers; ++i)
producers_and_consumers.push_back(thread(consumer, i));
// Wait for consumers and producers to finish
for(auto& t : producers_and_consumers)
t.join();
return 0;
}
Hope that helps, tell me if you need more info or if you disagree with something :-)
And Good Bastille Day to all French people!
If you want to synchronise the threads, then using a sync object to hold each of the threads in a "ping-pong" or "tick-tock" pattern.
In C++ 11 you can use condition variables, the example here shows something similar to what you are asking for.

Extend the life of threads with synchronization (C++11)

I have a program with a function which takes a pointer as arg, and a main. The main is creating n threads, each of them running the function on different memory areas depending on the passed arg. Threads are then joined, the main performs some data mixing between the area and creates n new threads which do the the same operation as the old ones.
To improve the program I would like to keep the threads alive, removing the long time necessary to create them. Threads should sleep when the main is working and notified when they have to come up again. At the same way the main should wait when threads are working as it did with join.
I cannot end up with a strong implementation of this, always falling in a deadlock.
Simple baseline code, any hints about how to modify this would be much appreciated
#include <thread>
#include <climits>
...
void myfunc(void * p) {
do_something(p);
}
int main(){
void * myp[n_threads] {a_location, another_location,...};
std::thread mythread[n_threads];
for (unsigned long int j=0; j < ULONG_MAX; j++) {
for (unsigned int i=0; i < n_threads; i++) {
mythread[i] = std::thread(myfunc, myp[i]);
}
for (unsigned int i=0; i < n_threads; i++) {
mythread[i].join();
}
mix_data(myp);
}
return 0;
}
Here is a possible approach using only classes from the C++11 Standard Library. Basically, each thread you create has an associated command queue (encapsulated in std::packaged_task<> objects) which it continuously check. If the queue is empty, the thread will just wait on a condition variable (std::condition_variable).
While data races are avoided through the use of std::mutex and std::unique_lock<> RAII wrappers, the main thread can wait for a particular job to be terminated by storing the std::future<> object associated to each submitted std::packaged_tast<> and call wait() on it.
Below is a simple program that follows this design. Comments should be sufficient to explain what it does:
#include <thread>
#include <iostream>
#include <sstream>
#include <future>
#include <queue>
#include <condition_variable>
#include <mutex>
// Convenience type definition
using job = std::packaged_task<void()>;
// Some data associated to each thread.
struct thread_data
{
int id; // Could use thread::id, but this is filled before the thread is started
std::thread t; // The thread object
std::queue<job> jobs; // The job queue
std::condition_variable cv; // The condition variable to wait for threads
std::mutex m; // Mutex used for avoiding data races
bool stop = false; // When set, this flag tells the thread that it should exit
};
// The thread function executed by each thread
void thread_func(thread_data* pData)
{
std::unique_lock<std::mutex> l(pData->m, std::defer_lock);
while (true)
{
l.lock();
// Wait until the queue won't be empty or stop is signaled
pData->cv.wait(l, [pData] () {
return (pData->stop || !pData->jobs.empty());
});
// Stop was signaled, let's exit the thread
if (pData->stop) { return; }
// Pop one task from the queue...
job j = std::move(pData->jobs.front());
pData->jobs.pop();
l.unlock();
// Execute the task!
j();
}
}
// Function that creates a simple task
job create_task(int id, int jobNumber)
{
job j([id, jobNumber] ()
{
std::stringstream s;
s << "Hello " << id << "." << jobNumber << std::endl;
std::cout << s.str();
});
return j;
}
int main()
{
const int numThreads = 4;
const int numJobsPerThread = 10;
std::vector<std::future<void>> futures;
// Create all the threads (will be waiting for jobs)
thread_data threads[numThreads];
int tdi = 0;
for (auto& td : threads)
{
td.id = tdi++;
td.t = std::thread(thread_func, &td);
}
//=================================================
// Start assigning jobs to each thread...
for (auto& td : threads)
{
for (int i = 0; i < numJobsPerThread; i++)
{
job j = create_task(td.id, i);
futures.push_back(j.get_future());
std::unique_lock<std::mutex> l(td.m);
td.jobs.push(std::move(j));
}
// Notify the thread that there is work do to...
td.cv.notify_one();
}
// Wait for all the tasks to be completed...
for (auto& f : futures) { f.wait(); }
futures.clear();
//=================================================
// Here the main thread does something...
std::cin.get();
// ...done!
//=================================================
//=================================================
// Posts some new tasks...
for (auto& td : threads)
{
for (int i = 0; i < numJobsPerThread; i++)
{
job j = create_task(td.id, i);
futures.push_back(j.get_future());
std::unique_lock<std::mutex> l(td.m);
td.jobs.push(std::move(j));
}
// Notify the thread that there is work do to...
td.cv.notify_one();
}
// Wait for all the tasks to be completed...
for (auto& f : futures) { f.wait(); }
futures.clear();
// Send stop signal to all threads and join them...
for (auto& td : threads)
{
std::unique_lock<std::mutex> l(td.m);
td.stop = true;
td.cv.notify_one();
}
// Join all the threads
for (auto& td : threads) { td.t.join(); }
}
The concept you want is the threadpool. This SO question deals with existing implementations.
The idea is to have a container for a number of thread instances. Each instance is associated with a function which polls a task queue, and when a task is available, pulls it and run it. Once the task is over (if it terminates, but that's another problem), the thread simply loop over to the task queue.
So you need a synchronized queue, a thread class which implements the loop on the queue, an interface for the task objects, and maybe a class to drive the whole thing (the pool class).
Alternatively, you could make a very specialized thread class for the task it has to perform (with only the memory area as a parameter for instance). This requires a notification mechanism for the threads to indicate that they are done with the current iteration.
The thread main function would be a loop on that specific task, and at the end of one iteration, the thread signals its end, and wait on condition variables to start the next loop. In essence, you would be inlining the task code within the thread, dropping the need of a queue altogether.
using namespace std;
// semaphore class based on C++11 features
class semaphore {
private:
mutex mMutex;
condition_variable v;
int mV;
public:
semaphore(int v): mV(v){}
void signal(int count=1){
unique_lock lock(mMutex);
mV+=count;
if (mV > 0) mCond.notify_all();
}
void wait(int count = 1){
unique_lock lock(mMutex);
mV-= count;
while (mV < 0)
mCond.wait(lock);
}
};
template <typename Task>
class TaskThread {
thread mThread;
Task *mTask;
semaphore *mSemStarting, *mSemFinished;
volatile bool mRunning;
public:
TaskThread(Task *task, semaphore *start, semaphore *finish):
mTask(task), mRunning(true),
mSemStart(start), mSemFinished(finish),
mThread(&TaskThread<Task>::psrun){}
~TaskThread(){ mThread.join(); }
void run(){
do {
(*mTask)();
mSemFinished->signal();
mSemStart->wait();
} while (mRunning);
}
void finish() { // end the thread after the current loop
mRunning = false;
}
private:
static void psrun(TaskThread<Task> *self){ self->run();}
};
classcMyTask {
public:
MyTask(){}
void operator()(){
// some code here
}
};
int main(){
MyTask task1;
MyTask task2;
semaphore start(2), finished(0);
TaskThread<MyTask> t1(&task1, &start, &finished);
TaskThread<MyTask> t2(&task2, &start, &finished);
for (int i = 0; i < 10; i++){
finished.wait(2);
start.signal(2);
}
t1.finish();
t2.finish();
}
The proposed (crude) implementation above relies on the Task type which must provide the operator() (ie. a functor like class). I said you could incorporate the task code directly in the thread function body earlier, but since I don't know it, I kept it as abstract as I could. There's one condition variable for the start of threads, and one for their end, both encapsulated in semaphore instances.
Seeing the other answer proposing the use of boost::barrier, I can only support this idea: make sure to replace my semaphore class with that class if possible, the reason being that it is better to rely on well tested and maintained external code rather than a self implemented solution for the same feature set.
All in all, both approaches are valid, but the former gives up a tiny bit of performance in favor of flexibility. If the task to be performed takes a sufficiently long time, the management and queue synchronization cost becomes negligible.
Update: code fixed and tested. Replaced a simple condition variable by a semaphore.
It can easily be achieved using a barrier (just a convenience wrapper over a conditional variable and a counter). It basically blocks until all N threads have reached the "barrier". It then "recycles" again. Boost provides an implementation.
void myfunc(void * p, boost::barrier& start_barrier, boost::barrier& end_barrier) {
while (!stop_condition) // You'll need to tell them to stop somehow
{
start_barrier.wait ();
do_something(p);
end_barrier.wait ();
}
}
int main(){
void * myp[n_threads] {a_location, another_location,...};
boost::barrier start_barrier (n_threads + 1); // child threads + main thread
boost::barrier end_barrier (n_threads + 1); // child threads + main thread
std::thread mythread[n_threads];
for (unsigned int i=0; i < n_threads; i++) {
mythread[i] = std::thread(myfunc, myp[i], start_barrier, end_barrier);
}
start_barrier.wait (); // first unblock the threads
for (unsigned long int j=0; j < ULONG_MAX; j++) {
end_barrier.wait (); // mix_data must not execute before the threads are done
mix_data(myp);
start_barrier.wait (); // threads must not start new iteration before mix_data is done
}
return 0;
}
The following is a simple compiling and working code performing some random stuffs. It implements aleguna's concept of barrier. The task length of each thread is different so it is really necessary to have a strong synchronization mechanism. I will try to do a pool on the same tasks and benchmark the result, and then maybe with futures as pointed out by Andy Prowl.
#include <iostream>
#include <thread>
#include <mutex>
#include <condition_variable>
#include <chrono>
#include <complex>
#include <random>
const unsigned int n_threads=4; //varying this will not (almost) change the total amount of work
const unsigned int task_length=30000/n_threads;
const float task_length_variation=task_length/n_threads;
unsigned int rep=1000; //repetitions of tasks
class t_chronometer{
private:
std::chrono::steady_clock::time_point _t;
public:
t_chronometer(): _t(std::chrono::steady_clock::now()) {;}
void reset() {_t = std::chrono::steady_clock::now();}
double get_now() {return std::chrono::duration_cast<std::chrono::duration<double>>(std::chrono::steady_clock::now() - _t).count();}
double get_now_ms() {return
std::chrono::duration_cast<std::chrono::duration<double,std::milli>>(std::chrono::steady_clock::now() - _t).count();}
};
class t_barrier {
private:
std::mutex m_mutex;
std::condition_variable m_cond;
unsigned int m_threshold;
unsigned int m_count;
unsigned int m_generation;
public:
t_barrier(unsigned int count):
m_threshold(count),
m_count(count),
m_generation(0) {
}
bool wait() {
std::unique_lock<std::mutex> lock(m_mutex);
unsigned int gen = m_generation;
if (--m_count == 0)
{
m_generation++;
m_count = m_threshold;
m_cond.notify_all();
return true;
}
while (gen == m_generation)
m_cond.wait(lock);
return false;
}
};
using namespace std;
void do_something(complex<double> * c, unsigned int max) {
complex<double> a(1.,0.);
complex<double> b(1.,0.);
for (unsigned int i = 0; i<max; i++) {
a *= polar(1.,2.*M_PI*i/max);
b *= polar(1.,4.*M_PI*i/max);
*(c)+=a+b;
}
}
bool done=false;
void task(complex<double> * c, unsigned int max, t_barrier* start_barrier, t_barrier* end_barrier) {
while (!done) {
start_barrier->wait ();
do_something(c,max);
end_barrier->wait ();
}
cout << "task finished" << endl;
}
int main() {
t_chronometer t;
std::default_random_engine gen;
std::normal_distribution<double> dis(.0,1000.0);
complex<double> cpx[n_threads];
for (unsigned int i=0; i < n_threads; i++) {
cpx[i] = complex<double>(dis(gen), dis(gen));
}
t_barrier start_barrier (n_threads + 1); // child threads + main thread
t_barrier end_barrier (n_threads + 1); // child threads + main thread
std::thread mythread[n_threads];
unsigned long int sum=0;
for (unsigned int i=0; i < n_threads; i++) {
unsigned int max = task_length + i * task_length_variation;
cout << i+1 << "th task length: " << max << endl;
mythread[i] = std::thread(task, &cpx[i], max, &start_barrier, &end_barrier);
sum+=max;
}
cout << "total task length " << sum << endl;
complex<double> c(0,0);
for (unsigned long int j=1; j < rep+1; j++) {
start_barrier.wait (); //give to the threads the missing call to start
if (j==rep) done=true;
end_barrier.wait (); //wait for the call from each tread
if (j%100==0) cout << "cycle: " << j << endl;
for (unsigned int i=0; i<n_threads; i++) {
c+=cpx[i];
}
}
for (unsigned int i=0; i < n_threads; i++) {
mythread[i].join();
}
cout << "result: " << c << " it took: " << t.get_now() << " s." << endl;
return 0;
}

One producer, two consumers acting on one 'queue' produced by producer

Preface: I'm new to multithreaded programming, and a little rusty with C++. My requirements are to use one mutex, and two conditions mNotEmpty and mEmpty. I must also create and populate the vectors in the way mentioned below.
I have one producer thread creating a vector of random numbers of size n*2, and two consumers inserting those values into two separate vectors of size n.
I am doing the following in the producer:
Lock the mutex: pthread_mutex_lock(&mMutex1)
Wait for consumer to say vector is empty: pthread_cond_wait(&mEmpty,&mMutex1)
Push back a value into the vector
Signal the consumer that the vector isn't empty anymore: pthread_cond_signal(&mNotEmpty)
Unlock the mutex: pthread_mutex_unlock(&mMutex1)
Return to step 1
In the consumer:
Lock the mutex: pthread_mutex_lock(&mMutex1)
Check to see if the vector is empty, and if so signal the producer: pthread_cond_signal(&mEmpty)
Else insert value into one of two new vectors (depending on which thread) and remove from original vector
Unlock the mutex: pthread_mutex_unlock(&mMutex1)
Return to step 1
What's wrong with my process? I keep getting segmentation faults or infinite loops.
Edit: Here's the code:
void Producer()
{
srand(time(NULL));
for(unsigned int i = 0; i < mTotalNumberOfValues; i++){
pthread_mutex_lock(&mMutex1);
pthread_cond_wait(&mEmpty,&mMutex1);
mGeneratedNumber.push_back((rand() % 100) + 1);
pthread_cond_signal(&mNotEmpty);
pthread_mutex_unlock(&mMutex1);
}
}
void Consumer(const unsigned int index)
{
for(unsigned int i = 0; i < mNumberOfValuesPerVector; i++){
pthread_mutex_lock(&mMutex1);
if(mGeneratedNumber.empty()){
pthread_cond_signal(&mEmpty);
}else{
mThreadVector.at(index).push_back[mGeneratedNumber.at(0)];
mGeneratedNumber.pop_back();
}
pthread_mutex_unlock(&mMutex1);
}
}
I'm not sure I understand the rationale behind the way you're doing
things. In the usual consumer-provider idiom, the provider pushes as
many items as possible into the channel, waiting only if there is
insufficient space in the channel; it doesn't wait for empty. So the
usual idiom would be:
provider (to push one item):
pthread_mutex_lock( &mutex );
while ( ! spaceAvailable() ) {
pthread_cond_wait( &spaceAvailableCondition, &mutex );
}
pushTheItem();
pthread_cond_signal( &itemAvailableCondition );
pthread_mutex_unlock( &mutex );
and on the consumer side, to get an item:
pthread_mutex_lock( &mutex );
while ( ! itemAvailable() ) {
pthread_cond_wait( &itemAvailableCondition, &mutex );
}
getTheItem();
pthread_cond_signal( &spaceAvailableCondition );
pthread_mutex_unlock( &mutex );
Note that for each condition, one side signals, and the other waits. (I
don't see any wait in your consumer.) And if there is more than one
process on either side, I'd recommend using pthread_cond_broadcast,
rather than pthread_cond_signal.
There are a number of other issues in your code. Some of them look more
like typos: you should copy/paste actual code to avoid this. Do you
really mean to read and pop mGeneratedValues, when you push into
mGeneratedNumber, and check whether that is empty? (If you actually
do have two different queues, then you're popping from a queue where no
one has pushed.) And you don't have any loops waiting for the
conditions; you keep iterating through the number of elements you
expect (incrementing the counter each time, so you're likely to
gerninate long before you should)—I can't see an infinite loop,
but I can readily see a endless wait in pthread_cond_wait in the
producer. I don't see a core dump off hand, but what happens when one
of the processes terminates (probably the consumer, because it never
waits for anything); if it ends up destroying the mutex or the condition
variables, you could get a core dump when another process attempts to
use them.
In producer, call pthread_cond_wait only when queue is not empty. Otherwise you get blocked forever due to a race condition.
You might want to consider taking mutex only after condition is fulfilled, e.g.
producer()
{
while true
{
waitForEmpty();
takeMutex();
produce();
releaseMutex();
}
}
consumer()
{
while true
{
waitForNotEmpty();
takeMutex();
consume();
releaseMutex();
}
}
Here is a solution to a similar problem like you. In this program producer produces a no and writes it to a array(buffer) and a maintains a file then update a status(status array) about it, while on getting data in the array(buffer) consumers start to consume(read and write to their file) and update a status that it has consumed. when producer looks that both the consumer has consumed the data it overrides the data with a new value and goes on. for convenience here i have restricted the code to run for 2000 nos.
// Producer-consumer //
#include <iostream>
#include <fstream>
#include <pthread.h>
#define MAX 100
using namespace std;
int dataCount = 2000;
int buffer_g[100];
int status_g[100];
void *producerFun(void *);
void *consumerFun1(void *);
void *consumerFun2(void *);
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t dataNotProduced = PTHREAD_COND_INITIALIZER;
pthread_cond_t dataNotConsumed = PTHREAD_COND_INITIALIZER;
int main()
{
for(int i = 0; i < MAX; i++)
status_g[i] = 0;
pthread_t producerThread, consumerThread1, consumerThread2;
int retProducer = pthread_create(&producerThread, NULL, producerFun, NULL);
int retConsumer1 = pthread_create(&consumerThread1, NULL, consumerFun1, NULL);
int retConsumer2 = pthread_create(&consumerThread2, NULL, consumerFun2, NULL);
pthread_join(producerThread, NULL);
pthread_join(consumerThread1, NULL);
pthread_join(consumerThread2, NULL);
return 0;
}
void *producerFun(void *)
{
//file to write produced data by producer
const char *producerFileName = "producer.txt";
ofstream producerFile(producerFileName);
int index = 0, producerCount = 0;
while(1)
{
pthread_mutex_lock(&mutex);
if(index == MAX)
{
index = 0;
}
if(status_g[index] == 0)
{
static int data = 0;
data++;
cout << "Produced: " << data << endl;
buffer_g[index] = data;
producerFile << data << endl;
status_g[index] = 5;
index ++;
producerCount ++;
pthread_cond_broadcast(&dataNotProduced);
}
else
{
cout << ">> Producer is in wait.." << endl;
pthread_cond_wait(&dataNotConsumed, &mutex);
}
pthread_mutex_unlock(&mutex);
if(producerCount == dataCount)
{
producerFile.close();
return NULL;
}
}
}
void *consumerFun1(void *)
{
const char *consumerFileName = "consumer1.txt";
ofstream consumerFile(consumerFileName);
int index = 0, consumerCount = 0;
while(1)
{
pthread_mutex_lock(&mutex);
if(index == MAX)
{
index = 0;
}
if(status_g[index] != 0 && status_g[index] != 2)
{
int data = buffer_g[index];
cout << "Cosumer1 consumed: " << data << endl;
consumerFile << data << endl;
status_g[index] -= 3;
index ++;
consumerCount ++;
pthread_cond_signal(&dataNotConsumed);
}
else
{
cout << "Consumer1 is in wait.." << endl;
pthread_cond_wait(&dataNotProduced, &mutex);
}
pthread_mutex_unlock(&mutex);
if(consumerCount == dataCount)
{
consumerFile.close();
return NULL;
}
}
}
void *consumerFun2(void *)
{
const char *consumerFileName = "consumer2.txt";
ofstream consumerFile(consumerFileName);
int index = 0, consumerCount = 0;
while(1)
{
pthread_mutex_lock(&mutex);
if(index == MAX)
{
index = 0;
}
if(status_g[index] != 0 && status_g[index] != 3)
{
int data = buffer_g[index];
cout << "Consumer2 consumed: " << data << endl;
consumerFile << data << endl;
status_g[index] -= 2;
index ++;
consumerCount ++;
pthread_cond_signal(&dataNotConsumed);
}
else
{
cout << ">> Consumer2 is in wait.." << endl;
pthread_cond_wait(&dataNotProduced, &mutex);
}
pthread_mutex_unlock(&mutex);
if(consumerCount == dataCount)
{
consumerFile.close();
return NULL;
}
}
}
Here is only one problem that producer in not independent to produce, that is it needs to take lock on the whole array(buffer) before it produces new data, and if the mutex is locked by consumer it waits for that and vice versa, i am trying to look for it.