Pthread synchronization with barrier - c++

I am trying to synchronize a function I am parallelizing with pthreads.
The issue is, I am having a deadlock because a thread will exit the function while other threads are still waiting for the thread that exited to reach the barrier. I am unsure whether the pthread_barrier structure takes care of this. Here is an example:
static pthread_barrier_t barrier;
static void foo(void* arg) {
for(int i = beg; i < end; i++) {
if (i > 0) {
pthread_barrier_wait(&barrier);
}
}
}
int main() {
// create pthread barrier
pthread_barrier_init(&barrier, NULL, NUM_THREADS);
// create thread handles
//...
// create threads
for (int i = 0; i < NUM_THREADS; i++) {
pthread_create(&thread_handles[i], NULL, &foo, (void*) i);
}
// join the threads
for (int i = 0; i < NUM_THREADS; i++) {
pthread_join(&thread_handles[i], NULL);
}
}
Here is a solution I tried for foo, but it didn't work (note NUM_THREADS_COPY is a copy of the NUM_THREADS constant, and is decremented whenever a thread reaches the end of the function):
static void foo(void* arg) {
for(int i = beg; i < end; i++) {
if (i > 0) {
pthread_barrier_wait(&barrier);
}
}
pthread_barrier_init(&barrier, NULL, --NUM_THREADS_COPY);
}
Is there a solution to updating the number of threads to wait in a barrier for when a thread exits a function?

You need to decide how many threads it will take to pass the barrier before any threads arrive at it. Undefined behavior results from re-initializing the barrier while there are threads waiting at it. Among the plausible manifestations are that some of the waiting threads are prematurely released or that some of the waiting threads never get released, but those are by no means the only unwanted things that could happen. In any case ...
Is there a solution to updating the number of threads to wait in a
barrier for when a thread exits a function?
... no, pthreads barriers do not support that.
Since a barrier seems not to be flexible enough for your needs, you probably want to fall back to the general-purpose thread synchronization object: a condition variable (used together with a mutex and some kind of shared variable).

Related

How to make two threads take turns executing their respective critical sections after one thread ends

In modern C++ with STL threads I want to have two worker threads that take turns doing their work. Only one can be working at a time and each may only get one turn before the other takes a turn. I have this part working.
The added constraint is that one thread needs to keep taking turns after the other thread finishes. But in my code the remaining worker thread deadlocks after the first worker thread finishes. I don't understand why, given that the last things the first worker did was unlock and notify the condition variable, which should've woken the second one up. Here's the code:
{
std::mutex mu;
std::condition_variable cv;
int turn = 0;
auto thread_func = [&](int tid, int iters) {
std::unique_lock<std::mutex> lk(mu);
lk.unlock();
for (int i = 0; i < iters; i++) {
lk.lock();
cv.wait(lk, [&] {return turn == tid; });
printf("tid=%d turn=%d i=%d/%d\n", tid, turn, i, iters);
fflush(stdout);
turn = !turn;
lk.unlock();
cv.notify_all();
}
};
auto th0 = std::thread(thread_func, 0, 20);
auto th1 = std::thread(thread_func, 1, 25); // Does more iterations
printf("Made the threads.\n");
fflush(stdout);
th0.join();
th1.join();
printf("Both joined.\n");
fflush(stdout);
}
I don't know whether this is something I don't understand about concurrency in STL threads, or whether I just have a logic bug in my code. Note that there is a question on SO that's similar to this, but without the second worker having to run longer than the first. I can't find it right now to link to it. Thanks in advance for your help.
When one thread is done, the other will wait for a notification that nobody will send. When only one thread is left, you need to either stop using the condition variable or signal the condition variable some other way.

How to run a function on a separate thread, if a thread is available

How can I run a function on a separate thread if a thread is available, assuming that i always want k threads running at the same time at any point?
Here's a pseudo-code
For i = 1 to N
IF numberOfRunningThreads < k
// run foo() on another thread
ELSE
// run foo()
In summary, once a thread is finished it notifies the other threads that there's a thread available that any of the other threads can use. I hope the description was clear.
My personal approach: Just do create the k threads and let them call foo repeatedly. You need some counter, protected against race conditions, that is decremented each time before foo is called by any thread. As soon as the desired number of calls has been performed, the threads will exit one after the other (incomplete/pseudo code):
unsigned int global_counter = n;
void fooRunner()
{
for(;;)
{
{
std::lock_guard g(global_counter_mutex);
if(global_counter == 0)
break;
--global_counter;
}
foo();
}
}
void runThreads(unsigned int n, unsigned int k)
{
global_counter = n;
std::vector<std::thread> threads(std::min(n, k - 1));
// k - 1: current thread can be reused, too...
// (provided it has no other tasks to perform)
for(auto& t : threads)
{
t = std::thread(&fooRunner);
}
fooRunner();
for(auto& t : threads)
{
t.join();
}
}
If you have data to pass to foo function, instead of a counter you could use e. g a FIFO or LIFO queue, whatever appears most appropriate for the given use case. Threads then exit as soon as the buffer gets empty; you'd have to prevent the buffer running empty prematurely, though, e. g. by prefilling all the data to be processed before starting the threads.
A variant might be a combination of both: exiting, if the global counter gets 0, waiting for the queue to receive new data e. g. via a condition variable otherwise, and the main thread continuously filling the queue while the threads are already running...
you can use (std::thread in <thread>) and locks to do what you want, but it seems to me that your code could be simply become parallel using openmp like this.
#pragma omp parallel num_threads(k)
#pragma omp for
for (unsigned i = 0; i < N; ++i)
{
auto t_id = omp_get_thread_num();
if (t_id < K)
foo()
else
other_foo()
}

Threads in for loop not working correctly

I want to make a program that gets the ids from a database and create a thread with the same function for each id. It works, but when I add a while loop to the function it only hangs there and doesn't get the next id's.
My code is:
void foo(char* i) {
while(1){
std::cout << i;
}
}
void makeThreads()
{
int i;
MYSQL *sqlhnd = mysql_init(NULL);
mysql_real_connect(sqlhnd, "127.0.0.1", "root", "h0flcepqE", "Blazor", 3306, NULL, 0);
mysql_query(sqlhnd, "SELECT id FROM `notifications`");
MYSQL_RES *confres = mysql_store_result(sqlhnd);
int totalrows = mysql_num_rows(confres);
int numfields = mysql_num_fields(confres);
MYSQL_FIELD *mfield;
MYSQL_ROW row;
while((row = mysql_fetch_row(confres)))
{
for(i = 0; i < numfields; i++)
{
printf("%s", row[i]);
std::thread t(foo, row[i]);
t.join();
}
}
}
int main()
{
makeThreads();
return 0;
}
Output is:
1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
Thanks
The for loop in question currently creates one thread object and one thread. Period. joining hides this problem in a way by forcing the main thread to wait for the thread to run to completion. That the thread can't is another issue.
Creating a thread and immediately joining forces your program to run sequentially and defeats the point of using threads. Not joining the thread will result in Bad because the thread object will be destroyed at the end of the loop and the thread has not been detached. Destroying an undetached thread is bad. std::terminate does pretty much what it sounds like it does: It hunts down and kills Sarah Connor. Just kidding. It ends your program with all the subtlety of a headsman's axe.
You could detach the threads manually by calling detach, but that's a really, really Bad Idea because you lose control of the thread and your program will exit while the threads are still running.
You need to store these threads and join them later, after the loop that spawns them.
Here's a simple approach to do that:
std::vector<std::thread> threads;
for(i = 0; i < numfields; i++)
{
std::cout << row[i];
threads.push_back(std::thread(foo, row[i]));
}
for (std::thread & t: threads)
{
t.join();
}
Now you will have numfields threads running forever, and I'm sure you can take care of that problem on your own.
t.join();
Means the program waits here for the thread t to finish.
Since:
t executes foo
foo never ends, due to while true
Then: you never execute the instructions after the join
So you have the uninterrupted 111111

Best way to wakeup multiple thread using pthread

I have created 4 threads by pthread_create. I want them to start running at the very same time, so I add sem_wait(&sem) at the very beginning of the thread procedure. In main thread, I may using something like this, but I don't think it is a good solution:
for (int i = 0; i < 4; i++)
{
sem_post(&sem);
}
I googled and found pthread_cond_t. However, pthread_cond_broadcast can only wake up threads that are currently waiting. Even if I put pthread_cond_wait at the very beginning of the procedure, it is still not guaranteed that pthread_cond_wait is called before pthread_cond_broadcast (in main thread).
To avoid this, I have to add lots of additional codes to make sure the calling sequence of wait and broadcast, which is also not smart.
So, is there a simple way to 'line-up' all threads (make them start to run at the same time)?
There seems to be a sem_post_multiple, but it is a win32 extension in pthread. I am using Linux (Android) however.
you are searching for a barrier
pthread_barrier_t
you initialize it with the number of threads (n) and then call pthread_barrier_wait() with every thread. This call will block the execution until n threads have reached the barrier.
example:
int num_threads = 4;
pthread_barrier_t bar;
void* thread_start(void* arg) {
pthread_barrier_wait(&bar);
//...
}
int main() {
pthread_barrier_init(&bar,NULL,num_threads);
pthread_t thread[num_threads];
for (int i=0; i < num_threads; i++) {
pthread_create(thread + i, NULL, &thread_start, NULL);
}
for (int i=0; i < num_threads; i++) {
pthread_join(thread[i], NULL);
}
pthread_barrier_destroy(&bar);
return 0;
}

Boost, create thread pool before io_service.post

I successfully was testing an example about boost io_service:
for(x = 0; x < loops; x++)
{
// Add work to ioService.
for (i = 0; i < number_of_threads; i++)
{
ioService.post(boost::bind(worker_task, data, pre_data[i]));
}
// Now that the ioService has work, use a pool of threads to service it.
for (i = 0; i < number_of_threads; i++)
{
threadpool.create_thread(boost::bind(
&boost::asio::io_service::run, &ioService));
}
// threads in the threadpool will be completed and can be joined.
threadpool.join_all();
}
This will loop several times and it take a little bit long because every time the threads are created for each loop.
Is there a way to create all needed threads.
Then post in the loop the work for each thread.
After the work it is needed to wait until all threads have finished their work!
Something like this:
// start/create threads
for (i = 0; i < number_of_threads; i++)
{
threadpool.create_thread(boost::bind(
&boost::asio::io_service::run, &ioService));
}
for(x = 0; x < loops; x++)
{
// Add work to ioService.
for (i = 0; i < number_of_threads; i++)
{
ioService.post(boost::bind(worker_task, data, pre_data[i]));
}
// threads in the threadpool will be completed and can be joined.
threadpool.join_all();
}
The problem here is that your worker threads will finish immediately after creation, since there is no work to be done. io_service::run() will just return right away, so unless you manage to sneak in one of the post-calls before all worker threads have had an opportunity to call run(), they will all finish right away.
Two ways to fix this:
Use a barrier to stop the workers from calling run() right away. Only unblock them once the work has been posted.
Use an io_service::work object to prevent run from returning. You can destroy the work object once you posted everything (and must do so before attempting to join the workers again).
The loop wasn't realy usefull.
Here is a better shwoing how it works.
I getting data in a callback:
void worker_task(uint8_t * data, uint32_t len)
{
uint32_t pos = 0;
while(pos < len)
{
pos += process_data(data + pos);
}
}
void callback_f(uint8_t *data, uint32_t len)
{
//split data into parts
uint32_t number_of_data_per_thread = len / number_of_threads;
// Add work to ioService.
uint32_t x = 0;
for (i = 0; i < number_of_threads; i++)
{
ioService.post(boost::bind(worker_task, data + x, number_of_data_per_thread));
x += number_of_data_per_thread ;
}
// Now that the ioService has work, use a pool of threads to service it.
for (i = 0; i < number_of_threads; i++)
{
threadpool.create_thread(boost::bind(
&boost::asio::io_service::run, &ioService));
}
// threads in the threadpool will be completed and can be joined.
threadpool.join_all();
}
So this callback get called very fast from the host application (media stream). If the len what is comming in is big enough the threadpool makes sense. This is because the working time is higher than the init time of the threads and start running.
If the len of the data is small the advantage of the threadpool is getting lost because the init and starting of the threads takes more time then the processing of the data.
May question is now if it is possible to have the threads already running and waiting for data. If the callback get called push the data to the threads and wait for their finish. The number of threads is constant (CPU count).
And as because it is a callback from a host application the pointer to data is only valid while being in the callback function. This is why I have to wait until all threads have finished work.
The thread can starting working immediately after getting data even before other threads are getting started. There is no sync problem because every thread have its own memory area of the data.