pthreads program works for some time and then stalls - c++

There is a program I am working on that, after I launch it, works for some time and then stalls. Here is a simplified version of the program:
#include <cstdlib>
#include <iostream>
#include <pthread.h>
pthread_t* thread_handles;
pthread_mutex_t mutex;
pthread_cond_t cond_var = PTHREAD_COND_INITIALIZER;
int thread_count;
const int some_count = 77;
const int numb_count = 5;
int countR = 0;
//Initialize threads
void InitTh(char* arg[]){
/* Get number of threads */
thread_count = strtol(arg[1], NULL, 10);
/*Allocate space for threads*/
thread_handles =(pthread_t*) malloc (thread_count*sizeof(pthread_t));
}
//Terminate threads
void TermTh(){
for(long thread = 0; thread < thread_count; thread++)
pthread_join(thread_handles[thread], NULL);
free(thread_handles);
}
void* DO_WORK(void* replica) {
/*Does something*/
pthread_mutex_lock(&mutex);
countR++;
if (countR == numb_count) pthread_cond_broadcast(&cond_var);
pthread_mutex_unlock(&mutex);
}
//Some function
void FUNCTION(){
pthread_mutex_init(&mutex, NULL);
for(int k = 0; k < some_count; k++){
for(int j = 0; j < numb_count; j++){
long thread = (long) j % thread_count;
pthread_create(&thread_handles[thread], NULL, DO_WORK, (void *)j);;
}
/*Wait for threads to finish their jobs*/
pthread_mutex_lock(&mutex);
if (countR < numb_count) while(pthread_cond_wait(&cond_var,&mutex) != 0);
countR = 0;
pthread_mutex_unlock(&mutex);
/*Does more work*/
}
pthread_cond_destroy(&cond_var);
pthread_mutex_destroy(&mutex);
}
int main(int argc, char* argv[]) {
/*Initialize threads*/
InitTh(argv);
/*Do some work*/
FUNCTION();
/*Treminate threads*/
TermTh();
return 0;
}
When some_count, (in my particular case,) is less than 76, the program works fine, but if I specify a larger value the program, as mentioned earlier, works for some time and then stalls. Maybe somebody can point out what I am doing wrong?

In
long thread = (long) j % thread_count;
pthread_create(&thread_handles[thread], NULL, DO_WORK, (void *)j);;
you can "override" initialized thread handles, depending on your actual thread count parameter.

I think you should init the thread number to numb_count rather then argv
then replace
long thread = (long) j % thread_count;
with
long thread = (long) j;
won't sure it fix it, but it's needed anyway...
Moreover, it's not about the number 76 or 77, you have a race condition in the thread use.
lets say that one of you threads got to the point in "DO_WORK" when he unlock the mutex but he still didn't returned from this function (meaning the thread is still running...). then you may try to create the same thread in the next iteration using:
pthread_create(&thread_handles[thread], NULL, DO_WORK, (void *)j);
fixing, change:
pthread_mutex_lock(&mutex);
if (countR < numb_count) while(pthread_cond_wait(&cond_var,&mutex) != 0);
countR = 0;
pthread_mutex_unlock(&mutex);
to:
pthread_mutex_lock(&mutex);
if (countR < numb_count) while(pthread_cond_wait(&cond_var,&mutex) != 0);
countR = 0;
for(long thread = 0; thread < numb_count; thread++)
pthread_join(thread_handles[thread], NULL);
pthread_mutex_unlock(&mutex);

You could try to analyze it using helgrind.
Install valgrind, then launch valgrind --tool=helgrind yourproject and see what helgrind spits out

You are neither initializing your mutex correctly (not causing the error here), nor storing the threads you create correctly. Try this:
for(int count = 0; count < thread_count; ++count) {
pthread_create(&thread_handles[count], NULL, DO_WORK, (void *)(count % numb_count));
}

Related

Random results when using pthread

I wrote a simple program using pthread but my results are random....
#define NTHREADS 2
pthread_mutex_t mutex1 = PTHREAD_MUTEX_INITIALIZER;
void *add(void* numbers){
pthread_mutex_lock( &mutex1 );
int *n = (int*) numbers;
float sum;
for(int i = 0; i < 5; i++){
sum = sum + n[i] +5;
}
cout << sum/5<<endl;
pthread_mutex_unlock( &mutex1 );
}
void *substract(void* numbers){
pthread_mutex_lock( &mutex1 );
int *n = (int*) numbers;
float sum;
for(int i = 0; i < 5; i++){
sum = sum + n[i] -10;
}
cout << sum/5<<endl;
pthread_mutex_unlock( &mutex1 );
}
main(){
pthread_t thread_id[NTHREADS];
int i, j;
int *numbers = new int[5];
numbers[0] = 34; numbers[1] = 2; numbers[2]= 77; numbers[3] = 40; numbers[4] = 12;
pthread_create( &thread_id[0], NULL, add, (void*) numbers);
pthread_create( &thread_id[1], NULL, substract, (void*) numbers );
pthread_join( thread_id[0], NULL);
pthread_join( thread_id[1], NULL);
exit(EXIT_SUCCESS);
}
The output of the program is random....Sometimes it got
-2.42477e+26
23
Sometimes it got only one strange number such as
235.69118e+13
(empty space)
I have also tried only to use one thread, but the result is still random. For example, I only used thread to calculate "add", the result is sometimes 38, which is correct, but sometimes is a very strange number.
Where I did wrong? Thank you .
The reason for random numbers, as I told you in your previous question, is that you do not initialize your sum before using. There are other issues with your code as well (see comments), but they are not directly responsible for the random result.
You also do not need to use any mutex at all in your current code. As a matter of fact, by using mutex you made your application effectively single-threaded, dumping all multithreading benefits. The only place where you might need a mutex is right before and after cout call - to ensure the output is not intertwined.
There are various things you need to fix in your code, but the most "burning issue", and the one causing you the problems is using an uninitialized variable -> sum.

OpenMP filling array with two threads in series

I have an array. And I need to fill it with two threads each value consequently, using omp_set_lock, and omp_unset_lock. First thread should write first value, then second array should write second value etc. I have no idea how to do that, because, in openmp you cant't explicitly make one thread wait for another. Have any ideas?
Why not try the omp_set_lock/omp_unset_lock functions?
omp_lock_t lock;
omp_init_lock(&lock);
#pragma omp parallel for
bool thread1 = true;
for (int i = 0; i < arr.size(); ++i) {
omp_set_lock(&lock);
if (thread1 == true) {
arr[i] = fromThread1();
thread1 = false;
} else {
arr[i] = fromThread2();
thread1 = true;
}
omp_unset_lock(&lock);
}

Locking a part of memory for multithreading

I'm trying to write some code that creates threads that can modify different parts of memory concurrently. I read that a mutex is usually used to lock code, but I'm not sure if I can use that in my situation. Example:
using namespace std;
mutex m;
void func(vector<vector<int> > &a, int b)
{
lock_guard<mutex> lk(m);
for (int i = 0; i < 10E6; i++) { a[b].push_back(1); }
}
int main()
{
vector<thread> threads;
vector<vector<int> > ints(4);
for (int i = 0; i < 10; i++)
{
threads.push_back(thread (func, ref(ints), i % 4));
}
for (int i = 0; i < 10; i++) { threads[i].join(); }
return 0;
}
Currently, the mutex just locks the code inside func, so (I believe) every thread just has to wait until the previous is finished.
I'm trying to get the program to edit the 4 vectors of ints at the same time, but that does realize it has to wait until some other thread is done editing one of those vectors before starting the next.
I think you want the following: (one std::mutex by std::vector<int>)
std::mutex m[4];
void func(std::vector<std::vector<int> > &a, int index)
{
std::lock_guard<std::mutex> lock(m[index]);
for (int i = 0; i < 10E6; i++) {
a[index].push_back(1);
}
}
Have you considered using a semaphore instead of a mutex?
The following questions might help you:
Semaphore Vs Mutex
When should we use mutex and when should we use semaphore
try:
void func(vector<vector<int> > &a, int b)
{
for (int i=0; i<10E6; i++) {
lock_guard<mutex> lk(m);
a[b].push_back(1);
}
}
You only need to lock your mutex while accessing the shared object (a). The way you implemented func means that one thread must finish running the entire loop before the next can start running.

while true for all threads

#include<stdio.h>
#include<pthread.h>
#define nThreads 5
pthread_mutex_t lock;
void *start(void *param) {
pthread_mutex_lock(&lock);
while (true)
{
//do certain things , mutex to avoid critical section problem
int * number = (int *) param;
cout<<*number;
}
pthread_mutex_unlock(&lock);
}
int main()
{
pthread_mutex_init(&lock, NULL);
pthread_t tid[nThreads];
int i = 0;
for(i = 0; i < nThreads; i++) pthread_create(&tid[i], NULL, start, (void *) &i);
for(i = 0; i < nThreads; i++) pthread_join(tid[i], NULL);
pthread_mutex_destroy(&lock);
return 0;
}
my question is whether all the threads are looping infinitely or only the first thread is looping. and if only one thread is looping, how to make all threads loop infinitely and should mutex be inside the while loop or outside :S !!
thanks in advance.
If the mutex is outside the loop as you've shown, then only one thread can enter that loop. If that loop runs forever (as while (true) will do if there's no break statement inside), then only one thread will actually get to loop and the rest will be locked out.
Move the mutex around just the code that you need to protect. If you want all the threads looping in parallel, taking turns accessing a common structure, move the mutex inside the loop.
In this case only 1 thread is in loop , also this will be the first thread to enter since that will never unlock mutex no other thread will enter ie, all other thread will wait indefinitely. I think what you want is this:
while (true)
{
pthread_mutex_lock(&lock);
//do certain things , mutex to avoid critical section problem
int * number = (int *) param;
cout<<*number;
pthread_mutex_unlock(&lock);
}

unix-fork-monitor-child-progress

I have an application where a bit of parallel processing would be of benefit. For the purposes of the discussion, let's say there is a directory with 10 text files in it, and I want to start a program, that forks off 10 processes, each taking one of the files, and uppercasing the contents of the file. I acknowledge that the parent program can wait for the children to complete using one of the wait functions, or using the select function.
What I would like to do is have the parent process monitor the progress of each forked process, and display something like a progress bar as the processes run.
My Question.
What would be a reasonable alternatives do I have for the forked processes to communicate this information back to the parent? What IPC techniques would be reasonable to use?
In this kind of situation where you only want to monitor the progress, the easiest alternative is to use shared memory. Every process updates it progress value (e.g. an integer) on a shared memory block, and the master process reads the block regularly. Basically, you don't need any locking in this scheme. Also, it is a "polling" style application because the master can read the information whenever it wants, so you do not need any event processing for handling the progress data.
If the only progress you need is "how many jobs have completed?", then a simple
while (jobs_running) {
pid = wait(&status);
for (i = 0; i < num_jobs; i++)
if (pid == jobs[i]) {
jobs_running--;
break;
}
printf("%i/%i\n", num_jobs - jobs_running, num_jobs);
}
will do. For reporting progress while, well, in progress, here's dumb implementations of some of the other suggestions.
Pipes:
#include <poll.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <unistd.h>
int child(int fd) {
int i;
struct timespec ts;
for (i = 0; i < 100; i++) {
write(fd, &i, sizeof(i));
ts.tv_sec = 0;
ts.tv_nsec = rand() % 512 * 1000000;
nanosleep(&ts, NULL);
}
write(fd, &i, sizeof(i));
exit(0);
}
int main() {
int fds[10][2];
int i, j, total, status[10] = {0};
for (i = 0; i < 10; i++) {
pipe(fds[i]);
if (!fork())
child(fds[i][1]);
}
for (total = 0; total < 1000; sleep(1)) {
for (i = 0; i < 10; i++) {
struct pollfd pfds = {fds[i][0], POLLIN};
for (poll(&pfds, 1, 0); pfds.revents & POLLIN; poll(&pfds, 1, 0)) {
read(fds[i][0], &status[i], sizeof(status[i]));
for (total = j = 0; j < 10; j++)
total += status[j];
}
}
printf("%i/1000\n", total);
}
return 0;
}
Shared memory:
#include <semaphore.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <time.h>
#include <unistd.h>
int child(int *o, sem_t *sem) {
int i;
struct timespec ts;
for (i = 0; i < 100; i++) {
sem_wait(sem);
*o = i;
sem_post(sem);
ts.tv_sec = 0;
ts.tv_nsec = rand() % 512 * 1000000;
nanosleep(&ts, NULL);
}
sem_wait(sem);
*o = i;
sem_post(sem);
exit(0);
}
int main() {
int i, j, size, total;
void *page;
int *status;
sem_t *sems;
size = sysconf(_SC_PAGESIZE);
size = (10 * sizeof(*status) + 10 * sizeof(*sems) + size - 1) & size;
page = mmap(0, size, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS, -1, 0);
status = page;
sems = (void *)&status[10];
for (i = 0; i < 10; i++) {
status[i] = 0;
sem_init(&sems[i], 1, 1);
if (!fork())
child(&status[i], &sems[i]);
}
for (total = 0; total < 1000; sleep(1)) {
for (total = i = 0; i < 10; i++) {
sem_wait(&sems[i]);
total += status[i];
sem_post(&sems[i]);
}
printf("%i/1000\n", total);
}
return 0;
}
Error handling etc. elided for clarity.
A few options (no idea which, if any, will suit you--a lot depends on what you are actually doing, as opped to the "uppercasing files" analogy):
signals
fifos / named pipes
the STDOUT of the children or other passed handles
message queues (if appropriate)
If all you want is a progress update, by far the easiest way is probably to use an anonymous pipe. The pipe(2) call will give you two file descriptors, one for each end of the pipe. Call it just before you fork, and have the parent listen to the first fd and the child write to the second. (This works because both the file descriptors and the two-element array containing them are shared between the processes -- not shared memory per se, but it's copy-on-write so they share the values unless you overwrite them.)
Just earlier today someone told me that they always use a pipe, by which the children can send notification to the parent process that all is going well. This seems a decent solution, and is especially useful in places where you would want to print an error, but no longer have access to stdout/stderr, etc.
Boost.MPI should be useful in this scenario. You may consider it overkill but it's definitely worth investigating:
www.boost.org/doc/html/mpi.html