C++ Boost thread sleep deadlock - c++

I have a problem with the following code:
#include <boost/thread/thread.hpp>
#include <boost/thread/mutex.hpp>
#include <iostream>
#include <sys/types.h>
#include <sys/wait.h>
using namespace std;
void f1(uint count)
{
while(count-- > 0)
{
// boost::this_thread::sleep(boost::posix_time::millisec(1000));
sleep(1);
}
}
void folkflore()
{
int res = fork();
//parent
if ( res )
{
wait(NULL);
}
else
{
unsigned int x = 2;
boost::thread tx(boost::bind(f1, 2));
tx.join();
_exit(-5);
}
}
int main()
{
std::cout << "Main program " << getpid() << std::endl;
unsigned int x = 2;
boost::thread t1(boost::bind(f1, 2));
boost::thread m(folkflore);
m.join();
t1.join();
return 0;
}
[LATER EDIT]
Ok, so it looks like boost::this_thread::sleep acquires a mutex in the behind-scenes, so I guess I'll stick with plain old sleep() which is plain old good for me.
[/LATER EDIT]
From main() I issue a t1 thread which counts 2 seconds and another thread which does the following: fork()'s inside it, the parent waits for the child and the child creates another thread which also counts 2 seconds.
The problem is that if I use boost::this_thread:sleep the program hangs or deadlocks somehow. If I use sleep(), then it works ok. Am I doing something wrong here? What is the difference between these two ?
From the man-page of sleep I get that:
"sleep() makes the calling thread sleep until seconds seconds have elapsed or a signal arrives which is not ignored.
"
Also from the boost docs, boost::this_thread::sleep seems to do the same thing.

You do dangerous things here:
fork call duplicates the whole program, but only one thread (current one) running
in new process. So all mutex here, but only one thread.
And if some thread lock mutexes and your thread try lock it in new process,
it will wait forever.
Here
boost::this_thread::sleep(boost::posix_time::millisec(1000));
if look at boost's include file, sleep looks like:
this_thread::sleep(get_system_time()+rel_time);
get_system_time call tz_convert from libc, which take mutex. And looks like before fork another thread lock it, and...

Related

Merge two file by sorting them based by timestamp using three threads in C++

I have two file: A,B with rows as csv:timestamp, parameters.. I want to read from one thread one file and from another thread the other file and compare the timestamps in a third thread.
The goal is to construct a vector with the content of the row sorted with the timestamp of both file.
How would you approach this problem?
I did something like this: but it doesn't print out all the timestampsand maybe I'm missing something:
#include <iostream>
#include <fstream>
#include <string>
#include <mutex>
#include <condition_variable>
#include <thread>
#include <vector>
#include <chrono>
std::mutex mtx;
std::condition_variable cv;
long long timestamp1, timestamp2;
std::vector<long long> timestamps;
bool finished1 = false, finished2 = false;
void thread2() {
std::ifstream file2("a.csv");
std::string line;
while (std::getline(file2, line)) {
std::vector<std::string> values = split(line, ',');
long long current_timestamp = std::stoll(values[4]);
{
std::unique_lock<std::mutex> lock(mtx);
while (timestamp1 >= current_timestamp) {
cv.wait(lock);
}
timestamp2 = current_timestamp;
}
cv.notify_one();
}
{
std::unique_lock<std::mutex> lock(mtx);
finished2 = true;
}
cv.notify_one();
}
void thread3() {
while (!finished1 || !finished2) {
std::unique_lock<std::mutex> lock(mtx);
cv.wait(lock);
if (finished1 && finished2) {
break;
}
if (timestamp1 >= timestamp2) {
timestamps.push_back(timestamp1);
std::cout << timestamp1 <<"\n" << std::flush;
} else {
timestamps.push_back(timestamp2);
std::cout << timestamp2 <<"\n" << std::flush;
}
}
}
#include <algorithm>
int main() {
std::thread t1(thread1);
std::thread t2(thread2);
std::thread t3(thread3);
t1.join();
t2.join();
t3.join();
std::cout << std::is_sorted(timestamps.begin(),timestamps.end());
}
The race condition, that result in loss of data, is due to a common misconception that assumes that notifying a condition variable results in an immediate wakeup of any thread that's listening on the condition variable, instantly. Additionally, it's also expected that the woken execution thread is guaranteed to immediately execute all of its assigned tasks, instantly, before it gets blocked on a mutex or a condition variable, again.
This is not the case.
All that notify_one() guarantees is that any execution thread that's listening on the condition variable will be woken up at some point after notify_one() is entered, which may be before or after notify_one() returns. That's it.
So, with that in mind, let's take a look at the sequence of events in thread1 (indentation adjusted):
timestamp1 = current_timestamp;
}
cv.notify_one();
timestamp1 is updated. The mutex is unlocked. The condition variable is notified.
thread3 is now scheduled to be woken up. But you have no guarantees, whatsoever, that not only did it wake up but it also managed to successfully relock the mutex. All you have is a nebulous promises from notify_one that this will happen. Eventually.
Meanwhile, back at the ranch:
std::unique_lock<std::mutex> lock(mtx);
while (timestamp2 >= current_timestamp) {
cv.wait(lock);
}
timestamp1 = current_timestamp;
thread1 managed to read the next timestamp and relock the mutex. Modern CPUs are fast. They can accomplish quite a lot before a context switch to another execution thread. This same thread discovers that the while condition is still true.
Based on the fact that the logic waits until the shared current_timestamp is less than the current value I conclude that the timestamps must be in decreasing order. Well, the last time around the block timestamp2, from the other thread was 1000 and current_timestamp was 900; now, the next current_timestamp is 800. It's still less than 1000, so we proceed on our merry way, updatingcurrent_timestamp from 900 to 800.
Meanwhile, thread3 is still having a nice dream, and only now beginning to wake up from its slumber, as a result of the prior notify_one (which is now just a distant memory to this execution thread). And thread3 missed the 900 value completely. It was replaced by 800. It's gone forever, never to be seen again.
This is not the only flaw in the shown code's logic. The missing data you're seeing is not due to some minor, single oversight, just a few lines of code that needs to be fixed. The logic is flawed in several different ways, that results in the missing data that you're seeing. You will need to completely rework the entire logic if you still want to use this multi-threaded approach to the described task.

Start multiple threads and wait only for one to finish to obtain results

Assuming I have the function double someRandomFunction(int n) that takes an integer and returns double but it's random in the sense that it tries random stuff to come up with the solution so even though you run the function with the same arguments, sometimes it can take 10 seconds to finish and other 40 seconds to finish.
The double someRandomFunction(int n) functions itself is a wrapper to a black box function. So the someRandomFunction takes a while to complete but I don't have control in the main loop of the black box, hence I can't really check for a flag variable within the thread as the heavy computation happens in a black box function.
I would like to start 10 threads calling that function and I am interested in the result of the first thread which finishes first. I don't care which one it's I only need 1 result from these threads.
I found the following code:
std::vector<boost::future<double>> futures;
for (...) {
auto fut = boost::async([i]() { return someRandomFunction(2) });
futures.push_back(std::move(fut));
}
for (...) {
auto res = boost::wait_for_any(futures.begin(), futures.end());
std::this_thread::yield();
std::cout << res->get() << std::endl;
}
Which is the closest to what I am looking for, but still I can't see how I can make my program to terminate the other threads as far as one thread returns a solution.
I would like to wait for one to finish and then carry on with the result of that one thread to continue my program execution (i.e., I don't want to terminate my program after I obtain that single result, but I would like to use it for the remaining program execution.).
Again, I want to start up 10 threads calling the someRandomFunction and then wait for one thread to finish first, get the result of that thread and stop all the other threads even though they didn't finish their work.
If the data structure supplied to the black-box has some obvious start and end values, one way to make it finish early could be to change the end value while it's computing. It could of course cause all sorts of trouble if you've misunderstood how the black-box must work with the data, but if you are reasonably sure, it can work.
main spawns 100 outer threads that each spawn one inner thread that calls the blackbox. The inner thread receives the blackbox result and notifies all waiting threads that it's done. The outer thread waits for any inner thread to get done and then modifies the data for its own blackbox to trick it to finish.
No polling (except for the spurious wakeup loops) and no detached threads.
#include <iostream>
#include <thread>
#include <mutex>
#include <condition_variable>
#include <vector>
#include <chrono>
// a work package for one black-box
struct data_for_back_box {
int start_here;
int end_here;
};
double blackbox(data_for_back_box* data) {
// time consuming work here:
for(auto v=data->start_here; v<data->end_here; ++v) {
std::this_thread::sleep_for(std::chrono::milliseconds(10));
}
// just a debug
if(data->end_here==0) std::cout << "I was tricked into exiting early\n";
return data->end_here;
}
// synchronizing stuff and result
std::condition_variable cv;
std::mutex mtx;
bool done=false;
double result;
// a wrapper around the real blackbox
void inner(data_for_back_box* data) {
double r = blackbox(data);
if(done) return; // someone has already finished, skip this result
// notify everyone that we're done
std::unique_lock<std::mutex> lock(mtx);
result = r;
done=true;
cv.notify_all();
}
// context setup and wait for any inner wrapper
// to signal "done"
void outer(int n) {
data_for_back_box data{0, 100+n*n};
std::thread work(inner, &data);
{
std::unique_lock<std::mutex> lock(mtx);
while( !done ) cv.wait(lock);
}
// corrupt data for blackbox:
data.end_here = 0;
// wait for this threads blackbox to finish
work.join();
}
int main() {
std::vector<std::thread> ths;
// spawn 100 worker threads
for(int i=0; i<100; ++i) {
ths.emplace_back(outer, i);
}
double saved_result;
{
std::unique_lock<std::mutex> lock(mtx);
while( !done ) cv.wait(lock);
saved_result = result;
} // release lock
// join all threads
std::cout << "got result, joining:\n";
for(auto& th : ths) {
th.join();
}
std::cout << "result: " << saved_result << "\n";
}

Linux Multithreading - threads do not produce any output as expected

I am learning multi-threading in Linux platform. I wrote this small program to get comfort with the concepts. On running the executable, I could not see any error nor does it print Hi. Hence I made to sleep the thread after I saw the output. But still could not see the prints on the console.
I also want to know which thread prints at run time. Can anyone help me?
#include <iostream>
#include <unistd.h>
#include <pthread.h>
using std::cout;
using std::endl;
void* print (void* data)
{
cout << "Hi" << endl;
sleep(10000000);
}
int main (int argc, char* argv[])
{
int t1 = 1, t2 =2, t3 = 3;
pthread_t thread1, thread2, thread3;
int thread_id_1, thread_id_2, thread_id_3;
thread_id_1 = pthread_create(&thread1, NULL, print, 0);
thread_id_2 = pthread_create(&thread2, NULL, print, 0);
thread_id_3 = pthread_create(&thread3, NULL, print, 0);
return 0;
}
Your main thread probably exits and thus the entire process dies. So, the threads don't get a chance to run. It's also possible (quite unlikely but still possible) that you'd see the output from the threads even with your code as-is if the threads complete execution before main thread exits. But you can't rely on that.
Call pthread_join(), which suspends the calling thread until the thread (specified by the thread ID) returns, on the threads after the pthread_create() calls in main():
pthread_join(thread1, NULL);
pthread_join(thread2, NULL);
pthread_join(thread3, NULL);
You can also use an array of pthread_t which would allow you to use a for loop over the pthread_create() and pthread_join() calls.
Or exit only the main thread using pthread_exit(0), which would exit only the calling thread and the remaining threads (the ones you created) will continue execution.
Note that your thread function should return a pointer or NULL:
void* print (void* data)
{
cout << "Hi" << endl;
return NULL;
}
Not sure about the high sleeps either right the threads exit, which is unnecessary and would hold the threads from exiting. Probably not something you wanted.

How do I make threads run sequentially instead of concurrently?

For example I want each thread to not start running until the previous one has completed, is there a flag, something like thread.isRunning()?
#include <iostream>
#include <vector>
#include <thread>
using namespace std;
void hello() {
cout << "thread id: " << this_thread::get_id() << endl;
}
int main() {
vector<thread> threads;
for (int i = 0; i < 5; ++i)
threads.push_back(thread(hello));
for (thread& thr : threads)
thr.join();
cin.get();
return 0;
}
I know that the threads are meant to run concurrently, but what if I want to control the order?
There is no thread.isRunning(). You need some synchronization primitive to do it.
Consider std::condition_variable for example.
One approachable way is to use std::async. With the current definition of std::async is that the associated state of an operation launched by std::async can cause the returned std::future's destructor to block until the operation is complete. This can limit composability and result in code that appears to run in parallel but in reality runs sequentially.
{
std::async(std::launch::async, []{ hello(); });
std::async(std::launch::async, []{ hello(); }); // does not run until hello() completes
}
If we need the second thread start to run after the first one is completed, is a thread really needed?
For solution I think try to set a global flag, the set the value in the first thread, and when start the second thread, check the flag first should work.
You can't simply control the order like saying "First, thread 1, then thread 2,..." you will need to make use of synchronization (i.e. std::mutex and condition-variables std::condition_variable_any).
You can create events so as to block one thread until a certain event happend.
See cppreference for an overview of threading-mechanisms in C++-11.
You will need to use semaphore or lock.
If you initialize semaphore to value 0:
Call wait after thread.start() and call signal/ release in the end of thread execution function (e.g. run funcition in java, OnExit function etc...)
So the main thread will keep waiting until the thread in loop has completed its execution.
Task-based parallelism can achieve this, but C++ does not currently offer task model as part of it's threading libraries. If you have TBB or PPL you can use their task-based facilities.
I think you can achieve this by using std::mutex and std::condition_variable from C++11. To be able to run threads sequentially array of booleans in used, when thread is done doing some work it writes true in specific index of the array.
For example:
mutex mtx;
condition_variable cv;
int ids[10] = { false };
void shared_method(int id) {
unique_lock<mutex> lock(mtx);
if (id != 0) {
while (!ids[id - 1]) {
cv.wait(lock);
}
}
int delay = rand() % 4;
cout << "Thread " << id << " will finish in " << delay << " seconds." << endl;
this_thread::sleep_for(chrono::seconds(delay));
ids[id] = true;
cv.notify_all();
}
void test_condition_variable() {
thread threads[10];
for (int i = 0; i < 10; ++i) {
threads[i] = thread(shared_method, i);
}
for (thread &t : threads) {
t.join();
}
}
Output:
Thread 0 will finish in 3 seconds.
Thread 1 will finish in 1 seconds.
Thread 2 will finish in 1 seconds.
Thread 3 will finish in 2 seconds.
Thread 4 will finish in 2 seconds.
Thread 5 will finish in 0 seconds.
Thread 6 will finish in 0 seconds.
Thread 7 will finish in 2 seconds.
Thread 8 will finish in 3 seconds.
Thread 9 will finish in 1 seconds.

make main program wait for threads to finish

In the following code I create some number of threads, and each threads sleeps for some seconds.
However my main program doesn't wait for the threads to finish, I was under the assumption that threads would continue to run until they finished by themselves.
Is there someway of making threads continue to run even though the calling thread finishes.
#include <pthread.h>
#include <iostream>
#include <cstdio>
#include <cstdlib>
int sample(int min,int max){
int r=rand();
return (r %max+min );
}
void *worker(void *p){
long i = (long) p;
int s = sample(1,10);
fprintf(stdout,"\tid:%ld will sleep: %d \n",i,s);
sleep(s);
fprintf(stdout,"\tid:%ld done sleeping \n",i,s);
}
pthread_t thread1;
int main(){
int nThreads = sample(1,10);
for(int i=0;i<nThreads;i++){
fprintf(stderr,"\t-> Creating: %d of %d\n",i,nThreads);
int iret1 = pthread_create( &thread1, NULL, worker, (void*) i);
pthread_detach(thread1);
}
// sleep(10);//work if this is not commented out.
return 0;
}
Thanks
Edit:
Sorry for not clarifying, is it possible without explicitly keeping track of my current running threads and by using join.
Each program has a main thread. It is the thread in which your main() function executes. When the execution of that thread finishes, the program finishes along with all its threads. If you want your main thread to wait for other threads, use must use pthread_join function
You need to keep track of the threads. You are not doing that because you are using the same thread1 variable to every thread you are creating.
You track threads by creating a list (or array) of pthread_t types that you pass to the pthread_create() function. Then you pthread_join() those threads in the list.
edit:
Well, it's really lazy of you to not keep track of running threads. But, you can accomplish what you want by having a global var (protected by a mutex) that gets incremented just before a thread finishes. Then in you main thread you can check if that var gets to the value you want. Say nThreads in your sample code.
You need to join each thread you create:
int main()
{
int nThreads = sample(1,10);
std::vector<pthread_t> threads(nThreads);
for(i=0; i<nThreads; i++)
{
pthread_create( &threads[i], NULL, worker, (void*) i)
}
/* Wait on the other threads */
for(i=0; i<nThreads; i++)
{
status* status;
pthread_join(threads[i], &status);
}
}
You learned your assumption was wrong. Main is special. Exiting main will kill your threads. So there are two options:
Use pthread_exit to exit main. This function will allow you to exit main but keep other threads running.
Do something to keep main alive. This can be anything from a loop (stupid and inefficient) to any blocking call. pthread_join is common since it will block but also give you the return status of the threads, if you are interested, and clean up the dead thread resources. But for the purposes of keeping main from terminating any blocking call will do e.g. select, read a pipe, block on a semaphore, etc.
Since Martin showed join(), here's pthread_exit():
int main(){
int nThreads = sample(1,10);
for(int i=0;i<nThreads;i++){
fprintf(stderr,"\t-> Creating: %d of %d\n",i,nThreads);
int iret1 = pthread_create( &thread1, NULL, worker, (void*) i);
pthread_detach(thread1);
}
pthread_exit(NULL);
}