Confusing C++ Thread Behavior

Confusing C++ Thread Behavior - c++

I was reading some literature on C++11 threads and tried the following code:
#include "iostream"
#include "thread"
using namespace std;
class background_task{
int data;
int flag;
public:
background_task(int val):data(val),flag(data%2){}
void operator()(void){
int count = 0;
while(count < 100)
{
if(flag)
cout <<'\n'<<data++;
else
cout <<'\n'<<data--;
count++;
}
}
};
int main(int argc , char** argv){
std::thread T1 {background_task(2)};
std::thread T2 {background_task(3)};
T1.join();
T2.join();
return 0;
}
the output doesn't make sense given that i am running two threads so each should be printing almost together and not wait for one thread to finish to start. Instead each thread finishes and then the next thread starts, like in a synchronous fashion.Am i missing something here?

its probably because of creating a new thread takes some time and the first thread finishes before the next one begin .
and you have the choice to detach or join a thread like
t1.detach();//don't care about t1 finishing
or t1.join()//wait for t1 to finish

Your operating system need not start the threads at the same time; it need not start them on different cores; it need not provide equal time to each thread. I really don't believe the standard mandates anything of the sort, but I haven't checked the standard to cite the right parts to verify.
You may be able to (no promises!) get the behavior you desire by changing your code to the following. This code is "encouraging" the OS to give more time to both threads, and hopefully allows for both threads to be fully constructed before one of them finishes.
#include <chrono>
#include <iostream>
#include <thread>
class background_task {
public:
background_task(int val) : data(val), flag(data % 2) {}
void operator()() {
int count = 0;
while (count < 100) {
std::this_thread::sleep_for(std::chrono::milliseconds(50));
if (flag)
std::cout << '\n' << data++;
else
std::cout << '\n' << data--;
count++;
}
}
private:
int data;
int flag;
};
int main() {
std::thread T1{background_task(2)};
std::thread T2{background_task(3)};
T1.join();
T2.join();
return 0;
}

Try below code, modified you earlier code to show the result:
#include "iostream"
#include "thread"
using namespace std;
class background_task{
int data;
int flag;
public:
background_task(int val):data(val),flag(data%2){}
void operator()(void){
int count = 0;
while(count < 10000000)
{
if(flag)
cout <<'\n'<<"Yes";
else
cout <<'\n'<<" "<<"No";
count++;
}
}
};
int main(int argc , char** argv){
std::thread T1 {background_task(2)};
std::thread T2 {background_task(3)};
T1.join();
T2.join();
return 0;
}
By the time second thread starts first thread is already done processing hence you saw what you saw.

In addition to Amir Rasti's answer I think it's worth mentioning the scheduler.
If you use a while(1) instead, you will see that the output isn't exactly parallel even after the two threads running "parallel". The scheduler (part of the operating system) will give each process time to run, but the time can vary. So it can be that one process will print 100 characters before the scheduler let the other process print again.

while(count < 10000)
Loop may be finished before starting of next thread, you can see the difference if you increase the loop or insert some sleep inside the loop.

Related

cpp thread join if two threads rely each other should using join cause deadlock

#include <iostream>
#include <mutex>
#include <condition_variable>
#include <thread>
std::mutex lock_bar_;
std::mutex lock_foo_;
int n = 3;
void foo() {
for (int i = 0; i < n; i++) {
lock_foo_.lock();
// printFoo() outputs "foo". Do not change or remove this line.
std::cout << "1\n";
lock_bar_.unlock();
}
}
void bar() {
for (int i = 0; i < n; i++) {
lock_bar_.lock();
// printBar() outputs "bar". Do not change or remove this line.
std::cout << "2\n";
lock_foo_.unlock();
}
}
int main(){
lock_bar_.lock();
std::thread t1{foo};
std::thread t2{bar};
t1.join(); // line 1
std::cout << "333\n"; // line 2
t2.join(); // line 3
std::cout << "3\n"; // line 4
}
the result is
1
2
1
2
1
2
333
3
or
1
2
1
2
1
333
2
3
my question is : why this programs can run without deadlock?
how join() is actually working?
when program executes line 1, according to cppreference https://en.cppreference.com/w/cpp/thread/thread/join
"Blocks the current thread until the thread identified by *this finishes its execution."
My understanding is that the main thead should stop. It waits until thread t1 is finsihed. then execute line 2 and the rest.
but program seems like that it executes line 1 and line 3. when thread t1 is finshed, it runs line 2. when thread t2 is finished, it executes line 4.
I am confused about join().
if anyone can help, much appreciated
first edited:
ignore original program
new program is
#include <iostream>
#include <mutex>
#include <condition_variable>
#include <thread>
int n = 10;
bool first = true;
std::condition_variable cv1;
std::condition_variable cv2;
std::mutex m;
void foo() {
std::unique_lock<std::mutex> ul(m, std::defer_lock);
for (int i = 0; i < n; i++) {
ul.lock();
cv1.wait(ul, [&]()->bool {return first;} );
std::cout << "1\n";
// printFoo() outputs "foo". Do not change or remove this line.
first = !first;
ul.unlock();
cv2.notify_all();
}
}
void bar() {
std::unique_lock<std::mutex> ul(m, std::defer_lock);
for (int i = 0; i < n; i++) {
ul.lock();
cv2.wait(ul, [&]()->bool {return !first;} );
// printBar() outputs "bar". Do not change or remove this line.
std::cout << "2\n";
first = !first;
ul.unlock();
cv1.notify_all();
}
}
int main(){
std::thread t1{foo};
std::thread t2{bar};
t1.join();
std::cout << "3\n";
t2.join();
}
same questions

Your threads do very little work. Depending on your os and number of cpu cores, threads will only switch at a fixed interval. There is a reasonable chance that after t1.join returns t2 has already finished executing (your first output).
If you add some sleeps to the loops in your threads you should see your second output every time as t2 will still be executing when t1.join returns.
Note that unlocking a mutex from a thread that didn't originally lock the mutex has undefined behaviour: https://en.cppreference.com/w/cpp/thread/mutex/unlock

You have made the wrong assumption that mutexes can be locked and unlocked from different threads.
A mutex locked by one thread cannot be unlocked by another thread. The whole lock/unlock process is per thread.
lock_bar_.unlock();
This line in your first function has no meaning. See ReleaseMutex in Windows (I guess it works that way in other OSes). It releases the mutex previously locked from the current thread, not from anyother else.

the answer is that when std::thread t2{} is created, it enters the queue. that is why t2 is also executed. join() does mean start().

Two threads sharing variable C++

So I have two threads where they share the same variable, 'counter'. I want to synchronize my threads by only continuing execution once both threads have reached that point. Unfortunately I enter a deadlock state as my thread isn't changing it's checking variable. The way I have it is:
volatile int counter = 0;
Thread() {
- some calculations -
counter++;
while(counter != 2) {
std::this_thread::yield();
}
counter = 0;
- rest of the calculations -
}
The idea is that since I have 2 threads, once they reach that point - at different times - they will increment the counter. If the counter isn't equal to 2, then the thread that reached there first will have to wait until the other has incremented the counter so that they are synced up. Does anyone know where the issue lies here?
To add more information about the problem, I have two threads which perform half of the operations on an array. Once they are done, I want to make sure that they both have completed finish their calculations. Once they are, I can signal the printer thread to wake up and perform it's operation of printing and clearing the array. If I do this before both threads have completed, there will be issues.
Pseudo code:
Thread() {
getLock()
1/2 of the calculations on array
releaseLock()
wait for both to finish - this is the issue
wake up printer thread
}

In situations like this, you must use an atomic counter.
std::atomic_uint counter = 0;
In the given example, there is also no sign that counter got initialized.

You are probably looking for std::conditional_variable: A conditional variable allows one thread to signal to another thread. Because it doesn't look like you are using the counter, and you're only using it for synchronisation, here is some code from another answer (disclaimer: it's one of my answers) that shows std::conditional_variable processing logic on different threads, and performing synchronisation around a value:
unsigned int accountAmount;
std::mutex mx;
std::condition_variable cv;
void depositMoney()
{
// go to the bank etc...
// wait in line...
{
std::unique_lock<std::mutex> lock(mx);
std::cout << "Depositing money" << std::endl;
accountAmount += 5000;
}
// Notify others we're finished
cv.notify_all();
}
void withdrawMoney()
{
std::unique_lock<std::mutex> lock(mx);
// Wait until we know the money is there
cv.wait(lock);
std::cout << "Withdrawing money" << std::endl;
accountAmount -= 2000;
}
int main()
{
accountAmount = 0;
// Run both threads simultaneously:
std::thread deposit(&depositMoney);
std::thread withdraw(&withdrawMoney);
// Wait for both threads to finish
deposit.join();
withdraw.join();
std::cout << "All transactions processed. Final amount: " << accountAmount << std::endl;
return 0;
}

I would look into using a countdown latch. The idea is to have one or more threads block until the desired operation is completed. In this case you want to wait until both threads are finished modifying the array.
Here is a simple example:
#include <condition_variable>
#include <mutex>
#include <thread>
class countdown_latch
{
public:
countdown_latch(int count)
: count_(count)
{
}
void wait()
{
std::unique_lock<std::mutex> lock(mutex_);
while (count_ > 0)
condition_variable_.wait(lock);
}
void countdown()
{
std::lock_guard<std::mutex> lock(mutex_);
--count_;
if (count_ == 0)
condition_variable_.notify_all();
}
private:
int count_;
std::mutex mutex_;
std::condition_variable condition_variable_;
};
and usage would look like this
std::atomic<int> result = 0;
countdown_latch latch(2);
void perform_work()
{
++result;
latch.countdown();
}
int main()
{
std::thread t1(perform_work);
std::thread t2(perform_work);
latch.wait();
std::cout << "result = " << result;
t1.join();
t2.join();
}

Why is this piece of C++ code not synchronized

I am learning to write multithreading applications. So share I run into trouble anytime I want my threads to access even the simples shared resources, despite using mutex.
For example, consider this code:
using namespace std;
mutex mu;
std::vector<string> ob;
void addSomeAValues(){
mu.lock();
for(int a=0; a<10; a++){
ob.push_back("A" + std::to_string(a));
usleep(300);
}
mu.unlock();
}
void addSomeBValues(){
mu.lock();
for(int b=0; b<10; b++){
ob.push_back("B" + std::to_string(b));
usleep(300);
}
mu.unlock();
}
int main() {
std::chrono::steady_clock::time_point start = std::chrono::steady_clock::now();
thread t0(addSomeAValues);
thread t1(addSomeBValues);
std::chrono::steady_clock::time_point end = std::chrono::steady_clock::now();
t0.join();
t1.join();
//Display the results
cout << "Code Run Complete; results: \n";
for(auto k : ob){
cout << k <<endl;
}
//Code running complete, report the time it took
typedef std::chrono::duration<int,std::milli> millisecs_t;
millisecs_t duration(std::chrono::duration_cast<millisecs_t>(end-start));
std::cout << duration.count() << " milliseconds.\n";
return 0;
}
When I run the program, it behaves unpredictably. Sometimes, the values A0-9 and B0-9 is printed to console no problem, sometimes there is a segmentation fault with crash report, sometimes, A0-3 & B0-5 is presented.
If i am missing a core synchronization issue, pleasee help
Edit: after alot of useful feed back i changed the code to
#include <iostream>
#include <string>
#include <vector>
#include <mutex>
#include <unistd.h>
#include <thread>
#include <chrono>
using namespace std;
mutex mu;
std::vector<string> ob;
void addSomeAValues(){
for(int a=0; a<10; a++){
mu.lock();
ob.push_back("A" + std::to_string(a));
mu.unlock();
usleep(300);
}
}
void addSomeBValues(){
for(int b=0; b<10; b++){
mu.lock();
ob.push_back("B" + std::to_string(b));
mu.unlock();
usleep(300);
}
}
int main() {
std::chrono::steady_clock::time_point start = std::chrono::steady_clock::now() ;
thread t0(addSomeAValues);
thread t1(addSomeBValues);
std::chrono::steady_clock::time_point end = std::chrono::steady_clock::now() ;
t0.join();
t1.join();
//Display the results
cout << "Code Run Complete; results: \n";
for(auto k : ob){
cout << k <<endl;
}
//Code running complete, report the time it took
typedef std::chrono::duration<int,std::milli> millisecs_t ;
millisecs_t duration( std::chrono::duration_cast<millisecs_t>(end-start) ) ;
std::cout << duration.count() << " milliseconds.\n" ;
return 0;
}
however I get the following output sometimes:
*** Error in `/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment':
double free or corruption (fasttop): 0x00007f19fc000920 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x80a46)[0x7f1a0687da46]
/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment[0x402dd4]
/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment[0x402930]
/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment[0x402a8d]
/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment[0x402637
/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment[0x402278]
/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment[0x4019cf]
/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment[0x4041e3]
/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment[0x404133]
/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment[0x404088]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xb29f0)[0x7f1a06e8d9f0]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7f8e)[0x7f1a060c6f8e]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f1a068f6e1d]
Update & Solution
With the problem I was experiencing (namely: unpredictable executing of the program with intermittent dump of corruption complaints), all was solved by including -lpthread as part of my eclipse build (under project settings).
I am using C++11. It's odd, at least to me, that the program would compile without issuing a complaint that I have not yet linked against pthread.
So to anyone using C++11, std::thread, and linux, make sure you link against pthread otherwise your program runtime will be VERY unpredictable, and buggy.

If you're going to use threads, I'd advise doing the job at least a little differently.
Right now, one thread gets the mutex, does all it's going to do (including sleeping for 3000 microseconds), then quits. Then the other thread does essentially the same thing. This being the case, threads have accomplished essentially nothing positive and a fair amount of negative (synchronization code and such).
Your current code is almost unsafe with respect to exceptions -- if an exception were to be thrown inside one of your thread functions, the mutex wouldn't be unlocked, even though that thread could no longer execute.
Finally, right now, you're exposing a mutex, and leaving it to all code that accesses the associated resource to use the mutex correctly. I'd prefer to centralize the mutex locking so its exception safe, and most of the code can ignore it completely.
// use std::lock_guard, if available.
class lock {
mutex &m
public:
lock(mutex &m) : m(m) { m.lock(); }
~lock() { m.unlock(); }
};
class synched_vec {
mutex m;
std::vector<string> data;
public:
void push_back(std::string const &s) {
lock l(m);
data.push_back(s);
}
} ob;
void addSomeAValues(){
for(int a=0; a<10; a++){
ob.push_back("A" + std::to_string(a));
usleep(300);
}
}
This also means that if (for example) you decide to use a lock-free (or minimal locking) structure in the future, you should only have to modify the synched_vec, not all the rest of the code that uses it. Likewise, by keeping all the mutex handling in one place, it's much easier to get the code right, and if you do find a bug, much easier to ensure you've fixed it (rather than looking through all the client code).

The code in the question runs without any segmentation faults (with adding headers and replacing the sleep with a sleep for my system).
There are two problems with the code though, that could cause unexpected results:
Each thread locks the mutex during his full execution. This prevents the other thread to run. The two threads are not running in parallel! In your case, you should only lock, when you are accessing the vector.
Your end time point is taken after creating the threads and not after they are done executing. Both threads are done, when they are both joined.
Working compilable code with headers, chrono-sleep and the two errors fixed:
#include <mutex>
#include <string>
#include <vector>
#include <thread>
#include <iostream>
std::mutex mu;
std::vector<std::string> ob;
void addSomeAValues(){
for(int a=0; a<10; a++){
mu.lock();
ob.push_back("A" + std::to_string(a));
mu.unlock();
std::this_thread::sleep_for(std::chrono::milliseconds(300));
}
}
void addSomeBValues(){
for(int b=0; b<10; b++){
mu.lock();
ob.push_back("B" + std::to_string(b));
mu.unlock();
std::this_thread::sleep_for(std::chrono::milliseconds(300));
}
}
int main() {
std::chrono::steady_clock::time_point start = std::chrono::steady_clock::now();
std::thread t0(addSomeAValues);
std::thread t1(addSomeBValues);
t0.join();
t1.join();
std::chrono::steady_clock::time_point end = std::chrono::steady_clock::now();
//Display the results
std::cout << "Code Run Complete; results: \n";
for(auto k : ob){
std::cout << k << std::endl;
}
//Code running complete, report the time it took
typedef std::chrono::duration<int,std::milli> millisecs_t;
millisecs_t duration(std::chrono::duration_cast<millisecs_t>(end-start));
std::cout << duration.count() << " milliseconds.\n";
return 0;
}

Thread pooling in C++11

Relevant questions:
About C++11:
C++11: std::thread pooled?
Will async(launch::async) in C++11 make thread pools obsolete for avoiding expensive thread creation?
About Boost:
C++ boost thread reusing threads
boost::thread and creating a pool of them!
How do I get a pool of threads to send tasks to, without creating and deleting them over and over again? This means persistent threads to resynchronize without joining.
I have code that looks like this:
namespace {
std::vector<std::thread> workers;
int total = 4;
int arr[4] = {0};
void each_thread_does(int i) {
arr[i] += 2;
}
}
int main(int argc, char *argv[]) {
for (int i = 0; i < 8; ++i) { // for 8 iterations,
for (int j = 0; j < 4; ++j) {
workers.push_back(std::thread(each_thread_does, j));
}
for (std::thread &t: workers) {
if (t.joinable()) {
t.join();
}
}
arr[4] = std::min_element(arr, arr+4);
}
return 0;
}
Instead of creating and joining threads each iteration, I'd prefer to send tasks to my worker threads each iteration and only create them once.

This is adapted from my answer to another very similar post.
Let's build a ThreadPool class:
class ThreadPool {
public:
void Start();
void QueueJob(const std::function<void()>& job);
void Stop();
void busy();
private:
void ThreadLoop();
bool should_terminate = false; // Tells threads to stop looking for jobs
std::mutex queue_mutex; // Prevents data races to the job queue
std::condition_variable mutex_condition; // Allows threads to wait on new jobs or termination
std::vector<std::thread> threads;
std::queue<std::function<void()>> jobs;
};
ThreadPool::Start
For an efficient threadpool implementation, once threads are created according to num_threads, it's better not to
create new ones or destroy old ones (by joining). There will be a performance penalty, and it might even make your
application go slower than the serial version. Thus, we keep a pool of threads that can be used at any time (if they
aren't already running a job).
Each thread should be running its own infinite loop, constantly waiting for new tasks to grab and run.
void ThreadPool::Start() {
const uint32_t num_threads = std::thread::hardware_concurrency(); // Max # of threads the system supports
threads.resize(num_threads);
for (uint32_t i = 0; i < num_threads; i++) {
threads.at(i) = std::thread(ThreadLoop);
}
}
ThreadPool::ThreadLoop
The infinite loop function. This is a while (true) loop waiting for the task queue to open up.
void ThreadPool::ThreadLoop() {
while (true) {
std::function<void()> job;
{
std::unique_lock<std::mutex> lock(queue_mutex);
mutex_condition.wait(lock, [this] {
return !jobs.empty() || should_terminate;
});
if (should_terminate) {
return;
}
job = jobs.front();
jobs.pop();
}
job();
}
}
ThreadPool::QueueJob
Add a new job to the pool; use a lock so that there isn't a data race.
void ThreadPool::QueueJob(const std::function<void()>& job) {
{
std::unique_lock<std::mutex> lock(queue_mutex);
jobs.push(job);
}
mutex_condition.notify_one();
}
To use it:
thread_pool->QueueJob([] { /* ... */ });
ThreadPool::busy
void ThreadPool::busy() {
bool poolbusy;
{
std::unique_lock<std::mutex> lock(queue_mutex);
poolbusy = jobs.empty();
}
return poolbusy;
}
The busy() function can be used in a while loop, such that the main thread can wait the threadpool to complete all the tasks before calling the threadpool destructor.
ThreadPool::Stop
Stop the pool.
void ThreadPool::Stop() {
{
std::unique_lock<std::mutex> lock(queue_mutex);
should_terminate = true;
}
mutex_condition.notify_all();
for (std::thread& active_thread : threads) {
active_thread.join();
}
threads.clear();
}
Once you integrate these ingredients, you have your own dynamic threading pool. These threads always run, waiting for
job to do.
I apologize if there are some syntax errors, I typed this code and and I have a bad memory. Sorry that I cannot provide
you the complete thread pool code; that would violate my job integrity.
Notes:
The anonymous code blocks are used so that when they are exited, the std::unique_lock variables created within them
go out of scope, unlocking the mutex.
ThreadPool::Stop will not terminate any currently running jobs, it just waits for them to finish via active_thread.join().

You can use C++ Thread Pool Library, https://github.com/vit-vit/ctpl.
Then the code your wrote can be replaced with the following
#include <ctpl.h> // or <ctpl_stl.h> if ou do not have Boost library
int main (int argc, char *argv[]) {
ctpl::thread_pool p(2 /* two threads in the pool */);
int arr[4] = {0};
std::vector<std::future<void>> results(4);
for (int i = 0; i < 8; ++i) { // for 8 iterations,
for (int j = 0; j < 4; ++j) {
results[j] = p.push([&arr, j](int){ arr[j] +=2; });
}
for (int j = 0; j < 4; ++j) {
results[j].get();
}
arr[4] = std::min_element(arr, arr + 4);
}
}
You will get the desired number of threads and will not create and delete them over and over again on the iterations.

A pool of threads means that all your threads are running, all the time – in other words, the thread function never returns. To give the threads something meaningful to do, you have to design a system of inter-thread communication, both for the purpose of telling the thread that there's something to do, as well as for communicating the actual work data.
Typically this will involve some kind of concurrent data structure, and each thread would presumably sleep on some kind of condition variable, which would be notified when there's work to do. Upon receiving the notification, one or several of the threads wake up, recover a task from the concurrent data structure, process it, and store the result in an analogous fashion.
The thread would then go on to check whether there's even more work to do, and if not go back to sleep.
The upshot is that you have to design all this yourself, since there isn't a natural notion of "work" that's universally applicable. It's quite a bit of work, and there are some subtle issues you have to get right. (You can program in Go if you like a system which takes care of thread management for you behind the scenes.)

A threadpool is at core a set of threads all bound to a function working as an event loop. These threads will endlessly wait for a task to be executed, or their own termination.
The threadpool job is to provide an interface to submit jobs, define (and perhaps modify) the policy of running these jobs (scheduling rules, thread instantiation, size of the pool), and monitor the status of the threads and related resources.
So for a versatile pool, one must start by defining what a task is, how it is launched, interrupted, what is the result (see the notion of promise and future for that question), what sort of events the threads will have to respond to, how they will handle them, how these events shall be discriminated from the ones handled by the tasks. This can become quite complicated as you can see, and impose restrictions on how the threads will work, as the solution becomes more and more involved.
The current tooling for handling events is fairly barebones(*): primitives like mutexes, condition variables, and a few abstractions on top of that (locks, barriers). But in some cases, these abstrations may turn out to be unfit (see this related question), and one must revert to using the primitives.
Other problems have to be managed too:
signal
i/o
hardware (processor affinity, heterogenous setup)
How would these play out in your setting?
This answer to a similar question points to an existing implementation meant for boost and the stl.
I offered a very crude implementation of a threadpool for another question, which doesn't address many problems outlined above. You might want to build up on it. You might also want to have a look of existing frameworks in other languages, to find inspiration.
(*) I don't see that as a problem, quite to the contrary. I think it's the very spirit of C++ inherited from C.

Follwoing [PhD EcE](https://stackoverflow.com/users/3818417/phd-ece) suggestion, I implemented the thread pool:
function_pool.h
#pragma once
#include <queue>
#include <functional>
#include <mutex>
#include <condition_variable>
#include <atomic>
#include <cassert>
class Function_pool
{
private:
std::queue<std::function<void()>> m_function_queue;
std::mutex m_lock;
std::condition_variable m_data_condition;
std::atomic<bool> m_accept_functions;
public:
Function_pool();
~Function_pool();
void push(std::function<void()> func);
void done();
void infinite_loop_func();
};
function_pool.cpp
#include "function_pool.h"
Function_pool::Function_pool() : m_function_queue(), m_lock(), m_data_condition(), m_accept_functions(true)
{
}
Function_pool::~Function_pool()
{
}
void Function_pool::push(std::function<void()> func)
{
std::unique_lock<std::mutex> lock(m_lock);
m_function_queue.push(func);
// when we send the notification immediately, the consumer will try to get the lock , so unlock asap
lock.unlock();
m_data_condition.notify_one();
}
void Function_pool::done()
{
std::unique_lock<std::mutex> lock(m_lock);
m_accept_functions = false;
lock.unlock();
// when we send the notification immediately, the consumer will try to get the lock , so unlock asap
m_data_condition.notify_all();
//notify all waiting threads.
}
void Function_pool::infinite_loop_func()
{
std::function<void()> func;
while (true)
{
{
std::unique_lock<std::mutex> lock(m_lock);
m_data_condition.wait(lock, [this]() {return !m_function_queue.empty() || !m_accept_functions; });
if (!m_accept_functions && m_function_queue.empty())
{
//lock will be release automatically.
//finish the thread loop and let it join in the main thread.
return;
}
func = m_function_queue.front();
m_function_queue.pop();
//release the lock
}
func();
}
}
main.cpp
#include "function_pool.h"
#include <string>
#include <iostream>
#include <mutex>
#include <functional>
#include <thread>
#include <vector>
Function_pool func_pool;
class quit_worker_exception : public std::exception {};
void example_function()
{
std::cout << "bla" << std::endl;
}
int main()
{
std::cout << "stating operation" << std::endl;
int num_threads = std::thread::hardware_concurrency();
std::cout << "number of threads = " << num_threads << std::endl;
std::vector<std::thread> thread_pool;
for (int i = 0; i < num_threads; i++)
{
thread_pool.push_back(std::thread(&Function_pool::infinite_loop_func, &func_pool));
}
//here we should send our functions
for (int i = 0; i < 50; i++)
{
func_pool.push(example_function);
}
func_pool.done();
for (unsigned int i = 0; i < thread_pool.size(); i++)
{
thread_pool.at(i).join();
}
}

You can use thread_pool from boost library:
void my_task(){...}
int main(){
int threadNumbers = thread::hardware_concurrency();
boost::asio::thread_pool pool(threadNumbers);
// Submit a function to the pool.
boost::asio::post(pool, my_task);
// Submit a lambda object to the pool.
boost::asio::post(pool, []() {
...
});
}
You also can use threadpool from open source community:
void first_task() {...}
void second_task() {...}
int main(){
int threadNumbers = thread::hardware_concurrency();
pool tp(threadNumbers);
// Add some tasks to the pool.
tp.schedule(&first_task);
tp.schedule(&second_task);
}

Something like this might help (taken from a working app).
#include <memory>
#include <boost/asio.hpp>
#include <boost/thread.hpp>
struct thread_pool {
typedef std::unique_ptr<boost::asio::io_service::work> asio_worker;
thread_pool(int threads) :service(), service_worker(new asio_worker::element_type(service)) {
for (int i = 0; i < threads; ++i) {
auto worker = [this] { return service.run(); };
grp.add_thread(new boost::thread(worker));
}
}
template<class F>
void enqueue(F f) {
service.post(f);
}
~thread_pool() {
service_worker.reset();
grp.join_all();
service.stop();
}
private:
boost::asio::io_service service;
asio_worker service_worker;
boost::thread_group grp;
};
You can use it like this:
thread_pool pool(2);
pool.enqueue([] {
std::cout << "Hello from Task 1\n";
});
pool.enqueue([] {
std::cout << "Hello from Task 2\n";
});
Keep in mind that reinventing an efficient asynchronous queuing mechanism is not trivial.
Boost::asio::io_service is a very efficient implementation, or actually is a collection of platform-specific wrappers (e.g. it wraps I/O completion ports on Windows).

Edit: This now requires C++17 and concepts. (As of 9/12/16, only g++ 6.0+ is sufficient.)
The template deduction is a lot more accurate because of it, though, so it's worth the effort of getting a newer compiler. I've not yet found a function that requires explicit template arguments.
It also now takes any appropriate callable object (and is still statically typesafe!!!).
It also now includes an optional green threading priority thread pool using the same API. This class is POSIX only, though. It uses the ucontext_t API for userspace task switching.
I created a simple library for this. An example of usage is given below. (I'm answering this because it was one of the things I found before I decided it was necessary to write it myself.)
bool is_prime(int n){
// Determine if n is prime.
}
int main(){
thread_pool pool(8); // 8 threads
list<future<bool>> results;
for(int n = 2;n < 10000;n++){
// Submit a job to the pool.
results.emplace_back(pool.async(is_prime, n));
}
int n = 2;
for(auto i = results.begin();i != results.end();i++, n++){
// i is an iterator pointing to a future representing the result of is_prime(n)
cout << n << " ";
bool prime = i->get(); // Wait for the task is_prime(n) to finish and get the result.
if(prime)
cout << "is prime";
else
cout << "is not prime";
cout << endl;
}
}
You can pass async any function with any (or void) return value and any (or no) arguments and it will return a corresponding std::future. To get the result (or just wait until a task has completed) you call get() on the future.
Here's the github: https://github.com/Tyler-Hardin/thread_pool.

looks like threadpool is very popular problem/exercise :-)
I recently wrote one in modern C++; it’s owned by me and publicly available here - https://github.com/yurir-dev/threadpool
It supports templated return values, core pinning, ordering of some tasks.
all implementation in two .h files.
So, the original question will be something like this:
#include "tp/threadpool.h"
int arr[5] = { 0 };
concurency::threadPool<void> tp;
tp.start(std::thread::hardware_concurrency());
std::vector<std::future<void>> futures;
for (int i = 0; i < 8; ++i) { // for 8 iterations,
for (int j = 0; j < 4; ++j) {
futures.push_back(tp.push([&arr, j]() {
arr[j] += 2;
}));
}
}
// wait until all pushed tasks are finished.
for (auto& f : futures)
f.get();
// or just tp.end(); // will kill all the threads
arr[4] = *std::min_element(arr, arr + 4);

I found the pending tasks' future.get() call hangs on caller side if the thread pool gets terminated and leaves some tasks inside task queue. How to set future exception inside thread pool with only the wrapper std::function?
template <class F, class... Args>
std::future<std::result_of_t<F(Args...)>> enqueue(F &&f, Args &&...args) {
auto task = std::make_shared<std::packaged_task<std::result_of_t<F(Args...)>()>>(
std::bind(std::forward<F>(f), std::forward<Args>(args)...));
std::future<return_type> res = task->get_future();
{
std::unique_lock<std::mutex> lock(_mutex);
_tasks.push([task]() -> void { (*task)(); });
}
return res;
}
class StdThreadPool {
std::vector<std::thread> _workers;
std::priority_queue<TASK> _tasks;
...
}
struct TASK {
//int _func_return_value;
std::function<void()> _func;
int priority;
...
}

The Stroika library has a threadpool implementation.
Stroika ThreadPool.h
ThreadPool p;
p.AddTask ([] () {doIt ();});
Stroika's thread library also supports cancelation (cooperative) - so that when the ThreadPool above goes out of scope - it cancels any running tasks (similar to c++20's jthread).

Is there a way to cancel/detach a future in C++11?

I have the following code:
#include <iostream>
#include <future>
#include <chrono>
#include <thread>
using namespace std;
int sleep_10s()
{
this_thread::sleep_for(chrono::seconds(10));
cout << "Sleeping Done\n";
return 3;
}
int main()
{
auto result=async(launch::async, sleep_10s);
auto status=result.wait_for(chrono::seconds(1));
if (status==future_status::ready)
cout << "Success" << result.get() << "\n";
else
cout << "Timeout\n";
}
This is supposed to wait 1 second, print "Timeout", and exit. Instead of exiting, it waits an additional 9 seconds, prints "Sleeping Done", and then segfaults. Is there a way to cancel or detach the future so my code will exit at the end of main instead of waiting for the future to finish executing?

The C++11 standard does not provide a direct way to cancel a task started with std::async. You will have to implement your own cancellation mechanism, such as passing in an atomic flag variable to the async task which is periodically checked.
Your code should not crash though. On reaching the end of main, the std::future<int> object held in result is destroyed, which will wait for the task to finish, and then discard the result, cleaning up any resources used.

Here a simple example using an atomic bool to cancel one or multiple future at the same time. The atomic bool may be wrapped inside a Cancellation class (depending on taste).
#include <chrono>
#include <future>
#include <iostream>
using namespace std;
int long_running_task(int target, const std::atomic_bool& cancelled)
{
// simulate a long running task for target*100ms,
// the task should check for cancelled often enough!
while(target-- && !cancelled)
this_thread::sleep_for(chrono::milliseconds(100));
// return results to the future or raise an error
// in case of cancellation
return cancelled ? 1 : 0;
}
int main()
{
std::atomic_bool cancellation_token = ATOMIC_VAR_INIT(false);
auto task_10_seconds= async(launch::async,
long_running_task,
100,
std::ref(cancellation_token));
auto task_500_milliseconds = async(launch::async,
long_running_task,
5,
std::ref(cancellation_token));
// do something else (should allow short task
// to finish while the long task will be cancelled)
this_thread::sleep_for(chrono::seconds(1));
// cancel
cancellation_token = true;
// wait for cancellation/results
cout << task_10_seconds.get() << " "
<< task_500_milliseconds.get() << endl;
}

I know this is an old question, but it still comes up as the top result for "detach std::future" when searching. I came up with a simple template based approach to handle this:
template <typename RESULT_TYPE, typename FUNCTION_TYPE>
std::future<RESULT_TYPE> startDetachedFuture(FUNCTION_TYPE func) {
std::promise<RESULT_TYPE> pro;
std::future<RESULT_TYPE> fut = pro.get_future();
std::thread([func](std::promise<RESULT_TYPE> p){p.set_value(func());},
std::move(pro)).detach();
return fut;
}
and you use it like so:
int main(int argc, char ** argv) {
auto returner = []{fprintf(stderr, "I LIVE!\n"); sleep(10); return 123;};
std::future<int> myFuture = startDetachedFuture<int, decltype(returner)>(returner);
sleep(1);
}
output:
$ ./a.out
I LIVE!
$
If myFuture goes out of scope and is destructed, the thread will carry on doing whatever it was doing without causing problems because it owns the std::promise and its shared state. Good for occasions where you only sometimes would prefer to ignore the result of a computation and move on (my use case).
To the OP's question: if you get to the end of main it will exit without waiting for the future to finish.
This macro is unnecessary but saves on typing if you are going to call this frequently.
// convenience macro to save boilerplate template code
#define START_DETACHED_FUTURE(func) \
startDetachedFuture<decltype(func()), decltype(func)>(func)
// works like so:
auto myFuture = START_DETACHED_FUTURE(myFunc);

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js