Thread pool stucks - c++

I created a thread pooling to distribute 100 computations between 4 threads.
I cannot understand why the following code stucks after 4 computations. After each computation, the thread must be released and I expect that .joinable() returns false so the program will continue.
#include <string>
#include <iostream>
#include <vector>
#include <thread>
#include <boost/date_time/posix_time/posix_time.hpp>
#include <boost/thread/thread.hpp>
#include <cmath>
class AClass
void calculation_single(std::vector<double> *s,int index)
std::cout<<"["<<index<<"] calculated \n";
void calculation()
const uint N_nums=100;
const uint N_threads=4;
std::vector<double> A;
std::vector<std::thread> thread_pool;
for(uint i=0;i<N_threads;i++)
uint A_index=0;
int free_thread=-1;
for(uint i=0;i<N_threads && free_thread<0;i++)
// wait for tasks to finish
for(std::thread& th : thread_pool)
int main()
AClass obj;
return 0;

A thread is joinable if it isn't empty basically.
A thread with a completed task is not empty.
std::thread bob;
bob is not joinable.
Your threads are. Nothing you do makes them not joinable.
Also, busy waiting is a crappy thread pool.
Create a consumer producer queue, with a pool of threads consuming tasks and an abort method. Feed tasks into the queue with via a packaged task and return a std::future<T>. Don't spawn a new thread per task.


What could be a better for condition_variables

I am trying to make a multi threaded function it looks like:
namespace { // Anonymous namespace instead of static functions.
std::mutex log_mutex;
void Background() {
std::queue<std::string> log_records;
// Exchange data for minimizing lock time.
std::unique_lock lock(log_mutex);
if (log_records.empty()) {
void Log(std::string log){
std::unique_lock lock(log_mutex);
I use Sleep to prevent high CPU usages due to continuously looping even if logs are empty. But this has a very visible draw back that it will print the logs in batches. I tried to get over this problem by using conditional variables but in there the problem is if there are too many logs in a short time then the cv is stopped and waked up many times leading to even more CPU usage. Now what can i do to solve this issue?
You can assume there may be many calls to log per second.
I would probably think of using a counting semaphore for this:
The semaphore would keep a count of the number of messages in the logs (initially zero).
Log clients would write a message and increment by one the number of messages by releasing the semaphore.
A log server would do an acquire on the semaphore, blocking until there was any message in the logs, and then decrementing by one the number of messages.
Log clients get the logs queue lock, push a message, and only then do the release on the semaphore.
The log server can do the acquire before getting the logs queue lock; this would be possible even if there were more readers. For instance: 1 message in the log queue, server 1 does an acquire, server 2 does an acquire and blocks because semaphore count is 0, server 1 goes on and gets the logs queue lock...
#include <algorithm> // for_each
#include <chrono> // chrono_literasl
#include <future> // async, future
#include <iostream> // cout
#include <mutex> // mutex, unique_lock
#include <queue>
#include <semaphore> // counting_semaphore
#include <string>
#include <thread> // sleep_for
#include <vector>
std::mutex mtx{};
std::queue<std::string> logs{};
std::counting_semaphore c_semaphore{ 0 };
int main()
auto log = [](std::string message) {
std::unique_lock lock{ mtx };
auto log_client = [&log]() {
using namespace std::chrono_literals;
static size_t s_id{ 1 };
size_t id{ s_id++ };
for (;;)
std::this_thread::sleep_for(id * 100ms);
auto log_server = []() {
for (;;)
std::unique_lock lock{ mtx };
std::cout << logs.front() << " ";
std::vector<std::future<void>> log_clients(10);
std::for_each(std::begin(log_clients), std::end(log_clients),
[&log_client](auto& lc_fut) {
lc_fut = std::async(std::launch::async, log_client);
auto ls_fut{ std::async(std::launch::async, log_server) };
std::for_each(std::begin(log_clients), std::end(log_clients),
[](auto& lc_fut) { lc_fut.wait(); });

Modern C++. Return data structure from working thread continuing its execution

I need to launch working thread, perform some initialization, return data structure as initialization result and continue thread execution. What is the best (or possible) code to achieve this using modern c++ features only? Note, launched thread should continue its execution (thread does not terminated as usual). Unfortunately, most solutions assume worker thread termination.
Pseudo code:
// Executes in WorkerThread context
void SomeClass::Worker_treadfun_with_init()
// 1. Initialization calls...
// 2. Pass/signal initialization results to caller
// 3. Continue execution of WorkerThread
// Executes in CallerThread context
void SomeClass::Caller()
// 1. Create WorkerThread with SomeClass::Worker_treadfun_with_init()" thread function
// 2. Sleep thread for some initialization results
// 3. Grab results
// 3. Continue execution of CallerThread
I think std::future meets your requirements.
// Executes in WorkerThread context
void SomeClass::Worker_treadfun_with_init(std::promise<Result> &pro)
// 1. Initialization calls...
// 2. Pass/signal initialization results to caller
// 3. Continue execution of WorkerThread
// Executes in CallerThread context
void SomeClass::Caller()
// 1. Create WorkerThread with SomeClass::Worker_treadfun_with_init()" thread function
std::promise<Result> pro;
auto f=pro.get_future();
auto result=f.get();
// 3. Grab results
// 3. Continue execution of CallerThread
Try using a pointer or reference to the data structure with the answer in it, and std::condition_variable to let you know when the answer has been computed:
#include <iostream>
#include <thread>
#include <mutex>
#include <condition_variable>
#include <chrono>
#include <vector>
std::vector<double> g_my_answer;
std::mutex g_mtx;
std::condition_variable g_cv;
bool g_ready = false;
void Worker_treadfun_with_init()
//Do your initialization here
std::unique_lock<std::mutex> lck( g_mtx );
for( double val = 0; val < 10; val += 0.3 )
g_my_answer.push_back( val );
g_ready = true;
//Keep doing your other work..., here we'll just sleep
for( int i = 0; i < 100; ++i )
std::this_thread::sleep_for( std::chrono::seconds(1) );
void Caller()
std::unique_lock<std::mutex> lck(g_mtx);
std::thread worker_thread = std::thread( Worker_treadfun_with_init );
//Calling wait will cause current thread to sleep until g_cv.notify_one() is called.
g_cv.wait( lck, [&g_ready](){ return g_ready; } );
//Print out the answer as the worker thread continues doing its work
for( auto val : g_my_answer )
std::cout << val << std::endl;
//Unlock mutex (or better yet have unique_lock go out of scope)
// incase worker thread needs to lock again to finish
//Make sure to join the worker thread some time later on.
Of course in actual code you wouldnt use global variables, and instead pass them by pointer or reference (or as member variables of SomeClass) to the worker function, but you get the point.

C++ thread that starts several threads

I am trying to do a program that has to run 2 tasks periodically.
That is, for example, run task 1 every 10 seconds, and run task 2 every 20 seconds.
What I am thinking is to create two threads, each one with a timer. Thread 1 launches a new thread with task 1 every 10 seconds. and Thread 2 launches a new thread with task 2 every 20 seconds.
My doubt is, how to launch a new task 1 if the previous task 1 hasn't finished?
while (true)
thread t1 (task1);
I was trying this, but this way it will only launch a new task 1 when the previous one finishes.
Basically I want to implement a task scheduler.
Run task1 every X seconds.
Run task2 every Y seconds.
I was thinking in something like this:
thread t1 (timer1);
thread t2 (timer2);
void timer1()
while (true)
thread t (task1);
the same for timer2 and task2
Perhaps you could create a periodic_task handler that is responsible for scheduling one task every t seconds. And then you can launch a periodic_task with a specific function and time duration from anywhere you want to in your program.
Below I've sketched something out. One valid choice is to detach the thread and let it run forever. Another is to include cancellation to allow the parent thread to cancel/join. I've included functionality to allow the latter (though you could still just detach/forget).
#include <condition_variable>
#include <functional>
#include <iostream>
#include <mutex>
#include <thread>
class periodic_task
std::chrono::seconds d_;
std::function<void()> task_;
std::mutex mut_;
std::condition_variable cv_;
bool cancel_{false};
periodic_task(std::function<void()> task, std::chrono::seconds s)
: d_{s}
, task_(std::move(task))
std::unique_lock<std::mutex> lk{mut_};
auto until = std::chrono::steady_clock::now();
while (true)
while (!cancel_ && std::chrono::steady_clock::now() < until)
cv_.wait_until(lk, until);
if (cancel_)
until += d_;
void cancel()
std::unique_lock<std::mutex> lk{mut_};
cancel_ = true;
std::cerr << "short\n";
long_task(int i, const std::string& message)
std::cerr << "long " << message << ' ' << i << '\n';
using namespace std::chrono_literals;
periodic_task task_short{short_task, 7s};
periodic_task task_long{[](){long_task(5, "Hi");}, 13s};
std::thread t1{std::ref(task_short)};
std::thread t2{std::ref(task_long)};
You want to avoid using thread::join() it, by definition, waits for the thread to finish. Instead, use thread::detach before sleeping, so it doesn't need to wait.
I'd suggest reading up on it

Simplest/Effective approach of calling back a method from a lib file

I am currently calling some methods from an external lib file. Is there a way for these methods to callback functions in my application once they are done as these methods might be running in separate threads?
The following diagram shows what I am trying to achieve
I wanted to know what is the best way of sending a message back to the calling application ? Any boost components that might help ?
Update after the edit:
It's not clear what you have. Do you control the thread entry point for the thread started by the external library (this would surprise me)?
the library function accepts a callback
assuming you don't control the source for the library function, not the thread function started by this library function in a background thread
you want to have the callback processed on the original thread
you could have the callback store an record in some kind of queue that you regularly check from the main thread (no busy loops, of course). Use a lock-free queue, or synchronize access to the queue using e.g. a std::mutex.
Update Here's such a queuing version Live on Coliru as well:
#include <thread>
#include <vector>
// fake external library taking a callback
extern void library_function(int, void(*cb)(int,int));
// our client code
#include <iostream>
#include <mutex>
void callback_handler(int i, int input)
static std::mutex mx;
std::lock_guard<std::mutex> lk(mx);
std::cout << "Callback #" << i << " from task for input " << input << "\n";
// callback queue
#include <deque>
#include <future>
namespace {
using pending_callback = std::packaged_task<void()>;
std::deque<pending_callback> callbacks;
std::mutex callback_mutex;
int process_pending_callbacks() {
std::lock_guard<std::mutex> lk(callback_mutex);
int processed = 0;
while (!callbacks.empty()) {
return processed;
void enqueue(pending_callback cb) {
std::lock_guard<std::mutex> lk(callback_mutex);
// this wrapper to "fake" a callback (instead queuing the real
// callback_handler)
void queue_callback(int i, int input)
enqueue(pending_callback(std::bind(callback_handler, i, input)));
int main()
// do something with delayed processing:
library_function(3, queue_callback);
library_function(5, queue_callback);
// wait for completion, periodically checking for pending callbacks
for (
int still_pending = 3 + 5;
still_pending > 0;
std::this_thread::sleep_for(std::chrono::milliseconds(10))) // no busy wait
still_pending -= process_pending_callbacks();
// somewhere, in another library:
void library_function(int some_input, void(*cb)(int,int))
std::thread([=] {
for (int i = 1; i <= some_input; ++i) {
std::this_thread::sleep_for(std::chrono::milliseconds(rand() % 5000)); // TODO abolish rand()
cb(i, some_input);
Typical output:
Callback #1 from task for input 5
Callback #2 from task for input 5
Callback #1 from task for input 3
Callback #3 from task for input 5
Callback #2 from task for input 3
Callback #4 from task for input 5
Callback #5 from task for input 5
Callback #3 from task for input 3
Note that
output is interspersed for both worker threads
but because the callbacks queue is FIFO, the sequence of callbacks per worker thread is preserved
This is what I thought of, before you edited the question: Live on Coliru
#include <thread>
#include <vector>
extern int library_function(bool);
static std::vector<std::thread> workers; // TODO implement a proper pool
void await_workers()
for(auto& th: workers)
if (th.joinable()) th.join();
template <typename F, typename C>
void do_with_continuation(F f, C continuation)
workers.emplace_back([=] () mutable {
auto result = f();
#include <iostream>
#include <mutex>
void callback(int result)
static std::mutex mx;
std::lock_guard<std::mutex> lk(mx);
std::cout << "Resulting value from callback " << result << "\n";
int main()
// do something with delayed processing:
do_with_continuation(std::bind(library_function, false), callback);
do_with_continuation(std::bind(library_function, true), callback);
// somewhere, in another library:
#include <chrono>
int library_function(bool some_input)
std::this_thread::sleep_for(std::chrono::seconds(some_input? 6 : 3));
return some_input ? 42 : 0;
It will always print the output in the order:
Resulting value from callback 0
Resulting value from callback 42
make sure you synchronize access to shared state from within such a callback (in this case, std::cout is protected by a lock)
you'd want to make a thread pool, instead of an ever-growing vector of (used) threads

Setting limit on post queue size with Boost Asio?

I'm using boost::asio::io_service as a basic thread pool. Some threads get added to io_service, the main thread starts posting handlers, the worker threads start running the handlers, and everything finishes. So far, so good; I get a nice speedup over single-threaded code.
However, the main thread has millions of things to post. And it just keeps on posting them, much faster than the worker threads can handle them. I don't hit RAM limits, but it's still kind of silly to be enqueuing so many things. What I'd like to do is have a fixed-size for the handler queue, and have post() block if the queue is full.
I don't see any options for this in the Boost ASIO docs. Is this possible?
I'm using the semaphore to fix the handlers queue size. The following code illustrate this solution:
void Schedule(boost::function<void()> function)
semaphore.wait();, function));
void TaskWrapper(boost::function<void()> &function)
You can wrap your lambda in another lambda which would take care of counting the "in-progress" tasks, and then wait before posting if there are too many in-progress tasks.
#include <atomic>
#include <chrono>
#include <future>
#include <iostream>
#include <mutex>
#include <thread>
#include <vector>
#include <boost/asio.hpp>
class ThreadPool {
using asio_worker = std::unique_ptr<boost::asio::io_service::work>;
boost::asio::io_service service;
asio_worker service_worker;
std::vector<std::thread> grp;
std::atomic<int> inProgress = 0;
std::mutex mtx;
std::condition_variable busy;
ThreadPool(int threads) : service(), service_worker(new asio_worker::element_type(service)) {
for (int i = 0; i < threads; ++i) {
grp.emplace_back([this] {; });
template<typename F>
void enqueue(F && f) {
std::unique_lock<std::mutex> lock(mtx);
// limit queue depth = number of threads
while (inProgress >= grp.size()) {
inProgress++;[this, f = std::forward<F>(f)]{
try {
catch (...) {
~ThreadPool() {
for (auto& t : grp)
if (t.joinable())
int main() {
std::unique_ptr<ThreadPool> pool(new ThreadPool(4));
for (int i = 1; i <= 20; ++i) {
pool->enqueue([i] {
std::string s("Hello from task ");
s += std::to_string(i) + "\n";
std::cout << s;
std::cout << "All tasks queued.\n";
pool.reset(); // wait for all tasks to complete
std::cout << "Done.\n";
Hello from task 3
Hello from task 4
Hello from task 2
Hello from task 1
Hello from task 5
Hello from task 7
Hello from task 6
Hello from task 8
Hello from task 9
Hello from task 10
Hello from task 11
Hello from task 12
Hello from task 13
Hello from task 14
Hello from task 15
Hello from task 16
Hello from task 17
Hello from task 18
All tasks queued.
Hello from task 19
Hello from task 20
you could use the strand object to put the events and put a delay in your main ? Is your program dropping out after all the work is posted? If so you can use the work object which will give you more control over when your io_service stops.
you could always main check the state of the threads and have it wait untill one becomes free or something like that.
//example from the second link
boost::asio::io_service io_service;
boost::asio::io_service::work work(io_service);
hope this helps.
Maybe try lowering the priority of the main thread so that once the worker threads get busy they starve the main thread and the system self limits.