Boost threads running serially, not in parallel - c++

I'm a complete newbie to multi-threading in C++, and decided to start with the Boost Libraries. Also, I'm using Intel's C++ Compiler (from Parallel Studio 2011) with VS2010 on Vista.
I'm coding a genetic algorithm, and want to exploit the benefits of multi-threading: I want to create a thread for each individual (object) in the population, in order for them to calculate their fitness (heavy operations) in parallel, to reduce total execution time.
As I understand it, whenever I launch a child thread it stars working "in the background", and the parent thread continues to execute the next instruction, right? So, I thought of creating and launching all the child threads I need (in a for loop), and then wait for them to finish (call each thread's join() in another for loop) before continuing.
The problem I'm facing is that the first loop won't continue to the next iteration until the newly created thread is done working. Then, the second loop is as good as gone, since all the threads are already joined by the time that loop is hit.
Here are (what I consider to be) the relevant code snippets. Tell me if there is anything else you need to know.
class Poblacion {
// Constructors, destructor and other members
// ...
list<Individuo> _individuos;
void generaInicial() { // This method sets up the initial population.
int i;
// First loop
for(i = 0; i < _tamano_total; i++) {
Individuo nuevo(true);
nuevo.Start(); // Create and launch new thread
// Second loop
list<Individuo>::iterator it;
for(it = _individuos.begin(); it != _individuos.end(); it++) {
And, the threaded object Individuo:
class Individuo {
// Other private members
// ...
boost::thread _hilo;
// Other public members
// ...
void Start() {
_hilo = boost::thread(&Individuo::Run, this);
void Run() {
// These methods operate with/on each instance's own attributes,
// so they *can't* be static
void Join() {
if(_hilo.joinable()) _hilo.join();
Thank you! :D

If that's your real code then you have a problem.
for(i = 0; i < _tamano_total; i++) {
Individuo nuevo(true);
nuevo.Start(); // Create and launch new thread
void Start() {
_hilo = boost::thread(&Individuo::Run, this);
This code creates a new Individuo object on the stack, then starts a thread that runs, passing the thispointer of that stack object to the new thread. It then copies that object into the list, and promptly destroys the stack object, leaving a dangling pointer in the new thread. This gives you undefined behaviour.
Since list never moves an object in memory once it has been inserted, you could start the thread after inserting into the list:
for(i = 0; i < _tamano_total; i++) {
_individuos.push_back(Individuo(true)); // add new entry to list
_individuos.back().Start(); // start a thread for that entry


Running a task in a separate thread which shold be able to stop on request

I am trying to design an infinite (or a user-defined length) loop that would be independent of my GUI process. I know how to start that loop in a separate thread, so the GUI process is not blocked. However, I would like to have a possibility to interrupt the loop at a press of a button. The complete scenario may look like this:
GUI::startButton->myClass::runLoop... ---> starts a loop in a new thread
GUI::stopButton->myClass::terminateLoop ---> should be able to interrupt the started loop
The problem I have is figuring out how to provide the stop functionality. I am sure there is a way to achieve this in C++. I was looking at a number of multithreading related posts and articles, as well as some lectures on how to use async and futures. Most of the examples did not fit my intended use and/or were too complex for my current state of skills.
MyClass *myClass = new MyClass;
void MyWidget::on_pushButton_start_clicked()
void MyWidget::on_pushButton_stop_clicked()
myClass->stop(); // TBD: how to implement the stop functionality?
std::thread MyClass::start()
return std::thread(&MyClass::runLoop, this);
void MyClass::runLoop()
for(int i = 0; i < 999999; i++)
// do some work
As far as i know, there is no standard way to terminate a STL thread. And even if possible, this is not advisable since it can leave your application in an undefined state.
It would be better to add a check to your MyClass::runLoop method that stops execution in a controlled way as soon as an external condition is fulfilled. This might, for example, be a control variable like this:
std::thread MyClass::start()
_threadRunning = true;
if(_thread.joinable() == true) // If thr thread is joinable...
// Join before (re)starting the thread
_thread = std::thread(&MyClass::runLoop, this);
return _thread;
void MyClass::runLoop()
for(int i = 0; i < MAX_ITERATION_COUNT; i++)
if(_threadRunning == false) { break; }
// do some work
Then you can end the thread with:
void MyClass::stopLoop()
_threadRunning = false;
_threadRunning would here be a member variable of type bool or, if your architecture for some reason has non-atomic bools, std::atomic<bool>.
With x86, x86_64, ARM and ARM64, however, you should be fine without atomic bools. It, however is advised to use them. Also to hint at the fact that the variable is used in a multithreading context.
Possible MyClass.h:
MyClass() : _threadRunning(false) {}
std::thread start();
std::thread runLoop();
std::thread stopLoop();
std::thread _thread;
std::atomic<bool> _threadRunning;
It might be important to note that, depending on the code in your loop, it might take a while before the thread really stops.
Therefore it might be wise to std::thread::join the thread before restarting it, to make sure only one thread runs at a time.

How to initiate a thread in a class in C++ 14?

class ThreadOne {
void RealThread();
void EnqueueJob(s_info job);
std::queue<s_info> q_jobs;
H5::H5File* targetFile = new H5::H5File("file.h5", H5F_ACC_TRUNC);
std::condition_variable cv_condition;
std::mutex m_job_q_;
ThreadOne::ThreadOne() {
void ThreadOne::RealThread() {
while (true) {
std::unique_lock<std::mutex> lock(m_job_q_);
cv_condition.wait(lock, [this]() { return !this->q_jobs.empty(); });
s_info info = std::move(q_jobs.front());
//* DO THE JOB *//
void ThreadOne::EnqueueJob(s_info job) {
std::lock_guard<std::mutex> lock(m_job_q_);
ThreadOne *tWrite = new ThreadOne();
I want to make a thread and send it a pointer of an array and its name as a struct(s_info), and then make the thread write it into a file. I think that it's better than creating a thread whenever writing is needed.
I could make a thread pool and allocate jobs to it, but it's not allowed to write the same file concurrently in my situation, I think that just making a thread will be enough and the program will still do CPU-bound jobs when writing job is in process.
To sum up, this class (hopefully) gets array pointers and their dataset names, puts them in q_jobs and RealThread writes the arrays into a file.
I referred to a C++ thread pool program and the program initiates threads like this:
std::vector<std::thread> vec_worker_threads;
vector_worker_threads.emplace_back([this]() { this->RealThread(); });
I'm new to C++ and I understand what the code above does, but I don't know how to initiate RealThread in my class without a vector. How can I make an instance of the class that has a thread(RealThread) that's already ready inside it?
From what I can gather, and as already discussed in the comments, you simply want a std::thread member for ThreadOne:
class ThreadOne {
std::thread thread;
ThreadOne::ThreadOne() {
thread = std::thread{RealThread, this};
ThreadOne::~ThreadOne() {
// (potentially) notify thread to finish first
ThreadOne tWrite;
Note that I did not start the thread in the member-initializer-list of the constructor in order to avoid the thread accessing other members that have not been initialized yet. (The default constructor of std::thread does not start any thread.)
I also wrote a destructor which will wait for the thread to finish and join it. You must always join threads before destroying the std::thread object attached to it, otherwise your program will call std::terminate and abort.
Finally, I replaced tWrite from being a pointer to being a class type directly. There is probably no reason for you to use dynamic allocation there and even if you have a need for it, you should be using
auto tWrite = std::make_unique<ThreadOne>();
or equivalent, instead, so that you are not going to rely on manually deleteing the pointer at the correct place.
Also note that your current RealThread function seems to never finish. It must return at some point, probably after receiving a notification from the main thread, otherwise thread.join() will wait forever.

Synchronization technique to wait till all objects have been processed

In this code, I am first creating a thread that keeps running always. Then I am creating objects and adding them one by one to a queue. The thread picks up object from queue one by one processes them and deletes them.
class MyClass
std::queue<class MyClass*> MyClassObjQueue;
void ThreadFunctionToProcessAndDeleteObjectsFromQueue()
// Get and Process and then Delete Objects one by one from MyClassObjQueue.
void main()
CreateThread (ThreadFunctionToProcessAndDeleteObjectsFromQueue);
int N = GetNumberOfObjects(); // Call some function that gets value of number of objects
// Create objects and queue them
for (int i=0; i<N; i++)
MyClass* obj = NULL;
obj = new MyClass;
delete obj;
// Wait till all objects have been processed and destroyed (HOW ???)
I am not sure how to wait till all objects have been processed before I quit. One way is to keep on checking size of queue periodically by using while(1) loop and Sleep. But I think it's novice way to do the things. I really want to do it in elegant way by using thread synchronization objects (e.g. semaphore etc.) so that synchronization function will wait for all objects to finish. But not sure how to do that. Any input will be appreciated.
(Note: I've not used synchronization objects to add/delete from queue in the code above. This is only to keep the code simple & readable. I know STL containers are not thread safe)

Static Class variable for Thread Count in C++

I am writing a thread based application in C++. The following is sample code showing how I am checking the thread count. I need to ensure that at any point in time, there are only 20 worker threads spawned from my application:
using namespace std;
class ThreadWorkerClass
static int threadCount;
void ThreadWorkerClass()
threadCount ++;
static int getThreadCount()
return threadCount;
void run()
/* The worker thread execution
* logic is to be written here */
//Reduce count by 1 as worker thread would finish here
threadCount --;
int main()
ThreadWorkerClass twObj;
//Use Boost to start Worker Thread
//Assume max 20 worker threads need to be spawned
if(ThreadWorkerClass::getThreadCount() <= 20)
boost::thread *wrkrThread = new boost::thread(
//Wait for the threads to join
//Something like (*wrkrThread).join();
return 0;
Will this design require me to take a lock on the variable threadCount? Assume that I will be running this code in a multi-processor environment.
The design is not good enough. The problem is that you exposed the constructor, so whether you like it or not, people will be able to create as many instances of your object as they want. You should do some sort of threads pooling. i.e. You have a class maintaining a set of pools and it gives out threads if available. something like
class MyThreadClass {
//the method obtaining that thread is reponsible for returning it
class ThreadPool {
//create 20 instances of your Threadclass
//This is a blocking function
MyThreadClass getInstance() {
//if a thread from the pool is free give it, else wait
So everything is maintaned internally by the pooling class. Never give control over that class to the others. you can also add query functions to the pooling class, like hasFreeThreads(), numFreeThreads() etc...
You can also enhance this design through giving out smart pointer so you can follow how many people are still owning the thread.
Making the people obtaining the thread responsible for releasing it is sometimes dangerous, as processes crashes and they never give the tread back, there are many solutions to that, the simplest one is to maintain a clock on each thread, when time runs out the thread is taken back by force.

Windows API Thread Pool simple example

[EDIT: thanks to MSalters answer and Raymond Chen's answer to InterlockedIncrement vs EnterCriticalSection/counter++/LeaveCriticalSection, the problem is solved and the code below is working properly. This should provide an interesting simple example of Thread Pool use in Windows]
I don't manage to find a simple example of the following task. My program, for example, needs to increment the values in a huge std::vector by one, so I want to do that in parallel. It needs to do that a bunch of times across the lifetime of the program. I know how to do that using CreateThread at each call of the routine but I don't manage to get rid of the CreateThread with the ThreadPool.
Here is what I do :
class Thread {
virtual void run() = 0 ; // I can inherit an "IncrementVectorThread"
class IncrementVectorThread: public Thread {
IncrementVectorThread(int threadID, int nbThreads, std::vector<int> &vec) : id(threadID), nb(nbThreads), myvec(vec) { };
virtual void run() {
for (int i=(myvec.size()*id)/nb; i<(myvec.size()*(id+1))/nb; i++)
myvec[i]++; //and let's assume myvec is properly sized
int id, nb;
std::vector<int> &myvec;
class ThreadGroup : public std::vector<Thread*> {
ThreadGroup() {
pool = CreateThreadpool(NULL);
cleanupGroup = CreateThreadpoolCleanupGroup();
SetThreadpoolCallbackPool(&cbe, pool);
SetThreadpoolCallbackCleanupGroup(&cbe, cleanupGroup, NULL);
threadCount = 0;
~ThreadGroup() {
PTP_POOL pool;
volatile long threadCount;
} ;
static VOID CALLBACK runFunc(
PVOID Context,
PTP_WORK Work) {
ThreadGroup &thread = *((ThreadGroup*) Context);
long id = InterlockedIncrement(&(thread.threadCount));
DWORD tid = (id-1)%thread.size();
void run_threads(ThreadGroup* thread_group) {
SetThreadpoolThreadMaximum(thread_group->pool, thread_group->size());
SetThreadpoolThreadMinimum(thread_group->pool, thread_group->size());
TP_WORK *worker = CreateThreadpoolWork(runFunc, (void*) thread_group, &thread_group->cbe);
thread_group->threadCount = 0;
for (int i=0; i<thread_group->size(); i++) {
void main() {
ThreadGroup group;
std::vector<int> vec(10000, 0);
for (int i=0; i<10; i++)
group.push_back(new IncrementVectorThread(i, 10, vec));
// now, vec should be == std::vector<int>(10000, 3);
So, if I understood well :
- the command CreateThreadpool creates a bunch of Threads (hence, the call to CreateThreadpoolWork is cheap as it doesn't call CreateThread)
- I can have as many thread pools as I want (if I want to do a thread pool for "IncrementVector" and one for my "DecrementVector" threads, I can).
- if I need to divide my "increment vector" task into 10 threads, instead of calling 10 times CreateThread, I create a single "worker", and Submit it 10 times to the ThreadPool with the same parameter (hence, I need the thread ID in the callback to know which part of my std::vector to increment). Here I couldn't find the thread ID, since the function GetCurrentThreadId() returns the real ID of the thread (ie., something like 1528, not something between 0..nb_launched_threads).
Finally, I am not sure I understood the concept well : do I really need a single worker and not 10 if I split my std::vector into 10 threads ?
You're roughly right up to the last point.
The whole idea about a thread pool is that you don't care how many threads it has. You just throw a lot of work into the thread pool, and let the OS determine how to execute each chunk.
So, if you create and submit 10 chunks, the OS may use between 1 and 10 threads from the pool.
You should not care about those thread identities. Don't bother with thread ID's, minimum or maximum number of threads, or stuff like that.
If you don't care about thread identities, then how do you manage what part of the vector to change? Simple. Before creating the threadpool, initialize a counter to zero. In the callback function, call InterlockedIncrement to retrieve and increment the counter. For each submitted work item, you'll get a consecutive integer.