clearing a grid with multhithreading - c++

I am trying to clear a (game) grid, but whenever I multithread it, the time it takes to clear the grid increases with 3 seconds.
To my own logic this should not be the case since each Y value of the array hold a lot of X values (the X values store a class) which then should iterate through a so called objects property and perform objects.clear() on it, which also iterates through every element.
My code:
const int NUM_OF_AVAIL_THREADS = std::thread::hardware_concurrency() * 2;
ThreadPool* pool = new ThreadPool(NUM_OF_AVAIL_THREADS);
vector<future<void>> threads;
void Terrain::clear_grid()
{
for (int y = 0; y < tiles.size(); y++)
{
threads.push_back(pool->enqueue([&]()
{
array<TerrainTile, terrain_width>& h = tiles.at(y);
for (int x = 0; x < h.size(); x++)
{
h.at(x).objects.clear();
}
}));
}
pool->wait_and_clear_threads(threads);
}
TerrainTile looks like this:
class TerrainTile
{
public:
//TerrainTile *up, *down, *left, *right;
vector<TerrainTile*> exits;
bool visited = false;
size_t position_x;
size_t position_y;
TileType tile_type;
vector<TerrainTile*> neighbors;
vector<MovingAsset*> objects;
vector<Tank*> tanks;
vector<MovingAsset*> beams;
vector<MovingAsset*> get_collidable_assets();
void add_collidable_assets(MovingAsset* asset);
void add_neighbor(TerrainTile* neighbor);
};
How the tiles array looks like:
static constexpr size_t terrain_width = 80;
static constexpr size_t terrain_height = 45;
std::array<std::array<TerrainTile, terrain_width>, terrain_height> tiles;
am I missing out on something crucial here, or does the cost of creating a thread simply outweigh the time it takes to iterate through the arrays?
EDIT: THIS IS THE THREADPOOL
#pragma once
namespace Tmpl8
{
class ThreadPool; //Forward declare
class Worker;
class Worker
{
public:
//Instantiate the worker class by passing and storing the threadpool as a reference
Worker(ThreadPool& s) : pool(s) {}
inline void operator()();
private:
ThreadPool& pool;
};
class ThreadPool
{
public:
ThreadPool(size_t numThreads) : stop(false)
{
for (size_t i = 0; i < numThreads; ++i)
workers.push_back(std::thread(Worker(*this)));
}
~ThreadPool()
{
stop = true; // stop all threads
condition.notify_all();
for (auto& thread : workers)
thread.join();
}
void wait_and_clear_threads(vector<future<void>>& threads)
{
for (future<void>& t : threads)
{
t.wait();
}
threads.clear();
}
template <class T>
auto enqueue(T task) -> std::future<decltype(task())>
{
//Wrap the function in a packaged_task so we can return a future object
auto wrapper = std::make_shared<std::packaged_task<decltype(task())()>>(std::move(task));
//Scope to restrict critical section
{
//lock our queue and add the given task to it
std::unique_lock<std::mutex> lock(queue_mutex);
tasks.push_back([=]
{
(*wrapper)();
});
}
//Wake up a thread to start this task
condition.notify_one();
return wrapper->get_future();
}
private:
friend class Worker; //Gives access to the private variables of this class
std::vector<std::thread> workers;
std::deque<std::function<void()>> tasks;
std::condition_variable condition; //Wakes up a thread when work is available
std::mutex queue_mutex; //Lock for our queue
bool stop = false;
};
inline void Worker::operator()()
{
std::function<void()> task;
while (true)
{
//Scope to restrict critical section
//This is important because we don't want to hold the lock while executing the task,
//because that would make it so only one task can be run simultaneously (aka sequantial)
{
std::unique_lock<std::mutex> locker(pool.queue_mutex);
//Wait until some work is ready or we are stopping the threadpool
//Because of spurious wakeups we need to check if there is actually a task available or we are stopping
pool.condition.wait(locker, [=] { return pool.stop || !pool.tasks.empty(); });
if (pool.stop) break;
task = pool.tasks.front();
pool.tasks.pop_front();
}
task();
}
}
} // namespace Tmpl8

Related

Not able to achieve linear speedup using threadpool

I want to calculate number of even numbers among all pairwise sums till 100000. And I want to do it using threadpools. Previously I did it in a static way, i.e., I allocated work to all the threads in the beginning itself. I was able to achieve linear speedup in that case. But the bottleneck is that the threads which started early, finished early (because there were less pairs to compute). So instead of that I want to allocate work to the threads dynamically, i.e., I will initially assign some work to the threads and as soon as they complete the work, they come back to take more work from the queue. Below is my threadpool code,
main.cpp :
#include <iostream>
#include <random>
#include<chrono>
#include<iomanip>
#include<future>
#include<vector>
#include "../include/ThreadPool.h"
std::random_device rd;
std::mt19937 mt(rd());
std::uniform_int_distribution<int> dist(-10, 10);
auto rnd = std::bind(dist, mt);
int thread_work;
long long pairwise(const int start) {
long long sum = 0;
long long counter = 0;
for(int i = start+1; i <= start+thread_work; i++)
{
for(int j = i-1; j >= 0; j--)
{
sum = i + j;
if(sum%2 == 0)
counter++;
}
}
//std::cout<<counter<<std::endl;
return counter;
}
int main(int argc, char *argv[])
{
// Create pool with x threads
int x;
std::cout<<"Enter num of threads : ";
std::cin>>x;
std::cout<<"Enter thread_work : ";
std::cin>>thread_work;
ThreadPool pool(x);
// Initialize pool
pool.init();
int N = 100000;
long long res = 0;
auto start = std::chrono::high_resolution_clock::now();
for(int i = 0; i < N; i = i + thread_work)
{
std::future<long long int> fut = pool.submit(pairwise,i);
res += fut.get();
}
std::cout<<"total is "<<res<<std::endl;
pool.shutdown();
auto end = std::chrono::high_resolution_clock::now();
double time_taken = std::chrono::duration_cast<std::chrono::nanoseconds>(end - start).count();
time_taken *= 1e-9;
std::cout << "Time taken by program is : " << std::fixed << time_taken << std::setprecision(9)<<" secs"<<std::endl;
return 0;
}
my SafeQueue.h :
#pragma once
#include <mutex>
#include <queue>
// Thread safe implementation of a Queue using an std::queue
template <typename T>
class SafeQueue {
private:
std::queue<T> m_queue;
std::mutex m_mutex;
public:
SafeQueue() {
}
SafeQueue(SafeQueue& other) {
//TODO:
}
~SafeQueue() {
}
bool empty() {
std::unique_lock<std::mutex> lock(m_mutex);
return m_queue.empty();
}
int size() {
std::unique_lock<std::mutex> lock(m_mutex);
return m_queue.size();
}
void enqueue(T& t) {
std::unique_lock<std::mutex> lock(m_mutex);
m_queue.push(t);
}
bool dequeue(T& t) {
std::unique_lock<std::mutex> lock(m_mutex);
if (m_queue.empty()) {
return false;
}
t = std::move(m_queue.front());
m_queue.pop();
return true;
}
};
and my ThreadPool.h :
#pragma once
#include <functional>
#include <future>
#include <mutex>
#include <queue>
#include <thread>
#include <utility>
#include <vector>
#include "SafeQueue.h"
class ThreadPool {
private:
class ThreadWorker {
private:
int m_id;
ThreadPool * m_pool;
public:
ThreadWorker(ThreadPool * pool, const int id)
: m_pool(pool), m_id(id) {
}
void operator()() {
std::function<void()> func;
bool dequeued;
while (!m_pool->m_shutdown) {
{
std::unique_lock<std::mutex> lock(m_pool->m_conditional_mutex);
if (m_pool->m_queue.empty()) {
m_pool->m_conditional_lock.wait(lock);
}
dequeued = m_pool->m_queue.dequeue(func);
}
if (dequeued) {
func();
}
}
}
};
bool m_shutdown;
SafeQueue<std::function<void()>> m_queue;
std::vector<std::thread> m_threads;
std::mutex m_conditional_mutex;
std::condition_variable m_conditional_lock;
public:
ThreadPool(const int n_threads)
: m_threads(std::vector<std::thread>(n_threads)), m_shutdown(false) {
}
ThreadPool(const ThreadPool &) = delete;
ThreadPool(ThreadPool &&) = delete;
ThreadPool & operator=(const ThreadPool &) = delete;
ThreadPool & operator=(ThreadPool &&) = delete;
// Inits thread pool
void init() {
for (int i = 0; i < m_threads.size(); ++i) {
m_threads[i] = std::thread(ThreadWorker(this, i));
}
}
// Waits until threads finish their current task and shutdowns the pool
void shutdown() {
m_shutdown = true;
m_conditional_lock.notify_all();
for (int i = 0; i < m_threads.size(); ++i) {
if(m_threads[i].joinable()) {
m_threads[i].join();
}
}
}
// Submit a function to be executed asynchronously by the pool
template<typename F, typename...Args>
auto submit(F&& f, Args&&... args) -> std::future<decltype(f(args...))> {
// Create a function with bounded parameters ready to execute
std::function<decltype(f(args...))()> func = std::bind(std::forward<F>(f), std::forward<Args>(args)...);
// Encapsulate it into a shared ptr in order to be able to copy construct / assign
auto task_ptr = std::make_shared<std::packaged_task<decltype(f(args...))()>>(func);
// Wrap packaged task into void function
std::function<void()> wrapper_func = [task_ptr]() {
(*task_ptr)();
};
// Enqueue generic wrapper function
m_queue.enqueue(wrapper_func);
// Wake up one thread if its waiting
m_conditional_lock.notify_one();
// Return future from promise
return task_ptr->get_future();
}
};

Concurrent program does not terminate

I have the following concurrent program:
template <typename T>
class CircularBuffers{
public:
CircularBuffers(const T &size):_buffer(size),_start(0),_end(0){
assert((size > 1 && "size must be greater than 1"));
}
CircularBuffers(std::initializer_list<T> l):_buffer(l),_start(0),_end(l.size()-1){}
void addValue(const T &value)
{
std::unique_lock<std::mutex> uLock(_mutex);
_end = (_end+1) % _buffer.size();
if (_end==_start){_start = (_start+1) % _buffer.size();}
_buffer[_start] = value;
}
T poll()
{
std::unique_lock<std::mutex> uLock(_mutex);
_read_cond.wait(uLock,[&](){return !isEmpty();});
T element = _buffer[_start];
_start = (_start + 1) % _buffer.size();
return element;
}
bool isFull(){
return _start == (_end+1) % _buffer.size();
}
bool isEmpty()
{
return _start == _end;
}
private:
std::vector<T> _buffer;
int _start;
int _end;
std::mutex _mutex;
std::condition_variable _read_cond;
};
int main()
{
std::shared_ptr<CircularBuffers<int>> circ = std::make_shared<CircularBuffers<int>>(7);
std::vector<std::future<void>> futures_add;
std::vector<std::future<int>> futures_poll;
for (int i=0;i<10;i++)
{
futures_add.emplace_back(std::async(&CircularBuffers<int>::addValue,circ,i));
}
for (int i=0;i<10;i++)
{
futures_poll.emplace_back(std::async(&CircularBuffers<int>::poll,circ));
}
std::for_each(futures_add.begin(),futures_add.end(),[](std::future<void> &ftr){ftr.wait();});
std::for_each(futures_poll.begin(),futures_poll.end(),[](std::future<int> &ftr){ftr.wait();});
}
It never ends and it is because of the concurrency but I cannot manage to see why.
My questions are:
Why does not end? It goes over the first set of waits but for the waits from poll it gets stacked.
How could I adapt this code to make it work forever, in the sense that an infinite while for writing and polling is created BUT every time a thread gives a result, it gets deleted to leave space to new thread in the respective future vector, in order not to end up creating infinite threads.

C++ STL Producer multiple consumer where producer waits for free consumer before producing next value

My little consumer-producer problem had me stumped for some time. I didn't want an implementation where one producer pushes some data round-robin to the consumers, filling up their queues of data respectively.
I wanted to have one producer, x consumers, but the producer waits with producing new data until a consumer is free again. In my example there are 3 consumers so the producer creates a maximum of 3 objects of data at any given time. Since I don't like polling, the consumers were supposed to notify the producer when they are done. Sounds simple, but the solution I found doesn't please me. First the code.
#include "stdafx.h"
#include <mutex>
#include <iostream>
#include <future>
#include <map>
#include <atomic>
std::atomic_int totalconsumed;
class producer {
using runningmap_t = std::map<int, std::pair<std::future<void>, bool>>;
// Secure the map of futures.
std::mutex mutex_;
runningmap_t running_;
// Used for finished notification
std::mutex waitermutex_;
std::condition_variable waiter_;
// The magic number to limit the producer.
std::atomic<int> count_;
bool can_run();
void clean();
// Fake a source, e.g. filesystem scan.
int fakeiter;
int next();
bool has_next() const;
public:
producer() : fakeiter(50) {}
void run();
void notify(int value);
void wait();
};
class consumer {
producer& producer_;
public:
consumer(producer& producer) : producer_(producer) {}
void run(int value) {
std::this_thread::sleep_for(std::chrono::milliseconds(42));
std::cout << "Consumed " << value << " on (" << std::this_thread::get_id() << ")" << std::endl;
totalconsumed++;
producer_.notify(value);
}
};
// Only if less than three threads are active, another gets to run.
bool producer::can_run() { return count_.load() < 3; }
// Verify if there's something to consume
bool producer::has_next() const { return 0 != fakeiter; }
// Produce the next value for consumption.
int producer::next() { return --fakeiter; }
// Remove the futures that have reported to be finished.
void producer::clean()
{
for (auto it = running_.begin(); it != running_.end(); ) {
if (it->second.second) {
it = running_.erase(it);
}
else {
++it;
}
}
}
// Runs the producer. Creates a new consumer for every produced value. Max 3 at a time.
void producer::run()
{
while (has_next()) {
if (can_run()) {
auto c = next();
count_++;
auto future = std::async(&consumer::run, consumer(*this), c);
std::unique_lock<std::mutex> lock(mutex_);
running_[c] = std::make_pair(std::move(future), false);
clean();
}
else {
std::unique_lock<std::mutex> lock(waitermutex_);
waiter_.wait(lock);
}
}
}
// Consumers diligently tell the producer that they are finished.
void producer::notify(int value)
{
count_--;
mutex_.lock();
running_[value].second = true;
mutex_.unlock();
std::unique_lock<std::mutex> waiterlock(waitermutex_);
waiter_.notify_all();
}
// Wait for all consumers to finish.
void producer::wait()
{
while (!running_.empty()) {
mutex_.lock();
clean();
mutex_.unlock();
std::this_thread::sleep_for(std::chrono::milliseconds(10));
}
}
// Looks like the application entry point.
int main()
{
producer p;
std::thread pthread(&producer::run, &p);
pthread.join();
p.wait();
std::cout << std::endl << std::endl << "Total consumed " << totalconsumed.load() << std::endl;
return 0;
}
The part I don't like is the list of values mapped to the futures, called running_. I need to keep the future around until the consumer is actually done. I can't remove the future from the map in the notify method or else I'll kill the thread that is currently calling notify.
Am I missing something that could simplify this construct?
template<class T>
struct slotted_data {
std::size_t I;
T t;
};
template<class T>
using sink = std::function<void(T)>;
template<class T, std::size_t N>
struct async_slots {
bool produce( slotted_data<T> data ) {
if (terminate || data.I>=N) return false;
{
auto l = lock();
if (slots[data.I]) return false;
slots[data.I] = std::move(data.t);
}
cv.notify_one();
return true;
}
// rare use of non-lambda cv.wait in the wild!
bool consume(sink<slotted_data<T>> f) {
auto l = lock();
while(!terminate) {
for (auto& slot:slots) {
if (slot) {
auto r = std::move(*slot);
slot = std::nullopt;
f({std::size_t(&slot-slots.data()), std::move(r)}); // invoke in lock
return true;
}
}
cv.wait(l);
}
return false;
}
// easier and safer version:
std::optional<slotted_data<T>> consume() {
std::optional<slotted_data<T>> r;
bool worked = consume([&](auto&& data) { r = std::move(data); });
if (!worked) return {};
return r;
}
void finish() {
{
auto l = lock();
terminate = true;
}
cv.notify_all();
}
private:
auto lock() { return std::unique_lock<std::mutex>(m); }
std::mutex m;
std::condition_variable cv;
std::array< std::optional<T>, N > slots;
bool terminate = false;
};
async_slots provides a fixed number of slots and an awaitable consume. If you try to produce two things in the same slot, the producer function returns false and ignores you.
consume invokes the sink of the data inside the mutex in a continuation passing style. This permits atomic consumption.
We want to invert producer and consumer:
template<class T, std::size_t N>
struct slotted_consumer {
bool consume( std::size_t I, sink<T> sink ) {
std::optional<T> data;
std::condition_variable cv;
std::mutex m;
bool worked = slots.produce(
{
I,
[&](auto&& t){
{
std::unique_lock<std::mutex> l(m);
data.emplace(std::move(t));
}
cv.notify_one();
}
}
);
if (!worked) return false;
std::unique_lock<std::mutex> l(m);
cv.wait(l, [&]()->bool{
return (bool)data;
});
sink( std::move(*data) );
return true;
}
bool produce( T t ) {
return slots.consume(
[&](auto&& f) {
f.t( std::move(t) );
}
);
}
void finish() {
slots.finish();
}
private:
async_slots< sink<T>, N > slots;
};
we have to take some care to execute sink in a context where we are not holding the mutex of async_slots, which is why consume above is so strange.
Live example.
You share a slotted_consumer< int, 3 > slots. The producing thread repeatedly calls slots.produce(42);. It blocks until a new consumer lines up.
Consumer #2 calls slots.consume( 2, [&](int x){ /* code to consume x */ } ), and #1 and #0 pass their slot numbers as well.
All 3 consumers can be waiting for the next production. The above system defaults to feeding #0 first if it is waiting for more work; we could make it "fair" at a cost of keeping a bit more state.

Efficiently waiting for all tasks in a threadpool to finish

I currently have a program with x workers in my threadpool. During the main loop y tasks are assigned to the workers to complete, but after the tasks are sent out I must wait for all tasks for finish before preceding with the program. I believe my current solution is inefficient, there must be a better way to wait for all tasks to finish but I am not sure how to go about this
// called in main after all tasks are enqueued to
// std::deque<std::function<void()>> tasks
void ThreadPool::waitFinished()
{
while(!tasks.empty()) //check if there are any tasks in queue waiting to be picked up
{
//do literally nothing
}
}
More information:
threadpool structure
//worker thread objects
class Worker {
public:
Worker(ThreadPool& s): pool(s) {}
void operator()();
private:
ThreadPool &pool;
};
//thread pool
class ThreadPool {
public:
ThreadPool(size_t);
template<class F>
void enqueue(F f);
void waitFinished();
~ThreadPool();
private:
friend class Worker;
//keeps track of threads so we can join
std::vector< std::thread > workers;
//task queue
std::deque< std::function<void()> > tasks;
//sync
std::mutex queue_mutex;
std::condition_variable condition;
bool stop;
};
or here's a gist of my threadpool.hpp
example of what I want to use waitFinished() for:
while(running)
//....
for all particles alive
push particle position function to threadpool
end for
threadPool.waitFinished();
push new particle position data into openGL buffer
end while
so this way I can send hundrends of thousands of particle position tasks to be done in parallel, wait for them to finish and put the new data inside the openGL position buffers
This is one way to do what you're trying. Using two condition variables on the same mutex is not for the light-hearted unless you know what is going on internally. I didn't need the atomic processed member other than my desire to demonstrate how many items were finished between each run.
The sample workload function in this generates one million random int values, then sorts them (gotta heat my office one way or another). waitFinished will not return until the queue is empty and no threads are busy.
#include <iostream>
#include <deque>
#include <functional>
#include <thread>
#include <condition_variable>
#include <mutex>
#include <random>
//thread pool
class ThreadPool
{
public:
ThreadPool(unsigned int n = std::thread::hardware_concurrency());
template<class F> void enqueue(F&& f);
void waitFinished();
~ThreadPool();
unsigned int getProcessed() const { return processed; }
private:
std::vector< std::thread > workers;
std::deque< std::function<void()> > tasks;
std::mutex queue_mutex;
std::condition_variable cv_task;
std::condition_variable cv_finished;
std::atomic_uint processed;
unsigned int busy;
bool stop;
void thread_proc();
};
ThreadPool::ThreadPool(unsigned int n)
: busy()
, processed()
, stop()
{
for (unsigned int i=0; i<n; ++i)
workers.emplace_back(std::bind(&ThreadPool::thread_proc, this));
}
ThreadPool::~ThreadPool()
{
// set stop-condition
std::unique_lock<std::mutex> latch(queue_mutex);
stop = true;
cv_task.notify_all();
latch.unlock();
// all threads terminate, then we're done.
for (auto& t : workers)
t.join();
}
void ThreadPool::thread_proc()
{
while (true)
{
std::unique_lock<std::mutex> latch(queue_mutex);
cv_task.wait(latch, [this](){ return stop || !tasks.empty(); });
if (!tasks.empty())
{
// got work. set busy.
++busy;
// pull from queue
auto fn = tasks.front();
tasks.pop_front();
// release lock. run async
latch.unlock();
// run function outside context
fn();
++processed;
latch.lock();
--busy;
cv_finished.notify_one();
}
else if (stop)
{
break;
}
}
}
// generic function push
template<class F>
void ThreadPool::enqueue(F&& f)
{
std::unique_lock<std::mutex> lock(queue_mutex);
tasks.emplace_back(std::forward<F>(f));
cv_task.notify_one();
}
// waits until the queue is empty.
void ThreadPool::waitFinished()
{
std::unique_lock<std::mutex> lock(queue_mutex);
cv_finished.wait(lock, [this](){ return tasks.empty() && (busy == 0); });
}
// a cpu-busy task.
void work_proc()
{
std::random_device rd;
std::mt19937 rng(rd());
// build a vector of random numbers
std::vector<int> data;
data.reserve(100000);
std::generate_n(std::back_inserter(data), data.capacity(), [&](){ return rng(); });
std::sort(data.begin(), data.end(), std::greater<int>());
}
int main()
{
ThreadPool tp;
// run five batches of 100 items
for (int x=0; x<5; ++x)
{
// queue 100 work tasks
for (int i=0; i<100; ++i)
tp.enqueue(work_proc);
tp.waitFinished();
std::cout << tp.getProcessed() << '\n';
}
// destructor will close down thread pool
return EXIT_SUCCESS;
}
Output
100
200
300
400
500
Best of luck.

How to make boost::thread_group execute a fixed number of parallel threads

This is the code to create a thread_group and execute all threads in parallel:
boost::thread_group group;
for (int i = 0; i < 15; ++i)
group.create_thread(aFunctionToExecute);
group.join_all();
This code will execute all threads at once. What I want to do is to execute them all but 4 maximum in parallel. When on is terminated, another one is executed until there are no more to execute.
Another, more efficient solution would be to have each thread callback to the primary thread when they are finished, and the handler on the primary thread could launch a new thread each time. This prevents the repetitive calls to timed_join, as the primary thread won't do anything until the callback is triggered.
I have something like this:
boost::mutex mutex_;
boost::condition_variable condition_;
const size_t throttle_;
size_t size_;
bool wait_;
template <typename Env, class F>
void eval_(const Env &env, const F &f) {
{
boost::unique_lock<boost::mutex> lock(mutex_);
size_ = std::min(size_+1, throttle_);
while (throttle_ <= size_) condition_.wait(lock);
}
f.eval(env);
{
boost::lock_guard<boost::mutex> lock(mutex_);
--size_;
}
condition_.notify_one();
}
I think you are looking for a thread_pool implementation, which is available here.
Additionally I have noticed that if you create a vector of std::future and store futures of many std::async_tasks in it and you do not have any blocking code in the function passed to the thread, VS2013 (atleast from what I can confirm) will launch exactly the appropriate no of threads your machine can handle. It reuses the threads once created.
I created my own simplified interface of boost::thread_group to do this job:
class ThreadGroup : public boost::noncopyable
{
private:
boost::thread_group group;
std::size_t maxSize;
float sleepStart;
float sleepCoef;
float sleepMax;
std::set<boost::thread*> running;
public:
ThreadGroup(std::size_t max_size = 0,
float max_sleeping_time = 1.0f,
float sleeping_time_coef = 1.5f,
float sleeping_time_start = 0.001f) :
boost::noncopyable(),
group(),
maxSize(max_size),
sleepStart(sleeping_time_start),
sleepCoef(sleeping_time_coef),
sleepMax(max_sleeping_time),
running()
{
if(max_size == 0)
this->maxSize = (std::size_t)std::max(boost::thread::hardware_concurrency(), 1u);
assert(max_sleeping_time >= sleeping_time_start);
assert(sleeping_time_start > 0.0f);
assert(sleeping_time_coef > 1.0f);
}
~ThreadGroup()
{
this->joinAll();
}
template<typename F> boost::thread* createThread(F f)
{
float sleeping_time = this->sleepStart;
while(this->running.size() >= this->maxSize)
{
for(std::set<boost::thread*>::iterator it = running.begin(); it != running.end();)
{
const std::set<boost::thread*>::iterator jt = it++;
if((*jt)->timed_join(boost::posix_time::milliseconds((long int)(1000.0f * sleeping_time))))
running.erase(jt);
}
if(sleeping_time < this->sleepMax)
{
sleeping_time *= this->sleepCoef;
if(sleeping_time > this->sleepMax)
sleeping_time = this->sleepMax;
}
}
return *this->running.insert(this->group.create_thread(f)).first;
}
void joinAll()
{
this->group.join_all();
}
void interruptAll()
{
#ifdef BOOST_THREAD_PROVIDES_INTERRUPTIONS
this->group.interrupt_all();
#endif
}
std::size_t size() const
{
return this->group.size();
}
};
Here is an example of use, very similar to boost::thread_group with the main difference that the creation of the thread is a waiting point:
{
ThreadGroup group(4);
for(int i = 0; i < 15; ++i)
group.createThread(aFunctionToExecute);
} // join all at destruction