Automatically delete containers sent to asynchronous functions/io_service - c++

I would like to use an unordered_map as a job or session context object. So, I would like to allocate in some function bundle it with a static function in a function object and send this function object to an io_service. And obviously, I do not worry about deallocating it.
Any ideas on how to do that?
Thank you!
#include <iostream>
#include <unordered_map>
#include "boost/asio.hpp"
#include "boost/thread.hpp"
using namespace std;
namespace asio = boost::asio;
typedef std::unique_ptr<asio::io_service::work> work_ptr;
typedef boost::function<void(void) > boost_void_void_fun;
class job_processor {
public:
job_processor(int threads) : thread_count(threads) {
service = new asio::io_service();
work = new work_ptr(new asio::io_service::work(*(service)));
for (int i = 0; i < this->thread_count; ++i)
workers.create_thread(boost::bind(&asio::io_service::run, service));
}
void post_task(boost_void_void_fun job) {
this->service->post(job);
}
void drain() {
this->work->reset();
}
void wait() {
this->workers.join_all();
}
private:
int thread_count;
work_ptr * work;
asio::io_service* service;
boost::thread_group workers;
};
typedef std::unordered_map<string, unsigned long> map_t;
class with_static_function {
public:
static void print_map(map_t map) {
for(map_t::iterator it = map.begin(); it != map.end(); ++it)
std::cout << it->first << ":" << it->second << std::endl;
}
static void print__heap_map(map_t* map) {
if(!map) return;
for(map_t::iterator it = map->begin(); it != map->end(); ++it)
std::cout << it->first << ":" << it->second << std::endl;
}
};
int main(int argc, char** argv) {
map_t words;
words["one"] = 1;
// pass the reference;
with_static_function::print_map(words);
job_processor *pr = new job_processor(4);
{
map_t* heap_map = new map_t;
(*heap_map)["two"] = 2;
// I need this variable to the job_processor;
// and I do not want to worry about deallocation.
// should happen automatically somehow.
// I am ok with changing the variable to be a shared_ptr or
// anything else that works.
boost_void_void_fun fun = boost::bind(
&with_static_function::print__heap_map,
heap_map);
fun(); // if binding was done right this should have worked.
pr->post_task(fun);
}
pr->drain();
pr->wait();
delete pr;
return 0;
}

A number of observations:
Stop Emulating Java. Do not use new unless you're implementing an ownership primitive (smart handle/pointer type). Specifically, just create a pr:
job_processor pr(4);
Same goes for all the members of job_processor (you were leaking everything, and if job_processor were copied, you'd get double-free Undefined Behaviour
The code
// pass the reference;
with_static_function::print_map(words);
passes by value... meaning the whole map is copied
to avoid that copy, fix the print_map signature:
static void print_map(map_t const& map) {
for(map_t::const_iterator it = map.begin(); it != map.end(); ++it)
std::cout << it->first << ":" << it->second << std::endl;
}
Of course, consider just writing
static void print_map(map_t const& map) {
for(auto& e : map)
std::cout << e.first << ":" << e.second << "\n";
}
The "heap" overload of that could be, as my wording implies, an overload. Be sure to remove the useless duplication of code (!):
static void print_map(map_t const* map) {
if (map) print_map(*map);
}
You don't even need that overload because you can simply use a lambda to bind (instead of boost::bind):
auto heap_map = boost::make_shared<map_t>();
heap_map->insert({{"two", 2}, {"three", 3}});
boost_void_void_fun fun = [heap_map] { with_static_function::print_map(*heap_map); };
Complete working program:
Live On Coliru
#include <iostream>
#include <unordered_map>
#include <boost/asio.hpp>
#include <boost/make_shared.hpp>
#include <boost/thread.hpp>
#include <boost/function.hpp>
#include <boost/bind.hpp>
namespace asio = boost::asio;
typedef boost::function<void(void)> boost_void_void_fun;
class job_processor {
public:
job_processor(int threads) : service(), work(boost::asio::io_service::work(service))
{
for (int i = 0; i < threads; ++i)
workers.create_thread(boost::bind(&asio::io_service::run, &service));
}
void post_task(boost_void_void_fun job) {
service.post(job);
}
void drain() {
work.reset();
}
void wait() {
workers.join_all();
}
private:
asio::io_service service;
boost::optional<asio::io_service::work> work;
boost::thread_group workers;
};
typedef std::unordered_map<std::string, unsigned long> map_t;
namespace with_static_function {
static void print_map(map_t const& map) {
for(auto& e : map)
std::cout << e.first << ":" << e.second << "\n";
}
}
int main() {
// pass the reference;
with_static_function::print_map({ { "one", 1 } });
job_processor pr(4);
{
auto heap_map = boost::make_shared<map_t>();
heap_map->insert({{"two", 2}, {"three", 3}});
boost_void_void_fun fun = [heap_map] { with_static_function::print_map(*heap_map); };
pr.post_task(fun);
}
pr.drain();
pr.wait();
}
Prints
one:1
three:3
two:2

Related

Passing std::function as a parameter

I am trying to pass a std::function as a parameter. I am having a couple of problems with the syntax.
The code is simple. I want to save a function handler into a std::map. I don't want to make the registerHandler method a template. Here is the code example.
#include <map>
#include <functional>
#include <iostream>
using namespace std;
class message
{
public:
void print(string command)
{
cout << "message id: " << id_ << " content: " << command << std::endl;
}
int id_ = 0;
};
std::map<int, function<void(string)>> functionMap;
void registerHandler(int id, message& messageClass, std::function<void(string)>& func)
{
auto messageHandler = bind(&func, &messageClass, placeholders::_1);
// ERROR #2
// functionMap.insert({id, messageHandler});
}
int main()
{
message msg1;
msg1.id_ = 5000;
// ERROR #1
// registerHandler(msg1.id_, msg1, message::print);
std::map<int, function<void(string)>>::iterator iter;
iter = functionMap.begin();
while (iter != functionMap.end())
{
int key = iter->first;
auto messageHandler = iter->second;
messageHandler("Junk Payload");
iter++;
}
}
ERROR#1
message.cc:37:44: error: invalid use of non-static member function ‘void message::print(std::string)’
registerHandler(msg1.id_, msg1, message::print);
ERROR#2
message.cc: In function ‘void registerHandler(int, message&, std::function<void(std::basic_string<char>)>&)’:
message.cc:24:42: error: no matching function for call to ‘std::map<int, std::function<void(std::basic_string<char>)> >::insert(<brace-enclosed initializer list>)’
functionMap.insert({id, messageHandler});
^
You can write your program like this to work:
#include <map>
#include <functional>
#include <iostream>
using namespace std;
class message
{
public:
void print(string command)
{
cout << "message id: " << id_ << " content: " << command << std::endl;
}
int id_ = 0;
};
std::map<int, function<void(string)>> functionMap;
void registerHandler(int id, message& messageClass, const std::function<void(string)>& func)
{
functionMap.insert({id, func});
}
int main()
{
message msg1;
msg1.id_ = 5000;
registerHandler(msg1.id_, msg1, bind(&message::print, &msg1, placeholders::_1));
std::map<int, function<void(string)>>::iterator iter;
iter = functionMap.begin();
while (iter != functionMap.end())
{
int key = iter->first;
auto messageHandler = iter->second;
messageHandler("Junk Payload");
iter++;
}
}

Get sub-map from std::map by number of elements instead of key using iterator

I have a std::map<std::string, std::vector<std::string>> and I need to perform a threaded task on this map by dividing the map into sub-maps and passing each sub-map to a thread.
With a std::vector<T> I would be able to get a sub-vector pretty easy, by doing this:
#include <vector>
#include <string>
int main(void)
{
size_t off = 0;
size_t num_elms = 100; // Made up value
std::vector<uint8_t> full; // Assume filled with stuff
std::vector<uin8t_t> sub(std::begin(full) + off, std::begin(full) + off + num_elms);
off = off + num_elms;
}
However, doing the same with std::map<T1, T2> gives a compilation error.
#include <vector>
#include <map>
#include <string>
int main(void)
{
size_t off = 0;
size_t num_elms = 100;
std::map<std::string, std::vector<std::string>> full;
std::map<std::string, std::vector<std::string>> sub(std::begin(full) + off,
std::begin(full) + off + num_elms);
off = off + num_elms;
}
It is the same with other std::map "types". Which, from what I have gathered, is down to the iterator.
What is possible is to extract the keys and do something similar to this solution:
#include <map>
#include <vector>
#include <string>
#include <iostream>
void print_map(const std::map<std::string, std::vector<std::string>>& _map)
{
for (const auto& [key, value] : _map)
{
std::cout << "key: " << key << "\nvalues\n";
for (const auto& elm : value)
{
std::cout << "\t" << elm << "\n";
}
}
}
void print_keys(const std::vector<std::string>& keys)
{
std::cout << "keys: \n";
for(const auto& key : keys)
{
std::cout << key << "\n";
}
}
int main(void)
{
std::map<std::string, std::vector<std::string>> full;
full["aa"] = {"aa", "aaaa", "aabb"};
full["bb"] = {"bb", "bbbbb", "bbaa"};
full["cc"] = {"cc", "cccc", "ccbb"};
full["dd"] = {"dd", "dd", "ddcc"};
print_map(full);
std::vector<std::string> keys;
for (const auto& [key, value] : full)
{
(void) value;
keys.emplace_back(key);
}
print_keys(keys);
size_t off = 0;
size_t num_elms = 2;
std::map<std::string, std::vector<std::string>> sub1 (full.find(keys.at(off)), full.find(keys.at(off + num_elms)));
off = off + num_elms;
std::map<std::string, std::vector<std::string>> sub2 (full.find(keys.at(off)), full.find(keys.at(off + num_elms -1)));
std::cout << "sub1:\n";
print_map(sub1);
std::cout << "sub2:\n";
print_map(sub2);
}
However, this has the potential to be extremely inefficient, as the map can be really big (10k+ elements).
So, is there a better way to replicate the std::vector approach with std::map?
A slightly different approach would be to use one of the execution policies added in C++17, like std::execution::parallel_policy. In the example below, the instance std::execution::par is used:
#include <execution>
// ...
std::for_each(std::execution::par, full.begin(), full.end(), [](auto& p) {
// Here you are likely using a thread from a built-in thread pool
auto& vec = p.second;
// do work with "vec"
});
With a slight adaption, you can reasonably easily pass ranges to print_map, and divide up your map by calling std::next on an iterator.
// Minimal range-for support
template <typename Iter>
struct Range {
Range (Iter b, Iter e) : b(b), e(e) {}
Iter b;
Iter e;
Iter begin() const { return b; }
Iter end() const { return e; }
};
// some shorter aliases
using Map = std::map<std::string, std::vector<std::string>>;
using MapView = Range<Map::const_iterator>;
// not necessarily the whole map
void print_map(MapView map) {
for (const auto& [key, value] : map)
{
std::cout << "key: " << key << "\nvalues\n";
for (const auto& elm : value)
{
std::cout << "\t" << elm << "\n";
}
}
}
int main(void)
{
Map full;
full["aa"] = {"aa", "aaaa", "aabb"};
full["bb"] = {"bb", "bbbbb", "bbaa"};
full["cc"] = {"cc", "cccc", "ccbb"};
full["dd"] = {"dd", "dd", "ddcc"};
// can still print the whole map
print_map({ map.begin(), map.end() });
size_t num_elms = 2;
size_t num_full_views = full.size() / num_elms;
std::vector<MapView> views;
auto it = full.begin();
for (size_t i = 0; i < num_full_views; ++i) {
auto next = std::next(it, num_elms);
views.emplace_back(it, next);
it = next;
}
if (it != full.end()) {
views.emplace_back(it, full.end());
}
for (auto view : views) {
print_map(view);
}
}
In C++20 (or with another ranges library), this can be simplified with std::ranges::drop_view / std::ranges::take_view.
using MapView = decltype(std::declval<Map>() | std::ranges::views::drop(0) | std::ranges::views::take(0));
for (size_t i = 0; i < map.size(); i += num_elms) {
views.push_back(map | std::ranges::views::drop(i) | std::ranges::views::take(num_elms));
}

Boost thread pool join tasks without closing the pool [duplicate]

Consider the functions
#include <iostream>
#include <boost/bind.hpp>
#include <boost/asio.hpp>
void foo(const uint64_t begin, uint64_t *result)
{
uint64_t prev[] = {begin, 0};
for (uint64_t i = 0; i < 1000000000; ++i)
{
const auto tmp = (prev[0] + prev[1]) % 1000;
prev[1] = prev[0];
prev[0] = tmp;
}
*result = prev[0];
}
void batch(boost::asio::thread_pool &pool, const uint64_t a[])
{
uint64_t r[] = {0, 0};
boost::asio::post(pool, boost::bind(foo, a[0], &r[0]));
boost::asio::post(pool, boost::bind(foo, a[1], &r[1]));
pool.join();
std::cerr << "foo(" << a[0] << "): " << r[0] << " foo(" << a[1] << "): " << r[1] << std::endl;
}
where foo is a simple "pure" function that performs a calculation on begin and writes the result to the pointer *result.
This function gets called with different inputs from batch. Here dispatching each call to another CPU core might be beneficial.
Now assume the batch function gets called several 10 000 times. Therefore a thread pool would be nice which is shared between all the sequential batch calls.
Trying this with (for the sake of simplicity only 3 calls)
int main(int argn, char **)
{
boost::asio::thread_pool pool(2);
const uint64_t a[] = {2, 4};
batch(pool, a);
const uint64_t b[] = {3, 5};
batch(pool, b);
const uint64_t c[] = {7, 9};
batch(pool, c);
}
leads to the result
foo(2): 2 foo(4): 4
foo(3): 0 foo(5): 0
foo(7): 0 foo(9): 0
Where all three lines appear at the same time, while the computation of foo takes ~3s.
I assume that only the first join really waits for the pool to complete all jobs.
The others have invalid results. (The not initialized values)
What is the best practice here to reuse the thread pool?
The best practice is not to reuse the pool (what would be the use of pooling, if you keep creating new pools?).
If you want to be sure you "time" the batches together, I'd suggest using when_all on futures:
Live On Coliru
#define BOOST_THREAD_PROVIDES_FUTURE_WHEN_ALL_WHEN_ANY
#include <iostream>
#include <boost/bind.hpp>
#include <boost/asio.hpp>
#include <boost/thread.hpp>
uint64_t foo(uint64_t begin) {
uint64_t prev[] = {begin, 0};
for (uint64_t i = 0; i < 1000000000; ++i) {
const auto tmp = (prev[0] + prev[1]) % 1000;
prev[1] = prev[0];
prev[0] = tmp;
}
return prev[0];
}
void batch(boost::asio::thread_pool &pool, const uint64_t a[2])
{
using T = boost::packaged_task<uint64_t>;
T tasks[] {
T(boost::bind(foo, a[0])),
T(boost::bind(foo, a[1])),
};
auto all = boost::when_all(
tasks[0].get_future(),
tasks[1].get_future());
for (auto& t : tasks)
post(pool, std::move(t));
auto [r0, r1] = all.get();
std::cerr << "foo(" << a[0] << "): " << r0.get() << " foo(" << a[1] << "): " << r1.get() << std::endl;
}
int main() {
boost::asio::thread_pool pool(2);
const uint64_t a[] = {2, 4};
batch(pool, a);
const uint64_t b[] = {3, 5};
batch(pool, b);
const uint64_t c[] = {7, 9};
batch(pool, c);
}
Prints
foo(2): 2 foo(4): 4
foo(3): 503 foo(5): 505
foo(7): 507 foo(9): 509
I would consider
generalizing
message queuing
Generalized
Make it somewhat more flexible by not hardcoding batch sizes. After all, the pool size is already fixed, we don't need to "make sure batches fit" or something:
Live On Coliru
#define BOOST_THREAD_PROVIDES_FUTURE_WHEN_ALL_WHEN_ANY
#include <iostream>
#include <boost/bind.hpp>
#include <boost/asio.hpp>
#include <boost/thread.hpp>
#include <boost/thread/future.hpp>
struct Result { uint64_t begin, result; };
Result foo(uint64_t begin) {
uint64_t prev[] = {begin, 0};
for (uint64_t i = 0; i < 1000000000; ++i) {
const auto tmp = (prev[0] + prev[1]) % 1000;
prev[1] = prev[0];
prev[0] = tmp;
}
return { begin, prev[0] };
}
void batch(boost::asio::thread_pool &pool, std::vector<uint64_t> const a)
{
using T = boost::packaged_task<Result>;
std::vector<T> tasks;
tasks.reserve(a.size());
for(auto begin : a)
tasks.emplace_back(boost::bind(foo, begin));
std::vector<boost::unique_future<T::result_type> > futures;
for (auto& t : tasks) {
futures.push_back(t.get_future());
post(pool, std::move(t));
}
for (auto& fut : boost::when_all(futures.begin(), futures.end()).get()) {
auto r = fut.get();
std::cerr << "foo(" << r.begin << "): " << r.result << " ";
}
std::cout << std::endl;
}
int main() {
boost::asio::thread_pool pool(2);
batch(pool, {2});
batch(pool, {4, 3, 5});
batch(pool, {7, 9});
}
Prints
foo(2): 2
foo(4): 4 foo(3): 503 foo(5): 505
foo(7): 507 foo(9): 509
Generalized2: Variadics Simplify
Contrary to popular believe (and honestly, what usually happens) this time we can leverage variadics to get rid of all the intermediate vectors (every single one of them):
Live On Coliru
void batch(boost::asio::thread_pool &pool, T... a)
{
auto launch = [&pool](uint64_t begin) {
boost::packaged_task<Result> pt(boost::bind(foo, begin));
auto fut = pt.get_future();
post(pool, std::move(pt));
return fut;
};
for (auto& r : {launch(a).get()...}) {
std::cerr << "foo(" << r.begin << "): " << r.result << " ";
}
std::cout << std::endl;
}
If you insist on outputting the results in time, you can still add when_all into the mix (requiring a bit more heroics to unpack the tuple):
Live On Coliru
template <typename...T>
void batch(boost::asio::thread_pool &pool, T... a)
{
auto launch = [&pool](uint64_t begin) {
boost::packaged_task<Result> pt(boost::bind(foo, begin));
auto fut = pt.get_future();
post(pool, std::move(pt));
return fut;
};
std::apply([](auto&&... rfut) {
Result results[] {rfut.get()...};
for (auto& r : results) {
std::cerr << "foo(" << r.begin << "): " << r.result << " ";
}
}, boost::when_all(launch(a)...).get());
std::cout << std::endl;
}
Both still print the same result
Message Queuing
This is very natural to boost, and sort of skips most complexity. If you also want to report per batched group, you'd have to coordinate:
Live On Coliru
#include <iostream>
#include <boost/asio.hpp>
#include <memory>
struct Result { uint64_t begin, result; };
Result foo(uint64_t begin) {
uint64_t prev[] = {begin, 0};
for (uint64_t i = 0; i < 1000000000; ++i) {
const auto tmp = (prev[0] + prev[1]) % 1000;
prev[1] = prev[0];
prev[0] = tmp;
}
return { begin, prev[0] };
}
using Group = std::shared_ptr<size_t>;
void batch(boost::asio::thread_pool &pool, std::vector<uint64_t> begins) {
auto group = std::make_shared<std::vector<Result> >(begins.size());
for (size_t i=0; i < begins.size(); ++i) {
post(pool, [i,begin=begins.at(i),group] {
(*group)[i] = foo(begin);
if (group.unique()) {
for (auto& r : *group) {
std::cout << "foo(" << r.begin << "): " << r.result << " ";
std::cout << std::endl;
}
}
});
}
}
int main() {
boost::asio::thread_pool pool(2);
batch(pool, {2});
batch(pool, {4, 3, 5});
batch(pool, {7, 9});
pool.join();
}
Note this is having concurrent access to group, which is safe due to the limitations on element accesses.
Prints:
foo(2): 2
foo(4): 4 foo(3): 503 foo(5): 505
foo(7): 507 foo(9): 509
I just ran into this advanced executor example which is hidden from the documentation:
I realized just now that Asio comes with a fork_executor example which does exactly this: you can "group" tasks and join the executor (which represents that group) instead of the pool. I've missed this for the longest time since none of the executor examples are listed in the HTML documentation – sehe 21 mins ago
So without further ado, here's that sample applied to your question:
Live On Coliru
#define BOOST_BIND_NO_PLACEHOLDERS
#include <boost/asio/thread_pool.hpp>
#include <boost/asio/ts/executor.hpp>
#include <condition_variable>
#include <memory>
#include <mutex>
#include <queue>
#include <thread>
// A fixed-size thread pool used to implement fork/join semantics. Functions
// are scheduled using a simple FIFO queue. Implementing work stealing, or
// using a queue based on atomic operations, are left as tasks for the reader.
class fork_join_pool : public boost::asio::execution_context {
public:
// The constructor starts a thread pool with the specified number of
// threads. Note that the thread_count is not a fixed limit on the pool's
// concurrency. Additional threads may temporarily be added to the pool if
// they join a fork_executor.
explicit fork_join_pool(std::size_t thread_count = std::thread::hardware_concurrency()*2)
: use_count_(1), threads_(thread_count)
{
try {
// Ask each thread in the pool to dequeue and execute functions
// until it is time to shut down, i.e. the use count is zero.
for (thread_count_ = 0; thread_count_ < thread_count; ++thread_count_) {
boost::asio::dispatch(threads_, [&] {
std::unique_lock<std::mutex> lock(mutex_);
while (use_count_ > 0)
if (!execute_next(lock))
condition_.wait(lock);
});
}
} catch (...) {
stop_threads();
threads_.join();
throw;
}
}
// The destructor waits for the pool to finish executing functions.
~fork_join_pool() {
stop_threads();
threads_.join();
}
private:
friend class fork_executor;
// The base for all functions that are queued in the pool.
struct function_base {
std::shared_ptr<std::size_t> work_count_;
void (*execute_)(std::shared_ptr<function_base>& p);
};
// Execute the next function from the queue, if any. Returns true if a
// function was executed, and false if the queue was empty.
bool execute_next(std::unique_lock<std::mutex>& lock) {
if (queue_.empty())
return false;
auto p(queue_.front());
queue_.pop();
lock.unlock();
execute(lock, p);
return true;
}
// Execute a function and decrement the outstanding work.
void execute(std::unique_lock<std::mutex>& lock,
std::shared_ptr<function_base>& p) {
std::shared_ptr<std::size_t> work_count(std::move(p->work_count_));
try {
p->execute_(p);
lock.lock();
do_work_finished(work_count);
} catch (...) {
lock.lock();
do_work_finished(work_count);
throw;
}
}
// Increment outstanding work.
void
do_work_started(const std::shared_ptr<std::size_t>& work_count) noexcept {
if (++(*work_count) == 1)
++use_count_;
}
// Decrement outstanding work. Notify waiting threads if we run out.
void
do_work_finished(const std::shared_ptr<std::size_t>& work_count) noexcept {
if (--(*work_count) == 0) {
--use_count_;
condition_.notify_all();
}
}
// Dispatch a function, executing it immediately if the queue is already
// loaded. Otherwise adds the function to the queue and wakes a thread.
void do_dispatch(std::shared_ptr<function_base> p,
const std::shared_ptr<std::size_t>& work_count) {
std::unique_lock<std::mutex> lock(mutex_);
if (queue_.size() > thread_count_ * 16) {
do_work_started(work_count);
lock.unlock();
execute(lock, p);
} else {
queue_.push(p);
do_work_started(work_count);
condition_.notify_one();
}
}
// Add a function to the queue and wake a thread.
void do_post(std::shared_ptr<function_base> p,
const std::shared_ptr<std::size_t>& work_count) {
std::lock_guard<std::mutex> lock(mutex_);
queue_.push(p);
do_work_started(work_count);
condition_.notify_one();
}
// Ask all threads to shut down.
void stop_threads() {
std::lock_guard<std::mutex> lock(mutex_);
--use_count_;
condition_.notify_all();
}
std::mutex mutex_;
std::condition_variable condition_;
std::queue<std::shared_ptr<function_base>> queue_;
std::size_t use_count_;
std::size_t thread_count_;
boost::asio::thread_pool threads_;
};
// A class that satisfies the Executor requirements. Every function or piece of
// work associated with a fork_executor is part of a single, joinable group.
class fork_executor {
public:
fork_executor(fork_join_pool& ctx)
: context_(ctx), work_count_(std::make_shared<std::size_t>(0)) {}
fork_join_pool& context() const noexcept { return context_; }
void on_work_started() const noexcept {
std::lock_guard<std::mutex> lock(context_.mutex_);
context_.do_work_started(work_count_);
}
void on_work_finished() const noexcept {
std::lock_guard<std::mutex> lock(context_.mutex_);
context_.do_work_finished(work_count_);
}
template <class Func, class Alloc>
void dispatch(Func&& f, const Alloc& a) const {
auto p(std::allocate_shared<exFun<Func>>(
typename std::allocator_traits<Alloc>::template rebind_alloc<char>(a),
std::move(f), work_count_));
context_.do_dispatch(p, work_count_);
}
template <class Func, class Alloc> void post(Func f, const Alloc& a) const {
auto p(std::allocate_shared<exFun<Func>>(
typename std::allocator_traits<Alloc>::template rebind_alloc<char>(a),
std::move(f), work_count_));
context_.do_post(p, work_count_);
}
template <class Func, class Alloc>
void defer(Func&& f, const Alloc& a) const {
post(std::forward<Func>(f), a);
}
friend bool operator==(const fork_executor& a, const fork_executor& b) noexcept {
return a.work_count_ == b.work_count_;
}
friend bool operator!=(const fork_executor& a, const fork_executor& b) noexcept {
return a.work_count_ != b.work_count_;
}
// Block until all work associated with the executor is complete. While it
// is waiting, the thread may be borrowed to execute functions from the
// queue.
void join() const {
std::unique_lock<std::mutex> lock(context_.mutex_);
while (*work_count_ > 0)
if (!context_.execute_next(lock))
context_.condition_.wait(lock);
}
private:
template <class Func> struct exFun : fork_join_pool::function_base {
explicit exFun(Func f, const std::shared_ptr<std::size_t>& w)
: function_(std::move(f)) {
work_count_ = w;
execute_ = [](std::shared_ptr<fork_join_pool::function_base>& p) {
Func tmp(std::move(static_cast<exFun*>(p.get())->function_));
p.reset();
tmp();
};
}
Func function_;
};
fork_join_pool& context_;
std::shared_ptr<std::size_t> work_count_;
};
// Helper class to automatically join a fork_executor when exiting a scope.
class join_guard {
public:
explicit join_guard(const fork_executor& ex) : ex_(ex) {}
join_guard(const join_guard&) = delete;
join_guard(join_guard&&) = delete;
~join_guard() { ex_.join(); }
private:
fork_executor ex_;
};
//------------------------------------------------------------------------------
#include <algorithm>
#include <iostream>
#include <random>
#include <vector>
#include <boost/bind.hpp>
static void foo(const uint64_t begin, uint64_t *result)
{
uint64_t prev[] = {begin, 0};
for (uint64_t i = 0; i < 1000000000; ++i) {
const auto tmp = (prev[0] + prev[1]) % 1000;
prev[1] = prev[0];
prev[0] = tmp;
}
*result = prev[0];
}
void batch(fork_join_pool &pool, const uint64_t (&a)[2])
{
uint64_t r[] = {0, 0};
{
fork_executor fork(pool);
join_guard join(fork);
boost::asio::post(fork, boost::bind(foo, a[0], &r[0]));
boost::asio::post(fork, boost::bind(foo, a[1], &r[1]));
// fork.join(); // or let join_guard destructor run
}
std::cerr << "foo(" << a[0] << "): " << r[0] << " foo(" << a[1] << "): " << r[1] << std::endl;
}
int main() {
fork_join_pool pool;
batch(pool, {2, 4});
batch(pool, {3, 5});
batch(pool, {7, 9});
}
Prints:
foo(2): 2 foo(4): 4
foo(3): 503 foo(5): 505
foo(7): 507 foo(9): 509
Things to note:
executors can overlap/nest: you can use several joinable fork_executors on a single fork_join_pool and they will join the distinct groups of tasks for each executor
You can get that sense easily when looking at the library example (which does a recursive divide-and-conquer merge sort).
I had a similar problem and ended up using latches. In this case the code would would be (I also switched from bind to lambdas):
void batch(boost::asio::thread_pool &pool, const uint64_t a[])
{
uint64_t r[] = {0, 0};
boost::latch latch(2);
boost::asio::post(pool, [&](){ foo(a[0], &r[0]); latch.count_down();});
boost::asio::post(pool, [&](){ foo(a[1], &r[1]); latch.count_down();});
latch.wait();
std::cerr << "foo(" << a[0] << "): " << r[0] << " foo(" << a[1] << "): " << r[1] << std::endl;
}
https://godbolt.org/z/oceP6jjs7

c++ class method that takes arbitrary number of callbacks and stores results

I've been trying to think of a way to have my class method take an arbitrary number of callback functions, run all of them, and then store the output. I think this works, but is there a way I can do this where I don't have to make the user wrap all of the callback functions into a vector? This also just feels messy. Feel free to mention other things that are not ideal.
#include <iostream>
#include <functional>
#include <vector>
class MyObj{
public:
// where I store stuff
std::vector<double> myResults;
// function that is called intermittently
void runFuncs(const std::vector<std::function<double()> >& fs){
if ( myResults.size() == 0){
for( auto& f : fs){
myResults.push_back(f());
}
}else{
int i (0);
for( auto& f : fs){
myResults[i] = f();
i++;
}
}
}
};
int main(int argc, char **argv)
{
auto lambda1 = [](){ return 1.0;};
auto lambda2 = [](){ return 2.0;};
MyObj myThing;
std::vector<std::function<double()> > funcs;
funcs.push_back(lambda1);
funcs.push_back(lambda2);
myThing.runFuncs(funcs);
std::cout << myThing.myResults[0] << "\n";
std::cout << myThing.myResults[1] << "\n";
std::vector<std::function<double()> > funcs2;
funcs2.push_back(lambda2);
funcs2.push_back(lambda1);
myThing.runFuncs(funcs2);
std::cout << myThing.myResults[0] << "\n";
std::cout << myThing.myResults[1] << "\n";
return 0;
}
Something like this, perhaps:
template <typename... Fs>
void runFuncs(Fs... fs) {
myResults = std::vector<double>({fs()...});
}
Then you can call it as
myThing.runFuncs(lambda1, lambda2);
Demo

Cannot Return Values When Passing Function by Reference To TBB Task

I'm getting my feet wet with Intel TBB and am trying to figure out why I cannot populate a vector passed in by reference to a TBB Task when I also pass in a function by reference.
Here is the code:
// tbbTesting.cpp : Defines the entry point for the console application.
#include "stdafx.h"
#include "tbb/task.h"
#include <functional>
#include <iostream>
#include <random>
#define NUM_POINTS 10
void myFunc(std::vector<double>& numbers)
{
std::mt19937_64 gen;
std::uniform_real_distribution<double> dis(0.0, 1000.0);
for (size_t i = 0; i < NUM_POINTS; i++)
{
auto val = dis(gen);
std::cout << val << std::endl; //proper values generated
numbers.push_back(val); //why is this failing?
}
std::cout << std::endl;
for (auto i : numbers)
{
std::cout << numbers[i] << std::endl; //garbage values
}
}
class TASK_generateRandomNumbers : public tbb::task
{
public:
TASK_generateRandomNumbers(std::function<void(std::vector<double>&)>& fnc,
std::vector<double>& nums) : _fnc(fnc), _numbers(nums) {}
~TASK_generateRandomNumbers() {};
tbb::task* execute()
{
_fnc(_numbers);
return nullptr;
}
private:
std::function<void(std::vector<double>&)>& _fnc;
std::vector<double>& _numbers;
};
class Manager
{
public:
Manager() { _numbers.reserve(NUM_POINTS); }
~Manager() {}
void GenerateNumbers()
{
_fnc = std::bind(&myFunc, _numbers);
TASK_generateRandomNumbers* t = new(tbb::task::allocate_root())
TASK_generateRandomNumbers(_fnc, _numbers);
tbb::task::spawn_root_and_wait(*t);
}
auto GetNumbers() const { return _numbers; }
private:
std::function<void(std::vector<double>&)> _fnc;
std::vector<double> _numbers;
};
int main()
{
Manager mgr;
mgr.GenerateNumbers();
auto numbers = mgr.GetNumbers(); //returns empty
}
When the execute method performs the operation, I can get values when passing the vector by reference.
When the execute method has to call a function, I get garbage data printed to the console (push_back failing?) and I get an empty container on return.
Can anyone see what I'm missing? Thanks.
I have found a couple of bugs that have nothing to do with tbb.
1) Your myFunc is using range for incorrectly. It does not return an index but each value directly in the vector in turn. Your code is casting each double to an int and using that as index into the array which is why you are gettign garbage.
2) When you use std::bind to create a functor the arguments are copied by value. If you want to pass in a reference then you need to use std::ref to wrap the argument.
If you are using c++11 then you might want to consider using a lambda rather than bind.
I've written a small program using your myFunc in different ways: with and without using std::ref and also a lambda example. You should see that it generates the same numbers 3 times but when it tries to print out v1 it wont contain anything because the generated values were placed in a copy.
#include <vector>
#include <random>
#include <iostream>
#include <functional>
constexpr size_t NUM_POINTS = 10;
void myFunc(std::vector<double>& numbers)
{
std::mt19937_64 gen;
std::uniform_real_distribution<double> dis(0.0, 1000.0);
for (size_t i = 0; i < NUM_POINTS; i++)
{
auto val = dis(gen);
std::cout << val << std::endl; //proper values generated
numbers.push_back(val); //why is this failing? it's not
}
std::cout << std::endl;
}
void printNumbers(std::vector<double>const& numbers)
{
for (auto number : numbers)
{
std::cout << number << std::endl;
}
std::cout << std::endl;
}
int main()
{
std::cout << "generating v1" << std::endl;
std::vector<double> v1;
auto f1 = std::bind(&myFunc, v1);
f1();
printNumbers(v1);
std::cout << "generating v2" << std::endl;
std::vector<double> v2;
auto f2= std::bind(&myFunc, std::ref(v2));
f2();
printNumbers(v2);
std::cout << "generating v3" << std::endl;
std::vector<double> v3;
auto f3 = [&v3]() { myFunc(v3); }; //using a lambda
f3();
printNumbers(v3);
return 0;
}