I know the title is a bit broad so let me elaborate.
I have 2 processes running, one is writing into the shared memory, the other is reading from it.
To achieve shared memory effect I am using boost::interprocess (btw let me know if there are more convenient libraries).
So I implemented the following:
//Writer
#include <boost/interprocess/shared_memory_object.hpp>
#include <boost/interprocess/windows_shared_memory.hpp>
#include <boost/interprocess/mapped_region.hpp>
#include <iostream>
namespace ip = boost::interprocess;
class SharedMemory
{
public:
template<typename OpenOrCreate>
SharedMemory(OpenOrCreate criteria, const char* name, ip::mode_t mode, size_t size) :
name_(name),
sm_(std::make_shared<ip::windows_shared_memory>(criteria, name, mode, size))
{
}
template<typename OpenOrCreate>
SharedMemory(OpenOrCreate criteria, const char* name, ip::mode_t mode) :
name_(name),
sm_(std::make_shared<ip::windows_shared_memory>(criteria, name, mode))
{
}
std::shared_ptr<ip::windows_shared_memory> getSM()
{
return sm_;
}
private:
std::function<void()> destroyer_;
std::string name_;
std::shared_ptr<ip::windows_shared_memory> sm_;
};
int main()
{
SharedMemory creator(ip::create_only, "SharedMemory", ip::read_write, 10);
ip::mapped_region region(*creator.getSM(), ip::read_write);
std::memset(region.get_address(), 1, region.get_size());
int status = system("reader.exe");
std::cout << status << std::endl;
}
So I am creating shared memory, writing 1 to it then calling the reader exe. (I skip the reader part as its pretty much the same but instead of write it reads)
This code works fine, I write into memory and the other process reads it and prints my 1's.
But what if I have this 2 exes running at the same time and I want to write into memory then notify the other process that there is an update? How to signal from one exe/process to another?
The scenario is that I am streaming some live data, writing into memory and then telling the other process that there is an update.
I think there are more convenient approaches indeed.
In principle to synchronize between processes you use all the same approaches as synchronizing inside a process (between threads): using synchronization primitives (mutex/critical section, condition variable, semaphores, barriers etc.).
In addition, you need to have a data structure that you synchronize. This is precisely the Achilles' heel at the moment. There is a total absence of data structure here.
Though you can do raw byte access with your own logic, I don't see the appeal of using a high-level library in doing so. Instead I'd use a managed memory segment, that lets you find or construct typed objects by name. This may include your synchronization primitives.
In fact, you can expedite the process by using a message_queue which has all the synchronization already built-in.
Manual Sync: Writer using Segment Manager
I'll provide portable code because I donot have a windows machine. First let's think of a datastructure. A simple example would be a queue of messages. Let's use a deque<string>.
Not exactly trivial data structures, but the great news is that Boost Interprocess comes with all the nuts and bolts to make things work (using interprocess allocators).
namespace Shared {
using Segment = ip::managed_shared_memory;
using Mgr = Segment::segment_manager;
template <typename T>
using Alloc = bc::scoped_allocator_adaptor<ip::allocator<T, Mgr>>;
template <typename T> using Deque = bc::deque<T, Alloc<T>>;
using String = bc::basic_string<char, std::char_traits<char>, Alloc<char>>;
using DataStructure = Deque<String>;
class Memory {
public:
Memory(const char* name, size_t size)
: name_(name)
, sm_(ip::open_or_create, name, size)
, data_(*sm_.find_or_construct<DataStructure>("data")(
sm_.get_segment_manager()))
{
}
DataStructure& get() { return data_; }
DataStructure const& get() const { return data_; }
private:
std::string name_;
Segment sm_;
DataStructure& data_;
};
} // namespace Shared
There, now we can have the writer be something like:
int main()
{
Shared::Memory creator("SharedMemory", 10*1024*1024);
creator.get().emplace_back("Hello");
creator.get().emplace_back("World");
std::cout << "Total queued: " << creator.get().size() << "\n";
}
Which will print e.g.
Total queued: 2
Total queued: 4
Total queued: 6
Depending on the number of times you ran it.
The Reader side
Now lets do the reader side. In fact it's so much the same, let's put it in the same main program:
int main(int argc, char**)
{
Shared::Memory mem("SharedMemory", 10*1024*1024);
auto& data = mem.get();
bool is_reader = argc > 1;
if (not is_reader) {
data.emplace_back("Hello");
data.emplace_back("World");
std::cout << "Total queued: " << data.size() << "\n";
} else {
std::cout << "Found entries: " << data.size() << "\n";
while (!data.empty()) {
std::cout << "Dequeued " << data.front() << "\n";
data.pop_front();
}
}
}
Simple for a start. Now running e.g. test.exe READER will conversely print something like:
Locking & Synchronization
The goal is to run writer and reader concurrently. That's not safe as it is now, because of a lack of locking and synchronization. Let's add it:
class Memory {
static constexpr size_t max_capacity = 100;
public:
Memory(const char* name, size_t size)
: name_(name)
, sm_(ip::open_or_create, name, size)
, mx_(*sm_.find_or_construct<Mutex>("mutex")())
, cv_(*sm_.find_or_construct<Cond>("condition")())
, data_(*sm_.find_or_construct<DataStructure>("data")(
sm_.get_segment_manager()))
{ }
// ...
private:
std::string name_;
Segment sm_;
Mutex& mx_;
Cond& cv_;
DataStructure& data_;
};
Now let's be careful. Because we want all operations on the data_ queue to be synchronized, we shall not expose it as we did before (with the get() member function). Instead we expose the exact interface of operations we support:
size_t queue_length() const;
void enqueue(std::string message); // blocking when queue at max_capacity
std::string dequeue(); // blocking dequeue
std::optional<std::string> try_dequeue(); // non-blocking dequeue
These all do the locking as required, simply as you'd expect:
size_t queue_length() const {
ip::scoped_lock<Mutex> lk(mx_);
return data_.size();
}
It gets more interesting on the potentially blocking operations. I chose to have a maximum capacity, so enqueue needs to wait for capacity:
// blocking when queue at max_capacity
void enqueue(std::string message) {
ip::scoped_lock<Mutex> lk(mx_);
cv_.wait(lk, [this] { return data_.size() < max_capacity; });
data_.emplace_back(std::move(message));
cv_.notify_one();
}
Conversely, dequeue needs to wait for a message to become available:
// blocking dequeue
std::string dequeue() {
ip::scoped_lock<Mutex> lk(mx_);
cv_.wait(lk, [this] { return not data_.empty(); });
return do_pop();
}
Alternatively, you could make it non-blocking, just optionally returning a value:
// non-blocking dequeue
std::optional<std::string> try_dequeue() {
ip::scoped_lock<Mutex> lk(mx_);
if (data_.empty())
return std::nullopt;
return do_pop();
}
Now in main let's have three versions: writer, reader and continuous reader (where the latter demonstrates the blocking interface):
#include <boost/interprocess/allocators/allocator.hpp>
#include <boost/interprocess/managed_shared_memory.hpp>
#include <boost/interprocess/sync/interprocess_condition_any.hpp>
#include <boost/interprocess/sync/interprocess_mutex.hpp>
#include <boost/interprocess/sync/scoped_lock.hpp>
#include <boost/container/scoped_allocator.hpp>
#include <boost/interprocess/containers/deque.hpp>
#include <boost/interprocess/containers/string.hpp>
#include <iostream>
#include <iomanip>
#include <optional>
namespace ip = boost::interprocess;
namespace bc = boost::container;
namespace Shared {
using Segment = ip::managed_shared_memory;
using Mgr = Segment::segment_manager;
template <typename T>
using Alloc = bc::scoped_allocator_adaptor<ip::allocator<T, Mgr>>;
template <typename T> using Deque = ip::deque<T, Alloc<T>>;
using String = ip::basic_string<char, std::char_traits<char>, Alloc<char>>;
using DataStructure = Deque<String>;
using Mutex = ip::interprocess_mutex;
using Cond = ip::interprocess_condition;
class Memory {
static constexpr size_t max_capacity = 100;
public:
Memory(const char* name, size_t size)
: name_(name)
, sm_(ip::open_or_create, name, size)
, mx_(*sm_.find_or_construct<Mutex>("mutex")())
, cv_(*sm_.find_or_construct<Cond>("condition")())
, data_(*sm_.find_or_construct<DataStructure>("data")(
sm_.get_segment_manager()))
{ }
size_t queue_length() const {
ip::scoped_lock<Mutex> lk(mx_);
return data_.size(); // caution: racy by design!
}
// blocking when queue at max_capacity
void enqueue(std::string message) {
ip::scoped_lock<Mutex> lk(mx_);
cv_.wait(lk, [this] { return data_.size() < max_capacity; });
data_.emplace_back(std::move(message));
cv_.notify_one();
}
// blocking dequeue
std::string dequeue() {
ip::scoped_lock<Mutex> lk(mx_);
cv_.wait(lk, [this] { return not data_.empty(); });
return do_pop();
}
// non-blocking dequeue
std::optional<std::string> try_dequeue() {
ip::scoped_lock<Mutex> lk(mx_);
if (data_.empty())
return std::nullopt;
return do_pop();
}
private:
std::string name_;
Segment sm_;
Mutex& mx_;
Cond& cv_;
DataStructure& data_;
// Assumes mx_ locked by current thread!
std::string do_pop() {
auto&& tmp = std::move(data_.front());
data_.pop_front();
cv_.notify_all(); // any of the waiters might be a/the writer
return std::string(tmp.begin(), tmp.end());
}
};
} // namespace Shared
int main(int argc, char**)
{
Shared::Memory mem("SharedMemory", 10*1024*1024);
switch (argc) {
case 1:
mem.enqueue("Hello");
mem.enqueue("World");
std::cout << "Total queued: " << mem.queue_length() << "\n";
break;
case 2:
std::cout << "Found entries: " << mem.queue_length() << "\n";
while (auto msg = mem.try_dequeue()) {
std::cout << "Dequeued " << *msg << "\n";
}
break;
case 3:
std::cout << "Continuous reader\n";
while (true) {
std::cout << "Dequeued " << mem.dequeue() << "\n";
}
break;
}
}
Little demo:
Summary, Caution
Note there are some loose ends with the above. Notably, the absence of robust locks in Boost Interprocess needs some extra care for proper shutdown without holding the lock.
I'd suggest to contrast with ip::message_queue as well:
How to put file in boost::interprocess::managed_shared_memory? (contrasts shared memory, message_queue and pure TCP sockets)
For error C2664 under msvc mentioned above, it could be solved by changing
data_.emplace_back(std::move(message));
to:
data_.emplace_back(std::move(message.data()));
Hope it could help anyone.
Related
I come from python world, and as a weekend project I decided to write a simple UDP server in c++. I have a question regarding correct way of discovering the type of incoming request. My approach is to have a class for every possible type of request. Upon packet arrival I have to unpack it's OPID (operation id) and instantiate correct class. To do that I have to bind OPIDs with the classes, and the only way I'm familiar of doing this in c++ involves huge switch:case block. Doing this doesn't really feels right for me, also If I understand UncleBob correctly, this goes against few OOP practices. As code describes the best one's intentions, here's python equivalent of what I'm trying to do with c++.
class BaseOperation:
OPID = 0
def process(packet_data):
raise NotImplementedError("blah blah")
class FooOperation(BaseOperation):
OPID = 1
def process(packet_data):
print("Foo on the packet!")
class BarOperation(BaseOperation):
OPID = 2
def process(packet_data):
print("Bar on the packet!")
opid_mappings = {
FooOperation.OPID: FooOperation,
BarOperation.OPID: BarOperation
}
Somewhere in code handling the incoming packet
def handle_connection(packet):
try:
operation = opid_mappings[get_opid(packet)]()
except KeyError:
print("Unknown OPID")
return
operation.process(get_data(packet))
Really quick hack of object-based solution. This might not be the right way to go in our wonderful new C++11 world of std::function.
If the children of BaseOperation need to store state, go objects!
#include <iostream>
#include <map>
class BaseOperation
{
protected:
int OPID;
public:
virtual ~BaseOperation()
{
}
virtual int operator()() = 0;
};
class FooOperation:public BaseOperation
{
public:
static constexpr int OPID = 1;
FooOperation()
{
}
int operator()()
{
// do parsing
return OPID; // just for convenience so we can tell who was called
}
};
constexpr int FooOperation::OPID; // allocate storage for static
class BarOperation:public BaseOperation
{
public:
static constexpr int OPID = 2;
BarOperation()
{
}
int operator()()
{
// do parsing
return OPID; // just for convenience so we can tell who was called
}
};
constexpr int BarOperation::OPID; // allocate storage for static
std::map<int, BaseOperation*> opid_mappings{
{FooOperation::OPID, new FooOperation()},
{BarOperation::OPID, new BarOperation()}
};
int main()
{
std::cout << "calling OPID 1:" << (*opid_mappings[1])() << std::endl;
std::cout << "calling OPID 2:" << (*opid_mappings[2])() << std::endl;
for (std::pair<int, BaseOperation*> todel: opid_mappings)
{
delete todel.second;
}
return 0;
}
This also ignores the fact that there is probably no need for the map. If the OPIDs are sequential, a good ol' dumb array solves the problem. I like the map because it won't screw up if someone moves a parser handler or inserts one into the middle of the list.
Regardless, this has a bunch of memory management problems, such as the need for the for loop deleting the parser objects at the bottom of main. This could be solved with std::unique_ptr, but this is probably a rabbit hole we don't need to go down.
Odds are really good that the parser doesn't have any state and we can just use a map of OPIDs and std::function.
#include <iostream>
#include <map>
#include <functional>
static constexpr int FooOPID = 1;
int fooOperation()
{
// do parsing
return FooOPID;
}
static constexpr int BarOPID = 2;
int BarOperation()
{
// do parsing
return BarOPID;
}
std::map<int, std::function<int()>> opid_mappings {
{FooOPID, fooOperation},
{BarOPID, BarOperation}
};
int main()
{
std::cout << "calling OPID 1:" << opid_mappings[1]() << std::endl;
std::cout << "calling OPID 2:" << opid_mappings[2]() << std::endl;
return 0;
}
And because the parser's are kind of useless if you aren't passing anything in, one last tweak:
#include <iostream>
#include <map>
#include <functional>
struct Packet
{
//whatever you need here. Probably a buffer reference and a length
};
static constexpr int FooOPID = 1;
int fooOperation(Packet & packet)
{
// do parsing
return FooOPID;
}
static constexpr int BarOPID = 2;
int BarOperation(Packet & packet)
{
// do parsing
return BarOPID;
}
std::map<int, std::function<int(Packet &)>> opid_mappings {
{FooOPID, fooOperation},
{BarOPID, BarOperation}
};
int main()
{
Packet packet;
std::cout << "calling OPID 1:" << opid_mappings[1](packet) << std::endl;
std::cout << "calling OPID 2:" << opid_mappings[2](packet) << std::endl;
return 0;
}
I have a XML file with a sequence of nodes. Each node represents an element that I need to parse and add in a sorted list (the order must be the same of the nodes found in the file).
At the moment I am using a sequential solution:
struct Graphic
{
bool parse()
{
// parsing...
return parse_outcome;
}
};
vector<unique_ptr<Graphic>> graphics;
void producer()
{
for (size_t i = 0; i < N_GRAPHICS; i++)
{
auto g = new Graphic();
if (g->parse())
graphics.emplace_back(g);
else
delete g;
}
}
So, only if the graphic (that actually is an instance of a class derived from Graphic, a Line, a Rectangle and so on, that is why the new) can be properly parse, it will be added to my data structure.
Since I only care about the order in which thes graphics are added to my list, I though to call the parse method asynchronously, such that the producer has the task of read each node from the file and add this graphic to the data structure, while the consumer has the task of parse each graphic whenever a new graphic is ready to be parsed.
Now I have several consumer threads (created in the main) and my code looks like the following:
queue<pair<Graphic*, size_t>> q;
mutex m;
atomic<size_t> n_elements;
void producer()
{
for (size_t i = 0; i < N_GRAPHICS; i++)
{
auto g = new Graphic();
graphics.emplace_back(g);
q.emplace(make_pair(g, i));
}
n_elements = graphics.size();
}
void consumer()
{
pair<Graphic*, size_t> item;
while (true)
{
{
std::unique_lock<std::mutex> lk(m);
if (n_elements == 0)
return;
n_elements--;
item = q.front();
q.pop();
}
if (!item.first->parse())
{
// here I should remove the item from the vector
assert(graphics[item.second].get() == item.first);
delete item.first;
graphics[item.second] = nullptr;
}
}
}
I run the producer first of all in my main, so that when the first consumer starts the queue is already completely full.
int main()
{
producer();
vector<thread> threads;
for (auto i = 0; i < N_THREADS; i++)
threads.emplace_back(consumer);
for (auto& t : threads)
t.join();
return 0;
}
The concurrent version seems to be at least twice as faster as the original one.
The full code has been uploaded here.
Now I am wondering:
Are there any (synchronization) errors in my code?
Is there a way to achieve the same result faster (or better)?
Also, I noticed that on my computer I get the best result (in terms of elapsed time) if I set the number of thread equals to 8. More (or less) threads give me worst results. Why?
Blockquote
There isn't synchronization errors, but I think that the memory managing could be better, since your code leaked if parse() throws an exception.
There isn't synchronization errors, but I think that your memory managing could be better, since you will have leaks if parse() throw an exception.
Blockquote
Is there a way to achieve the same result faster (or better)?
Probably. You could use a simple implementation of a thread pool and a lambda that do the parse() for you.
The code below illustrate this approach. I use the threadpool implementation
here
#include <iostream>
#include <stdexcept>
#include <vector>
#include <memory>
#include <chrono>
#include <utility>
#include <cassert>
#include <ThreadPool.h>
using namespace std;
using namespace std::chrono;
#define N_GRAPHICS (1000*1000*1)
#define N_THREADS 8
struct Graphic;
using GPtr = std::unique_ptr<Graphic>;
static vector<GPtr> graphics;
struct Graphic
{
Graphic()
: status(false)
{
}
bool parse()
{
// waste time
try
{
throw runtime_error("");
}
catch (runtime_error)
{
}
status = true;
//return false;
return true;
}
bool status;
};
int main()
{
auto start = system_clock::now();
auto producer_unit = []()-> GPtr {
std::unique_ptr<Graphic> g(new Graphic);
if(!g->parse()){
g.reset(); // if g don't parse, return nullptr
}
return g;
};
using ResultPool = std::vector<std::future<GPtr>>;
ResultPool results;
// ThreadPool pool(thread::hardware_concurrency());
ThreadPool pool(N_THREADS);
for(int i = 0; i <N_GRAPHICS; ++i){
// Running async task
results.emplace_back(pool.enqueue(producer_unit));
}
for(auto &t : results){
auto value = t.get();
if(value){
graphics.emplace_back(std::move(value));
}
}
auto duration = duration_cast<milliseconds>(system_clock::now() - start);
cout << "Elapsed: " << duration.count() << endl;
for (size_t i = 0; i < graphics.size(); i++)
{
if (!graphics[i]->status)
{
cerr << "Assertion failed! (" << i << ")" << endl;
break;
}
}
cin.get();
return 0;
}
It is a bit faster (1s) on my machine, more readable, and removes the necessity of shared datas (synchronization is evil, avoid it or hide it in a reliable and efficient way).
This project asked for 4 threads that has a command file with
instructions such as SEND, Receive and quit. When the file says "2
send" the thread that in the second place in the array should wake
up and receive its message. I need to know how to make a thread read
it's message if the command file has a message for it?
The biggest issue I see for your design is the fact that each thread reads its line randomly independent from any other thread. After this it would have to check wether the current line is actually meant for it i.e. starting with the appropriate number. What happens if not ? Too complicated.
I would split this issue up into one reader thread and a set of worker threads. The first reads lines from a file and dispatches it to the workers by pushing it into the current workers queue. All synchronized with a per worker mutex and conditional variable The following is implemented in C++11 but should as well be doable in pthread_* style.
#include <thread>
#include <iostream>
#include <queue>
#include <mutex>
#include <fstream>
#include <list>
#include <sstream>
#include <condition_variable>
class worker {
public:
void operator()(int n) {
while(true) {
std::unique_lock<std::mutex> l(_m);
_c.wait(l);
if(!_q.empty()) {
{
std::unique_lock<std::mutex> l(_mm);
std::cerr << "#" << n << " " << _q.back() <<std::endl;
}
_q.pop();
}
}
}
private:
std::mutex _m;
std::condition_variable _c;
std::queue<std::string> _q;
// Only needed to synchronize I/O
static std::mutex _mm;
// Reader may write into our queue
friend class reader;
};
std::mutex worker::_mm;
class reader {
public:
reader(worker & w0,worker & w1,worker & w2,worker & w3) {
_v.push_back(&w0);
_v.push_back(&w1);
_v.push_back(&w2);
_v.push_back(&w3);
}
void operator()() {
std::ifstream fi("commands.txt");
std::string s;
while(std::getline(fi,s)) {
std::stringstream ss(s);
int n;
if((ss >> n >> std::ws) && n>=0 && n<_v.size()) {
std::string s0;
if(std::getline(ss,s0)) {
std::unique_lock<std::mutex> l(_v[n]->_m);
_v[n]->_q.push(s0);
_v[n]->_c.notify_one();
}
}
}
std::cerr << "done" << std::endl;
}
private:
std::vector<worker *> _v;
};
int main(int c,char **argv) {
worker w0;
worker w1;
worker w2;
worker w3;
std::thread tw0([&w0]() { w0(0); });
std::thread tw1([&w1]() { w1(1); });
std::thread tw2([&w2]() { w2(2); });
std::thread tw3([&w3]() { w3(3); });
reader r(w0,w1,w2,w3);
std::thread tr([&r]() { r(); });
tr.join();
tw0.join();
tw1.join();
tw2.join();
tw3.join();
}
The example code only reads from "commands.txt" until EOF. I assume you'd like to read continuously like the "tail -f" command. That's however not doable with std::istream.
The code of course is clumsy but I guess it gives you an idea. One should for example add a blocking mechanism if the workers are way too slow processing their stuff and the queues may eat up all the precious RAM.
I can't get code working reliably in a simple VS2012 console application consisting of a producer and consumer that uses a C++11 condition variable. I am aiming at producing a small reliable program (to use as the basis for a more complex program) that uses the 3 argument wait_for method or perhaps the wait_until method from code I have gathered at these websites:
condition_variable:
wait_for,
wait_until
I'd like to use the 3 argument wait_for with a predicate like below except it will need to use a class member variable to be most useful to me later. I am receiving "Access violation writing location 0x__" or "An invalid parameter was passed to a service or function" as errors after only about a minute of running.
Would steady_clock and the 2 argument wait_until be sufficient to replace the 3 argument wait_for? I've also tried this without success.
Can someone show how to get the code below to run indefinitely with no bugs or weird behavior with either changes in wall-clock time from daylight savings time or Internet time synchronizations?
A link to reliable sample code could be just as helpful.
// ConditionVariable.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <condition_variable>
#include <mutex>
#include <thread>
#include <iostream>
#include <queue>
#include <chrono>
#include <atomic>
#define TEST1
std::atomic<int>
//int
qcount = 0; //= ATOMIC_VAR_INIT(0);
int _tmain(int argc, _TCHAR* argv[])
{
std::queue<int> produced_nums;
std::mutex m;
std::condition_variable cond_var;
bool notified = false;
unsigned int count = 0;
std::thread producer([&]() {
int i = 0;
while (1) {
std::this_thread::sleep_for(std::chrono::microseconds(1500));
std::unique_lock<std::mutex> lock(m);
produced_nums.push(i);
notified = true;
qcount = produced_nums.size();
cond_var.notify_one();
i++;
}
cond_var.notify_one();
});
std::thread consumer([&]() {
std::unique_lock<std::mutex> lock(m);
while (1) {
#ifdef TEST1
// Version 1
if (cond_var.wait_for(
lock,
std::chrono::microseconds(1000),
[&]()->bool { return qcount != 0; }))
{
if ((count++ % 1000) == 0)
std::cout << "consuming " << produced_nums.front () << '\n';
produced_nums.pop();
qcount = produced_nums.size();
notified = false;
}
#else
// Version 2
std::chrono::steady_clock::time_point timeout1 =
std::chrono::steady_clock::now() +
//std::chrono::system_clock::now() +
std::chrono::milliseconds(1);
while (qcount == 0)//(!notified)
{
if (cond_var.wait_until(lock, timeout1) == std::cv_status::timeout)
break;
}
if (qcount > 0)
{
if ((count++ % 1000) == 0)
std::cout << "consuming " << produced_nums.front() << '\n';
produced_nums.pop();
qcount = produced_nums.size();
notified = false;
}
#endif
}
});
while (1);
return 0;
}
Visual Studio Desktop Express had 1 important update which it installed and Windows Update has no other important updates. I'm using Windows 7 32-bit.
Sadly, this is actually a bug in VS2012's implementation of condition_variable, and the fix will not be patched in. You'll have to upgrade to VS2013 when it's released.
See:
http://connect.microsoft.com/VisualStudio/feedback/details/762560
First of all, while using condition_variables I personally prefer some wrapper classes like AutoResetEvent from C#:
struct AutoResetEvent
{
typedef std::unique_lock<std::mutex> Lock;
AutoResetEvent(bool state = false) :
state(state)
{ }
void Set()
{
auto lock = AcquireLock();
state = true;
variable.notify_one();
}
void Reset()
{
auto lock = AcquireLock();
state = false;
}
void Wait(Lock& lock)
{
variable.wait(lock, [this] () { return this->state; });
state = false;
}
void Wait()
{
auto lock = AcquireLock();
Wait(lock);
}
Lock AcquireLock()
{
return Lock(mutex);
}
private:
bool state;
std::condition_variable variable;
std::mutex mutex;
};
This may not be the same behavior as C# type or may not be as efficient as it should be but it gets things done for me.
Second, when I need to implement a producing/consuming idiom I try to use a concurrent queue implementation (eg. tbb queue) or write a one for myself. But you should also consider making things right by using Active Object Pattern. But for simple solution we can use this:
template<typename T>
struct ProductionQueue
{
ProductionQueue()
{ }
void Enqueue(const T& value)
{
{
auto lock = event.AcquireLock();
q.push(value);
}
event.Set();
}
std::size_t GetCount()
{
auto lock = event.AcquireLock();
return q.size();
}
T Dequeue()
{
auto lock = event.AcquireLock();
event.Wait(lock);
T value = q.front();
q.pop();
return value;
}
private:
AutoResetEvent event;
std::queue<T> q;
};
This class has some exception safety issues and misses const-ness on the methods but like I said, for a simple solution this should fit.
So as a result your modified code looks like this:
int main(int argc, char* argv[])
{
ProductionQueue<int> produced_nums;
unsigned int count = 0;
std::thread producer([&]() {
int i = 0;
while (1) {
std::this_thread::sleep_for(std::chrono::microseconds(1500));
produced_nums.Enqueue(i);
qcount = produced_nums.GetCount();
i++;
}
});
std::thread consumer([&]() {
while (1) {
int item = produced_nums.Dequeue();
{
if ((count++ % 1000) == 0)
std::cout << "consuming " << item << '\n';
qcount = produced_nums.GetCount();
}
}
});
producer.join();
consumer.join();
return 0;
}
I've implemented a thread pool using boost::asio, and some number boost::thread objects calling boost::asio::io_service::run(). However, a requirement that I've been given is to have a way to monitor all threads for "health". My intent is to make a simple sentinel object that can be passed through the thread pool -- if it makes it through, then we can assume that the thread is still processing work.
However, given my implementation, I'm not sure how (if) I can monitor all the threads in the pool reliably. I've simply delegated the thread function to boost::asio::io_service::run(), so posting a sentinel object into the io_service instance won't guarantee which thread will actually get that sentinel and do the work.
One option may be to just periodically insert the sentinel, and hope that it gets picked up by each thread at least once in some reasonable amount of time, but that obviously isn't ideal.
Take the following example. Due to the way that the handler is coded, in this instance we can see that each thread will do the same amount of work, but in reality I will not have control of the handler implementation, some can be long running while others will be almost immediate.
#include <iostream>
#include <boost/asio.hpp>
#include <vector>
#include <boost/thread.hpp>
#include <boost/bind.hpp>
void handler()
{
std::cout << boost::this_thread::get_id() << "\n";
boost::this_thread::sleep(boost::posix_time::milliseconds(100));
}
int main(int argc, char **argv)
{
boost::asio::io_service svc(3);
std::unique_ptr<boost::asio::io_service::work> work(new boost::asio::io_service::work(svc));
boost::thread one(boost::bind(&boost::asio::io_service::run, &svc));
boost::thread two(boost::bind(&boost::asio::io_service::run, &svc));
boost::thread three(boost::bind(&boost::asio::io_service::run, &svc));
svc.post(handler);
svc.post(handler);
svc.post(handler);
svc.post(handler);
svc.post(handler);
svc.post(handler);
svc.post(handler);
svc.post(handler);
svc.post(handler);
svc.post(handler);
work.reset();
three.join();
two.join();
one.join();
return 0;
}
You can use a common io_service instance between all the threads and a private io_service instance for every thread. Every thread will execute a method like this:
void Mythread::threadLoop()
{
while(/* termination condition */)
{
commonIoService.run_one();
privateIoService.run_one();
commonConditionVariable.timed_wait(time);
}
}
By this way, if you want to ensure that some task is executed in a thread, you only have to post this task in its owned io_service.
To post a task in your thread pool you can do:
void MyThreadPool::post(Hander handler)
{
commonIoService.post(handler);
commonConditionVariable.notify_all();
}
The solution that I used relies on the fact that I own the implementation of the tread pool objects. I created a wrapper type that will update statistics, and copy the user defined handlers that are posted to the thread pool. Only this wrapper type is ever posted to the underlying io_service. This method allows me to keep track of the handlers that are posted/executed, without having to be intrusive into the user code.
Here's a stripped down and simplified example:
#include <iostream>
#include <memory>
#include <vector>
#include <boost/thread.hpp>
#include <boost/asio.hpp>
// Supports scheduling anonymous jobs that are
// executable as returning nothing and taking
// no arguments
typedef std::function<void(void)> functor_type;
// some way to store per-thread statistics
typedef std::map<boost::thread::id, int> thread_jobcount_map;
// only this type is actually posted to
// the asio proactor, this delegates to
// the user functor in operator()
struct handler_wrapper
{
handler_wrapper(const functor_type& user_functor, thread_jobcount_map& statistics)
: user_functor_(user_functor)
, statistics_(statistics)
{
}
void operator()()
{
user_functor_();
// just for illustration purposes, assume a long running job
boost::this_thread::sleep(boost::posix_time::milliseconds(100));
// increment executed jobs
++statistics_[boost::this_thread::get_id()];
}
functor_type user_functor_;
thread_jobcount_map& statistics_;
};
// anonymous thread function, just runs the proactor
void thread_func(boost::asio::io_service& proactor)
{
proactor.run();
}
class ThreadPool
{
public:
ThreadPool(size_t thread_count)
{
threads_.reserve(thread_count);
work_.reset(new boost::asio::io_service::work(proactor_));
for(size_t curr = 0; curr < thread_count; ++curr)
{
boost::thread th(thread_func, boost::ref(proactor_));
// inserting into this map before any work can be scheduled
// on it, means that we don't have to look it for lookups
// since we don't dynamically add threads
thread_jobcount_.insert(std::make_pair(th.get_id(), 0));
threads_.emplace_back(std::move(th));
}
}
// the only way for a user to get work into
// the pool is to use this function, which ensures
// that the handler_wrapper type is used
void schedule(const functor_type& user_functor)
{
handler_wrapper to_execute(user_functor, thread_jobcount_);
proactor_.post(to_execute);
}
void join()
{
// join all threads in pool:
work_.reset();
proactor_.stop();
std::for_each(
threads_.begin(),
threads_.end(),
[] (boost::thread& t)
{
t.join();
});
}
// just an example showing statistics
void log()
{
std::for_each(
thread_jobcount_.begin(),
thread_jobcount_.end(),
[] (const thread_jobcount_map::value_type& it)
{
std::cout << "Thread: " << it.first << " executed " << it.second << " jobs\n";
});
}
private:
std::vector<boost::thread> threads_;
std::unique_ptr<boost::asio::io_service::work> work_;
boost::asio::io_service proactor_;
thread_jobcount_map thread_jobcount_;
};
struct add
{
add(int lhs, int rhs, int* result)
: lhs_(lhs)
, rhs_(rhs)
, result_(result)
{
}
void operator()()
{
*result_ = lhs_ + rhs_;
}
int lhs_,rhs_;
int* result_;
};
int main(int argc, char **argv)
{
// some "state objects" that are
// manipulated by the user functors
int x = 0, y = 0, z = 0;
// pool of three threads
ThreadPool pool(3);
// schedule some handlers to do some work
pool.schedule(add(5, 4, &x));
pool.schedule(add(2, 2, &y));
pool.schedule(add(7, 8, &z));
// give all the handlers time to execute
boost::this_thread::sleep(boost::posix_time::milliseconds(1000));
std::cout
<< "x = " << x << "\n"
<< "y = " << y << "\n"
<< "z = " << z << "\n";
pool.join();
pool.log();
}
Output:
x = 9
y = 4
z = 15
Thread: 0000000000B25430 executed 1 jobs
Thread: 0000000000B274F0 executed 1 jobs
Thread: 0000000000B27990 executed 1 jobs