Is this the right approach for a thread-safe Queue class? - c++

I'm wondering if this is the right approach to writing a thread-safe queue in C++?
template <class T>
class Queue
{
public:
Queue() {}
void Push(T& a)
{
m_mutex.lock();
m_q.push_back(a);
m_mutex.unlock();
}
T& Pop()
{
m_mutex.lock();
T& temp = m_q.pop();
m_mutex.unlock();
return temp;
}
private:
std::queue<t> m_q;
boost::mutex m_mutex;
};
You get the idea... I'm just wondering if this is the best approach. Thanks!
EDIT:
Because of the questions I'm getting, I wanted to clarify that the mutex is a boost::mutex

I recommend using the Boost threading libraries to assist you with this.
Your code is fine, except that when you write code in C++ like
some_mutex.lock();
// do something
some_mutex.unlock();
then if the code in the // do something section throws an exception then the lock will never be released. The Boost library solves this with its classes such as lock_guard in which you initialize an object which acquires a lock in its constructor, and whose destructor releases the lock. That way you know that your lock will always be released. Other languages accomplish this through try/finally statements, but C++ doesn't support this construct.
In particular, what happens when you try to read from a queue with no elements? Does that throw an exception? If so, then your code would run into problems.
When trying to get the first element, you probably want to check if something is there, then go to sleep if not and wait until something is. This is a job for a condition object, also provided by the Boost library, though available at a lower level if you prefer.

Herb Sutter wrote an excellent article last year in Dr. Dobbs Journal, covering all of the major concerns for a thread-safe, lock-free, single-producer, single-consumer queue implementation. (Which made corrections over an implementation published the previous month.)
His followup article in the next issue tackled a more generic approach for a multi-user concurrent queue, along with a full discussion of potential pitfalls and performance issues.
There are a few more articles on similar concurrency topics.
Enjoy.

From a threading-point of view, that looks about right for a simple, thread-safe queue.
You do have one problem, though: std::queue's pop() does not return the element popped from the queue. What you need to do is:
T Pop()
{
m_mutex.lock();
T temp = m_q.front();
m_q.pop();
m_mutex.unlock();
return temp;
}
You don't want to return a reference in this case since the referenced element is being popped from the queue and destroyed.
You also need to have some public Size() function to tell you how many elements are in the queue (either that, or you'll need to gracefully handle the case where Pop() is called and there are no elements in the queue).
Edit: Though, as Eli Courtwright points out, you do have to be careful with the queue operations throwing exceptions, and using Boost is a good idea.

Depends what are your goals. Crafted in this manner, you'll have your "reader" client blocking your "writer" client. You might want to consider using a "condition" to avoid dead-locks etc.

The approach that your are trying to implement is a locking approach. It will work, except that if you use a plain system-provided "mutex" object, it's performance might turn out disappointing (its lock-unlock overhead is pretty high). It is hard to say whether it will be good or not, since we don't know what your performance requirements and expectations are.
Since the operations you perform in "locked" segments of your code are rather quick, it might make sense to use a spin-lock instead of a true mutex, or a combination of the two. This will give you much better performance. Then again, maybe your "mutex" already implements all that (no way to know, since you provided no details about what is actually hiding behind that name).
And finally, if you are happen to be looking for best performance, you might want to read up on lock-free synchronization approach, which is a completely different story. Lock-free methods are typically much more difficult to implement though.

As Jean-Lou Dupont pointed out, your current code is quite prone to deadlocks. When I've done this, I've used three locks: two counted semaphores and one mutex. The counted semaphores signal when:there's space available to insert an objectThere at least one object to retrieve from the queue
The mutex is used only while actually putting an item in the queue or retrieving an item from the queue -- but no attempt is ever made at locking the mutex until we know that the insertion or retrieval will be able to succeed immediately.

Related

Communication between threads via shared vector

I am designing a tcp server which takes information from a request and puts everything in a queue to be processed. I am using a asio web server to handle all web interaction. I am looking for an effective way to queue everything up to be processes. I am using boost signals and global vector to do this right now similar to this.
void request_handler::handle_request(request &req, reply &rep)
{
std::string parsedInfo = parse_request(req);
shared_queue.push_back(parsedInfo);
new_entry();
}
new_entry is a boost signal
boost::signal<void ()> new_entry;
Right now I have a signal handler class to catch the signal.
void sig_handler::process_next()
{
boost::try_mutex::scoped_try_lock lock(guard);
if(!lock)
return;
while(!shared_queue.empty())
{
... //Do Stuff
std::string cur_entry = shared_queue.at(0);
shared_queue.erase(shared_queue.begin());
... //Do more stuff
}
}
My goal is to clear out the vector queue when there is information in it, and every time something is pushed on the vector. I would like to avoid polling as much as possible. I believe this part is working how I expect it too. However I am getting an occasional crash from, what I believe based on my backtrace, pushing information on the shared queue. This is only happening when I trying to do 1000's of transactions a second, which makes it hard to debug in a multi-threaded environment. My back trace is here:
Error: signal 11:
./UpdateServer/build/UpdateServer(_Z7handleri+0x18)[0x469f68]
/lib/x86_64-linux-gnu/libc.so.6(+0x364a0)[0x7fbbdd67a4a0]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(_ZNSsC1ERKSs+0xb)[0x7fbbddfb4f2b]
./UpdateServer/build/UpdateServer[0x4749d0]
./UpdateServer/build/UpdateServer(_ZNSt6vectorISsSaISsEE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPSsS1_EERKSs+0x111)[0x476521]
./UpdateServer/build/UpdateServer(_ZN15request_handler14handle_requestERK7requestR5reply+0x3d3)[0x475873]
./UpdateServer/build/UpdateServer(_ZN10connection11handle_readERKN5boost6system10error_codeEm+0x234)[0x46c774]
./UpdateServer/build/UpdateServer(_ZN5boost4asio6detail14strand_service8dispatchINS1_7binder2INS_3_bi6bind_tIvNS_4_mfi3mf2Iv10connectionRKNS_6system10error_codeEmEENS5_5list3INS5_5valueINS_10shared_ptrIS9_EEEEPFNS_3argILi1EEEvEPFNSK_ILi2EEEvEEEEESB_mEEEEvRPNS2_11strand_implET_+0xcd)[0x47216d]
./UpdateServer/build/UpdateServer(_ZN5boost4asio6detail15wrapped_handlerINS0_10io_service6strandENS_3_bi6bind_tIvNS_4_mfi3mf2Iv10connectionRKNS_6system10error_codeEmEENS5_5list3INS5_5valueINS_10shared_ptrIS9_EEEEPFNS_3argILi1EEEvEPFNSK_ILi2EEEvEEEEEEclISB_mEEvRKT_RKT0_+0xd9)[0x472439]
./UpdateServer/build/UpdateServer(_ZN5boost4asio6detail18completion_handlerINS1_17rewrapped_handlerINS1_7binder2INS1_15wrapped_handlerINS0_10io_service6strandENS_3_bi6bind_tIvNS_4_mfi3mf2Iv10connectionRKNS_6system10error_codeEmEENS8_5list3INS8_5valueINS_10shared_ptrISC_EEEEPFNS_3argILi1EEEvEPFNSN_ILi2EEEvEEEEEEESE_mEESV_EEE11do_completeEPNS1_15task_io_serviceEPNS1_25task_io_service_operationESG_m+0x1e5)[0x4726f5]
./UpdateServer/build/UpdateServer(_ZN5boost4asio6detail14strand_service8dispatchINS1_17rewrapped_handlerINS1_7binder2INS1_15wrapped_handlerINS0_10io_service6strandENS_3_bi6bind_tIvNS_4_mfi3mf2Iv10connectionRKNS_6system10error_codeEmEENS9_5list3INS9_5valueINS_10shared_ptrISD_EEEEPFNS_3argILi1EEEvEPFNSO_ILi2EEEvEEEEEEESF_mEESW_EEEEvRPNS2_11strand_implET_+0x2ad)[0x472aed]
./UpdateServer/build/UpdateServer(_ZN5boost4asio6detail19asio_handler_invokeINS1_7binder2INS1_15wrapped_handlerINS0_10io_service6strandENS_3_bi6bind_tIvNS_4_mfi3mf2Iv10connectionRKNS_6system10error_codeEmEENS7_5list3INS7_5valueINS_10shared_ptrISB_EEEEPFNS_3argILi1EEEvEPFNSM_ILi2EEEvEEEEEEESD_mEES6_SU_EEvRT_PNS4_IT0_T1_EE+0x15f)[0x472d3f]
./UpdateServer/build/UpdateServer(_ZN5boost4asio6detail23reactive_socket_recv_opINS0_17mutable_buffers_1ENS1_15wrapped_handlerINS0_10io_service6strandENS_3_bi6bind_tIvNS_4_mfi3mf2Iv10connectionRKNS_6system10error_codeEmEENS7_5list3INS7_5valueINS_10shared_ptrISB_EEEEPFNS_3argILi1EEEvEPFNSM_ILi2EEEvEEEEEEEE11do_completeEPNS1_15task_io_serviceEPNS1_25task_io_service_operationESF_m+0xce)[0x472ede]
./UpdateServer/build/UpdateServer(_ZN5boost4asio6detail15task_io_service3runERNS_6system10error_codeE+0x79a)[0x47ceea]
./UpdateServer/build/UpdateServer(_ZN5boost4asio10io_service3runEv+0x25)[0x47d1d5]
/usr/lib/libboost_thread.so.1.48.0(+0xdda9)[0x7fbbdecd4da9]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7fbbdd42ee9a]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fbbdd737cbd]
The line ./UpdateServer/build/UpdateServer(_ZNSt6vectorISsSaISsEE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPSsS1_EERKSs+0x111)[0x476521]
seems to demangle to std::vector insert iterator, which is why I believe my program is crashing on the shared vector insert(I do not believe I have any other vectors of strings in my program), however I am fairly positive I am using my vector in a safe way, on both the insert and the read.
So I guess my question is when I am pushing information onto a shared vector, are there race condition problems I have to worry about that would cause my crash? And is the approach I am taking a feasible approach, or should I rethink my design in someway? Please let me know if you need any more information, I will be glad to provide anything I can.
Thank you
std data structures are not thread safe (for the most part) and therefore require additional synchronization if accessed by multiple threads simultaneously. In your case, one thread could be calling push_back while another thread is calling erase. This will produce undefined behavior. To fix this, both the push_back and the erase need to be protected by the same lock. I recommend you google for thread safety in the c++ standard library and read more about it.
Also, using a vector here is probably not the best choice. You should look into std::queue instead. When you erase the first element of the vector is has to copy all of the strings later in the vector down one, which can be very expensive. queue does not suffer from this problem.

Multiple mutex locking strategies and why libraries don't use address comparison

There is a widely known way of locking multiple locks, which relies on choosing fixed linear ordering and aquiring locks according to this ordering.
That was proposed, for example, in the answer for "Acquire a lock on two mutexes and avoid deadlock". Especially, the solution based on address comparison seems to be quite elegant and obvious.
When I tried to check how it is actually implemented, I've found, to my surprise, that this solution in not widely used.
To quote the Kernel Docs - Unreliable Guide To Locking:
Textbooks will tell you that if you always lock in the same order, you
will never get this kind of deadlock. Practice will tell you that this
approach doesn't scale: when I create a new lock, I don't understand
enough of the kernel to figure out where in the 5000 lock hierarchy it
will fit.
PThreads doesn't seem to have such a mechanism built in at all.
Boost.Thread came up with
completely different solution, lock() for multiple (2 to 5) mutexes is based on trying and locking as many mutexes as it is possible at the moment.
This is the fragment of the Boost.Thread source code (Boost 1.48.0, boost/thread/locks.hpp:1291):
template<typename MutexType1,typename MutexType2,typename MutexType3>
void lock(MutexType1& m1,MutexType2& m2,MutexType3& m3)
{
unsigned const lock_count=3;
unsigned lock_first=0;
for(;;)
{
switch(lock_first)
{
case 0:
lock_first=detail::lock_helper(m1,m2,m3);
if(!lock_first)
return;
break;
case 1:
lock_first=detail::lock_helper(m2,m3,m1);
if(!lock_first)
return;
lock_first=(lock_first+1)%lock_count;
break;
case 2:
lock_first=detail::lock_helper(m3,m1,m2);
if(!lock_first)
return;
lock_first=(lock_first+2)%lock_count;
break;
}
}
}
where lock_helper returns 0 on success and number of mutexes that weren't successfully locked otherwise.
Why is this solution better, than comparing addresses or any other kind of ids? I don't see any problems with pointer comparison, which can be avoided using this kind of "blind" locking.
Are there any other ideas on how to solve this problem on a library level?
From the bounty text:
I'm not even sure if I can prove correctness of the presented Boost solution, which seems more tricky than the one with linear order.
The Boost solution cannot deadlock because it never waits while already holding a lock. All locks but the first are acquired with try_lock. If any try_lock call fails to acquire its lock, all previously acquired locks are freed. Also, in the Boost implementation the new attempt will start from the lock failed to acquire the previous time, and will first wait till it is available; it's a smart design decision.
As a general rule, it's always better to avoid blocking calls while holding a lock. Therefore, the solution with try-lock, if possible, is preferred (in my opinion). As a particular consequence, in case of lock ordering, the system at whole might get stuck. Imagine the very last lock (e.g. the one with the biggest address) was acquired by a thread which was then blocked. Now imagine some other thread needs the last lock and another lock, and due to ordering it will first get the other one and will wait on the last lock. Same can happen with all other locks, and the whole system makes no progress until the last lock is released. Of course it's an extreme and rather unlikely case, but it illustrates the inherent problem with lock ordering: the higher a lock number the more indirect impact the lock has when acquired.
The shortcoming of the try-lock-based solution is that it can cause livelock, and in extreme cases the whole system might also get stuck for at least some time. Therefore it is important to have some back-off schema that make pauses between locking attempts longer with time, and perhaps randomized.
Sometimes, lock A needs to be acquired before lock B does. Lock B might have either a lower or a higher address, so you can't use address comparison in this case.
Example: When you have a tree data-structure, and threads try to read and update nodes, you can protect the tree using a reader-writer lock per node. This only works if your threads always acquire locks top-down root-to-leave. The address of the locks does not matter in this case.
You can only use address comparison if it does not matter at all which lock gets acquired first. If this is the case, address comparison is a good solution. But if this is not the case you can't do it.
I guess the Linux kernel requires certain subsystems to be locked before others are. This cannot be done using address comparison.
The "address comparison" and similar approaches, although used quite often, are special cases. They works fine if you have
a lock-free mechanism to get
two (or more) "items" of the same kind or hierarchy level
any stable ordering schema between those items
For example: You have a mechanism to get two "accounts" from a list. Assume that the access to the list is lock-free. Now you have pointers to both items and want to lock them. Since they are "siblings" you have to choose which one to lock first. Here the approach using addresses (or any other stable ordering schema like "account id") is OK.
But the linked Linux text talks about "lock hierarchies". This means locking not between "siblings" (of the same kind) but between "parent" and "children" which might be from different types. This may happen in actual tree structures as well in other scenarios.
Contrived example: To load a program you must
lock the file inode,
lock the process table
lock the destination memory
These three locks are not "siblings" not in a clear hierarchy. The locks are also not taken directly one after the other - each subsystem will take the locks at free will. If you consider all usecases where those three (and more) subsystems interact you see, that there is no clear, stable ordering you can think of.
The Boost library is in the same situation: It strives to provide generic solutions. So they cannot assume the points from above and must fall back to a more complicated strategy.
One scenario when address compare will fail is if you use the proxy pattern.
You can delegate the locks to the same object and the addresses will be different.
Consider the following example
template<typename MutexType>
class MutexHelper
{
MutexHelper(MutexType &m) : _m(m) {}
void lock()
{
std::cout <<"locking ";
m.lock();
}
void unlock()
{
std::cout <<"unlocking ";
m.unlock();
}
MutexType &_m;
};
if the function
template<typename MutexType1,typename MutexType2,typename MutexType3>
void lock(MutexType1& m1,MutexType2& m2,MutexType3& m3);
will actually use address compare the following code ca produce a deadlock
Mutex m1;
Mutex m1;
thread1
MutexHelper hm1(m1);
MutexHelper hm2(m2);
lock(hm1, hm2);
thread2:
MutexHelper hm2(m2);
MutexHelper hm1(m1);
lock(hm1, hm2);
EDIT:
this is an interesting thread that share some light on boost::lock implementation
thread-best-practice-to-lock-multiple-mutexes
Address compare does not work for inter-process shared mutexes (named synchronization objects).

Thread safety of multiple-reader/single-writer class

I am working on a set that is frequently read but rarely written.
class A {
boost::shared_ptr<std::set<int> > _mySet;
public:
void add(int v) {
boost::shared_ptr<std::set<int> > tmpSet(new std::set<int>(*_mySet));
tmpSet->insert(v); // insert to tmpSet
_mySet = tmpSet; // swap _mySet
}
void check(int v) {
boost::shared_ptr<std::set<int> > theSet = _mySet;
if (theSet->find(v) != theSet->end()) {
// do something irrelevant
}
}
};
In the class, add() is only called by one thread and check() is called by many threads. check() does not care whether _mySet is the latest or not. Is the class thread-safe? Is it possible that the thread executing check() would observe swap _mySet happening before insert to tmpSet?
This is an interesting use of shared_ptr to implement thread safety.
Whether it is OK depends on the thread-safety guarantees of
boost::shared_ptr. In particular, does it establish some sort of
fence or membar, so that you are guaranteed that all of the writes in
the constructor and insert functions of set occur before any
modification of the pointer value becomes visible.
I can find no thread safety guarantees whatsoever in the Boost
documentation of smart pointers. This surprizes me, as I was sure that
there was some. But a quick look at the sources for 1.47.0 show none,
and that any use of boost::shared_ptr in a threaded environment will
fail. (Could someone please point me to what I'm missing. I can't
believe that boost::shared_ptr has ignored threading.)
Anyway, there are three possibilities: you can't use the shared pointer
in a threaded environment (which seems to be the case), the shared
pointer ensures its own internal consistency in a threaded environment,
but doesn't establish ordering with regards to other objects, or the
shared pointer establishes full ordering. Only in the last case will
your code be safe as is. In the first case, you'll need some form of
lock around everything, and in the second, you'll need some sort of
fences or membar to ensure that the necessary writes are actually done
before publishing the new version, and that they will be seen before
trying to read it.
You do need synchronization, it is not thread safe. Generally it doesn't matter, even something as simple as shared += value; is not thread safe.
look here for example with regards to thread safety of shared_ptr: Is boost shared_ptr <XXX> thread safe?
I would also question your allocation/swapping in add() and use of shared_ptr in check()
update:
I went back and re-rad dox for shared_ptr ... It is most likely thread-safe in your particular since the reference counting for shared_ptr is thread-safe. However you are doing (IMHO) unnecessary complexity by not using read/write lock.
Eventually this code should be thread safe:
atomic_store(&_my_set,tmpSet);
and
theSet = atomic_load(&_mySet);
(instead of simple assignments)
But I don't know the current status of atomicity support for shared_ptr.
Note, that adding atomicity to shared_ptr in lock-free manner is really dificult thing; so even atomicity is implemented it may relay on mutexes or usermode spinlocks and, therefore, may sometimes suffer from performance issues
Edit: Perhaps, volatile qualifier for _my_set member variable should also be added.. but I'm not sure that it is strictly required by semantics of atomic operations

How to detect circular calls?

I've been looking for causes for deadlocks and strategies/tools to avoid and detect them.
Another potential cause for deadlocks is to have blocking functions calling other blocking functions in a circular way, so that eventually a call never returns.
Sometimes this is hard to discover, specially in very large projects.
So, are there any tools/libraries/techiques that allow to automate the detection of circular calls in a program?
EDIT:
I code mostly in C and C++ so, if possible, give any information about the topic that is applicable to those languages.
Nevertheless, it seems this topic is scarcely covered in SO, so answers for other languages are ok too. although maybe those deserve a topic of its own if someone finds it relevant
Thanks.
Circular (or recursive) calls that try to acquire the same non-reentrant lock are one of the easiest to debug blocking scenarios: locking is deterministic, and can be easily checked. When the application locks, fire up the debugger and look at the stack trace to understand what locks are held and why.
As to general solutions for the problem of locking... you can look into some libraries that provide mutex ordering, and detect when you are trying to lock on a mutex out of order. This type of solutions might be complex to implement correctly, but once in place it ensures that you cannot enter a deadlock condition, as it forces all processes to obtain the locks in the same order (i.e. if process A holds lock La, and it tries to acquire lock Lb for which the ordering is correct, then it can either succeed or lock, but whichever process is holding lock Lb cannot try to lock La as the ordering constraint would not be met).
If you are on Linux there 2 Valgrind tools for detecting deadlocks and race conditions: Helgrind, DRD. They both complement each other and it's worth to check for thread errors by both of them.
In linux you can use valgrind to detect deadlocks, use --tool=helgrind.
Best way to detect deadlocks (IMO) is to make a test program that calls all the functions in a random order in like 30 different threads 10000s of times.
If you get a deadlock you can use VS2010 "Parallel Stacks" window. Debug->Windows->Parallel Stacks
This window will show you all the stacks, so you can find the methods that are deadlocking.
A simple strategy I use to write thread-safe objects:
A thread safe object should be safe when its public methods are called, so you don't get deadlocks when it is used.
So, the idea is to just lock all the public methods that access the object's data.
Besides that you need to insure that within the class' code you never call a public method. If you need to use one of the public methods, then make that method private, and wrap the private method with a public method that locks and then calls it.
If you want better lock granularity you could just create objects for each part that has its own lock, and lock it like I suggested. Then use encapsulation to combine those classes to the one class.
Example:
class Blah {
MyData data;
Lock lock;
public:
DataItem GetData(int index)
{
ReadLock read(lock);
return LocalGetData(index);
}
DataItem FindData(string key)
{
ReadLock read(lock);
DataItem item;
//find the item, can use LocalGetData() to get the item without deadlocking
return item;
}
void PutData(DataItem item)
{
ReadLock write(lock);
//put item in database
}
private:
DataItem LocalGetData(int index)
{
return data[index];
}
}
You could find a tool that builds a call graph, and check the graph for cycles.
Otherwise, there are a number of strategies for detecting deadlocks or other circularities, but they all depend on having some sort of supporting infrastructure in place.
There are deadlock avoidance strategies, having to do with assigning lock priorities and ordering the locks according to priority. These require code changes and enforcing the standards, though.

what is the best way to synchronize container access between multiple threads in real-time application

I have std::list<Info> infoList in my application that is shared between two threads. These 2 threads are accessing this list as follows:
Thread 1: uses push_back(), pop_front() or clear() on the list (Depending on the situation)
Thread 2: uses an iterator to iterate through the items in the list and do some actions.
Thread 2 is iterating the list like the following:
for(std::list<Info>::iterator i = infoList.begin(); i != infoList.end(); ++i)
{
DoAction(i);
}
The code is compiled using GCC 4.4.2.
Sometimes ++i causes a segfault and crashes the application. The error is caused in std_list.h line 143 at the following line:
_M_node = _M_node->_M_next;
I guess this is a racing condition. The list might have changed or even cleared by thread 1 while thread 2 was iterating it.
I used Mutex to synchronize access to this list and all went ok during my initial test. But the system just freezes under stress test making this solution totally unacceptable. This application is a real-time application and i need to find a solution so both threads can run as fast as possible without hurting the total applications throughput.
My question is this:
Thread 1 and Thread 2 need to execute as fast as possible since this is a real-time application. what can i do to prevent this problem and still maintain the application performance? Are there any lock-free algorithms available for such a problem?
Its ok if i miss some newly added Info objects in thread 2's iteration but what can i do to prevent the iterator from becoming a dangling pointer?
Thanks
Your for() loop can potentially keep a lock for a relatively long time, depending on how many elements it iterates. You can get in real trouble if it "polls" the queue, constantly checking if a new element became available. That makes the thread own the mutex for an unreasonably long time, giving few opportunities to the producer thread to break in and add an element. And burning lots of unnecessary CPU cycles in the process.
You need a "bounded blocking queue". Don't write it yourself, the lock design is not trivial. Hard to find good examples, most of it is .NET code. This article looks promising.
In general it is not safe to use STL containers this way. You will have to implement specific method to make your code thread safe. The solution you chose depends on your needs. I would probably solve this by maintaining two lists, one in each thread. And communicating the changes through a lock free queue (mentioned in the comments to this question). You could also limit the lifetime of your Info objects by wrapping them in boost::shared_ptr e.g.
typedef boost::shared_ptr<Info> InfoReference;
typedef std::list<InfoReference> InfoList;
enum CommandValue
{
Insert,
Delete
}
struct Command
{
CommandValue operation;
InfoReference reference;
}
typedef LockFreeQueue<Command> CommandQueue;
class Thread1
{
Thread1(CommandQueue queue) : m_commands(queue) {}
void run()
{
while (!finished)
{
//Process Items and use
// deleteInfo() or addInfo()
};
}
void deleteInfo(InfoReference reference)
{
Command command;
command.operation = Delete;
command.reference = reference;
m_commands.produce(command);
}
void addInfo(InfoReference reference)
{
Command command;
command.operation = Insert;
command.reference = reference;
m_commands.produce(command);
}
}
private:
CommandQueue& m_commands;
InfoList m_infoList;
}
class Thread2
{
Thread2(CommandQueue queue) : m_commands(queue) {}
void run()
{
while(!finished)
{
processQueue();
processList();
}
}
void processQueue()
{
Command command;
while (m_commands.consume(command))
{
switch(command.operation)
{
case Insert:
m_infoList.push_back(command.reference);
break;
case Delete:
m_infoList.remove(command.reference);
break;
}
}
}
void processList()
{
// Iterate over m_infoList
}
private:
CommandQueue& m_commands;
InfoList m_infoList;
}
void main()
{
CommandQueue commands;
Thread1 thread1(commands);
Thread2 thread2(commands);
thread1.start();
thread2.start();
waitforTermination();
}
This has not been compiled. You still need to make sure that access to your Info objects is thread safe.
I would like to know what is the purpose of this list, it would be easier to answer the question then.
As Hoare said, it is generally a bad idea to try to share data to communicate between two threads, rather you should communicate to share data: ie messaging.
If this list is modelling a queue, for example, you might simply use one of the various ways to communicate (such as sockets) between the two threads. Consumer / Producer is a standard and well-known problem.
If your items are expensive, then only pass the pointers around during communication, you'll avoid copying the items themselves.
In general, it's exquisitely difficult to share data, although it is unfortunately the only way of programming we hear of in school. Normally only low-level implementation of "channels" of communication should ever worry about synchronization and you should learn to use the channels to communicate instead of trying to emulate them.
To prevent your iterator from being invalidated you have to lock the whole for loop. Now I guess the first thread may have difficulties updating the list. I would try to give it a chance to do its job on each (or every Nth iteration).
In pseudo-code that would look like:
mutex_lock();
for(...){
doAction();
mutex_unlock();
thread_yield(); // give first thread a chance
mutex_lock();
if(iterator_invalidated_flag) // set by first thread
reset_iterator();
}
mutex_unlock();
You have to decide which thread is the more important. If it is the update thread, then it must signal the iterator thread to stop, wait and start again. If it is the iterator thread, it can simply lock the list until iteration is done.
The best way to do this is to use a container that is internally synchronized. TBB and Microsoft's concurrent_queue do this. Anthony Williams also has a good implementation on his blog here
Others have already suggested lock-free alternatives, so I'll answer as if you were stuck using locks...
When you modify a list, existing iterators can become invalidated because they no longer point to valid memory (the list automagically reallocates more memory when it needs to grow). To prevent invalidated iterators, you could make the producer block on a mutex while your consumer traverses the list, but that would be needless waiting for the producer.
Life would be easier if you used a queue instead of a list, and have your consumer use a synchronized queue<Info>::pop_front() call instead of iterators that can be invalidated behind your back. If your consumer really needs to gobble chunks of Info at a time, then use a condition variable that'll make your consumer block until queue.size() >= minimum.
The Boost library has a nice portable implementation of condition variables (that even works with older versions of Windows), as well as the usual threading library stuff.
For a producer-consumer queue that uses (old-fashioned) locking, check out the BlockingQueue template class of the ZThreads library. I have not used ZThreads myself, being worried about lack of recent updates, and because it didn't seem to be widely used. However, I have used it as inspiration for rolling my own thread-safe producer-consumer queue (before I learned about lock-free queues and TBB).
A lock-free queue/stack library seems to be in the Boost review queue. Let's hope we see a new Boost.Lockfree in the near future! :)
If there's interest, I can write up an example of a blocking queue that uses std::queue and Boost thread locking.
EDIT:
The blog referenced by Rick's answer already has a blocking queue example that uses std::queue and Boost condvars. If your consumer needs to gobble chunks, you can extend the example as follows:
void wait_for_data(size_t how_many)
{
boost::mutex::scoped_lock lock(the_mutex);
while(the_queue.size() < how_many)
{
the_condition_variable.wait(lock);
}
}
You may also want to tweak it to allow time-outs and cancellations.
You mentioned that speed was a concern. If your Infos are heavyweight, you should consider passing them around by shared_ptr. You can also try making your Infos fixed size and use a memory pool (which can be much faster than the heap).
As you mentioned that you don't care if your iterating consumer misses some newly-added entries, you could use a copy-on-write list underneath. That allows the iterating consumer to operate on a consistent snapshot of the list as of when it first started, while, in other threads, updates to the list yield fresh but consistent copies, without perturbing any of the extant snapshots.
The trade here is that each update to the list requires locking for exclusive access long enough to copy the entire list. This technique is biased toward having many concurrent readers and less frequent updates.
Trying to add intrinsic locking to the container first requires you to think about which operations need to behave in atomic groups. For instance, checking if the list is empty before trying to pop off the first element requires an atomic pop-if-not-empty operation; otherwise, the answer to the list being empty can change in between when the caller receives the answer and attempts to act upon it.
It's not clear in your example above what guarantees the iteration must obey. Must every element in the list eventually be visited by the iterating thread? Does it make multiple passes? What does it mean for one thread to remove an element from the list while another thread is running DoAction() against it? I suspect that working through these questions will lead to significant design changes.
You're working in C++, and you mentioned needing a queue with a pop-if-not-empty operation. I wrote a two-lock queue many years ago using the ACE Library's concurrency primitives, as the Boost thread library was not yet ready for production use, and the chance for the C++ Standard Library including such facilities was a distant dream. Porting it to something more modern would be easy.
This queue of mine -- called concurrent::two_lock_queue -- allows access to the head of the queue only via RAII. This ensures that acquiring the lock to read the head will always be mated with a release of the lock. A consumer constructs a const_front (const access to head element), a front (non-const access to head element), or a renewable_front (non-const access to head and successor elements) object to represent the exclusive right to access the head element of the queue. Such "front" objects can't be copied.
Class two_lock_queue also offers a pop_front() function that waits until at least one element is available to be removed, but, in keeping with std::queue's and std::stack's style of not mixing container mutation and value copying, pop_front() returns void.
In a companion file, there's a type called concurrent::unconditional_pop, which allows one to ensure through RAII that the head element of the queue will be popped upon exit from the current scope.
The companion file error.hh defines the exceptions that arise from use of the function two_lock_queue::interrupt(), used to unblock threads waiting for access to the head of the queue.
Take a look at the code and let me know if you need more explanation as to how to use it.
If you're using C++0x you could internally synchronize list iteration this way:
Assuming the class has a templated list named objects_, and a boost::mutex named mutex_
The toAll method is a member method of the list wrapper
void toAll(std::function<void (T*)> lambda)
{
boost::mutex::scoped_lock(this->mutex_);
for(auto it = this->objects_.begin(); it != this->objects_.end(); it++)
{
T* object = it->second;
if(object != nullptr)
{
lambda(object);
}
}
}
Calling:
synchronizedList1->toAll(
[&](T* object)->void // Or the class that your list holds
{
for(auto it = this->knownEntities->begin(); it != this->knownEntities->end(); it++)
{
// Do something
}
}
);
You must be using some threading library. If you are using Intel TBB, you can use concurrent_vector or concurrent_queue. See this.
If you want to continue using std::list in a multi-threaded environment, I would recommend wrapping it in a class with a mutex that provides locked access to it. Depending on the exact usage, it might make sense to switch to a event-driven queue model where messages are passed on a queue that multiple worker threads are consuming (hint: producer-consumer).
I would seriously take Matthieu's thought into consideration. Many problems that are being solved using multi-threaded programming are better solved using message-passing between threads or processes. If you need high throughput and do not absolutely require that the processing share the same memory space, consider using something like the Message-Passing Interface (MPI) instead of rolling your own multi-threaded solution. There are a bunch of C++ implementations available - OpenMPI, Boost.MPI, Microsoft MPI, etc. etc.
I don't think you can get away without any synchronisation at all in this case as certain operation will invalidate the iterators you are using. With a list, this is fairly limited (basically, if both threads are trying to manipulate iterators to the same element at the same time) but there is still a danger that you'll be removing an element at the same time you're trying to append one to it.
Are you by any chance holding the lock across DoAction(i)? You obviously only want to hold the lock for the absolute minimum of time that you can get away with in order to maximise the performance. From the code above I think you'll want to decompose the loop somewhat in order to speed up both sides of the operation.
Something along the lines of:
while (processItems) {
Info item;
lock(mutex);
if (!infoList.empty()) {
item = infoList.front();
infoList.pop_front();
}
unlock(mutex);
DoAction(item);
delayALittle();
}
And the insert function would still have to look like this:
lock(mutex);
infoList.push_back(item);
unlock(mutex);
Unless the queue is likely to be massive, I'd be tempted to use something like a std::vector<Info> or even a std::vector<boost::shared_ptr<Info> > to minimize the copying of the Info objects (assuming that these are somewhat more expensive to copy compared to a boost::shared_ptr. Generally, operations on a vector tend to be a little faster than on a list, especially if the objects stored in the vector are small and cheap to copy.