I am designing a tcp server which takes information from a request and puts everything in a queue to be processed. I am using a asio web server to handle all web interaction. I am looking for an effective way to queue everything up to be processes. I am using boost signals and global vector to do this right now similar to this.
void request_handler::handle_request(request &req, reply &rep)
{
std::string parsedInfo = parse_request(req);
shared_queue.push_back(parsedInfo);
new_entry();
}
new_entry is a boost signal
boost::signal<void ()> new_entry;
Right now I have a signal handler class to catch the signal.
void sig_handler::process_next()
{
boost::try_mutex::scoped_try_lock lock(guard);
if(!lock)
return;
while(!shared_queue.empty())
{
... //Do Stuff
std::string cur_entry = shared_queue.at(0);
shared_queue.erase(shared_queue.begin());
... //Do more stuff
}
}
My goal is to clear out the vector queue when there is information in it, and every time something is pushed on the vector. I would like to avoid polling as much as possible. I believe this part is working how I expect it too. However I am getting an occasional crash from, what I believe based on my backtrace, pushing information on the shared queue. This is only happening when I trying to do 1000's of transactions a second, which makes it hard to debug in a multi-threaded environment. My back trace is here:
Error: signal 11:
./UpdateServer/build/UpdateServer(_Z7handleri+0x18)[0x469f68]
/lib/x86_64-linux-gnu/libc.so.6(+0x364a0)[0x7fbbdd67a4a0]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(_ZNSsC1ERKSs+0xb)[0x7fbbddfb4f2b]
./UpdateServer/build/UpdateServer[0x4749d0]
./UpdateServer/build/UpdateServer(_ZNSt6vectorISsSaISsEE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPSsS1_EERKSs+0x111)[0x476521]
./UpdateServer/build/UpdateServer(_ZN15request_handler14handle_requestERK7requestR5reply+0x3d3)[0x475873]
./UpdateServer/build/UpdateServer(_ZN10connection11handle_readERKN5boost6system10error_codeEm+0x234)[0x46c774]
./UpdateServer/build/UpdateServer(_ZN5boost4asio6detail14strand_service8dispatchINS1_7binder2INS_3_bi6bind_tIvNS_4_mfi3mf2Iv10connectionRKNS_6system10error_codeEmEENS5_5list3INS5_5valueINS_10shared_ptrIS9_EEEEPFNS_3argILi1EEEvEPFNSK_ILi2EEEvEEEEESB_mEEEEvRPNS2_11strand_implET_+0xcd)[0x47216d]
./UpdateServer/build/UpdateServer(_ZN5boost4asio6detail15wrapped_handlerINS0_10io_service6strandENS_3_bi6bind_tIvNS_4_mfi3mf2Iv10connectionRKNS_6system10error_codeEmEENS5_5list3INS5_5valueINS_10shared_ptrIS9_EEEEPFNS_3argILi1EEEvEPFNSK_ILi2EEEvEEEEEEclISB_mEEvRKT_RKT0_+0xd9)[0x472439]
./UpdateServer/build/UpdateServer(_ZN5boost4asio6detail18completion_handlerINS1_17rewrapped_handlerINS1_7binder2INS1_15wrapped_handlerINS0_10io_service6strandENS_3_bi6bind_tIvNS_4_mfi3mf2Iv10connectionRKNS_6system10error_codeEmEENS8_5list3INS8_5valueINS_10shared_ptrISC_EEEEPFNS_3argILi1EEEvEPFNSN_ILi2EEEvEEEEEEESE_mEESV_EEE11do_completeEPNS1_15task_io_serviceEPNS1_25task_io_service_operationESG_m+0x1e5)[0x4726f5]
./UpdateServer/build/UpdateServer(_ZN5boost4asio6detail14strand_service8dispatchINS1_17rewrapped_handlerINS1_7binder2INS1_15wrapped_handlerINS0_10io_service6strandENS_3_bi6bind_tIvNS_4_mfi3mf2Iv10connectionRKNS_6system10error_codeEmEENS9_5list3INS9_5valueINS_10shared_ptrISD_EEEEPFNS_3argILi1EEEvEPFNSO_ILi2EEEvEEEEEEESF_mEESW_EEEEvRPNS2_11strand_implET_+0x2ad)[0x472aed]
./UpdateServer/build/UpdateServer(_ZN5boost4asio6detail19asio_handler_invokeINS1_7binder2INS1_15wrapped_handlerINS0_10io_service6strandENS_3_bi6bind_tIvNS_4_mfi3mf2Iv10connectionRKNS_6system10error_codeEmEENS7_5list3INS7_5valueINS_10shared_ptrISB_EEEEPFNS_3argILi1EEEvEPFNSM_ILi2EEEvEEEEEEESD_mEES6_SU_EEvRT_PNS4_IT0_T1_EE+0x15f)[0x472d3f]
./UpdateServer/build/UpdateServer(_ZN5boost4asio6detail23reactive_socket_recv_opINS0_17mutable_buffers_1ENS1_15wrapped_handlerINS0_10io_service6strandENS_3_bi6bind_tIvNS_4_mfi3mf2Iv10connectionRKNS_6system10error_codeEmEENS7_5list3INS7_5valueINS_10shared_ptrISB_EEEEPFNS_3argILi1EEEvEPFNSM_ILi2EEEvEEEEEEEE11do_completeEPNS1_15task_io_serviceEPNS1_25task_io_service_operationESF_m+0xce)[0x472ede]
./UpdateServer/build/UpdateServer(_ZN5boost4asio6detail15task_io_service3runERNS_6system10error_codeE+0x79a)[0x47ceea]
./UpdateServer/build/UpdateServer(_ZN5boost4asio10io_service3runEv+0x25)[0x47d1d5]
/usr/lib/libboost_thread.so.1.48.0(+0xdda9)[0x7fbbdecd4da9]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7fbbdd42ee9a]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fbbdd737cbd]
The line ./UpdateServer/build/UpdateServer(_ZNSt6vectorISsSaISsEE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPSsS1_EERKSs+0x111)[0x476521]
seems to demangle to std::vector insert iterator, which is why I believe my program is crashing on the shared vector insert(I do not believe I have any other vectors of strings in my program), however I am fairly positive I am using my vector in a safe way, on both the insert and the read.
So I guess my question is when I am pushing information onto a shared vector, are there race condition problems I have to worry about that would cause my crash? And is the approach I am taking a feasible approach, or should I rethink my design in someway? Please let me know if you need any more information, I will be glad to provide anything I can.
Thank you
std data structures are not thread safe (for the most part) and therefore require additional synchronization if accessed by multiple threads simultaneously. In your case, one thread could be calling push_back while another thread is calling erase. This will produce undefined behavior. To fix this, both the push_back and the erase need to be protected by the same lock. I recommend you google for thread safety in the c++ standard library and read more about it.
Also, using a vector here is probably not the best choice. You should look into std::queue instead. When you erase the first element of the vector is has to copy all of the strings later in the vector down one, which can be very expensive. queue does not suffer from this problem.
Related
First off, it's been a while since I've used any sort of mutex or semaphore, so go easy on me.
I have implemented a generic logging class that right now only receives a message from other classes and prepends that message with date/time and the level of debug, and then prints the message to stdout.
I would like to implement some sort of queue or buffer that will hold many messages that are sent to the logging class and then write them to a file.
The problem that I'm running into is I can't decide how/where to protect the queue.
Below is some pseudo-code of what I've come up with so far:
logMessage(char *msg, int debugLevel){
formattedMsg = formatMsg(msg, debugLevel) //formats the msg to include date/time & debugLevel
lockMutext()
queue.add(formattedMsg)
unlockMutex()
}
wrtieToFile(){
if (isMessageAvailable()) { //would check to see if there is a message in the queue
lockMutext()
file << queue.getFirst() //would append file with the first available msg from the queue
unlockMutex()
}
}
My questions are:
Do I really need to use the mutex in both places?
Is a mutex really what I'm looking for?
I'm thinking I may need a thread for the writing to the file part - does that sound like a good idea?
FYI I looking for a way to do this without using Boost or any 3rd party library.
EDIT The intended platform is Linux.
EDIT 2 Moved formatMsg to before the mutex lock (thank you #Paul Rubel)
With respect to do you really need the mutex. Think what could happen if you didn't lock things. Unless your queue is thread-safe you probably need to protect both insertion and removal.
Imagine execution contexts changing as you are removing the first element. The add could find the queue in a inconsistent state, and then who knows what could happen.
Regarding creating the message, unless formatMsg makes use of shared resources you can probably more it out of the locked section, which can increase your parallelism.
Extracting the writing to file into its own thread sounds like a reasonable choice, that way the logging threads will not have to make the calls themselves.
correct me if i'm wrong. Multiple callers from multiple threads all trying to access the same resource concurrently.
Maybe you could just have one mutex wrapping the entirety of your logging functionality.
watch out for race conditions.
Edit
Readers take a look at the comments to this answer for some valuable discussion
You can define a global variable which contains the number of element present in the queue or buffer. That means you need to increment or decrement this variable while adding data or removing data from buffer or queue. So you keep this variable inside a mutex for your above logging framework.
Shared memory is giving me a hard time and GDB isn't being much help. I've got 32KB of shared memory allocated, and I used shmat to cast it to a pointer to a struct containing A) a bool and B) a queue of objects containing one std::string, three ints, and one bool, plus assorted methods. (I don't know if this matryoshka structure is how you're supposed to do it, but it's the only way I know. Using a message queue isn't an option, and I need to use multiple processes.)
Pushing one object onto the queue works, but when I try to push a second, the program freezes. No error message, no nothing. What's causing this? I doubt it's a lack of memory, but if it is, how much do I need?
EDIT: In case I was unclear -- the objects in the queue are of a class with the five data members described.
EDIT 2: I changed the class of the queue's entries so that it doesn't use std::string. (Embarrassingly enough, I was able to represent the data with a primitive.) The program still freezes on the second push().
EDIT 3: I tried calling front() from the same queue immediately after the first push(), and it froze the program too. Checking the value of the bool outside the queue, however, worked fine, so it's gotta be something wrong with the queue itself.
EDIT 4: As an experiment, I added an std::queue<int> to the struct I was using for the shared memory. It showed the same behavior -- push() worked once, then front() made it freeze. So it's not a problem with the class I'm using for the queue items, either.
This question suggests I'm not likely to solve this with std::queue. Is that so? Should I use boost like it says? (In my case, I'm executing shmget() and shmat() in the parent process and trying to let two child processes communicate, so it's slightly different.)
EDIT 5: The other child process also freezes when it calls front(). A semaphore ensures this happens after the first push() call.
Putting std::string objects into a shared memory segment can't possibly work.
It should work fine for a single process, but as soon as you try to access it from a second process, you'll get garbage: the string will contain a pointer to heap-allocated data, and that pointer is only valid in the process that allocated it.
I don't know why your program freezes, but it is completely pointless to even think about.
As I said in my comment, your problem stems from attempting to use objects that internally require heap allocation in a structure, which should be self contained (i.e. requires no further dynamically allocated memory).
I would tweak your setup, and change the std::string to some fixed size character array, something like
// this structure fits nicely into a typical cache line
struct Message
{
boost::array<char, 48> some_string;
int a, b, c;
bool c;
};
Now, when you need to post something on the queue, copy the string content into some_string. Of course you should size your strings appropriately (and boost::array probably isn't the best - ideally you want some length information too) but you get the idea...
I've been looking for causes for deadlocks and strategies/tools to avoid and detect them.
Another potential cause for deadlocks is to have blocking functions calling other blocking functions in a circular way, so that eventually a call never returns.
Sometimes this is hard to discover, specially in very large projects.
So, are there any tools/libraries/techiques that allow to automate the detection of circular calls in a program?
EDIT:
I code mostly in C and C++ so, if possible, give any information about the topic that is applicable to those languages.
Nevertheless, it seems this topic is scarcely covered in SO, so answers for other languages are ok too. although maybe those deserve a topic of its own if someone finds it relevant
Thanks.
Circular (or recursive) calls that try to acquire the same non-reentrant lock are one of the easiest to debug blocking scenarios: locking is deterministic, and can be easily checked. When the application locks, fire up the debugger and look at the stack trace to understand what locks are held and why.
As to general solutions for the problem of locking... you can look into some libraries that provide mutex ordering, and detect when you are trying to lock on a mutex out of order. This type of solutions might be complex to implement correctly, but once in place it ensures that you cannot enter a deadlock condition, as it forces all processes to obtain the locks in the same order (i.e. if process A holds lock La, and it tries to acquire lock Lb for which the ordering is correct, then it can either succeed or lock, but whichever process is holding lock Lb cannot try to lock La as the ordering constraint would not be met).
If you are on Linux there 2 Valgrind tools for detecting deadlocks and race conditions: Helgrind, DRD. They both complement each other and it's worth to check for thread errors by both of them.
In linux you can use valgrind to detect deadlocks, use --tool=helgrind.
Best way to detect deadlocks (IMO) is to make a test program that calls all the functions in a random order in like 30 different threads 10000s of times.
If you get a deadlock you can use VS2010 "Parallel Stacks" window. Debug->Windows->Parallel Stacks
This window will show you all the stacks, so you can find the methods that are deadlocking.
A simple strategy I use to write thread-safe objects:
A thread safe object should be safe when its public methods are called, so you don't get deadlocks when it is used.
So, the idea is to just lock all the public methods that access the object's data.
Besides that you need to insure that within the class' code you never call a public method. If you need to use one of the public methods, then make that method private, and wrap the private method with a public method that locks and then calls it.
If you want better lock granularity you could just create objects for each part that has its own lock, and lock it like I suggested. Then use encapsulation to combine those classes to the one class.
Example:
class Blah {
MyData data;
Lock lock;
public:
DataItem GetData(int index)
{
ReadLock read(lock);
return LocalGetData(index);
}
DataItem FindData(string key)
{
ReadLock read(lock);
DataItem item;
//find the item, can use LocalGetData() to get the item without deadlocking
return item;
}
void PutData(DataItem item)
{
ReadLock write(lock);
//put item in database
}
private:
DataItem LocalGetData(int index)
{
return data[index];
}
}
You could find a tool that builds a call graph, and check the graph for cycles.
Otherwise, there are a number of strategies for detecting deadlocks or other circularities, but they all depend on having some sort of supporting infrastructure in place.
There are deadlock avoidance strategies, having to do with assigning lock priorities and ordering the locks according to priority. These require code changes and enforcing the standards, though.
I have std::list<Info> infoList in my application that is shared between two threads. These 2 threads are accessing this list as follows:
Thread 1: uses push_back(), pop_front() or clear() on the list (Depending on the situation)
Thread 2: uses an iterator to iterate through the items in the list and do some actions.
Thread 2 is iterating the list like the following:
for(std::list<Info>::iterator i = infoList.begin(); i != infoList.end(); ++i)
{
DoAction(i);
}
The code is compiled using GCC 4.4.2.
Sometimes ++i causes a segfault and crashes the application. The error is caused in std_list.h line 143 at the following line:
_M_node = _M_node->_M_next;
I guess this is a racing condition. The list might have changed or even cleared by thread 1 while thread 2 was iterating it.
I used Mutex to synchronize access to this list and all went ok during my initial test. But the system just freezes under stress test making this solution totally unacceptable. This application is a real-time application and i need to find a solution so both threads can run as fast as possible without hurting the total applications throughput.
My question is this:
Thread 1 and Thread 2 need to execute as fast as possible since this is a real-time application. what can i do to prevent this problem and still maintain the application performance? Are there any lock-free algorithms available for such a problem?
Its ok if i miss some newly added Info objects in thread 2's iteration but what can i do to prevent the iterator from becoming a dangling pointer?
Thanks
Your for() loop can potentially keep a lock for a relatively long time, depending on how many elements it iterates. You can get in real trouble if it "polls" the queue, constantly checking if a new element became available. That makes the thread own the mutex for an unreasonably long time, giving few opportunities to the producer thread to break in and add an element. And burning lots of unnecessary CPU cycles in the process.
You need a "bounded blocking queue". Don't write it yourself, the lock design is not trivial. Hard to find good examples, most of it is .NET code. This article looks promising.
In general it is not safe to use STL containers this way. You will have to implement specific method to make your code thread safe. The solution you chose depends on your needs. I would probably solve this by maintaining two lists, one in each thread. And communicating the changes through a lock free queue (mentioned in the comments to this question). You could also limit the lifetime of your Info objects by wrapping them in boost::shared_ptr e.g.
typedef boost::shared_ptr<Info> InfoReference;
typedef std::list<InfoReference> InfoList;
enum CommandValue
{
Insert,
Delete
}
struct Command
{
CommandValue operation;
InfoReference reference;
}
typedef LockFreeQueue<Command> CommandQueue;
class Thread1
{
Thread1(CommandQueue queue) : m_commands(queue) {}
void run()
{
while (!finished)
{
//Process Items and use
// deleteInfo() or addInfo()
};
}
void deleteInfo(InfoReference reference)
{
Command command;
command.operation = Delete;
command.reference = reference;
m_commands.produce(command);
}
void addInfo(InfoReference reference)
{
Command command;
command.operation = Insert;
command.reference = reference;
m_commands.produce(command);
}
}
private:
CommandQueue& m_commands;
InfoList m_infoList;
}
class Thread2
{
Thread2(CommandQueue queue) : m_commands(queue) {}
void run()
{
while(!finished)
{
processQueue();
processList();
}
}
void processQueue()
{
Command command;
while (m_commands.consume(command))
{
switch(command.operation)
{
case Insert:
m_infoList.push_back(command.reference);
break;
case Delete:
m_infoList.remove(command.reference);
break;
}
}
}
void processList()
{
// Iterate over m_infoList
}
private:
CommandQueue& m_commands;
InfoList m_infoList;
}
void main()
{
CommandQueue commands;
Thread1 thread1(commands);
Thread2 thread2(commands);
thread1.start();
thread2.start();
waitforTermination();
}
This has not been compiled. You still need to make sure that access to your Info objects is thread safe.
I would like to know what is the purpose of this list, it would be easier to answer the question then.
As Hoare said, it is generally a bad idea to try to share data to communicate between two threads, rather you should communicate to share data: ie messaging.
If this list is modelling a queue, for example, you might simply use one of the various ways to communicate (such as sockets) between the two threads. Consumer / Producer is a standard and well-known problem.
If your items are expensive, then only pass the pointers around during communication, you'll avoid copying the items themselves.
In general, it's exquisitely difficult to share data, although it is unfortunately the only way of programming we hear of in school. Normally only low-level implementation of "channels" of communication should ever worry about synchronization and you should learn to use the channels to communicate instead of trying to emulate them.
To prevent your iterator from being invalidated you have to lock the whole for loop. Now I guess the first thread may have difficulties updating the list. I would try to give it a chance to do its job on each (or every Nth iteration).
In pseudo-code that would look like:
mutex_lock();
for(...){
doAction();
mutex_unlock();
thread_yield(); // give first thread a chance
mutex_lock();
if(iterator_invalidated_flag) // set by first thread
reset_iterator();
}
mutex_unlock();
You have to decide which thread is the more important. If it is the update thread, then it must signal the iterator thread to stop, wait and start again. If it is the iterator thread, it can simply lock the list until iteration is done.
The best way to do this is to use a container that is internally synchronized. TBB and Microsoft's concurrent_queue do this. Anthony Williams also has a good implementation on his blog here
Others have already suggested lock-free alternatives, so I'll answer as if you were stuck using locks...
When you modify a list, existing iterators can become invalidated because they no longer point to valid memory (the list automagically reallocates more memory when it needs to grow). To prevent invalidated iterators, you could make the producer block on a mutex while your consumer traverses the list, but that would be needless waiting for the producer.
Life would be easier if you used a queue instead of a list, and have your consumer use a synchronized queue<Info>::pop_front() call instead of iterators that can be invalidated behind your back. If your consumer really needs to gobble chunks of Info at a time, then use a condition variable that'll make your consumer block until queue.size() >= minimum.
The Boost library has a nice portable implementation of condition variables (that even works with older versions of Windows), as well as the usual threading library stuff.
For a producer-consumer queue that uses (old-fashioned) locking, check out the BlockingQueue template class of the ZThreads library. I have not used ZThreads myself, being worried about lack of recent updates, and because it didn't seem to be widely used. However, I have used it as inspiration for rolling my own thread-safe producer-consumer queue (before I learned about lock-free queues and TBB).
A lock-free queue/stack library seems to be in the Boost review queue. Let's hope we see a new Boost.Lockfree in the near future! :)
If there's interest, I can write up an example of a blocking queue that uses std::queue and Boost thread locking.
EDIT:
The blog referenced by Rick's answer already has a blocking queue example that uses std::queue and Boost condvars. If your consumer needs to gobble chunks, you can extend the example as follows:
void wait_for_data(size_t how_many)
{
boost::mutex::scoped_lock lock(the_mutex);
while(the_queue.size() < how_many)
{
the_condition_variable.wait(lock);
}
}
You may also want to tweak it to allow time-outs and cancellations.
You mentioned that speed was a concern. If your Infos are heavyweight, you should consider passing them around by shared_ptr. You can also try making your Infos fixed size and use a memory pool (which can be much faster than the heap).
As you mentioned that you don't care if your iterating consumer misses some newly-added entries, you could use a copy-on-write list underneath. That allows the iterating consumer to operate on a consistent snapshot of the list as of when it first started, while, in other threads, updates to the list yield fresh but consistent copies, without perturbing any of the extant snapshots.
The trade here is that each update to the list requires locking for exclusive access long enough to copy the entire list. This technique is biased toward having many concurrent readers and less frequent updates.
Trying to add intrinsic locking to the container first requires you to think about which operations need to behave in atomic groups. For instance, checking if the list is empty before trying to pop off the first element requires an atomic pop-if-not-empty operation; otherwise, the answer to the list being empty can change in between when the caller receives the answer and attempts to act upon it.
It's not clear in your example above what guarantees the iteration must obey. Must every element in the list eventually be visited by the iterating thread? Does it make multiple passes? What does it mean for one thread to remove an element from the list while another thread is running DoAction() against it? I suspect that working through these questions will lead to significant design changes.
You're working in C++, and you mentioned needing a queue with a pop-if-not-empty operation. I wrote a two-lock queue many years ago using the ACE Library's concurrency primitives, as the Boost thread library was not yet ready for production use, and the chance for the C++ Standard Library including such facilities was a distant dream. Porting it to something more modern would be easy.
This queue of mine -- called concurrent::two_lock_queue -- allows access to the head of the queue only via RAII. This ensures that acquiring the lock to read the head will always be mated with a release of the lock. A consumer constructs a const_front (const access to head element), a front (non-const access to head element), or a renewable_front (non-const access to head and successor elements) object to represent the exclusive right to access the head element of the queue. Such "front" objects can't be copied.
Class two_lock_queue also offers a pop_front() function that waits until at least one element is available to be removed, but, in keeping with std::queue's and std::stack's style of not mixing container mutation and value copying, pop_front() returns void.
In a companion file, there's a type called concurrent::unconditional_pop, which allows one to ensure through RAII that the head element of the queue will be popped upon exit from the current scope.
The companion file error.hh defines the exceptions that arise from use of the function two_lock_queue::interrupt(), used to unblock threads waiting for access to the head of the queue.
Take a look at the code and let me know if you need more explanation as to how to use it.
If you're using C++0x you could internally synchronize list iteration this way:
Assuming the class has a templated list named objects_, and a boost::mutex named mutex_
The toAll method is a member method of the list wrapper
void toAll(std::function<void (T*)> lambda)
{
boost::mutex::scoped_lock(this->mutex_);
for(auto it = this->objects_.begin(); it != this->objects_.end(); it++)
{
T* object = it->second;
if(object != nullptr)
{
lambda(object);
}
}
}
Calling:
synchronizedList1->toAll(
[&](T* object)->void // Or the class that your list holds
{
for(auto it = this->knownEntities->begin(); it != this->knownEntities->end(); it++)
{
// Do something
}
}
);
You must be using some threading library. If you are using Intel TBB, you can use concurrent_vector or concurrent_queue. See this.
If you want to continue using std::list in a multi-threaded environment, I would recommend wrapping it in a class with a mutex that provides locked access to it. Depending on the exact usage, it might make sense to switch to a event-driven queue model where messages are passed on a queue that multiple worker threads are consuming (hint: producer-consumer).
I would seriously take Matthieu's thought into consideration. Many problems that are being solved using multi-threaded programming are better solved using message-passing between threads or processes. If you need high throughput and do not absolutely require that the processing share the same memory space, consider using something like the Message-Passing Interface (MPI) instead of rolling your own multi-threaded solution. There are a bunch of C++ implementations available - OpenMPI, Boost.MPI, Microsoft MPI, etc. etc.
I don't think you can get away without any synchronisation at all in this case as certain operation will invalidate the iterators you are using. With a list, this is fairly limited (basically, if both threads are trying to manipulate iterators to the same element at the same time) but there is still a danger that you'll be removing an element at the same time you're trying to append one to it.
Are you by any chance holding the lock across DoAction(i)? You obviously only want to hold the lock for the absolute minimum of time that you can get away with in order to maximise the performance. From the code above I think you'll want to decompose the loop somewhat in order to speed up both sides of the operation.
Something along the lines of:
while (processItems) {
Info item;
lock(mutex);
if (!infoList.empty()) {
item = infoList.front();
infoList.pop_front();
}
unlock(mutex);
DoAction(item);
delayALittle();
}
And the insert function would still have to look like this:
lock(mutex);
infoList.push_back(item);
unlock(mutex);
Unless the queue is likely to be massive, I'd be tempted to use something like a std::vector<Info> or even a std::vector<boost::shared_ptr<Info> > to minimize the copying of the Info objects (assuming that these are somewhat more expensive to copy compared to a boost::shared_ptr. Generally, operations on a vector tend to be a little faster than on a list, especially if the objects stored in the vector are small and cheap to copy.
I'm wondering if this is the right approach to writing a thread-safe queue in C++?
template <class T>
class Queue
{
public:
Queue() {}
void Push(T& a)
{
m_mutex.lock();
m_q.push_back(a);
m_mutex.unlock();
}
T& Pop()
{
m_mutex.lock();
T& temp = m_q.pop();
m_mutex.unlock();
return temp;
}
private:
std::queue<t> m_q;
boost::mutex m_mutex;
};
You get the idea... I'm just wondering if this is the best approach. Thanks!
EDIT:
Because of the questions I'm getting, I wanted to clarify that the mutex is a boost::mutex
I recommend using the Boost threading libraries to assist you with this.
Your code is fine, except that when you write code in C++ like
some_mutex.lock();
// do something
some_mutex.unlock();
then if the code in the // do something section throws an exception then the lock will never be released. The Boost library solves this with its classes such as lock_guard in which you initialize an object which acquires a lock in its constructor, and whose destructor releases the lock. That way you know that your lock will always be released. Other languages accomplish this through try/finally statements, but C++ doesn't support this construct.
In particular, what happens when you try to read from a queue with no elements? Does that throw an exception? If so, then your code would run into problems.
When trying to get the first element, you probably want to check if something is there, then go to sleep if not and wait until something is. This is a job for a condition object, also provided by the Boost library, though available at a lower level if you prefer.
Herb Sutter wrote an excellent article last year in Dr. Dobbs Journal, covering all of the major concerns for a thread-safe, lock-free, single-producer, single-consumer queue implementation. (Which made corrections over an implementation published the previous month.)
His followup article in the next issue tackled a more generic approach for a multi-user concurrent queue, along with a full discussion of potential pitfalls and performance issues.
There are a few more articles on similar concurrency topics.
Enjoy.
From a threading-point of view, that looks about right for a simple, thread-safe queue.
You do have one problem, though: std::queue's pop() does not return the element popped from the queue. What you need to do is:
T Pop()
{
m_mutex.lock();
T temp = m_q.front();
m_q.pop();
m_mutex.unlock();
return temp;
}
You don't want to return a reference in this case since the referenced element is being popped from the queue and destroyed.
You also need to have some public Size() function to tell you how many elements are in the queue (either that, or you'll need to gracefully handle the case where Pop() is called and there are no elements in the queue).
Edit: Though, as Eli Courtwright points out, you do have to be careful with the queue operations throwing exceptions, and using Boost is a good idea.
Depends what are your goals. Crafted in this manner, you'll have your "reader" client blocking your "writer" client. You might want to consider using a "condition" to avoid dead-locks etc.
The approach that your are trying to implement is a locking approach. It will work, except that if you use a plain system-provided "mutex" object, it's performance might turn out disappointing (its lock-unlock overhead is pretty high). It is hard to say whether it will be good or not, since we don't know what your performance requirements and expectations are.
Since the operations you perform in "locked" segments of your code are rather quick, it might make sense to use a spin-lock instead of a true mutex, or a combination of the two. This will give you much better performance. Then again, maybe your "mutex" already implements all that (no way to know, since you provided no details about what is actually hiding behind that name).
And finally, if you are happen to be looking for best performance, you might want to read up on lock-free synchronization approach, which is a completely different story. Lock-free methods are typically much more difficult to implement though.
As Jean-Lou Dupont pointed out, your current code is quite prone to deadlocks. When I've done this, I've used three locks: two counted semaphores and one mutex. The counted semaphores signal when:there's space available to insert an objectThere at least one object to retrieve from the queue
The mutex is used only while actually putting an item in the queue or retrieving an item from the queue -- but no attempt is ever made at locking the mutex until we know that the insertion or retrieval will be able to succeed immediately.