thread safe producer/consumer pattern in c++

thread safe producer/consumer pattern in c++ - c++

How Can I develop a producer/ consumer pattern which is thread safe?
in my case, the producer runs in a thread and the consumer runs on another thread.
Is std::deque is safe for this purpose?
can I push_back to the back of a deque in one thread and push_front in another thread?
Edit 1
In my case, I know the maximum number of items in the std::deque (for example 10). Is there any way that I can reserve enough space for items beforehand so during processing, there was no need to change the size of queue memory and hence make sure when I am adding pushing data to back, no change could be happen on front data?

STL C++ containers are not thread-safe: if you decide for them, you need to use proper synchronizations (basically std::mutex and std::lock) when pushing/popping elements.
Alternatively you can use properly designed containers (single-producer-single-consumer queues should fit your needs), one example of them here: http://www.boost.org/doc/libs/1_58_0/doc/html/lockfree.html
addon after your EDIT:
yep, a SPSC queue is basically a ring buffer and definitively fits you needs.

How Can I develop a producer/ consumer pattern which is thread safe?
There are several ways, but using locks and monitors is fairly easy to grasp and doesn't have many hidden caveats. The standard library has std::unique_lock, std::lock_guard and std::condition_variable to implement the pattern. Check out the cppreference page of condition_variable for simple example.
Is std::deque is safe for this purpose?
It's not safe. You need synchronization.
can I push_back to the back of a deque in one thread and push_front in another thread?
Sure, but you need synchronization. There is a race condition when the queue is empty or has only one element. Also when the queue is full or one short of full in case you want to limit it's size.

I think you mean push_back() and pop_front().
std::deque is not thread-safe on its own.
You will need to serialise access using an std::mutex so the consumer isn't trying to pop while the producer is trying to push.
You should also consider how you handle the following:
How does the consumer behave if the deque is empty when it looks for the next item?
If it enters a wait state then you will need a std::condition_variable to be notified by the producer when the deque has been added to.
You may also need to handle program termination in which the consumer is waiting on the deque and the program is terminated. It could be left 'waiting forever' unless you orchestrate things correctly.
10 items is 'piffle' so I wouldn't bother about reserving space. std::deque grows and shrinks automatically so don't bother with fine grain tuning until you've built a working application.
Premature optimization is the root of all evil.
NB: It's not clear how your limiting the queue size but if the producer fills up the queue and then waits for it to clear back down you'll need more waits and conditions coming back the other way to coordinate.

Related

Is there a awaitable queue in c++?

I use concurrency::task from ppltasks.h heavily in my codebase.
I would like to find a awaitable queue, where I can do "co_await my_queue.pop()". Has anyone implemented one?
Details:
I have one producer thread that pushes elements to a queue, and another receiver thread would be waiting and waking up when elements arrive in the queue. This receiving thread might wait/wake up to handle other tasks in the meantime (using pplpp::when_any).
I don't want a queue with an interface where i have to poll a try_pop method as that is slow, and I don't want a blocking_pop method as that means I can't handle other ready tasks in the meantime.

This is basically your standard thread-safe queue implementation, but instead of a condition_variable, you will have to use futures to coordinate the different threads. You can then co_await on the future returned by pop to become ready.
The queue's implementation will need to keep a list of the promises that correspond to the outstanding pop calls. In case that the queue is still full when poping, you can return a ready future immediately. You can use plain old std::mutex to synchronize concurrent access to the underlying data structures.
I don't know of any implementation that already does this, but it shouldn't be too hard to pull off. Note though that managing all the futures will introduce some additional overhead, so your queue will probably be slightly less efficient than the classic condition_variable-based approach.

Posted a comment but I might as well write this as the answer since its long an I need formatting.
Basically you're two options are:
Lock-free queues, the most popular of which is this:
https://github.com/cameron314/concurrentqueue
They do have try_pop, because it uses atomic pointer and any atomic methods (e.g. std::atomic_compare_exchange_weak) can and will "fail" and return false at times, so you are forced to have a spin-lock over them.
You may find queues that abstract this inside a "pop" which just calls "try_pop" until it works, but that's the same overhead in the backround.
Lock-base queues:
These are easier to do on your own, without a third part library, just wrap every method you need in locks, if you want to 'peek' very often look into using shared_locks, otherwise just std::lock_guard should be enough to guard all wrapper. However this is what you may call a 'blocking' queue since during an access, weather it is to read or to write, the whole queue will be locked.
There is not thread-safe alternatives to these two implementations. If you are in need of a really large queue (e.g. hundreds of GBs of memory worth of objects) under heavy usage you can consider writing some custom hybrid data structure, but for most usecases moodycamel's queue will be more than sufficient an.

c++ multithread slow processing

I have multiple consumer threads and one producer thread. Producer thread writes the data into a map belong to a certain consumer thread and sends a signal to the consumer thread. I am using mutexes around the map when I am inserting and erasing the data. however this approach looks not efficient in terms of speed performance. Can you suggest another approach instead of map which requires mutex locks and unlocks and I think mutex slows down the transmission.

however this approach looks not efficient in terms of speed performance. Can you suggest another approach instead of map which requires mutex locks and unlocks and I think mutex slows down the transmission.
You should use a profiler to identify where the bottleneck is.
Producer thread writes the data into a map belong to a certain consumer thread and sends a signal to the consumer thread.
The producer should not be concerned what kind of data structure the consumer uses - it is a consumer's implementation detail. Keep in mind that inserting a value into a map requires a memory allocation (unless you are using a custom allocator) and memory allocation internally takes locks as well to protect the state of the heap. The end result is that locking a mutex around map::insert operation may lock it for too long actually.
A simpler and more efficient design would be to have an atomic queue between the producer and consumer (e.g. pipe, TBB concurrent_bounded_queue which pre-allocates its storage so that push/pop operations are really quick). Since your producer communicates directly to each consumer that queue is one-writer-one-reader and it can be implemented as a wait-free queue (or ring buffer a-la C++ disruptor).

Andrei Alexandrescu made the good point in that you should measure your code (https://www.facebook.com/notes/facebook-engineering/three-optimization-tips-for-c/10151361643253920) and this is the same advice I would give you, which is to measure your code and see what performance differences you are getting between a baseline test and your test running single threaded:
Time required to insert data using single thread to map
with above listed data
Time required to insert data
using single thread to map with above listed data and using mutex
locks
If you are still looking for a thread-safe container, you may want to look at Intel's open-source implementation of thread-safe containers at http://www.threadingbuildingblocks.org/docs/help/reference/containers_overview/concurrent_queue_cls.htm .
Also, as a suggestion for the consumer thread implementation, you may want to read the ActiveObject article that Herb Sutter posted on his website: http://herbsutter.com/2010/07/12/effective-concurrency-prefer-using-active-objects-instead-of-naked-threads/
If you can provide some more details, like why the map has to be locked all the time, we may be able to draft up a mechanism that is better performing.

How to implement platform independent asynchronous write to file?

I am creating a program that will receive messages from a remote machine and needs to write the messages to a file on disk. The difficulty I am finding lies in the fact that the aim of this program is to test the performance of the library that receives the messages and, therefore, I need to make sure that writing the messages to disk does not affect the performance of the library. The library delivers the messages to the program via a callback function. One other difficulty is that the solution must be platform independent.
What options do I have?
I thought of the following:
using boost:asio to write to file, but it seems (see this documentation) that asynchronous write to file is in the Windows specific part of this library - so this cannot be used.
Using boost::interprocess to create a message queue but this documentation indicates that there are 3 methods in which the messages can be sent, and all methods would require the program to block (implicitly or not) if the message queue is full, which I cannot risk.
creating a std::deque<MESSAGES> to push to the deque from the callback function, and pop the messages out while writing to the file (on a separate thread), but STL containers are not guaranteed to be thread-safe. I could lock the pushing onto, and popping off, the deque but we are talking about 47 microseconds between successive messages so I would like to avoid locks altogether.
Does anyone have any more ideas on possible solutions?

STL containers may not be thread-safe but I haven't ever hit one that cannot be used at different times on different threads. Passing the ownership to another thread seems safe.
I have used the following a couple of times, so I know it works:
Create a pointer to a std::vector.
Create a mutex lock to protect the vector pointer.
Use new[] to create a std::vector and then reserve() a large size for it.
In the receiver thread:
Lock the mutex whenever adding an item to the queue. This should be a short lock.
Add queue item.
Release the lock.
If you feel like it signal a condition variable. I sometimes don't: it depends on the design. If the volume is very high and there is no pause in the receive side just skip the condition and poll instead.
On the consumer thread (the disk writer):
Go look for work to do by polling or waiting on a condition variable:
Lock the queue mutex.
Look at the queue length.
If there is work in the queue assign the pointer to a variable in the consumer thread.
Use new[] and reserve() to create a new queue vector and assign it to the queue pointer.
Unlock the mutex.
Go off and write your items to disk.
delete[] the used-up queue vector.
Now, depending on your problem you may end up needing a way to block. For example in one of my programs if the queue length ever hits 100,000 items, the producing thread just starts doing 1 second sleeps and complaining a lot. It is one of those things that shouldn't happen, yet does, so you should consider it. Without any limits at all it will just use all the memory on the machine and then crash with an exception, get killed by OOM or just come to a halt in a swap storm.

boost::thread is platform independent so you should be able to utilize that create a thread to do the blocking writes on. To avoid needing to lock the container every time a message is placed into the main thread you can utilize a modification on the double buffering technique by creating nested containers, such as:
std::deque<std::deque<MESSAGES> >
Then only lock the top level deque when a deque full of messages is ready to be added. The writing thread would in turn only lock the top level deque to pop off a deque full of messages to be written.

Is there in stl or boost thread safe structure for inter thread communication - with behavior like queue?

I have game and I have two threads , one generates custom class and needs to store that (I put to push that in queue but I am not sure if that is thread safe, first thread generates every 50ms new instance, and second can read faster if there is any or slower - speed changes over time) . Another thread uses if queue is not empty , pop first and calculates some things. Is there any data structure thread safe for this problem in stl or boost ?

Using std::queue or any similar container will not be thread safe. If you want your access (push/pop) to be thread-safe, while using std::queue, you should use boost::mutex or a similar mechanism to lock before each access. You can look at boost::shared_mutex if you need immutable reads from more than one thread (not sure you need that based on what you described).
Apart from that, you can take a look at boost::interprocess::message_queue, as someone has already mentioned -> http://www.boost.org/doc/libs/1_50_0/boost/interprocess/ipc/message_queue.hpp for the most recent version of boost.
Moreover, there is the concept of lock-free queues en.wikipedia.org/wiki/Non-blocking_algorithm. I cannot provide an example of such implementation but I am sure you can find some if you google around.

STL containers thread-safeness for producer/consumer pattern

I am planning to do the following:
store a deque of pre-built objects to be consumed. The main thread might consume these objects here and there. I have another junky thread used for logging and other not time-critical but expensive things. When the pre-built objects are running low, I will refill them in the junky thread.
Now my question is, is there going to be race condition here? Technically one thread is consuming objects from the front, and another thread is pushing objects into the back. As long as I don't let the size run down to zero, it should be fine. The only thing that concerns me is the "size" of this deque. Do they store a integer "size" variable in STL containers? should modifying that size variable introduce race conditions?
What's the best way of solving this problem? I don't really want to use locks, because the main thread is performance critical (the reason I pre-built these objects in the first place!)

STL containers are not thread safe, period, don't play with this. Specifically the deque elements are usually stored in a chain of short arrays and that chain will be modified when operating with the deque, so there's a lot of room for messing things up.

Another option would be to have 2 deques, one for read another for write. The main thread reads, and the other writes. When the read deque is empty, switch the deques (just move 2 pointers), which would involve a lock, but only occasionally.
The consumer thread would drive the switch so it would only need to do a lock when switching. The producer thread would need to lock per write in case the switch happens in the middle of a write, but as you mention the consumer is less performance-critical, so no worries there.
What you're suggesting regarding no locks is indeed dangerous as others mention.

As #sharptooth mentioned, STL containers aren't thread-safe. Are you using a C++11 capable compiler? If so, you could implement a lock-free queue using atomic types. Otherwise you'd need to use assembler for compare-and-swap or use a platform specific API (see here). See this question to get information on how to do this.
I would emphasise that you should measure performance when using standard thread synchronisation and see if you do actually need a lock-free technique.

There will be a data race even with non-empty deque.
You'll have to protect all accesses (not just writes) to the deque through locks, or use a queue specifically designed for consumer-producer model in multi-threaded environment (such as Microsoft's unbounded_buffer).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js