I want synchronize between producer and consumer processes. I use a cyclic buffer without mutexes or semaphores.
This solution allows me to avoid waiting on synchronization.
I use shared memory cyclic buffer for the data sharing.
Now I want to allow the consumer process to sleep while the producer doesn't produce new data.
Regular named semaphore doesn't make the job for me, because it has an integer value. I need some binary semaphore, that is not incrementing, but just signals the consumer whether there is a new data or not.
I.e. the produced should not increment the semaphore value, but just to set it to 1.
I write in C++ for Linux OS
Related
Let's say I am on CentOS 7 x86_64 + GCC 7.
I would like to create a ringbuffer in shared memory.
If I have two processes Producer and Consumer, and both share a named shared memory, which is created/accessed through shm_open() + mmap().
If Producer writes something like:
struct Data {
uint64_t length;
char data[100];
}
to the shared memory at a random time, and the Consumer is constantly polling the shared memory to read. Will I have some sort of synchronization issue that the member length is seen but the member data is still in the progress of writing? If yes, what's the most efficient technique to avoid the issue?
I see this post:
Shared-memory IPC synchronization (lock-free)
But I would like to get a deeper, more low level of understanding what's required to synchronize between two processes efficiently.
Thanks in advance!
To avoid this, you would want to make the structure std::atomic and access it with acquire-release memory ordering. On most modern processors, the instructions this inserts are memory fences, which guarantee that the writer wait for all loads to complete before it begins writing, and that the reader wait for all stores to complete before it begins reading.
There are, in addition, locking primitives in POSIX, but the <atomic> header is newer and what you probably want.
What the Standard Says
From [atomics.lockfree], emphasis added:
Operations that are lock-free should also be address-free. That is, atomic operations on the same memory location via two different addresses will communicate atomically. The implementation should not depend on any per-process state. This restriction enables communication by memory that is mapped into a process more than once and by memory that is shared between two processes.
For lockable atomics, the standard says in [thread.rec.lockable.general], emphasis added:
An execution agent is an entity such as a thread that may perform work in parallel with other execution agents. [...] Implementations or users may introduce other kinds of agents such as processes [....]
You will sometimes see the claim that the standard supposedly makes no mention of using the <atomic> primitives with memory shared between processes, only threads. This is incorrect.
However, passing pointers to the other process through shared memory will not work, as the shared memory may be mapped to different parts of the address space, and of course a pointer to any object not in shared memory is right out. Indices and offsets of objects within shared memory will. (Or, if you really need pointers, Boost provides IPC-safe wrappers.)
Yes, you will ultimately run into data races, not only length being written and read before data is written, but also parts of those members will be written out of sync of your process reading it.
Although lock-free is the new trend, I'd suggest to go for a simpler tool as your first IPC sync job: the semaphore. On linux, the following man pages will be useful:
sem_init
sem_wait
sem_post
The idea is to have both processes signal the other one it is currently reading or writing the shared memory segment. With a semaphore, you can write inter-process mutexes:
Producer:
while true:
(opt) create resource
lock semaphore (sem_wait)
copy resource to shm
unlock semaphore (sem_post)
Consumer:
while true:
lock semaphore (sem_wait)
copy resource to local memory
or crunch resource
unlock semaphore (sem_post)
If for instance Producer is writing into shm while Consumer calls sem_wait, Consumer will block until after Producer will call sem_post, but, you have no guarantee Producer won't go for another loop, writing two times in a row before Consumer will be woke up. You have to build a mechanism unsure Producer & Consumer do work alternatively.
Using C++ I'm planning to have a producer process writing a data vector and then several consumer processes reading that data. There will be a shared memory segment (Boost::Interprocess) where the data vector will be stored. Issue is: I have no control over the order in which the processes will be launched by a third party application, it could be that the consumer might launch before the producer has produced any data. What mechanisms are available to coordinate the processes so that the consumer processes can be commanded to wait patiently until the producer signals the data is ready; no matter what the order in which the processes launch?
I guess named semaphore is a good choice. Your producer and consumers application should agree (hard-coded) the name of the semaphore, something like /tmp/mySem and only the producer must create and post the semaphore while the consumers should wait for semaphores existance and state.
If creating shared memory is responsibility of producer process, then you can use boost barrier for synchronizing startup.
You can create a barrier for creating shared memory, maybe some number of jobs to deploy. After reaching this barrier, consumer processes can continue to process them.
You can look to details of boost barrier at this page
I have 6 threads running in my application continuously. The scenario is:
One thread continuously gets the messages and inserts into a message queue. Other 4 threads can be considered as workers which continuously fetch messages from queue and process them. The other final thread populates the analytics information.
Problem:
Now the sleep durations for getting messages thread is 100ms. Worker threads is 200ms. When I ran this application the messages fetch thread is taking control and inserting into the queue thus increasing the heap. The worker threads are not getting chance to process the messages and deallocate them. Finally its resulting into out of memory.
How to manage this kind of scenario so that equal opportunity is given for messages fetch thread and worker thread.
Thanks in advance :)
You need to add back-pressure to your producer thread. Usually this will done by using blocking consumer-producer queues. Producer adds items to queue, consumers dequeues items from queue and process them. If queue is empty, consumers blocks until producer adds something to queue. If queue is full producer blocks until consumers fetch items from the queue.
One system of flow-control that I use often is to create a large pool of message objects at startup and never create any more. The *objects are stored on a thread-safe, blocking 'pool queue' and circulated around, popped from the pool by producer/s, queued to consumer/s on other blocking queues and then pushed back onto the pool queue when 'consumed'.
This caps memory use, provides flow-control, (if the pool empties, the producer/s block on it until messages are returned from consumers), and eliminates continual new/delete/malloc/free. The more complex and slower bounded queues are not necessary and all queues need only to be large enough to hold the, (known), maximum number of messages.
Using 'classic' blocking queues does not require any Sleep() calls.
Your question is a little vague so I can give you these guidelines instead of a code:
Protect mutual data with Mutex. In a multi-threaded consumer producer problem usually there is a race condition on the mutual data (the message in your program). One thread is attempting to write on the mutual memory location while the other is trying to read from the same location. The message read by the reader might be corrupted because the writer has wrote over it in the middle of reading process. You can lock the mutual memory location with a Mutex. Each one of the threads should acquire this lock in order to be able to read or modify the mutual data. This way the consumer process will be absolutely sure that data has not been modified. However you should note that acquiring this lock might hold back the producer thread so you should release the lock as soon as possible.
Use condition variables to notify consumer threads. If you do not use a notification mechanisms all consumer threads should actively check for data production which will use up system resources. The consumer threads should easily go to sleep knowing that the producer thread will notify them whenever a message is ready.
The threading library in C++ 11 has everything you need to implement a consumer producer application. However if you are not able to upgrade your compiler you could use boost threading library as well.
You want to use a bounded queue which when full will block threads trying to enqueue until more space is available.
You can use concurrent_bounded_queue from tbb, or simply use a semaphore initialized to the maximum queue size, and decrement on enqueue and increment on dequeue. boost::thread doesn't provide semaphores natively, but you can implement it using locks and condition variables.
I have two processes:
Producer
and
Consumer
they have a commonly mmaped shared region of memory
Memory
Now, Producer writes stuff to Memory. Consumer reads stuff from Memory.
I would prefer Consumer not to spin wait with Memory is empty.
I would prefer Producer not to spin wait when Memory is full.
How do I achieve this?
how about using mutexes? since a mutex will sleep until the resource is available, you won't experience the spin-wait problem.
This is reminiscent of the Dining Philosophers Problem. If your platform supports it, you could use condition variables shared across multiple processes. With such shared conditional variables your Producer could signal your Consumer to read Memory when data is available, and vice versa when Memory is empty. Remember to check for a spurious wakeup.
You'd need to check if MacOSX pthread implementation supports condition variables shared across processes. See my answer to your mutex related question to determine how. The answer applies for condition variables as well.
I have a lot of data that I want to disseminate to many different threads. This data is coming from a single thread. The consuming threads can safely access the container simultaneously.
The data needs to be merged into the container ever delta seconds (50ms < delta < 1), during which time the consuming threads need to be locked out, but not blocked. Similarly, when the data producer wants to merge in the data, it should wait until any reading threads are finished (which should be fast), but no one else should start reading as the update needs to occur as soon as possible.
I'm working on linux (platform specific solution is perfectly fine/expected) and I care about every millisecond. What sort of locking mechanisms should I use or is there an even better model for this problem?
If there is only one data producer thread and memory is not a consideration, you may want to consider using a merge and swap algorithm.
In it, the writer thread creates a copy of the data structure while readers continue to use the original, merges in new changes, then performs an exchange of the two structures within a mutex or critical section (or reader/writer lock). If your Unix platform supports interlocked exchange as an atomic operation, you can perform a lock-free exchange maximizing read throughput through they implementation.
It looks like you need to use the pthread read/write locks. They allow you to restrict access to one writer OR multiple readers. Look at pthread_rwlock_init to initialize the lock, pthread_rwlock_rdlock to acquire the lock for reading data, and pthread_rwlock_wrlock to acquire the lock for writing data.
Sounds like a good use for pthread read-write locks along with some thread-safe queues. The producer thread inserts items into the queue. The worker pool will pull items off of the queue and process the data. I'm not sure how the output will work but you might want to use a thread-safe queue here as well... maybe a priority queue to automatically merge the data if it makes sense.
The locked queue construct is nothing more than a mutex for exclusive locking, a std::queue for data storage, and a condition variable to wake up threads that are waiting on the queue. The enqueue method grabs the lock, inserts into the queue, releases the lock, and signals the condition. The dequeue method grabs the mutex, waits on the condition using the mutex as a guard, and dequeues any data that is there when it is woken up. This is a pretty standard producer-consumer style queue.
Before you roll your own solution, you might want to check out Boost.MPI and Boost.Thread. They both provide nicer C++ interfaces over the underlying OS implementation. I've used Boost.Thread a lot but it doesn't provide a nice message passing interface, but it does improve over pthread.
If you are really into multi-processing, you might want to give Boost.MPI or maybe Apache Qpid serious consideration. I plan on looking into Qpid and AMPQ for future projects since they both provide nice message-based interfaces.