Concurrency of processes for mutex - concurrency

I have to write a daemon to decide acces policy for mutexes ( it establishes which process get the mutex if more than one want the same mutex on whatever criteria)
For that I established some codes : L 1 231 (LOCK mtx_id process_pid).
When a process requests a mutex it writes on a shared memory zone some code similar to the one above.
The daemon reads it. (For every mutex I have a queue with processe waiting to get it.) Puts the process pid in queue.
If it is unlocked , pop queue, give mutex.( Write in shared memory id_mutex and process of the pid that got it, for other processes to read and know who has the mutex.
My question is : how do more processe request same mutex ? Creating them at first and selecting the requested process manually does not seem such a good option.
Any help is appreciated.THank you

Many OS have a container, a catalog, directory or registry, of OS objects that can be stored by name. Once stored in the container, they can be looked up by name and a reference token returned. That token can then be used to access the object.
A synchro object like an inter-process mutex would be a good candidate for storage in the container. Multiple processes could then look up the mutex by name and use it.
Such cataloged objects are often reference-counted so that they are only destroyed when the last process with a token calls for it to be closed.
BTW - see comments, your design suc.... has issues :(

Related

Thread safety of curl_multi_remove_handle

It seems like some sources recommend using curl_multi_remove_handle to "invalidate" a curl handle and cause curl_multi_wait to return early. This seems not to be covered under the thread safety guarantee (if done from another thread), or am I wrong (the threads safety guarantees are basically just reentrancy guarantees)?
What is the recommended way signal curl_multi_wait to return early? Is it really required to do it via timeouts? (Under Linux, I would use an eventfd in the epoll set to effectively have the case "wait on these sockets OR this event fd OR the given timeout".) It seems I could use custom curl_waitfd structures, but this would require platform specific setup for dummy sockets.
You must not call curl_multi_remove_handle from thread B if curl_multi_wait for that handle is running in thread A. That will just cause tears and misery.
You can opt to, for example:
user sufficiently short timeouts for curl_multi_wait() so that you don't need to abort it
add a private socket/file descriptor to send data on to abort when you want to
return error from the progress callback (or another callback) for the transfer(s) you need to stop - by setting a flag that they all check (global, or global like)
rework your app logic so that you can consider the transfer to "dead" without it having stopped yet, and have libcurl have its cause and close it later and you don't have to care much about it being done a bit after you decided you can ignore it.
curl_multi_poll()
After I first wrote this answer, we introduced curl_multi_poll in libcurl. This function is very similar to curl_multi_wait but also allows it to pre-emptively return with the use of curl_multi_wakeup, thus offering applications a few more alternative approaches.
Unfortunately, curl_multi is not, what people these days would deem as "thread safe". Yes, you can use a CURLM handle in two different threads, as long, as they don't access it at the same time. But hey, this is true for almost any data structure in C or C++.
So, if you have one thread running an event loop with curl_multi_wait(), you cannot use a second thread to add new jobs via curl_multi_add_handle() or remove jobs via curl_multi_remove_handle(). Well, it will work most of the times, but especially during high load, you will start getting data corruptions and segfaults due to the concurrent access to libcurl's internal data structures.
There are two ways around this problem, but both require a bit of coding:
Use the newer curl_multi_poll() interface, which (unlike curl_multi_wait()) is externally interruptible via curl_multi_wakeup(). Yes, curl_multi_wakeup() is the ONLY function on CURLM handles, that is safe to call concurrently from another thread (or even multiple threads). To add new requests to the event loop or remove requests from it, you would need some request queue and a mutex, which secures access to that queue. Then, to add a new job, you would do:
(thread 1 is running curl_multi_poll() in an endless loop)
thread 2 acquires said mutex
thread 2 posts an "add easy handle request" into the request queue
thread 2 releases said mutex again
thread 2 calls curl_multi_wakeup()
thread 1 acquires the mutex after curl_multi_poll() returns
thread 1 then processes the "add easy handle request" in the job list and performs curl_multi_add_handle()
thread 1 then releases the mutex again
thread 1 does all other necessary work (in particular call curl_multi_perform() and pass finished transfers to the application etc.)
thread 1 calls curl_multi_poll() again
To remove a job, you would use the same procedure, just let thread 2 post an "remove easy handle request" instead of an "add easy handle request" to the request queue and then let thread 1 call curl_multi_remove_handle() instead of curl_multi_add_handle().
In this solution, ALL calls to the CURLM handle are performed from thread 1, with the sole exception of curl_multi_wakeup(), which is used by other threads to signal thread 1 of new work waiting in the request queue.
Or use the curl_action() interface, where you have to provide two callbacks to libcurl, with which it reports file descriptors to watch and a timeout to your application. You then have to call epoll() or a similiar OS function yourself to wait for activity (or timeout) in the event loop thread. Then add a mutex again to serialize access to the CURLM handle: Your event loop thread should lock that mutex just before it calls curl_action() (or any other function on the CURLM handle) and unlock it immediately after. As curl_action() (unlike curl_multi_poll()) does not sleep, that mutex will be locked only for brief intervals. So other threads can then easily directly lock that mutex for themselves, too, and call curl_multi_add_handle() or curl_multi_remove_handle() as needed. Be aware, though, that those intervening additions or removals of handles can modify the active FD set, and that you may need some synchronisation with the event loop thread to notify it of the modified epoll() set.
The first solution is likely easier to implement. You should be able to find libcurl wrappers for both variants on Github, but be sure to test them intensively before using them in any critical application.

Cross-Process Mutex Read/Write Locking

I'm trying to make inter-process communication in C/C++ on Windows environment.
I am creating a shared memory page file and two processes get the handle to that file. It's like this:
Process1: Initialize shared memory area. Wait for Process2 to fill it.
Process2: Get handle to shared memory area. Put stuff in it.
I am creating a named mutex in process1 as well. Now process1 acquires the ownership of the mutex soon after creating it (using WaitSingleObject). Obviously, there is nothing in the memory area so I need to release the mutex. Now I need to wait until the memory is filled instead of trying to acquire the mutex again.
I was thinking of conditional variables. Process2 signals the condition variable once it fills in the memory area and process1 will acquire the information immediately.
However, as per MS Documentation on Condition Variables, they are not shared across processes which is clear from their initialization as they are not named.
Furthermore, the shared memory area can hold up to one element at any given moment which means process2 cannot refill after filling it unless process1 extracts its information.
From the given description it's clear that condition variables are the best for this purpose (or Monitors). So is there a way around this?
Conditional variables can be used with in the process, but not across the processes.
Try NamedPipe with PIPE_ACCESS_DUPLEX as open mode. So that you have communication options from both process.
https://msdn.microsoft.com/en-us/library/windows/desktop/aa365150(v=vs.85).aspx
I have used events for this before. Use 2 named auto reset events. 1 data ready event and one buffer ready event. Writer waits for buffer ready, writes data and sets the data ready event. Reader waits for data ready event, reads memory and sets the buffer ready event. If done properly you should not need the mutex.

Using QLockFile::setStaleLockTime() but lock doesn't become stale?

I'm setting the stale lock time to 100 ms using this code:
QLockFile lock1(fn);
lock1.setStaleLockTime(100);
QVERIFY(lock1.lock());
QLockFile lock2(fn);
lock2.setStaleLockTime(100);
QVERIFY(lock2.lock());
I expected it to block for only 100ms, but it blocks indefinitely. Why is that?
Am I misunderstanding how lock files should become stale? Here's what the docs say:
The value of staleLockTime is used by lock() and tryLock() in order to determine when an existing lock file is considered stale, i.e. left over by a crashed process. This is useful for the case where the PID got reused meanwhile, so one way to detect a stale lock file is by the fact that it has been around for a long time.
You misunderstand something
If the process holding the lock crashes, the lock file stays on disk
and can prevent any other process from accessing the shared resource,
ever. For this reason, QLockFile tries to detect such a "stale" lock
file, based on the process ID written into the file. To cover the
situation that the process ID got reused meanwhile, the current
process name is compared to the name of the process that corresponds
to the process ID from the lock file. If the process names differ, the
lock file is considered stale. Additionally, the last modification
time of the lock file (30s by default, for the use case of a
short-lived operation) is taken into account. If the lock file is
found to be stale, it will be deleted.
So not only staleLockTime but also process ID checked and other things. So you can't use this method such way.

how to pass data to running thread

When using pthread, I can pass data at thread creation time.
What is the proper way of passing new data to an already running thread?
I'm considering making a global variable and make my thread read from that.
Thanks
That will certainly work. Basically, threads are just lightweight processes that share the same memory space. Global variables, being in that memory space, are available to every thread.
The trick is not with the readers so much as the writers. If you have a simple chunk of global memory, like an int, then assigning to that int will probably be safe. Bt consider something a little more complicated, like a struct. Just to be definite, let's say we have
struct S { int a; float b; } s1, s2;
Now s1,s2 are variables of type struct S. We can initialize them
s1 = { 42, 3.14f };
and we can assign them
s2 = s1;
But when we assign them the processor isn't guaranteed to complete the assignment to the whole struct in one step -- we say it's not atomic. So let's now imagine two threads:
thread 1:
while (true){
printf("{%d,%f}\n", s2.a, s2.b );
sleep(1);
}
thread 2:
while(true){
sleep(1);
s2 = s1;
s1.a += 1;
s1.b += 3.14f ;
}
We can see that we'd expect s2 to have the values {42, 3.14}, {43, 6.28}, {44, 9.42} ....
But what we see printed might be anything like
{42,3.14}
{43,3.14}
{43,6.28}
or
{43,3.14}
{44,6.28}
and so on. The problem is that thread 1 may get control and "look at" s2 at any time during that assignment.
The moral is that while global memory is a perfectly workable way to do it, you need to take into account the possibility that your threads will cross over one another. There are several solutions to this, with the basic one being to use semaphores. A semaphore has two operations, confusingly named from Dutch as P and V.
P simply waits until a variable is 0 and the goes on, adding 1 to the variable; V subtracts 1 from the variable. The only thing special is that they do this atomically -- they can't be interrupted.
Now, do you code as
thread 1:
while (true){
P();
printf("{%d,%f}\n", s2.a, s2.b );
V();
sleep(1);
}
thread 2:
while(true){
sleep(1);
P();
s2 = s1;
V();
s1.a += 1;
s1.b += 3.14f ;
}
and you're guaranteed that you'll never have thread 2 half-completing an assignment while thread 1 is trying to print.
(Pthreads has semaphores, by the way.)
I have been using the message-passing, producer-consumer queue-based, comms mechanism, as suggested by asveikau, for decades without any problems specifically related to multiThreading. There are some advantages:
1) The 'threadCommsClass' instances passed on the queue can often contain everything required for the thread to do its work - member/s for input data, member/s for output data, methods for the thread to call to do the work, somewhere to put any error/exception messages and a 'returnToSender(this)' event to call so returning everything to the requester by some thread-safe means that the worker thread does not need to know about. The worker thread then runs asynchronously on one set of fully encapsulated data that requires no locking. 'returnToSender(this)' might queue the object onto a another P-C queue, it might PostMessage it to a GUI thread, it might release the object back to a pool or just dispose() it. Whatever it does, the worker thread does not need to know about it.
2) There is no need for the requesting thread to know anything about which thread did the work - all the requestor needs is a queue to push on. In an extreme case, the worker thread on the other end of the queue might serialize the data and communicate it to another machine over a network, only calling returnToSender(this) when a network reply is received - the requestor does not need to know this detail - only that the work has been done.
3) It is usually possible to arrange for the 'threadCommsClass' instances and the queues to outlive both the requester thread and the worker thread. This greatly eases those problems when the requester or worker are terminated and dispose()'d before the other - since they share no data directly, there can be no AV/whatever. This also blows away all those 'I can't stop my work thread because it's stuck on a blocking API' issues - why bother stopping it if it can be just orphaned and left to die with no possibility of writing to something that is freed?
4) A threadpool reduces to a one-line for loop that creates several work threads and passes them the same input queue.
5) Locking is restricted to the queues. The more mutexes, condVars, critical-sections and other synchro locks there are in an app, the more difficult it is to control it all and the greater the chance of of an intermittent deadlock that is a nightmare to debug. With queued messages, (ideally), only the queue class has locks. The queue class must work 100% with mutiple producers/consumers, but that's one class, not an app full of uncooordinated locking, (yech!).
6) A threadCommsClass can be raised anytime, anywhere, in any thread and pushed onto a queue. It's not even necessary for the requester code to do it directly, eg. a call to a logger class method, 'myLogger.logString("Operation completed successfully");' could copy the string into a comms object, queue it up to the thread that performs the log write and return 'immediately'. It is then up to the logger class thread to handle the log data when it dequeues it - it may write it to a log file, it may find after a minute that the log file is unreachable because of a network problem. It may decide that the log file is too big, archive it and start another one. It may write the string to disk and then PostMessage the threadCommsClass instance on to a GUI thread for display in a terminal window, whatever. It doesn't matter to the log requesting thread, which just carries on, as do any other threads that have called for logging, without significant impact on performance.
7) If you do need to kill of a thread waiting on a queue, rather than waiing for the OS to kill it on app close, just queue it a message telling it to teminate.
There are surely disadvantages:
1) Shoving data directly into thread members, signaling it to run and waiting for it to finish is easier to understand and will be faster, assuming that the thread does not have to be created each time.
2) Truly asynchronous operation, where the thread is queued some work and, sometime later, returns it by calling some event handler that has to communicate the results back, is more difficult to handle for developers used to single-threaded code and often requires state-machine type design where context data must be sent in the threadCommsClass so that the correct actions can be taken when the results come back. If there is the occasional case where the requestor just has to wait, it can send an event in the threadCommsClass that gets signaled by the returnToSender method, but this is obviously more complex than simply waiting on some thread handle for completion.
Whatever design is used, forget the simple global variables as other posters have said. There is a case for some global types in thread comms - one I use very often is a thread-safe pool of threadCommsClass instances, (this is just a queue that gets pre-filled with objects). Any thread that wishes to communicate has to get a threadCommsClass instance from the pool, load it up and queue it off. When the comms is done, the last thread to use it releases it back to the pool. This approach prevents runaway new(), and allows me to easily monitor the pool level during testing without any complex memory-managers, (I usually dump the pool level to a status bar every second with a timer). Leaking objects, (level goes down), and double-released objects, (level goes up), are easily detected and so get fixed.
MultiThreading can be safe and deliver scaleable, high-performance apps that are almost a pleasure to maintain/enhance, (almost:), but you have to lay off the simple globals - treat them like Tequila - quick and easy high for now but you just know they'll blow your head off tomorrow.
Good luck!
Martin
Global variables are bad to begin with, and even worse with multi-threaded programming. Instead, the creator of the thread should allocate some sort of context object that's passed to pthread_create, which contains whatever buffers, locks, condition variables, queues, etc. are needed for passing information to and from the thread.
You will need to build this yourself. The most typical approach requires some cooperation from the other thread as it would be a bit of a weird interface to "interrupt" a running thread with some data and code to execute on it... That would also have some of the same trickiness as something like POSIX signals or IRQs, both of which it's easy to shoot yourself in the foot while processing, if you haven't carefully thought it through... (Simple example: You can't call malloc inside a signal handler because you might be interrupted in the middle of malloc, so you might crash while accessing malloc's internal data structures which are only partially updated.)
The typical approach is to have your thread creation routine basically be an event loop. You can build a queue structure and pass that as the argument to the thread creation routine. Then other threads can enqueue things and the thread's event loop will dequeue it and process the data. Note this is cleaner than a global variable (or global queue) because it can scale to have multiple of these queues.
You will need some synchronization on that queue data structure. Entire books could be written about how to implement your queue structure's synchronization, but the most simple thing would have a lock and a semaphore. When modifying the queue, threads take a lock. When waiting for something to be dequeued, consumer threads would wait on a semaphore which is incremented by enqueuers. It's also a good idea to implement some mechanism to shut down the consumer thread.

Difference between event object and condition variable

What is the difference between event objects and condition variables?
I am asking in context of WIN32 API.
Event objects are kernel-level objects. They can be shared across process boundaries, and are supported on all Windows OS versions. They can be used as their own standalone locks to shared resources, if desired. Since they are kernel objects, the OS has limitations on the number of available events that can be allocated at a time.
Condition Variables are user-level objects. They cannot be shared across process boundaries, and are only supported on Vista/2008 and later. They do not act as their own locks, but require a separate lock to be associated with them, such as a critical section. Since they are user- objects, the number of available variables is limited by available memory. When a Conditional Variable is put to sleep, it automatically releases the specified lock object so another thread can acquire it. When the Conditional Variable wakes up, it automatically re-acquires the specified lock object again.
In terms of functionality, think of a Conditional Variable as a logical combination of two objects working together - a keyed event and a lock object. When the Condition Variable is put to sleep, it resets the event, releases the lock, waits for the event to be signaled, and then re-acquires the lock. For instance, if you use a critical section as the lock object, SleepConditionalVariableCS() is similar to a sequence of calls to ResetEvent(), LeaveCriticalSection(), WaitForSingleObject(), and EnterCriticalSection(). Whereas if you use a SRWL as the lock, SleepConditionVariableSRW() is similar to a sequence of calls to ResetEvent(), ReleaseSRWLock...(), WaitForSingleObject(), and AcquireSRWLock...().
They are very similar, but event objects work across process boundaries, whereas condition variables do not. From the MSDN documentation on condition variables:
Condition variables are user-mode
objects that cannot be shared across
processes.
From the MSDN documentation on event objects:
Threads in other processes can open a
handle to an existing event object by
specifying its name in a call to the
OpenEvent function.
The most significant difference is the Event object is a kernel object and can be shared across processes as long as it is alive when processes/threads are trying to acquire, on the contrary, Condition variable is a user mode object which is light(only has same size as a pointer and has nothing additional to be released after using it) and has better performance.
Typically, condition variable is often used along with locks, since we need to keep data synchronized properly. When considering Condition Variable, we treat it like keyed events which was improved since Vista.
Joe duffy has a blog post http://joeduffyblog.com/2006/11/28/windows-keyed-events-critical-sections-and-new-vista-synchronization-features/ that explained more detailed information.