Mutex lock on write only - c++

I have a multithreaded C++ application which holds a complex data structure in memory (cached data).
Everything is great while I just read the data. I can have as many threads as I want access the data.
However the cached structure is not static.
If the requested data item is not available it will be read from database and is then inserted into the data tree. This is probably also not problematic and even if I use a mutex while I add the new data item to the tree that will only take few cycles (it's just adding a pointer).
There is a Garbage Collection process that's executed every now and then. It removes all old items from the tree. To do so I need to lock the whole thing down to make sure that no other process is currently accessing any data that's going to be removed from memory. I also have to lock the tree while I read from the cache so that I don't remove items while they are processed (kind of "the same thing the other way around").
"Pseudocode":
function getItem(key)
lockMutex()
foundItem = walkTreeToFindItem(key)
copyItem(foundItem, safeCopy)
unlockMutex()
return safeCopy
end function
function garbageCollection()
while item = nextItemInTree
if (tooOld) then
lockMutex()
deleteItem(item)
unlockMutex()
end if
end while
end function
What's bothering me: This means, that I have to lock the tree while I'm reading (to avoid the garbage collection to start while I read). However - as a side-effect - I also can't have two reading processes at the same time anymore.
Any suggestions?
Is there some kind of "this is a readonly action that only collides with writes" Mutex?

Look into read-write-lock.
You didn't specify which framework can you use but both pThread and boost have implemented that pattern.

The concept is a "shared reader, single writer" lock as others have stated. In Linux environments you should be able to use pthread_rwlock_t without any framework. I would suggest looking into boost::shared_lock as well.

I suggest a reader-writer lock. The idea is you can acquire a lock for "reading" or for "writing", and the lock will allow multiple readers, but only one writer. Very handy.

In C++17 this type of access (multiple read, single write) is supported directly with std::shared_mutex. I think this was adopted from boost::shared_lock. The topic is also described with an example in Anthony Williams's C++ Concurrency in Action.
https://livebook.manning.com/book/c-plus-plus-concurrency-in-action-second-edition/chapter-3/185.
The essential bits are:
use std::shared_lock on reads where shared access is allowed
existing shared_locks don't prevent a new shared_lock from locking
use std::unique_lock on update/write functions, where exclusive access is needed
will wait until all existing shared reads complete, as well as other writes
precludes any other reads or writes while locked

Related

When can safely access mutex protected variable without locking?

A common pattern of storing config in my code is a "map[string]interface{}" protected by RWMutex, but usually after app initiated (could be triggered in multiple go-routine), the map becomes totally readonly. So I have a feeling that from some point of time on, the RWMutex on read should be unnecessary.
An example of this config map is at
http://play.golang.org/p/tkbj9DBok_
One fact that brought me to think of this is in some of production code it actually doing this way of unprotected access of shared object (though it's mostly readonly after it's init'ed), I understand normal way of using RWMutex to protect, but interesting part is this malformed code haven't run into problem in past months.
Is that true that after some accurate "time point" that writes are flushed from cache into memory and with a guarantee of no more writes needed, reads can actually go without RWMutex.RLock? If YES, when is the time point or how to setup the conditions before lockless access?
As long as no one is modifying the map, it should be safe for multiple threads to read it at once. Unfortunately, without any locking you'll have no way to make sure no one else is reading the map when you want to update it.
So one solution is to never update the map, but instead replace it atomically. The read-copy-update algorithm could be used here. Rather than directly accessing the map, so you need to dereference the pointer to access the map. To update it, you can do the following:
acquire an "update lock" mutex.
make a copy of the map. You want to copy all keys/values manually: simple assignment won't work because maps are reference types.
make your changes to the copy of the map.
use StorePointer from the sync/atomic package to atomically update the pointer to the live map to point to your new map.
release the mutex.
Everything that runs before the atomic update in (4) will see the old map, and everything after will see the new map. At no point will those goroutines be reading from a map that is being written to, so there is no need for an RWMutex.
RLock and RUnlock are very fast operations. If there are no writers they are essentially lockless and take just 1 atomic operation each (http://golang.org/src/sync/rwmutex.go?h=RLock#L29). So unless your application performs poorly because reading configuration is slow I would suggest to use RWLock.
Note, that my initial answer was a proposal to implement Freeze() operation for Register but later I realized that the correct implementation won't be any faster than using RWLock.

Guarded Data Design Pattern

In our application we deal with data that is processed in a worker thread and accessed in a display thread and we have a mutex that takes care of critical sections. Nothing special.
Now we thought about re-working our code where currently locking is done explicitely by the party holding and handling the data. We thought of a single entity that holds the data and only gives access to the data in a guarded fashion.
For this, we have a class called GuardedData. The caller can request such an object and should keep it only for a short time in local scope. As long as the object lives, it keeps the lock. As soon as the object is destroyed, the lock is released. The data access is coupled with the locking mechanism without any explicit extra work in the caller. The name of the class reminds the caller of the present guard.
template<typename T, typename Lockable>
class GuardedData {
GuardedData(T &d, Lockable &m) : data(d), guard(m) {}
boost::lock_guard<Lockable> guard;
T &data;
T &operator->() { return data; }
};
Again, a very simple concept. The operator-> mimics the semantics of STL iterators for access to the payload.
Now I wonder:
Is this approach well known?
Is there maybe a templated class like this already available, e.g. in the boost libraries?
I am asking because I think it is a fairly generic and usable concept. I could not find anything like it though.
Depending upon how this is used, you are almost guaranteed to end up with deadlocks at some point. If you want to operate on 2 pieces of data then you end up locking the mutex twice and deadlocking (unless each piece of data has its own mutex - which would also result in deadlock if the lock order is not consistent - you have no control over that with this scheme without making it really complicated). Unless you use a recursive mutex which may not be desired.
Also, how are your GuardedData objects passed around? boost::lock_guard is not copyable - it raises ownership issues on the mutex i.e. where & when it is released.
Its probably easier to copy parts of the data you need to the reader/writer threads as and when they need it, keeping the critical section short. The writer would similarly commit to the data model in one go.
Essentially your viewer thread gets a snapshot of the data it needs at a given time. This may even fit entirely in a cpu cache sitting near the core that is running the thread and never make it into RAM. The writer thread may modify the underlying data whilst the reader is dealing with it (but that should invalidate the view). However since the viewer has a copy it can continue on and provide a view of the data at the moment it was synchronized with the data.
The other option is to give the view a smart pointer to the data (which should be treated as immutable). If the writer wishes to modify the data, it copies it at that point, modifies the copy and when completes, switches the pointer to the data in the model. This would necessitate blocking all readers/writers whilst processing, unless there is only 1 writer. The next time the reader requests the data, it gets the fresh copy.
Well known, I'm not sure. However, I use a similar mechanism in Qt pretty often called a QMutexLocker. The distinction (a minor one, imho) is that you bind the data together with the mutex. A very similar mechanism to the one you've described is the norm for thread synchronization in C#.
Your approach is nice for guarding one data item at a time but gets cumbersome if you need to guard more than that. Additionally, it doesn't look like your design would stop me from creating this object in a shared place and accessing the data as often as I please, thinking that it's guarded perfectly fine, but in reality recursive access scenarios are not handled, nor are multi-threaded access scenarios if they occur in the same scope.
There seems to be to be a slight disconnect in the idea. Its use conveys to me that accessing the data is always made to be thread-safe because the data is guarded. Often, this isn't enough to ensure thread-safety. Order of operations on protected data often matters, so the locking is really scope-oriented, not data-oriented. You could get around this in your model by guarding a dummy object and wrapping your guard object in a temporary scope, but then why not just use one the existing mutex implementations?
Really, it's not a bad approach, but you need to make sure its intended use is understood.

How to synchronize access to many objects

I have a thread pool with some threads (e.g. as many as number of cores) that work on many objects, say thousands of objects. Normally I would give each object a mutex to protect access to its internals, lock it when I'm doing work, then release it. When two threads would try to access the same object, one of the threads has to wait.
Now I want to save some resources and be scalable, as there may be thousands of objects, and still only a hand full of threads. I'm thinking about a class design where the thread has some sort of mutex or lock object, and assigns the lock to the object when the object should be accessed. This would save resources, as I only have as much lock objects as I have threads.
Now comes the programming part, where I want to transfer this design into code, but don't know quite where to start. I'm programming in C++ and want to use Boost classes where possible, but self written classes that handle these special requirements are ok. How would I implement this?
My first idea was to have a boost::mutex object per thread, and each object has a boost::shared_ptr that initially is unset (or NULL). Now when I want to access the object, I lock it by creating a scoped_lock object and assign it to the shared_ptr. When the shared_ptr is already set, I wait on the present lock. This idea sounds like a heap full of race conditions, so I sort of abandoned it. Is there another way to accomplish this design? A completely different way?
Edit:
The above description is a bit abstract, so let me add a specific example. Imagine a virtual world with many objects (think > 100.000). Users moving in the world could move through the world and modify objects (e.g. shoot arrows at monsters). When only using one thread, I'm good with a work queue where modifications to objects are queued. I want a more scalable design, though. If 128 core processors are available, I want to use all 128, so use that number of threads, each with work queues. One solution would be to use spatial separation, e.g. use a lock for an area. This could reduce number of locks used, but I'm more interested if there's a design which saves as much locks as possible.
You could use a mutex pool instead of allocating one mutex per resource or one mutex per thread. As mutexes are requested, first check the object in question. If it already has a mutex tagged to it, block on that mutex. If not, assign a mutex to that object and signal it, taking the mutex out of the pool. Once the mutex is unsignaled, clear the slot and return the mutex to the pool.
Without knowing it, what you were looking for is Software Transactional Memory (STM).
STM systems manage with the needed locks internally to ensure the ACI properties (Atomic,Consistent,Isolated). This is a research activity. You can find a lot of STM libraries; in particular I'm working on Boost.STM (The library is not yet for beta test, and the documentation is not really up to date, but you can play with). There are also some compilers that are introducing TM in (as Intel, IBM, and SUN compilers). You can get the draft specification from here
The idea is to identify the critical regions as follows
transaction {
// transactional block
}
and let the STM system to manage with the needed locks as far as it ensures the ACI properties.
The Boost.STM approach let you write things like
int inc_and_ret(stm::object<int>& i) {
BOOST_STM_TRANSACTION {
return ++i;
} BOOST_STM_END_TRANSACTION
}
You can see the couple BOOST_STM_TRANSACTION/BOOST_STM_END_TRANSACTION as a way to determine a scoped implicit lock.
The cost of this pseudo transparency is of 4 meta-data bytes for each stm::object.
Even if this is far from your initial design I really think is what was behind your goal and initial design.
I doubt there's any clean way to accomplish your design. The problem that assigning the mutex to the object looks like it'll modify the contents of the object -- so you need a mutex to protect the object from several threads trying to assign mutexes to it at once, so to keep your first mutex assignment safe, you'd need another mutex to protect the first one.
Personally, I think what you're trying to cure probably isn't a problem in the first place. Before I spent much time on trying to fix it, I'd do a bit of testing to see what (if anything) you lose by simply including a Mutex in each object and being done with it. I doubt you'll need to go any further than that.
If you need to do more than that I'd think of having a thread-safe pool of objects, and anytime a thread wants to operate on an object, it has to obtain ownership from that pool. The call to obtain ownership would release any object currently owned by the requesting thread (to avoid deadlocks), and then give it ownership of the requested object (blocking if the object is currently owned by another thread). The object pool manager would probably operate in a thread by itself, automatically serializing all access to the pool management, so the pool management code could avoid having to lock access to the variables telling it who currently owns what object and such.
Personally, here's what I would do. You have a number of objects, all probably have a key of some sort, say names. So take the following list of people's names:
Bill Clinton
Bill Cosby
John Doe
Abraham Lincoln
Jon Stewart
So now you would create a number of lists: one per letter of the alphabet, say. Bill and Bill would go in one list, John, Jon Abraham all by themselves.
Each list would be assigned to a specific thread - access would have to go through that thread (you would have to marshall operations to an object onto that thread - a great use of functors). Then you only have two places to lock:
thread() {
loop {
scoped_lock lock(list.mutex);
list.objectAccess();
}
}
list_add() {
scoped_lock lock(list.mutex);
list.add(..);
}
Keep the locks to a minimum, and if you're still doing a lot of locking, you can optimise the number of iterations you perform on the objects in your lists from 1 to 5, to minimize the amount of time spent acquiring locks. If your data set grows or is keyed by number, you can do any amount of segregating data to keep the locking to a minimum.
It sounds to me like you need a work queue. If the lock on the work queue became a bottle neck you could switch it around so that each thread had its own work queue then some sort of scheduler would give the incoming object to the thread with the least amount of work to do. The next level up from that is work stealing where threads that have run out of work look at the work queues of other threads.(See Intel's thread building blocks library.)
If I follow you correctly ....
struct table_entry {
void * pObject; // substitute with your object
sem_t sem; // init to empty
int nPenders; // init to zero
};
struct table_entry * table;
object_lock (void * pObject) {
goto label; // yes it is an evil goto
do {
pEntry->nPenders++;
unlock (mutex);
sem_wait (sem);
label:
lock (mutex);
found = search (table, pObject, &pEntry);
} while (found);
add_object_to_table (table, pObject);
unlock (mutex);
}
object_unlock (void * pObject) {
lock (mutex);
pEntry = remove (table, pObject); // assuming it is in the table
if (nPenders != 0) {
nPenders--;
sem_post (pEntry->sem);
}
unlock (mutex);
}
The above should work, but it does have some potential drawbacks such as ...
A possible bottleneck in the search.
Thread starvation. There is no guarantee that any given thread will get out of the do-while loop in object_lock().
However, depending upon your setup, these potential draw-backs might not matter.
Hope this helps.
We here have an interest in a similar model. A solution we have considered is to have a global (or shared) lock but used in the following manner:
A flag that can be atomically set on the object. If you set the flag you then own the object.
You perform your action then reset the variable and signal (broadcast) a condition variable.
If the acquire failed you wait on the condition variable. When it is broadcast you check its state to see if it is available.
It does appear though that we need to lock the mutex each time we change the value of this variable. So there is a lot of locking and unlocking but you do not need to keep the lock for any long period.
With a "shared" lock you have one lock applying to multiple items. You would use some kind of "hash" function to determine which mutex/condition variable applies to this particular entry.
Answer the following question under the #JohnDibling's post.
did you implement this solution ? I've a similar problem and I would like to know how you solved to release the mutex back to the pool. I mean, how do you know, when you release the mutex, that it can be safely put back in queue if you do not know if another thread is holding it ?
by #LeonardoBernardini
I'm currently trying to solve the same kind of problem. My approach is create your own mutex struct (call it counterMutex) with a counter field and the real resource mutex field. So every time you try to lock the counterMutex, first you increment the counter then lock the underlying mutex. When you're done with it, you decrement the coutner and unlock the mutex, after that check the counter to see if it's zero which means no other thread is trying to acquire the lock . If so put the counterMutex back to the pool. Is there a race condition when manipulating the counter? you may ask. The answer is NO. Remember you have a global mutex to ensure that only one thread can access the coutnerMutex at one time.

lock file so that it cannot be deleted

I'm working with two independent c/c++ applications on Windows where one of them constantly updates an image on disk (from a webcam) and the other reads that image for processing. This works fine and dandy 99.99% of the time, but every once in a while the reader app is in the middle of reading the image when the writer deletes it to refresh it with a new one.
The obvious solution to me seems to be to have the reader put some sort of a lock on the file so that the writer can see that it can't delete it and thus spin-lock on it until it can delete and update. Is there anyway to do this? Or is there another simple design pattern I can use to get the same sort of constant image refreshing between two programs?
Thanks,
-Robert
Try using a synchronization object, probably a mutex will do. Whenever a process wants to read or write to a file it should first acquire the mutex lock.
Yes, a locking mechanism would help. There are, unfortunately, several to choose from. Linux/Unix e.g. has flock (2), Windows has a similar (but different) mechanism.
Another (somewhat hacky) solution is to just write the file under a temporary name, then rename it. Many filesystems guarantee that a rename is atomic, so this may work. This however depends on the fs, so it's a bit hacky.
If you are willing to go with the Windows API, opening the file with CreateFile and passing in 0 for the dwShareMode will not allow any other application to open the file.
From the documentation:
Prevents other processes from opening a file or device if they
request delete, read, or write access.
Then you'd have to use ReadFile, WriteFile, CloseFile, etc rather than the C standard library functions.
Or, as a really simple kludge, the reader creates a temp file (says, .lock) before starting reading and deletes it afterwards. The write doesn't manipulate the file so long as .lock exists.
That's how Open Office does it (and others) and it's probably the simplest to implement, no matter which platform.
Joe, many solutions have been proposed; I commented on some of them but I'd like to chime in with an overall view and some specifics and recommendations:
You have the following options:
use filesystem locking: under Windows have both the reader and writer open (and create with the CREATE_ALWAYS disposition, respectively) the shared file in OF_SHARE_EXCLUSIVE mode; have both the reader and writer ready to handle ERROR_SHARING_VIOLATION and retry after some predefined period of time (e.g. 250ms)
use file renaming to essentially transfer file ownership: have the writer create a writer-private file (e.g. shared_file.tmpwrite), write to it, close it, then make it publicly available to the reader by renaming it to an agreed-upon "public" name (e.g. simply shared-file); have the reader periodically test for the existence of a file with the agreed-upon "public" name (e.g. shared-file) and, when one is found, attempt to first rename it to a reader-private name (e.g. shared_file.tmpread) before having the reader open it (under the reader-private name); under Windows use MOVEFILE_REPLACE_EXISTING; the rename operation does not have to be atomic for this to work
use other forms of interprocess communication (IPC): under Windows you can create a named mutex, and have both the reader and writer attempt to create (the existing mutex will be returned if it already exists) then acquire the named mutex before opening the shared file for reading or writing
implement your own filesystem-backed locking: take advantage of open(O_CREAT|O_EXCL) or, under Windows, of the CREATE_NEW disposition to atomically create an application lock file; unlike OF_SHARE_EXCLUSIVE approach above, it would be up to you to deal with stale lock files (i.e. lock files left by a process which did not shut down gracefully such as after a crash.)
I would implement method 1.
Method 2 would also work, but it is in a sense reinventing the wheel.
Method 3 arguably has the advantage of allowing your reader process to wait on the writer process and vice-versa, eliminating the need for the arbitrary sleep delays between the retries of methods 1 and 2 (polling); however, if you are OK with polling then you should still use method 1
Method 4 is listed for completeness only, as it is complex to implement (when the lock file is detected to be stale, e.g. by checking whether the PID contained therein still exists, multiple processes can potentially be competing for its removal, which introduces a race condition requiring a second lock, which in turn can become stale etc. etc., e.g.:
process A creates the lock file but dies without removing the lock file
process A restarts and tries to acquire the lock file but realizes it is stale
process B comes out of a sleep delay and also tries to acquire the lock file but realizes it is stale
process A removes the lock file, which it knew to be stale, and recreates it essentially reacquiring the lock
process B removes the lock file, which it (still) thinks is stale (although at this point it is no longer stale and owned by process A) -- violation
Instead of deleting images, what about appending them to the end of the file? This would allow you to keep adding to the file while the reader is still operating without destroying the file. The reader can then delete the image when it's done with it (provided it is necessary) and move onto the next image. Or, the other option would be store the image in a buffer, for writing, and you test the file pointer. If it's set to the head of the file then you can go ahead and write from the buffer to the file. Otherwise, wait until reader finishes and puts the pointer back at the head of the file.
couldn't you store a few images? ('n' sounds like a good number :-)
Not too many to fill your disk, but surely 3 would be enough? if not, you are writing faster than you can process and have a fundamental problem anyhoo (tune to discover 'n').
Cyclically overwrite.

To Mutex or Not To Mutex?

Do I need a mutex if I have only one reader and one writer? The reader takes the next command (food.front()) from the queue and executes a task based on the command. After the command is executed, it pops off the command. The writer to the queue pushes commands onto the queue (food.push()).
Do I need a mutex? My reader (consumer) only executes if food.size() > 0. I am using a reader thread and send thread.
A mutex is used in multi-threaded environments. I don't see mention of threads in your question, so I don't see a need for a mutex.
However, if we assume by reader and writer you mean you have two threads, you need to protect mutual data with a mutex (or other multi-threaded protection scheme.)
What happens when the queue has items, and the reader thread pops something off while the writer thread puts something on? Disaster! With a mutex, you'll be sure only one thread is operating on the queue at a time.
Another method is a lock-free thread-safe queue. It would use atomic operations to ensure the data isn't manipulated incorrectly.
What happens if the reader sees the size is greater than zero, but the structure isn't yet completely updated?
That can be avoided by very carefully coding the updates, but the way to make the code resistant to future tampering updates is to use a mutex.
Assuming the "writer" and the "reader" are in separate threads:
Most probably yes: you could have a "metastable" state between the "writing" event and a "reading" event where the pointers to the structures are consistent.
Of course this depends on the implementation: if an atomic operation is used to update the pointers, you might be good without a mutex.
Depends entirely on the implementation, if you have two different threads accessing the same variables you will need a mutex. Otherwise you may for example end up with an inconsistent count.
Say in write you do ++count and in read you do --count and say the current value is 2. Now note that these statements do not need to be atomic, the ++count may consist of reading variable count, incrementing it and then writing it back again. No a write and read are simultaneously executed and say the first bit of the write is executed is executed (i.e. it loads value 2. then the whole read is executed decrementing the count, but the other thread still had value 2 loaded, which it increments and subsequently writes back to the variable. Now you just lost a read action.
your question depends on two conditions:
there are only two threads, one is producer, the other is consumer
the structure is designed for lock-free
if satisfy both, you can drop the lock, or you need to use a lock to protect the queue structure.
for dropping the lock, must remember to update the header or tailer pointer in the end of steps.