When can safely access mutex protected variable without locking?

When can safely access mutex protected variable without locking? - concurrency

A common pattern of storing config in my code is a "map[string]interface{}" protected by RWMutex, but usually after app initiated (could be triggered in multiple go-routine), the map becomes totally readonly. So I have a feeling that from some point of time on, the RWMutex on read should be unnecessary.
An example of this config map is at
http://play.golang.org/p/tkbj9DBok_
One fact that brought me to think of this is in some of production code it actually doing this way of unprotected access of shared object (though it's mostly readonly after it's init'ed), I understand normal way of using RWMutex to protect, but interesting part is this malformed code haven't run into problem in past months.
Is that true that after some accurate "time point" that writes are flushed from cache into memory and with a guarantee of no more writes needed, reads can actually go without RWMutex.RLock? If YES, when is the time point or how to setup the conditions before lockless access?

As long as no one is modifying the map, it should be safe for multiple threads to read it at once. Unfortunately, without any locking you'll have no way to make sure no one else is reading the map when you want to update it.
So one solution is to never update the map, but instead replace it atomically. The read-copy-update algorithm could be used here. Rather than directly accessing the map, so you need to dereference the pointer to access the map. To update it, you can do the following:
acquire an "update lock" mutex.
make a copy of the map. You want to copy all keys/values manually: simple assignment won't work because maps are reference types.
make your changes to the copy of the map.
use StorePointer from the sync/atomic package to atomically update the pointer to the live map to point to your new map.
release the mutex.
Everything that runs before the atomic update in (4) will see the old map, and everything after will see the new map. At no point will those goroutines be reading from a map that is being written to, so there is no need for an RWMutex.

RLock and RUnlock are very fast operations. If there are no writers they are essentially lockless and take just 1 atomic operation each (http://golang.org/src/sync/rwmutex.go?h=RLock#L29). So unless your application performs poorly because reading configuration is slow I would suggest to use RWLock.
Note, that my initial answer was a proposal to implement Freeze() operation for Register but later I realized that the correct implementation won't be any faster than using RWLock.

Related

Backing up a running rocksdb-instance

I would like to backup a running rocksdb-instance to a location on the same disk in a way that is safe, and without interrupting processing during the backup.
I have read:
Rocksdb Backup Instructions
Checkpoints Documentation
Documentation in rocksdb/utilities/{checkpoint.h,backupable_db.{h,cc}}
My question is whether the call to CreateNewBackupWithMetadata is marked as NOT threadsafe to express, that two concurrent calls to this function will have unsafe behavior, or to indicate that ANY concurrent call on the database will be unsafe. I have checked the implementation, which appears to be creating a checkpoint - which the second article claims are used for online backups of MyRocks -, but I am still unsure, what part of the call is not threadsafe.
I currently interpret this as, it is unsafe, because CreateBackup... calls DisableFileDeletions and later EnableFileDeletions, which, of course, if two overlapping calls are made, may cause trouble. Since the SST-files are immutable, I am not worried about them, but am unsure whether modifying the WAL through insertions can corrupt the backup. I would assume that triggering a flush on backup should prevent this, but I would like to be sure.
Any pointers or help are appreciated.

I ended up looking into the implementation way deeper, and here is what I found:
Recall a rocksdb database consists of Memtables, SSTs and a single WAL, which protects data in the Memtables against crashes.
When you call rocksdb::BackupEngine::CreateBackupWithMetadata, there is no lock taken internally, so this call can race, if two calls are active at the same time. Most notably this call does Disable/EnableFileDeletions, which, if called by one call, while another is still active spells doom for the other call.
The process of copying the files from the database to the backup is protected from modifications while the call is active by creating a rocksdb::Checkpoint, which, if flush_before_backup was set to true, will first flush the Memtables, thus clearing the active WAL.
Internally the call to CreateCustomCheckpoint calls DB::GetLiveFiles in db_filecheckpoint.cc. GetLiveFiles takes the global database lock (_mutex), optionally flushes the Memtables, and retrieves the list of SSTs. If a flush in GetLiveFiles happens while holding the global database-lock, the WAL must be empty at this time, which means the list should always contain the SST-files representing a complete and consistent database state from the time of the checkpoint. Since the SSTs are immutable, and since file deletion through compaction is turned off by the backup-call, you should always get a complete backup without holding writes on the database. However this, of course, means it is not possible to determine the exact last write/sequence number in the backup when concurrent updates happen - at least not without inspecting the backup after it has been created.
For the non-flushing version, there maybe WAL-files, which are retrieved in a different call than GetLiveFiles, with no lock held in between, i.e. these are not necessarily consistent, but I did not investigate further, since the non-flushing case was not applicable to my use.

Guarded Data Design Pattern

In our application we deal with data that is processed in a worker thread and accessed in a display thread and we have a mutex that takes care of critical sections. Nothing special.
Now we thought about re-working our code where currently locking is done explicitely by the party holding and handling the data. We thought of a single entity that holds the data and only gives access to the data in a guarded fashion.
For this, we have a class called GuardedData. The caller can request such an object and should keep it only for a short time in local scope. As long as the object lives, it keeps the lock. As soon as the object is destroyed, the lock is released. The data access is coupled with the locking mechanism without any explicit extra work in the caller. The name of the class reminds the caller of the present guard.
template<typename T, typename Lockable>
class GuardedData {
GuardedData(T &d, Lockable &m) : data(d), guard(m) {}
boost::lock_guard<Lockable> guard;
T &data;
T &operator->() { return data; }
};
Again, a very simple concept. The operator-> mimics the semantics of STL iterators for access to the payload.
Now I wonder:
Is this approach well known?
Is there maybe a templated class like this already available, e.g. in the boost libraries?
I am asking because I think it is a fairly generic and usable concept. I could not find anything like it though.

Depending upon how this is used, you are almost guaranteed to end up with deadlocks at some point. If you want to operate on 2 pieces of data then you end up locking the mutex twice and deadlocking (unless each piece of data has its own mutex - which would also result in deadlock if the lock order is not consistent - you have no control over that with this scheme without making it really complicated). Unless you use a recursive mutex which may not be desired.
Also, how are your GuardedData objects passed around? boost::lock_guard is not copyable - it raises ownership issues on the mutex i.e. where & when it is released.
Its probably easier to copy parts of the data you need to the reader/writer threads as and when they need it, keeping the critical section short. The writer would similarly commit to the data model in one go.
Essentially your viewer thread gets a snapshot of the data it needs at a given time. This may even fit entirely in a cpu cache sitting near the core that is running the thread and never make it into RAM. The writer thread may modify the underlying data whilst the reader is dealing with it (but that should invalidate the view). However since the viewer has a copy it can continue on and provide a view of the data at the moment it was synchronized with the data.
The other option is to give the view a smart pointer to the data (which should be treated as immutable). If the writer wishes to modify the data, it copies it at that point, modifies the copy and when completes, switches the pointer to the data in the model. This would necessitate blocking all readers/writers whilst processing, unless there is only 1 writer. The next time the reader requests the data, it gets the fresh copy.

Well known, I'm not sure. However, I use a similar mechanism in Qt pretty often called a QMutexLocker. The distinction (a minor one, imho) is that you bind the data together with the mutex. A very similar mechanism to the one you've described is the norm for thread synchronization in C#.
Your approach is nice for guarding one data item at a time but gets cumbersome if you need to guard more than that. Additionally, it doesn't look like your design would stop me from creating this object in a shared place and accessing the data as often as I please, thinking that it's guarded perfectly fine, but in reality recursive access scenarios are not handled, nor are multi-threaded access scenarios if they occur in the same scope.
There seems to be to be a slight disconnect in the idea. Its use conveys to me that accessing the data is always made to be thread-safe because the data is guarded. Often, this isn't enough to ensure thread-safety. Order of operations on protected data often matters, so the locking is really scope-oriented, not data-oriented. You could get around this in your model by guarding a dummy object and wrapping your guard object in a temporary scope, but then why not just use one the existing mutex implementations?
Really, it's not a bad approach, but you need to make sure its intended use is understood.

How to modify a data structure while a process is already accessing it ?

I have written a program (suppose X) in c++ which creates a data structure and then uses it continuously.
Now I would like to modify that data structure without aborting the previous program.
I tried 2 ways to accomplish this task :
In the same program X, first I created data structure and then tried to create a child process which starts accessing and using that data structure for some purpose. The parent process continues with its execution and asks the user for any modification like insertion, deletion, etc and takes input from console and subsequently modification is done. The problem here is, it doesn't modify the copy of data structure that the child process was using. Later on, I figured out this won't help because the child process is using its own copy of data structure and hence modifications done via parent process won't be reflected in it. But definitely, I didn't want this to happen. So I went for multithreading.
Instead of creating child process, I created an another thread which access that data structure and uses it and tried to take user input from console in different thread. Even,
this didn't work because of very fast switching between threads.
So, please help me to solve this issue. I want the modification to be reflected in the original data structure. Also I don't want the process (which is accessing and using it continuously) to wait for sometimes since it's time crucial.

First point: this is not a trivial problem. To handle it at all well, you need to design a system, not just a quick hack or two.
First of all, to support the dynamic changing, you'll almost certainly want to define the data structure in code in something like a DLL or .so, so you can load it dynamically.
Part of how to proceed will depend on whether you're talking about data that's stored strictly in memory, or whether it's more file oriented. In the latter case, some of the decisions will depend a bit on whether the new form of a data structure is larger than an old one (i.e., whether you can upgrade in place or no).
Let's start out simple, and assume you're only dealing with structures in memory. Each data item will be represented as an object. In addition to whatever's needed to access the data, each object will provide locking, and a way to build itself from an object of the previous version of the object (lazily -- i.e., on demand, not just in the ctor).
When you load the DLL/.so defining a new object type, you'll create a collection of those the same size as your current collection of existing objects. Each new object will be in the "lazy" state, where it's initialized, but hasn't really been created from the old object yet.
You'll then kick off a thread that walks makes the new collection known to the rest of the program, then walks through the collection of new objects, locking an old object, using it to create a new object, then destroying the old object and removing it from the old collection. It'll use a fairly short timeout when it tries to lock the old object (i.e., if an object is in use, it won't wait for it very long, just go on to the next. It'll iterate repeatedly until all the old objects have been updated and the collection of old objects is empty.
For data on disk, things can be just about the same, except your collections of objects provide access to the data on disk. You create two separate files, and copy data from one to the other, converting as needed.
Another possibility (especially if the data can be upgraded in place) is to use a single file, but embed a version number into each record. Read some raw data, check the version number, and use appropriate code to read/write it. If you're reading an old version number, read with the old code, convert to the new format, and write in the new format. If you don't have space to update in place, write the new record to the end of the file, and update the index to indicate the new position.

Your approach to concurrent access is similar to sharing a cake between a classroom full of blindfolded toddlers. It's no surprise that you end up with a sticky mess. Each toddler will either have to wait their turn to dig in or know exactly which part of the cake she alone can touch.
Translating to code, the former means having a lock or mutex that controls access to a data structure so that only one thread can modify it at any time.
The latter can be done by having a data structure that is modified in place by threads that each know exactly which parts of the data structure they can update, e.g. by passing a struct with details on which range to update, effectively splitting up the data beforehand. These should not overlap and iterators should not be invalidated (e.g. by resizing), which may not be possible for a given problem.
There are many many algorithms for handling resource competition, so this is grossly simplified. Distributed computing is a significant field of computer science dedicated to these kinds problems; study the problem (you didn't give details) and don't expect magic.

Mutex lock on write only

I have a multithreaded C++ application which holds a complex data structure in memory (cached data).
Everything is great while I just read the data. I can have as many threads as I want access the data.
However the cached structure is not static.
If the requested data item is not available it will be read from database and is then inserted into the data tree. This is probably also not problematic and even if I use a mutex while I add the new data item to the tree that will only take few cycles (it's just adding a pointer).
There is a Garbage Collection process that's executed every now and then. It removes all old items from the tree. To do so I need to lock the whole thing down to make sure that no other process is currently accessing any data that's going to be removed from memory. I also have to lock the tree while I read from the cache so that I don't remove items while they are processed (kind of "the same thing the other way around").
"Pseudocode":
function getItem(key)
lockMutex()
foundItem = walkTreeToFindItem(key)
copyItem(foundItem, safeCopy)
unlockMutex()
return safeCopy
end function
function garbageCollection()
while item = nextItemInTree
if (tooOld) then
lockMutex()
deleteItem(item)
unlockMutex()
end if
end while
end function
What's bothering me: This means, that I have to lock the tree while I'm reading (to avoid the garbage collection to start while I read). However - as a side-effect - I also can't have two reading processes at the same time anymore.
Any suggestions?
Is there some kind of "this is a readonly action that only collides with writes" Mutex?

Look into read-write-lock.
You didn't specify which framework can you use but both pThread and boost have implemented that pattern.

The concept is a "shared reader, single writer" lock as others have stated. In Linux environments you should be able to use pthread_rwlock_t without any framework. I would suggest looking into boost::shared_lock as well.

I suggest a reader-writer lock. The idea is you can acquire a lock for "reading" or for "writing", and the lock will allow multiple readers, but only one writer. Very handy.

In C++17 this type of access (multiple read, single write) is supported directly with std::shared_mutex. I think this was adopted from boost::shared_lock. The topic is also described with an example in Anthony Williams's C++ Concurrency in Action.
https://livebook.manning.com/book/c-plus-plus-concurrency-in-action-second-edition/chapter-3/185.
The essential bits are:
use std::shared_lock on reads where shared access is allowed
existing shared_locks don't prevent a new shared_lock from locking
use std::unique_lock on update/write functions, where exclusive access is needed
will wait until all existing shared reads complete, as well as other writes
precludes any other reads or writes while locked

Multi-Threaded Deep Copies

What's the best way to perform a deep copy in the constructor of an object that is part of multi-threaded C++ code?

Depends on your data structures.
I'd guess the problem you're facing (although you don't say so) is potential locking inversion. If you're deep copying, then you will presumably take one or more locks for the various objects you need to copy.
If you can define a DAG (that is, a partial order) where the nodes are every lock in the system, and every combination of locks you might ever want to take is connected by edges, then you can ensure that locks are never taken in different orders in different threads. Hence in particular you won't get locking inversion. A typical rule is to take the "most general" lock last, since this tends to minimise contention.
But if you are deep copying one of a whole bunch of "WidgetBoxes", each containing "Widgets" which are basically indistinguishable, with possible overlap between the contents of boxes, then you have a problem defining a locking order naturally. Further, you'd have to lock the WidgetBox first (even though it's the "most general" object), since without that lock you can't tell what else you need to lock. If Widgets are comparable, you might be able to lock each in order, do your copy, and release everything. Nasty.
An alternative is to define a single lock shared between all Widgets and WidgetBoxes. This might introduce a lot of contention, in which case optimistic locking may improve things, provided that not too many modifications occur concurrently with copies.
Another alternative may be to relax the guarantees you make about the copy - instead of requiring that the full copy be made from an identifiable state of the deep structure, you could first lock the WidgetBox, shallow copy it (using refcounting or whatever - the lock on a refcount is typically an "ultimate inner lock" and hence not an inversion risk), release the WidgetBox lock, then copy each of the Widgets in turn. Use the same approach to copying the Widgets if they have internal structure. The result might contain a Widget in a state it did not achieve until after it was removed from the WidgetBox in another thread, or other such incongruities, so if that's not acceptable then you can't use this approach. But if you only ever lock one object at a time in each thread, then you can't get locking inversion.
A final, possible "nuclear" option, is to make everything immutable, and always copy on modification. If nothing can ever be modified, then you don't need any locks (although you do still need memory barriers when passing references between threads).
If none of these works, then you're out of my experience unless I've forgotten something. I'd speculate that database implementation is somewhere that a lot of lock-related cleverness goes on, and that would be the area to turn for ideas.

fork()
I'm just kidding. But it was too funny to pass up :)
I think onebyone has covered most of your options. Except, perhaps, stepping back and seeing if you can find a way to avoid doing the deep copy in the first place...

My first impulse (I'm not an expert):
Have a lock on that object that the code uses for writing to it. When you're deep copying, lock the object, perform your deep copy, and then unlock it.
Or, am I missing something here?

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js