How to modify a data structure while a process is already accessing it ? - c++

I have written a program (suppose X) in c++ which creates a data structure and then uses it continuously.
Now I would like to modify that data structure without aborting the previous program.
I tried 2 ways to accomplish this task :
In the same program X, first I created data structure and then tried to create a child process which starts accessing and using that data structure for some purpose. The parent process continues with its execution and asks the user for any modification like insertion, deletion, etc and takes input from console and subsequently modification is done. The problem here is, it doesn't modify the copy of data structure that the child process was using. Later on, I figured out this won't help because the child process is using its own copy of data structure and hence modifications done via parent process won't be reflected in it. But definitely, I didn't want this to happen. So I went for multithreading.
Instead of creating child process, I created an another thread which access that data structure and uses it and tried to take user input from console in different thread. Even,
this didn't work because of very fast switching between threads.
So, please help me to solve this issue. I want the modification to be reflected in the original data structure. Also I don't want the process (which is accessing and using it continuously) to wait for sometimes since it's time crucial.

First point: this is not a trivial problem. To handle it at all well, you need to design a system, not just a quick hack or two.
First of all, to support the dynamic changing, you'll almost certainly want to define the data structure in code in something like a DLL or .so, so you can load it dynamically.
Part of how to proceed will depend on whether you're talking about data that's stored strictly in memory, or whether it's more file oriented. In the latter case, some of the decisions will depend a bit on whether the new form of a data structure is larger than an old one (i.e., whether you can upgrade in place or no).
Let's start out simple, and assume you're only dealing with structures in memory. Each data item will be represented as an object. In addition to whatever's needed to access the data, each object will provide locking, and a way to build itself from an object of the previous version of the object (lazily -- i.e., on demand, not just in the ctor).
When you load the DLL/.so defining a new object type, you'll create a collection of those the same size as your current collection of existing objects. Each new object will be in the "lazy" state, where it's initialized, but hasn't really been created from the old object yet.
You'll then kick off a thread that walks makes the new collection known to the rest of the program, then walks through the collection of new objects, locking an old object, using it to create a new object, then destroying the old object and removing it from the old collection. It'll use a fairly short timeout when it tries to lock the old object (i.e., if an object is in use, it won't wait for it very long, just go on to the next. It'll iterate repeatedly until all the old objects have been updated and the collection of old objects is empty.
For data on disk, things can be just about the same, except your collections of objects provide access to the data on disk. You create two separate files, and copy data from one to the other, converting as needed.
Another possibility (especially if the data can be upgraded in place) is to use a single file, but embed a version number into each record. Read some raw data, check the version number, and use appropriate code to read/write it. If you're reading an old version number, read with the old code, convert to the new format, and write in the new format. If you don't have space to update in place, write the new record to the end of the file, and update the index to indicate the new position.

Your approach to concurrent access is similar to sharing a cake between a classroom full of blindfolded toddlers. It's no surprise that you end up with a sticky mess. Each toddler will either have to wait their turn to dig in or know exactly which part of the cake she alone can touch.
Translating to code, the former means having a lock or mutex that controls access to a data structure so that only one thread can modify it at any time.
The latter can be done by having a data structure that is modified in place by threads that each know exactly which parts of the data structure they can update, e.g. by passing a struct with details on which range to update, effectively splitting up the data beforehand. These should not overlap and iterators should not be invalidated (e.g. by resizing), which may not be possible for a given problem.
There are many many algorithms for handling resource competition, so this is grossly simplified. Distributed computing is a significant field of computer science dedicated to these kinds problems; study the problem (you didn't give details) and don't expect magic.

Related

Backing up a running rocksdb-instance

I would like to backup a running rocksdb-instance to a location on the same disk in a way that is safe, and without interrupting processing during the backup.
I have read:
Rocksdb Backup Instructions
Checkpoints Documentation
Documentation in rocksdb/utilities/{checkpoint.h,backupable_db.{h,cc}}
My question is whether the call to CreateNewBackupWithMetadata is marked as NOT threadsafe to express, that two concurrent calls to this function will have unsafe behavior, or to indicate that ANY concurrent call on the database will be unsafe. I have checked the implementation, which appears to be creating a checkpoint - which the second article claims are used for online backups of MyRocks -, but I am still unsure, what part of the call is not threadsafe.
I currently interpret this as, it is unsafe, because CreateBackup... calls DisableFileDeletions and later EnableFileDeletions, which, of course, if two overlapping calls are made, may cause trouble. Since the SST-files are immutable, I am not worried about them, but am unsure whether modifying the WAL through insertions can corrupt the backup. I would assume that triggering a flush on backup should prevent this, but I would like to be sure.
Any pointers or help are appreciated.
I ended up looking into the implementation way deeper, and here is what I found:
Recall a rocksdb database consists of Memtables, SSTs and a single WAL, which protects data in the Memtables against crashes.
When you call rocksdb::BackupEngine::CreateBackupWithMetadata, there is no lock taken internally, so this call can race, if two calls are active at the same time. Most notably this call does Disable/EnableFileDeletions, which, if called by one call, while another is still active spells doom for the other call.
The process of copying the files from the database to the backup is protected from modifications while the call is active by creating a rocksdb::Checkpoint, which, if flush_before_backup was set to true, will first flush the Memtables, thus clearing the active WAL.
Internally the call to CreateCustomCheckpoint calls DB::GetLiveFiles in db_filecheckpoint.cc. GetLiveFiles takes the global database lock (_mutex), optionally flushes the Memtables, and retrieves the list of SSTs. If a flush in GetLiveFiles happens while holding the global database-lock, the WAL must be empty at this time, which means the list should always contain the SST-files representing a complete and consistent database state from the time of the checkpoint. Since the SSTs are immutable, and since file deletion through compaction is turned off by the backup-call, you should always get a complete backup without holding writes on the database. However this, of course, means it is not possible to determine the exact last write/sequence number in the backup when concurrent updates happen - at least not without inspecting the backup after it has been created.
For the non-flushing version, there maybe WAL-files, which are retrieved in a different call than GetLiveFiles, with no lock held in between, i.e. these are not necessarily consistent, but I did not investigate further, since the non-flushing case was not applicable to my use.

When can safely access mutex protected variable without locking?

A common pattern of storing config in my code is a "map[string]interface{}" protected by RWMutex, but usually after app initiated (could be triggered in multiple go-routine), the map becomes totally readonly. So I have a feeling that from some point of time on, the RWMutex on read should be unnecessary.
An example of this config map is at
http://play.golang.org/p/tkbj9DBok_
One fact that brought me to think of this is in some of production code it actually doing this way of unprotected access of shared object (though it's mostly readonly after it's init'ed), I understand normal way of using RWMutex to protect, but interesting part is this malformed code haven't run into problem in past months.
Is that true that after some accurate "time point" that writes are flushed from cache into memory and with a guarantee of no more writes needed, reads can actually go without RWMutex.RLock? If YES, when is the time point or how to setup the conditions before lockless access?
As long as no one is modifying the map, it should be safe for multiple threads to read it at once. Unfortunately, without any locking you'll have no way to make sure no one else is reading the map when you want to update it.
So one solution is to never update the map, but instead replace it atomically. The read-copy-update algorithm could be used here. Rather than directly accessing the map, so you need to dereference the pointer to access the map. To update it, you can do the following:
acquire an "update lock" mutex.
make a copy of the map. You want to copy all keys/values manually: simple assignment won't work because maps are reference types.
make your changes to the copy of the map.
use StorePointer from the sync/atomic package to atomically update the pointer to the live map to point to your new map.
release the mutex.
Everything that runs before the atomic update in (4) will see the old map, and everything after will see the new map. At no point will those goroutines be reading from a map that is being written to, so there is no need for an RWMutex.
RLock and RUnlock are very fast operations. If there are no writers they are essentially lockless and take just 1 atomic operation each (http://golang.org/src/sync/rwmutex.go?h=RLock#L29). So unless your application performs poorly because reading configuration is slow I would suggest to use RWLock.
Note, that my initial answer was a proposal to implement Freeze() operation for Register but later I realized that the correct implementation won't be any faster than using RWLock.

Best way to control access to a string object in multi-threaded program

I've got a "config" class that has a bunch of attributes that "mirror" configuration settings. A single instance of the class is shared throughout the code (using boost shared_ptr objects) and its attributes read by multiple threads (around 100).
Occasionally, the settings may change and a "monitor" thread updates the appropriate attributes in the object.
For integer and bool attributes, I'm using boost atomic so that when an update happens and the monitor thread sets the value, none of the read threads read it in a partially updated state.
However, for string attributes, I'm worried that making them atomic would hurt performance significantly. It seems like a good way to do it would be to have the string attributes actually be pointers to strings, and then when an update happens, a new string object could be built, and then the write to the shared object (the string pointer) would only be writing the address of the new string object to point to. So I assume that write time would be far shorter than writing a whole new string value to a shared string object.
Doing that, however, means I think I'd want to use shared_ptrs for the string attribs, so that a string object holding the previous value is automatically deleted once all read threads are using the updated string pointer attribute.
So to give an example:
class Config
{
public:
boost::atomic<boost::shared_ptr<std::string> > configStr1;
void updateValueInMonitorThread(std::string newValue)
{
boost::shared_ptr<string> newValuePtr;
newValuePtr = newValue;
configStr1 = newValuePtr;
}
};
void threadThatReadsConfig(boost::shared_ptr<Config> theConfig)
{
std::map<std::string, std::string> thingImWorkingOn;
thingImWorkingOn[*(theConfig->configStr1.load())] = "some value";
}
Is that overkill? Is there a better way to do it? I really don't like the way the reading threads have to access the value by dereferencing it and calling .load(). Also, is it even threadsafe, or does that stuff actually negate the safety features of the atomic and/or shared_ptr type?
I know I could use a mutex and read lock it when accessed in a "getter" and write lock it when the monitor thread updates the string's value, but I'd like to avoid that as I'm trying to keep the config class simple and it's going to have dozens, possibly hundreds of these string attributes.
Thanks in advance for any suggestions/info!
You are already giving each consumer a shared_ptr to the configuration object. So the threads won't notice if the configuration object isn't always the same object.
That is, when the main configuration changes, generate an entirely new configuration object. That seems like a lot of copying, but I'll bet it happens sufficiently rarely that you won't notice the overhead. Then you can swap the new configuration object in for the old one, and when all the consumers of the old object finish with it, it will disappear.
Obviously, this changes the semantics of the use of a configuration object. A long-running thread which would like to be able to notice configuration changes will have to periodically refresh its configuration object. The easiest way to do that would be just to acquire a new configuration object on every use of configuration data; again, that's unlikely to be too expensive, unless you use a configuration string in a hard loop.
On the plus side, you can make the entire configuration object const, which might allow for some optimizations.
The classical method of using mutex variables to set a lock on shared resources (here your string objects) is not only the best but the most efficient way of handling such situations, otherwise you may get into trouble because of incomplete protection or you may end-up with a solution that has more overhead. In some applications you may improve efficiency by using separate mutex locks for separate objects so that if an object is updating, others remain accessible.

A shared List/Map among several threads in F#?

I'm doing a program to handle many blocking I/O operations at a time by spawning an Agent/MailboxProcessor per operation.
I've got a bunch of files I've cached in memory in a Map which I want to share among these agents. However, I've also got a FileSystemWatcher to callback whenever changes are made to the files, so that I can update the cache.
How do I make this happen without risking the cache being corrupted by multi-threaded read and write ?
It seems to me that the Map is already based on pointers to objects, so would that automatically solve my problem as I'm simply changing the pointers to the new objects as they are loaded, or is this a broken understanding of it?
Thanks
It seems to me that the Map is already based on pointers to objects, so would that automatically solve my problem as I'm simply changing the pointers to the new objects as they are loaded, or is this a broken understanding of it?
I think your understanding is correct. You can just have a single mutable reference to an immutable Map. Writing a new map to the reference is atomic so there is no need to synchronize that.
When I've seen similar Erlang programs, the systems are set up like this:
You can wrap up the FileSystemWatcher with a MailboxProcessor, that way you're handling incoming updates as messages and not windows events. Your FileSystemWatcherProcess can hold a list of children who are listening, and push out updates as needed. This is basically the same thing event-based programming, only with messages and actors instead.
Your FileSystemWatcherProcess should not need to maintain a your cache of files, it just blindly pushes out messages.
OR You have a master process which holds the state of the map. File SystemWatcher sends updates to the master. Each child thread holds a reference to the master, so that each time they finish processing an item or batch of items, they send a message to the master process requesting the latest Map.
Neither system requires any locking.
Following up to Jon's answer. if you end up having multiple writers, then instead of locks you can always do CAS:
let updateMap value =
let mutable success = false
while not success do
let v = !x
let result = Interlocked.CompareExchange(x, v.Add(value), v)
success <- Object.ReferenceEqual(v, result)
And, if targeting only .Net 4.0 is an option for you, you shouldn't be inventing all that stuff yourself: there is a System.Collections.Concurrent.ConcurrentDictionary class which implements concurrently readable and writable dictionary already.

What is the most efficient implementation of a java like object monitor in C++?

In Java each object has a synchronisation monitor. So i guess the implementation is pretty condensed in term of memory usage and hopefully fast as well.
When porting this to C++ what whould be the best implementation for it. I think that there must be something better then "pthread_mutex_init" or is the object overhead in java really so high?
Edit: i just checked that pthread_mutex_t on Linux i386 is 24 bytes large. Thats huge if i have to reserve this space for each object.
In a sense it's worse than pthread_mutex_init, actually. Because of Java's wait/notify you kind of need a paired mutex and condition variable to implement a monitor.
In practice, when implementing a JVM you hunt down and apply every single platform-specific optimisation in the book, and then invent some new ones, to make monitors as fast as possible. If you can't do a really fiendish job of that, you definitely aren't up to optimising garbage collection ;-)
One observation is that not every object needs to have its own monitor. An object which isn't currently synchronised doesn't need one. So the JVM can create a pool of monitors, and each object could just have a pointer field, which is filled in when a thread actually wants to synchronise on the object (with a platform-specific atomic compare and swap operation, for instance). So the cost of monitor initialisation doesn't have to add to the cost of object creation. Assuming the memory is pre-cleared, object creation can be: decrement a pointer (plus some kind of bounds check, with a predicted-false branch to the code that runs gc and so on); fill in the type; call the most derived constructor. I think you can arrange for the constructor of Object to do nothing, but obviously a lot depends on the implementation.
In practice, the average Java application isn't synchronising on very many objects at any one time, so monitor pools are potentially a huge optimisation in time and memory.
The Sun Hotspot JVM implements thin locks using compare and swap. If an object is locked, then the waiting thread wait on the monitor of thread which locked the object. This means you only need one heavy lock per thread.
I'm not sure how Java does it, but .NET doesn't keep the mutex (or analog - the structure that holds it is called "syncblk" there) directly in the object. Rather, it has a global table of syncblks, and object references its syncblk by index in that table. Furthermore, objects don't get a syncblk as soon as they're created - instead, it's created on demand on the first lock.
I assume (note, I do not know how it actually does that!) that it uses atomic compare-and-exchange to associate the object and its syncblk in a thread-safe way:
Check the hidden syncblk_index field of our object for 0. If it's not 0, lock it and proceed, otherwise...
Create a new syncblk in global table, get the index for it (global locks are acquired/released here as needed).
Compare-and-exchange to write it into object itself.
If previous value was 0 (assume that 0 is not a valid index, and is the initial value for the hidden syncblk_index field of our objects), our syncblk creation was not contested. Lock on it and proceed.
If previous value was not 0, then someone else had already created a syncblk and associated it with the object while we were creating ours, and we have the index of that syncblk now. Dispose the one we've just created, and lock on the one that we've obtained.
Thus the overhead per-object is 4 bytes (assuming 32-bit indices into syncblk table) in best case, but larger for objects which actually have been locked. If you only rarely lock on your objects, then this scheme looks like a good way to cut down on resource usage. But if you need to lock on most or all your objects eventually, storing a mutex directly within the object might be faster.
Surely you don't need such a monitor for every object!
When porting from Java to C++, it strikes me as a bad idea to just copy everything blindly. The best structure for Java is not the same as the best for C++, not least because Java has garbage collection and C++ doesn't.
Add a monitor to only those objects that really need it. If only some instances of a type need synchronization then it's not that hard to create a wrapper class that contains the mutex (and possibly condition variable) necessary for synchronization. As others have already said, an alternative is to use a pool of synchronization objects with some means of choosing one for each object, such as using a hash of the object address to index the array.
I'd use the boost thread library or the new C++0x standard thread library for portability rather than relying on platform specifics at each turn. Boost.Thread supports Linux, MacOSX, win32, Solaris, HP-UX and others. My implementation of the C++0x thread library currently only supports Windows and Linux, but other implementations will become available in due course.