How can I synchronize two processes accessing a file on a NAS? - c++

Here's the thing: I have two applications, written in C++ and running on two machines with different OS (one Linux and one Windows). One of this process is in charge of updating an XML file on a NAS (Network Attached Storage) while the other one reads this file.
Is it possible to synchronize these two processes in order to avoid reading of the file at the same time it's being modified?

You could create a lock file on the server that is created before you do a write, wait then write and delete on completion., Have the read process check for the token before reading the file.
Edit: To address the comments, you can implement a double-checked locking type pattern. Have both reader and writer have a locking file and double check before you do work, something like:
Reader: Check for write lock file, create read lock file, check for write lock file, if exists delete read file and abort.
Writer: Check for read lock file, create write lock file, check for read lock file, if exists delete write lock file and abort.
This will stop your processes trampling on each other but a potential race condition may occur in that the you could potentially have both processes check, create then recheck simultaneously though this will not cause the data to be read in an inconsistent state but will cause both read and write processes to abort for your specified delay

Thank you all for your answers.
At last we managed to resolve our problem, not by using locking commands of the OS (because we were not sure they would propagate correctly to the OS of the NAS head), but by creating lock directories instead of lock files. Directory creation is an atomic operation, and returns an error value if the folder already exists. Therefore, we don't have to check the lock existence before acquiring it, both operations are made in a single step.

OK you need some form of locking mechanism to control accesses.
Most *nix File systems provide this. I suspect it is also available on Windows File System (as this mechanism is used by perl) but it may have another name.
Take a look at the flock().
This is a file locking mechanism. It is an advisory lock so it does not actually lock the file and prevent usage but it provides a mechanism for marking the file. If both applications use the mechanism then you can control accesses to the file.
flock() provides both shared locks (or READ Lock) and exclusive locks (or WRITE Lock). flock will block your thread (in a non busy way) until the file has been unlocked by the user (it also provides NON blocking checks so you can do other things while waiting).
Check out flock in section 2 of the man pages.
int flock(int fd, int operation);
Flock() applies or removes an advisory lock on the file associated with the file
descriptor fd. A lock is applied by specifying an operation parameter that is
one of LOCK_SH or LOCK_EX with the optional addition of LOCK_NB. To unlock an
existing lock operation should be LOCK_UN.

If the files reside on an NFS share you can use fcntl(2) to lock the file. Check question D10 in the Linux NFS FAQ. I have very little experience with windows APIs but from what I've heard they have good POSIX support so you should be able to use fcntl as long as they support POSIX.1-2001.
If you are accessing the files using different protocols (i.e. AFS or SMB) maybe you could set up a simple synchronization server that manages locks via an IPC interface?

Would it be possible to switch from files to a database?
This type of concurency is something that DBMSs manage very well. It need no be expensive or difficult to install. MySql, Postgress or JavaDB would all handle this elegantly at little or no cost.
Failing the database option I would have the writing process write to a "hidden" file name like ".updateinprogress.xml" and rename the file when the update is complete. On most systems "mv" or "ren" is an atomic operation so the reading process either picks up hte old file or the newer file but never a half written one.

Related

Safe access to file from two processes

Suppose I have two processes. One always resides in memory and periodically reads some settings from a file on a disk. If it detects that settings was changed then it applies them.
The other process runs under command line by demand and modifies the settings. Thus the first process only read the file and never write to it while the second can only write to the file.
Should I synchronize the access to the file to ensure that the first process will always get consistent settings i.e. before or after modifications not some intermediate contents? If yes, what is the simplest way to do this in C++.
I'm interested mainly in cross-platform ways. But also curious about Windows- and/or Linux-specific ones.
Use a named semaphore and require either process to hold the semaphore before editing the file on disk. Named semaphores can be connected to by any running application.
Look at man 7 sem_overview for more information on named semaphores on linux machines.
The closest equivalent for windows I can find is http://msdn.microsoft.com/en-us/library/windows/desktop/ms682438(v=vs.85).aspx
You are using C++ so your first option should be to check through the usual cross-platform libs - POCO, Boost, ACE, and so forth to see if there is anything that already does what you require.
You really have two separate issues: (1) file synchronization and (2) notification.
On Linux to avoid having your daemon constantly polling to see if a file has changed you can use inotify calls and set up events that will tell you when the file has been changed by the command line program. It might be simplest to look for IN_CLOSE_WRITE events since a CL prog will presumably be opening, changing, and closing the file.
For synchronization, since you are in control of both programs, you can just use file or record locking e.g. lockf, flock or fcntl.
The most obvious solution is to open the file in exclusive mode. If the file can not be opened, wait some time and try to open the file again. This will prevent possible access/modification conflicts.
The benefit of this approach is that it's simple and doesn't have significant drawbacks.
Of course you could use some synchronization primitives (Mutex, Semaphore depending on the OS) but this would be an overkill in your scenario, when speedy response is not required (waiting 200 msec between open attempts is fine, and writing of config file won't take more).

is there a way to synchronize write to a file among different processes (not threads)

My project requires being run on several different physical machines, which have shared file system among them. One problem arising out of this is how to synchronize write to a common single file. With threads, that can be easily achieved with locks, however my program consists of processes distributed on different machines, which I have no idea how to synchronize. In theory, any way to check whether a file is being opened right now or any lock-like solutions will do, but I just cannot crack out this by myself. A python way would be particularly appreciated.
Just a thought...
Couldn't you put a 'lock' file in the same directory as the file your trying to write to? In your distributed processes check for this lock file. If it exists sleep for x amount and try again. Likewise, when the process that currently has the file open finishes the process deletes the lock file?
So if you have in the simple case 2 processes called A and B:
Process A checks for lock file and if it doesn't exist it creates the lock file and does what it needs to with the file. After it's done it deletes this lock file.
If process A detects the lock file then that means process B has the file, so sleep and try again later....rinse repeat.
There are file locking mechanism to multiple process, even written from multiple languages.
Operating System specific locking mechanisms are used for this. Java JNI, C++, etc languages have implemented these locking mechanisms to synchronize file access within multiple OS Processes[within LTs-Lightweight Threads].
Look in to your Language specific Native file synchronization mechanisms for this.
Following is a Java based Sample:
FileInputStream in = new FileInputStream(file);
try {
java.nio.channels.FileLock lock = in.getChannel().lock();
try {
Reader reader = new InputStreamReader(in, charset);
...
} finally {
lock.release();
}
} finally {
in.close();
}
This locking should be OS independent [work in Unix like systems, Windows, etc].
For this kind to scenarios, I suggest to use Double Locking for better access controlling.

C++ how to check if file is in use - multi-threaded multi-process system

C++:
Is there a way to check if a file has been opened for writing by another process/ class/ device ?
I am trying to read files from a folder that may be accessed by other processes for writing. If I read a file that is simultaneously being written on, both the read and the write process give me errors (the writing is incomplete, I might only get a header).
So I must check for some type of condition before I decide whether to open that specific file.
I have been using boost::filesystem to get my file list. I want compatibility with both Unix and Windows.
You must use a file advisory lock. In Unix, this is flock, in Windows it is LockFile.
However, the fact that your reading process is erroring probably indicates that you have not opened the file in read-only mode in that process. You must specify the correct flags for read-only access or from the OS' perspective you have two writers.
Both operating systems support reader-writer locks, where unlimited readers are allowed, but only in the absence of writers, and only at most one writer at a time will have access.
Since you say your system is multi-process (ie, not multi thread), you can't use a condition variable (unless it's in interprocess shared memory). You also can't use a single writer as a coordinator unless you're willing to shuttle your data there via sockets or shared memory.
From what I understand about boost::filesystem, you're not going to get the granularity you need from that feature-set in order to perform the tasks you're requesting. In general, there are two different approaches you can take:
Use a synchronization mechanism such as a named semaphore visible at the file-system level
Use file-locks (i.e., fcntl or flock on POSIX systems)
Unfortunately both approaches are going to be platform-specific, or at least specific to POSIX vs. Win32.
A very nice solution can be found here using Sutter's active object https://sites.google.com/site/kjellhedstrom2/active-object-with-cpp0x
This is quite advanced but really scaled well on many cores.

boost interprocess file_lock does not work with multiple processes

I seem to be having an issue with boost::interprocess::file_lock
I have process 1 that is essentially
boost::interprocess::file_lock test_lock("testfile.csv");
test_lock.lock();
sleep(1000);
test_lock.unlock();
When I run a second process while the first process is sleeping, I find that I am still able to read testfile.csv. What's worse, I can even overwrite it.
Am I misinterpreting how file_lock works? I was under the impression that calling .lock() gives it exclusive lock over the file and prevents any other process from read/modifying the file.
file_lock is not for locking a file. It is a mutual exclusion object that uses a file as its backing technology. The contents of the file are basically irrelevant; what is relevant is that all instances of a file_lock pointing to that file will respect the locking characteristics of the lock.
As with any mutex-type object, the lock itself is there to protect or otherwise meter access to some other resource.
It has nothing to do with filesystem protection of files.
Reference
To ensure portability the kind of lock you are expecting doesn't exist in boost. There is no kernel level enforceable lock on the Unix/Linux/OSX OS's that boost supports.
see:
http://www.boost.org/doc/libs/1_51_0/doc/html/interprocess/synchronization_mechanisms.html#interprocess.synchronization_mechanisms.file_lock
Because of this the boost interprocess lock is an advisory or cooperative lock. You can accomplish what you are trying to do with boost::interprocess::file_lock but, you must also use the interprocess::file_lock in other processes that might try to read/write the file. Simply try to acquire the lock in your other process before accessing the file.
This is how interprocess::file_lock was designed to be used. You can do it some OS specific way that enforces kernel level locking if you are on some OS's, but if you use the boost::interprocess::file_lock, your code will be portable.
Its not a filesystem lock.
Simply locking in one process is not going to prevent another process from accessing the file. Think of it "like" a mutex. To have your second process honor the file_lock, you need to acquire the lock in the 2nd process as well. Then you will see that proc2 will block, waiting for proc1 to release the lock.

Reading a file in a process while a process is writing in c++

I have a file increasing by time, and need to read the file without any race condition or something in another process in C++ on Windows.
Writing a file is given, and there is no room I can play with it. Only thing I can do is reading it gracefully.
Do you have any idea to handle this case well?
TIA
In Win32 you would have to make sure that every writer opens the file with at least read share access, and every reader opens the file with at least write share access. Further sharing would be required if you have >1 reader or >1 writer.
See here for CreateFile docs, dwShareMode parameter.
You'll almost certainly need to use CreateFile (In both processes) to allow sharing the file at all. If the writing application opens the file in exclusive sharing mode and keeps it open, the reading application won't be able to open the file at all.
From there, preventing race conditions is fairly straightforward: each process will typically use LockFile or LockFileEx to lock a section of the file for exclusive access while it uses data in that section of the file. In general, you want to keep that period of time as short as possible, so you'll lock the section, read/write, and unlock, all about as quickly as possible (i.e., without doing anything else, if you can avoid it).