Autosaving files with multiple instances - c++

I'm writing a Qt/C++ program that does long-running simulations, and to guard against data loss, I wrote some simple autosave behaviour. The program periodically saves to the user's temp directory (using QDir::temp()), and if the program closes gracefully, this file is deleted. If the program starts up and sees the file in that directory, it assumes a previous instance crashed or was forcibly ended, and it prompts the user about loading it.
Now here is the complication - I'd like this functionality to work properly even if multiple instances of the program are used at once. So when the program loads, it can't just look for the presence of an autosave file. If it finds one, it needs to determine if that file was created by a running instance (in which case, there's nothing wrong and nothing to be done) or if it has been left over by a instance that crashed or was forcibly ended (in which case it should prompt the user about loading it).
My program is for Windows/Mac/Linux, so what would be the best way to implement this using Qt or otherwise in a cross-platform fashion?
Edit:
The comments suggested the use of the process identifier, which I can get using QCoreApplication::applicationPid(). I like this idea, but when the program loads and sees a file with a certain PID in the name, how can it look at the other running instances (if any) to see if there is a match?

You can simply use QSaveFile which, as the documentation states:-
The QSaveFile class provides an interface for safely writing to files.
QSaveFile is an I/O device for writing text and binary files, without losing existing data if the writing operation fails.
While writing, the contents will be written to a temporary file, and if no error happened, commit() will move it to the final file. This ensures that no data at the final file is lost in case an error happens while writing, and no partially-written file is ever present at the final location. Always use QSaveFile when saving entire documents to disk.
As for multiple instances, you just need to reflect that in the filename.

Related

Qt5 ini file that needs updating in case of a crash

I'm writing a simple peer to peer instant messenger for a local network. It uses an ini file to parse a UUID to use as an identifier across the network. The ini file is accessed through a QSettings object. I have written functionality to enable multiple instances of the program to be run on the same computer. When the first program is run, it reads the ini file for the first entry and if one exists reads it, and replaces it with "INUSE". When closing, it replaces the key value with the original UUID. If another instance of the program reads the ini file and reads an INUSE as the first key value, it creates another after it, takes it, and puts an INUSE tag on the second key value.
This works fine, however, if the program crashes the UUID that was "INUSE" will be lost and INUSE will remain until manually taken out. How can I account for crashing with a system that accomplishes the same thing?
Ive taken a look at QLockFile but can't wrap my head around exactly how I would implement such a system.
Any comments are appreciated.
The current format of the ini file is as follows:
[uuid]
1={uuid1}
2={uuid2}
while program 1 is executing
[uuid]
1=INUSE
2={uuid2}
and after a normal end of program
[uuid]
1={uuid1}
2={uuid2}
Essentially what I need is a way of preserving data between program executions but also signal to other instances that said data is currently being used.
I think the first thing is to identify why is your program crashing. In order to choose the better solution.
QLockFile allows you to prevent multiple process accesing the same file. So this only will be usefull to you if the program is crashing beacuse of that.
What ever is the reason your program is crashing, I would recomend the use of exceptions to perform the correct actions when this occurs:
try {
// Some of your code
} catch (exception &e)
{
// Some error occured, do something about it.
// Like restoring your UUID.
}
You can read more about exception here, and you can always use the QT version Qexception.
Hope it helps

Robust way to detect if file has changed

I think this question hasn't been answered for my use-case.
We wish to detect if the user has changed a file without re-reading its contents for the purposes of caching a computation result based on the file contents. Our program is a long-running one that lets the user click a button to perform a computation based on data entered in the program and data stored in external files (sorry, I can't be more specific than that). The external data needs to be read, processed and various data structures need to be built based on it, so we try to cache those between computations to speed up re-computes when the user changes the data in the program itself, but not the data in the external files. However, if the external file has changed, we have to re-read that.
For each external resource we're checking if the modification time and file size have changed, but that's not really all that robust and can lead to user frustration if they have e.g. fileA and fileB with the same size and timestamp and copy or fileA to fileC, use fileC as an external resource, and then copy fileB to fileC. The system preserves the modification time of the original file and the sizes are the same, so we don't re-read the external resource.
Our program runs on Windows, macOS and Linux, is written in C++ and we're perfectly OK with using platform-specific code to detect file changes. We're interested in the most robust way to detect if the contents of a file identified by a file path have changed without actually reading the file itself.
I've made this answer a community wiki so others can add their ideas for the various platforms listed in the question.
Linux
MacOS
Windows
Option 1
Set up a thread that watches the directory containing the file. When the directory changes, you'll have to check if the file you care about has actually changed. That may mean opening and re-reading the file, (e.g., to compute the current checksum). But since you have to do this only after a change notification, this overhead may be acceptable.
I believe (but have not verified) that if someone copies a same-size, same-timestamp file over an existing file, you'll get a directory change notification.
Option 2
Hold the file open with an opportunistic lock. This involves creating the lock with a call to DeviceIoControl and then issuing a blocking call to GetOverlappedResult, which will unblock when another process attempts to change the file. Your program can the release the lock, allowing the other process to update the file, and know that the file is being changed.

Using temporary files safely

There is a static library I use in my program which can only take filenames as its input, not actual file contents. There is nothing I can do about the library's source code. So I want to: create a brand-new file, store data to being processed into it, flush it onto the disk(?), pass its name to the library, then delete it.
But I also want this process to be rather secure:
1) the file must be created anew, without any bogus data (maybe it's not critical, but whatever);
2) anyone but my process must not be able read or write from/to this file (I want the library to process my actual data, not bogus data some wiseguy managed to plug in);
3) after I'm done with this file, it must be deleted (okay, if someone TerminateProcess() me, I guess there is nothing much can be done, but still).
The library seems to use non-Unicode fopen() to open the given file though, so I am not quite sure how to handle all this, since the program is intended to run on Windows. Any suggestions?
You have a lot of suggestions already, but another option that I don't think has been mentioned is using named pipes. It will depend on the library in question as to whether it works or not, but it might be worth a try. You can create a named pipe in your application using the CreateNamedPipe function, and pass the name of the pipe to the library to operate on (the filename you would pass would be \\.\pipe\PipeName). Whether the library accepts a filename like that or not is something you would have to try, but if it works the advantage is your file never has to actually be written to disk.
This can be achieved using the CreateFile and GetTempFileName functions (if you don't know if you can write to the current working directory, you may also want to use , GetTempPath).
Determine a directory to store your temporary file in; the current directory (".") or the result of GetTempPath would be good candidates.
Use GetTempFileName to create a temporary file name.
Finally, call CreateFile to create the temporary file.
For the last step, there are a few things to consider:
The dwFlagsAndAttributes parameter of CreateFile should probably include FILE_ATTRIBUTE_TEMPORARY.
The dwFlagsAndAttributes parameter should probably also include FILE_FLAG_DELETE_ON_CLOSE to make sure that the file gets deleted no matter what (this probably also works if your process crashes, in which case the system closes all handles for you).
The dwShareMode parameter of CreateFile should probably be FILE_SHARE_READ so that other attempts to open the file will succeed, but only for reading. This means that your library code will be able to read the file, but nobody will be able to write to it.
This article should give you some good guidelines on the issue.
The gist of the matter is this:
The POSIX mkstemp() function is the secure and preferred solution where available. Unfortunately, it is not available in Windows, so you would need to find a wrapper that properly implements this functionality using Windows API calls.
On Windows, the tmpfile_s() function is the only one that actually opens the temporary file atomically (instead of simply generating a filename), protecting you from a race condition. Unfortunately, this function does not allow you to specify which directory the file will be created in, which is a potential security issue.
Primarily, you can create file in user's temporary folder (eg. C:\Users\\AppData\Local\Temp) - it is a perfect place for such files. Secondly, when creating a file, you can specify, what kind of access sharing do you provide.
Fragment of CreateFile help page on MSDN:
dwShareMode
0 Prevents other processes from opening a file or device
if they request delete, read, or write access.
FILE_SHARE_DELETE Enables subsequent open operations on a file or device to
request delete access. Otherwise, other processes cannot open the file or device if they
request delete access. If this flag is not specified, but the file or device has been opened for delete access, the function fails. Note: Delete access allows both delete and rename operations.
FILE_SHARE_READ Enables subsequent open operations on a
file or device to request read access. Otherwise, other processes cannot open the file or device if they request read access. If this flag is not specified, but the file or device has been opened for read access, the function fails.
FILE_SHARE_WRITE Enables subsequent open operations on a file or device to request
write access.
Otherwise, other processes cannot open the file or device if they
request write access.
If this flag is not specified, but the file or device has been opened
for write access or has a file mapping with write access, the function
fails.
Whilst suggestions given are good, such as using FILE_SHARE_READ, FILE_DELETE_ON_CLOSE, etc, I don't think there is a completely safe way to do thist.
I have used Process Explorer to close files that are meant to prevent a second process starting - I did this because the first process got stuck and was "not killable and not dead, but not responding", so I had a valid reason to do this - and I didn't want to reboot the machine at that particular point due to other processes running on the system.
If someone uses a debugger of some sort [including something non-commercial, written specifically for this purpose], attaches to your running process, sets a breakpoint and stops the code, then closes the file you have open, it can write to the file you just created.
You can make it harder, but you can't stop someone with sufficient privileges/skills/capabilities from intercepting your program and manipulating the data.
Note that file/folder protection only works if you reliably know that users don't have privileged accounts on the machine - typical Windows users are either admins right away, or have another account for admin purposes - and I have access to sudo/root on nearly all of the Linux boxes I use at work - there are some fileservers that I don't [and shouldn't] have root access. But all the boxes I use myself or can borrow of testing purposes, I can get to a root environment. This is not very unusual.
A solution I can think of is to find a different library that uses a different interface [or get the sources of the library and modify it so that it]. Not that this prevents a "stop, modify and go" attack using the debugger approach described above.
Create your file in your executable's folder using CreateFile API, You can give the file name some UUID, each time its created, so that no other process can guess the file name to open it. and set its attribute to hidden. After using it, just delete the file .Is it enough?

How to check if a file is still being written?

How can I check if a file is still being written? I need to wait for a file to be created, written and closed again by another process, so I can go on and open it again in my process.
In general, this is a difficult problem to solve. You can ask whether a file is open, under certain circumstances; however, if the other process is a script, it might well open and close the file multiple times. I would strongly recommend you use an advisory lock, or some other explicit method for the other process to communicate when it's done with the file.
That said, if that's not an option, there is another way. If you look in the /proc/<pid>/fd directories, where <pid> is the numeric process ID of some running process, you'll see a bunch of symlinks to the files that process has open. The permissions on the symlink reflect the mode the file was opened for - write permission means it was opened for write mode.
So, if you want to know if a file is open, just scan over every process's /proc entry, and every file descriptor in it, looking for a writable symlink to your file. If you know the PID of the other process, you can directly look at its proc entry, as well.
This has some major downsides, of course. First, you can only see open files for your own processes, unless you're root. It's also relatively slow, and only works on Linux. And again, if the other process opens and closes the file several times, you're stuck - you might end up seeing it during the closed period, and there's no easy way of knowing if it'll open it again.
You could let the writing process write a sentinel file (say "sentinel.ok") after it is finished writing the data file your reading process is interested in. In the reading process you can check for the existence of the sentinel before reading the data file, to ensure that the data file is completely written.
#blu3bird's idea of using a sentinel file isn't bad, but it requires modifying the program that's writing the file.
Here's another possibility that also requires modifying the writer, but it may be more robust:
Write to a temporary file, say "foo.dat.part". When writing is complete, rename "foo.dat.part" to "foo.dat". That way a reader either won't see "foo.dat" at all, or will see a complete version of it.
You can try using inotify
http://en.wikipedia.org/wiki/Inotify
If you know that the file will be opened once, written and then closed, it would be possible for your app to wait for the IN_CLOSE_WRITE event.
However if the behaviour of the other application doing the writing of the file is more like open,write,close,open,write,close....then you'll need some other mechanism of determining when the other app has truly finished with the file.

File corruption detection and error handling

I'm a newbie C++ developer and I'm working on an application which needs to write out a log file every so often, and we've noticed that the log file has been corrupted a few times when running the app. The main scenarios seems to be when the program is shutting down, or crashes, but I'm concerned that this isn't the only time that something may go wrong, as the application was born out of a fairly "quick and dirty" project.
It's not critical to have to the most absolute up-to-date data saved, so one idea that someone mentioned was to alternatively write to two log files, and then if the program crashes at least one will still have proper integrity. But this doesn't smell right to me as I haven't really seen any other application use this method.
Are there any "best practises" or standard "patterns" or frameworks to deal with this problem?
At the moment I'm thinking of doing something like this -
Write data to a temp file
Check the data was written correctly with a hash
Rename the original file, and put the temp file in place.
Delete the original
Then if anything fails I can just roll back by just deleting the temp, and the original be untouched.
You must find the reason why the file gets corrupted. If the app crashes unexpectedly, it can't corrupt the file. The only thing that can happen is that the file is truncated (i.e. the last log messages are missing). But the app can't really jump around in the file and modify something elsewhere (unless you call seek in the logging code which would surprise me).
My guess is that the app is multi threaded and the logging code is being called from several threads which can easily lead to data corrupted before the data is written to the log.
You probably forgot to call fsync() every so often, or the data comes in from different threads without proper synchronization among them. Hard to tell without more information (platform, form of corruption you see).
A workaround would be to use logfile rollover, ie. starting a new file every so often.
I really think that you (and others) are wasting your time when you start adding complexity to log files. The whole point of a log is that it should be simple to use and implement, and should work most of the time. To that end, just write the log to an unbuffered stream (l;ike cerr in a C++ program) and live with any, very occasional in my experience, snafus.
OTOH, if you really need an audit trail of everything your app does, for legal reasons, then you should be using some form of transactional storage such as a SQL database.
Not sure if your app is multi-threaded -- if so, consider using Active Object Pattern (PDF) to put a queue in front of the log and make all writes within a single thread. That thread can commit the log in the background. All logs writes will be asynchronous, and in order, but not necessarily written immediately.
The active object can also batch writes.