Concept to implement recovery of file after crash of application (eg. SIGSEGV)

Concept to implement recovery of file after crash of application (eg. SIGSEGV) - c++

I want to implement a feature, that when my application crashes it saves the current data to a temporary file so it can be recovered on the next launch like many application do (eg. Word or something).
So as far as I could find out this is typically done by just saving the file every few minutes and then loading that last saved file on startup if it exists.
However I was wondering if it could also be done by catching all unhandled exceptions and then call the save method when the application crashes.
The advantage would be that I don't have to write to the disk all the time, cause SSDs don't like that, and the file would really be from the crash time and not 10 minutes old in the worst case.
I've tried this on linux with
signal(SIGSEGV, crashSave);
where crashSave() is the function that calls the save and it seems to work. However I'm not sure if this will work on Windows as well?
And is there a general reason why I should not do this (except that the saved file might be corrupted in few cases) Or what is the advantage of other applications doing timed autosave instead?

Related

Redirect APPCRASH Dumps (Or turn them off)

I have an application (didn't write it) that is producing APPCRASH dumps in C:\Windows\SysWOW64. The application while dumping is crippled, but operating at bare minimum capacity to not lose data. The issue is that these dumps are so large that the system is spending most of it's time writing these and the application is falling far behind in processing and will start losing data soon.
The plan is to either entirely disable it, or mount it to a RAM drive and purge them as soon as they hit the RAM drive.
Now I've looked into using this key:
http://msdn.microsoft.com/en-us/library/windows/desktop/bb787181%28v=vs.85%29.aspx
But all it does is generate a second dump now instead of redirect the original.
The dump is named:
dump-2013_03_31-15_23_55_772.dmp
This is generally the realm of developers on Windows (with stuff like C/C++) so I'd like to hit them up, don't think ServerFault could get me any answers on this.
Additionally: It's not cycling dump files (they'll fill the 20GBs left on the hard drive), so I'm not sure if this is Windows behavior or custom code in the app (if it is... ick!).

To write a DumpFile, an app has to call the function "MiniDumpWriteDump" so this is not a behavior of the system or something you can control, it is application driven. If it dumps on crashes, it uses "SetUnhandledExceptionFilter" to set its own handling routine, before(!) the OS takes over. Unfortunately I didn't found a way to overwrite this handler from an other process, so the only hope left is, that there is a register entry for the app switching the behavior or change the path (as my applications have it for exactly the reason you describe).

Overwriting a file without the risk of a corrupt file

So often my applications want to save files to load again later. Having recently got unlucky with a crash, I want to write the operation in such a way that I am guaranteed to either have the new data, or the original data, but no a corrupted mess.
My first idea was to do something along the lines of (to save a file called example.dat):
Come up with a unique file name for the target directory, e.g. example.dat.tmp
Create that file and write my data to it.
Delete the original file (example.dat)
Rename ("Move") the temp file to where the original was (example.dat.tmp -> example.dat).
Then at load time the application can follow the following rules:
If no "example.dat" and no "example.dat.tmp", first run / new project, so load in the defaults / create new file.
If "example.dat" and no "example.dat.tmp", then load example.dat (normal load case)
If "example.dat.tmp" exists offer the user the chance to potentially recover data. If "example.dat" also exists, do not overwrite it without explicit user constant.
However, having done a little research, I found that as well as OS caching which I may be able to override with the file flush methods, some disk drives still then cache internally and may even lie to the OS saying they are done, so 4. could complete, the write is not actually written, and if the system goes down I have lost my data...
I am not sure the disk problem is actually solvable by an application, but are the general rules above the correct thing to do? Should I keep an old recovery copy of the file for longer to be sure, what are the guidelines regarding such things (e.g. acceptable disk usage, should the user choose, where to put such files, etc.).
Also how should I avoid potential conflict the user and other programs for "example.dat.tmp". I recall seeing a "~example.dat" sometimes from some other software, is that a better convention?

If the disk drives report back to the OS that the data is
physically on the disk, and it's not, then there's not much you
can do about it. A lot of disks do cache a certain number of
writes, and report them done, but such disks should have
a battery backup, and finish the physical writes no matter what
(and they won't loose data in case of a system crash, since they
won't even see it).
For the rest, you say you've done some research, so you no doubt
know that you can't use std::ofstream (nor FILE*) for this;
you have to do the actual writes at the system level, and open
the files with special attributes for them to ensure full
synchronization. Otherwise, the operations can stick around in
the OS buffering for a while. And that as far as I know,
there's no way of ensuring such synchronization for a rename.
(But I'm not sure that it's necessary, if you always keep two
versions: my usual convention in such cases is to write to
a file "example.dat.new", then when I'm done writing, delete
any file named "example.dat.bak", rename "example.dat" to
"example.dat.bak", and then rename "example.dat.new" to
"example.dat". Given this, you should be able to figure out
what did or did not happen, and find the correct file
(interactively, if need be, or insert an initial line with the
timestamp).

You should lock the actual data file while you write its substitute, if there's a chance that a different process could be going through the same protocol that you are describing.
You can use flock for the file lock.
As for your temp file name, you could make your process ID part of it, for instance "example.dat.3124," No other simultaneously-running process would generate the same name.

How to calculate time to load a file into a application thorugh C++?

I have written a code in C++ to open a file in its default application like .doc in MS-Word now I want to calculate time to open a file into its application.
For that I need to know percentage of file loaded into that application. But from last 7 days I couldn't find any suitable solution. So can any one help me in solving this problem?
If i am using windows then can windows task manager help me to do this?

What you're trying to do is not only impossible, it doesn't even make sense.
When you play an MP3 in WMP, it doesn't load the whole file into memory. Instead, it maps a little bit of the file at a time into memory so it can decode the MP3 on the fly as it's playing. (I suppose if you play the song all the way through, without stopping or skipping or fast forwarding or rewinding, it will eventually read every byte of the file, probably finishing a few seconds before the song is over, but I doubt that's what you're looking for.)
Likewise, Word doesn't read any entire .doc file into memory (unless it's very small). That's how it's able to edit gigantic files without using huge amounts of memory. (Again, if you page through the whole file, it will probably eventually read every byte—for that matter, it may eventually copy enough of the file into an autosave backup file that it no longer needs to look at the original—but again, I doubt that's what you're looking for.)
If you only care about certain specific applications, and those applications have a COM Automation interface (as both WMP and Word do), they may have methods or events that will tell you when they're done "loading" a file (meaning they've read enough of it to start playing/displaying/etc.), or when they've "finished" with a file (meaning moved on to the next track, or whatever), but there's no generic answer to that; different applications will have different Automation interfaces. (And, as a side note, you really don't want to do COM Automation from C++ unless you really have to; it's much easier from jscript, vbscript, or your favorite .NET language…)

If the third party process does not signal that it has loaded something, e.g., through some output stream, one way will be to view the file handles being opened and closed by the processes. I presume this will be similar to how "task managers" like Process Explorer are able to view file handles of processes. However, if the process does not close the file handle once it is done "loading", then, you will not get an accurate time. Furthermore, you won't be able to get a "live" percentage of how much data has been loaded.

File corruption detection and error handling

I'm a newbie C++ developer and I'm working on an application which needs to write out a log file every so often, and we've noticed that the log file has been corrupted a few times when running the app. The main scenarios seems to be when the program is shutting down, or crashes, but I'm concerned that this isn't the only time that something may go wrong, as the application was born out of a fairly "quick and dirty" project.
It's not critical to have to the most absolute up-to-date data saved, so one idea that someone mentioned was to alternatively write to two log files, and then if the program crashes at least one will still have proper integrity. But this doesn't smell right to me as I haven't really seen any other application use this method.
Are there any "best practises" or standard "patterns" or frameworks to deal with this problem?
At the moment I'm thinking of doing something like this -
Write data to a temp file
Check the data was written correctly with a hash
Rename the original file, and put the temp file in place.
Delete the original
Then if anything fails I can just roll back by just deleting the temp, and the original be untouched.

You must find the reason why the file gets corrupted. If the app crashes unexpectedly, it can't corrupt the file. The only thing that can happen is that the file is truncated (i.e. the last log messages are missing). But the app can't really jump around in the file and modify something elsewhere (unless you call seek in the logging code which would surprise me).
My guess is that the app is multi threaded and the logging code is being called from several threads which can easily lead to data corrupted before the data is written to the log.

You probably forgot to call fsync() every so often, or the data comes in from different threads without proper synchronization among them. Hard to tell without more information (platform, form of corruption you see).
A workaround would be to use logfile rollover, ie. starting a new file every so often.

I really think that you (and others) are wasting your time when you start adding complexity to log files. The whole point of a log is that it should be simple to use and implement, and should work most of the time. To that end, just write the log to an unbuffered stream (l;ike cerr in a C++ program) and live with any, very occasional in my experience, snafus.
OTOH, if you really need an audit trail of everything your app does, for legal reasons, then you should be using some form of transactional storage such as a SQL database.

Not sure if your app is multi-threaded -- if so, consider using Active Object Pattern (PDF) to put a queue in front of the log and make all writes within a single thread. That thread can commit the log in the background. All logs writes will be asynchronous, and in order, but not necessarily written immediately.
The active object can also batch writes.

Ensuring a file is flushed when file created in external process (Win32)

Windows Win32 C++ question about flushing file activity to disk.
I have an external application (ran using CreateProcess) which does some file creation. i.e., when it returns it will have created a file with some content.
How can I ensure that the file the process created was really flushed to disk, before I proceed?
By this I mean not the C++ buffers but really flushing disk (e.g. FlushFileBuffers).
Remember that I don't have access to any file HANDLE - this is all of course hidden inside the external process.
I guess I could open up a handle of my own to the file and then use FlushFileBuffers, but it's not clear this would work (since my handle doesn't actually contain anything which needs flushing).
Finally, I want this to run in non-admin userspace so I cannot use FlushFileBuffers on a whole volume.
Any ideas?
UPDATE: Why do I think this is a problem?
I'm working on a data backup application. Essentially it has to create some files as described. It then has to update it's internal DB (using SQLite embedded DB).
I recently had a data corruption issue which occurred during a bluescreen (the cause of which was unrelated to my app).
What I'm concerned about is application integrity during a system crash. And yes, I do care about this because this app is a data backup app.
The use case I'm concerned about is this:
A small data file is created using external process. This write is waiting in the OS cache to be written to disk.
I update the DB and commit. This is a disk activity. This write is also waiting in the OS cache.
A system failure occurs.
As I see it, we're now in a potential race condition. If "1" gets flushed and "2" doesn't then we're fine (as the DB transact wasn't then committed). If neither gets flushed or both get flushed then we're also OK.
As I understand it, the writes will be non-deterministic. i.e., I'm not aware that the OS will guarantee to write "1" before "2". (Am I wrong?)
So, if "2" gets flushed, but "1" doesn't then we have a problem.
What I observed was that the DB was correctly updated, but that the file had garbage in: the last 2 thirds of the data was binary "zeroes". Now, I don't know what it looks like when you have a file part flushed at the time of bluescreen, but I wouldn't be surprised if it looked like that.
Can I guarantee this is the cause? No I cannot guarantee this. I'm just speculating. It could just be that the file was "naturally" corrupted due to disk failure or as a result of the blue screen.
With regards to performance, this is something I believe I can deal with.
For example, the default behaviour of SQLite is to do a full file flush (using FlushFileBuffers) every time you commit a transaction. They are quite clear that if you don't do this then at the time of system crash, you might have a corrupted DB.
Also, I believe I can mitigate the performance hit by only flushing at "checkpoints". For example, writing 50 files, flushing the lot and then writing to the DB.
How likely is all this to be a problem? Beats me. But then my app might well be archiving at or around the time of system failure so it might be more likely that you think.
Hope that explains why I wan't to do this.

Why would you want this? The OS will make sure that the data is flushed to the disk in due time. If you access it, it will either return the data from the cache or from disk, so this is transparent for you.
If you need some safety in case of disaster, then you must call FlushFileBuffers, for example by creating a process with admin rights after running the external process. But that can severely impact the performance of the whole machine.
Your only other option is to modify the source of the other process.
[EDIT] The most simple solution is probably to copy the file in your process and then flush the copy (since you have the handle). Save the copy under a name which says "not committed in the database".
Then update the database. Write into the database, "updated from file ...". If this entry already exists next time, don't update the database and skip this step.
Flush the database to disk.
Rename the file to "file has been processed into database". Rename is an atomic operation (so it either happens or not).
If you can't think of a good filename for the different states, then use subfolders and move the file between them.

Well, there are no attractive options here. There is no documented way to retrieve the file handle you need from the process. Although there are undocumented ones, go there (via DuplicateHandle) only with careful consideration.
Yes, calling FlushFileBuffers on a volume handle is the documented way. You can avoid the privilege problem by letting a service make the call. Talk to it from your app with one of the standard process interop mechanisms. A named pipe whose name is prefixed with Global\ is probably the easiest way to get that going.

After your update I think http://sqlite.org/atomiccommit.html gives you the answers you need.
The way SQLite ensures that everything is flushed to disc works. So it works for you as well - take a look at the source.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js