Cpp file handling using transaction like sql, commit and rollback - c++

I need to write in multiple files if anything wrong happens then rollback all changes and also commit same time in all files in cpp windows. is it possible or if there any library please suggest.

This would require kind of mechanism for file system operations resembling those database transactions you mention, which the C++ standard doesn't provide, so you'd entirely rely on the operating system – which none of the ones I'm aware of provides either (maybe there's some specialised linux distribution that does so?).
All you can do is trying to get as close as possible, e.g. by the following approach:
Write out all these files all as temporary copies, at best into a dedicated temporary directory.
Rename all original files/move them into the dedicated directory (we'll keep them as backup for now).
Rename/move all new temporaries to original file names/folders.
Finally delete the backups.
If anything goes wrong then you can delete all the new temporaries again and rename the backups back to their original names – unless the error occurred on deleting the backups, then you might just leave the remaining ones.
If you retain a log file and add an entry for every task when it is started as well as when completed you exactly know when an error occurred and can safely restore the original state later on even if recovery failed, as the actual point of failure can be determined exactly that way (similarly to the logging database management systems do internally as well to realise their transactions).

Related

Do Memory Mapped Files need Mutex when they are read only?

Recently, something happened with our windows c/c++ applications.
We use a DLL to map files to page file, and our applications read these shared files through memory mapping.
Everything is OK when we just run a single instance of application.
Sometimes we get nothing(just zeros) -- but not error or exception -- from mapped memory when we run 24 instances at the same time.
It seems like that this problem happens more on a slower storage device.
If the files are stored in a slower device(say, EFS of AWS), we got this problem about 6/24 instances every time.
But if we move files to EBS of AWS, we only got this problem about 1/24 or 2/24 instances, and not every time.
I guess maybe there are some conflicts during massive accessing?
Do I need mutex for these read only files?
The mutex is just for protecting writable objects, am I right?
More information:
Everything happened INSIDE that DLL.
EXEs just use this DLL to get TRUE or FALSE.
The DLL is used to judge whether some given data belong to a certain file.
Some structs describe the data structure of files, the problem is that a certain struct just get 0 when it should not, but not every time.
I logged the parameters inside the DLL, they are passed to DLL correctly, every time.
I still don't know how and why did this happen, but I found that I can avoid this problem simply by adding a RETRY to that judge function.
I still think this problem is a kind of I/O problem because RETRY can avoid this, but I have no more evidences.
And, maybe the title is not very proper to this problem so I think it's time to close it.
Finally, I figured it out.
This is NOT a memory mapped file problem, it is a LOGICAL problem.
Our DLL has not enough authority, so when we shared our data into memory, NOBODY can see them!
And our applications are designed to load data themselves if they can not find any shared data, so the difference of EFS and EBS happens!
These applications are very old, no documents left, and nobody knows how they are working, so I had to dig information from source code ...

With what API do you perform a read-consistent file operation in OS X, analogous to Windows Volume Shadow Service

We're writing a C++/Objective C app, runnable on OSX from versions 10.7 to present (10.11).
Under windows, there is the concept of a shadow file, which allows you read a file as it exists at a certain point in time, without having to worry about other processes writing to that file in the interim.
However, I can't find any documentation or online articles discussing a similar feature in OS X. I know that OS X will not lock a file when it's being written to, so is it necessary to do something special to make sure I don't pick up a file that is in the middle of being modified?
Or does the Journaled Filesystem make any special handling unnecessary? I'm concerned that if I have one process that is creating or modifying files (within a single context of, say, an fopen call - obviously I can't be guaranteed of "completeness" if the writing process is opening and closing a file repeatedly during what should be an atomic operation), that a reading process will end up getting a "half-baked" file.
And if JFS does guarantee that readers only see "whole" files, does this extend to Fat32 volumes that may be mounted as external drives?
A few things:
On Unix, once you open a file, if it is replaced (as opposed to modified), your file descriptor continues to access the file you opened, not its replacement.
Many apps will replace rather than modify files, using things like -[NSData writeToFile:atomically:] with YES for atomically:.
Cocoa and the other high-level frameworks do, in fact, lock files when they write to them, but that locking is advisory not mandatory, so other programs also have to opt in to the advisory locking system to be affected by that.
The modern approach is File Coordination. Again, this is a voluntary system that apps have to opt in to.
There is no feature quite like what you described on Windows. If the standard approaches aren't sufficient for your needs, you'll have to build something custom. For example, you could make a copy of the file that you're interested in and, after your copy is complete, compare it to the original to see if it was being modified as you were copying it. If the original has changed, you'll have to start over with a fresh copy operation (or give up). You can use File Coordination to at least minimize the possibility of contention from cooperating programs.

C++ Boost Object Serialization - Periodic Saving to Protect Data

I have a program that uses boost serialization that loads on program start up and saves on shutdown.
Every once in a while, the program will crash due to this or that and I expect that to be fairly normal. The problem is that when the program crashes, often the objects are not saved at all. Other times, some will be missing or the data will be corrupted. This could be disastrous if a user loses months and months of data. In a perfect world, every one would backup their data and they could just roll back the data file.
My first solution is to periodically save the objects to a different temporary data file during run time. That way if the program crashes they can revert to the temporary data file with minimal data loss. My concern is the effect on performance. As far as I understand (correct me if I am wrong), once you save an object, it can't be used anymore? If that is the case, then the periodic save routine would involve saving and deleting my pointers, then loading them up again.
My second solution is to simply make a copy of the data file during program start up. The user's loss of data would be limited to that session. However, this may not be sufficient as some users may run the program for days and days.
Any input would be appreciated.
Thanks in advance.
If you save an object graph with boost serialization, that object graph is still available and can be saved again without necessarily reading anything from disk.
If you want to go high-tech and introduce a lot more complexity, you can use Boost Interprocess library with a managed_shared_memory segment. This enables you to actually transparently work directly on a disk file (actually, on memory pages backed by file blocks). This introduces another issue, actually: how to prevent changes from frequently hitting the disk.
Gratuitous advice:
I think the best of all worlds would be if your object graph is (e.g.) a Composite pattern where all nodes are shared immutables. Now serialization is "free" (with Boost), you can easily handle multiple versions of the program state (often a "document" or "database", logically) and efficiently save/load them with Boost Serialization. This pattern facilitates undo/redo, concurrent operations, transactional commit ¹ etc.
¹ (! not without extra work, but in principle)

c++ program to find a file currently open in gvim?

I want to rename the folder e.g "mv -f old_proj_name new_proj_name".
But, since the file is opened in gvim editor it is not allowing renaming operation to be performed on the folder.
The file is not moved to new folder name.
Manually I have used unlocker software to check whether the file is locked by other process.
fopen() does not show file is locked, when the file is opened by gvim editor.
I tried with opendir() API as well but didn't helped.
Now i want the lock checking functionality to be implemented in my code, so that before doing rename operation i should able to know whether i can do it successfully or not.
Please guide me.
Regards,
Amol
before doing rename operation i should able to know whether i can do it successfully or not.
This is a fallacy. You can only know whether you could perform the operation successfully at the time of the check. To know whether you can do it now, you need to check for it now. But when you actually get around to performing it, that "now" will turn to "back then". To have a reliable indication, you need to check again.
Don't you think it will get tiresome really fast?
So there are two ways of dealing with this.
First, you can hope (but never know) that nothing important happens between the check and the actual operation.
Second, you may skip the check altogether and just attempt the operation. If it fails, then you can't do it. There, you have killed two birds with one stone: you have checked whether an operation is possible, and performed it in the case it is indeed possible.
Update
If your data is organised in such a way that you have to perform several operations that may fail, and data consistency depends on all these operations succeeding or failing at once, then there's an inherent problem. You can check for some known failure conditions, but (a) you can never check for all possible failure conditions, and (b) any check is valid just for the moment it's performed. So any such check will not be fully reliable. You may be able to prevent some failures but not others. An adequate solution to this would be data storage with proper rollback facility built in, i.e. a database.
Hope it helps.

How to guarantee files that are decrypted during run time are cleaned up?

Using C or C++, After I decrypt a file to disk- how can I guarantee it is deleted if the application crashes or the system powers off and can't clean it up properly? Using C or C++, on Windows and Linux?
Unfortunately, there's no 100% foolproof way to insure that the file will be deleted in case of a full system crash. Think about what happens if the user just pulls the plug while the file is on disk. No amount of exception handling will protect you from that (the worst) case.
The best thing you can do is not write the decrypted file to disk in the first place. If the file exists in both its encrypted and decrypted forms, that's a point of weakness in your security.
The next best thing you can do is use Brian's suggestion of structured exception handling to make sure the temporary file gets cleaned up. This won't protect you from all possibilities, but it will go a long way.
Finally, I suggest that you check for temporary decrypted files on start-up of your application. This will allow you to clean up after your application in case of a complete system crash. It's not ideal to have those files around for any amount of time, but at least this will let you get rid of them as quickly as possible.
Don't write the file decrypted to disk at all.
If the system is powerd off the file is still on disk, the disk and therefore the file can be accessed.
Exception would be the use of an encrypted file system, but this is out of control of your program.
I don't know if this works on Windows, but on Linux, assuming that you only need one process to access the decrypted file, you can open the file, and then call unlink() to delete the file. The file will continue to exist as long as the process keeps it open, but when it is closed, or the process dies, the file will no longer be accessible.
Of course the contents of the file are still on the disk, so really you need more than just deleting it, but zeroing out the contents. Is there any reason that the decrypted file needs to be on disk (size?). Better would just to keep the decrypted version in memory, preferably marked as unswappable, so it never hits the disk.
Try to avoid it completely:
If the file is sensitive, the best bet is to not have it written to disk in a decrypted format in the first place.
Protecting against crashes: Structured exception handling:
However, you could add structured exception handling to catch any crashes.
__try and __except
What if they pull the plug?:
There is a way to protect against this...
If you are on windows, you can use MoveFileEx and the option MOVEFILE_DELAY_UNTIL_REBOOT with a destination of NULL to delete the file on the next startup. This will protect against accidental computer shutdown with an undeleted file. You can also ensure that you have an exclusively opened handle to this file (specify no sharing rights such as FILE_SHARE_READ and use CreateFile to open it). That way no one will be able to read from it.
Other ways to avoid the problem:
All of these are not excuses for having a decrypted file on disk, but:
You could also consider writing to a file that is larger than MAX_PATH via file syntax of \\?\. This will ensure that the file is not browsable by windows explorer.
You should set the file to have the temporary attribute
You should set the file to have the hidden attribute
In C (and so, I assume, in C++ too), as long as your program doesn't crash, you could register an atexit() handler to do the cleanup. Just avoid using _exit() or _Exit() since those bypass the atexit() handlers.
As others pointed out, though, it is better to avoid having the decrypted data written to disk. And simply using unlink() (or equivalent) is not sufficient; you need to rewrite some other data over the original data. And journalled file systems make that very difficult.
A process cannot protect or watch itself. Your only possibility is to start up a second process as a kind of watchdog, which regularly checks the health of the decrypting other process. If the other process crashes, the watchdog will notice and delete the file itself.
You can do that using hearth-beats (regular polling of the other process to see whether it's still alive), or using interrupts sent from the other process itself, which will trigger a timeout if it has crashed.
You could use sockets to make the connection between the watchdog and your app work, for example.
It's becoming clear that you need some locking mechanism to prevent swapping to the pagefile / swap-partition. On Posix Systems, this can be done by the m(un)lock* family of functions.
There's a problem with deleting the file. It's not really gone.
When you delete files off your hard drive (not counting the recycle bin) the file isn't really gone. Just the pointer to the file is removed.
Ever see those spy movies where they overwrite the hard drive 6, 8,24 times and that's how they know that it's clean.. Well they do that for a reason.
I'd make every effort to not store the file's decrypted data. Or if you must, make it small amounts of data. Even, disjointed data.
If you must, then they try catch should protect you a bit.. Nothing can protect from the power outage though.
Best of luck.
Check out tmpfile().
It is part of BSD UNIX not sure if it is standard.
But it creates a temporary file and automatically unlinks it so that it will be deleted on close.
Writing to the file system (even temporarily) is insecure.
Do that only if you really have to.
Optionally you could create an in-memory file system.
Never used one myself so no recommendations but a quick google found a few.
In C++ you should use an RAII tactic:
class Clean_Up_File {
std::string filename_;
public Clean_Up_File(std::string filename) { ... } //open/create file
public ~Clean_Up_File() { ... } //delete file
}
int main()
{
Clean_Up_File file_will_be_deleted_on_program_exit("my_file.txt");
}
RAII helps automate a lot of cleanup. You simply create an object on the stack, and have that object do clean up at the end of its lifetime (in the destructor which will be called when the object falls out of scope). ScopeGuard even makes it a little easier.
But, as others have mentioned, this only works in "normal" circumstances. If the user unplugs the computer you can't guarantee that the file will be deleted. And it may be possible to undelete the file (even on UNIX it's possible to "grep the harddrive").
Additionally, as pointed out in the comments, there are some cases where objects don't fall out of scope (for instance, the std::exit(int) function exits the program without leaving the current scope), so RAII doesn't work in those cases. Personally, I never call std::exit(int), and instead I either throw exceptions (which will unwind the stack and call destructors; which I consider an "abnormal exit") or return an error code from main() (which will call destructors and which I also consider an "abnormal exit"). IIRC, sending a SIGKILL also does not call destructors, and SIGKILL can't be caught, so there you're also out of luck.
This is a tricky topic. Generally, you don't want to write decrypted files to disk if you can avoid it. But keeping them in memory doesn't always guarentee that they won't be written to disk as part of a pagefile or otherwise.
I read articles about this a long time ago, and I remember there being some difference between Windows and Linux in that one could guarentee a memory page wouldn't be written to disk and one couldn't; but I don't remember clearly.
If you want to do your due diligence, you can look that topic up and read about it. It all depends on your threat model and what you're willing to protect against. After all, you can use compressed air to chill RAM and pull the encryption key out of that (which was actually on the new Christian Slater spy show, My Own Worst Enemy - which I thought was the best use of cutting edge, accurate, computer security techniques in media yet)
on Linux/Unix, use unlink as soon as you created the file. The file will be removed as soon as you program closes the file descriptor or exits.
Better yet, the file will be removed even if the whole system crashes - because it is basically removed as soon as you unlink it.
The data will not be physically deleted from the disk, of course, so it still may be available for hacking.
Remember that the computer could be powered down at any time. Then, somebody you don't like could boot up with a Linux live CD, and examine your disk in any level of detail desired without changing a thing. No system that writes plaintext to the disk can be secure against such attacks, and they aren't hard to do.
You could set up a function that will overwrite the file with ones and zeros repeatedly, preferably injecting some randomness, and set it up to run at end of program, or at exit. This will work, provided there are no hardware or software glitches, power failures, or other interruptions, and provided the file system writes to only the sectors it claims to be using (journalling file systems, for example, may leave parts of the file elsewhere).
Therefore, if you want security, you need to make sure no plaintext is written out, and that also means it cannot be written to swap space or the equivalent. Find out how to mark memory as unswappable on all platforms you're writing for. Make sure decryption keys and the like are treated the same way as plaintext: never written to the disk under any circumstances, and kept in unswappable memory.
Then, your system should be secure against attacks short of hostiles breaking in, interrupting you, and freezing your RAM chips before powering down, so they don't lose their contents before being transferred for examination. Or authorities demanding your key, legally (check your local laws here) or illegally.
Moral of the story: real security is hard.
The method that I am going to implement will be to stream the decryption- so that the only part that is in memory is the part that is decrypted during the read as the data is being used. Here is a diagram of the pipeline:
This will be a streamed implementation, so the only data that is in memory is the data that I am consuming in the application at any given point. This makes some things tricky- considering a lot of traditional file tricks are no longer available, but since the implementation will be stream based i will still be able to seek to different points of the file which would be translated to the crypt stream to decrypt at different sections.
Basically, it will be encrypting blocks of the file at a time - so then if I try to seek to a certain point it will decrypt that block to read. When I read past a block it decrypts the next block and releases the previous (within the crypt stream).
This implementation does not require me to decrypt to a file or to memory and is compatible with other stream consumers and providers (fstream).
This is my 'plan'. I have not done this type of work with fstream before and I will likely be posting a question as soon as I am ready to work on this.
Thanks for all the other answers- it was very informative.