How to know when writing a file has been finished? - c++

I'm writing a program which operates on a file (only reads the file), while another program is writing that file (I've no control over it to use events and I don't know the content of the file). I want a way to know when that program finished writing, to stop my program operating on the file. I used these two method but I don't know which one is reliable and more performance:
1- renaming file to another name, if success, rename it to original name.
2-flush file , if file size has not been changed for a while (e.g 5 sec) then stop operation.
which one is better? is there any better way (more reliable and more performance)?
I'm using windows 7 and qt5.2(or visual studio) for c++.

Qt provides a class called QFileSystemWatcher which allows you to monitor files and directories.

Related

How to check if a file is used by another process in C++?

I need to check if a file is currently opened by another process, e.g. a text editor (but needs to apply to everything else too).
I tried using std::ofstream::is_open() etc., but this did not work. I could open the file in my text editor while my program was checking if it was open. The program saw it as a closed file and went on. Only if I opened it as another ofstream would this work.
I'm using the filesystem library to copy files and they may only be copied (and later removed) if the file is not currently written to by another process on the client server.
Really curious about this one. Been wondering this for quite some time but never found a good way for it myself.
I'm currently making a program that needs to be able to run on both linux and windows. every 5 seconds it copies all files from directory a,b,c,d to x. This can be set by the client in rules. after it copied everything. all the files may be removed. After a day (or whatever the client tells the program) all those files from x need to be zipped and archived on location y. Hence the problem, files may only be deleted (and copied) if the other programs that place all the files in directories a,b,c,d are not touching that specific file right now. Hope that makes the question clearer.
And before anybody starts. Yes I know about the data race condition. I do not care about this for now. The program does absolutely nothing with the contents of a file. And after a file is closed by the other process, it will be closed forever.
I need to check if a file is currently opened by another process
This is heavily operating system specific (and might be useless)
So read first a good textbook on operating systems.
On Linux specifically you might use inotify(7) facilities, or /proc/ pseudo-file system (see proc(5)), or perhaps lsof(8). They work only for local file systems (not remote ones, like NFS). See also Advanced Linux Programming and syscalls(2).
And you could have surprises (e.g. a process being scheduled so quickly that removes a file that you won't have time to do anything)
For Windows take more time to read its documentation.
I'm currently making a program that needs to be able to run on both linux and windows. every 5 seconds it copies all files from directory a,b,c,d to x.
You might look, at least for inspiration, inside the source code of rsync.
I don't understand what your actual problem is, but rsync might be part of the solution and is rumored to run on both Windows and Linux

How to force file flushing

Suppose that I have the following code:
#include <chrono>
#include <fstream>
#include <thread>
int main()
{
std::ofstream f("test.log");
int i = 0;
while (true)
{
f << i++;
f.flush();
std::this_thread::sleep_for(std::chrono::milliseconds(100));
}
}
(note that I have a flush call after each write operation)
I noticed that this application doesn't update "last modified time" and "size" attributes of the "test.log" file unless I do a right-click on this file or open it.
I guess that this is due to an internal bufferization (system doesn't want to make such time-consuming operations as an actual I/O to disk unless forced to do so). Am I right?
I need to write an application that should watch for changes in log files created by other applications (I can't change them). At first, I thought about FileSystemWatcher class in C# but I noticed that it has the same behavior (it doesn't fire a corresponding event unless file was closed in a source application or was forced to update by right-clicking that file in Windows Explorer). What can I do then? Call WinAPI functions like GetFileAttributes for every file that I want to look for as often as I can?
There are two separate things here. First, the last modified time on the file MFT record (inode equivalent) is updated every time you write to it.
However the information returned by FindFirstFile and friends is not from the file, it is from information cached in the directory entry. This cache is updated whenever a file is closed which was opened through that directory entry. This is the information displayed by most applications, such as Windows Explorer, and the command prompt DIR command.
If you want to know when a file was updated you need to do the equivalent of a Unix stat operation which reads the MFT record (inode). This requires opening a handle to the file, calling GetFileInformationByHandle and closing the handle again.
The second thing is that there is a good reason not to do this. If a program is writing to a file, it may be partway through the writing process. Therefore the file may be in an invalid (corrupt) state. To ensure that the file is in a valid state you should wait until the file has been closed. This is how you know that the file is now ready to look at.
Once the writing program has finished writing to the file, the directory entry will be updated and FileSystemWatcher will show the file.
If you are absolutely sure you want to see notifications of files which are still in the process of being written, then you can look into the USN change journal as an option. I don't know if this is kept more up to date than the directory entries, you will have to investigate that.

Syncing independent applications. (How to check if a file was modified by another program on runtime)

It is easier to explain with example.
When 2 text editors edit the same text file in the same time, when one editor saves the file, the other one understands that it was modified and asks to do smth.
How is it possible to get a signal that a file was modified outside the program?
I am working with c++ (though I think it isn't important) and on linux. (solution for windows would be good too)
ISO-C++ does not offer this functionality, so you have to stick with what the operating system provides.
On Linux that would be inotify, on Windows you would use directory change notifications.
① Check the timestamp of the file as close as possible before writing. If it is not what it was when you last opened this file for reading, then beware!
② You can build a checksum of the file and compare this to one you built earlier.
③ Register to a system service which informs you about file activities. This depends on the goodwill of the OS you are using; if this notification service isn't working properly, your stuff will fail. On Linux have a look at Inotify.

How to check if a file is still being written?

How can I check if a file is still being written? I need to wait for a file to be created, written and closed again by another process, so I can go on and open it again in my process.
In general, this is a difficult problem to solve. You can ask whether a file is open, under certain circumstances; however, if the other process is a script, it might well open and close the file multiple times. I would strongly recommend you use an advisory lock, or some other explicit method for the other process to communicate when it's done with the file.
That said, if that's not an option, there is another way. If you look in the /proc/<pid>/fd directories, where <pid> is the numeric process ID of some running process, you'll see a bunch of symlinks to the files that process has open. The permissions on the symlink reflect the mode the file was opened for - write permission means it was opened for write mode.
So, if you want to know if a file is open, just scan over every process's /proc entry, and every file descriptor in it, looking for a writable symlink to your file. If you know the PID of the other process, you can directly look at its proc entry, as well.
This has some major downsides, of course. First, you can only see open files for your own processes, unless you're root. It's also relatively slow, and only works on Linux. And again, if the other process opens and closes the file several times, you're stuck - you might end up seeing it during the closed period, and there's no easy way of knowing if it'll open it again.
You could let the writing process write a sentinel file (say "sentinel.ok") after it is finished writing the data file your reading process is interested in. In the reading process you can check for the existence of the sentinel before reading the data file, to ensure that the data file is completely written.
#blu3bird's idea of using a sentinel file isn't bad, but it requires modifying the program that's writing the file.
Here's another possibility that also requires modifying the writer, but it may be more robust:
Write to a temporary file, say "foo.dat.part". When writing is complete, rename "foo.dat.part" to "foo.dat". That way a reader either won't see "foo.dat" at all, or will see a complete version of it.
You can try using inotify
http://en.wikipedia.org/wiki/Inotify
If you know that the file will be opened once, written and then closed, it would be possible for your app to wait for the IN_CLOSE_WRITE event.
However if the behaviour of the other application doing the writing of the file is more like open,write,close,open,write,close....then you'll need some other mechanism of determining when the other app has truly finished with the file.

C++ : Opening a file in non exclusive mode

I have to develop an application which parses a log file and sends specific data to a server. It has to run on both Linux and Windows.
The problem appears when I want to test the log rolling system (which appends .1 to the name of the creates a new one with the same name). On Windows (haven't tested yet on Linux) I can't rename a file that I have opened with std::ifstream() (exclusive access?) even if I open it in "input mode" (ios::in).
Is there a cross-platform way to open file in a non-exclusive way?
Is there a way to open file in a non-exclusive way,
Yes, using Win32, passing the various FILE_SHARE_Xxxx flags to CreateFile.
is it cross platform?
No, it requires platform-specific code.
Due to annoying backwards compatibility concerns (DOS applications, being single-tasking, assume that nothing can delete a file out from under them, i.e. that they can fclose() and then fopen() without anything going amiss; Win16 preserved this assumption to make porting DOS applications easier, Win32 preserved this assumption to make porting Win16 applications easier, and it's awful), Windows defaults to opening files exclusively.
The underlying OS infrastructure supports deleting/renaming open files (although I believe it does have the restriction that memory-mapped files cannot be deleted, which I think isn't a restriction found on *nix), but the default opening semantics do not.
C++ has no notion of any of this; the C++ operating environment is much the same as the DOS operating environment--no other applications running concurrently, so no need to control file sharing.
It's not the reading operation that's requiring the exclusive mode, it's the rename, because this is essentially the same as moving the file to a new location.
I'm not sure but I don't think this can be done. Try copying the file instead, and later delete/replace the old file when it is no longer read.
Win32 filesystem semantics require that a file you rename not be open (in any mode) at the time you do the rename. You will need to close the file, rename it, and then create the new log file.
Unix filesystem semantics allow you to rename a file that's open because the filename is just a pointer to the inode.
If you are only reading from the file I know it can be done with windows api CreateFile. Just specify FILE_SHARE_DELETE | FILE_SHARE_READ | FILE_SHARE_WRITE as the input to dwShareMode.
Unfortunally this is not crossplatform. But there might be something similar for Linux.
See msdn for more info on CreateFile.
EDIT: Just a quick note about Greg Hewgill comment. I've just tested with the FILE_SHARE* stuff (too be 100% sure). And it is possible to both delete and rename files in windows if you open read only and specify the FILE_SHARE* parameters.
I'd make sure you don't keep files open. This leads to weird stuff if your app crashes for example.
What I'd do:
Abstract (reading / writing / rolling over to a new file) into one class, and arrange closing of the file when you want to roll over to a new one in that class. (this is the neatest way, and since you already have the roll-over code you're already halfway there.)
If you must have multiple read/write access points, need all features of fstreams and don't want to write that complete a wrapper then the only cross platform solution I can think of is to always close the file when you don't need it, and have the roll-over code try to acquire exclusive access to the file a few times when it needs to roll-over before giving up.