How to recursively monitor an directory besides using inotify - c++

I want to make an C++ application that will monitor file changes in its directory or its sub-directories. I know inotify can monitor single level directory changes and i need to manually add a watch for each sub-directory to detect changes in the sub-directory.
I need to know if there is any other way to recursively monitor changes in a directory in Linux other than inotify.

To monitor recursively a directory you have to:
Create an inotify(7) object with inotify_init(2) or inotify_init2(2).
Descend recursively on your directory, using inotify_add_watch(2) for all the nodes you want to be notified on (add the watch for the directory itself before scanning, or you'll lose events ---se below).
Wait for events to come on the inotify descriptor you have received.
Take into account that directory creation forces you to possible rescan of directory contents, as you can get an event for a new subdirectory on a watched directory, but files can be created on it before you had a chance to add watches for the recently created directory, so you will not be informed of that files creation.
For this last reason, you'll need to consider putting all things in a subroutine and to call it each time a new directory gets created. This way perhaps you'll scan files for which you'll receive creation events, but the other side you can lost events.
Also, you must prepare yourshelf to do a complete tree rescan in case you lose some events (in a full queue overflow).
And believe me, this is by far more efficient than to do it the classic way. And you can bypass short lived files between rescans.
The reason there's not a recursive solution to this problem has been pointed above in one comment (You'll need the kernel to do the search for you, even if you are not interested on it)

Yes there is the old classic way: Simply check the directory (recursively) in intervals. Store a list of all files and directories and their modification dates. Then it's easy to see if you have a new file/directory, if one has been removed, or if otherwise modified.
It is time consuming though, and you need to store data about every file and directory, so if you try to do it on the root directory you will use a lot of memory or storage.

Probably you want to mix inotify and dnotify to achieve that.

Related

QT5 - Detecting when new files are added to a directory and retrieving their path

My problem is relatively simple. I have an application where I need to monitor a particular folder (the downloads folder, in my case) for added files. Whenever a file is added to that folder, I want to move that file to a completely different directory. I have been looking at QFileSystemWatcher; however, none of the signals it provides seems to be ideal for my situation.
Here is my current code:
connect(&m_fileWatcher, &QFileSystemWatcher::directoryChanged, this, &FileHandler::directoryChanged);
void FileHandler::directoryChanged(const QString &dir)
{
qDebug() << "File changed...." << string;
// Some other logic
}
This signal only gives me a string to work with which is the directory that witnessed a change. I don't know what kind of change took place (add, rename, or delete), and I also have no idea which file has changed.
I understand that I could store all of the files in the directory in some sort of data structure and do some other logic when this signal is emitted, but that doesn't seem very performant in this case since I'm dealing with the user's downloads folder (which could contain thousands of files).
How can I make this work? Should I refer to a different helper class provided by QT, or is there some other way I can do this while utilizing QFileSystemWatcher? I'm simply just looking for ideas.
Thank you.
You’ve hit the limit of what the underlying OS provides: notification of change to the content of a directory.
If you wish to identify the file:
deleted you must have a prior list of files available for compare
added same as deleted
modified loop through the directory for the file with the most recent last modified date
IDK if you wish to use any specific filename container class from Qt or just a std::vector <std::filesystem::path> or the like for your cached folder contents.
QFileSystemWatcher only notifies you that a change happened, but not the details of what was changed. So you will have to resort to OS-specific APIs to get the details. For instance, on Windows, you can use ReadDirectoryChangesW() instead of QFileSystemWatcher.

Is it safe enough to store a file in the TEMP directory

Is it safe enough to store a file in the %TEMP% directory via GetTempPath, GetTempPath and CreateFile for more than two hours? Is there any guarantees that this file won't be deleted earlier?
Thanks in advance.
A file you create in the TEMP directory must be created with the CreateFile's FILE_FLAG_DELETE_ON_CLOSE option. This ensures that the file will always be cleaned-up and you cannot spray garbage files, even if your program crashes before it has a chance to delete the file again.
This option then also inevitably forces you to do the Right Thing, keeping the file opened while you are using it. Which in turn prevents anybody from the deleting the file, even if they use a sledge-hammer.
Lots of programs don't follow this advice and a user's TEMP directory tends to be a big olde mess, forcing the user to clean it up manually once in a while. A built-in feature of Windows, he'll use the "Disk Cleanup" applet. The kind of scenario where you will lose the file if you don't follow this advice. Best to use %AppData% instead.
There are no guarantees. This folder is usually not cleared except the user starts any cleanup.
But everyone can delete files here. And it is wise to do that on a regular base
To prevent the file from being deleted, you can keep a handle open (assuming the application is running the whole time) and do not specify FILE_SHARE_DELETE (and, if applicable, neither FILE_SHARE_WRITE).
Alternative:
Use a path in %APPDATA% or %PROGRAMDATA% that you clear yourself regulary, or let the user specify a path.
In addition, you could register a scheduled task to clean the folder regulary.
If you do not want that another process can delete your files, just keep them open with a share mode of FILE_SHARE_READ | FILE_SHARE_WRITE. That way any attempt to delete them will fail, but any other process will be able to read or write them.
BTW : this is not related with the files living in %TEMP% folder.
If you cannot have a process to keep them open all the time, you must rely on other processes (and other users) on your system not doing anything ...

Monitoring a directory for subdirectory complete creation and then launching another process, c++

So I have an idea that I would like to implement and it's as follows:
Monitor a specific directory.
once a sub-directory is not only created but completed (i.e. a folder that's being downloaded or copied has just completed) the code calls a procedure or a scheme to compress the folder.
I have a sort of an idea of implementing this using ReadDirectoryChangesW. However my question is how to wait for changes, but when a change happens, it waits for its completeness. The second question would be how to identify the subfolder that's completed so I can call the compression scheme and supply it as an argument.
Thank you.
Since it's labelled "winapi", just set the NTFS compression attribute on the subdirectory as soon as you see it. Any new files in that directory will be automatically compressed as they're created.

How to determine when files are done copying for further processing?

Alright so to start this is strictly for Windows and I'd prefer to use C++ over .NET but I'm not opposed to boost::filesystem although if it can be avoided in favor of straight Windows API I'd prefer that.
Now the scenario is an application on another machine I can't change is going to create files in a particular directory on the machine that I need to make backups of and do some extra processing. Currently I've made a little application which will sit and listen for change notifications in a target directory using FindFirstChangeNotification and FindNextChangeNotification windows APIs.
The problem is that while I can get notified when new files are created in the directory, modified, size changes, etc it only notifies once and does not specifically tell me which files. I've looked at ReadDirectoryChangesW as well but it's the same story there except that I can get slightly more specific information.
Now I can scan the directory and try to acquire locks or open the files to determine what specifically changed from the last notification and whether they are available for further use but in the case of copying a large file I've found this isn't good enough as the file won't be ready to be manipulated and I won't get any other notifications after the first so there is no way to tell when it's actually done copying unless after the first notification I continually try to acquire locks until it succeeds.
The only other thing I can think of that would be less hackish would be to have some kind of end token file but since I don't have control over the application creating the files in the first place I don't see how I'd go about doing that and it's still not ideal.
Any suggestions?
This is a fairly common problem and one that doesn't have an easy answer. Acquiring locks is one of the best options when you cannot change the thing at the remote end. Another I have seen is to watch the file at intervals until the size doesn't change for an interval or two.
Other strategies include writing a no-byte file as a trigger when the main file is complete and writing to a temp directory then moving the complete file to the real destination. But to be reliable, it must be the sender who controls this. As the receiver, you are constrained to watching the directory and waiting for the file to settle.
It looks like ReadDirectoryChangesW is going to be your best bet. For each file copy operation, you should be receiving FILE_ACTION_ADDED followed by a bunch of FILE_ACTION_MODIFIED notifications. On the last FILE_ACTION_MODIFIED notification, the file should no longer be locked by the copying process. So, if you try to acquire a lock after each FILE_ACTION_MODIFIED of the copy, it should fail until the copy completes. It's not a particularly elegant solution, but there doesn't seem to be any notifications available for when a file copy completes.
You can process the data once the file is closed, right? So the task is to track when the file is closed. This can be done using file system filter driver. You can write your own or you can use our CallbackFilter product.

Win32 C++ ReadDirectoryChangesW "creation" and "modification" of file difference detect?

Here is the problem: I monitor a directory using Win32 API ReadDirectoryChangesW function. And I need to distinguish between newly created files and modified files. But there are problems... as always :(
Cases:
I monitor directory for new/modify (FILE_NOTIFY_CHANGE_FILE_NAME | FILE_NOTIFY_CHANGE_SIZE). Problem: After file creation, new file event + modify file event is triggered. But i need only one. How can I avoid that? When file is modified I get what I want :).
I monitor directory only for new file (FILE_NOTIFY_CHANGE_FILE_NAME) - NO PROBLEM.
I monitor directory only for modify file (FILE_NOTIFY_CHANGE_SIZE). Problem: When a new file is, modify action is fired along with file creation event. How can I avoid that?
Of course, I implemented some workarounds. But, I want to know if there any elegant way of handling the problems I described.
You should be catching FILE_NOTIFY_CHANGE_LAST_WRITE, not FILE_NOTIFY_CHANGE_SIZE, for a modified file. Files may be modified without the size changing.
You should also keep a queue of changes and the time they happened and only process the queue after there have been no changes in the past 1-2 seconds. Some applications can do very strange things when creating or modifying files, and you'll most likely want to special case for popular applications if you plan on using this code in the wild.
ReadDirectoryChanges isn't one of the friendliest winapi functions. You probably can't get around receiving two events on file creation; I'm not completely sure whether you'll get an extra modify for FILE_NOTIFY_CHANGE_LAST_WRITE on creation, but I think you probably will. Using the queue approach will allow you to easily throw out the extra event if it has the same time stamp as the creation event.