How to see if a subfile of a directory has changed - c++

In Windows, is there an easy way to tell if a folder has a subfile that has changed?
I verified, and the last modified date on the folder does not get updated when a subfile changes.
Is there a registry entry I can set that will modify this behavior?
If it matters, I am using an NTFS volume.
I would ultimately like to have this ability from a C++ program.
Scanning an entire directory recursively will not work for me because the folder is much too large.
Update: I really need a way to do this without a process running while the change occurs. So installing a file system watcher is not optimal for me.
Update2: The archive bit will also not work because it has the same problem as the last modification date. The file's archive bit will be set, but the folders will not.

This article should help. Basically, you create one or more notification object such as:
HANDLE dwChangeHandles[2];
dwChangeHandles[0] = FindFirstChangeNotification(
lpDir, // directory to watch
FALSE, // do not watch subtree
FILE_NOTIFY_CHANGE_FILE_NAME); // watch file name changes
if (dwChangeHandles[0] == INVALID_HANDLE_VALUE)
{
printf("\n ERROR: FindFirstChangeNotification function failed.\n");
ExitProcess(GetLastError());
}
// Watch the subtree for directory creation and deletion.
dwChangeHandles[1] = FindFirstChangeNotification(
lpDrive, // directory to watch
TRUE, // watch the subtree
FILE_NOTIFY_CHANGE_DIR_NAME); // watch dir name changes
if (dwChangeHandles[1] == INVALID_HANDLE_VALUE)
{
printf("\n ERROR: FindFirstChangeNotification function failed.\n");
ExitProcess(GetLastError());
}
and then you wait for a notification:
while (TRUE)
{
// Wait for notification.
printf("\nWaiting for notification...\n");
DWORD dwWaitStatus = WaitForMultipleObjects(2, dwChangeHandles,
FALSE, INFINITE);
switch (dwWaitStatus)
{
case WAIT_OBJECT_0:
// A file was created, renamed, or deleted in the directory.
// Restart the notification.
if ( FindNextChangeNotification(dwChangeHandles[0]) == FALSE )
{
printf("\n ERROR: FindNextChangeNotification function failed.\n");
ExitProcess(GetLastError());
}
break;
case WAIT_OBJECT_0 + 1:
// Restart the notification.
if (FindNextChangeNotification(dwChangeHandles[1]) == FALSE )
{
printf("\n ERROR: FindNextChangeNotification function failed.\n");
ExitProcess(GetLastError());
}
break;
case WAIT_TIMEOUT:
// A time-out occurred. This would happen if some value other
// than INFINITE is used in the Wait call and no changes occur.
// In a single-threaded environment, you might not want an
// INFINITE wait.
printf("\nNo changes in the time-out period.\n");
break;
default:
printf("\n ERROR: Unhandled dwWaitStatus.\n");
ExitProcess(GetLastError());
break;
}
}
}

This is perhaps overkill, but the IFS kit from MS or the FDDK from OSR might be an alternative. Create your own filesystem filter driver with simple monitoring of all changes to the filesystem.

ReadDirectoryChangesW
Some excellent sample code in this CodeProject article

If you can't run a process when the change occurs, then there's not much you can do except scan the filesystem, and check the modification date/time. This requires you to store each file's last date/time, though, and compare.
You can speed this up by using the archive bit (though it may mess up your backup software, so proceed carefully).
An archive bit is a file attribute
present in many computer file systems,
notably FAT, FAT32, and NTFS. The
purpose of an archive bit is to track
incremental changes to files for the
purpose of backup, also called
archiving.
As the archive bit is a binary bit, it
is either 1 or 0, or in this case more
frequently called set (1) and clear
(0). The operating system sets the
archive bit any time a file is
created, moved, renamed, or otherwise
modified in any way. The archive bit
therefore represents one of two
states: "changed" and "not changed"
since the last backup.
Archive bits are not affected by
simply reading a file. When a file is
copied, the original file's archive
bit is unaffected, however the copy's
archive bit will be set at the time
the copy is made.
So the process would be:
Clear the archive bit on all the files
Let the file system change over time
Scan all the files - any with the archive bit set have changed
This will eliminate the need for your program to keep state, and since you're only going over the directory entries (where the bit is stored) and they are clustered, it should be very, very fast.
If you can run a process during the changes, however, then you'll want to look at the FileSystemWatcher class. Here's an example of how you might use it.
It also exists in .NET (for future searchers of this type of problem)
Perhaps you can leave a process running on the machine watching for changes and creating a file for you to read later.
-Adam

Perhaps you can use the NTFS 5 Change Journal with DeviceIoControl as explained here

If you are not opposed to using .NET the FileSystemWatcher class will handle this for you fairly easily.

From the double post someone mentioned: WMI Event Sink
Still looking for a better answer though.

Nothing easy - if you have a running app you can use the Win32 file change notification apis (FindFirstChangeNotification) as suggested with the other answers. warning: circa 2000 trend micro real-time virus scanner would group the changes together making it necessary to use really large buffers when requesting the file system change lists.
If you don't have a running app, you can turn on ntfs journaling and scan the journal for changes http://msdn.microsoft.com/en-us/library/aa363798(VS.85).aspx but this can be slower than scanning the whole directory when the # of changes is larger than the # of files.

Related

Could DropBox interfere with DeleteFile()/rename()

I had the following code which got executed every two
minutes all day long:
int sucessfully_deleted = DeleteFile(dest_filename);
if (!sucessfully_deleted)
{
// this never happens
}
rename(source_filename,dest_filename);
Once every several hours the rename() would fail with errno=13 (EACCES). The files involved were all sitting on a DropBox directory and I had a hunch that DropBox could be the cause. I figured that it might just be possible that the DeleteFile() function may return with a non-zero successfully_deleted but actually DropBox could still be busy doing some stuff in relation to the deletion that prevented rename() from succeeding. What I did next was to change rename() to my_rename() which would attempt a rename() and upon any failure would Sleep() for one second and try a second time. Sure enough that has worked perfectly ever since. What's more, I get a diagnostic message displaying first-attempt-failures every several hours. It has never failed on the second attempt.
So you could say that the problem is entirely solved... but I would like to understand what might be going on so as to better defend myself against any related DropBox issues in the future...
Really I would like to have a new super_delete() function which does not return until the file is properly deleted and finished with in all respects.
under windows request to delete file really never delete file just. it mark it FCB (File Control Block) with special flag (FCB_STATE_DELETE_ON_CLOSE). real deletion will be only when the last file handle will be closed.
The DeleteFile function marks a file for deletion on close. Therefore,
the file deletion does not occur until the last handle to the file is
closed. Subsequent calls to CreateFile to open the file fail with
ERROR_ACCESS_DENIED.
also if exist section ( memory-mapped file ) open on file - file even can not be marked for delete. api call fail with STATUS_CANNOT_DELETE. so in general impossible always delete file.
in case exist another open handles for file (but not section !) begin from windows 10 rs1 exist new functional for delete - FileDispositionInformationEx with FILE_DISPOSITION_POSIX_SEMANTICS. in this case:
Normally a file marked for deletion is not actually deleted until all
open handles for the file have been closed and the link count for the
file is zero. When marking a file for deletion using
FILE_DISPOSITION_POSIX_SEMANTICS, the link gets removed from the visible namespace as soon as the POSIX delete handle has been closed,
but the file’s data streams remain accessible by other existing
handles until the last handle has been closed.
ULONG DeletePosix(PCWSTR lpFileName)
{
HANDLE hFile = CreateFileW(lpFileName, DELETE, FILE_SHARE_VALID_FLAGS, 0, OPEN_EXISTING,
FILE_FLAG_BACKUP_SEMANTICS|FILE_FLAG_OPEN_REPARSE_POINT, 0);
if (hFile == INVALID_HANDLE_VALUE)
{
return GetLastError();
}
static FILE_DISPOSITION_INFO_EX fdi = { FILE_DISPOSITION_DELETE| FILE_DISPOSITION_POSIX_SEMANTICS };
ULONG dwError = SetFileInformationByHandle(hFile, FileDispositionInfoEx, &fdi, sizeof(fdi))
? NOERROR : GetLastError();
// win10 rs1: file removed from parent folder here
CloseHandle(hFile);
return dwError;
}
Update
Sorry i didn't get the question correctly the first time. I thought DeleteFile returned error 13.
Now I understand that DeleteFile succeeds but rename fails immediatlely after.
It could be just a sync issue with the filesystem. After calling DeleteFile the file will be deleted when the OS commits the changes to the filesystem. That may not appen immediately.
If you need to perform multiple operations to the same path, you should have a look at transactions https://learn.microsoft.com/it-it/windows/desktop/api/winbase/nf-winbase-deletefiletransacteda.
-- OLD ANSWER --
That is correct. If the another application handles to that file, DeleteFile will fail.
Citing MSDN docs https://learn.microsoft.com/en-us/windows/desktop/api/winbase/nf-winbase-deletefile :
The DeleteFile function fails if an application attempts to delete a file that has other handles open for normal I/O or as a memory-mapped file (FILE_SHARE_DELETE must have been specified when other handles were opened).
This applies to dropbox, the antivirus, or in general, any other application that may open those files.
Dropbox may open the file to compute its hash (to look for changes) at any moment. Same goes with the antivirus.

rollback function or design pattern in C++

Right now, I am facing a new problem that I can't figure out how to fix. I have two files. One is a video file and other is a thumbnail. They have same name. I want to rename these two files using C++. I am using the rename function and it works. This is what I've written:
if(rename(videoFile) == 0)
{
if(rename(thumbnail) != 0)
{
printf("Fail rename \n");
}
}
The problem occurs when the video file is renamed successfully but for some reason the thumbnail can't be renamed. When this happens, I would like to rollback the renaming of the video file because the video file name and the thumbnail file name should be the same in my program. What I want to do is to rename after both files are okay to rename. Please guide me, any design pattern for function like rollback or third party software.
There is no absolutely foolproof way to do this.
Fundamental rule of disk I/O: The filesystem can change at any time. You can't check whether a rename would succeed; your answer is already wrong. You can't be certain that undoing the rename will succeed; somebody else might have taken the name while you briefly weren't using it.
On systems that support hard links, you can use them to get about 90% of the way there, assuming you're not moving between filesystems. Suppose you're renaming A to B and C to D. Then do these things:
Create hard link B which links to A. This is written as link("A", "B") in C, using the Unix link(2) system call. Windows users should call CreateHardLink() instead.
If (1) succeeded, create hard link D which links to C. Otherwise, return failure now.
If (2) succeeded, delete A and C and return success. Otherwise, delete B and return failure. If the deletions fail, there is no obvious means of recovery. In practice, you can probably ignore failed deletions assuming the reason for failure was "file not found" or equivalent for your platform.
This is still vulnerable to race conditions if someone deletes one of the files out from under you at the wrong time, but that is arguably not an issue since it is largely equivalent to the rename failing (or succeeding) and then the person deleting the file afterwards.
Technically, you should also be opening the containing directory (in O_RDONLY mode) and fsync(2)'ing it after each operation, at least under Unix. If moving between directories, that's both the source and the destination directories. In practice, nobody does this, particularly since it will lead to degraded performance under ext3. Linus takes the position that the filesystem ought to DTRT without this call, but it is formally required under POSIX. As for Windows, I've been unable to find any authoritative reference on this issue on MSDN or elsewhere. So far as I'm aware, Windows does not provide an API for synchronizing directory entries (you can't open() a directory, so you can't get a file descriptor suitable to pass to fsync()).
Nitpick: To some extent, this sort of thing can be done perfectly on transactional filesystems, but just about the only one in common use right now is NTFS, and Microsoft specifically tells developers not to use that feature. If/when btrfs hits stable, transactions might become genuinely useful.
On Windows platform starting from Vista, you can use code such as the following.
#include "KtmW32.h"
bool RenameFileTransact( LPCTSTR lpctszOldVideoFile, LPCTSTR lpctszNewVideoFile, LPCTSTR lpctszOldThumbnailFile, LPCTSTR lpctszNewThumbnailFile )
{
bool bReturn = false;
HANDLE hRnameTransaction = CreateTransaction(NULL, NULL, 0, 0, 0, 0, NULL);
if (MoveFileTransacted(lpctszOldVideoFile, lpctszNewVideoFile, NULL, NULL, 0, hRnameTransaction) &&
MoveFileTransacted(lpctszOldThumbnailFile, lpctszNewThumbnailFile, NULL, NULL, 0, hRnameTransaction))
{
if ( CommitTransaction(hRnameTransaction))
{
bReturn = true;
}
}
CloseHandle( hRnameTransaction );
return bReturn;
}
But as #Kevin pointed out above, Microsoft discourages the usage of this good feature.

Close shared files programmatically

The company I'm working with has a program written in ye olde vb6, which is updated pretty frequently, and most clients run the executable from a mapped network drive. This actually has surprisingly few issues, the biggest of which is automatic updates. Currently the updater program (written in c++) renames the existing exe, then downloads and places the new version into the old version's place. This generally works fine, but in some environments it simply fails.
The solution is running this command from microsoft:
for /f "skip=4 tokens=1" %a in ('net files') do net files %a /close
This command closes all network files that are shared (well... most) and then the updater can replace the exe.
In C++ I can use the System(""); function to run that command, or I could redirect the output of net files, and iterate through the results looking for the particular file in question and run net file /close command to close them. But it would be much much nicer if there were winapi functions that have similar capabilities for better reliability and future safety.
Is there any way for me to programmatically find all network shared files and close relevant ones?
You can programmatically do what net file /close does. Just include lmshare.h and link to Netapi32.dll. You have two functions to use: NetFileEnum to enumerate all open network files (on a given computer) and NetFileClose to close them.
Quick (it assumes program is running on same server and there are not too many open connections, see last paragraph) and dirty (no error checking) example:
FILE_INFO_2* pFiles = NULL;
DWORD nRead = 0, nTotal = 0;
NetFileEnum(
NULL, // servername, NULL means localhost
"c:\\directory\\path", // basepath, directory where VB6 program is
NULL, // username, searches for all users
2, // level, we just need resource ID
(LPBYTE*)&pFiles, // bufptr, need to use a double pointer to get the buffer
MAX_PREFERRED_LENGTH, // prefmaxlen, collect as much as possible
&nRead, // entriesread, number of entries stored in pFiles
&nTotal, // totalentries, ignore this
NULL //resume_handle, ignore this
);
for (int i=0; i < nRead; ++i)
NetFileClose(NULL, pFiles[i].fi2_id);
NetApiBufferFree(pFiles);
Refer to MSDN for details about NetFileEnum and NetFileClose. Note that NetFileEnum may return ERROR_MORE_DATA if more data is available.

inotify does not raise DELETE_SELF if an open file-fd exists

I am trying to monitor a directory using inotify and I am registering for ALL the events. Now, I have a requirement in my project to track any MOVE_SELF operations performed on the directory, so that I should be able to detect to which new location has the monitored directory moved to. To achieve this I am storing a reference of open file-descrptor (int fd) of the monitored directory and when I get a MOVE_SELF, I try to get the new path using:
//code to store a reference of file-descrptor of the monitored sirectory
fd = open(watchPath.c_str(), O_RDONLY)
//code to learn the new location of the moved directory
char fdpath[4096];
char path[4096];
sprintf(fdpath, "/proc/self/fd/%d", fd);
ssize_t sz = readlink(fdpath, path, sizeof(path) - 1); //Path will contain the new location after the move happens
But the side effect of this is, in case I delete the directory, I do not get DELETE_SELF event, because there is still an open file descriptor that I am holding. Could anyone suggest me on how to get around this issue?
Thanks,
-Sandeep
In case someone stumbles into this issue: this is definitely an expected behavior. Inotify does not monitor "files", it monitors "file objects" (aka inodes). An inode does not get removed by kernel until all open file descriptors, pointing to it, are closed.
This is also why IN_DELETE/IN_DELETE_SELF does not get triggered if you remove one of several hard links to file (because hard links share the same inode).
You can partially work around the hard link issue by subscribing to IN_ATTRIB event: it is triggered when the reference count of inode changes (e.g. when one of hard links is deleted), so you can use it to check if the file still exist at the old path.
As for "open descriptors" issue — I am not aware of any workarounds. Personally, I just don't care. So what if your program temporarily de-synchronizes with disk contents? Even if inotify were completely flawless, you would still need occasional re-sync due to queue overruns and event races.

Use Win API to determine if an instance of an executable is already running

I need to ensure only 1 instance of my C++ application is running.
Using the Win API how do I;
retrieve the information about my current application?
GetCurrentProcess() will give me a HANDLE on my application, how do I retrieve information about it
retrieve a list of all running processes for the user?
EnumProcesses() gives a list, but appears to require a pre-allocated buffer, so how do I find out how many processes are currently running?
I need to compare the exe name of my server to the running processes, and raise an error if I find more than one
Note: I cannot use any boost libraries, and I am not interested in using a mutex, seen on similar posts.
You can use the CreateMutex function to create a system-wide named mutex to denote whether your process is running. It will return ERROR_ALREADY_EXISTS if the process is already running:
(void)::CreateMutex( NULL,
TRUE,
TEXT( "My_Special_Invokation_Test_Mutex" ) );
switch ( ::GetLastError() ) {
case ERROR_SUCCESS:
// Process was not running already
break;
case ERROR_ALREADY_EXISTS:
// Process is running already
break;
default:
// Error occured, not sure whether process is running already.
break;
}
Now, if you insist on not using a mutex, you can use the CreateFile function instead. Make sure to pass zero for the dwShareMode field to get exclusive access semantics, CREATE_NEW for the dwCreationDisposition field (so that you create the file only if it doesn't exist already) and FILE_FLAG_DELETE_ON_CLOSE for the dwFlagsAndAttributes argument so that the file gets deleted once your process is terminated. Something like this:
LPCTSTR lockFileName = ...;
(void)::CreateFile( lockFileName,
GENERIC_READ,
0,
NULL,
CREATE_NEW,
FILE_FLAG_DELETE_ON_CLOSE,
NULL );
switch ( ::GetLastError() ) {
case ERROR_SUCCESS:
// Process was not running already
break;
case ERROR_FILE_EXISTS:
// Process is running already
break;
default:
// Error occured, not sure whether process is running already.
break;
}
See this article about Temporary file generation and usage best practices about how to deal with temporary files safely.
To make a long story short, it's certainly possible to use lock files for your task, but I think it's harder to do it right.
Updated version of Nawaz's answer:-
Handle mutex = CreateMutex (0, 0, "SomeUniqueName");
switch (GetLastError ())
{
case ERROR_ALREADY_EXISTS:
// app already running
break;
case ERROR_SUCCESS:
// first instance
break;
default:
// who knows what happened!
break;
}
This does have a security issue, a malicious application could create a mutex called "SomeUniqueName" before your app starts, which would then prevent your app from being run. To counter this, you can name the mutex based on a hash of some constant system parameter (the MAC address for example). The MSDN documentation has this to say about single instance applications:
If you are using a named mutex to limit your application to a single instance, a malicious user can create this mutex before you do and prevent your application from starting. To prevent this situation, create a randomly named mutex and store the name so that it can only be obtained by an authorized user. Alternatively, you can use a file for this purpose. To limit your application to one instance per user, create a locked file in the user's profile directory.
Since mutex isn't desired, you can for example use a file mapping instead. The documentation to CreateFilemapping says:
If the object exists before the function call, the function returns a handle to the existing object (with its current size, not the specified size), and GetLastError returns ERROR_ALREADY_EXISTS.
If the function fails, the return value is NULL.
This leads to the following no-mutex implementation:
Handle h = CreateFileMapping(0, 0, PAGE_READONLY, 0, 4096, name);
bool already_running = !!h && (GetLastError() == ERROR_ALREADY_EXISTS);
Either the call succeeds and the mapping already exists, then another process is already running.
Or, a new mapping is created, or the call fails. In either case, no other process is already running. If the call fails, it almost certainly failed for any other process that may have tried before as well. Since once a call was successful, the mapping already exists, the only possible reason why two identical calls could succeed once and then fail would be "no more handles left", and that just doesn't (well, shouldn't) happen. Anyway, if this does happen, you have a much more serious problem elsewhere.
That thing probably works with every type of named kernel object you pick (i.e. every type of kernel object that has both a Create and an Open version).
A file mapping object has the advantage that if you also want to do IPC (say, forward your commandline to the already running instance, and then exit), then you already have a mapping that you can use (though sure enough a pipe would do mighty fine as well).
But otherwise, I don't see how this (or any other solution) is superior to using the mutex approach in any way. Really, why not use a mutex? It's what they're for.