Right now, I am facing a new problem that I can't figure out how to fix. I have two files. One is a video file and other is a thumbnail. They have same name. I want to rename these two files using C++. I am using the rename function and it works. This is what I've written:
if(rename(videoFile) == 0)
{
if(rename(thumbnail) != 0)
{
printf("Fail rename \n");
}
}
The problem occurs when the video file is renamed successfully but for some reason the thumbnail can't be renamed. When this happens, I would like to rollback the renaming of the video file because the video file name and the thumbnail file name should be the same in my program. What I want to do is to rename after both files are okay to rename. Please guide me, any design pattern for function like rollback or third party software.
There is no absolutely foolproof way to do this.
Fundamental rule of disk I/O: The filesystem can change at any time. You can't check whether a rename would succeed; your answer is already wrong. You can't be certain that undoing the rename will succeed; somebody else might have taken the name while you briefly weren't using it.
On systems that support hard links, you can use them to get about 90% of the way there, assuming you're not moving between filesystems. Suppose you're renaming A to B and C to D. Then do these things:
Create hard link B which links to A. This is written as link("A", "B") in C, using the Unix link(2) system call. Windows users should call CreateHardLink() instead.
If (1) succeeded, create hard link D which links to C. Otherwise, return failure now.
If (2) succeeded, delete A and C and return success. Otherwise, delete B and return failure. If the deletions fail, there is no obvious means of recovery. In practice, you can probably ignore failed deletions assuming the reason for failure was "file not found" or equivalent for your platform.
This is still vulnerable to race conditions if someone deletes one of the files out from under you at the wrong time, but that is arguably not an issue since it is largely equivalent to the rename failing (or succeeding) and then the person deleting the file afterwards.
Technically, you should also be opening the containing directory (in O_RDONLY mode) and fsync(2)'ing it after each operation, at least under Unix. If moving between directories, that's both the source and the destination directories. In practice, nobody does this, particularly since it will lead to degraded performance under ext3. Linus takes the position that the filesystem ought to DTRT without this call, but it is formally required under POSIX. As for Windows, I've been unable to find any authoritative reference on this issue on MSDN or elsewhere. So far as I'm aware, Windows does not provide an API for synchronizing directory entries (you can't open() a directory, so you can't get a file descriptor suitable to pass to fsync()).
Nitpick: To some extent, this sort of thing can be done perfectly on transactional filesystems, but just about the only one in common use right now is NTFS, and Microsoft specifically tells developers not to use that feature. If/when btrfs hits stable, transactions might become genuinely useful.
On Windows platform starting from Vista, you can use code such as the following.
#include "KtmW32.h"
bool RenameFileTransact( LPCTSTR lpctszOldVideoFile, LPCTSTR lpctszNewVideoFile, LPCTSTR lpctszOldThumbnailFile, LPCTSTR lpctszNewThumbnailFile )
{
bool bReturn = false;
HANDLE hRnameTransaction = CreateTransaction(NULL, NULL, 0, 0, 0, 0, NULL);
if (MoveFileTransacted(lpctszOldVideoFile, lpctszNewVideoFile, NULL, NULL, 0, hRnameTransaction) &&
MoveFileTransacted(lpctszOldThumbnailFile, lpctszNewThumbnailFile, NULL, NULL, 0, hRnameTransaction))
{
if ( CommitTransaction(hRnameTransaction))
{
bReturn = true;
}
}
CloseHandle( hRnameTransaction );
return bReturn;
}
But as #Kevin pointed out above, Microsoft discourages the usage of this good feature.
Related
I was reading: https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use
They showed this code to be buggy and I totally understand why it's so:
if (access("file", W_OK) != 0) {
exit(1);
}
// Attacker: symlink("/etc/passwd", "file");
fd = open("file", O_WRONLY);
// Actually writing over /etc/passwd
write(fd, buffer, sizeof(buffer));
But the real question is how to protect against this type of exploits?
You can use the O_NOFOLLOW flag. It will cause the open to fail if basename of the path is a symbolic link. That would solve the described attack.
To cover links along the directory path, you can check whether frealpath(fd, ...) matches what you would expect.
Another way to prevent a process from overwriting /etc/passwd is to run it as non-root so that it won't have permission. Or, you can use chroot - or more generally, a container - to prevent the host system's /etc/passwd being visible to the process.
More generally though, filesystem TOCTOU is unsolvable at the moment on Linux. You would need transaction support either on filesystem or system call level - which are lacking.
There is no failproof solution.
Be also aware of Rice's theorem. It might be relevant.
But you could adopt a system wide convention (and document it) that every program accessing a given file is using locking facilities like flock(2).
Short:
In my c++ project i need to read/write extended file properties. I managed it with using alternate data streams (ADS). My problem is, for opening the ADS i need to use the CreateFile API. But it is not fulfilling my needs. NtCreateFile will fullfill all my needs. (Or alternatively NtSetEaFile and NtQueryEaFile) But NtCreateFile is not directly accessible from a win32 console application.
I know i can use this function easily via GetProcAdress. But i like to know the opinion of you all, if i did miss something? Some other libs are using this pattern already, for example Chromium (https://github.com/chromium-googlesource-mirror/chromium/blob/1c1996b75d3611f56d14e2b30e7ae4eabc101486/src/sandbox/src/win_utils.cc function: ResolveNTFunctionPtr)
But im uncertain, because the c++ project is not a hobby project and i ask myself if it is dangerous or not.
I guess NtCreateFile is maybe the securest way to do, because it is well documented and supported by winternl.h header. Especially because this method is unchanged since windows 2000. But what is with NtSetEaFile, NtQueryEaFile which are fitting my needs perfectly. They are only half documented. A documentation for ZwSetEaFile and ZwQueryEaFile exist (unchanged since windows 2000).
Reason why i want to do that:
I want to write and read extended properties from files via ADS. But in case of writing the extended property of a given file the first time, i need to open the file with OPEN_ALWAYS. In case of file is not existing it will create a new file, even if i only access not the content stream of the file. To avoid this i get first the handle of the original file and check with this HANDLE if the file still exist.
But i dont want to blog any file with reduced access rights, because from my point of view that is a very bad pattern. The user needs to have full access to any file any time. Because of that we open all HANDLES with the flag FILE_SHARE_DELETE | FILE_SHARE_READ | FILE_SHARE_WRITE. And now i have the race.
auto hFile = CreateFileW(originalPath, …, FILE_SHARE_DELETE | FILE_SHARE_READ | FILE_SHARE_WRITE, …).
// this is the little race: if somebody at least rename originalPath the
// second CreateFileW call will cause the creation of a empty file with the
// path originalPath (the old path).
auto hADS = CreateFileW(originalPath + adsName, …, FILE_SHARE_DELETE | FILE_SHARE_READ | FILE_SHARE_WRITE, OPEN_ALWAYS, …).
This is a main issue, especially because this happens from time to time in our tests. NtCreateFile will fix it, because i can create the second HANDLE with the help of the first HANDLE. Because of that no race. Or NtSetEaFile and NtQueryEaFile will help, because i only need one HANDLE.
The thing is, that the application needs not to be save for the future, because ADS works only on NTFS anyway. And who knows when NTFS will be exchanged. But i dont want a flaky behaviour. I want to trust this Methods. I I am fine if the API will change in the future and the software needs to adapt to it. But i want to be sure, that all Windows higher or equal then 7 can deal with it. Somebody some experience to share? I would like to hear them very much.
This question is wrong. Your proposed solution for your problem, is not using NtCreateFile, but use CreateFile with dwCreationDisposition set to the OPEN_EXISTING.
From documentation:
OPEN_EXISTING
Opens a file or device, only if it exists. If the specified file or
device does not exist, the function fails and the last-error code is
set to ERROR_FILE_NOT_FOUND.
Simply open file if exists and set whatever you want. If file is renamed, CreateFile returns ERROR_FILE_NOT_FOUND.
THE PROBLEM
Now, to your proposed solution, what is better method or why is not possible use ntdll.dll in win32 console application (???).
Again, your "better" method - GetProcAddress is "wrong" same as using linking against ntdll.dll. In Windows 11, or Windows 12 or Windows 3030 the function may be removed and both solutions (statical vs. dynamical import) will be fail.
It is not really unsecure to use this kind of APIs if their is a documentation. In case of NtSetEaFile, NtQueryEaFile and NtCreateFile you can find a description inside of Microsoft's Doc. (keep in mind NtXxx == ZwXxx)
But this API can change in the future and Microsoft does not guarantee that it will provides the same methods in the next Windows version. If you can, use the public API, because then you are safe. If not it is a case by case decision. In this case the three methods from the API are unchanged since Windows2000. Plus for example NtSetEaFile and NtQueryEaFile is used by Microsoft for WSL (Windows Subsystem for Linux). And especially NtCreateFile is used by a wide range of OpenSource Projects. So it is very unlikely that this API will change.
In my use case another aspect is important. Because I wanted to use ADS, but ADS is only supported by NTFS. So using ADS does not ensure future compatibility as well. So it was very clear for me using NtSetEaFile and NtQueryEaFile.
But how you can use this kind of APIs? Dynamic or static linking is possible. It depends on your needs what is better. In case of static linking you need to download the last WDK (Windows Driver Kit) and link against the ntdll.lib. In case of dynamic linking you can access the dll directly via GetModuleHandle and finding out the address of the method with GetProcAddress. Under Windows ntdll.dll is accessible from any application. In both cases you don't have directly a header file. You have to define the header file by yourself or use WDK to get them.
In my project dynamic linking was the best choice. The reason was, that on every windows the right implementation will be choosen and in case the method is not available i have the chance to deactivate the feature in my software instead of crash. Microsoft is recommending the dynamic way, because of the last reason.
Simple PseudoCode (dynamic case):
typedef struct _FILE_FULL_EA_INFORMATION {
ULONG NextEntryOffset;
UCHAR Flags;
UCHAR EaNameLength;
USHORT EaValueLength;
CHAR EaName[1];
} FILE_FULL_EA_INFORMATION, *PFILE_FULL_EA_INFORMATION;
typedef struct _IO_STATUS_BLOCK {
union {
NTSTATUS Status;
PVOID Pointer;
};
ULONG_PTR Information;
} IO_STATUS_BLOCK, *PIO_STATUS_BLOCK;
typedef NTSTATUS(WINAPI *NtSetEaFileFunction)(IN HANDLE FileHandle,
OUT PIO_STATUS_BLOCK
IoStatusBlock,
IN PVOID Buffer,
IN ULONG Length);
HMODULE ntdll = GetModuleHandle(L"ntdll.dll");
NtSetEaFileFunction function = nullptr;
FARPROC *function_ptr = reinterpret_cast<FARPROC *>(&function);
*function_ptr = GetProcAddress(ntdll, "NtQueryEaFile");
// function could be used normally.
The other answer is incorrect. The reason is that the reason of my problem is, that I need to use OPEN_ALWAYS. Of course, if you don't need this flag, everything is fine. But in my case there is a point where I needed to create the ADS. And it will not be created without the OPEN_ALWAYS flag.
I had the following code which got executed every two
minutes all day long:
int sucessfully_deleted = DeleteFile(dest_filename);
if (!sucessfully_deleted)
{
// this never happens
}
rename(source_filename,dest_filename);
Once every several hours the rename() would fail with errno=13 (EACCES). The files involved were all sitting on a DropBox directory and I had a hunch that DropBox could be the cause. I figured that it might just be possible that the DeleteFile() function may return with a non-zero successfully_deleted but actually DropBox could still be busy doing some stuff in relation to the deletion that prevented rename() from succeeding. What I did next was to change rename() to my_rename() which would attempt a rename() and upon any failure would Sleep() for one second and try a second time. Sure enough that has worked perfectly ever since. What's more, I get a diagnostic message displaying first-attempt-failures every several hours. It has never failed on the second attempt.
So you could say that the problem is entirely solved... but I would like to understand what might be going on so as to better defend myself against any related DropBox issues in the future...
Really I would like to have a new super_delete() function which does not return until the file is properly deleted and finished with in all respects.
under windows request to delete file really never delete file just. it mark it FCB (File Control Block) with special flag (FCB_STATE_DELETE_ON_CLOSE). real deletion will be only when the last file handle will be closed.
The DeleteFile function marks a file for deletion on close. Therefore,
the file deletion does not occur until the last handle to the file is
closed. Subsequent calls to CreateFile to open the file fail with
ERROR_ACCESS_DENIED.
also if exist section ( memory-mapped file ) open on file - file even can not be marked for delete. api call fail with STATUS_CANNOT_DELETE. so in general impossible always delete file.
in case exist another open handles for file (but not section !) begin from windows 10 rs1 exist new functional for delete - FileDispositionInformationEx with FILE_DISPOSITION_POSIX_SEMANTICS. in this case:
Normally a file marked for deletion is not actually deleted until all
open handles for the file have been closed and the link count for the
file is zero. When marking a file for deletion using
FILE_DISPOSITION_POSIX_SEMANTICS, the link gets removed from the visible namespace as soon as the POSIX delete handle has been closed,
but the file’s data streams remain accessible by other existing
handles until the last handle has been closed.
ULONG DeletePosix(PCWSTR lpFileName)
{
HANDLE hFile = CreateFileW(lpFileName, DELETE, FILE_SHARE_VALID_FLAGS, 0, OPEN_EXISTING,
FILE_FLAG_BACKUP_SEMANTICS|FILE_FLAG_OPEN_REPARSE_POINT, 0);
if (hFile == INVALID_HANDLE_VALUE)
{
return GetLastError();
}
static FILE_DISPOSITION_INFO_EX fdi = { FILE_DISPOSITION_DELETE| FILE_DISPOSITION_POSIX_SEMANTICS };
ULONG dwError = SetFileInformationByHandle(hFile, FileDispositionInfoEx, &fdi, sizeof(fdi))
? NOERROR : GetLastError();
// win10 rs1: file removed from parent folder here
CloseHandle(hFile);
return dwError;
}
Update
Sorry i didn't get the question correctly the first time. I thought DeleteFile returned error 13.
Now I understand that DeleteFile succeeds but rename fails immediatlely after.
It could be just a sync issue with the filesystem. After calling DeleteFile the file will be deleted when the OS commits the changes to the filesystem. That may not appen immediately.
If you need to perform multiple operations to the same path, you should have a look at transactions https://learn.microsoft.com/it-it/windows/desktop/api/winbase/nf-winbase-deletefiletransacteda.
-- OLD ANSWER --
That is correct. If the another application handles to that file, DeleteFile will fail.
Citing MSDN docs https://learn.microsoft.com/en-us/windows/desktop/api/winbase/nf-winbase-deletefile :
The DeleteFile function fails if an application attempts to delete a file that has other handles open for normal I/O or as a memory-mapped file (FILE_SHARE_DELETE must have been specified when other handles were opened).
This applies to dropbox, the antivirus, or in general, any other application that may open those files.
Dropbox may open the file to compute its hash (to look for changes) at any moment. Same goes with the antivirus.
I have a folder to which files are copied. I want to watch it and process files as soon as they are copied to the directory. I can detect when a file is in the directory, whether through polling (my current implementation) or in some tests using Windows APIs from a few samples I've found online.
The problem is that I detect when the file is first created and its still being copied. This makes my program, that needs to access the file, through errors (because the file is not yet complete). How can I detect not when the copying started but when the copying ended? I'm using C++ on Windows, so the answer may be platform dependent but, if possible, I'd prefer it to be platform agnostic.
You could use either lock files or a special naming convention. The easiest is the latter and would work something like this:
Say you want to move a file named "fileA.txt" When copying it to the destination directory, instead, copy it to "fileA.txt.partial" or something like that. When the copy is complete, rename the file from "fileA.txt.partial" to "fileA.txt". So the appearance of "fileA.txt" is atomic as far as the watching program can see.
The other option as mentioned earlier is lock files. So when you copy a file named "fileA.txt", you first create a file called "fileA.txt.lock". When the copying is done, you simply delete the lock file. When the watching program see "fileA.txt", it should check if "fileA.txt.lock" exists, if it does, it can wait or revisit that file in the future as needed.
You should not be polling. Use FindFirstChangeNotification (http://msdn.microsoft.com/en-us/library/windows/desktop/aa364417%28v=vs.85%29.aspx) to watch a directory for changes.
Then use the Wait functions (http://msdn.microsoft.com/en-us/library/windows/desktop/ms687069%28v=vs.85%29.aspx) to wait on change notifications to happen.
Overview and examples here: http://msdn.microsoft.com/en-us/library/windows/desktop/aa365261%28v=vs.85%29.aspx
I'm not sure how exactly file write completion can be determined. Evan Teran's answer is a good idea.
You can use something like this, This is tested and working
bool IsFileDownloadComplete(const std::wstring& dir, const std::wstring& fileName)
{
std::wstring originalFileName = dir + fileName;
std::wstring tempFileName = dir + L"temp";
while(true)
{
int ret = rename(convertWstringToString(originalFileName).c_str(), convertWstringToString(tempFileName).c_str());
if(ret == 0)
break;
Sleep(10);
}
/** File is not open. Rename to original. */
int ret = rename(convertWstringToString(tempFileName).c_str(), convertWstringToString(originalFileName).c_str());
if(ret != 0)
throw std::exception("File rename failed");
return true;
}
In Windows, is there an easy way to tell if a folder has a subfile that has changed?
I verified, and the last modified date on the folder does not get updated when a subfile changes.
Is there a registry entry I can set that will modify this behavior?
If it matters, I am using an NTFS volume.
I would ultimately like to have this ability from a C++ program.
Scanning an entire directory recursively will not work for me because the folder is much too large.
Update: I really need a way to do this without a process running while the change occurs. So installing a file system watcher is not optimal for me.
Update2: The archive bit will also not work because it has the same problem as the last modification date. The file's archive bit will be set, but the folders will not.
This article should help. Basically, you create one or more notification object such as:
HANDLE dwChangeHandles[2];
dwChangeHandles[0] = FindFirstChangeNotification(
lpDir, // directory to watch
FALSE, // do not watch subtree
FILE_NOTIFY_CHANGE_FILE_NAME); // watch file name changes
if (dwChangeHandles[0] == INVALID_HANDLE_VALUE)
{
printf("\n ERROR: FindFirstChangeNotification function failed.\n");
ExitProcess(GetLastError());
}
// Watch the subtree for directory creation and deletion.
dwChangeHandles[1] = FindFirstChangeNotification(
lpDrive, // directory to watch
TRUE, // watch the subtree
FILE_NOTIFY_CHANGE_DIR_NAME); // watch dir name changes
if (dwChangeHandles[1] == INVALID_HANDLE_VALUE)
{
printf("\n ERROR: FindFirstChangeNotification function failed.\n");
ExitProcess(GetLastError());
}
and then you wait for a notification:
while (TRUE)
{
// Wait for notification.
printf("\nWaiting for notification...\n");
DWORD dwWaitStatus = WaitForMultipleObjects(2, dwChangeHandles,
FALSE, INFINITE);
switch (dwWaitStatus)
{
case WAIT_OBJECT_0:
// A file was created, renamed, or deleted in the directory.
// Restart the notification.
if ( FindNextChangeNotification(dwChangeHandles[0]) == FALSE )
{
printf("\n ERROR: FindNextChangeNotification function failed.\n");
ExitProcess(GetLastError());
}
break;
case WAIT_OBJECT_0 + 1:
// Restart the notification.
if (FindNextChangeNotification(dwChangeHandles[1]) == FALSE )
{
printf("\n ERROR: FindNextChangeNotification function failed.\n");
ExitProcess(GetLastError());
}
break;
case WAIT_TIMEOUT:
// A time-out occurred. This would happen if some value other
// than INFINITE is used in the Wait call and no changes occur.
// In a single-threaded environment, you might not want an
// INFINITE wait.
printf("\nNo changes in the time-out period.\n");
break;
default:
printf("\n ERROR: Unhandled dwWaitStatus.\n");
ExitProcess(GetLastError());
break;
}
}
}
This is perhaps overkill, but the IFS kit from MS or the FDDK from OSR might be an alternative. Create your own filesystem filter driver with simple monitoring of all changes to the filesystem.
ReadDirectoryChangesW
Some excellent sample code in this CodeProject article
If you can't run a process when the change occurs, then there's not much you can do except scan the filesystem, and check the modification date/time. This requires you to store each file's last date/time, though, and compare.
You can speed this up by using the archive bit (though it may mess up your backup software, so proceed carefully).
An archive bit is a file attribute
present in many computer file systems,
notably FAT, FAT32, and NTFS. The
purpose of an archive bit is to track
incremental changes to files for the
purpose of backup, also called
archiving.
As the archive bit is a binary bit, it
is either 1 or 0, or in this case more
frequently called set (1) and clear
(0). The operating system sets the
archive bit any time a file is
created, moved, renamed, or otherwise
modified in any way. The archive bit
therefore represents one of two
states: "changed" and "not changed"
since the last backup.
Archive bits are not affected by
simply reading a file. When a file is
copied, the original file's archive
bit is unaffected, however the copy's
archive bit will be set at the time
the copy is made.
So the process would be:
Clear the archive bit on all the files
Let the file system change over time
Scan all the files - any with the archive bit set have changed
This will eliminate the need for your program to keep state, and since you're only going over the directory entries (where the bit is stored) and they are clustered, it should be very, very fast.
If you can run a process during the changes, however, then you'll want to look at the FileSystemWatcher class. Here's an example of how you might use it.
It also exists in .NET (for future searchers of this type of problem)
Perhaps you can leave a process running on the machine watching for changes and creating a file for you to read later.
-Adam
Perhaps you can use the NTFS 5 Change Journal with DeviceIoControl as explained here
If you are not opposed to using .NET the FileSystemWatcher class will handle this for you fairly easily.
From the double post someone mentioned: WMI Event Sink
Still looking for a better answer though.
Nothing easy - if you have a running app you can use the Win32 file change notification apis (FindFirstChangeNotification) as suggested with the other answers. warning: circa 2000 trend micro real-time virus scanner would group the changes together making it necessary to use really large buffers when requesting the file system change lists.
If you don't have a running app, you can turn on ntfs journaling and scan the journal for changes http://msdn.microsoft.com/en-us/library/aa363798(VS.85).aspx but this can be slower than scanning the whole directory when the # of changes is larger than the # of files.