I want to monitor a folder in my file system. Let say I want to monitor the folder: C:\MyNewFolder
I have this code to do it:
HANDLE ChangeHandle=FindFirstChangeNotification(_T("C:\\\MyNewFolder"),FALSE,FILE_NOTIFY_CHANGE_LAST_WRITE);
for(;;)
{
DWORD Wait=WaitForSingleObject(ChangeHandle,INFINITE);
if (Wait == WAIT_OBJECT_0)
{
MessageBox(NULL,_T("Change"),_T("Change"),MB_OK);
FindNextChangeNotification(ChangeHandle);
}
else
{
break;
}
}
I want to have a messagebox that notifying me about any file change in my folder. That code works fine but I have one problem. The problem is that I got 2 notification for each change. What is the problem with my code?
Thanks.
This is entirely normal. A change to a file usually involves a change to the file data as well as a change to the directory entry. Metadata properties like the file length and the last write date are stored there. So you'll get a notification for both. ReadDirectoryChangesW() doesn't otherwise distinguish between the two.
This is not different from a process making multiple changes to the same file. Be sure to be able to handle both conditions. This usually involves a timer so you don't go overboard with the number of operations you perform on a notification. Such a timer is also often required because the process that is changing the file still has a lock on it that prevents you from doing anything with the file. Until the process closes the file, an indeterminate amount of time later.
What you're probably seeing is multiple changes to the one file (e.g. a file being created, and then written to, or a file being written to multiple times, etc). Unfortunately FindFirstChangeNotification doesn't tell you what has actually happened.
You're better off using ReadDirectoryChangesW for file notification as it will actually tell you what has changed.
Related
I have a fortran code which needs to read a series of ascii data files (which all together are about 25 Gb). Basically the code opens a given ascii file, reads the information and use it to do some operations, and then close it. Then opens another file, reads the information, do some operations, and close it again. And so on with the rest of ascii files.
Overall each complete run takes about 10h. I usually need to run several independent calculations with different parameters, and the way I do is to run each independent calculation sequentially, so that at the end if I have 10 independent calculations, the total CPU time is 100h.
A more rapid way would be to run the 10 independent calculations at the same time using different processors on a cluster machine, but the problem is that if a given calculation needs to open and read data from a given ascii file which has been already opened and it's being used by another calculation, then the code gives obviously an error.
I wonder whether there is a way to verify if a given ascii file is already being used by another calculation, and if so to ask the code to wait until the ascii file is finally closed.
Any help would be of great help.
Many thanks in advance.
Obamakoak.
Two processes should be able to read the same file. Perhaps action="read" on the open statement might help. Must the files be human readable? The I/O would very likely be much faster with unformatted (sometimes call binary) files.
P.S. If your OS doesn't support multiple-read access, you might have to create your own lock system. Create a master file that a process opens to check which files are in use or not, and to update said list. Immediately closing after a check or update. To handle collisions on this read/write file, use iostat on the open statement and retry after a delay if there is an error.
I know this is an old thread but I've been struggling with the same issue for my own code.
My first attempt was creating a variable on a certain process (e.g. the master) and accessing this variable exclusively using one-sided passive MPI. This is fancy and works well, but only with newer versions of MPI.
Also, my code seemed happy to open (with READWRITE status) files that were also open in other processes.
Therefore, the easiest workaround, if your program has file access, is to make use of an external lock file, as described here. In your case, the code might look something like this:
A process checks whether the lock file exists using the NEW statement, which fails if a file already exists. It will look something like:
file_exists = .true.
do while (file_exists)
open(STATUS='NEW',unit=11,file=lock_file_name,iostat=open_stat)
if (open_stat.eq.0) then
file_exists = .false.
open(STATUS='OLD',ACTION=READWRITE',unit=12,file=data_file_name,iostat=ierr)
if (ierr.ne.0) stop
else
call sleep(1)
end if
end do
The file is now opened exclusively by the current process. Do the operations you need to do, such as reading, writing.
When you are done, close the data file and finally the lock file
close(12,iostat=ierr)
if (ierr.ne.0) stop
close(11,status='DELETE',iostat=ierr)
if (ierr.ne.0) stop
The data file is now again unlocked for the other processes.
I hope this may be useful for other people who have the same problem.
How do I monitor rtf file to check if it is updated for a while (lets say 15 min). If not updating then let the main thread know that file is not updated. I am thinking of using WaitforSingleObject function to wait for any changes in last 15 minute. how can I implement this funcationality?
I believe what are looking for is file change notifications such as FindFirstChangeNotification, FindNextChangeNotification, and ReadDirectoryChangesW you monitor a file or directory for changes, rename, write, and so on.
Presumably your platform is Windows since you mention WaitForSingleObject. In which case the function you are looking for is ReadDirectoryChangesW. This will allow you to be notified as soon as changes are made, without you performing any polling.
Jim Beveridge has an excellent pair of articles that go into some depth:
http://qualapps.blogspot.com/2010/05/understanding-readdirectorychangesw.html
http://qualapps.blogspot.com/2010/05/understanding-readdirectorychangesw_19.html
You can stat() the file, check its modification date and act appropriately.
You can also periodically compute a checksum of the file and compare it to the previous one.
For RTF files you can also take the size of the file and compare it to the previous size; if it's been modified it's very likely the size will be different.
All those methods will probably introduce more overhead than the system calls mentioned by others.
In my opinion, you can achieve this in two ways. You can write a file filter driver that can monitor write operation on the file. However this is little bit stretching.
Another way is simple one. In your main thread, create a hash of your RTF file and cache it. Create an event in non-signaled state, create a callback function, create a worker thread. Wait in the worker thread on event for 15 min. After timout, again generate hash of your file and compare it with cached hash. If mismatch, notify your main thread through callback function.
I'm watching the config files of my NodeJS server on Ubuntu using:
for( var index in cfgFiles ) {
fs.watch(cfgFiles[index], function(event, fileName) {
logger.info("======> EVENT: " + event);
updateConfigData(fileName);
});
}
So whenever I save a config file, the "change" event is received at least twice by the handler function for the same file name causing updateConfigData() to be executed multiple times. I experienced the same behavior when watching config files using C++/iNotify.
Does anyone have a clue what causes this behavior?
Short Answer: It is not Node, file is really changed twice.
Long Answer
I have a very similar approach that I use for my development setup. My manager process watches all js source files if it is a development machine and restart childs on the cluster.
I had not paid any attention to this since it was just development setup; but after I read your question, I gave it a look and realized that I have the same behavior.
I edit files on my local computer and my editor updates them over sftp whenever I save. At every save, change event on the file is triggered twice.
I had checked listeners('change') for the FSWatcher object that is returned by fs.watch call; but it shows my event handler only once.
Then I did the test I should have done first: "touch file.js" on server and it triggered only once. So, for me, it was not Node; but file was really changed twice. When file is opened for writing (instead of appending), it probably triggers a change since it empties the content. Then when new content is written, it triggers the event for a second time.
This does not cause any problem for me; but if you want to prevent it, you can make an odd-even control in your event handler function by keeping the call numbers for each file and do whatever you do only on even-indexed calls.
See my response to a similar question which explains that the problem is being caused by your editor making multiple edits to the file on save.
is there a way to delete a file under windows xp, ntfs filesystem even if there is a lock on that file?
Having issues with other processes like e.g. virus scan locking files I want to move/delete.
Thanks for any hints!
MoveFileEx allows you to pass the MOVEFILE_DELAY_UNTIL_REBOOT which will cause the file to be moved/deleted when you next reboot. Other than that, you'd have to find/kill whichever other process(es) currently have the file locked, which may not be possible, and is almost certainly not desirable behaviour for most programs.
If the file is locked when you try to delete it then the deletion will fail. If you need the file to be deleted, then you need whatever is locking it to release the lock.
That's really all there is to it. There are no shortcuts here.
If I recall right, there's a Microsoft program called Open Handles that you can download which will tell you what process is locking a particular file. Then you just kill that process and it unlocks the file so that you can delete it. Doesn't work if the file is locked by a core operating system process, but should work fine if it's locked by a virus scanner.
I guess if you're trying to do this programmatically rather than manually, you'll need to get your program to invoke oh.exe and process its output accordingly. Then kill the relevant process using the Windows API (to the best of my knowledge, TerminateProcess is the appropriate function) and try deleting the file again.
If you absolutely need to delete the file before proceeding, you may do following:
#include <stdio.h>
...
while(remove("myfile.txt" ) != 0)
// Error deleting file. Wait a little before trying again.
Sleep(100);
After the loop you absolutely sure that file is successfully deleted.
You may use some "attempts counter" to exit the loop to not wait forever ;)
I have this tool in which a single log-like file is written to by several processes.
What I want to achieve is to have the file truncated when it is first opened, and then have all writes done at the end by the several processes that have it open.
All writes are systematically flushed and mutex-protected so that I don't get jumbled output.
First, a process creates the file, then starts a sequence of other processes, one at a time, that then open the file and write to it (the master sometimes chimes in with additional content; the slave process may or may not be open and writing something).
I'd like, as much as possible, not to use more IPC that what already exists (all I'm doing now is writing to a popen-created pipe). I have no access to external libraries other that the CRT and Win32 API, and I would like not to start writing serialization code.
Here is some code that shows where I've gone:
// open the file. Truncate it if we're the 'master', append to it if we're a 'slave'
std::ofstream blah(filename, ios::out | (isClient ? ios:app : 0));
// do stuff...
// write stuff
myMutex.acquire();
blah << "stuff to write" << std::flush;
myMutex.release();
Well, this does not work: although the output of the slave process is ordered as expected, what the master writes is either bunched together or at the wrong place, when it exists at all.
I have two questions: is the flag combination given to the ofstream's constructor the right one ? Am I going the right way anyway ?
If you'll be writing a lot of data to the log from multiple threads, you'll need to rethink the design, since all threads will block on trying to acquire the mutex, and in general you don't want your threads blocked from doing work so they can log. In that case, you'd want to write your worker thread to log entries to queue (which just requires moving stuff around in memory), and have a dedicated thread to pull entries off the queue and write them to the output. That way your worker threads are blocked for as short a time as possible.
You can do even better than this by using async I/O, but that gets a bit more tricky.
As suggested by reinier, the problem was not in the way I use the files but in the way the programs behave.
The fstreams do just fine.
What I missed out is the synchronization between the master and the slave (the former was assuming a particular operation was synchronous where it was not).
edit: Oh well, there still was a problem with the open flags. The process that opened the file with ios::out did not move the file pointer as needed (erasing text other processes were writing), and using seekp() completely screwed the output when writing to cout as another part of the code uses cerr.
My final solution is to keep the mutex and the flush, and, for the master process, open the file in ios::out mode (to create or truncate the file), close it and reopen it using ios::app.
I made a 'lil log system that has it's own process and will handle the writing process, the idea is quite simeple. The proccesses that uses the logs just send them to a pending queue which the log process will try to write to a file. It's like batch procesing in any realtime rendering app. This way you'll grt rid of too much open/close file operations. If I can I'll add the sample code.
How do you create that mutex?
For this to work this needs to be a named mutex so that both processes actually lock on the same thing.
You can check that your mutex is actually working correctly with a small piece of code that lock it in one process and another process which tries to acquire it.
I suggest blocking such that the text is completely written to the file before releasing the mutex. I've had instances where the text from one task is interrupted by text from a higher priority thread; doesn't look very pretty.
Also, put the format into Comma Separated format, or some format that can be easily loaded into a spreadsheet. Include thread ID and timestamp. The interlacing of the text lines shows how the threads are interacting. The ID parameter allows you to sort by thread. Timestamps can be used to show sequential access as well as duration. Writing in a spreadsheet friendly format will allow you to analyze the log file with an external tool without writing any conversion utilities. This has helped me greatly.
One option is to use ACE::logging. It has an efficient implementation of concurrent logging.