how to monitor a file for changes? - c++

How do I monitor rtf file to check if it is updated for a while (lets say 15 min). If not updating then let the main thread know that file is not updated. I am thinking of using WaitforSingleObject function to wait for any changes in last 15 minute. how can I implement this funcationality?

I believe what are looking for is file change notifications such as FindFirstChangeNotification, FindNextChangeNotification, and ReadDirectoryChangesW you monitor a file or directory for changes, rename, write, and so on.

Presumably your platform is Windows since you mention WaitForSingleObject. In which case the function you are looking for is ReadDirectoryChangesW. This will allow you to be notified as soon as changes are made, without you performing any polling.
Jim Beveridge has an excellent pair of articles that go into some depth:
http://qualapps.blogspot.com/2010/05/understanding-readdirectorychangesw.html
http://qualapps.blogspot.com/2010/05/understanding-readdirectorychangesw_19.html

You can stat() the file, check its modification date and act appropriately.
You can also periodically compute a checksum of the file and compare it to the previous one.
For RTF files you can also take the size of the file and compare it to the previous size; if it's been modified it's very likely the size will be different.
All those methods will probably introduce more overhead than the system calls mentioned by others.

In my opinion, you can achieve this in two ways. You can write a file filter driver that can monitor write operation on the file. However this is little bit stretching.
Another way is simple one. In your main thread, create a hash of your RTF file and cache it. Create an event in non-signaled state, create a callback function, create a worker thread. Wait in the worker thread on event for 15 min. After timout, again generate hash of your file and compare it with cached hash. If mismatch, notify your main thread through callback function.

Related

update a file simultaneously without locking file

Problem- Multiple processes want to update a file simultaneously.I do not want to use file locking functionality as highly loaded environment a process may block for a while which i don't want. I want something like all process send data to queue or some shared place or something else and one master process will keep on taking data from there and write to the file.So that no process will get block.
One possibility using socket programming.All the processes will send data to to single port and master keep on listening this single port and store data to file.But what if master got down for few seconds.if it happen than i may write to some file based on timestamp and than later sync.But i am putting this on hold and looking for some other solution.(No data lose)
Another possibility may be tacking lock for the particular segment of the file on which the process want to write.Basically each process will write a line.I am not sure how good it will be for high loaded system.
Please suggest some solution for this problem.
Have a 0mq instance handle the writes (as you initially proposed for the socket) and have the workers connect to it and add their writes to the queue (example in many languages).
Each process can write to own file (pid.temp) and periodically rename file (pid-0.data, pid-1.data, ...) for master process that can grab all this files.
You may not need to construct something like this. If you do not want to get processes blocked just use the LOCK_NB flag of perl flock. Periodically try to flock. If not succeeds continue the processing and the values can stored in an array. If file locked, write the data to it from the array.

FindFirstChangeNotification is notifying about changes twice

I want to monitor a folder in my file system. Let say I want to monitor the folder: C:\MyNewFolder
I have this code to do it:
HANDLE ChangeHandle=FindFirstChangeNotification(_T("C:\\\MyNewFolder"),FALSE,FILE_NOTIFY_CHANGE_LAST_WRITE);
for(;;)
{
DWORD Wait=WaitForSingleObject(ChangeHandle,INFINITE);
if (Wait == WAIT_OBJECT_0)
{
MessageBox(NULL,_T("Change"),_T("Change"),MB_OK);
FindNextChangeNotification(ChangeHandle);
}
else
{
break;
}
}
I want to have a messagebox that notifying me about any file change in my folder. That code works fine but I have one problem. The problem is that I got 2 notification for each change. What is the problem with my code?
Thanks.
This is entirely normal. A change to a file usually involves a change to the file data as well as a change to the directory entry. Metadata properties like the file length and the last write date are stored there. So you'll get a notification for both. ReadDirectoryChangesW() doesn't otherwise distinguish between the two.
This is not different from a process making multiple changes to the same file. Be sure to be able to handle both conditions. This usually involves a timer so you don't go overboard with the number of operations you perform on a notification. Such a timer is also often required because the process that is changing the file still has a lock on it that prevents you from doing anything with the file. Until the process closes the file, an indeterminate amount of time later.
What you're probably seeing is multiple changes to the one file (e.g. a file being created, and then written to, or a file being written to multiple times, etc). Unfortunately FindFirstChangeNotification doesn't tell you what has actually happened.
You're better off using ReadDirectoryChangesW for file notification as it will actually tell you what has changed.

Linux: application responsiveness and select()

I have a C++ console app that uses open() [O_RDWR | O_NONBLOCK], write(), select(), read() and close() to work with device file. Also ioctl() can be called to cancel current operation. At any given time only one user can work with device.
I need to come up with C++ class having libsigc++ signals that get fired when data is available from device.
The problem: when calling select() application becomes unresponsive as it waits for the data. How to make it responsive - by calling select() in worker thread? If so - how will worker thread communicate with main thread? Maybe I should look into boost::asio?
How to make it responsive - by calling select() in worker thread
you can use dup(), this will duplicated your file descriptors... thus you can move entire read operations into another thread. thus your write thread and processing thread will be responsive, even when the read [select()] thread is in sleeping.
signal emitting overhead of libsigc++ is minimal, thus i think you can embedded code inside the read thread itself. slots can exist in different thread, this is where you will receive your signals...
I think Thrift source code [entirely boost based] might be of your interest, though thrift does not use libsigc++.
It sounds as though you've misunderstood select; the purpose of select (or poll, epoll, etc) is not "wait for data" but "wait for one or more events to occur on a series of file descriptors or a timer, or a signal to be raised".
What "responsiveness" is going missing while you're in your select call? You said it's a console app so you're not talking about a GUI loop, so presumably it is IO related? If so, then you need to refactor your select so that waiting for the data you're talking about is one element; that is, if you're using select, build FD_SETs of ALL file/socket descriptors (and stdin and stdout are file descriptors) that you want to wait on input for.
Or build a loop that periodically calls "select" with a short timeout to /test/ for any pending input and only try and read it when select tells you there is something to read.
It sounds like you have a producer-consumer style problem. There are various way to implement a solution to this problem, but most folks these days tend to use condition variable based approaches (see this C++11 based example).
There are also a number of design patterns that when implemented can help alleviate your concurrency problem, such as:
Half-Sync / Half-Async
A producer-consumer style pattern that introduces a queue between an asynchronous layer that fills the queue with events, and a synchronous layer that processes those events.
Leader / Followers
Multiple threads take turns handling events
A related discussion is available here.

Writing concurrently to a file

I have this tool in which a single log-like file is written to by several processes.
What I want to achieve is to have the file truncated when it is first opened, and then have all writes done at the end by the several processes that have it open.
All writes are systematically flushed and mutex-protected so that I don't get jumbled output.
First, a process creates the file, then starts a sequence of other processes, one at a time, that then open the file and write to it (the master sometimes chimes in with additional content; the slave process may or may not be open and writing something).
I'd like, as much as possible, not to use more IPC that what already exists (all I'm doing now is writing to a popen-created pipe). I have no access to external libraries other that the CRT and Win32 API, and I would like not to start writing serialization code.
Here is some code that shows where I've gone:
// open the file. Truncate it if we're the 'master', append to it if we're a 'slave'
std::ofstream blah(filename, ios::out | (isClient ? ios:app : 0));
// do stuff...
// write stuff
myMutex.acquire();
blah << "stuff to write" << std::flush;
myMutex.release();
Well, this does not work: although the output of the slave process is ordered as expected, what the master writes is either bunched together or at the wrong place, when it exists at all.
I have two questions: is the flag combination given to the ofstream's constructor the right one ? Am I going the right way anyway ?
If you'll be writing a lot of data to the log from multiple threads, you'll need to rethink the design, since all threads will block on trying to acquire the mutex, and in general you don't want your threads blocked from doing work so they can log. In that case, you'd want to write your worker thread to log entries to queue (which just requires moving stuff around in memory), and have a dedicated thread to pull entries off the queue and write them to the output. That way your worker threads are blocked for as short a time as possible.
You can do even better than this by using async I/O, but that gets a bit more tricky.
As suggested by reinier, the problem was not in the way I use the files but in the way the programs behave.
The fstreams do just fine.
What I missed out is the synchronization between the master and the slave (the former was assuming a particular operation was synchronous where it was not).
edit: Oh well, there still was a problem with the open flags. The process that opened the file with ios::out did not move the file pointer as needed (erasing text other processes were writing), and using seekp() completely screwed the output when writing to cout as another part of the code uses cerr.
My final solution is to keep the mutex and the flush, and, for the master process, open the file in ios::out mode (to create or truncate the file), close it and reopen it using ios::app.
I made a 'lil log system that has it's own process and will handle the writing process, the idea is quite simeple. The proccesses that uses the logs just send them to a pending queue which the log process will try to write to a file. It's like batch procesing in any realtime rendering app. This way you'll grt rid of too much open/close file operations. If I can I'll add the sample code.
How do you create that mutex?
For this to work this needs to be a named mutex so that both processes actually lock on the same thing.
You can check that your mutex is actually working correctly with a small piece of code that lock it in one process and another process which tries to acquire it.
I suggest blocking such that the text is completely written to the file before releasing the mutex. I've had instances where the text from one task is interrupted by text from a higher priority thread; doesn't look very pretty.
Also, put the format into Comma Separated format, or some format that can be easily loaded into a spreadsheet. Include thread ID and timestamp. The interlacing of the text lines shows how the threads are interacting. The ID parameter allows you to sort by thread. Timestamps can be used to show sequential access as well as duration. Writing in a spreadsheet friendly format will allow you to analyze the log file with an external tool without writing any conversion utilities. This has helped me greatly.
One option is to use ACE::logging. It has an efficient implementation of concurrent logging.

Setting a timeout on ifstream in C++?

We're trying to read data from 2 usb mice connected to a linux box (this data is used for odometry/localization on a robot). So we need to continuously read from each mouse how much it moved. The problem is that when a mouse is not moving, it doesn't send any data, so the file stream from which we get the data blocks execution and therefore the program can't do the odometry calculations (which involve time measurement for speed).
Is there a way to set a timeout on the input stream (we're using ifstream in C++ and read from /dev/input/mouse), so that we're able to know when the mouse doesn't move, instead of waiting for an event to be received? Or do we need to mess up with threads (arggh...)? Any other suggestions are welcome!
Thanks in advance!
A common way to read from multiple file descriptors in linux is to use select(). I suggest starting with the manpage. The basic system flow is as follows:
1) Initialize devices
2) Obtain list of device file descriptors
3) Setup the time out
4) Call select with file descriptors and timeout as parameters - it will block until there is data on one of the file descriptors or the time out is reached
5) Determine why select returned and act accordingly (i.e. call read() on the file descriptor that has data). You may need to internally buffer the result of read until an entire data gram is obtained.
6) loop back to 4.
This can become your programs main loop. If you already have a different main loop you, can run the above without looping, but your will need to insure that the function is called frequently enough such that you do not lose data on the serial ports. You should also insure that your update rate (i.e. 1/timeout) is fast enough for your primary task.
Select can operate on any file descriptor such network sockets and anything else that exposes an interface through a file descriptor.
What you're looking for would be an asynchronous way to read from ifstream, like socket communication. The only thing that could help would be the readsome function, perhaps it returns if no data is available, but I doubt this helps.
Using threads would be the best way to handle this.
Take a look at the boost Asio library. This might help you deal with the threading suggested by schnaeder.
No, there is no such method. You'll have to wait for an event, or create a custom Timer class and wait for a timeout to repoll, or use threads.