I am not sure if this is even a valid question. I am not a master at understanding the workings of system. One of my program writes logs to a text file. Another email program runs on scheduler and emails and archives the log file if found in the folder.
My question is, If at any given instant if the first program is writing information into the file and at the same time email scheduler runs what will happen? Will the email program be able to mail the file and archive it? If Yes, will the earlier program writing the file crash? How to handle this scenario without crashing either programs?
No matter what, your setup will lead to some kind of trouble.
I think the simplest solution would be to have the program that writes the log file do this e.g. 5 minutes before the emailer/archiver is scheduled to run:
start a new file for logging
copy or rename the old file to the file that the emailer/archiver uses.
Related
I have a program that reads files (given to it by the user) from the computer and performs operations on these files. However, the program isn't working. I input a valid file with a valid path and the program says it is reading this valid file, however, it doesn't find the files. I have verified that the method I use to read the files works.
So, this prompts my question. Is it possible for a C++ program to track what files are being read by a specific program, and tell me the path it is trying to read?
For Linux, the strace utility is the answer (as mentioned by Peter in a comment). You probably have it installed already, so just run strace your_program_name and you can see all the system calls the program is running, and their arguments and return codes. You should focus on the open calls.
I need to check if a file is currently opened by another process, e.g. a text editor (but needs to apply to everything else too).
I tried using std::ofstream::is_open() etc., but this did not work. I could open the file in my text editor while my program was checking if it was open. The program saw it as a closed file and went on. Only if I opened it as another ofstream would this work.
I'm using the filesystem library to copy files and they may only be copied (and later removed) if the file is not currently written to by another process on the client server.
Really curious about this one. Been wondering this for quite some time but never found a good way for it myself.
I'm currently making a program that needs to be able to run on both linux and windows. every 5 seconds it copies all files from directory a,b,c,d to x. This can be set by the client in rules. after it copied everything. all the files may be removed. After a day (or whatever the client tells the program) all those files from x need to be zipped and archived on location y. Hence the problem, files may only be deleted (and copied) if the other programs that place all the files in directories a,b,c,d are not touching that specific file right now. Hope that makes the question clearer.
And before anybody starts. Yes I know about the data race condition. I do not care about this for now. The program does absolutely nothing with the contents of a file. And after a file is closed by the other process, it will be closed forever.
I need to check if a file is currently opened by another process
This is heavily operating system specific (and might be useless)
So read first a good textbook on operating systems.
On Linux specifically you might use inotify(7) facilities, or /proc/ pseudo-file system (see proc(5)), or perhaps lsof(8). They work only for local file systems (not remote ones, like NFS). See also Advanced Linux Programming and syscalls(2).
And you could have surprises (e.g. a process being scheduled so quickly that removes a file that you won't have time to do anything)
For Windows take more time to read its documentation.
I'm currently making a program that needs to be able to run on both linux and windows. every 5 seconds it copies all files from directory a,b,c,d to x.
You might look, at least for inspiration, inside the source code of rsync.
I don't understand what your actual problem is, but rsync might be part of the solution and is rumored to run on both Windows and Linux
My Java application reads data from directory and puts into a common shared resource queue, the consumer will consume the event validate the event and save it into database, i want to process the files in the directory only once even if the application restarted it should not process the file again rather start from file from where it had stopped, can anyone help me out with this
Do you have any code that we could look at? That way we can see exactly what you need.
And with what you have you might want to look into how the program can "save progress". You might be able to do something there.
We have File like Zip, txt Files in Windows SFTP Server and we use Informatica for our ETL Jobs , but our concern is the vendors who drop Files in SFTP Server they drop files in random times , and the files are of different sizes so How can we detect a File transfer is complete or Not??
Unfortunately, you can't. When your client is asking the FTP server for a list of files, it receives the current state on the server. There's no way of telling if one of the files is currently being written to or not.
So the only way here is to work out some kind of protocol with the vendor. You would need to work with some kind of lock file. If the vendor is writing to the file, they first have to create the lock file. Same goes for your ETL job when reading the file, you would first have to create the lock file and the vendor is not allowed to start a new file writing process until the lock is removed. You get the idea.
It all depends on how fool prove the solution needs to be. Another option is to let the vendor write to a temporary file first. Only when the upload process is finished, they rename the file to its final name.
Writing to temp file and renaming once write is complet is one option, as Socken23 indicated. Others might be:
create empty .ready file once write to target file is done. Your process should then check for the existence of .ready file and read data from the other one.
perform a sequence of file size checks on FTP before starting the ETL. E.g. check file size, wait one minute, check again, wait & check third time. If all sizes match, you may assume the write is complete.
How can I check if a file is still being written? I need to wait for a file to be created, written and closed again by another process, so I can go on and open it again in my process.
In general, this is a difficult problem to solve. You can ask whether a file is open, under certain circumstances; however, if the other process is a script, it might well open and close the file multiple times. I would strongly recommend you use an advisory lock, or some other explicit method for the other process to communicate when it's done with the file.
That said, if that's not an option, there is another way. If you look in the /proc/<pid>/fd directories, where <pid> is the numeric process ID of some running process, you'll see a bunch of symlinks to the files that process has open. The permissions on the symlink reflect the mode the file was opened for - write permission means it was opened for write mode.
So, if you want to know if a file is open, just scan over every process's /proc entry, and every file descriptor in it, looking for a writable symlink to your file. If you know the PID of the other process, you can directly look at its proc entry, as well.
This has some major downsides, of course. First, you can only see open files for your own processes, unless you're root. It's also relatively slow, and only works on Linux. And again, if the other process opens and closes the file several times, you're stuck - you might end up seeing it during the closed period, and there's no easy way of knowing if it'll open it again.
You could let the writing process write a sentinel file (say "sentinel.ok") after it is finished writing the data file your reading process is interested in. In the reading process you can check for the existence of the sentinel before reading the data file, to ensure that the data file is completely written.
#blu3bird's idea of using a sentinel file isn't bad, but it requires modifying the program that's writing the file.
Here's another possibility that also requires modifying the writer, but it may be more robust:
Write to a temporary file, say "foo.dat.part". When writing is complete, rename "foo.dat.part" to "foo.dat". That way a reader either won't see "foo.dat" at all, or will see a complete version of it.
You can try using inotify
http://en.wikipedia.org/wiki/Inotify
If you know that the file will be opened once, written and then closed, it would be possible for your app to wait for the IN_CLOSE_WRITE event.
However if the behaviour of the other application doing the writing of the file is more like open,write,close,open,write,close....then you'll need some other mechanism of determining when the other app has truly finished with the file.