Background:
I have an application which plays video files from disc. When I play these files the first time the file reading sometimes lags. However, the second time it is played there is never any lag, I suspect this is because the file is put into the windows file cache the first time it is played.
The requirements of my application is that it should be able to play any video at any time (the same video is almost never played twice, thus making the cache unnecessary), this makes the current problem quite critical.
In order to debug this problem I would need to disable windows xp file caching.
Question
Is there a way to disable windows xp file caching?
EDIT/More Info
Im using ffmpeg, and have no access to the actual file read calls. The problem can occur even if several other files have been played previously (warm up).
In general, you can't just force FILE_FLAG_NO_BUFFERING. It requires aligned buffers, and typically these aren't provided. Besides, it's the wrong thing. You don't care whether Windows reads 32KB ahead.
The only thing that you'd like Windows to do is discard file contents from cache after you've read them. The correct flag for that is FILE_FLAG_SEQUENTIAL_SCAN. This hints Windows that you (probably) won't seek back, so there is no reason to keep those bytes in cache.
You can try passing FILE_FLAG_NO_BUFFERING to CreateFile() to avoid caching. This imposes some requirements on your buffers. Specifically, their size must be a multiple of the sector size and their addresses must be aligned on the sector size. See MSDN for more details.
Assuming you have access to the CreateFile call which ultimately opens your file, you can use FILE_FLAG_NO_BUFFERING when you open it:
http://msdn.microsoft.com/en-us/library/aa363858%28VS.85%29.aspx
If you're not calling CreateFile directly yourself, but via some kind of library, you'll need to see if they provide a way to let you set this flag indirectly.
You might also find that the initial delay is caused by loading-up of the enormous number of DLLs which can make up the media stack in Windows, in which case altering the way the media file itself is opened won't help.
You could test this by making an extremely short media file, which you play at app startup, to 'warm-up' the stack.
Related
I'm writing a little C++ program for myself. At the begining of it, I read a file all the way to the bottom, and later on, right before the program ends, I need to read that file again from the begining.
My question is, is it more efficient to have the file open during the execution (even thought I won't be using it) and just rewind it when I need it again, or should I close it the first time and then open it again when I need it?
Edit: Just to clarify, my question is not only related to the specific project that I'm working on. It is really small (less than 300 lines of code), so there won't be any noticeable performance difference. I'm asking about opening, closing and "rewinding" files in general, so it's aplicable to other big projects were performance and memory may actually matter
If you close and open the file, the OS definitely need to update system lock for the file and list of resources (opened files) of your process. Furthermore close and open operation are two systems calls (kernel calls) and system call is not cheap. Every system call require translating of virtual address.
Closing the file can (if there is any change) force writing the cache to the hard-disk, this means seek time about 15ms (physical move of the platter). It can be even worse in the case of network drive.
After closing the file, some properties need to be updated. FileSystem watcher may be launched.
An antivirus scanning may be triggered after closing the file, it depends on filename, path, antivirus brand.
Furthermore closing the file is a risk, that you are not able to open it again because of another process. For example Dropbox read every file in Dropbox folder after change. So closing and opening file does not generally work in Dropbox folder (Dropbox may be faster). And who knows how users use your application. Users are inventive and they share files you didn't think of.
You might be able to measure a fraction of gained efficiency in the range of a few nanoseconds if you fseek to the beginning of the file but I don't think this is worth it when you are only dealing with a single file.
Like others said: try to find other areas of code which you can optimize.
As with all performance issues, the final optimizations vary widely. Measure both implementations against a reasonable data set and take it from there.
As a design choice it may be simpler to cache the contents of the file in memory once it has been read the first time and then there is no need to re-read the contents. If the modified content is required then again, cache the modified data to forgo the second read.
Is there a way to get all opened file handles for a process and arrange it by time files were opened? We have a project, which requires exactly this - we need to determine which files are opened by a Dj software, such as Traktor or Serato. The reason we need to know its order is to determine, which file is in the first deck, and which is in the second one.
Currently we are using Windows internal APIs from the Ntdll.dll (Winternl.h) to determine a list of all opened files for a process. Maybe that's not the best way to do it. Any suggestions are highly appreciated.
We relied on an observed behavior of that APIs on certain OS version and certain Dj software versions, which was that the list of all opened files for a process never get rearranges, i.e. adheres an order. I know that's a bad practice, but it was a "should be" feature from the customer right before the release, so we had to. The problem is now we have a bug when those handles are sometimes randomly rearranged without any particular cause. That brakes everything. I thought maybe there would be a field in those win structures to obtain file's been opened time, but seemingly there are no such things. Docs on that APIs are quite bad.
I thought about some code paste, but it's a function 200 lines long and it uses indirect calls from the dll using function pointers and all structures for WinAPIs are redefined manually, so it's really hard to read it. Actually, the Winternl.h header isn't even included - all stuff is loaded manually too, like that:
GetProcAddress( GetModuleHandleA("ntdll.dll"), "NtQuerySystemInformation" );
It's really a headache for a cross platform application...
P.S. I have posted a related question here about any cross-platform or Qt way to get opened file handles, maybe that stuff will be useful or related.
if it's just to check the behavior in other OS for debug purpose, you can use the technique of creating process in debug mode and intercept in the order all events of dll loading, here's a good article talking about that.
I have written a code in C++ to open a file in its default application like .doc in MS-Word now I want to calculate time to open a file into its application.
For that I need to know percentage of file loaded into that application. But from last 7 days I couldn't find any suitable solution. So can any one help me in solving this problem?
If i am using windows then can windows task manager help me to do this?
What you're trying to do is not only impossible, it doesn't even make sense.
When you play an MP3 in WMP, it doesn't load the whole file into memory. Instead, it maps a little bit of the file at a time into memory so it can decode the MP3 on the fly as it's playing. (I suppose if you play the song all the way through, without stopping or skipping or fast forwarding or rewinding, it will eventually read every byte of the file, probably finishing a few seconds before the song is over, but I doubt that's what you're looking for.)
Likewise, Word doesn't read any entire .doc file into memory (unless it's very small). That's how it's able to edit gigantic files without using huge amounts of memory. (Again, if you page through the whole file, it will probably eventually read every byte—for that matter, it may eventually copy enough of the file into an autosave backup file that it no longer needs to look at the original—but again, I doubt that's what you're looking for.)
If you only care about certain specific applications, and those applications have a COM Automation interface (as both WMP and Word do), they may have methods or events that will tell you when they're done "loading" a file (meaning they've read enough of it to start playing/displaying/etc.), or when they've "finished" with a file (meaning moved on to the next track, or whatever), but there's no generic answer to that; different applications will have different Automation interfaces. (And, as a side note, you really don't want to do COM Automation from C++ unless you really have to; it's much easier from jscript, vbscript, or your favorite .NET language…)
If the third party process does not signal that it has loaded something, e.g., through some output stream, one way will be to view the file handles being opened and closed by the processes. I presume this will be similar to how "task managers" like Process Explorer are able to view file handles of processes. However, if the process does not close the file handle once it is done "loading", then, you will not get an accurate time. Furthermore, you won't be able to get a "live" percentage of how much data has been loaded.
I update the hard disk root directory, information like long filename, filesize , filedate etc, using VC++ writefile function. However, I note window explorer do not know about this until it is re-booted such as a refresh or reopen another window explorer. I have tried call to SHChangeNotify and SendMessageTimeout but fail. My next step would be to try a fake of removal and insert of an external disk. Please help. thanks
Do not attempt to modify a filesystem directly while it is mounted (and if explorer can see it, it's mounted). The OS will maintain various cached representations of the filesystem, and modifying it behind the OS's back will result in inconsistencies between the cached representation and the actual FS, potentially corrupting the filesystem and any data in said FS.
Take a look at this serverfault question for some hints on how to perform an unmount.
Try turning off the hard disk write cache, hopefully OS does not cache any file system data in RAM. This will affect IO performance but may help your experiment.
So I have many log files that I need to write to. They are created when program begins, and they save to file when program closes.
I was wondering if it is better to do:
fopen() at start of program, then close the files when program ends - I would just write to the files when needed. Will anything (such as other file io) be slowed down with these files being still "open" ?
OR
I save what needs to be written into a buffer, and then open file, write from buffer, close file when program ends. I imagine this would be faster?
Well, fopen(3) + fwrite(3) + fclose(3) is a buffered I/O package, so another layer of buffering on top of it might just slow things down.
In any case, go for a simple and correct program. If it seems to run slowly, profile it, and then optimize based on evidence and not guesses.
Short answer:
Big number of opened files shouldn't slow down anything
Writing to file will be buffered anyway
So you can leave those files opened, but do not forget to check the limit of opened files in your OS.
Part of the point of log files is being able to figure out what happened when/if your program runs into a problem. Quite a few people also do log file analysis in (near) real-time. Your second scenario doesn't work for either of these.
I'd start with the first approach, but with a high-enough level interface that you could switch to the second if you really needed to. I wouldn't view that switch as a major benefit of the high-level interface though -- the real benefit would normally be keeping the rest of the code a bit cleaner.
There is no good reason to buffer log messages in your program and write them out on exit. Simply write them as they're generated using fprintf. The stdio system will take care of the buffering for you. Of course this means opening the file (with fopen) from the beginning and keeping it open.
For log files, you will probably want a functional interface that flushes the data to disk after each complete message, so that if the program crashes (it has been known to happen), the log information is safe. Leaving stuff in standard I/O buffers means excavating the data from a core dump - which is less satisfactory than having the information on disk safely.
Other I/O really won't be affected by holding one - or even a few - log files open. You lose a few file descriptors, perhaps, but that is not often a serious problem. When it is a problem, you use one file descriptor for one log file - and you keep it open so you can log information. You might elect to map stderr to the log file, leaving that as the file descriptor that's in use.
It's been mentioned that the FILE* returned by fopen is already buffered. For logging, you should probably also look into using the setbuf() or setvbuf() functions to change the buffering behavior of the FILE*.
In particular, you might want to set the buffering mode to line-at-a-time, so the log file is flushed automatically after each line is written. You can also specify the size of the buffer to use.