A task:
I have a trusted video event detector. I trust to my event detector for 100% and I want to write an uncompressed frame to my avi containter only if my event detector produces "true" result.
For frames, when my event detector is producing "false" I would like to write an empty packet because I want to know that there was a frame without event happening.
Is it possible to keep AVI file alive? Or do I need to write my own player in this case?
Another option is to calculate timestamps manually and set dts/pts to that calculated time.
Drawback: I will need to recalculate timestamps to understand how many frames were between events.
I am using:
av_write_frame(AVFormatContext, AVPacket);
and
av_interleaved_write_frame(AVFormatContext, AVPacket);
What is your suggestion/idea?
Thank you in advance.
Knowing AVI spec, I don't think there is a such thing as an "empty packet" as AVI stores its frames densely without frame timestamps. If file-size is no issue, you can repeat the same frame to indicate undetected event (undo with freezedetect filter) or insert all-zero frame (undo with blackdetect filter). It, however, appears better to use something like matroska container and variable frame rate paired with a lossless h264 (more inline with your alternate option?). Just my 2 cents.
Related
I have a real time video stream, and want to cut some video clips from it by accurate timestamp(pts).
When I receiver an avpacket, I decode it, and do something and cache the avpacket. I don't want to re-encode all avpackets, it cost cpu resource.
There are many gop structure in H.264 stream, usually we should cut the video begin at the key frame, and end at the key frame. Otherwise the front some frames in the video clip would display error.
Now I use av_write_frame to make avpacket to video. But sometimes the length of gop is very long, such as it could be 250, 8.3s(30 frame per second). It means the distance between two I-frame could be 250 frames. The video clip is short, I don't want to add too many unused frames.
How should I do? I think i should insert a i-frame at the start position of video clip. Could I change a p-frame to i-frame?
Thanks your reading!
This is not possible in the generic case, but may be in specific cases. Even then, there are no open source/free tools to do this, and I am unaware of any commercial tools. The reason I say it is not possible in the generic case is each frame can reference up to 16 other frames. So you can not just replace a single frame, You will need to replace all referenced frames. Doing this will likely take almost as much CPU as encoding the whole GOP.
I'm trying to capture an AVI video, using DirectShow AVIMux and FileWriter Filters.
When I connect SampleGrabber filter instead of the AVIMux, I can clearly see that the stream is 30 fps, however upon capturing the video, each frame is duplicated 4 time and I get a 120 frames instead of 30. The movie is 4 times slower than it should be and only the first frame in the set of 4 is a Key Frame.
I tried the same experiment with 8 fps and for each image I received, I had 15 frames in the video. And in case of 15 fps, I got each frame 8 times.
I tried both writing the code in C++ and testing it with Graph Edit Plus.
Is there any way I can control it? Maybe some restrictions on the AVIMux filter?
You don't specify your capture format which could have some bearing on the problem, but generally it sounds like the graph when writing to file has some bottleneck which prevents the stream from continuing to flow at 30fps. The camera is attempting to produce frames at 30fps, and it will do so as long as buffers are recycled for it to fill.
But here the buffers aren't available because the file writer is busy getting them onto the disk. The capture filter is starved and in this situation it increments the "dropped frame" counter which travels with each captured frame. AVIMux uses this count to insert an indicator into the AVI file which says in effect "a frame should have been available here to write to file, but isn't; at playback time repeat the last frame". So the file should have placeholders for 30 frames per second - some filled with actual frames, and some "dropped frames".
Also, you don't mention whether you're muxing in audio, which would be acting as a reference clock for the graph to maintain audio-video sync. When capture completes if also using an audio stream, AVIMux alters the framerate of the video stream to make the duration of the two streams equal. You can check whether AVIMux has altered the framerate of the video stream by dumping the AVI file header (or maybe right click on the file in explorer and look at the properties).
If I had to hazard a guess as to the root of the problem, I'd wager the capture driver has a bug in calculating the dropped frame count which is in turn messing up AVIMux. Does this happen with a different camera?
I have so-called "block's" that stores some of MPEG4 (I,p,p,p,p...) frames.
For every "block", frame starts with an "I" frame, and ends before next "I" frame.
(VOL - "visual_object_sequence_start_code" is allways included before the "I" frame)
I need to be able to play those "block" frames in "backwards" mode.
The thick is that:
It's not possible to just take the last frame in my block and perform decoding, because it's a "P" frame and it needs an "inter frame (I)" to be correctly decoded.
I can't just take my first "I" frame, then pass it to the ffmpeg's "avcodec_decode_video" function and only then pass my last "P" frame to the ffmpeg, because that last "P" frame depends on the "P" frame before it, right? (well.. as far as I've tested this method, my last decoded P frame had artifacts)
Now the way I'm performing backwards playing is - first decoding all of my "block" frames in RGB and store them in memory. (in most cases it would be ~25 frames per block max.) But this method really requires a lot of memory... (especially if frames resolutions is high)
And I have a feeling that this is not the right way to do this...
So I would like to ask, does any one have any suggestions how this "backwards" frame decoding/playing could be performed using FFmpeg?
Thanks
What you are looking at really a research problem: To get a glimps of the overall approach, look at the following paper:
Compressed-Domain Reverse Play of MPEG Video Streams, SPIE International Symposium on Voice, Video, and Data Communications, Boston, MA, November, 1998.
Reverse-play algorithm for MPEG video streaming
MANIPULATING TEMPORAL DEPENDENCIES IN COMPRESSED VIDEO DATA WITH APPLICATIONS TO COMPRESSED-DOMAIN PROCESSING OF MPEG VIDEO.
Essentially, there is still advanced encoding based on key frames, however, you can reverse the process of motion compensation to achieve the reverse flow. This is done by on the fly conversion of P frames into I frames. This does require looking forward but doesn't require that much more memory. Possibly you can save this as a new file and then apply it to standard decoder with reverse play requirements.
However, this is very complex, and i have seen rare softwares doing this practically.
I do not think there is a way around starting from I-frame and decoding all P-frames, due to P-frame depending on the previous frame. To handle decoded frames, they can be saved to a file, or, with limited storage and extra CPU power, older P-frames can be discarded and recomputed later.
At the command level, you can convert input video to a series of images:
ffmpeg -i input_video output%4d.jpg
then reverse their order somehow and convert back to a video:
ffmpeg -r FRAME_RATE -i reverse_output%4d.jpg output_video
You may consider pre-processing, if it is an option.
I'm doing some processing on some very large video files (often up to 16MP), and I need a way to store these videos in a format that allows seeking to specific frames (rather than to times, like ffmpeg). I was planning on just rolling my own format that concatenates all of the individually zlib compressed frames together, and then appends an index on the end that links frame numbers to file byte indices. Before I go about this though, I just wanted to check to make sure I'm not duplicating the functionality of another format/library. Has anyone heard of a format/library that allows lossless compression and random access of videos?
The reason it is hard to seek to a specific frame in most video codecs is that most frames depend on another frame or frames, so frames must be decoded as a group. For this reason, most libraries will only let you seek to the closest I-frame (Intra-frame - independently decodable frame). To actually produce an image from a non-I-frame, data from other frames is required, so you have to decode a number of frames worth of data.
The only ways I have seen this problem solved involve creating an index of some kind on the file. In other words, make a pass through the file and create an index of what frame corresponds to a certain time or section of the file. Since the seeking functions of most libraries are only able to seek to an I frame so you may have to seek to the closest I-frame and then decode from there to the exact frame you want.
If space is not of high importance, I would suggest doing it like you say, but use JPEG compression instead of zlib as it will give you a lot higher compression ratio since it exploits the fact you are dealing with image data.
If space is an issue, P frames (depend on previous frame/frames) can greatly reduce the size of the file. I would not mess with B frames (depend on previous and future frame/frames) since they make it much harder to get things right.
I have solved the problem of seeking to a specific frame in the presence of B and P frames in the past using ffmpeg (libavformat) to demux the video into packets (1 frame's worth of data per packet) and concatenate these into a single file. The important thing is to keep and index into that file so you can find packet bounds for a given frame. If the frame is an I-frame, you can just feed that frame's data into an ffmpeg decoder and it can be decoded. If the frame is a B or P frame, you have to go back to the last I-frame and decode forward from there. This can be quite tricky to get right, especially for B-frames since they are often sent in a different order than how they are displayed.
Some formats allow you to change the number of key frames per second.
For example, I've used ffmpeg to encode to flv at 25 frames per second with 25 key frames per second, and then used a player that was fine in moving to key frames. Basically this allowed me to do frame by frame seeking.
Also the last time I checked quicktime can do frame by frame seek without having to have each frame being a key frame.
May not be applicable to you but that's my thoughts.
I have a two dump files of raw video and raw audio from an encoder and I want to be able to measure the "Lip-sync". Imagine a video of a hammer striking an anvil. I want to go frame by frame and see that when the hammer finally hits the anvil, there is a spike in amplitude on the audio track.
Because of the speed that everything happens at, I cannot merely listen to the audio, i need to see the waveform in time domain.
Are there any tools out there that will let me see both the video and audio?
If you are concerned about validating a decoder then generally from a validation perspective the goal is to check Audio and Video PTS values against a common real time clock.
Raw YUV and PCM files do not include timestamps. If you know the frame-rate and sample-rate you can use a raw yuv file viewer (I wrote my own) to figure out the time (from start of file) of a given frame in the video, and a tool like Audacity to figure out the time form start of file to a start of tone in the audio file. this still may not tell you the whole story since tools usually embed a delay between the audio and video in the ts/ps file. Or you can hook up ab OScope and go old school.