How to find and decode efficiently Nth frame with libavcodec? - c++

Please, this is not duplicate of similar posts!
I want to find and to decode Nth frame, for example 7th frame.
As I understood, using time_base I can calculate how many ticks is each frame and by multiplying it with 7 we will get position of 7th frame. To calculate the ticks I do
AVStream inStream = getStreamFromAVFormatContext();
int fps = inStream->r_frame_rate.num;
AVRational timeBase = inStream->time_base;
int ticks_per_frame = (1/fps) / timeBase;
int _7thFramePos = ticks_per_frame * 7;
Did I calculated correctly position of 7th frame? If I did, so to go to that frame I just do av_seek_frame(pFormatCtx, -1, _7thFramePos, AVSEEK_FLAG_ANY), right?
What happens if the 7th frame was P-Frame or B-Frame, how I decode it?
I noticed that the calculated value differs from inStream->codec->ticks_per_frame, why? Shouldn't they be the same? What is the difference?

This post explains the issue nicely.
http://www.hackerfactor.com/blog/index.php?/archives/307-Picture-Go-Back.html
[1] comment for AVStream structure clearly mentions that "r_frame_rate" is a guess and may not be accurate, because even if I have frame-rate of (say) 25fps, in term of base_time I may have 24 or 26 frames in a second.
[2] To find the exact frame number you need to decode frame from the start and keep a counter, but that is very in-efficient, this can be optimized for some file-formats like MP4 where information about every frame is present in file-header.

Related

C++ ffmpeg get the current frame

I'm trying to seek to a certain part of a video using ffmpeg. So far I've got this:
int64_t pts = (int64_t)( ((float) timestamp_to_go / 1000)* (double)time_base.den / (double)time_base.num);
if(av_seek_frame(av_format_ctx, video_stream_index, pts, AVSEEK_FLAG_BACKWARD) < 0 )
exit(0);
This allows me to seek to the closest IFrame. For example if I try to seek to the 10th second of a video, it seeks to the 8.5th second of a video. This is fine since I can just decode till I reach the 10th seconds and go on with my day.
However I couldn't figure out how to get the current frame index. After I seek to the frame using the code above, I need to figure out which frame/timestamp I'm currently at so I can decode until I reach the timestamp desired.
For example: If I try to seek to 10 and get 8.5 like example above, for a video with 30 fps I need to get 255 so I can decode untill I reach the 300th frame, which corresponds to the 10th second.

How to get correct frame duration using ffmpeg?

In my file "Video.avi", all pkt_duration attributes in video AVFrame are equal to 1.
But the best_effort_timestamp attributes in AVFrame for the first video frames are : 0, then 3, then 4, then 5, then 7 .. etc, which means for example that the duration of the first frame is 3 and not 1. (the time_base is 1/30)
Is there a better way to know the correct duration of a frame than reading the next frame and getting its best_effort_timestamp and computing the difference between the two best_effort_timestamp?
If not, is there a way to read only the header of the next packet just to get its best_effort_timestamp and not wasting time to decompress anything ?
Thank you for your help.

Get the frame from Video by the time (openCV)

I have a video and I have important times in this video
For example:
"frameTime1": "00:00:01.00"
"frameTime2": "00:00:02.50"
"frameTime2": "00:00:03.99"
.
.
.
I get the FPS, and I get the totalFrameCount
If I want to get the frames in that's times for example the frame that's happen in this time "frameTime2": "00:00:02.50" I will do the following code
FrameIndex = (Time*FPS)/1000; //1000 Because 1 second = 100 milli second
In this case 00:00:02.50 = 2500 milli second, and the FPS = 29
So the FrameIndex in this case is 72.5, in this case I will choose either frameNO: 72 or 73, but I feel that's not accurate enough, any better solution?
What's the best and accurate way to do this?
The most accurate thing you have at your disposal is the frame time. When you say that an event occurred at 2500ms, where is this time coming from? Why is it not aligned with your framerate? You only have video data points at 2483ms and 2517ms, no way around that.
If you are tracking an object on the video, and you want its position at t=2500, then you can interpolate the position from the known data points. You can do that either by doing linear interpolation between the neighboring frames, or possibly by fitting a curve on the object trajectory and solving for the target time.
If you want to rebuild a complete frame at t=2500 then it's much more complicated and still an open problem.

Understanding SDL frame animation

I am working through the book SDL game development. In the first project, there is a bit of code meant to move the coords of the rendered frame of a sprite sheet:
void Game::update()
{
m_sourceRectangle.x = 128 * int((SDL_GetTicks()/100)%6);
}
I am having trouble understanding this... I know that it moves m_sourceRectangle 128 pixels along the x axis every 100 ms... but how does it actually work? Can somebody breakdown each element of this code to help me understand?
I don't understand why SDL_GetTicks() needs to be called to do this...
I also know that %6 is there because there are 6 frames in the animation... but how does it actually do that?
The book says:
Here we have used SDL_GetTicks() to find out the amount of milliseconds since SDL was initialized. We then divide this by the amount of time (in ms) we want between frames and then use the modulo operator to keep
it in range of the amount of frames we have in our animation. This code will (every 100 milliseconds) shift the x value of our source rectangle by
128 pixels (the width of a frame), multiplied by the current frame we want, giving us the correct position. Build the project and you should see the animation displayed.
But I am not sure I understand why getting the amount of milliseconds since SDL was initialized works.
The modulo operator takes the rest of a division. So for example if GetTicks() is 2600, first dividing by 100 makes it 26 and modulo 6 of 26 is 2. Therefore it's frame 2.
if GetTicks() is 3300; you divide by 100 and get 33; modulo 6 of 33 is 3; frame 3.
Each frame will be displayed for 100ms, so at T=0ms it's Frame 0, t=100ms it's Frame 100/100, at T=200ms it's Frame 200/100 and so on. So at T=SDL_GetTicks() ms, it's Frame SDL_GetTicks()/100. But than you only have 6 frames all together and cycling, therefore at T=SDL_GetTicks() ms it's in face Frame (SDL_GetTicks()/100) % 6.
There is an assumption here is that when the program start, Frame 0 is displayed, which may not be true because there are lots of things to do at starting which take time. But for simple demo to illustrate cycling of frames, it is good enough.
Hope this helps.

How to find the length of a video in OpenCV?

I want to find the length of a video capture in OpenCV;
int frameNumbers = (int) cvGetCaptureProperty(video2, CV_CAP_PROP_FRAME_COUNT);
int fps = (int) cvGetCaptureProperty(video2, CV_CAP_PROP_FPS);
int videoLength = frameNumbers / fps;
but this give me a result which is less than the real answer. What do I have to do?
Actually, I am not sure if there is any issue with the functions that you tried as of today. However, There is an issue with this snippet. Here, it is being assumed that Frames Per Second is an integer value which is not always the case. For example, many videos are encoded at 29.97 FPS, and this code would assume int(29.97) = 29 which obviously results in a larger value in seconds for video length.
The calculation seems to work fine for me if I use floating point values (float) without truncating them.
See this similar post. OpenCV cannt (yet) capture correctly the number of frames
OpenCV captures only a fraction of the frames from a video file