How can I retrieve the current frame number of a video using OpenCV? Does OpenCV have any built-in function for getting the current frame or I have to do it manually?
You can use the "get" method of your capture object like below :
capture.get(CV_CAP_PROP_POS_FRAMES); // retrieves the current frame number
and also :
capture.get(CV_CAP_PROP_FRAME_COUNT); // returns the number of total frames
Btw, these methods return a double value.
You can also use cvGetCaptureProperty method (if you use old C interface).
cvGetCaptureProperty(CvCapture* capture,int property_id);
property_id options are below with definitions:
CV_CAP_PROP_POS_MSEC 0
CV_CAP_PROP_POS_FRAME 1
CV_CAP_PROP_POS_AVI_RATIO 2
CV_CAP_PROP_FRAME_WIDTH 3
CV_CAP_PROP_FRAME_HEIGHT 4
CV_CAP_PROP_FPS 5
CV_CAP_PROP_FOURCC 6
CV_CAP_PROP_FRAME_COUNT 7
POS_MSEC is the current position in a video file, measured in
milliseconds.
POS_FRAME is the position of current frame in video (like 55th frame of video).
POS_AVI_RATIO is the current position given as a number between 0 and 1
(this is actually quite useful when you want to position a trackbar
to allow folks to navigate around your video).
FRAME_WIDTH and FRAME_HEIGHT are the dimensions of the individual
frames of the video to be read (or to be captured at the camera’s
current settings).
FPS is specific to video files and indicates the number of frames
per second at which the video was captured. You will need to know
this if you want to play back your video and have it come out at the
right speed.
FOURCC is the four-character code for the compression codec to be
used for the video you are currently reading.
FRAME_COUNT should be the total number of frames in video, but
this figure is not entirely reliable.
(from Learning OpenCV book )
In openCV version 3.4, the correct flag is:
cap.get(cv2.CAP_PROP_POS_FRAMES)
The way of doing it in OpenCV python is like this:
import cv2
cam = cv2.VideoCapture(<filename>);
print cam.get(cv2.cv.CV_CAP_PROP_POS_FRAMES)
Related
I'm trying to seek to a certain part of a video using ffmpeg. So far I've got this:
int64_t pts = (int64_t)( ((float) timestamp_to_go / 1000)* (double)time_base.den / (double)time_base.num);
if(av_seek_frame(av_format_ctx, video_stream_index, pts, AVSEEK_FLAG_BACKWARD) < 0 )
exit(0);
This allows me to seek to the closest IFrame. For example if I try to seek to the 10th second of a video, it seeks to the 8.5th second of a video. This is fine since I can just decode till I reach the 10th seconds and go on with my day.
However I couldn't figure out how to get the current frame index. After I seek to the frame using the code above, I need to figure out which frame/timestamp I'm currently at so I can decode until I reach the timestamp desired.
For example: If I try to seek to 10 and get 8.5 like example above, for a video with 30 fps I need to get 255 so I can decode untill I reach the 300th frame, which corresponds to the 10th second.
I am writing video stabilizer using opencv. The algorithm is as follows:
while there are more frames in the video:
take new frame from the video
detect keypoints in the new frame
compute descriptor for new keypoints
match descriptors of the new and the previous frame
filter matches to get good matches
find homography between previous and new frame
apply homography (warpPerspective) to the new frame and thus create "adjusted new frame"
set previous frame to be equal to "adjusted new frame" (descriptors, keypoints)
I have a few questions. Am I on the right track? How to do the actual stabilization (using Gaussian filter or something else)?
Here is possible sequence of steps:
Step 1. Read Frames from a Movie File
Step 2. Collect Salient Points from Each Frame
Step 3. Select Correspondences Between Points
Step 4. Estimating Transform from Noisy Correspondences
Step 5. Transform Approximation and Smoothing
Step 6. Run on the Full Video
More details on each step you can find here:
http://www.mathworks.com/help/vision/examples/video-stabilization-using-point-feature-matching.html
I think you can follow the same steps in OpenCV.
If you're using python code then you can use my powerful & threaded VidGear Video Processing python library that now provides real-time Video Stabilization with minimalistic latency and at the expense of little to no additional computational power requirement with Stabilizer Class. Here's a basic usage example for your convenience:
# import libraries
from vidgear.gears import VideoGear
from vidgear.gears import WriteGear
import cv2
stream = VideoGear(source=0, stabilize = True).start() # To open any valid video stream(for e.g device at 0 index)
# infinite loop
while True:
frame = stream.read()
# read stabilized frames
# check if frame is None
if frame is None:
#if True break the infinite loop
break
# do something with stabilized frame here
cv2.imshow("Stabilized Frame", frame)
# Show output window
key = cv2.waitKey(1) & 0xFF
# check for 'q' key-press
if key == ord("q"):
#if 'q' key-pressed break out
break
cv2.destroyAllWindows()
# close output window
stream.stop()
# safely close video stream
More advanced usage can be found here: https://github.com/abhiTronix/vidgear/wiki/Real-time-Video-Stabilization#real-time-video-stabilization-with-vidgear
I have a video and I have important times in this video
For example:
"frameTime1": "00:00:01.00"
"frameTime2": "00:00:02.50"
"frameTime2": "00:00:03.99"
.
.
.
I get the FPS, and I get the totalFrameCount
If I want to get the frames in that's times for example the frame that's happen in this time "frameTime2": "00:00:02.50" I will do the following code
FrameIndex = (Time*FPS)/1000; //1000 Because 1 second = 100 milli second
In this case 00:00:02.50 = 2500 milli second, and the FPS = 29
So the FrameIndex in this case is 72.5, in this case I will choose either frameNO: 72 or 73, but I feel that's not accurate enough, any better solution?
What's the best and accurate way to do this?
The most accurate thing you have at your disposal is the frame time. When you say that an event occurred at 2500ms, where is this time coming from? Why is it not aligned with your framerate? You only have video data points at 2483ms and 2517ms, no way around that.
If you are tracking an object on the video, and you want its position at t=2500, then you can interpolate the position from the known data points. You can do that either by doing linear interpolation between the neighboring frames, or possibly by fitting a curve on the object trajectory and solving for the target time.
If you want to rebuild a complete frame at t=2500 then it's much more complicated and still an open problem.
I want to denoise a video using OpenCV and C++. I found on the OpenCV doc site this:
fastNlMeansDenoising(contourImage,contourImage2);
Every time a new frame is loaded, my program should denoise the current frame (contourImage) and write it to contourImage2.
But if I run the code, it returns 0 and exits. What am I doing wrong or is there an alternative way to denoise an image? (It should be fast, because I am processing a video)
while you are using c++ you are not providing the full argument try this that way.
cv::fastNlMeansDenoisingColored(contourImage, contourImage2, 10, 10,7, 21);
// This is Original Function to be used.
cv::fastNlMeansDenoising(src[, dst[, h[, templateWindowSize[, searchWindowSize]]]]) → dst
Parameters:
src – Input 8-bit 1-channel, 2-channel or 3-channel image.
dst – Output image with the same size and type as src .
templateWindowSize – Size in pixels of the template patch that is used to compute weights. Should be odd. Recommended value 7 pixels.
searchWindowSize – Size in pixels of the window that is used to compute weighted average for given pixel. Should be odd. Affect performance linearly: greater.
searchWindowsSize - greater denoising time. Recommended value 21 pixels.
h – Parameter regulating filter strength. Big h value perfectly removes noise but also removes image details, smaller h value preserves details but also preserves some noise
I have a very basic question about frame capturing using OpenCV. My code look like below:
VideoCapture cap(0);
cv::Mat mat;
int i = 0;
while(cap.read(mat)==true) {
//some code here
i = i + 1;
}
It works well. However, when I look at logcat logs by OpenCV, it says
FRAMES Received 225, grabbed 123.
and this grabbed (123) usually matches with the variable 'i' (number of loops) in my code.
Ideally my code should be able to read all received frames, isn't it? Can someone explain this behavior?
Calling cap.read(mat) takes a certain amount of time as it has to obtain and decode the image's video feed and convert it to the cv::Mat format. This amount of time appears to be greater than the video's capture rate. You can determine the frames per second of the video capture with the following:
double frames_per_second = cap.get(CV_CAP_PROP_FPS);
Try timing the amount of time your cap.read(mat) call takes and see if you can see a relationship between the ratio of frames received to frames grabbed and the ratio of the capture time (1/frames_per_second) and the time cap.read(mat) takes to execute.
Source:
http://opencv-srf.blogspot.ca/2011/09/capturing-images-videos.html