I have several videos about 5 minutes each along with annotation data containing information about each object's bounding box coordinates at each frame number.
I am trying to read the videos and draw lines between center of bounding boxes frame by frame (when the current frame number matches the number from the ground truth data). I don't want to do this in a batch process, but every 30 or 60 frames would work for me.
Here is my code:
VideoCapture capture(path_video);
if (!capture.isOpened()){
cout << "Failed to capture frame/Open the file" << "\n";
return 0;
}
bool stop(false);
while(!stop){
capture >> frame;
if (frame.data==NULL) {
break;
}
double rate = capture.get(CV_CAP_PROP_FPS);
int delay = 1000/rate;
frmNum = (capture.get(CV_CAP_PROP_POS_FRAMES));
for (int i=0 ; i<db.size() ; i++){//db is a vector of vector that has annotation data for each object splited in inner vectors , and is sorted ascendingly on start frame number of objects.
if (!db[i].empty()){
for (int j=1 ; j<db[i].size() ; j++){
if(frmNum == db[i][j-1].currFrame){
cv::line(frame, db[i][j-1].pnt, db[i][j].pnt,Scalar(255,0,0),2);
}
else{
break;
}
}
}
}
imshow("Video", frame);
int key = waitKey(delay);
if (key==27){
break;
}
I checked and my if condition becomes true but no line is drawn on the video. I guess I don't see the lines because frames are changing and the drawn lines are cleared by new frames, But I couldnt come up with an alternative way. Thanks for your help.
If you want to draw the annotations on every frame and you don't mind that the information may be "obsolete" in some cases, I would do the following:
have a global image variable called "overlay" which would hold the most up to date (in regards to current frame number) representation of annotations and would have the same size as a single frame from the videostream
maintain an array of indices called "last_object_annotation_idx", which would store for each object the last index of its annotation already seen/used (initially set to -1 for each object)
in each iteration of the main loop, update the "last_object_annotation_idx" array (that is, for each object check if the current frame number matches the "currFrame" field of the next annotation - I am using the information that the array of annotations is sorted)
if there was a change to the array, redraw the overlay image using annotations referenced from "last_object_annotation_idx"
finally, add the overlay image to the frame and display the result.
Also, in the if statement
if(frmNum == db[i][j-1].currFrame){
cv::line(frame, db[i][j-1].pnt, db[i][j].pnt,Scalar(255,0,0),2);
}
else{
break;
}
isn't the "break" kind of wrong? It means you will break out of the loop if the first check fails, so you will not see anything past the first index in that case.
Related
The problem is the following:
On the one hand, I populate a vector with all frames of a video. I check if the frames actually contain those of the video, in the right order: YES.
On the other hand, when I exit the while-statement, and read with a for-loop its contents, it behaves as if every single frame was equal to the last frame of the video (a vector of repeated images).
I cornered the problem and found these facts:
Every image is correctly read and stored in frame_temp (each frame is retrieved, from the beginning of the video until the end).
imshow(frames[i]) shows the correct image inside the while loop.
imshow(frames[i]) shows only the last frame of the video inside the for loop, whatever the value of "i" is.
I have tried doing it with vector.push_back, but the result is the same.
void GetFramesFromVideo(String filepath, vector<Mat>& frames)
{
Mat frame_temp;
VideoCapture cap = VideoCapture(filepath);
int videosize = cap.get(7);
frames = vector<Mat>(videosize);
bool success = cap.read(frame_temp);
frames[0]=frame_temp;
namedWindow("he");
int i = 1;
while (success)
{
success = cap.read(frame_temp);
if (success)
{
frames[i] = frame_temp;
i++;
}
imshow("he", frames[i-1]);
waitKey(10);
cout << "Read a new frame: " << success;
}
for (int i = 0; i < frames.size(); i++)
{
imshow("he", frames[i]);
waitKey(10);
}
}
The imshow("he") inside the while-loop reproduces the video - each frames[i] is an actual frame of the video.
The imshow("he") inside the for-loop repeats the last frame of the video again, and again. The frames vector seems to be populated with duplicates of the last "push_backed" frame.
cv::Mat is a type which has reference semantics, not value semantics. In other words, you can think of it as a smart pointer to a matrix.
The documentation for operator=(const Mat&) says:
Matrix assignment is an O(1) operation. This means that no data is copied but the data is shared and the reference counter, if any, is incremented.
So your problem is simple: you are always writing to frame_temp, which is the only actual data you have, and then you are storing references to that data repeatedly in your vector<Mat>. You need to create a new Mat each time.
I want to count the vehicles from a video. After frame differencing I got a gray scale image or kind of binary image. I have defined a Region of Interest to work on a specific area of the frames, the values of the pixels of the vehicles passing through the Region of Interest are higher than 0 or even higher than 40 or 50 because they are white.
My idea is that when a certain number of pixels in a specific interval of time (say 1-2 seconds) are white then there must be a vehicle passing so I will increment the counter.
What I want is, to check whether there are still white pixels coming or not after a 1-2 seconds. If there are no white pixels coming it means that the vehicle has passed and the next vehicle is gonna come, in this way the counter must be incremented.
One method that came to my mind is to count the frames of the video and store it in a variable called No_of_frames. Then using that variable I think I can estimate the time passed. If the value of the variable No_of_frames is greater then lets say 20, it means that nearly 1 second has passed, if my videos frame rate is 25-30 fps.
I am using Qt Creator with windows 7 and OpenCV 2.3.1
My code is something like:
for(int i=0; i<matFrame.rows; i++)
{
for(int j=0;j<matFrame.cols;j++)
if (matFrame.at<uchar>(i,j)>100)//values of pixels greater than 100
//will be considered as white.
{
whitePixels++;
}
if ()// here I want to use time. The 'if' statement must be like:
//if (total_no._of_whitepixels>100 && no_white_pixel_came_after 2secs)
//which means that a vehicle has just passed so increment the counter.
{
counter++;
}
}
Any other idea for counting the vehicles, better than mine, will be most welcomed. Thanks in advance.
For background segmentation I am using the following algorithm but it is very slow, I don't know why. The whole code is as follows:
// opencv2/video/background_segm.hpp OPENCV header file must be included.
IplImage* tmp_frame = NULL;
CvCapture* cap = NULL;
bool update_bg_model = true;
Mat element = getStructuringElement( 0, Size( 2,2 ),Point() );
Mat eroded_frame;
Mat before_erode;
if( argc > 2 )
cap = cvCaptureFromCAM(0);
else
// cap = cvCreateFileCapture( "C:\\4.avi" );
cap = cvCreateFileCapture( "C:\\traffic2.mp4" );
if( !cap )
{
printf("can not open camera or video file\n");
return -1;
}
tmp_frame = cvQueryFrame(cap);
if(!tmp_frame)
{
printf("can not read data from the video source\n");
return -1;
}
cvNamedWindow("BackGround", 1);
cvNamedWindow("ForeGround", 1);
CvBGStatModel* bg_model = 0;
for( int fr = 1;tmp_frame; tmp_frame = cvQueryFrame(cap), fr++ )
{
if(!bg_model)
{
//create BG model
bg_model = cvCreateGaussianBGModel( tmp_frame );
// bg_model = cvCreateFGDStatModel( temp );
continue;
}
double t = (double)cvGetTickCount();
cvUpdateBGStatModel( tmp_frame, bg_model, update_bg_model ? -1 : 0 );
t = (double)cvGetTickCount() - t;
printf( "%d. %.1f\n", fr, t/(cvGetTickFrequency()*1000.) );
before_erode= bg_model->foreground;
cv::erode((Mat)bg_model->background, (Mat)bg_model->foreground, element );
//eroded_frame=bg_model->foreground;
//frame=(IplImage *)erode_frame.data;
cvShowImage("BackGround", bg_model->background);
cvShowImage("ForeGround", bg_model->foreground);
// cvShowImage("ForeGround", bg_model->foreground);
char k = cvWaitKey(5);
if( k == 27 ) break;
if( k == ' ' )
{
update_bg_model = !update_bg_model;
if(update_bg_model)
printf("Background update is on\n");
else
printf("Background update is off\n");
}
}
cvReleaseBGStatModel( &bg_model );
cvReleaseCapture(&cap);
return 0;
A great deal of research has been done on vehicle tracking and counting. The approach you describe appears to be quite fragile, and is unlikely to be robust or accurate. The main issue is using a count of pixels above a certain threshold, without regard for their spatial connectivity or temporal relation.
Frame differencing can be useful for separating a moving object from its background, provided the object of interest is the only (or largest) moving object.
What you really need is to first identify the object of interest, segment it from the background, and track it over time using an adaptive filter (such as a Kalman filter). Have a look at the OpenCV video reference. OpenCV provides background subtraction and object segmentation to do all the required steps.
I suggest you read up on OpenCV - Learning OpenCV is a great read. And also on more general computer vision algorithms and theory - http://homepages.inf.ed.ac.uk/rbf/CVonline/books.htm has a good list.
Normally they just put a small pneumatic pipe across the road (soft pipe semi filled with air). It is attached to a simple counter. Each vehicle passing over the pipe generates two pulses (first front, then rear wheels). The counter records the number of pulses in specified time intervals and divides by 2 to get the approx vehicle count.
Using OpenCV (HighGUI.h), I have this code in main:
int frame_count = 0;
while( (frame=cvQueryFrame(capture)) != NULL && (frame_count <= max_frame)) {
//If the current frame is within the bouds min_frame and max_frame, write it to the new video, else do nothing
if (frame_count >= min_frame){
cvWriteFrame( writer, frame );
}
frame_count++;
}
which queries capture for the frame, and only writes it if the frame is within the bounds min_frame and max_frame, both integers. The code works great, but it is not efficient enough for handling very large videos because it has to query capture for the frames leading up to min_frame, but it doesn't write them. I am wondering if there's a way I can just get cvQueryFrame to start at min_frame instead of having to work its way up to it. It already stops when it gets up to max_frame, since I moved
frame_count <= max_frame
into the while conditional. Any suggestions?
You can set the next frame to read by means of
cvSetCaptureProperty(capture, CV_CAP_PROP_POS_FRAMES, min_frame);
of course only for videos and not for cameras. The next frame read will be min_frame and the reading will go on normally from there (the frames after min_frame will be read) if you don't set the frame position again. So it's a perfect fit, as you just need to call it at the start.
Lets say I have 4 images and I want to use these 4 images to animate a character. The 4 images represent the character walking. I want the animation to repeat itself as long as I press the key to move but to stop right when I unpress it. It doesn't need to be SFML specific if you don't know it, just basic theory would really help me.
Thank you.
You may want some simple kind of state machine. When the key is down (see sf::Input's IsKeyDown method), have the character in the "animated" state. When the key is not down, have the character in "not animated" state. Of course, you could always skip having this "state" and just do what I mention below (depending on exactly what you're doing).
Then, if the character is in the "animated" state, get the next "image" (see the next paragraph for more details on that). For example, if you have your images stored in a simple 4 element array, the next image would be at (currentIndex + 1) % ARRAY_SIZE. Depending on what you are doing, you may want to store your image frames in a more sophisticated data structure. If the character is not in the "animated" state, then you wouldn't do any updating here.
If your "4 images" are within the same image file, you can use the sf::Sprite's SetSubRect method to change the portion of the image displayed. If you actually have 4 different images, then you probably would need to use the sf::Sprite's SetImage method to switch the images out.
How would you enforce a framerate so that the animation doesn't happen too quickly?
Hello please see my answer here and accept this post as the best solution.
https://stackoverflow.com/a/52656103/3624674
You need to supply duration per frame and have the total progress be used to step through to the frame.
In the Animation source file do
class Animation {
std::vector<Frame> frames;
double totalLength;
double totalProgress;
sf::Sprite *target;
public:
Animation(sf::Sprite& target) {
this->target = ⌖
totalProgress = 0.0;
}
void addFrame(Frame& frame) {
frames.push_back(std::move(frame));
totalLength += frame.duration;
}
void update(double elapsed) {
// increase the total progress of the animation
totalProgress += elapsed;
// use this progress as a counter. Final frame at progress <= 0
double progress = totalProgress;
for(auto frame : frames) {
progress -= (*frame).duration;
// When progress is <= 0 or we are on the last frame in the list, stop
if (progress <= 0.0 || &(*frame) == &frames.back())
{
target->setTextureRect((*frame).rect);
break; // we found our frame
}
}
};
To stop when you unpress, simply only animate when the key is held
if(isKeyPressed) {
animation.update(elapsed);
}
To support multiple animations for different situations have a boolean for each state
bool isWalking, isJumping, isAttacking;
...
if(isJumping && !isWalking && !isAttacking) {
jumpAnimation.update(elapsed);
} else if(isWalking && !isAttacking) {
walkAnimation.update(elapsed);
} else if(isAttacking) {
attackAnimation.update(elapsed);
}
...
// now check for keyboard presses
if(jumpkeyPressed) { isJumping = true; } else { isJumping false; }
Maybe not really big, but a hundred frames or something. Is the only way to load it in by making an array and loading each image individually?
load_image() is a function I made which loads the images and converts their BPP.
expl[0] = load_image( "explode1.gif" );
expl[1] = load_image( "explode2.gif" );
expl[2] = load_image( "explode3.gif" );
expl[3] = load_image( "explode4.gif" );
...
expl[99] = load_image( "explode100.gif" );
Seems like their should be a better way.. at least I hope.
A common technique is spritesheets, in which a single, large image is divided into a grid of cells, with each cell containing one frame of an animation. Often, all animation frames for any game entity are placed on a single, sometimes huge, sprite sheet.
maybe simplify your loading with a utility function that builds a filename for each iteration of a loop:
LoadAnimation(char* isFileBase, int numFrames)
{
char szFileName[255];
for(int i = 0; i < numFrames; i++)
{
// append the frame number and .gif to the file base to get the filename
sprintf(szFileName, "%s%d.gif", isFileBase, i);
expl[i] = load_image(szFileName);
}
}
Instead of loading as a grid, stack all the frames in one vertical strip (same image). Then you only need to know how many rows per frame and you can set a pointer to the frame row offset. You end up still having contiguous scan lines that can be displayed directly or trivially chewed off into separate images.