Masking a blob from a binary image - c++

I am doing motion recognition of walking using openCV and C++ and I would like to create a mask or copied image in order to achieve the effect seen in the picture provided. .The following is an explanation of the images
The resulting blob of the human walking is seen. Then, a mask image or copied image of the original frame is created, the binary human blob is now masked and the non-masked pixels are now set to zero. The result is the extracted human body with a black background. The diagram below shows how the human blob is extracted and then masked.
This is to be done for every 5th frame of a video sequence. My code so far consists of getting every 5th frame, grayscaling it, finding the areas of all the blobs, and applying a threshold value to get a binary image where more or less, only the human blob is white and the rest of the image is black. Now, I am trying to extract the human body but I have no clue how to proceed. Please help me.
#include "cv.h"
#include "highgui.h"
#include "iostream"
using namespace std;
int main( int argc, char* argv ) {
CvCapture *capture = NULL;
capture = cvCaptureFromAVI("C:\\walking\\lady walking.avi");
return -1;
IplImage* color_frame = NULL;
IplImage* gray_frame = NULL ;
int thresh_frame = 28;
CvMoments moments;
int frameCount=0;//Counts every 5 frames
cvNamedWindow( "walking", CV_WINDOW_AUTOSIZE );
while(1) {
color_frame = cvQueryFrame( capture );//Grabs the frame from a file
if( !color_frame ) break;
gray_frame = cvCreateImage(cvSize(color_frame->width, color_frame->height), color_frame->depth, 1);
if( !color_frame ) break;// If the frame does not exist, quit the loop
cvCvtColor(color_frame, gray_frame, CV_BGR2GRAY);
cvThreshold(gray_frame, gray_frame, thresh_frame, 255, CV_THRESH_BINARY);
cvErode(gray_frame, gray_frame, NULL, 1);
cvDilate(gray_frame, gray_frame, NULL, 1);
cvMoments(gray_frame, &moments, 1);
double m00;
m00 = cvGetCentralMoment(&moments, 0,0);
cvShowImage("walking", gray_frame);
char c = cvWaitKey(33);
if( c == 27 ) break;
double m00 = (double)cvGetCentralMoment(&moments, 0,0);
cout << "Area - : " << m00 << endl;
//area of lady walking = 39696. Therefore, using new threshold area as 30 for this video
//area of walking man = 67929
cvReleaseCapture( &capture );
cvDestroyWindow( "walking" );
return 0;
I would also like to upload the video that I am using in the code but I don't know how to upload it here, so if anyone can help me out with that too. I want to provide as much info as possible w.r.t. my question.

the easiest way is to look for the biggest blob in the image (cvfind contours can be the function you need), then you set to blac all the other blobs (scannig all the contours and using cvfloadfill).
finally you scan the entire binary image if the considered pixel is white you do nothing, if the pixel is black you set to black the corresponding pixel of the 5th frame


Stitching Video with fast playback of frames

I'm attempting to stitch two videos together though matching there key points though finding the homography between the overlapping video. I have successfully got this to work with two different images.
With the video I have loaded the two separate video files and looped the frames and copied them to the blank matrix cap1frame and cap2frame for each video.
Then I send each frame from each video to the stitching function which matches the keypoints based on the homography between the two frames and stitch them and display the resultant image. (matching based on openCV example)
The stitching is successful however, it results in a very slow playback of the video and some sort of graphical anomalies on the side of the frame. Seen in the photo.
I'm wondering how I can make this more efficient with fast video playback.
int main(int argc, char** argv){
// Create a VideoCapture object and open the input file
VideoCapture cap1("");
VideoCapture cap2("");
// Check if camera opened successfully
if(!cap1.isOpened() || !cap2.isOpened()){
cout << "Error opening video stream or file" << endl;
return -1;
//Trying to loop frames
for (;;){
Mat cap1frame;
Mat cap2frame;
cap1 >> cap1frame;
cap2 >> cap2frame;
// If the frame is empty, break immediately
if (cap1frame.empty() || cap2frame.empty())
//sending each frame from each video to the stitch function then displaying
imshow( "Result", Stitching(cap1frame,cap2frame));
if(waitKey(30) >= 0) break;
// waitKey(0);
return 0;
I was able to resolve my issue by pre-calculating the homography with just the first frame of video. This is so the function was only called once.
I then looped through the rest of the video to apply the warping of the video frames so they could be stitched together based on the pre-calculated homography. This bit was initially within my stitching function.
I still had an issue at this point with playback still being really slow when calling imshow. But I decided to export the resultant video and this worked when the correct fps was set in the VideoWriter object. I wonder if I just needed to adjust the fps playback of imshow but I'm not sure on that bit.
I've got my full code below:
#include <stdio.h>
#include <iostream>
#include "opencv2/core.hpp"
#include "opencv2/features2d.hpp"
#include "opencv2/highgui.hpp"
#include "opencv2/calib3d.hpp"
#include "opencv2/xfeatures2d.hpp"
#include <opencv2/xfeatures2d/nonfree.hpp>
#include <opencv2/xfeatures2d/cuda.hpp>
#include <opencv2/opencv.hpp>
#include <vector>
//To get homography from images passed in. Matching points in the images.
Mat Stitching(Mat image1,Mat image2){
Mat I_1 = image1;
Mat I_2 = image2;
//based on
cv::Ptr<Feature2D> f2d = xfeatures2d::SIFT::create();
// Step 1: Detect the keypoints:
std::vector<KeyPoint> keypoints_1, keypoints_2;
f2d->detect( I_1, keypoints_1 );
f2d->detect( I_2, keypoints_2 );
// Step 2: Calculate descriptors (feature vectors)
Mat descriptors_1, descriptors_2;
f2d->compute( I_1, keypoints_1, descriptors_1 );
f2d->compute( I_2, keypoints_2, descriptors_2 );
// Step 3: Matching descriptor vectors using BFMatcher :
BFMatcher matcher;
std::vector< DMatch > matches;
matcher.match( descriptors_1, descriptors_2, matches );
// Keep best matches only to have a nice drawing.
// We sort distance between descriptor matches
Mat index;
int nbMatch = int(matches.size());
Mat tab(nbMatch, 1, CV_32F);
for (int i = 0; i < nbMatch; i++)<float>(i, 0) = matches[i].distance;
vector<DMatch> bestMatches;
for (int i = 0; i < 200; i++)
bestMatches.push_back(matches[ < int > (i, 0)]);
// 1st image is the destination image and the 2nd image is the src image
std::vector<Point2f> dst_pts; //1st
std::vector<Point2f> source_pts; //2nd
for (vector<DMatch>::iterator it = bestMatches.begin(); it != bestMatches.end(); ++it) {
//cout << it->queryIdx << "\t" << it->trainIdx << "\t" << it->distance << "\n";
//-- Get the keypoints from the good matches
dst_pts.push_back( keypoints_1[ it->queryIdx ].pt );
source_pts.push_back( keypoints_2[ it->trainIdx ].pt );
Mat H_12 = findHomography( source_pts, dst_pts, CV_RANSAC );
return H_12;
int main(int argc, char** argv){
//Mats to get the first frame of video and pass to Stitching function.
Mat I1, h_I1;
Mat I2, h_I2;
// Create a VideoCapture object and open the input file
VideoCapture cap1("");
VideoCapture cap2("");
//Check if camera opened successfully
if(!cap1.isOpened() || !cap2.isOpened()){
cout << "Error opening video stream or file" << endl;
return -1;
//passing first frame to Stitching function
if ({
h_I1 = I1;
if ({
h_I2 = I2;
Mat homography;
//passing here.
homography = Stitching(h_I1,h_I2);
std::cout << homography << '\n';
//creating VideoWriter object with defined values.
VideoWriter video("video/output.avi",CV_FOURCC('M','J','P','G'),30, Size(1280,720));
//Looping through frames of both videos.
for (;;){
Mat cap1frame;
Mat cap2frame;
cap1 >> cap1frame;
cap2 >> cap2frame;
// If the frame is empty, break immediately
if (cap1frame.empty() || cap2frame.empty())
Mat warpImage2;
//warping the second video cap2frame so it matches with the first one.
//size is defined as the final video size
warpPerspective(cap2frame, warpImage2, homography, Size(1280,720), INTER_CUBIC);
//final is the final canvas where both videos will be warped onto.
Mat final (Size(1280,720), CV_8UC3);
//Mat final(Size(cap1frame.cols*2 + cap1frame.cols, cap1frame.rows*2),CV_8UC3);
//Using roi getting the relivent areas of each video.
Mat roi1(final, Rect(0, 0, cap1frame.cols, cap1frame.rows));
Mat roi2(final, Rect(0, 0, warpImage2.cols, warpImage2.rows));
//warping images on to the canvases which are linked with the final canvas.
//writing to video.
//imshow ("Result", final);
if(waitKey(30) >= 0) break;
return 0;

OpenCV ROI on Real time camera

I am trying to set ROI in real time camera and copy a picture in the ROI.
However, I tried many methods from Internet but it is still unsuccessful.
Part of my code is shown below:
libfreenect2::Frame *ir = frames[libfreenect2::Frame::Ir];
//! [loop start]
cv::Mat(ir->height, ir->width, CV_32FC1, ir->data).copyTo(irmat);
Mat img = imread("button.png");
cv::Rect r(1,1,100,200);
cv::Mat dstroi = img(Rect(0,0,r.width,r.height));
irmat(r).convertTo(dstroi, dstroi.type(), 1, 0);
cv::imshow("ir", irmat / 4500.0f);
int key = cv::waitKey(1);
protonect_shutdown = protonect_shutdown || (key > 0 && ((key & 0xFF) == 27));
My real time camera can show the video normally. And no bugs in my program, but the picture cannot be shown in the ROI.
Does anyone have some ideas?
Any help is appreciate.
I hope I understood your question right and you want an output something like this:
I have created a rectangle of size 100x200 on the video feed and displaying an image in that rectangle.
Here is the code:
int main()
Mat frame,overlayFrame;
VideoCapture cap("video.avi");//use 0 for webcam
if (!cap.isOpened())
cout << "Could not capture video";
return -1;
Rect roi(1,1,100,200);//creating a rectangle of size 100x200 at point (1,1) on the videofeed
while ((cap.get(CV_CAP_PROP_POS_FRAMES) + 1) < cap.get(CV_CAP_PROP_FRAME_COUNT))
resize(overlayFrame, overlayFrame, resize(overlayFrame, overlayFrame, Size(roi.width, roi.height));//changing the size of the image to fit in the roi
overlayFrame.copyTo(frame(roi));//copying the picture to the roi
imshow("CameraFeed", frame);
if (waitKey(27) >= 0)
return 0;

Writing video with openCV - no key frame set for track 0

I'm trying to modify and write some video using openCV using the following code:
cv::VideoCapture capture( video_filename );
// Check if the capture object successfully initialized
if ( !capture.isOpened() )
printf( "Failed to load video, exiting.\n" );
return -1;
cv::Mat frame, cropped_img;
int fourcc = static_cast<int>(capture.get(CV_CAP_PROP_FOURCC));
double fps = 30;
cv::Size frame_size( RADIUS, (int) 2*PI*RADIUS );
video_filename = "test.avi";
cv::VideoWriter writer( video_filename, fourcc, fps, frame_size );
if ( !writer.isOpened() && save )
printf("Failed to initialize video writer, unable to save video!\n");
if ( ! )
printf("Failed to read next frame, exiting.\n");
// select the region of interest in the frame
cropped_img = frame( ROI );
// display the image and wait
imshow("cropped", cropped_img);
// if we are saving video, write the unwrapped image
if (save)
writer.write( cropped_img );
char key = cv::waitKey(30);
When I try to run the output video 'test.avi' with VLC I get the following error: avidemux error: no key frame set for track 0. I'm using Ubuntu 13.04, and I've tried using videos encoded with MPEG-4 and libx264. I think the fix should be straightforward but can't find any guidance. The actual code is available at Thanks in advance!
[PYTHON] Apart from the resolution mismatch, there can also be a frames-per-second mismatch. In my case, the resolution was correctly set, but the problem was with fps. Checking the frames per second at which VideoCapture object was reading, it showed to be 30.0, but if I set the fps of VideoWriter object to 30.0, the same error was being thrown in VLC. Instead of setting it to 30.0, you can get by with the error by setting it to 30.
P.S. You can check the resolution and the fps at which you are recording by using the cap.get(3) for width, cap.get(4) for height and cap.get(5) for fps inside the capturing while/for loop.
The full code is as follows:
import numpy as np
import cv2 as cv2
cap = cv2.VideoCapture(0)
#Define Codec and create a VideoWriter Object
fourcc = cv2.VideoWriter_fourcc('X','V','I','D')
#30.0 in the below line doesn't work while 30 does work.
out = cv2.VideoWriter('output.mp4', fourcc, 30, (640, 480))
ret, frame =
colored_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2BGRA)
print('Width = ', cap.get(3),' Height = ', cap.get(4),' fps = ', cap.get(5))
cv2.imshow('frame', colored_frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
The full documentation (C++) for what all properties can be checked is available here : propId OpenCV Documentation
This appears to be an issue of size mismatch between the frames written and the VideoWriter object opened. I was running into this issue when trying to write a series of resized images from my webcam into a video output. When I removed the resizing step and just grabbed the size from an initial test frame, everything worked perfectly.
To fix my resizing code, I essentially ran a single test frame through my processing and then pulled its size when creating the VideoWriter object:
#include <cassert>
#include <iostream>
#include <time.h>
#include "opencv2/opencv.hpp"
using namespace cv;
int main()
VideoCapture cap(0);
Mat testFrame;
cap >> testFrame;
Mat testDown;
resize(testFrame, testDown, Size(), 0.5, 0.5, INTER_NEAREST);
bool ret = imwrite("test.png", testDown);
Size outSize = Size(testDown.cols, testDown.rows);
VideoWriter outVid("test.avi", CV_FOURCC('M','P','4','2'),1,outSize,true);
for (int i = 0; i < 10; ++i) {
Mat frame;
cap >> frame;
std::cout << "Grabbed frame" << std::endl;
Mat down;
resize(frame, down, Size(), 0.5, 0.5, INTER_NEAREST);
//bool ret = imwrite("test.png", down);
outVid << down;
std::cout << "Wrote frame" << std::endl;
struct timespec tim, tim2;
tim.tv_sec = 1;
tim.tv_nsec = 0;
nanosleep(&tim, &tim2);
My guess is that your problem is in the size calculation:
cv::Size frame_size( RADIUS, (int) 2*PI*RADIUS );
I'm not sure where your frames are coming from (i.e. how the capture is set up), but likely in rounding or somewhere else your size gets messed up. I would suggest doing something similar to my solution above.

removing noise in a binary image using openCV

I had read in a video into Visual Studio using openCV and converted it to grayscale then used the function CV_THRESH_BINARY to convert it into a binary image. However, there are holes and noise in the frames. What is a simple way to remove noise or the holes? I have read up on the Erode and Dilate functions in openCV but I am not too clear on how to use them. this is my code so far. If anyone can show me how to incorporate the noise removal into my code, it would be greatly appreciated.
#include "cv.h"
#include "highgui.h"
int main( int argc, char* argv ) {
CvCapture *capture = NULL;
capture = cvCaptureFromAVI("C:\\walking\\lady walking.avi");
return -1;
IplImage* color_frame = NULL;
IplImage* gray_frame = NULL ;
int thresh_frame = 70;
int frameCount=0;//Counts every 5 frames
cvNamedWindow( "Binary video", CV_WINDOW_AUTOSIZE );
while(1) {
color_frame = cvQueryFrame( capture );//Grabs the frame from a file
if( !color_frame ) break;
gray_frame = cvCreateImage(cvSize(color_frame->width, color_frame->height), color_frame->depth, 1);
if( !color_frame ) break;// If the frame does not exist, quit the loop
cvCvtColor(color_frame, gray_frame, CV_BGR2GRAY);
cvThreshold(gray_frame, gray_frame, thresh_frame, 255, CV_THRESH_BINARY);
cvShowImage("Binary video", gray_frame);
char c = cvWaitKey(33);
if( c == 27 ) break;
cvReleaseCapture( &capture );
cvDestroyWindow( "Grayscale video" );
return 0;
DISCLAIMER: It is hard to give a good answer, because you provided very little info. If you posted your image before and after binarization, it would be much easier. However, I will try to give some hints.
If the holes are rather big, then probably threshold value is wrong, try increasing or decreasing it and check the result. You can try
cv::threshold(gray_frame, gray_frame, 0, 255, CV_THRESH_BINARY | CV_THRESH_OTSU);
This will calculate threshold value automatically.
If you cannot find a good thresholding value, then try some adaptive thresholding algorithms, opencv has adaptiveThreshold() function, but it's not so good.
If the holes and noise are rather small (few pixels each), you can try some of the following:
Using opening (erosion, next dilatation) to remove white noise and closing(dilatation, next erosion) to small black noise. But remember, that opening, while removing white noise, will also strengthen black noise and vice versa.
Median blur AFTER you do thresholding. It may remove small noise, both black and white, while preserving colors (image will stil be binary) and, with posssible small errors, shapes. Applying median blur BEFORE binarization may also help reduce small noise.
You might try using a Smooth function with CV_MEDIAN before you do the thresholding.

Areas of objects using cvMoments

I am working on a motion recognition project of walking, involving openCV and C++. I have reached the stage in the algorithm where I am required to find the area of the human blob. I have loaded the video, converted it to grayscale and thresholded it to obtain a binary image with white regions showing the human walking in addition to other white regions. I need to find the area of each white region to determine the area of the human blob since this region will have an area greater than that of the other white regions. Please look through my code and explain the output to me because I am getting an area of 40872 and I do not know what this means. This is my code. I want to upload the video I used but I do not know how to:/ If someone can tell me how to upload the video I used, please do, because this is the only way I will be able to get help with this particular video. I really hope someone can help me.
#include "cv.h"
#include "highgui.h"
#include "iostream"
using namespace std;
int main( int argc, char* argv ) {
CvCapture *capture = NULL;
capture = cvCaptureFromAVI("C:\\walking\\lady walking.avi");
return -1;
IplImage* color_frame = NULL;
IplImage* gray_frame = NULL ;
int thresh_frame = 70;
CvMoments moments;
int frameCount=0;//Counts every 5 frames
cvNamedWindow( "walking", CV_WINDOW_AUTOSIZE );
while(1) {
color_frame = cvQueryFrame( capture );//Grabs the frame from a file
if( !color_frame ) break;
gray_frame = cvCreateImage(cvSize(color_frame->width, color_frame->height), color_frame->depth, 1);
if( !color_frame ) break;// If the frame does not exist, quit the loop
cvCvtColor(color_frame, gray_frame, CV_BGR2GRAY);
cvThreshold(gray_frame, gray_frame, thresh_frame, 255, CV_THRESH_BINARY);
cvErode(gray_frame, gray_frame, NULL, 1);
cvDilate(gray_frame, gray_frame, NULL, 1);
cvMoments(gray_frame, &moments, 1);
double m00;
m00 = cvGetSpatialMoment(&moments, 0,0);
cvShowImage("walking", gray_frame);
char c = cvWaitKey(33);
if( c == 27 ) break;
double m00 = (double)cvGetSpatialMoment(&moments, 0,0);
cout << "Area - : " << m00 << endl;
cvReleaseCapture( &capture );
cvDestroyWindow( "walking" );
return 0;
cout << "Area - : " << m00 << endl;
The function cvGetSpatialMoment retrieves the spatial moment, which in case of image moments is defined as:
where I(x,y) is the intensity of the pixel (x, y).
The spatial moment m00 is like the mass of an object. It contains no x, y information. The average x position is average(x) = sum(density(x)*x_i) over all i's. I(x,y) is like the density function, but here it is the intensity of the pixel. If you don't want your result to change based on the lighting, you probably want to make the matrix a binary matrix. A pixel is either part of the object or not. Feeding in a greyscale image of the object will essentially convert the greylevel to density as per the formula above.
Area = average(x) * average(y)
so you want
Area = m01 * m10
m00 is basically summing the grey-level over all the pixels in the image. No spatial meaning. Though if you don't convert your image to binary, you may want to divide by m00 to "normalize" it.
You can use MEI and MHI image to recognize motion. with 50frame/1 you updateMHI image and get segment motion and create motion by cvMotions, after that you need to use mathanan distinct with training data. I'm Vietnamese. And english i'm very bad.