removing noise in a binary image using openCV - c++

I had read in a video into Visual Studio using openCV and converted it to grayscale then used the function CV_THRESH_BINARY to convert it into a binary image. However, there are holes and noise in the frames. What is a simple way to remove noise or the holes? I have read up on the Erode and Dilate functions in openCV but I am not too clear on how to use them. this is my code so far. If anyone can show me how to incorporate the noise removal into my code, it would be greatly appreciated.
#include "cv.h"
#include "highgui.h"
int main( int argc, char* argv ) {
CvCapture *capture = NULL;
capture = cvCaptureFromAVI("C:\\walking\\lady walking.avi");
return -1;
IplImage* color_frame = NULL;
IplImage* gray_frame = NULL ;
int thresh_frame = 70;
int frameCount=0;//Counts every 5 frames
cvNamedWindow( "Binary video", CV_WINDOW_AUTOSIZE );
while(1) {
color_frame = cvQueryFrame( capture );//Grabs the frame from a file
if( !color_frame ) break;
gray_frame = cvCreateImage(cvSize(color_frame->width, color_frame->height), color_frame->depth, 1);
if( !color_frame ) break;// If the frame does not exist, quit the loop
cvCvtColor(color_frame, gray_frame, CV_BGR2GRAY);
cvThreshold(gray_frame, gray_frame, thresh_frame, 255, CV_THRESH_BINARY);
cvShowImage("Binary video", gray_frame);
char c = cvWaitKey(33);
if( c == 27 ) break;
cvReleaseCapture( &capture );
cvDestroyWindow( "Grayscale video" );
return 0;

DISCLAIMER: It is hard to give a good answer, because you provided very little info. If you posted your image before and after binarization, it would be much easier. However, I will try to give some hints.
If the holes are rather big, then probably threshold value is wrong, try increasing or decreasing it and check the result. You can try
cv::threshold(gray_frame, gray_frame, 0, 255, CV_THRESH_BINARY | CV_THRESH_OTSU);
This will calculate threshold value automatically.
If you cannot find a good thresholding value, then try some adaptive thresholding algorithms, opencv has adaptiveThreshold() function, but it's not so good.
If the holes and noise are rather small (few pixels each), you can try some of the following:
Using opening (erosion, next dilatation) to remove white noise and closing(dilatation, next erosion) to small black noise. But remember, that opening, while removing white noise, will also strengthen black noise and vice versa.
Median blur AFTER you do thresholding. It may remove small noise, both black and white, while preserving colors (image will stil be binary) and, with posssible small errors, shapes. Applying median blur BEFORE binarization may also help reduce small noise.

You might try using a Smooth function with CV_MEDIAN before you do the thresholding.


dataset for emotion detection

I am working on a code using opencv library which is tracking the user's face and the features on the face. I have managed to do live detection of face and the features like eyes, lips from the webcam. I would like to now extract the emotion from the detected features. I would like to know is there any available dataset which I can use to compare the emotion and how it can be done.
here is the code for face detection
CvRect detectFaceInImage(const IplImage *inputImg, const CvHaarClassifierCascade* cascade )
const CvSize minFeatureSize = cvSize(20, 20);
const int flags = CV_HAAR_FIND_BIGGEST_OBJECT | CV_HAAR_DO_ROUGH_SEARCH; // Only search for 1 face.
const float search_scale_factor = 1.1f;
IplImage *detectImg;
IplImage *greyImg = 0;
CvMemStorage* storage;
CvRect rc;
double t;
CvSeq* rects;
int i;
storage = cvCreateMemStorage(0);
cvClearMemStorage( storage );
// If the image is color, use a greyscale copy of the image.
detectImg = (IplImage*)inputImg; // Assume the input image is to be used.
if (inputImg->nChannels > 1)
greyImg = cvCreateImage(cvSize(inputImg->width, inputImg->height), IPL_DEPTH_8U, 1 );
cvCvtColor( inputImg, greyImg, CV_BGR2GRAY );
detectImg = greyImg; // Use the greyscale version as the input.
// Detect all the faces.
t = (double)cvGetTickCount();
rects = cvHaarDetectObjects( detectImg, (CvHaarClassifierCascade*)cascade, storage,
search_scale_factor, 3, flags, minFeatureSize );
t = (double)cvGetTickCount() - t;
printf("[Face Detection took %d ms and found %d objects]\n", cvRound( t/((double)cvGetTickFrequency()*1000.0) ), rects->total );
// Get the first detected face (the biggest).
if (rects->total > 0) {
rc = *(CvRect*)cvGetSeqElem( rects, 0 );
rc = cvRect(-1,-1,-1,-1); // Couldn't find the face.
//cvReleaseHaarClassifierCascade( &cascade );
//cvReleaseImage( &detectImg );
if (greyImg)
cvReleaseImage( &greyImg );
cvReleaseMemStorage( &storage );
return rc; // Return the biggest face found, or (-1,-1,-1,-1).
I am using the Karolinska Directed Emotional Faces (KDEF) photographs for an educational research project. Information regarding the data set is available at
Note that you will probably need to crop, resize, center, straighten, and normalize the images to use them with OpenCV. Once properly prepared, the images work quite well with all the OpenCV2 FaceRecognizer class functions.
As to how facial expression recognition can be done: no standard approach exists. Start by reading the FaceRecognizer documentation and working through the tutorials. For what it's worth: I have found that using Local Binary Pattern Histograms produces the most accurate results.

Filter out only one contour in OpenCV C/C++

I'm trying to make a program to detect an object in any shape using a video camera/webcam based on Canny filter and contour finding function. Here is my program:
int main( int argc, char** argv )
CvCapture *cam;
CvMoments moments;
CvMemStorage* storage = cvCreateMemStorage(0);
CvSeq* contours = NULL;
CvSeq* contours2 = NULL;
CvPoint2D32f center;
int i;
fprintf(stderr,"Cannot find any camera. \n");
return -1;
IplImage *img=cvQueryFrame(cam);
if(img==NULL){return -1;}
IplImage *src_gray= cvCreateImage( cvSize(img->width,img->height), 8, 1);
cvCvtColor( img, src_gray, CV_BGR2GRAY );
cvSmooth( src_gray, src_gray, CV_GAUSSIAN, 5, 11);
cvCanny(src_gray, src_gray, 70, 200, 3);
cvFindContours( src_gray, storage, &contours, sizeof(CvContour), CV_RETR_EXTERNAL, CV_CHAIN_APPROX_NONE, cvPoint(0,0));
if(contours==NULL){ contours=contours2;}
cvMoments(contours, &moments, 1);
double m_00 = cvGetSpatialMoment( &moments, 0, 0 );
double m_10 = cvGetSpatialMoment( &moments, 1, 0 );
double m_01 = cvGetSpatialMoment( &moments, 0, 1 );
float gravityX = (m_10 / m_00)-150;
float gravityY = (m_01 / m_00)-150;
printf("center point=(%.f, %.f) \n",gravityX,gravityY); }
for (; contours != 0; contours = contours->h_next){
CvScalar color = CV_RGB(250,0,0);
cvDrawContours(img,contours,color,color,-1,-1, 8, cvPoint(0,0));
cvShowImage( "Input", img );
cvShowImage( "Contours", src_gray );
if(cvWaitKey(33)>=0) break;
This program will detect all contours captured by the camera and the average coordinate of the contours will be printed. My question is how to filter out only one object/contour so I can get more precise (x,y) position of the object? If possible, can anyone show me how to mark the center of the object by using (x,y) coordinates?
Thanks in advance. Cheers
p/s:Sorry I couldn't upload a screenshot yet but if anything helps, here's the link.
Edit: To make my question more clear:
For example, if I only want to filter out only the square from my screenshot above, what should I do?
The object I want to filter out has the biggest contour area and most importantly has a shape(any shape), not a straight or a curve line
I'm still experimenting with the smooth and canny values so if anybody have the problem to detect the contours using my program please alter the values.
I think it can be solved fairly easy. I would suggest some morphological operations before contour detection. Also, I would suggest filtering "out" smaller elements, and getting the biggest element as the only one still in the image.
I suggest:
for filtering out lines (straight or curved): you have to decide what do you yourself consider a border between a "line" and a "shape". Let's say you consider all the objects of a thickness 5 pixel or more to be objects, while the ones that are less than 5 pixels across to be lines. An morphological opening that uses a 5x5 square or a 3-pixel sized diamond shape as a structuring element would take care of this.
for filtering out small objects in general: if objects are of arbitrary shapes, purely morphological opening won't do: you have to do an algebraic opening. A special type of algebraic openings is an area opening: an operation that removes all the connected components in the image that have (pixel) area smaller than a given threshold. If you have an upper bound on the size of uninteresting objects, or a lower bound on the size of interesting ones, that value should be used as a threshold. You can probably get a similar effect with a larger morphological opening, but it will not be so flexible.
for filtering out all the objects except the largest: it sounds like removing connected components from the smallest one to the largest one should work. Try labeling the connected components. On a binary (black & white image), this image transformation works by creating a greyscale image, labeling the background as 0 (black), and each component with a different, increasing grey value. In the end, pixels of each object are marked by a different value. You can now simply look at the gray level histogram, and find the grey value with the most pixels. Set all the other grey levels to 0 (black), and the only object left in the image is the biggest one.
The suggestions are written from the simplest to the most complex ones. Still, I think OpenCV can be of help with any of these. Morphological erosion, dilation, opening and closing are implemented in OpenCV. I think you might need to construct an algebraic opening operator on your own (or play with combining OpenCV basic morphology), but I'm sure OpenCV can help you with both labeling the connected components and examining the histogram of the resulting greyscale image.
In the end, when only pixels from one object are left, you do the Canny contour detection.
This is a blob processing problem that can not be solved (easily) by OpenCV itself. Have a look at cvBlobsLib. This library is extends OpenCV with functions/classes for connected component labeling.

Adjusting the threshold in Canny edge algorithm

I wanted to try my hand at text recognition, so i've used opencv to trace out the edges and c++ to find slopes, curves etc, the edge algorithm works well on big and uncluttered sets of characters but when it comes against small printed text or text with a lot of background noise like embedded in captcha it struggles and looks incomplete, my guess was i hadn't set the threshold values correctly and tried different values with no success.
Here is my code :
#include "cv.h"
#include "highgui.h"
using namespace cv;
const int low_threshold = 50;
const int high_threshold = 150;
int main()
IplImage* newImg;
IplImage* grayImg;
IplImage* cannyImg;
newImg = cvLoadImage("ocv.bmp",1);
grayImg = cvCreateImage( cvSize(newImg->width, newImg->height), IPL_DEPTH_8U, 1 );
cvCvtColor( newImg, grayImg, CV_BGR2GRAY );
cannyImg = cvCreateImage(cvGetSize(newImg), IPL_DEPTH_8U, 1);
cvCanny(grayImg, cannyImg, low_threshold, high_threshold, 3);
cvNamedWindow ("Source", 1);
cvNamedWindow ("Destination",1);
cvShowImage ("Source", newImg );
cvShowImage ("Destination", cannyImg );
cvDestroyWindow ("Source" );
cvDestroyWindow ("Destination" );
cvReleaseImage (&newImg );
cvReleaseImage (&grayImg );
cvReleaseImage (&cannyImg );
return 0;
I've looked across the net and have seen some complicated thresholding conditions like in this code from this site :
% Set direction to either 0, 45, -45 or 90 depending on angle.
for i=1:x-1,
for j=1:y-1,
if ((gradAngle(i,j)>67.5 && gradAngle(i,j)<=90) || (gradAngle(i,j)>=-90 && gradAngle(i,j)<=-67.5))
elseif ((gradAngle(i,j)>22.5 && gradAngle(i,j)<=67.5))
elseif ((gradAngle(i,j)>-22.5 && gradAngle(i,j)<=22.5))
elseif ((gradAngle(i,j)>-67.5 && gradAngle(i,j)<=-22.5))
If this is the solution can somebody provide me the c++ equivalent of this algorithm, if it's not what else can i do ?
Canny edge detector is a multi-step detector using hysteresis thresholding (it uses two threshold instead of one), and edge tracking (your last snippet is the part of this step). I suggest reading the wikipedia entry first. One possible solution could be to choose the high threshold, so e.g. 70% of the image pixels would be classified as edge (initially - you could do this quickly using histograms), than choose the low threshold as e.g. 40% of the high threshold. It might be a good idea to try to perform edge detection on image block rather than the whole image, so your algorithm could calculate different thresholds for different areas.
Note that CAPTCHA-s are designed to be hard to segment, and adding noise that broke edge detection is one technique to achive this (you might need to smooth the image first).

Masking a blob from a binary image

I am doing motion recognition of walking using openCV and C++ and I would like to create a mask or copied image in order to achieve the effect seen in the picture provided. .The following is an explanation of the images
The resulting blob of the human walking is seen. Then, a mask image or copied image of the original frame is created, the binary human blob is now masked and the non-masked pixels are now set to zero. The result is the extracted human body with a black background. The diagram below shows how the human blob is extracted and then masked.
This is to be done for every 5th frame of a video sequence. My code so far consists of getting every 5th frame, grayscaling it, finding the areas of all the blobs, and applying a threshold value to get a binary image where more or less, only the human blob is white and the rest of the image is black. Now, I am trying to extract the human body but I have no clue how to proceed. Please help me.
#include "cv.h"
#include "highgui.h"
#include "iostream"
using namespace std;
int main( int argc, char* argv ) {
CvCapture *capture = NULL;
capture = cvCaptureFromAVI("C:\\walking\\lady walking.avi");
return -1;
IplImage* color_frame = NULL;
IplImage* gray_frame = NULL ;
int thresh_frame = 28;
CvMoments moments;
int frameCount=0;//Counts every 5 frames
cvNamedWindow( "walking", CV_WINDOW_AUTOSIZE );
while(1) {
color_frame = cvQueryFrame( capture );//Grabs the frame from a file
if( !color_frame ) break;
gray_frame = cvCreateImage(cvSize(color_frame->width, color_frame->height), color_frame->depth, 1);
if( !color_frame ) break;// If the frame does not exist, quit the loop
cvCvtColor(color_frame, gray_frame, CV_BGR2GRAY);
cvThreshold(gray_frame, gray_frame, thresh_frame, 255, CV_THRESH_BINARY);
cvErode(gray_frame, gray_frame, NULL, 1);
cvDilate(gray_frame, gray_frame, NULL, 1);
cvMoments(gray_frame, &moments, 1);
double m00;
m00 = cvGetCentralMoment(&moments, 0,0);
cvShowImage("walking", gray_frame);
char c = cvWaitKey(33);
if( c == 27 ) break;
double m00 = (double)cvGetCentralMoment(&moments, 0,0);
cout << "Area - : " << m00 << endl;
//area of lady walking = 39696. Therefore, using new threshold area as 30 for this video
//area of walking man = 67929
cvReleaseCapture( &capture );
cvDestroyWindow( "walking" );
return 0;
I would also like to upload the video that I am using in the code but I don't know how to upload it here, so if anyone can help me out with that too. I want to provide as much info as possible w.r.t. my question.
the easiest way is to look for the biggest blob in the image (cvfind contours can be the function you need), then you set to blac all the other blobs (scannig all the contours and using cvfloadfill).
finally you scan the entire binary image if the considered pixel is white you do nothing, if the pixel is black you set to black the corresponding pixel of the 5th frame

Areas of objects using cvMoments

I am working on a motion recognition project of walking, involving openCV and C++. I have reached the stage in the algorithm where I am required to find the area of the human blob. I have loaded the video, converted it to grayscale and thresholded it to obtain a binary image with white regions showing the human walking in addition to other white regions. I need to find the area of each white region to determine the area of the human blob since this region will have an area greater than that of the other white regions. Please look through my code and explain the output to me because I am getting an area of 40872 and I do not know what this means. This is my code. I want to upload the video I used but I do not know how to:/ If someone can tell me how to upload the video I used, please do, because this is the only way I will be able to get help with this particular video. I really hope someone can help me.
#include "cv.h"
#include "highgui.h"
#include "iostream"
using namespace std;
int main( int argc, char* argv ) {
CvCapture *capture = NULL;
capture = cvCaptureFromAVI("C:\\walking\\lady walking.avi");
return -1;
IplImage* color_frame = NULL;
IplImage* gray_frame = NULL ;
int thresh_frame = 70;
CvMoments moments;
int frameCount=0;//Counts every 5 frames
cvNamedWindow( "walking", CV_WINDOW_AUTOSIZE );
while(1) {
color_frame = cvQueryFrame( capture );//Grabs the frame from a file
if( !color_frame ) break;
gray_frame = cvCreateImage(cvSize(color_frame->width, color_frame->height), color_frame->depth, 1);
if( !color_frame ) break;// If the frame does not exist, quit the loop
cvCvtColor(color_frame, gray_frame, CV_BGR2GRAY);
cvThreshold(gray_frame, gray_frame, thresh_frame, 255, CV_THRESH_BINARY);
cvErode(gray_frame, gray_frame, NULL, 1);
cvDilate(gray_frame, gray_frame, NULL, 1);
cvMoments(gray_frame, &moments, 1);
double m00;
m00 = cvGetSpatialMoment(&moments, 0,0);
cvShowImage("walking", gray_frame);
char c = cvWaitKey(33);
if( c == 27 ) break;
double m00 = (double)cvGetSpatialMoment(&moments, 0,0);
cout << "Area - : " << m00 << endl;
cvReleaseCapture( &capture );
cvDestroyWindow( "walking" );
return 0;
cout << "Area - : " << m00 << endl;
The function cvGetSpatialMoment retrieves the spatial moment, which in case of image moments is defined as:
where I(x,y) is the intensity of the pixel (x, y).
The spatial moment m00 is like the mass of an object. It contains no x, y information. The average x position is average(x) = sum(density(x)*x_i) over all i's. I(x,y) is like the density function, but here it is the intensity of the pixel. If you don't want your result to change based on the lighting, you probably want to make the matrix a binary matrix. A pixel is either part of the object or not. Feeding in a greyscale image of the object will essentially convert the greylevel to density as per the formula above.
Area = average(x) * average(y)
so you want
Area = m01 * m10
m00 is basically summing the grey-level over all the pixels in the image. No spatial meaning. Though if you don't convert your image to binary, you may want to divide by m00 to "normalize" it.
You can use MEI and MHI image to recognize motion. with 50frame/1 you updateMHI image and get segment motion and create motion by cvMotions, after that you need to use mathanan distinct with training data. I'm Vietnamese. And english i'm very bad.