I'm trying to implement static hand gesture recognition for only 6 gestures. Because the hand could turn a bit i tried surf + flann as these are invariant.
The images are binary and when i compare them i get bad results and i even don't understand them. For example, for images equal i get 1 or 2 good keypoints and for diferent images i get 5 or 6 good keypoints.
Do you have any sugestion to implement gesture recognition for this case?
Result of a train and query image:
double max_dist = 0, min_dist = 100;
matcher.match(gestoDescriptors,t1descriptors,matches);
for (int i=0; i<gestoDescriptors.rows; i++){
double dist = matches[i].distance;
if (dist<min_dist)
min_dist = dist;
if (dist>max_dist)
max_dist=dist;
}
vector<DMatch>t1_good_matches;
for (int i=0; i<gestoDescriptors.rows; i++){
if (matches[i].distance<=max(2*min_dist,0.02)){
t1_good_matches.push_back(matches[i]);
}
}
//-- Draw only "good" matches
Mat img_matches;
drawMatches(gestoImage,gestoKeypoints,train1,t1keypoints,t1_good_matches,
img_matches, Scalar::all(-1), Scalar::all(-1),
vector<char>(), DrawMatchesFlags::NOT_DRAW_SINGLE_POINTS );
I haven't used SURF for feature extraction extensively, but from what I can gather, it's important that the image be reasonably "expressive" in the sense that there are various interest points with unique properties. I think the main issue you have is that images of hands (and particularly images that have been thresholded to yield binary images) don't have many unique points of interest. Especially due to the scale and rotation-invariant nature of SURF features, there is very little in your image that would qualify as unique to a specific gesture.
You might want to try using your binary image as a mask just to remove the background, and then try the SURF and FLANN logic on the full-color (or maybe grayscale) hand gestures.
In the long run, I think you'd be better off researching Convolutional Neural Networks if you're serious about building a high-quality gesture recognition system. CNNs require far less pre-processing of the image and are particularly well-suited for this kind of application.
Related
So, I'm taking over the work on an ortho-rectification algorithm that is intended to produce "accurate" results. I'm running into trouble trying to increase the accuracy and could use a little help.
Here is the basic approach.
Extract a calibration pattern from an image that was taken from a mobile phone.
Rectify the image based on a calibration pattern in the image
Scale the image to get the real world size of the scene around the pattern.
The calibration pattern is held against a flat surface, like a wall, counter, table, floor and the user takes a picture. With that picture, we want to measure artifacts on the same surface as the calibration pattern. We have tried this with calibration patterns ranging from the size of a credit card to a sheet of paper (8.5" x 11")
Here is an example input picture
With this resulting output image
Right now our measurements are usually within 1-2% of what we expect. This is sufficient for small areas (less than 25cm away from the calibration pattern. However, we'd like the algorithm to scale so that we can accurately measure a 2x2 meter area. However, at that size, the current error is too much (2-4 cm).
Here is the algorithm we are following.
// convert original image to grayscale and perform morphological dilation to reduce false matches when finding circle grid
Mat imgGray;
cvtColor(imgOriginal, imgGray, CV_BGR2GRAY);
// find calibration pattern in original image
Size patternSize(4, 11);
vector <Point2f> circleCenters_OriginalImage;
if (!findCirclesGrid(imgGray, patternSize, circleCenters_OriginalImage, CALIB_CB_ASYMMETRIC_GRID))
{
return false;
}
Point2f inputQuad[4];
inputQuad[0] = Point2f(circleCenters_OriginalImage[0].x, circleCenters_OriginalImage[0].y);
inputQuad[1] = Point2f(circleCenters_OriginalImage[3].x, circleCenters_OriginalImage[3].y);
inputQuad[2] = Point2f(circleCenters_OriginalImage[43].x, circleCenters_OriginalImage[43].y);
inputQuad[3] = Point2f(circleCenters_OriginalImage[40].x, circleCenters_OriginalImage[40].y);
// create model points for calibration pattern
vector <Point2f> circleCenters_ObjectSpace = GeneratePatternPointsInObjectSpace(circleCenters_OriginalImage[0], Distance(circleCenters_OriginalImage[0], circleCenters_OriginalImage[1]) / 2.0f, ioData.marker_up);
Point2f outputQuad[4];
outputQuad[0] = Point2f(circleCenters_ObjectSpace[0].x, circleCenters_ObjectSpace[0].y);
outputQuad[1] = Point2f(circleCenters_ObjectSpace[3].x, circleCenters_ObjectSpace[3].y);
outputQuad[2] = Point2f(circleCenters_ObjectSpace[43].x, circleCenters_ObjectSpace[43].y);
outputQuad[3] = Point2f(circleCenters_ObjectSpace[40].x, circleCenters_ObjectSpace[40].y);
Mat lambda(2,4,CV_32FC1);
lambda = Mat::zeros(imgOriginal.rows, imgOriginal.cols, imgOriginal.type());
lambda = getPerspectiveTransform(inputQuad, outputQuad);
warpPerspective(imgOriginal, imgOrthorectified, lambda, imgOrthorectified.size());
...
My Questions:
Is it reasonable to shoot for error < 0.25%? Is there a different algorithm that would yield more accurate results? What are the most valuable sources of error to identify and resolve?
As I've worked on this, I've also looked at removing pincushion / barrel distortions, and trying homographies to find the perspective transform. The best approaches I have found so far remain in the 1-2% error.
Any suggestions of where to go next would be really helpful
I trained my pc with opencv_traincascade all one day long to detect 2€ coins using more than 6000 positive images similar to the following:
Now, I have just tried to run a simple OpenCV program to see the results and to check the file cascade.xml. The final result is very disappointing:
There are many points on the coin but there are also many other points on the background. Could it be a problem with my positive images used for training? Or maybe, am I using the detectMultiScale() with wrong parameters?
Here's my code:
#include "opencv2/opencv.hpp"
using namespace cv;
int main(int, char**) {
Mat src = imread("2c.jpg", CV_LOAD_IMAGE_COLOR);
Mat src_gray;
std::vector<cv::Rect> money;
CascadeClassifier euro2_cascade;
cvtColor(src, src_gray, CV_BGR2GRAY );
//equalizeHist(src_gray, src_gray);
if ( !euro2_cascade.load( "/Users/lory/Desktop/cascade.xml" ) ) {
printf("--(!)Error loading\n");
return -1;
}
euro2_cascade.detectMultiScale( src_gray, money, 1.1, 0, CV_HAAR_FIND_BIGGEST_OBJECT|CV_HAAR_SCALE_IMAGE, cv::Size(10, 10),cv::Size(2000, 2000) );
for( size_t i = 0; i < money.size(); i++ ) {
cv::Point center( money[i].x + money[i].width*0.5, money[i].y + money[i].height*0.5 );
ellipse( src, center, cv::Size( money[i].width*0.5, money[i].height*0.5), 0, 0, 360, Scalar( 255, 0, 255 ), 4, 8, 0 );
}
//namedWindow( "Display window", WINDOW_AUTOSIZE );
imwrite("result.jpg",src);
}
I have also tried to reduce the number of neighbours but the effect is the same, just with many less points... Could it be a problem the fact that in positive images there are those 4 corners as background around the coin? I generated png images with Gimp from a shot video showing the coin, so I don't know why opencv_createsamples puts those 4 corners.
Those positive images are just plain wrong
The more "noise" you give your images on the parts of the training data then the more robust it will be, but yes the longer it will take to train. This is however where your negative sampels will come into action. If you have as many negative training samples as possible with as many ranges as possible then you will create more robust detectors. You need to make sure your positive images only have your coins in, all your negative images have everything but coins in them
I've seen a couple of your questions up to now & I think you want to detect three different types of euro coins. You would be best training three classifiers all on those different coins and then running all three on your images.
I think also you are missing a key piece of knowledge on how HAAR works (or LBP or whatever) effectively it creates a set of "features" from your positive images then tries to find those features in the images you run the classifier over. It creates these features by working out what is different between your positive images and your negative images. You don't want anything that isnt going to be the thing you are trying to detect in your positive images.
Edit 1 - An Example
Imagine creating a classifier for a road stop sign, which is a similar detection to coins. It's big, it's red & it's hexagonal. Creating a classifier for this is relatively easy - as long as you don't confuse the training stage with erroneous data.
Edit 2 - Scaling of images:
You have to also remember that when running the detection stage it takes your classifier and starts small and then scales up. Large, obvious features will get detected quicker - in my previous example big red blobs & hexagonal shapes. It would then start on small features i.e. text, or numbers.
Edit 3 - a much better example
This example shows you really well how training a cascade object detector works. In fact it even has the same example as with a stop sign!
To detect image of euro coin you can use several methods:
1) Train OpenCV cascade (HAAR or LBP). Don't forget use the great amount of false images. Also extend image of coin (add border).
2) Estimate image with abs gradients of original image. Use Hough Transform to detect circles (coin has shape of circle).
For my college project I need to identify a species of a plant from plant leaf shape by detecting edges of a leaf. (I use OpenCV 2.4.9 and C++), but the source image has taken in the real environment of the plant and has more than one leaf. See the below example image. So here I need to extract the edge pattern of just one leaf to process further.
Using Canny Edge Detector I can identify edges of the whole image.
But I don't know how to proceed from here to extract edge pattern of just one leaf, may be more clear and complete leaf. I don't know even if this is possible also. Can anyone please tell me if this is possible how to extract edges of one leaf I just want to know the image peocessing steps that I need to apply to the image. I don't want any code samples. I'm new to image processing and OpenCV and learning by doing experiments.
Thanks in advance.
Edit
As Luis said said I have done Morphological close to the image after doing edge detection using Canny edge detection, but it seems still it is difficult me to find the largest contour from the image.
Here are the steps I have taken to process the image
Apply Bilateral Filter to reduce noise
bilateralFilter(img_src, img_blur, 31, 31 * 2, 31 / 2);
Adjust contrast by histogram equaliztion
cvtColor(img_blur,img_equalized,CV_BGR2GRAY);
Apply Canny edge detector
Canny(img_equalized, img_edge_detected, 20, 60, 3);
Threshold binary image to remove some background data
threshold(img_edge_detected, img_threshold, 1, 255,THRESH_BINARY_INV);
Morphological close of the image
morphologyEx(img_threshold, img_closed, MORPH_CLOSE, getStructuringElement(MORPH_ELLIPSE, Size(2, 2)));
Following are the resulting images I'm getting.
This result I'm getting for the above original image
Source image and result for second image
Source :
Result :
Is there any way to detect the largest contour and extract it from the image ?
Note that my final target is to create a plant identification system using real environmental image, but here I cannot use template matching or masking kind of things because the user has to take an image and upload it so the system doesn't have any prior idea about the leaf.
Here is the full code
#include <opencv\cv.h>
#include <opencv\highgui.h>
using namespace cv;
int main()
{
Mat img_src, img_blur,img_gray,img_equalized,img_edge_detected,img_threshold,img_closed;
//Load original image
img_src = imread("E:\\IMAG0196.jpg");
//Apply Bilateral Filter to reduce noise
bilateralFilter(img_src, img_blur, 31, 31 * 2, 31 / 2);
//Adjust contrast by histogram equaliztion
cvtColor(img_blur,img_equalized,CV_BGR2GRAY);
//Apply Canny edge detector
Canny(img_equalized, img_edge_detected, 20, 60, 3);
//Threshold binary image to remove some background data
threshold(img_edge_detected, img_threshold, 15, 255,THRESH_BINARY_INV);
//Morphological close of the image
morphologyEx(img_threshold, img_closed, MORPH_CLOSE, getStructuringElement(MORPH_ELLIPSE, Size(2, 2)));
imshow("Result", img_closed);
waitKey(0);
return 0;
}
Thank you.
Well there is a similar question that was asked here:
opencv matching edge images
It seems that edge information is not a good descriptor for the image, still if you want to try it I'll do the following steps:
Load image and convert it to grayscale
Detect edges - Canny, Sobel try them and find what it suits you best
Set threshold to a given value that eliminates most background - Binarize image
Close the image - Morphological close dont close the window!
Count and identify objects in the image (Blobs, Watershed)
Check each object for a shape (assuming you have described shapes of the leaf you could find before or a standard shape like an ellipse) features like:
http://docs.opencv.org/modules/imgproc/doc/structural_analysis_and_shape_descriptors.html
http://www.math.uci.edu/icamp/summer/research_11/park/shape_descriptors_survey.pdf
If a given object has a given shape that you described as a leaf then you detected the leaf!.
I believe that given images are taken in the real world these algorithm will perform poorly but it's a start. Well hope it helps :).
-- POST EDIT 06/07
Well since you have no prior information about the leaf, I think the best we could do is the following:
Load image
Bilateral filter
Canny
Extract contours
Assume: that the contour with the largest perimeter is the leaf
Convex hull the 3 or 2 largest contours (the blue line is the convex hull done)
Use this convex hull to do a graph cut on the image and segmentate it
If you do those steps, you'll end up with images like these:
I won't post the code here, but you can check it out in my messy github. I hope you don't mind it was made in python.
Leaf - Github
Still, I have a couple of things to finish that could improve the result.. Roadmap would be:
Define the mask in the graphcut (like its described in the doc)
Apply region grow may give a better convex hull
Remove all edges that touch the border of the image can help to identify larger edges
Well, again, I hope it helps
I have been using the code from the OpenCV website for detection of objects.I am a beginner to OpenCV and to image processing and have been trying to understand the working of SURF.I have a few doubts.
1.I have been using color images for detection and the results have been good so far.There are people who are recommending using grayscale images,will it increase the performance of the algorithm?
2.In the code,what is the significance of filtering by only having the matches with distance less than 3*mindist?
for( int i = 0; i < descriptors_object.rows; i++ )
{ if( matches[i].distance < 3*min_dist )
{ good_matches.push_back( matches[i]); }
}
3.Though the detection is robust in high illuminated images(i used 900 as the hessian value),the same image under low light conditions does not get detected with the same hessian value,is there a way to do both with the same hessian value and the same reference image for both conditions?Will cv::equalizeHist() be useful?If it is,can somebody please suggest a way for me to integrate with the SURF detection code?
4.DMatch structure which returns matches has a parameter called distance which returns the distance between descriptors.What does this mean?Is there a unit for the distance returned?
5.I would also like to know if there are better descriptors than SURF in terms of time complexity,scale and rotation invariance`for object detection.
Thanks in advance for your time and replies.
SURF works with grayscale images.
There are lot of false random matches (since if you have 100 features in img1, you will always have 100 matches) and to filter them is good idea. But better is to check relative distance - how it is done in Use Euclidean distance in SURF)
Yes, it can be used. You just use modified images instead of original.
You detect low number of features because of low pixel intensity and low contrast in the dark regions, which decreases detector response. When applied to the dark image, histogram equalization increases contrast which increases number of local maximas below the threshold.
cv::Mat img1, img1histEq;
cv::equalizeHist(img1,img1histEq);
SURF is can be viewed as 128-dimentional vector. Distance is distance between two such vectors in some space, usually Euclidean - sum of squared differences between corresponding vector elements. Other metrics, e.g. L1 also can be used, but Euclidean is the most used for SURF.
SIFT performs better in terms of invariance, but 3 times slower. You can find comparison of the different descriptors here and here.
It is not clear what do you mean by "object detection". What exactly do you need to do?
I am currently working on image processing project. I am using Opencv2.3.1 with VC++.
I have written the code such that, the input image is filtered to only blue color and converted to a binary image. The binary image has some small objects which I don't want. I wanted to eliminate those small objects, so i used openCV's cvFindContours() method to detect contours in Binary image. but the problem is I cant eliminate the small objects in the image output. I used cvContourArea() function , but didn't work properly.. , erode function also didn't work properly.
So please someone help me with this problem..
The binary image which I obtained :
The result/output image which I want to obtain :
Ok, I believe your problem could be solved with the bounding box demo recently introduced by OpenCV.
As you have probably noticed, the object you are interested at should be inside the largest rectangle draw in the picture. Luckily, this code is not very complex and I'm sure you can figure it all out by investigating and experimenting with it.
Here is my solution to eliminate small contours.
The basic idea is check the length/area for each contour, then delete the smaller one from vector container.
normally you will get contours like this
Mat canny_output; //example from OpenCV Tutorial
vector<vector<Point> > contours;
vector<Vec4i> hierarchy;
Canny(src_img, canny_output, thresh, thresh*2, 3);//with or without, explained later.
findContours(canny_output, contours, hierarchy, CV_RETR_TREE, CV_CHAIN_APPROX_SIMPLE, Point(0,0));
With Canny() pre-processing, you will get contour segments, however each segment is stored with boundary pixels as a closed ring. In this case, you can check the length and delete the small one like
for (vector<vector<Point> >::iterator it = contours.begin(); it!=contours.end(); )
{
if (it->size()<contour_length_threshold)
it=contours.erase(it);
else
++it;
}
Without Canny() preprocessing, you will get contours of objects.
Similarity, you can also use area to define a threshold to eliminate small objects, as OpenCV tutorial shown
vector<Point> contour = contours[i];
double area0 = contourArea(contour);
this contourArea() is the number of non-zero pixels
Are you sure filtering by small contour area didn't work? It's always worked for me. Can we see your code?
Also, as sue-ling mentioned, it's a good idea to use both erode and dilate to approximately preserve area. To remove small noisy bits, use erode first, and to fill in holes, use dilate first.
And another aside, you may want to check out the new C++ versions of the cv* functions if you weren't aware of them already (documentation for findContours). They're much easier to use, in my opinion.
Judging by the before and after images, you need to determine the area of all the white areas or blobs, then apply a threshold area value. This would eliminate all areas less than the value and leave only the large white region which is seen in the 2nd image. After using the cvFindContours function, try using 0 order moments. This would return the area of the blobs in the image. This link might be helpful in implementing what I've just described.
http://www.aishack.in/2010/07/tracking-colored-objects-in-opencv/
I believe you can use morphological operators like erode and dilate (read more here)
You need to perform erosion with a kernel size near to the radius of the circle on the right (the one you want to eliminate).
followed by dilation using the same kernel to fill the gaps created by the erosion step.
FYI erosion followed by dilation using the same kernel is called opening.
the code will be something like this
int erosion_size = 30; // adjust with you application
Mat erode_element = getStructuringElement( MORPH_ELLIPSE,
Size( 2*erosion_size + 1, 2*erosion_size+1 ),
Point( erosion_size, erosion_size ) );
erode( binary_img, binary_img, erode_element );
dilate( binary_img, binary_img, erode_element );
It is not a fast way but may be usefull in some cases.
There is a new function in OpencCV 3.0 - connectedComponentsWithStats. With it we can get area of connected components and eliminate unnecessary. So we can easy remove circle with holes, with the same bounding box as solid circle.