How to train/use the HOGDescriptor class in OpenCV - c++

I have been looking into training/using OpenCV to attempt to detect human figures. I want to try training a HOG for my specific purposes and not use the provided getDefaultPeopleDetector function. I have been unable to find any usable documentation on the HOGDescriptor class.
How do I train my own classifier for my own purposes?

HOG descriptor is very easy to implement. You can write your own code to do it. Look at http://smsoftdev-solutions.blogspot.com/2009/08/integral-histogram-for-fast-calculation.html.
It is fast implementation of HOG. Once you get HOG features of all the training images.You can train an SVM in OpenCV. Training with Gaussian Kernel has produced good results.

Related

SIFT implemetion the returns a single vector for the entire image

I'm trying to use SIFT (from opencv) to get a histogram that describes an image. The problem is, that SIFT identifies a lot of points of interest in the image and gives me a 128 elements vector. While It seems to me as this is what SIFT supposed to do, my lab's PI told me there is an implementation that gives a single 128 elements vector for the all image. Do you know of such an implementation ?
If not, is there any other way of getting a good descriptor for an image ?
(for the purpose of machine learning classification)
In SIFT descriptors feature extraction, each keypoints/interest points gives a 128D SIFT features and as there are multiple keypoints in an image, you will get some 128D x No. of keypoints SIFT vectors for each image. As per my experience, if you try to use SIFT features extraction in OpenCV, you will have to build the library from scratch as SIFT is patented algorithm, the OpenCV community have removed the plug-in library for SIFT and SURF.
You can also try other feature extraction techniques like VLAD, Fisher vector, RGB color histogram, HoG (Histogram of Oriented Gradient) features.

Using what dataset has the get DefaultPeopleDetector() SVM been trained on?

hog = cv2.HOGDescriptor()
hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())
I have seen these two lines of code in may online forums but I don't understand where the SVM vector comes from, i.e. what was the training data that was used to train this SVM and can I find that data and source code anywhere?
And also why does the SVM vector have a length of 3781 for a 64x128 image?
Some insight into this would be really helpful.
Thanks
Here you are using pre-trained people detector as SVM. You can read about it in the doc. I don't know the way that they trained it (The algorithms, parameters). But according to this answer, it was trained with Daimler Pedestrian Detection Dataset.
cv2.HOGDescriptor_getDefaultPeopleDetector() will return a array with size 3781 in size. Those are coefficients that are used by SVM to classify people. It has nothing to do with the input image that you are using.
And most importantly you can train a SVM as you like to detect another object and use as the SVM detector. Check this answer for more.

How to extract LBP features from a hand contour using opencv c++

I am currently working on a hand recognition system. I have been able to detect the hand and draw a contour for it. Now, I have to extract features from the hand region. What is the best feature extraction method that i can use?
I was thinking to use Local Binary Pattern, but since i am new to computer vision i don't know how to use it.
Perhaps you must look at histogram of gradients (HOG), which can be considered as a more general version of LBP. You can have multiple images of hands; by extracting HOG features from each image and using an SVM or neural network classifier, you can learn a statistical model of hand poses. This will help in recognizing an unseen hand. Look also at the current literature on deep learning.
A C++ implementation of HOG is available from vlfeat library [1], which can be called from OpenCV. HOG can be computer from OpenCV also [2].
[1] http://www.vlfeat.org/overview/hog.html
[2] http://goo.gl/8jTetR

People Detection with CvSVM and HOG

i'm working on a project (using opencv) where i need to accomplish the following:
Train a classifier so that it can detect people in an thermal image.
I decided to use opencv and classify with HOG and SVM.
So far, i have gotten to the point where i can
Load several images, positive and negative samples (about 1000)
extract the HOG Features for each image
Store the features with their label
Train the SVM
Get the SVM Settings (alpha and bias), set it as HOG Descriptor's SVM
Run testing
The Testing is horrible, even worse then the original one with
hog.setSVMDetector(HOGDescriptor::getDefaultPeopleDetector());
I think i'm doing the HOG Features wrong, bc i compute them for the whole image, but i need them computed on the image part where the person is. So i guess, that i have to crop the images where the Person is, resize it to some window size, train the SVM on classifing those windows and THEN pass it to the HOG Descriptor.
When i test the images directly on the trained SVM, i have observed, that i get almost 100% false positives. I guess this caused by the problem i described earlier.
I'm open for any ideas.
Regards,
hh

can HOG feature detection be used to keypoint matching?

I see HOG is often used with SVM for target detection, can it be used in matching keypoints in two images?
and btw, where could I find OpenCV sample of using HOGDescriptor?
HOG can be used without SVM for feature matching.
just choose some points ( edge, for example ) and calculate the feature of HOG inside ROI with those points centered.
HOGDescriptor seems only for GPU programming.
I created Descriptor for HOG as a Mat in openCV and it also works for OpenCV matching functions.
If you are working with images you can use SIFT/SURF with SVM. There is nothing that stops you from using HOG for keypoint matching, but bear in mind that the effectiveness depends on discrimination power and robustness of the descriptor.
Edit: My bad in understanding when I originally mentioned HOG being for video only. Somehow I was thinking about histogram of optical flow vectors which is very effective for video activity description.
Edit 2 [Oct '12]: I now suggest people to try ORB or BRISK for those looking for license friendly descriptors that are fast and quite effective for keypoint matching.