i'm working on a project (using opencv) where i need to accomplish the following:
Train a classifier so that it can detect people in an thermal image.
I decided to use opencv and classify with HOG and SVM.
So far, i have gotten to the point where i can
Load several images, positive and negative samples (about 1000)
extract the HOG Features for each image
Store the features with their label
Train the SVM
Get the SVM Settings (alpha and bias), set it as HOG Descriptor's SVM
Run testing
The Testing is horrible, even worse then the original one with
hog.setSVMDetector(HOGDescriptor::getDefaultPeopleDetector());
I think i'm doing the HOG Features wrong, bc i compute them for the whole image, but i need them computed on the image part where the person is. So i guess, that i have to crop the images where the Person is, resize it to some window size, train the SVM on classifing those windows and THEN pass it to the HOG Descriptor.
When i test the images directly on the trained SVM, i have observed, that i get almost 100% false positives. I guess this caused by the problem i described earlier.
I'm open for any ideas.
Regards,
hh
Related
hog = cv2.HOGDescriptor()
hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())
I have seen these two lines of code in may online forums but I don't understand where the SVM vector comes from, i.e. what was the training data that was used to train this SVM and can I find that data and source code anywhere?
And also why does the SVM vector have a length of 3781 for a 64x128 image?
Some insight into this would be really helpful.
Thanks
Here you are using pre-trained people detector as SVM. You can read about it in the doc. I don't know the way that they trained it (The algorithms, parameters). But according to this answer, it was trained with Daimler Pedestrian Detection Dataset.
cv2.HOGDescriptor_getDefaultPeopleDetector() will return a array with size 3781 in size. Those are coefficients that are used by SVM to classify people. It has nothing to do with the input image that you are using.
And most importantly you can train a SVM as you like to detect another object and use as the SVM detector. Check this answer for more.
I have written an object classification program using BoW clustering and SVM classification algorithms. The program runs successfully. Now that I can classify the objects, I want to track them in real time by drawing a bounding rectangle/circle around them. I have researched and came with the following ideas.
1) Use homography by using the train set images from the train data directory. But the problem with this approach is, the train image should be exactly same as the test image. Since I'm not detecting specific objects, the test images are closely related to the train images but not essentially an exact match. In homography we find a known object in a test scene. Please correct me if I am wrong about homography.
2) Use feature tracking. Im planning to extract the features computed by SIFT in the test images which are similar to the train images and then track them by drawing a bounding rectangle/circle. But the issue here is how do I know which features are from the object and which features are from the environment? Is there any member function in SVM class which can return the key points or region of interest used to classify the object?
Thank you
First I have tried the default people detector in the OpenCV library.
HOGDescriptor hog;
hog.setSVMDetector(HOGDescriptor::getDefaultPeopleDetector());
hog.detectMultiScale(img, found, 0, Size(8,8), Size(0,0), 1.05, 2);
Although it returns positive matches in a indoor environment with a webcam, they are very rare. So I trained the descriptor with INRIA dataset's negative and positive images but this time the false positives are far too many. I am not trying to lower false matches to zero, it would be enough to lower them to a reasonable level. What should I do?
Another issue is that I think the people in my sample videos are too far away to be easily distinguishable as human images. I have tried reducing the cell size but am not sure this is the right approach. What can be done about this?
Images would be helpful to you but due to reputation I can't post them.
Thanks
Check the opencv [doc]: http://docs.opencv.org/modules/gpu/doc/object_detection.html#gpu-hogdescriptor-detectmultiscale it seems your not using the interface correctly.
Did you do an evaluation of your trained SVM and observed a bad detection rate there as well? If yes you need to play a bit with the training parameters or input data. As far as I remember the INRIA set included people and non-people images but only the positive patches where exactly defined. When I trained a hog classifier the selection of negative samples had a lot of influence. Oh and did you use boosting? IIRC boosting provided a large performance gain in the original paper.
I see HOG is often used with SVM for target detection, can it be used in matching keypoints in two images?
and btw, where could I find OpenCV sample of using HOGDescriptor?
HOG can be used without SVM for feature matching.
just choose some points ( edge, for example ) and calculate the feature of HOG inside ROI with those points centered.
HOGDescriptor seems only for GPU programming.
I created Descriptor for HOG as a Mat in openCV and it also works for OpenCV matching functions.
If you are working with images you can use SIFT/SURF with SVM. There is nothing that stops you from using HOG for keypoint matching, but bear in mind that the effectiveness depends on discrimination power and robustness of the descriptor.
Edit: My bad in understanding when I originally mentioned HOG being for video only. Somehow I was thinking about histogram of optical flow vectors which is very effective for video activity description.
Edit 2 [Oct '12]: I now suggest people to try ORB or BRISK for those looking for license friendly descriptors that are fast and quite effective for keypoint matching.
I have been looking into training/using OpenCV to attempt to detect human figures. I want to try training a HOG for my specific purposes and not use the provided getDefaultPeopleDetector function. I have been unable to find any usable documentation on the HOGDescriptor class.
How do I train my own classifier for my own purposes?
HOG descriptor is very easy to implement. You can write your own code to do it. Look at http://smsoftdev-solutions.blogspot.com/2009/08/integral-histogram-for-fast-calculation.html.
It is fast implementation of HOG. Once you get HOG features of all the training images.You can train an SVM in OpenCV. Training with Gaussian Kernel has produced good results.