Modifying an OpenCV RandomTree classifier - c++

My problem : objective is to implement a computer vision paper which uses a random tree structure to regress pixels from a rgbd image to 3D world coordinates.
I used already OpenCv for AdaBoost and random forest but i never dived into the code.
So now as i would like to modify the error function of the split node, i don't know if it's possible. I didn't see clear declarations in the header file.
Just to add some informations about what I want to do in the error function.
The input is a pixel (i,j). Then in the error function depending on the parameter, a feature would be created from the rgbd image and a best split over the feature of each pixels of the subset would have to be found. The features clearly depend on the parameter and should be estimated during training.
My question :
Is it possible to create a class extending CvRTrees and modifying the error function for each split node ?
If yes, what member should be modified ? If no, do you know any librairy that could help me to achieve that.

As no one answered i will just post what i found out :
The CvRTrees use a fixed feature as input (e.g. a HOG descriptor).
If you want to use random features, you have to either put all these features as input (which may be totally suboptimal or impossible).
Or you can create your own implementation of the weak classifier where the type of feature used is a random vraiable as for example the threshold could be.

Related

classification with SVM using vocab build from Bag of Word

My intention is to build a classifier that correctly classify the image ROI with the template that I have manually extracted
Here is what I have done.
My first step is to understand what should be done to achieve the above
I have realized I would need to create the representation vectors(of the template) through research from the net. Hence I have used Bag of words to create the vocabulary
I have used and rewritten the Roy's project to opencv 3.1 and also used his food database. On seeing his database, I have realised that some of the image contain multiple class type. I try to clip the image so that each training image only contains one class of item but the image are now of different size
I have tried to run this code. The result is very disappointing. It always points to one class.
Question I have?
Is my step in processing the training image wrong? I read around and some posts suggest the image size must be constant or at least the aspect ratio. I am confused by this. Is there some tools available for resizing samples?
It does not matter what is the size of the sample images, since Roy's algorithm uses local descriptors extracte from nearby points of interest.
SVM is linear regression classifier and you need to train different SVM-s for each class. For each class it will say whether it's of that class or the rest. The so called one vs. rest.

Extracting the descriptors of a bunch of sample images and training an SVM in OpenCV

I have 4 types of symbols of musical notes of the same color: Whole note, half note, Crotchet and quaver. I need to classify an image and tell if it has one of this symbols (just one for now) and which one. for example, if i have an image with just the musical staff (but nothing else in it) it should tell me that the image is empty, but if i have an image with a Half note symbol in it, it should tell me something like "it is a half note".
Suppose i have 20 sample images for each possible symbol and 20 with the base case (nothing in it), i want to train a SVM to classify any input image. I've read about how i could do it, but i still have certain doubts. i think the process is something like this (and please correct me if i'm wrong):
extract the descriptors of all the sample images.
put those descriptors inside different Mat Objects (one for each symbol).
feed those Mats to the SVM to train it.
Use the SVM to classify the images.
i have specific doubts about what i think is the process:
is what i described the correct process for what i need to do?
should i pre-process the sample images (say extract the background and apply canny edges) before i feed them to the descriptor extractor? o can i leave them as they are?
i have read about three methods of extracting the descriptors: HOG, BOW (Bag of Words) and SIFT. i think they all do what i need but i don't know which one to use. i see that HOG is mostly (if not all times) for face and pedestrians detection and i don't know if it could be used for my case. Any advice of which one should use?
how many sample images should i have for every case?
i dont need specific details of the implementation, but i do need answers to these questions, thank you in advance
I'm not an expert of SIFT and BOW but I know something about HOG and SVM.
1 Is what i described the correct process for what i need to do?
If you are using OpenCV and HOG no that is not correct. Have a look to the sample code for HOG in OpenCV samples and you will find that, once extracted, the descriptors directly feed the SVM without filling a MAT element.
2 should i pre-process the sample images (say extract the background and apply canny edges) before i feed them to the descriptor extractor? o can i leave them as they are?
This is not mandatory. Preprocessing has been proved to be very useful but for your simple case you wont need it. On the other hand, if your wall presents draws, stickers or something that can confuse the detector then yes. It can be a good solution to decrease the number of false positives.
3 i have read about three methods of extracting the descriptors: HOG, BOW (Bag of Words) and SIFT. i think they all do what i need but i don't know which one to use. i see that HOG is mostly (if not all times) for face and pedestrians detection and i don't know if it could be used for my case. Any advice of which one should use?
I have direct knowledge only of HOG. You can easily implement your own detector with HOG without any problem, I'm currently using it for traffic signs. Pay attention to the detection window that you want to use. You can leave all the other parameters as they are, it will work for simple cases.
4 how many sample images should i have for every case?
Once again it depends on the situation. I would say that 200 images (try also with less) for class will do the trick but you can always increase the number by applying some transformation on the positives. Try to flip, saturate or blur the images.
Some more considerations. I think that you can work with grey scale images due to the fact that color is not important to distinguish the notes (all the same color right?). If you have problem with false positives you can try to use the HSV color space to filter out patches that you will then use to detect the notes (it really works well with red!!). The easiest way to train your SVM is using a linear kernel and then train a model for each class.

Object Annotation in images with OpenCV

I am trying to develop an automatic(or semi-automatic) image annotator for my final year project with OpenCV. I have been studying many OpenCV resources and have come across cascade classification for training and detection purposes. I understood that part, and also tried the Face Detection tutorial provided with OpenCV. So, now I know how to train and detect objects.
However, I still cannot understand how can I annotate objects present in the image?
For example, the system will show that this is an object, but I want the system to show that it is a ball. How can i accomplish that?
Thanks in advance.
One binary classificator (detector) can separate objects by two classes:
positive - the object type classifier was trained for,
and negative - all others.
If you need detect several distinguished classes you should use one detector for each class, or you can train multiclass classifier ("one vs all" type of classifiers for example), but it usually works slower and with less accuracy (because detector better search for similar objects). You can also take a look at convolutional networks (by Yann LeCun).
This is a very hard task. I suggest simplifying it by using latent SVM detector and limiting yourself to the models it supplies:
http://docs.opencv.org/modules/objdetect/doc/latent_svm.html

Setting parameters for BRISK in OpenCV

I'm trying to use BRISK implementation of OpenCV (for C++) in order to check in a photo if an image (or a part of an image) is included in. For example, I take a photo, and I try to match it with a set of images in database, and I would like to select the best corresponding image (or an error message if none of all the images is good enough).
So, I'm just testing OpenCV for the moment. I've simply taken the sample included in the framework (matching_to_many_images), and change the detector and descriptor from SURF to BRISK.
However, I have weird results. These are the results of matching (BruteForce Hamming):
In the first one, the scenes are entirely different, but there are a lot of matches!
In the second one, the scenes are pretty similar, but some matches are wrong.
I think this is a parameters issue- because on demo videos of BRISK, the results are significant.
Have you seen the OpenCV documentation for BRISK? I'm not sure what parameters you're using now, but you can specify the threshold and octaves, as well as the pattern. Documentation at
http://docs.opencv.org/modules/features2d/doc/feature_detection_and_description.html#brisk
Also you could try a different feature matching algorithm, although it appears that in the BRISK paper they also used hamming distance
Lastly, it's not too unexpected to have erroneous feature matches; try out different scenes as well as different feature parameters and see how your results are
There are commonly many incorrect initial matches when doing feature-feature matching using SIFT, SURF, BRISK, or any other local descriptor.
Many of these initial matches will be incorrect due to ambiguous features or features that arise from background clutter. [From Distinctive Image Features from Scale Invariant Keypoints]
The next step is to select only a subset of those matches that all agree on a common transformation between the two images. This is explained in sections 7.3 and 7.4 of Distinctive Image Features from Scale Invariant Keypoints.
The OpenCV Tutorial gives an excellent example of how to extract features and calculate a homography (a transformation that tells you how to to transform each point from one image to the other one).
You can replace the feature-detector/descriptor with any other one, which will result in different robustness to certain transformations like rotation, scaling or errors like blur or lumination change. The basic implementation of BRISK already has meaningful parameters defined.
Last but not least, if you try to match two completely different images, what would you expect as a result? The algorithm will try to find similarities, and therefore always calculate a result, even if it is non-sense and the scores are very low. Just keep in mind: Garbage in -> Garbage out.

Finding convex defects? in OpenCV 2.3, c++ with MS Visual Studio2010

I am currently attempting to use OpenCV 2.3 and c++ to detect a hand (wearing a green glove), and distinguish between different hand gestures.
At this very moment, my next step is to acquire specific features of the hand (convex defects).
So far I've used these functions in my process:
mixChannels(); //to subtract non-green channels from green.
threshold(); //to convert to binary
erode(); dialate(); //to get rid of any excess noise
findContours(); //to find contours of course
these have worked splendidly and I have been able to output findContours() through the use of drawContours().
Next step, and this is where I'm at, is using convexHull(), which also works in OpenCV 2.3. I have however yet to find out how the vector results of convexHull() actually look (what features they contains).
But this is where the tricky part comes.
I found that the older version of OpenCV (using c which uses IplImage), has a neat little function called cvConvexityDefects() which can give a set of deficiencies on the convex hull. These are what I need, but there seems to be no such function for the OpenCV 2.3 and I don't see how I can use the old syntax to get these results.
Here's a link to Open CV documentation on cvConvextDefects.
What I'm asking for, is either a similar OpenCV 2.3 function, or a selfwritten piece of code or algorithm for finding these defects. Or a method to use the old 2.1 syntax for a vector result or something like that.
(I know that I can use other features, rectangular bounding boxes, and fitted circles. But I'm sure that convex defects yield the most distinguishable features.)
Solution - I ended up using a c++ wrapper from this post
The only thing not working for this wrapper seems to be a leek of the defects vector, which should be easily solvable.
Next step is getting some usable data from these defects. (At first glance. the data seems to be single points on the convexHull or the Contour or the count of these. I had at first expected a set of two points, or a single point and a length, which it does not seem to be. If I'm hitting 'brick wall' with this, I'll make another post)
The new C++ interface does not (yet) support all the functions in C. And the opposite is true, also (not everything in cpp is in c). The reasons are various, but the good news is that you can easily use whatever function you want. By example, here you have to convert contours to sequences (CvSeq) and send them to your function.
Moreover, the FindContours method is a wrapper over cvFindContours. You can call it
cvFindContours((IplImage*)matImage, ...);
and then use directly the result.
Another way would be to create a nice, clean C++ wrapper over cvConvexDefects() and submit it to OpenCV. In the findContours source you will find some help for that (the opposite transform)
I would like to attempt to convert my convexHull vector from the c++ syntax to the sequence which you mentioned, but I don't know where to start. Could you perhaps shed some light on this?
Check out this question here, I beleive it goes into it.
Convexity defects C++ OpenCv