How do I answer this Computer Vision Theoretical question? - computer-vision

Suppose there is an image containing multiple objects of different types. The objective of the problem is to recognize objects using primary features of objects (colour, texture, shape). Explain your own idea what concepts will you apply, and how will you apply them, to differentiate/classify the objects in the image by extracting primary features (or combination of features) of objects. Also, justify how your idea can produce the best accuracy.

Since this is a theoretical question, it can have many answers. The simplest approach is to use k-means or weighted k-means, using the features you have. If you have quite unique features then k-means would be able to classify decently accurately. You might still have to juggle around finding how you would input some of the more esoteric features to k-means though. Other more involved methods would use your own trained model using CNN for classification using the features you provide.
Since this is a theoretical question this is all the answer I can provide you with.

Related

Why opencv SVM predict same object result is different after 2 times training? [duplicate]

I am using a multi-dimensional SVM classifier (SVM.NET, a wrapper for libSVM) to classify a set of features.
Given an SVM model, is it possible to incorporate new training data without having to recalculate on all previous data? I guess another way of putting it would be: is an SVM mutable?
Actually, it's usually called incremental learning. The question has come up before and is pretty well answered here : A few implementation details for a Support-Vector Machine (SVM).
In brief, it's possible but not easy, you would have to change the library you are using or implement the training algorithm yourself.
I found two possible solutions, SVMHeavy and LaSVM, that supports incremental training. But I haven't used either and don't know anything about them.
Online and incremental although similar but differ slightly. In online, its generally a single pass(epoch=1) or number of epochs could be configured. Where as, incremental would mean that you already have a model; no matter how it is built, but then model can be mutable by new examples. Also, a combination of online and incremental is often what is required.
Here is a list of tools with some remarks on the online and/or incremental SVM : https://stats.stackexchange.com/questions/30834/is-it-possible-to-append-training-data-to-existing-svm-models/51989#51989

Best method to recognize objects?

I am trying to recognize traffic signs, I already detect them, but now I have to do something to know which one is each of them, I have been reading about the cascade classifier http://docs.opencv.org/doc/user_guide/ug_traincascade.html but i would need 1 for each sign right? Furthermore i am having some issues with the merging of vec files. What do you recommend me to use for this object recognition? i am not sure if the cascade classifer is the best method.....
Thanks a lot fot your help!!
I also posted it here: http://answers.opencv.org/question/65836/best-method-to-recognize-objects/
Try to use Statistical Invariants + Perceptron (or another classifier such SVM).
Statistical Invariants give abilities to calculate features which still constant under the translation, rotation, isotropic scaling. They are easy for understanding and don't require high computational cost.

dimension reduction in spam filtering

I'm performing an experiment in which I need to compare classification performance of several classification algorithms for spam filtering, viz. Naive Bayes, SVM, J48, k-NN, RandomForests, etc. I'm using the WEKA data mining tool. While going through the literature I came to know about various dimension reduction methods which can be broadly classified into two types-
Feature Reduction: Principal Component Analysis, Latent Semantic Analysis, etc.
Feature Selection: Chi-Square, InfoGain, GainRatio, etc.
I have also read this tutorial of WEKA by Jose Maria in his blog: http://jmgomezhidalgo.blogspot.com.es/2013/02/text-mining-in-weka-revisited-selecting.html
In this blog he writes, "A typical text classification problem in which dimensionality reduction can be a big mistake is spam filtering". So, now I'm confused whether dimensionality reduction is of any use in case of spam filtering or not?
Further, I have also read in the literature about Document Frequency and TF-IDF as being one of feature reduction techniques. But I'm not sure how does it work and come into play during classification.
I know how to use weka, chain filters and classifiers, etc. The problem I'm facing is since I don't have enough idea about feature selection/reduction (including TF-IDF) I am unable to decide how and what feature selection techniques and classification algorithms I should combine to make my study meaningful. I also have no idea about optimal threshold value that I should use with chi-square, info gain, etc.
In StringToWordVector class, I have an option of IDFTransform, so does it makes sence to set it to TRUE and also use a feature selection technique, say InfoGain?
Please guide me and if possible please provide links to resources where I can learn about dimension reduction in detail and can plan my experiment meaningfully!
Well, Naive Bayes seems to work best for spam filtering, and it doesn't play nicely with dimensionality reduction.
Many dimensionality reduction methods try to identify the features of the highest variance. This of course won't help a lot with spam detection, you want discriminative features.
Plus, there is not only one type of spam, but many. Which is likely why naive Bayes works better than many other methods that assume there is only one type of spam.

open source CRFs implementation for computer vision problems? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
There are several open source implementations of conditional random fields (CRFs) in C++, such as CRF++, FlexCRF, etc. But from the manual, I can only understand how to use them for 1-D problems such as text tagging, it's not clear how to apply them in 2-D vision problems, suppose I have computed the association potentials at each node and the interaction potentials at each edge.
Did anyone use these packages for vision problems, e.g., segmentation? Or they simply cannot be used in this way?
All in all, is there any open source packages of CRFs for vision problems?
Thanks a lot!
The newest version of dlib has support for learning pairwise Markov random field models over arbitrary graph structures (including 2-D grids). It estimates the parameters in a max-margin sense (i.e. using a structural SVM) rather than in a maximum likelihood sense (i.e. CRF), but if all you want to do is predict a graph labeling then either method is just as good.
There is an example program that shows how to use this stuff on a simple example graph. The example puts feature vectors at each node and the structured SVM uses them to learn how to correctly label the nodes in the graph. Note that you can change the dimensionality of the feature vectors by modifying the typedefs at the top of the file. Also, if you already have a complete model and just want to find the most probable labeling then you can call the underlying min-cut based inference routine directly.
In general, I would say that the best way to approach these problems is to define the graphical model you want to use and then select a parameter learning method that works with it. So in this case I imagine you are interested in some kind of pairwise Markov random field model. In particular, the kind of model where the most probable assignment can be found with a min-cut/max-flow algorithm. Then in this case, it turns out that a structural SVM is a natural way to find the parameters of the model since a structural SVM only requires the ability to find maximum probability assignments. Finding the parameters via maximum likelihood (i.e. treating this as a CRF) would require you to additionally have some way to compute sums over the graph variables, but this is pretty hard with these kinds of models. For this kind of model, all the CRF methods I know about are approximations, while the SVM method in dlib uses an exact solver. By that I mean, one of the parameters of the algorithm is an epsilon value that says "run until you find the optimal parameters to within epsilon accuracy", and the algorithm can do this efficiently every time.
There was a good tutorial on this topic at this year's computer vision and pattern recognition conference. There is also a good book on Structured Prediction and Learning in Computer Vision written by the presenters.

How to write gaussian mixture model in c++ and Opencv

I want to track an object in a video. So i suppose that I could use "Gaussian Mixture Models" in Opencv and C++ . I want to know how to write Gaussian Mixture Models in C++ . Are there any better algorithms for this than GMM?
Sorry to not answer the question directly but:
Reading research papers is a great thing to do, but to be honest, you will get much more knowledge at this point by trying your own ideas on your specific data and getting a better understanding of the problem.
If you know the shapes, it's probably better to use a generalized Hough transform or matched filter for position estimates, combined with a Kalman filter for tracking. These will be relatively easy to implement. Or maybe you can find existing implementations.
Also, I'd prototype your idea in Matlab or Octave instead of C++ if you are not a very good C++ programmer as you'll wind up wasting most of your time with problems in C++ when the problem itself is what you really want to focus on.
As I said in the comment, I'd skip out on using GMM's for now until you get a better understanding of the problem and how you are going to use them. (Unless of course you already have a good idea of how you will use them.)