Least Median of Squares robust regression C++ - c++

I have a set of data z(0), z(1), z(2)...,z(n) that I am currently fitting with a 2 variables polynomial of the kind p(x,y) = a(1)*x^2+a(2)*y^2+a(3)*x*y+a(4). I have i=1,...,n (x(i),y(i)) coordinates that I impose to be p(x(i),y(i))=z(i). In this way I have a Overdetermined System that I can solve using Eigen SVD . I am looking for a more sophisticated method that can take care of outliers, like a Least Median of Squares robust regression (as described here) but I haven't found a C++ implementation for 2 variables. I looked in GSL but it seems there is nothing for 2 variable functions. The only other solution I can think of is using a TGraph2D in ROOT. Do you know any other solution? Numerical recipes maybe? Since I am writing C++ code I would prefer C or C++ implementations.

Since non answer has been given yet, but I am still working on this problem, I will share my progresses here.
The class TLinearFitter has a fit method that allows you to select Robust fitting - Least Trimmed Squares regression (LTS):
https://root.cern.ch/root/html532/TLinearFitter.html
Another possible solution, more time consuming maybe, but maybe more efficient on the long run is to write my own function to be minimized, and the use:
https://projects.coin-or.org/Ipopt to minimize it. Although in this approach there is a bigger "step". I don't know how to use the library and I haven't (yet?) found a nice tutorial to understand it.
here: https://wis.kuleuven.be/stat/robust/software there is a Fortran implementation of the LMedS algorithm called PROGRESS. So another possible solution could be to port this software to C/C++ and make a library out of it.

Related

What are the main rates and values we should figure to evaluate both feature detection, description and matching?

I work on palmprint recognition using feature2D with Open_CV library, and I use algorithms such as SIFT, SURF, ORB... to detect features and extract/match descriptors. My test include (1 vs 1) palmprint and also (1 vs Data Base) of palmprint.
Ones I get the result, I need to evaluate the algorithm, and for this I know that there are some rates or scores (like EER, rank-1 identification, recall and accuracy) which gives an estimation about how much this method was successful. Now I need to know if any of those rates are implemented in Open_CV, and how to use them. If they aren't, what are the different formulas used in the literary.
As far as I know there is little implemented in OpenCV. A common way is to store the results (e.g. in JSON) and process those with other programs such as Matlab or Python. This also allows you to change the evaluation without the need to recompute the algorithms.
There is no overall best method to show the results. It always depends on what you want to show. In my opinion ROC is the best way to express your output. It is also very widely used in research.
If you insist on doing it in C++, then you could use:
Roceasy or
DLIB

C++ Newton-Raphson algo?

I have a huge problem. I need to solve a non linear sistem of 3 equations in 3 variables with a C++ function or class. I thought about using Newton-Raphson method to perform the solution. Unlukily I didn't find a source code that can do that for me. There would be someone that knows a program like that? I'm near deciding to build it myself. Thanks
A 3x3 system is not huge; it's actually a very small problem. People routinely solve nonlinear systems of equations with thousands (and more) of variables and constraints.
Given that your system is 3x3 and possibly nasty, a more appropriate choice of method would be a line search method. You get global convergence to a local minimum of the residual this way; it's very easy to make straight Newton's method diverge.
Steepest descent with backtracking line search is the simplest line search method possible. You might try implementing it first.
First, see related questions What good libraries are there for solving a system of non-linear equations in C++? and https://stackoverflow.com/questions/4914967/could-you-explain-how-newton-raphson-for-a-set-of-equations-works-code-inside. Also, try to use boost.
Consider this cozy C++ library

Supprt Vector Machine works in matlab, doesn't work in c++

I'm writing an application that uses an SVM to do classification on some images (specifically these). My Matlab implementation works really well. Using a SIFT bag-of-words approach, I'm able to get near 100% accuracy with a linear kernel.
I need to implement this in C++ for speed/portability reasons, and so I've tried using both libsvm and dlib. I've tried multiple SVM types (c_svm, nu_svm, one_class) and multiple kernels (linear, polynomial, rbf). The best I've been able to achieve is around 50% accuracy - even on the same samples that I've trained on. I've confirmed that my feature generators are working, because when I export my c++-generated features to Matlab and train on those, I'm able to get near-perfect results again.
Is there something magical about Matlab's SVM implementation? Are there any common pitfalls or areas that I might look into that would explain the behavior I'm seeing? I know this is a little vague, but part of the problem is that I don't know where to go. Please let me know in the comments if there is other info I can provide that would be helpful.
There is nothing magical about the Matlab version of the libraries, other that it runs in Matlab which makes it harder to shoot yourself on the foot.
A check list:
Are you normalizing your data, making all values lie between 0 and 1
(or between -1 and 1), either linearly or using the mean and the
standard deviation?
Are you parameter searching for a good value of C (or C and gamma in
the case of an RBF kernel)? Doing cross validation or on a hold out set?
Are you sure that your're handling NaN, and all other floating point
nastiness? Matlab is very good at hiding this from you, C++ not so
much.
Could it be that you're loading your data incorrectly, reading a
"%s" into a double or something that is adding noise to your input
data?
Could it be that libsvm/dlib expects the data in row major order and
your're sending it in in column major (or the other way around)? Again Matlab makes this almost impossible, C++ not so much.
32-64 bit nastiness one version of the library, executable compiled
with the other?
Some other things:
Could it be that in Matlab you're somehow leaking the class (y) into
the preprocessing? no one does this on purpose, but I've seen it happen.
If you make almost any f(y) a feature, you'll get almost 100%
everytime.
Sometimes it helps to verify that everything is numerically
identical by printing to file before training both in C++ and
Matlab.
i'm very happy with libsvm using the rbf kernel. carlosdc pointed out the most common errors in the correct order :-). for libsvm - did you use the python tools shipped with libsvm? if not i recommend to do so. write your feature vectors to a file (from matlab and/or c++) and do a metatraining for the rbf kernel with easy.py. you get the parameters and a prediction for the generated model. if this prediction is ok continue with c++. from training you also get a scaled feature file (min/max transformed to -1.0/1.0 for every feature). compare these to your c++ implementation as well.
some libsvm issues: a nasty habit is (if i remember correctly) that values scaling to 0 (zero) are omitted in the scaled file. in grid.py is a parameter "nr_local_worker" which is defining the mumber of threads. you might wish to increase it.

How to write gaussian mixture model in c++ and Opencv

I want to track an object in a video. So i suppose that I could use "Gaussian Mixture Models" in Opencv and C++ . I want to know how to write Gaussian Mixture Models in C++ . Are there any better algorithms for this than GMM?
Sorry to not answer the question directly but:
Reading research papers is a great thing to do, but to be honest, you will get much more knowledge at this point by trying your own ideas on your specific data and getting a better understanding of the problem.
If you know the shapes, it's probably better to use a generalized Hough transform or matched filter for position estimates, combined with a Kalman filter for tracking. These will be relatively easy to implement. Or maybe you can find existing implementations.
Also, I'd prototype your idea in Matlab or Octave instead of C++ if you are not a very good C++ programmer as you'll wind up wasting most of your time with problems in C++ when the problem itself is what you really want to focus on.
As I said in the comment, I'd skip out on using GMM's for now until you get a better understanding of the problem and how you are going to use them. (Unless of course you already have a good idea of how you will use them.)

Open source C++ library for vector mathematics

I would need some basic vector mathematics constructs in an application. Dot product, cross product. Finding the intersection of lines, that kind of stuff.
I can do this by myself (in fact, have already) but isn't there a "standard" to use so bugs and possible optimizations would not be on me?
Boost does not have it. Their mathematics part is about statistical functions, as far as I was able to see.
Addendum:
Boost 1.37 indeed seems to have this. They also gracefully introduce a number of other solutions at the field, and why they still went and did their own. I like that.
Re-check that ol'good friend of C++ programmers called Boost. It has a linear algebra package that may well suits your needs.
I've not tested it, but the C++ eigen library is becoming increasingly more popular these days. According to them, they are on par with the fastest libraries around there and their API looks quite neat to me.
Armadillo
Armadillo employs a delayed evaluation
approach to combine several operations
into one and reduce (or eliminate) the
need for temporaries. Where
applicable, the order of operations is
optimised. Delayed evaluation and
optimisation are achieved through
recursive templates and template
meta-programming.
While chained operations such as
addition, subtraction and
multiplication (matrix and
element-wise) are the primary targets
for speed-up opportunities, other
operations, such as manipulation of
submatrices, can also be optimised.
Care was taken to maintain efficiency
for both "small" and "big" matrices.
I would stay away from using NRC code for anything other than learning the concepts.
I think what you are looking for is Blitz++
Check www.netlib.org, which is maintained by Oak Ridge National Lab and the University of Tennessee. You can search for numerical packages there. There's also Numerical Recipes in C++, which has code that goes with it, but the C++ version of the book is somewhat expensive and I've heard the code described as "terrible." The C and FORTRAN versions are free, and the associated code is quite good.
There is a nice Vector library for 3d graphics in the prophecy SDK:
Check out http://www.twilight3d.com/downloads.html
For linear algebra: try JAMA/TNT . That would cover dot products. (+matrix factoring and other stuff) As far as vector cross products (really valid only for 3D, otherwise I think you get into tensors), I'm not sure.
For an extremely lightweight (single .h file) library, check out CImg. It's geared towards image processing, but has no problem handling vectors.