How do researchers calculate the differentiation of the energy function in the conference/journal papers? - computer-vision

I am a computer vision/graphics guy. Although nowadays machine learning solutions dominate the mainstream methods, sometimes we need to deal with classical solutions. For example, SLAM (simultaneous localization and mapping) and accurate 3D reconstructions.
When implementing conference papers such as CVPR (Computer Vision and Pattern Recognition) and SIGGRAPH, the biggest trouble for me is to calculate the derivatives of the energy function while the number of variables may be thousands.
I know analytical differentiation is a standard method to calculate the derivatives, but the energy functions can be very complex for programmers without strong math background to adopt analytical differentiation.
I was using numerical differentiation to calculate the derivatives of the energy functions. Numerical differentiation is not accurate but easy to implement.
Sometimes you have to build a custom system while the existing software doesn't meet the requirement. I wonder how the other programmers and the paper authors implement the derivatives of energy functions. If they manually do the analytical differentiation, that's amazing.

Related

Fast gradient-descent implementation in a C++ library? [duplicate]

I'm looking to run a gradient descent optimization to minimize the cost of an instantiation of variables. My program is very computationally expensive, so I'm looking for a popular library with a fast implementation of GD. What is the recommended library/reference?
GSL is a great (and free) library that already implements common functions of mathematical and scientific interest.
You can peruse through the entire reference manual online. Poking around, this starts to look interesting, but I think we'd need to know more about the problem.
It sounds like you're fairly new to minimization methods. Whenever I need to learn a new set of numeric methods, I usually look in Numerical Recipes. It's a book that provides a nice overview of the most common methods in the field, their tradeoffs, and (importantly) where to look in the literature for more information. It's usually not where I stop, but it's often a helpful starting point.
For example, if your function is costly, then your goal is to minimization the number of evaluations to need to converge. If you have analytical expressions for the gradient, then a gradient-based method will probably work to your advantage, assuming that the function and its gradient are well-behaved (lack singularities) in the domain of interest.
If you don't have analytical gradients, then you're almost always better off using an approach like downhill simplex that only evaluates the function (not its gradients). Numerical gradients are expensive.
Also note that all of these approaches will converge to local minima, so they're fairly sensitive to the point at which you initially start the optimizer. Global optimization is a totally different beast.
As a final thought, almost all of the code you can find for minimization will be reasonably efficient. The real cost of minimization is in the cost function. You should spend time profiling and optimizing your cost function, and select an algorithm that will minimize the number of times you need to call it (methods like downhill simplex, conjugate gradient, and BFGS all shine on different kinds of problems).
In terms of actual code, you can find a lot of nice routines at NETLIB, in addition to the other libraries that have been mentioned. Most of the routines are in FORTRAN 77, but not all; to convert them to C, f2c is quite useful.
One of the best respected libraries for this kind of optimization work is the NAG libraries. These are used all over the world in universities and industry. They're available for C / FORTRAN. They're very non-free, and contain a lot more than just minimisation functions - A lot of general numerical mathematics is covered.
Anyway I suspect this library is overkill for what you need. But here are the parts pertaining to minimisation: Local Minimisation and Global Minimization.
Try CPLEX which is available for free for students.

Fast gradient-descent implementation in a C++ library?

I'm looking to run a gradient descent optimization to minimize the cost of an instantiation of variables. My program is very computationally expensive, so I'm looking for a popular library with a fast implementation of GD. What is the recommended library/reference?
GSL is a great (and free) library that already implements common functions of mathematical and scientific interest.
You can peruse through the entire reference manual online. Poking around, this starts to look interesting, but I think we'd need to know more about the problem.
It sounds like you're fairly new to minimization methods. Whenever I need to learn a new set of numeric methods, I usually look in Numerical Recipes. It's a book that provides a nice overview of the most common methods in the field, their tradeoffs, and (importantly) where to look in the literature for more information. It's usually not where I stop, but it's often a helpful starting point.
For example, if your function is costly, then your goal is to minimization the number of evaluations to need to converge. If you have analytical expressions for the gradient, then a gradient-based method will probably work to your advantage, assuming that the function and its gradient are well-behaved (lack singularities) in the domain of interest.
If you don't have analytical gradients, then you're almost always better off using an approach like downhill simplex that only evaluates the function (not its gradients). Numerical gradients are expensive.
Also note that all of these approaches will converge to local minima, so they're fairly sensitive to the point at which you initially start the optimizer. Global optimization is a totally different beast.
As a final thought, almost all of the code you can find for minimization will be reasonably efficient. The real cost of minimization is in the cost function. You should spend time profiling and optimizing your cost function, and select an algorithm that will minimize the number of times you need to call it (methods like downhill simplex, conjugate gradient, and BFGS all shine on different kinds of problems).
In terms of actual code, you can find a lot of nice routines at NETLIB, in addition to the other libraries that have been mentioned. Most of the routines are in FORTRAN 77, but not all; to convert them to C, f2c is quite useful.
One of the best respected libraries for this kind of optimization work is the NAG libraries. These are used all over the world in universities and industry. They're available for C / FORTRAN. They're very non-free, and contain a lot more than just minimisation functions - A lot of general numerical mathematics is covered.
Anyway I suspect this library is overkill for what you need. But here are the parts pertaining to minimisation: Local Minimisation and Global Minimization.
Try CPLEX which is available for free for students.

c++ numerical analysis Accurate data structure?

Using double type I made Cubic Spline Interpolation Algorithm.
That work was success as it seems, but there was a relative error around 6% when very small values calculated.
Is double data type enough for accurate scientific numerical analysis?
Double has plenty of precision for most applications. Of course it is finite, but it's always possible to squander any amount of precision by using a bad algorithm. In fact, that should be your first suspect. Look hard at your code and see if you're doing something that lets rounding errors accumulate quicker than necessary, or risky things like subtracting values that are very close to each other.
Scientific numerical analysis is difficult to get right which is why I leave it the professionals. Have you considered using a numeric library instead of writing your own? Eigen is my current favorite here: http://eigen.tuxfamily.org/index.php?title=Main_Page
I always have close at hand the latest copy of Numerical Recipes (nr.com) which does have an excellent chapter on interpolation. NR has a restrictive license but the writers know what they are doing and provide a succinct writeup on each numerical technique. Other libraries to look at include: ATLAS and GNU Scientific Library.
To answer your question double should be more than enough for most scientific applications, I agree with the previous posters it should like an algorithm problem. Have you considered posting the code for the algorithm you are using?
If double is enough for your needs depends on the type of numbers you are working with. As Henning suggests, it is probably best to take a look at the algorithms you are using and make sure they are numerically stable.
For starters, here's a good algorithm for addition: Kahan summation algorithm.
Double precision will be mostly suitable for any problem but the cubic spline will not work well if the polynomial or function is quickly oscillating or repeating or of quite high dimension.
In this case it can be better to use Legendre Polynomials since they handle variants of exponentials.
By way of a simple example if you use, Euler, Trapezoidal or Simpson's rule for interpolating within a 3rd order polynomial you won't need a huge sample rate to get the interpolant (area under the curve). However, if you apply these to an exponential function the sample rate may need to greatly increase to avoid loosing a lot of precision. Legendre Polynomials can cater for this case much more readily.

How are FFTs different from DFTs and how would one go about implementing them in C++?

After some studying, I created a small app that calculates DFTs (Discrete Fourier Transformations) from some input. It works well enough, but it is quite slow.
I read that FFTs (Fast Fourier Transformations) allow quicker calculations, but how are they different? And more importantly, how would I go about implementing them in C++?
If you don't need to manually implement the algorithm, you could take a look at the Fastest Fourier Transform in the West
Even thought it's developed in C, it officially works in C++ (from the FAQ)
Question 2.9. Can I call FFTW from
C++?
Most definitely. FFTW should compile
and/or link under any C++ compiler.
Moreover, it is likely that the C++
template class is
bit-compatible with FFTW's
complex-number format (see the FFTW
manual for more details).
FFT has n*log(n) compexity compared to DFT which has n^2.
There are lot of literature about that, and I strongly advise that you check that first, because such wide topic can not be full explaned here.
http://en.wikipedia.org/wiki/Fast_Fourier_transform (check external links )
If you need library I advise you to use existing one, for instance.
http://www.fftw.org/
This library has efficiently implementation of FFT and is also used in propariaretery software (MATLAB for instance)
Steven Smith's book The Scientist and Engineer's Guide to Digital Signal Processing , specifically Chapter 8 on the DFT and Chapter 12 on the FFT, does a much better job of explaining the two transforms that I ever could.
By the way, the whole book is available for free (link above) and it's a very good introduction to signal processing.
Regarding the C++ code request, I've only used the Fastest Fourier Transform in the West (already cited by superexsl) or DSP libraries such as those from TI or Analog Devices.
The results of a correctly implemented DFT are essentially identical to the results of a correctly implemented FFT (they differ only by rounding errors). As others have pointed out here, the major difference is that of performance. DFT has O(n^2) operations while the FFT has O(nlogn) operations.
The best, most readable publication I have ever found (the one I still refer to) is The Fast Fourier Transform and its Applications by E Oran Brigham. The first few chapters provide a very thorough overview of the continuous and discrete forms of the Fourier Transform. He then uses that to develop the fast version of the DFT based on the Cooley-Tukey Algorithm for the radix-2 (n is a power of 2) and mixed-radix cases (though the latter being somewhat more shallow treatise than the former).
The basic approach in the radix-2 algorithm to perform a linear time operation on the input X and to recursively split the result in half and perform a similar linear time operation on the two halves. The mixed radix case is similar, though you need to divide X into equal portions each time, so it helps if n doesn't have any large prime factors.
I've found this nice explanation with some algorithms described.
FastFourierTransform
About implementation,
first i'd make sure your implementation returns correct results (compare the output from matlab or octave - which have built in fourier transformates)
optimize when necessary, use profilers
don't use unnecesary for loops

Least Squares Regression in C/C++

How would one go about implementing least squares regression for factor analysis in C/C++?
the gold standard for this is LAPACK. you want, in particular, xGELS.
When I've had to deal with large datasets and large parameter sets for non-linear parameter fitting I used a combination of RANSAC and Levenberg-Marquardt. I'm talking thousands of parameters with tens of thousands of data-points.
RANSAC is a robust algorithm for minimizing noise due to outliers by using a reduced data set. Its not strictly Least Squares, but can be applied to many fitting methods.
Levenberg-Marquardt is an efficient way to solve non-linear least-squares numerically.
The convergence rate in most cases is between that of steepest-descent and Newton's method, without requiring the calculation of second derivatives. I've found it to be faster than Conjugate gradient in the cases I've examined.
The way I did this was to set up the RANSAC an outer loop around the LM method. This is very robust but slow. If you don't need the additional robustness you can just use LM.
Get ROOT and use TGraph::Fit() (or TGraphErrors::Fit())?
Big, heavy piece of software to install just of for the fitter, though. Works for me because I already have it installed.
Or use GSL.
If you want to implement an optimization algorithm by yourself Levenberg-Marquard seems to be quite difficult to implement. If really fast convergence is not needed, take a look at the Nelder-Mead simplex optimization algorithm. It can be implemented from scratch in at few hours.
http://en.wikipedia.org/wiki/Nelder%E2%80%93Mead_method
Have a look at
http://www.alglib.net/optimization/
They have C++ implementations for L-BFGS and Levenberg-Marquardt.
You only need to work out the first derivative of your objective function to use these two algorithms.
I've used TNT/JAMA for linear least-squares estimation. It's not very sophisticated but is fairly quick + easy.
Lets talk first about factor analysis since most of the discussion above is about regression. Most of my experience is with software like SAS, Minitab, or SPSS, that solves the factor analysis equations, so I have limited experience in solving these directly. That said, that the most common implementations do not use linear regression to solve the equations. According to this, the most common methods used are principal component analysis and principal factor analysis. In a text on Applied Multivariate Analysis (Dallas Johnson), no less that seven methods are documented each with their own pros and cons. I would strongly recommend finding an implementation that gives you factor scores rather than programming a solution from scratch.
The reason why there's different methods is that you can choose exactly what you're trying to minimize. There a pretty comprehensive discussion of the breadth of methods here.