When does colmap fail in sparse reconstruction? - computer-vision

This CVPR 2022 paper suggests that Colmap fails in sparse reconstruction in some cases (Section 4.1, Matterport3D para). Are there any known cases where Colmap can fail? In all my experiments so far, I have never encountered such a situation. So, I would like to know if there are known failure cases.

Related

high order bessel function computation with large variables

My work involves computation of high order bessel function at large variable value. Within MATLAB, this has been done without problems. However, in order to scale up the problem, I have tuned to writing C++ code with MPI. Of course, the step to generate bessel function is done by invoking some libraries. To put the problem concrete, let me consider this very specific bug.
In matlab, suppose I wish to compute $J_46341(86840.0)$, and
matlab gives me: besselj(46341,86840)=0.001309896212292
However, a simple test example to call
gsl_sf_bessel_Jn_e returns "ERROR: NaN"
and I have checked at order 46340, both matlab and gsl returns the same answer 0.00292895 within acceptable accuracies. One more step in GSL results in the NaN error while matlab still retains a good accurate numerical answer.
I did try to use recurrence relations to generate higher order values, from a-not-so-small-order, say from order of 20000 and up, however, this only delays the NaN error without completely solving the problem.
Switching my attention to other available software libraries out there, I tried NAG, but to my utter disappointment,
nag_bessel_j_alpha (s18ekc) has constraint of abs(nl)<=101
, in other words, it can only compute up to order of 101 and it is clearly not in my interest of study.
So, my question is fairly simple:
Is there a more reliable library approach to obtain high order bessel
function value for large x?
Asymptotically, bessel function approaches 0, I can surely set those values to zero if the tail is approaching the underflow limit. However, the NaN problem seems to occur somewhat between strongly oscillating curve and asymptotically decaying tail.
Problem solved. Thank you for the community work and it really amazed me with your knowledge and contributions!!!
Please see here,
how to call fortran routines from C++?
https://mathoverflow.net/questions/225121/computation-of-high-order-bessel-function-at-large-variable-value
MATLAB, R, Python and JuliaLang/openspecfun all build upon the original fortran source code by Dr. Donald E. Amos (sandia national lab), cited paper:
D. E. Amos, "A subroutine package for Bessel functions of a complex
argument and nonnegative order", Sandia National Laboratory Report,
SAND85-1018, May, 1985.
D. E. Amos, "A portable package for Bessel functions of a complex
argument and nonnegative order", Trans. Math. Software, 1986.
Now known as Amos Algorithm 644 collected by ACM.
http://dl.acm.org/citation.cfm?id=212078
http://dl.acm.org/citation.cfm?id=1268783
http://dl.acm.org/citation.cfm?id=98299
However, the source codes hosted on netlib are not bug free and probably not up-to-date,
http://netlib.sandia.gov/master/index.html
http://netlib.sandia.gov/amos/
While the version adopted by openspecfun works as solid,
https://github.com/JuliaLang/openspecfun

Logistic Regression with pymc3 - what's the prior for build in glm?

I could not find good explanation for what's going on exactly by using glm with pymc3 in case of logistic regression. So I compared the GLM version to an explicit pymc3 model. I started to write an ipython notebook for documentation, see:
http://christianherta.de/lehre/dataScience/machineLearning/mcmc/logisticRegressionPymc3.slides.php
What I don't understand is:
What prior is used for the Parameters in GLM? I assume they are also Normal distributed. I got different results with my explicit model in comparison to the build in GLM. (see link above)
With less data the sampling get's stuck and/or I got really poor results. With more training data I could not observe this behaviour. Is this normal for mcmc?
There are more issue in the notebook.
Thanks for your answer.
What prior is used for the Parameters in GLM
GLM is name for family of methods. Two popular priors: gaussian (corresponds to l2 regularization) and laplacian (corresponds to l1), usually the first one.
With less data the sampling get's stuck and/or I got really poor results. With more training data I could not observe this behaviour. Is this normal for mcmc?
Did you play with prior parameter? If model behaves badly with small amount of data, this may be due to strong prior (= too high regularization), which becomes the main term in optimization.

Fixed size SVD and solver in CUDA (in the device)

I implemented a program on the GPU (CUDA) which only uses the host (in C++) to start new kernels. During the calculation on the device I need SVD and solving systems of 3x3 (dense) matrices, fixed size.
I've got my own SVD and solver implementation but it is not numerical stable (thus not usable). Due to me being rather new with C++ and CUDA I would prefer to use a library instead. (numerical stuff is very tricky)
Now I have trouble finding that library:
cuSOLVER is not callable from the device
cuLA is not callable form the device (and abandoned so it seems)
Eigen looks promising (should be callable from device?) but it is unclear what the status is on CUDA support (it says experimental). I find people saying it works, others got compile errors?
Preferable I would also being able to do general matrix operations with the library (transpose, inversion, sum, multiply, ...) as my own implementations will likely be less efficient and numerically stable for those.
Any ideas on how to achieve this?
UPDATE:
Seems like Eigen supports basic functions like *,+, transpose and even eigenvalues but SVD, inverse ect is not yet supported. This is at the time of writing.
According to the website, a subset of features works for fixed size matrices (3x3 in your case) from Eigen 3.3. The current stable release is 3.2.6 while 3.3 is in alpha. I don't know if specifically SVD is supported in CUDA. I would recommend trying a small MCVE to see if it works (as well as the other functions you require), and if so, implementing it in your project.
I'm having a similar problem; want to generate random vectors within a kernel function which requires performing cholesky/eigenvalue decompositions of NxN (N<=5) covariance matrices. Since, as you noted, the MAGMA and CULA libraries are not available from the device, and there seems to be no cuSOLVER device API yet, I've resorted to implementing these myself following algorithms outlined in, for example, Numerical Recipes in C. As for solving linear systems, I'd suggest checking out the cuBLAS (level 2 functions), as it provides some basic functionality. If you want to invert matrices, I'd suggest cublasmatinvBatched(). I haven't used it myself, will give it a try during the weekend, but from the description it sounds promising. Hope others will chime into this thread with better solutions...

Sequential nonlinear optimization libraries in C++ WITH constraints

Are there any good libraries in c++ for sequential nonlinear optimization with constraints?
I am looking for inequality constraints and/or upper and lower bounds.
There is a stackoverflow question already for this but not all of them have constraints.
I know of NLopt, but it doesn't work well for my specific problem. Are there any others?
I finally found the solution that i was looking for if any one else is interested lpOpt
One SQP algorithm that you could try is DONLP2. It was originally written in Fortran 77 but there is an ANSI C version as well. It uses dense algebra, so it is primarily suitable for small to medium-sized problems. It is free for academic use. You need to request the code directly from the author, follow the instructions in the link.
UPDATE Sequential Quadratic Programming is only one approach to solving non-linear objective functions with constraints, there is also for example interior point methods. One very good large-scale open-source C++ alternative that applies the interior point approach is Ipopt (already mentioned in another answer). There is also for example the commercial package KNITRO. If you cannot or do not want to provide objective function and constraints gradients, you could also have a look at COBYLA2, of which a C version can be downloaded here.
For further inspiration, you could also consult the Decision Tree For Optimization Software, which lists different optimization codes suitable for a wide range of different problems.

How to test scientific software?

I'm convinced that software testing indeed is very important, especially in science. However, over the last 6 years, I never have come across any scientific software project which was under regular tests (and most of them were not even version controlled).
Now I'm wondering how you deal with software tests for scientific codes (numerical computations).
From my point of view, standard unit tests often miss the point, since there is no exact result, so using assert(a == b) might prove a bit difficult due to "normal" numerical errors.
So I'm looking forward to reading your thoughts about this.
I am also in academia and I have written quantum mechanical simulation programs to be executed on our cluster. I made the same observation regarding testing or even version control. I was even worse: in my case I am using a C++ library for my simulations and the code I got from others was pure spaghetti code, no inheritance, not even functions.
I rewrote it and I also implemented some unit testing. You are correct that you have to deal with the numerical precision, which can be different depending on the architecture you are running on. Nevertheless, unit testing is possible, as long as you are taking these numerical rounding errors into account. Your result should not depend on the rounding of the numerical values, otherwise you would have a different problem with the robustness of your algorithm.
So, to conclude, I use unit testing for my scientific programs, and it really makes one more confident about the results, especially with regards to publishing the data in the end.
Just been looking at a similar issue (google: "testing scientific software") and came up with a few papers that may be of interest. These cover both the mundane coding errors and the bigger issues of knowing if the result is even right (depth of the Earth's mantle?)
http://http.icsi.berkeley.edu/ftp/pub/speech/papers/wikipapers/cox_harris_testing_numerical_software.pdf
http://www.cs.ua.edu/~SECSE09/Presentations/09_Hook.pdf (broken link; new link is http://www.se4science.org/workshops/secse09/Presentations/09_Hook.pdf)
http://www.associationforsoftwaretesting.org/?dl_name=DianeKellyRebeccaSanders_TheChallengeOfTestingScientificSoftware_paper.pdf
I thought the idea of mutation testing described in 09_Hook.pdf (see also matmute.sourceforge.net) is particularly interesting as it mimics the simple mistakes we all make. The hardest part is to learn to use statistical analysis for confidence levels, rather than single pass code reviews (man or machine).
The problem is not new. I'm sure I have an original copy of "How accurate is scientific software?" by Hatton et al Oct 1994, that even then showed how different implementations of the same theories (as algorithms) diverged rather rapidly (It's also ref 8 in Kelly & Sanders paper)
--- (Oct 2019)
More recently Testing Scientific Software: A Systematic Literature Review
I'm also using cpptest for its TEST_ASSERT_DELTA. I'm writing high-performance numerical programs in computational electromagnetics and I've been happily using it in my C++ programs.
I typically go about testing scientific code the same way as I do with any other kind of code, with only a few retouches, namely:
I always test my numerical codes for cases that make no physical sense and make sure the computation actually stops before producing a result. I learned this the hard way: I had a function that was computing some frequency responses, then supplied a matrix built with them to another function as arguments which eventually gave its answer a single vector. The matrix could have been any size depending on how many terminals the signal was applied to, but my function was not checking if the matrix size was consistent with the number of terminals (2 terminals should have meant a 2 x 2 x n matrix); however, the code itself was wrapped so as not to depend on that, it didn't care what size the matrices were since it just had to do some basic matrix operations on them. Eventually, the results were perfectly plausible, well within the expected range and, in fact, partially correct -- only half of the solution vector was garbled. It took me a while to figure. If your data looks correct, it's assembled in a valid data structure and the numerical values are good (e.g. no NaNs or negative number of particles) but it doesn't make physical sense, the function has to fail gracefully.
I always test the I/O routines even if they are just reading a bunch of comma-separated numbers from a test file. When you're writing code that does twisted math, it's always tempting to jump into debugging the part of the code that is so math-heavy that you need a caffeine jolt just to understand the symbols. Days later, you realize you are also adding the ASCII value of \n to your list of points.
When testing for a mathematical relation, I always test it "by the book", and I also learned this by example. I've seen code that was supposed to compare two vectors but only checked for equality of elements and did not check for equality of length.
Please take a look at the answers to the SO question How to use TDD correctly to implement a numerical method?