SLAM system that uses deep learned features? - computer-vision

Has anybody tried developing a SLAM system that uses deep learned features instead of the classical AKAZE/ORB/SURF features?
Scanning recent Computer Vision conferences, there seem to be quite a few reports of successful usage of neural nets to extract features and descriptors, and benchmarks indicate that they may be more robust than their classical computer vision equivalent. I suspect that extraction speed is an issue, but assuming one has a decent GPU (e.g. NVidia 1050), is it even feasible to build a real-time SLAM system running say at 30FPS on 640x480 grayscale images with deep-learned features?

This was a bit too long for a comment, so that's why I'm posting it as an answer.
I think it is feasible, but I don't see how this would be useful. Here is why (please correct me if I'm wrong):
In most SLAM pipelines, precision is more important than long-term robustness. You obviously need your feature detections/matchings to be precise to get reliable triangulation/bundle (or whatever equivalent scheme you might use). However, the high level of robustness that neural networks provide is only required with systems that do relocalization/loop closure on long time intervals (e.g. need to do relocalization in different seasons etc). Even in such scenarios, since you already have a GPU, I think it would be better to use a photometric (or even just geometric) model of the scene for localization.
We don't have any reliable noise models for the features that are detected by the neural networks. I know there have been a few interesting works (Gal, Kendall, etc...) for propagating uncertainties in deep networks, but these methods seem a bit immature for deployment ins SLAM systems.
Deep learning methods are usually good for initializing a system, and the solution they provide needs to be refined. Their results depend too much on the training dataset, and tend to be "hit and miss" in practice. So I think that you could trust them to get an initial guess, or some constraints (e.g. like in the case of pose estimation: if you have a geometric algorithm that drifts in time, then you can use the results of a neural network to constrain them. But I think that the absence of a noise model as mentioned previously will make the fusion a bit difficult here...).
So yes, I think that it is feasible and that you can probably, with careful engineering and tuning produce a few interesting demos, but I wouldn't trust it in real life.

Related

Problems in computer vision that use optimization method in graph theory?

I am supposed to give a presentation on optimization algorithms on graphs. On the other hand, I am also very interested in computer vision. And I hope to combine these two in my presentation. Can you suggest some topics in computer vision which are solved by optimization methods in graph theory (e.g. shortest-path, maximum flow, matching, etc.)? The newer the better.
There was an enormous amount of work done in the late '90s and early 00's using graph-cut methods in Computer Vision. This is a good starting point: https://en.wikipedia.org/wiki/Graph_cuts_in_computer_vision

choosing kernel for digit recognition in C

I'm trying to classify digits read on images at known positions in C++, using SVM.
for that, I sample over a rectangle at the known position of the digit, I train with a ground_truth.
I wonder how to choose the kernel of the SVM. I use the default linear kernel but my intuition tell me that it might not be the best choice.
How could I choose the kernel?
You will need to tune the kernel (if you use a nonlinear one). This guide may be useful for you: A practical guide to SVM classification
Unfortunately there is not a magic bullet for this, so experimentation is your best friend.
Probably I would start with RBF which tends to work decently in most cases, and I am agreed with your intuition that probably linear is not the best, although some times (especially when you have tons of data) it can give you good surprises :)
The problem I have found with RBF is that it tends to overfit the training set, this stop to be an issue if you have a lot of data but then a new problem raises because it tends to scale poorly and having slow training time for big data.

Offline embedded realtime routing

I am currently working on a senior design project for school and have come across a design issue that i do not know how to solve. I need to have realtime, offline routing for an embedded walking application.
I have not been able to find any libraries that suit my need. I understand i might either have to make my own vectorized map of my local town or routing algorithm. I will not go into much detail what my project entails but it does not require a large map. Maybe a 5x5 mile grid. The maps can be loaded by SD if need to be changed.
I see there are GpsMid, YOURs, and others all using OpenStreetMap data.
We will have a TI micro-controller for processing and GPS card for real time lat/lon I just do not know how to take the real time info and route using a static map.
Thanks,
Matt
I'm not well versed in what is typically used for real-time routing with GPS and vectorized maps, but I can recommend some general algorithms that can be used as tools to help you get your project done.
A* search is a pretty typical path finding algorithm. http://en.wikipedia.org/wiki/A_star
Depending on how you organize your data, you may also find to Dijkstra's algorithm to be helpful. http://en.wikipedia.org/wiki/Dijkstra%27s_algorithm
These algorithms are popular enough that you should be able to find example code in whatever language you want, although I'd be very skeptical of the quality. I'd recommend writing your own, since you are in school, as it'd be beneficial for you to have written and debugged them on your own at least once in your career. When you are done, you'll have a tried and true implementation to call your own.
Seems to me there are two parts to this:
1 - Identifying map data that tells you what's a road/path (potential route), I would expect this is already in the data in some way. It could be as simple as which colour any given line is.
2 - Calculating a route over those paths. This is well documented/discussed and there are plenty of algorithms etc. out there on the problem. These days it's hardly worth trying very hard for elegance/efficiency, you can just throw CPU cycles at it until an answer pops out.
Also, should this be tagged [homework] ?

processing an image using CUDA implementation, python (pycuda) or C++?

I am in a project to process an image using CUDA. The project is simply an addition or subtraction of the image.
May I ask your professional opinion, which is best and what would be the advantages and disadvantages of those two?
I appreciate everyone's opinions and/or suggestions since this project is very important to me.
General answer: It doesn't matter. Use the language you're more comfortable with.
Keep in mind, however, that pycuda is only a wrapper around the CUDA C interface, so it may not always be up-to-date, also it adds another potential source of bugs, …
Python is great at rapid prototyping, so I'd personally go for Python. You can always switch to C++ later if you need to.
If the rest of your pipeline is in Python, and you're using Numpy already to speed things up, pyCUDA is a good complement to accelerate expensive operations. However, depending on the size of your images and your program flow, you might not get too much of a speedup using pyCUDA. There is latency involved in passing the data back and forth across the PCI bus that is only made up for with large data sizes.
In your case (addition and subtraction), there are built-in operations in pyCUDA that you can use to your advantage. However, in my experience, using pyCUDA for something non-trivial requires knowing a lot about how CUDA works in the first place. For someone starting from no CUDA knowledge, pyCUDA might be a steep learning curve.
Take a look at openCV, it contains a lot of image processing functions and all the helpers to load/save/display images and operate cameras.
It also now supports CUDA, some of the image processing functions have been reimplemented in CUDA and it gives you a good framework to do your own.
Alex's answer is right. The amount of time consumed in the wrapper is minimal. Note that PyCUDA has some nice metaprogramming constructs for generating kernels which might be useful.
If all you're doing is adding or subtracting elements of an image, you probably shouldn't use CUDA for this at all. The amount of time it takes to transfer back and forth across the PCI-E bus will dwarf the amount of savings you get from parallelism.
Any time you deal with CUDA, it's useful to think about the CGMA ratio (computation to global memory access ratio). Your addition/subtraction is only 1 float point operation for 2 memory accesses (1 read and 1 write). This ends up being very lousy from a CUDA perspective.

What are the most widely used C++ vector/matrix math/linear algebra libraries, and their cost and benefit tradeoffs? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
It seems that many projects slowly come upon a need to do matrix math, and fall into the trap of first building some vector classes and slowly adding in functionality until they get caught building a half-assed custom linear algebra library, and depending on it.
I'd like to avoid that while not building in a dependence on some tangentially related library (e.g. OpenCV, OpenSceneGraph).
What are the commonly used matrix math/linear algebra libraries out there, and why would decide to use one over another? Are there any that would be advised against using for some reason? I am specifically using this in a geometric/time context*(2,3,4 Dim)* but may be using higher dimensional data in the future.
I'm looking for differences with respect to any of: API, speed, memory use, breadth/completeness, narrowness/specificness, extensibility, and/or maturity/stability.
Update
I ended up using Eigen3 which I am extremely happy with.
There are quite a few projects that have settled on the Generic Graphics Toolkit for this. The GMTL in there is nice - it's quite small, very functional, and been used widely enough to be very reliable. OpenSG, VRJuggler, and other projects have all switched to using this instead of their own hand-rolled vertor/matrix math.
I've found it quite nice - it does everything via templates, so it's very flexible, and very fast.
Edit:
After the comments discussion, and edits, I thought I'd throw out some more information about the benefits and downsides to specific implementations, and why you might choose one over the other, given your situation.
GMTL -
Benefits: Simple API, specifically designed for graphics engines. Includes many primitive types geared towards rendering (such as planes, AABB, quatenrions with multiple interpolation, etc) that aren't in any other packages. Very low memory overhead, quite fast, easy to use.
Downsides: API is very focused specifically on rendering and graphics. Doesn't include general purpose (NxM) matrices, matrix decomposition and solving, etc, since these are outside the realm of traditional graphics/geometry applications.
Eigen -
Benefits: Clean API, fairly easy to use. Includes a Geometry module with quaternions and geometric transforms. Low memory overhead. Full, highly performant solving of large NxN matrices and other general purpose mathematical routines.
Downsides: May be a bit larger scope than you are wanting (?). Fewer geometric/rendering specific routines when compared to GMTL (ie: Euler angle definitions, etc).
IMSL -
Benefits: Very complete numeric library. Very, very fast (supposedly the fastest solver). By far the largest, most complete mathematical API. Commercially supported, mature, and stable.
Downsides: Cost - not inexpensive. Very few geometric/rendering specific methods, so you'll need to roll your own on top of their linear algebra classes.
NT2 -
Benefits: Provides syntax that is more familiar if you're used to MATLAB. Provides full decomposition and solving for large matrices, etc.
Downsides: Mathematical, not rendering focused. Probably not as performant as Eigen.
LAPACK -
Benefits: Very stable, proven algorithms. Been around for a long time. Complete matrix solving, etc. Many options for obscure mathematics.
Downsides: Not as highly performant in some cases. Ported from Fortran, with odd API for usage.
Personally, for me, it comes down to a single question - how are you planning to use this. If you're focus is just on rendering and graphics, I like Generic Graphics Toolkit, since it performs well, and supports many useful rendering operations out of the box without having to implement your own. If you need general purpose matrix solving (ie: SVD or LU decomposition of large matrices), I'd go with Eigen, since it handles that, provides some geometric operations, and is very performant with large matrix solutions. You may need to write more of your own graphics/geometric operations (on top of their matrices/vectors), but that's not horrible.
So I'm a pretty critical person, and figure if I'm going to invest in a library, I'd better know what I'm getting myself into. I figure it's better to go heavy on the criticism and light on the flattery when scrutinizing; what's wrong with it has many more implications for the future than what's right. So I'm going to go overboard here a little bit to provide the kind of answer that would have helped me and I hope will help others who may journey down this path. Keep in mind that this is based on what little reviewing/testing I've done with these libs. Oh and I stole some of the positive description from Reed.
I'll mention up top that I went with GMTL despite it's idiosyncrasies because the Eigen2 unsafeness was too big of a downside. But I've recently learned that the next release of Eigen2 will contain defines that will shut off the alignment code, and make it safe. So I may switch over.
Update: I've switched to Eigen3. Despite it's idiosyncrasies, its scope and elegance are too hard to ignore, and the optimizations which make it unsafe can be turned off with a define.
Eigen2/Eigen3
Benefits: LGPL MPL2, Clean, well designed API, fairly easy to use. Seems to be well maintained with a vibrant community. Low memory overhead. High performance. Made for general linear algebra, but good geometric functionality available as well. All header lib, no linking required.
Idiocyncracies/downsides: (Some/all of these can be avoided by some defines that are available in the current development branch Eigen3)
Unsafe performance optimizations result in needing careful following of rules. Failure to follow rules causes crashes.
you simply cannot safely pass-by-value
use of Eigen types as members requires special allocator customization (or you crash)
use with stl container types and possibly other templates required
special allocation customization (or you will crash)
certain compilers need special care to prevent crashes on function calls (GCC windows)
GMTL
Benefits: LGPL, Fairly Simple API, specifically designed for graphics engines.
Includes many primitive types geared towards rendering (such as
planes, AABB, quatenrions with multiple interpolation, etc) that
aren't in any other packages. Very low memory overhead, quite fast,
easy to use. All header based, no linking necessary.
Idiocyncracies/downsides:
API is quirky
what might be myVec.x() in another lib is only available via myVec[0] (Readability problem)
an array or stl::vector of points may cause you to do something like pointsList[0][0] to access the x component of the first point
in a naive attempt at optimization, removed cross(vec,vec) and
replaced with makeCross(vec,vec,vec) when compiler eliminates
unnecessary temps anyway
normal math operations don't return normal types unless you shut
off some optimization features e.g.: vec1 - vec2 does not return a
normal vector so length( vecA - vecB ) fails even though vecC = vecA -
vecB works. You must wrap like: length( Vec( vecA - vecB ) )
operations on vectors are provided by external functions rather than
members. This may require you to use the scope resolution everywhere
since common symbol names may collide
you have to do
length( makeCross( vecA, vecB ) )
or
gmtl::length( gmtl::makeCross( vecA, vecB ) )
where otherwise you might try
vecA.cross( vecB ).length()
not well maintained
still claimed as "beta"
documentation missing basic info like which headers are needed to
use normal functionalty
Vec.h does not contain operations for Vectors, VecOps.h contains
some, others are in Generate.h for example. cross(vec&,vec&,vec&) in
VecOps.h, [make]cross(vec&,vec&) in Generate.h
immature/unstable API; still changing.
For example "cross" has moved from "VecOps.h" to "Generate.h", and
then the name was changed to "makeCross". Documentation examples fail
because still refer to old versions of functions that no-longer exist.
NT2
Can't tell because they seem to be more interested in the fractal image header of their web page than the content. Looks more like an academic project than a serious software project.
Latest release over 2 years ago.
Apparently no documentation in English though supposedly there is something in French somewhere.
Cant find a trace of a community around the project.
LAPACK & BLAS
Benefits: Old and mature.
Downsides:
old as dinosaurs with really crappy APIs
For what it's worth, I've tried both Eigen and Armadillo. Below is a brief evaluation.
Eigen
Advantages:
1. Completely self-contained -- no dependence on external BLAS or LAPACK.
2. Documentation decent.
3. Purportedly fast, although I haven't put it to the test.
Disadvantage:
The QR algorithm returns just a single matrix, with the R matrix embedded in the upper triangle. No idea where the rest of the matrix comes from, and no Q matrix can be accessed.
Armadillo
Advantages:
1. Wide range of decompositions and other functions (including QR).
2. Reasonably fast (uses expression templates), but again, I haven't really pushed it to high dimensions.
Disadvantages:
1. Depends on external BLAS and/or LAPACK for matrix decompositions.
2. Documentation is lacking IMHO (including the specifics wrt LAPACK, other than changing a #define statement).
Would be nice if an open source library were available that is self-contained and straightforward to use. I have run into this same issue for 10 years, and it gets frustrating. At one point, I used GSL for C and wrote C++ wrappers around it, but with modern C++ -- especially using the advantages of expression templates -- we shouldn't have to mess with C in the 21st century. Just my tuppencehapenny.
If you are looking for high performance matrix/linear algebra/optimization on Intel processors, I'd look at Intel's MKL library.
MKL is carefully optimized for fast run-time performance - much of it based on the very mature BLAS/LAPACK fortran standards. And its performance scales with the number of cores available. Hands-free scalability with available cores is the future of computing and I wouldn't use any math library for a new project doesn't support multi-core processors.
Very briefly, it includes:
Basic vector-vector, vector-matrix,
and matrix-matrix operations
Matrix factorization (LU decomp, hermitian,sparse)
Least squares fitting and eigenvalue problems
Sparse linear system solvers
Non-linear least squares solver (trust regions)
Plus signal processing routines such as FFT and convolution
Very fast random number generators (mersenne twist)
Much more.... see: link text
A downside is that the MKL API can be quite complex depending on the routines that you need. You could also take a look at their IPP (Integrated Performance Primitives) library which is geared toward high performance image processing operations, but is nevertheless quite broad.
Paul
CenterSpace Software ,.NET Math libraries, centerspace.net
What about GLM?
It's based on the OpenGL Shading Language (GLSL) specification and released under the MIT license.
Clearly aimed at graphics programmers
I've heard good things about Eigen and NT2, but haven't personally used either. There's also Boost.UBLAS, which I believe is getting a bit long in the tooth. The developers of NT2 are building the next version with the intention of getting it into Boost, so that might count for somthing.
My lin. alg. needs don't exteed beyond the 4x4 matrix case, so I can't comment on advanced functionality; I'm just pointing out some options.
I'm new to this topic, so I can't say a whole lot, but BLAS is pretty much the standard in scientific computing. BLAS is actually an API standard, which has many implementations. I'm honestly not sure which implementations are most popular or why.
If you want to also be able to do common linear algebra operations (solving systems, least squares regression, decomposition, etc.) look into LAPACK.
I'll add vote for Eigen: I ported a lot of code (3D geometry, linear algebra and differential equations) from different libraries to this one - improving both performance and code readability in almost all cases.
One advantage that wasn't mentioned: it's very easy to use SSE with Eigen, which significantly improves performance of 2D-3D operations (where everything can be padded to 128 bits).
Okay, I think I know what you're looking for. It appears that GGT is a pretty good solution, as Reed Copsey suggested.
Personally, we rolled our own little library, because we deal with rational points a lot - lots of rational NURBS and Beziers.
It turns out that most 3D graphics libraries do computations with projective points that have no basis in projective math, because that's what gets you the answer you want. We ended up using Grassmann points, which have a solid theoretical underpinning and decreased the number of point types. Grassmann points are basically the same computations people are using now, with the benefit of a robust theory. Most importantly, it makes things clearer in our minds, so we have fewer bugs. Ron Goldman wrote a paper on Grassmann points in computer graphics called "On the Algebraic and Geometric Foundations of Computer Graphics".
Not directly related to your question, but an interesting read.
FLENS
http://flens.sf.net
It also implements a lot of LAPACK functions.
I found this library quite simple and functional (http://kirillsprograms.com/top_Vectors.php). These are bare bone vectors implemented via C++ templates. No fancy stuff - just what you need to do with vectors (add, subtract multiply, dot, etc).