How to do numeric differentiation using Boost Units? - c++

I would like to perform a numeric differentiation in C++. For type safety, I'd like to use boost::units to avoid mixing units but also boost::units::absolute to avoid mixing relative and absolute units.
A minimal example is to calculate the velocity as a function of the change in position divided by the change in time: v = dx/dt, which can be approximated as (x1 - x0)/(t1 - t0).
In this example v has an absolute unit (velocity), dx and dt relative ones (distance / duration).
While boost::units derives the correct unit if we simply take relative units everywhere,
static_assert(std::is_same<boost::units::divide_typeof_helper<
boost::units::si::length,
boost::units::si::time>::type,
boost::units::si::velocity>::value);
the static_assert fails if we want the result of our division being an absolute velocity:
static_assert(std::is_same<boost::units::divide_typeof_helper<
boost::units::si::length,
boost::units::si::time>::type,
boost::units::absolute<boost::units::si::velocity>>::value);
Am I doing a wrong assumption that the result of dividing two relative units should always yield an absolute one? Or is this an error in the implementation of boost::units?

From the docs on boost::units::absolute,
Description
A wrapper to represent absolute units (points rather than vectors).
Spacetime events are points (if not viewed as radius vectors), their differences are vectors. Velocity is also a vector. Thus, your assumption does indeed appear wrong.

Related

How to update the covariance of a multi camera system when a rigid motion is applied to all of them?

For example for 6-dof camera states, two cameras have 12 state parameters and a 12*12 covariance matrix (assume Gaussian distribution). How does this covariance change when a 6-dof rigid motion is applied to the cameras?
What if the 6-dof is a Gaussian distribution too ?
You can use the "forward propagation" theorem (you can find it in Hartley and Zisserman's multiple view geometry book, chapter 5, page 139).
Basically, if you have a random variable x with mean x_m and covariance C, and a differentliable function f that you apply to x, then the mean of f(x) will be f(x_m) and its covariance C_f will be approximately JCJ^t, where ^t denotes the transpose, and Jis the Jacobian matrix of f evaluated at x_m.
Let's now consider the problems of the covariance propagation separately for camera positions and camera orientations.
First see what happens to the translation parameters of the camera in your case, let's denote them with x_t.In your case, f is a rigid transformation, that means that
f(x_t)=Rx_t+T //R is a rotation and T a translation, x_t is the position of the camera
Now the Jacobian of f with respect to x_t is simply R, so the covariance is given by
C_f=RCR^T
which is an interesting result: it indicates that the change in
covariance only depends on the rotation. This makes sense, since
intuitively, translating the (positional) data doesn't actually changes the axis
along which it changes (thing about principal component
analysis).
Also note that if C is isotropic, i.e a diagonal matrix
lambda*Identity, then C_f=lambda*Identity, which also makes sense,
since intuitively we don't expect an isotropic covariance to change
with a rotation.
Now consider the orientation parameters. Let's use the Lie algebra of the SO(3) group. In that case, the yaw, pitch, scale will be parametrized as v=[alpha_1, alpha_2, alpha_3]^t (they are basically Lie algebra coefficients). In the following, we will use the exponential and logarithm maps from the Lie algebra so(3) to the group SO(3). We can write our function as
f(v)=log(R*exp(v))
In the above, exp(v) is the rotation matrix of your camera, and R is the rotation from your rigid transformation.
Note that translation doesn't affect orientation parameters. Computing the Jacobian of f with respect to v is mathematically involved. I suspect that you can do it using the adjoint or the Lie algebra, or you can do it using the Baker-Campbell-Hausdorff formula, however, you will have to limit the precision. Here, we'll take a shortcut and use the result given in this question.
jacobian_f_with_respect_to_v=R*inverse(R*exp(v))
=R*exp(v)^t*R^t
So, our covariance will be
R*exp(v)^t*R^t * Cov(v) * (R*exp(v)^t*R^t)^t
=R*exp(v)^t*R^t * Cov(v) * R * exp(v) * R^t
Again, we observe the same thing: if Cov(v) is isotropic then so is the covariance of f.
Edit: Answers to the questions you asked in the comments
Why did you assume conditional independence between translation/rotation?
Conditional independence between translation/orientation parameters is often assumed in many works (especially in the pose graphe litterature, e.g. see Hauke Strasdat's thesis), and I've always found that in practice, this works a lot better (not a very convincing argument, I know). However, I admit that I didn't put much thought (if any) into this when writing this answer, because my main point was "use the forward propagation theorem". You can apply it jointly to orientation/position, and all this changes is that your Jacobian will look like
J=[J_R J_T]//J_R Jacobian w.r.t orientation , J_T Jacobian w.r.t position
and then the "densification" of the covariance matrix will happen as a result of the propagation like J^T*C*J.
Why did you use SO(3) instead of SE(3)?
You said it yourself, I separated the translation parameters from the orientation. SE(3) is the space of rigid transformation, which includes translations. It wouldn't have made sense for me to use it since I already had taken care of the position parameters.
What about the covariance between two cameras?
I think we can still apply the same theorem. The difference now is your rigid transformation will be a function M(x_1,x_2) of 12 parameters, and your Jacobian will look like [J_R_1 J_R_2 J_T_1 J_T2]. These can be tedious to compute as you know, so if you can just try numeric or automatic differentiation.

How to sample from a normal distribution restricted to a certain interval, C++ implementation?

With this function I can sample from a normal distribution. I was wondering how could I sample efficiently from a normal distribution restricted to a certain interval [a,b]. My trivial approach would be to sample from the normal distribution and then keep the value if it belongs to a certain interval, otherwise re-sample. However would probably discards many values before I get a suitable one.
I could also approximate the normal distribution using a triangular distrubution, however I don't think this would be accurate enough.
I could also try to work on the cumulative function, but probably this would be slow as well. Is there any efficient approach to the problem?
Thx
I'm assuming you know how to transform to and from standard normal with shifting by μ and scaling by σ.
Option 1, as you said, is acceptance/rejection. Generate normals as usual, reject them if they're outside the range [a, b]. It's not as inefficient as you might think. If p = P{a < Z < b}, then the number of trials required follows a geometric distribution with parameter p and the expected number of attempts before accepting a value is 1/p.
Option 2 is to use an inverse Gaussian function, such as the one in boost. Calculate lo = Φ(a) and hi = Φ(b), the probabilities of your normal being below a and b, respectively. Then generate U distributed uniformly between lo and hi, and crank the resulting set of U's through the inverse Gaussian function and rescale to get outcomes with the desired truncated distribution.
The normal distribution is an integral, see the formula:
std::cout << "riemann_midpnt_sum = " << 1 / (sqrt(2*PI)) * riemann_mid_point_sum(fctn, -1, 1.0, 100) << '\n';
// where fctn is the function inside the integral
double fctn(double x) {
return exp(-(x*x)/2);
}
output: "riemann_midpnt_sum = 0.682698"
This calculates the normal distribution (standard) from -1 to 1.
This is using a riemman sum approximate the integral. You can take the riemman sum from here
You could have a look at the implementation of the normal dist function in your standard library (e.g., https://gcc.gnu.org/onlinedocs/gcc-4.6.3/libstdc++/api/a00277.html), and figure out a way to re-implement this with your constraint.
It might be tricky to understand the template-heavy library code, but if you really need speed then the trivial approach is not well suited, particularly if your interval is quite small.

Which method should I use to determine the similarity of 2D, 3D and 4D (quaternions) vectors?

I am writing some simple Unit Tests for math library.
To decide if the library generates good results I have to compare them with expected ones. Because of rounding etc. even good result will differ a bit from expected one (e.g. 0.701 when 0.700 was expected).
The problem is, I have to decide how similar two vectors are. I want to describe that similarity as an error proportion (for number it would be e.g. errorScale(3.0f /* generated */, 1.0f /* expected */) = 3.0f/1.5f = 2.0f == 200%).
Which method should I use to determine the similarity of 2D, 3D and 4D (quaternions) vectors?
There's no universally good measure. In particular, for addition the absolute error is better while for multiplication the relative error is better.
For vectors the "relative error" can also be considered in terms of length and direction. If you think about it, the "acceptable outcomes" form a small area around the exact result. But what's the shape of this area? Is it an axis-aligned square (absolute errors in x and y direction)? That privileges a specific vector base. A circle might be a better shape.

Mimimization of anonymous function in C++

I have a cyclic program in C++ which includes composing of a function (every time it is different) and further minimization of it. Composing of a function is implemented with GiNaC package (symbolic expressions).
I tried to minimize functions using Matlab fmincon function but it ate all the memory while converting string to lambda function (functions are rather complicated). And I couldn't manage to export function from C++ to Matlab in any way but as a string.
Is there any way to compose a complicated function (3 variables, sin-cos-square root etc.) and minimize it without determing gradient by myself because I don't know how functions look before running the program?
I also looked at NLopt and as I understood it requires gradients to be writte by programmer.
Most optimization algorithms do require the gradient. However, if it's impossible to 'know' it directly, you may evaluate it considering a small increment of every coordinate. If your F function depends on coordinates of x vector, you may approximate the i's component of you gradient vector G as
x1 = x;
x1[i] += dx;
G[i] = (F(x1) - F(x))/dx;
where dx is some small increment. Although such a calculation is approximate it's usually absolutely good for a minimum finding provided that dx is small enough.

Least Squares Solution of Overdetermined Linear Algebraic Equation Ax = By

I have a linear algebraic equation of the form Ax=By. Where A is a matrix of 6x5, x is vector of size 5, B a matrix of 6x6 and y vector of size 6. A, B and y are known variables and their values are accessed in real time coming from the sensors. x is unknown and has to find. One solution is to find Least Square Estimation that is x = [(A^T*A)^-1]*(A^T)B*y. This is conventional solution of linear algebraic equations. I used Eigen QR Decomposition to solve this as below
matrixA = getMatrixA();
matrixB = getMatrixB();
vectorY = getVectorY();
//LSE Solution
Eigen::ColPivHouseholderQR<Eigen::MatrixXd> dec1(matrixA);
vectorX = dec1.solve(matrixB*vectorY);//
Everything is fine until now. But when I check the errore = Ax-By, its not zero always. Error is not very big but even not ignorable. Is there any other type of decomposition which is more reliable? I have gone through one of the page but could not understand the meaning or how to implement this. Below are lines from the reference how to solve the problem. Could anybody suggest me how to implement this?
The solution of such equations Ax = Byis obtained by forming the error vector e = Ax-By and the finding the unknown vector x that minimizes the weighted error (e^T*W*e), where W is a weighting matrix. For simplicity, this weighting matrix is chosen to be of the form W = K*S, where S is a constant diagonal scaling matrix, and K is scalar weight. Hence the solution to the equation becomes
x = [(A^T*W*A)^-1]*(A^T)*W*B*y
I did not understand how to form the matrix W.
Your statement " But when I check the error e = Ax-By, its not zero always. " almost always will be true, regardless of your technique, or what weighting you choose. When you have an over-described system, you are basically trying to fit a straight line to a slew of points. Unless, by chance, all the points can be placed exactly on a single perfectly straight line, there will be some error. So no matter what technique you use to choose the line, (weights and so on) you will always have some error if the points are not colinear. The alternative would be to use some kind of spline, or in higher dimensions to allow for warping. In those cases, you can choose to fit all the points exactly to a more complicated shape, and hence result with 0 error.
So the choice of a weight matrix simply changes which straight line you will use by giving each point a slightly different weight. So it will not ever completely remove the error. But if you had a few particular points that you care more about than the others, you can give the error on those points higher weight when choosing the least square error fit.
For spline fitting see:
http://en.wikipedia.org/wiki/Spline_interpolation
For the really nicest spline curve interpolation you can use Centripital Catmull-Rom, which in addition to finding a curve to fit all the points, will prevent unnecessary loops and self intersections that can sometimes come up during abrupt changes in the data direction.
Catmull-rom curve with no cusps and no self-intersections