I'm implementing the Mahalanobis distance to measure the distance between two vectors of the same pool and just recognized that it seems to be correct the most but sometimes not, maybe due to negative values..
I realized that sometimes for
negative values occur. That's why the distance gets negative respectively the code throws up an error because root of a negative value..
I wonder about the problem. The data is (a row represents an object):
A: 376.498943729227 2.75082585760394 376.688899264061 2.75084113940164
B: 373.287817831307 2.75074375125675 373.392663499518 2.75092754974534
C: 377.091938091279 2.75082292557743 377.466035993347 2.75077191984784
D: 374.799551607287 2.75094834987157 374.209110037364 2.75091796001419
The covariance matrix S is then
7.13457e-09 3.13933e-05 5.45925e-10 3.80508e-06
3.13933e-05 2.96355 -0.000115865 3.28797
5.45925e-10 -0.000115865 5.31665e-09 -0.000137211
3.80508e-06 3.28797 -0.000137211 3.79042
and the inverse of it is
3.24779e+22 -8.58499e+18 1.40166e+22 7.92177e+18
-8.58499e+18 2.2693e+15 -3.70505e+18 -2.09399e+15
1.40166e+22 -3.70505e+18 6.04917e+21 3.41882e+18
7.92177e+18 -2.09399e+15 3.41882e+18 1.93222e+15
Now I wonder why I get negative results out of the highlighted product (in case of B and D)?
I'm not sure if its a programming problem (that's why I didn't include code lines, yet) or rather a theoretical one but I appreciate any help a lot!
I use the Eigen class.
edit:
I calculated the eigenvalues of the covariance matrix S via R and get:
7.593311e+02 1.243531e-01 1.156646e-02 -3.920936e-04
Why do I have different ones?
I used
M<- matrix(c(376.498943729227, 2.75082585760394, 376.688899264061, 2.75084113940164,
373.287817831307, 2.75074375125675, 373.392663499518, 2.75092754974534,
377.091938091279, 2.75082292557743, 377.466035993347, 2.75077191984784,
374.799551607287, 2.75094834987157, 374.209110037364, 2.75091796001419
), 4, 4)
> M
[,1] [,2] [,3] [,4]
[1,] 376.498944 373.287818 377.091938 374.799552
[2,] 2.750826 2.750744 2.750823 2.750948
[3,] 376.688899 373.392663 377.466036 374.209110
[4,] 2.750841 2.750928 2.750772 2.750918
ev<- eigen(M)
values<- ev$values
values
[1] 7.593311e+02 1.243531e-01 1.156646e-02 -3.920936e-04
Your covariance matrix has two eigenvalues that are almost zero (10^-10 and 10^-18). Therefore, the matrix cannot be easily inverted, it might even be considered as non-invertible.
The reason for the two small eigenvalues is that your data points do not fill the entire 4D space but only a 2D subspace (a plane embedded in 4D).
To calculate a reasonable distance, you need to project your points onto a 2D space (or whatever dimensionality your real data have). You can do this with PCA. After this, you can calculate the distance in 2D.
I copy&pasted your matrix to Matlab and computed eigenvalues, and the smallest of them is -4.0819e-13.
Which doesn't seem that bad, but it shows a problem. A covariance matrix should be positive semidefinite, and therefore no eigenvalue should be smaller than 0. Likely due to rounding issues in your code, the matrix has a (slightly) negative eigenvalue, which can result in a problem like you are having.
Also, since two of the eigenvalues are practically zero, computing the inverse is a very brave move indeed. Meaning: you shouldn't, since you are essentially computing the inverse of a singular matrix.
Related
I'm struggling to make sense of the spectral clustering documentation here.
Specifically.
If you have an affinity matrix, such as a distance matrix, for which 0 means identical elements, and high values means very dissimilar elements, it can be transformed in a similarity matrix that is well suited for the algorithm by applying the Gaussian (RBF, heat) kernel:
np.exp(- X ** 2 / (2. * delta ** 2))
For my data, I have a complete distance matrix of size (n_samples, n_samples) where large entries represent dissimilar pairs, small values represent similar pairs and zero represents identical entries. (I.e. the only zeros are along the diagonal).
So all I need to do is build the SpectralClustering object with affinity = "precomputed" and then pass the transformed distance matrix to fit_predict.
I'm stuck on the suggested transformation equation. np.exp(- X ** 2 / (2. * delta ** 2)).
What is X here? The (n_samples, n_samples) distance matrix?
If so, what is delta. Is it just X.max()-X.min()?
Calling np.exp(- X ** 2 / (2. * (X.max()-X.min()) ** 2)) seems to do the right thing. I.e. big entries become relatively small, and small entries relatively big, with all the entries between 0 and 1. The diagonal is all 1's, which makes sense, since each point is most affine with itself.
But I'm worried. I think if the author had wanted me to use np.exp(- X ** 2 / (2. * (X.max()-X.min()) ** 2)) he would have told me to use just that, instead of throwing delta in there.
So I guess my question is just this. What's delta?
Yes, X in this case is the matrix of distances. delta is a scale parameter that you can tune as you wish. It controls the "tightness", so to speak, of the distance/similarity relation, in the sense that a small delta increases the relative dissimmilarity of faraway points.
Notice that delta is proportional to the inverse of the gamma parameter of the RBF kernel, mentioned earlier in the doc link you give: both are free parameters which can be used to tune the clustering results.
I am having a strange issue with using Eigen (Tuxfamily) in my software (in c++).
I am analyzing a 3D volume image by calculating for each pixel an Hessian matrix.
The volume (approx 800x800x600) is divided in subvolumes and for each subvolume i sum up all the obtained matrices and then divide them by the amount to obtain the average (and then i do the same summing up all the averages and dividing by the number of subvolumes to obtain the average for the full volume).
The matrices are of type Matrix3d.
The problem is, that for most of the sums (and obviously for the averages as well) i obtain something like :
Elements analyzed : 28215
Elements summed : 28215
Subvolume sum :
5143.76 | nan | -2778.05
5402.07 | 16011.9 | -inf
-2778.05 | -8716.86 | 7059.32
I sum them this way :
for(int i = 0;i<(int)OuterVector.size();i++){
AverageProduct+=OuterVector[i];
}
Due to the nature of the matrices i know that they should be symmetrical on the diagonal, so the correct value is calculated for some of them. Any idea on why the others might be failing? (and consider that it's always the same two position of the matrix giving me nan and -inf)
Ok, using a mix of the suggestions you guys gave me in the comments, i tried a couple of random fixes and i solved the problem.
When i was creating the Eigen::Matrix3d object, i was not initializing the values, so somehow as soon as i was adding the first OuterVector[i] those two values were going wild (the (0,1) was going to nan and the (1,2) was going to inf). Strange that it was only happening only for those two specific values and in the same identical way every time.
So doing (at initialization time)
Matrix3d AverageProduct << 0,0,0,0,0,0,0,0,0;
was enough to fix it.
I am using PCA on binary attributes to reduce the dimensions (attributes) of my problem. The initial dimensions were 592 and after PCA the dimensions are 497. I used PCA before, on numeric attributes in an other problem and it managed to reduce the dimensions in a greater extent (the half of the initial dimensions). I believe that binary attributes decrease the power of PCA, but i do not know why. Could you please explain me why PCA does not work as good as in numeric data.
Thank you.
The principal components of 0/1 data can fall off slowly or rapidly,
and the PCs of continuous data too —
it depends on the data. Can you describe your data ?
The following picture is intended to compare the PCs of continuous image data
vs. the PCs of the same data quantized to 0/1: in this case, inconclusive.
Look at PCA as a way of getting an approximation to a big matrix,
first with one term: approximate A ~ c U VT, c [Ui Vj].
Consider this a bit, with A say 10k x 500: U 10k long, V 500 long.
The top row is c U1 V, the second row is c U2 V ...
all the rows are proportional to V.
Similarly the leftmost column is c U V1 ...
all the columns are proportional to U.
But if all rows are similar (proportional to each other),
they can't get near an A matix with rows or columns 0100010101 ...
With more terms, A ~ c1 U1 V1T + c2 U2 V2T + ...,
we can get nearer to A: the smaller the higher ci, the faster..
(Of course, all 500 terms recreate A exactly, to within roundoff error.)
The top row is "lena", a well-known 512 x 512 matrix,
with 1-term and 10-term SVD approximations.
The bottom row is lena discretized to 0/1, again with 1 term and 10 terms.
I thought that the 0/1 lena would be much worse -- comments, anyone ?
(U VT is also written U ⊗ V, called a "dyad" or "outer product".)
(The wikipedia articles
Singular value decomposition
and Low-rank approximation
are a bit math-heavy.
An AMS column by
David Austin,
We Recommend a Singular Value Decomposition
gives some intuition on SVD / PCA -- highly recommended.)
I am trying to implement the ideas in this paper for modeling fracture:
http://graphics.berkeley.edu/papers/Obrien-GMA-1999-08/index.html
I am stuck at a point (essentially page 4...) and would really appreciate any help. The part I am stuck on involves the deformation of tetrahedron (using FEM).
I have a single tetrahedron defined by four nodes (each node has a x, y, z position) in which I calculate the following matrices from:
u: each column is a vector containing material coordinates (x, y, z,
1) for each node (so total 4 columns), a 4x4 matrix
B: inverse(u), he calls this the basis matrix, a 4x4 matrix
P: each column is a vector containing real world coordinates (x, y,
z) for each node, I set P is initially equal to u since the object is
not deformed at the rest state, a 3x4 matrix
V: give some initial velocities for (x, y, z) in each node, so a 3x4
matrix
delta: basically an identity matrix, {{1, 0, 0}, {0, 1, 0}, {0, 0,
1}, {0, 0, 0}}
I get x(u) = P*B*u and v(u) = V*B*u, but not sure where to use these...
Also, I get dx = P*B*delta and dv = V*B*delta
I then get strain by Green's strain tensor, epsilon = 1/2(dx+transpose(dx)) - Identity_3x3
And then stress, sigma = lambda*trace(epsilon)*Identity_3x3 + 2*mu*epsilon
I get the elastic force by equation (24) on page 4 of the paper. It's just a big summation.
I then using explicit integration to update real world coordinates P. The idea is that the velocity update involves the force on the node of the tetrahedron and therefore affects the real-world coordinate position, making the object deform.
The problem, however, is that the force is incredibly small...something x 10^-19, etc. So, c++ usually rounds to 0. I've stepped through the calculations and can't figure out why.
I know I'm missing something here, just can't figure out what. What update am I not doing correctly?
A common reason why the force is small is that your Young's modulus (lambda) is too small. If you are using a scale of meters, a macro scale object might have 10^5 young's modlus and a .3 to .4 Poisson's ratio.
It sounds like what might be happening is that your tet is still in the rest configuration. In the presence of no deformation, the strain will be zero and so in-turn the stress and force will also be about zero. You can perturb the vertices in various ways and make sure your strain (epsilon) is being computed correctly. One simple test is to scale by 2 about the centroid which should give you a positive strain. If you scale by .5 about the centroid you will get a negative strain. If you translate the vertices uniformly you will get no change in strain (a common FEM invariant). If you rotate them you probably will get a change, but a co-rotational constitutive model wouldn't.
Note you might think that gravity would cause deformation, but unless one of the vertices is constrained, the uniform force on all vertices will cause a uniform translation which will not change the strain from being zero.
You definitely should not need to use arbitrary precision arithmetic for the examples in the paper. In fact, floats typically are sufficient for these types of simulation.
I might be mistaken, but c++ doubles only go to 15 decimal places, (at least that's what my std::numeric_limits says). So you're way out of precision.
So you might end up needing a library for arbitrary precision arithmetics, e.g., http://gmplib.org/
I am trying to do a 2D Real To Complex FFT using CUFFT.
I realize that I will do this and get W/2+1 complex values back (W being the "width" of my H*W matrix).
The question is - what if I want to build out a full H*W version of this matrix after the transform - how do I go about copying some values from the H*(w/2+1) result matrix back to a full size matrix to get both parts and the DC value in the right place
Thanks
I'm not familiar with CUDA, so take that into consideration when reading my response. I am familiar with FFTs and signal processing in general, though.
It sounds like you start out with an H (rows) x W (cols) matrix, and that you are doing a 2D FFT that essentially does an FFT on each row, and you end up with an H x W/2+1 matrix. A W-wide FFT returns W values, but the CUDA function only returns W/2+1 because real data is even in the frequency domain, so the negative frequency data is redundant.
So, if you want to reproduce the missing W/2-1 points, simply mirror the positive frequency. For instance, if one of the rows is as follows:
Index Data
0 12 + i
1 5 + 2i
2 6
3 2 - 3i
...
The 0 index is your DC power, the 1 index is the lowest positive frequency bin, and so forth. You would thus make your closest-to-DC negative frequency bin 5+2i, the next closest 6, and so on. Where you put those values in the array is up to you. I would do it the way Matlab does it, with the negative frequency data after the positive frequency data.
I hope that makes sense.
There are two ways this can be acheived. You will have to write your own kernel to acheive either of this.
1) You will need to perform conjugate on the (half) data you get to find the other half.
2) Since you want full results anyway, it would be best if you convert the input data from real to complex (by padding with 0 imaginary) and performing the complex to complex transform.
From practice I have noticed that there is not much of a difference in speed either way.
I actually searched the nVidia forums and found a kernel that someone had written that did just what I was asking. That is what I used. if you search the cuda forum for "redundant results fft" or similar you will find it.