cpp calculate 6x6 Covariance Matrix from two 1x3 arrays - c++

Like title says, I am attempting to calculate the covariance matrix for two 1x3 arrays and get one 6x6 std::array in C++. I need some guidance with my understanding - have looked and not been able to see much in terms of clarity to answer my question.
I have two arrays each with 3 elements.
Array 1 holds location data (x,y,z) and Array2 holds velocity data; we will call it (A,B,C)
Array1 = {x,y,z}
Array2 = {A,B,C}
and need to complete a Covariance matrix computing this into a 2d array[6][6]
I don't understand How I would get this.
I think my covariance formula is correct but still this would give me just an array[3][3].
cov = ( (Array1[n] - mean(Array1)) * (Array2[n] - mean(Array2)) ) / 3
\ 3 because its the number of values in each array.

Related

C++ Armadillo reshape a matrix with only one dimension size

Using Armadillo, how do I reshape a matrix when I only specify one dimension size?
In Matlab documentation, there is this example of such functionality:
Reshape a 6-by-6 magic square matrix into a matrix that has only 3
columns. Specify [] for the first dimension size to let reshape
automatically calculate the appropriate number of rows.
A = magic(6);
B = reshape(A,[],3);
The result is a 12-by-3 matrix, which maintains the same number of
elements (36) as the original 6-by-6 matrix. The elements in B also
maintain their columnwise order from A.
How can that be accomplished with Armadillo?
You can use .size() to get the total number of elements of your matrix and calculate the dimensions yourself.
Example:
B = reshape(A, A.size()/3, 3);

A better way to access n-d array element with a 1-d index array in C++?

Recently, I'm doing something about C++ pointers, I got this question when I want to access elements in multi-dimensional array with a 1-dimensional array which contains index.
Say I have a array arr, which is a 4-dimensional array with all elements set to 0 except for arr[1][2][3][4] is 1, and a array idx which contains index in every dimension for arr, I can access this element by using arr[idx[0]][idx[1]][idx[2]][idx[3]], or by using *(*(*(*(arr + idx[0]) + idx[1]) + idx[2]) + idx[3]).
The question comes with when n is large, this would be not so good, so I wonder if there is a better way to work with multi-dimensional accessing?
#include <bits/stdc++.h>
using namespace std;
#define N 10
int main()
{
int arr[N][N][N][N] = {0};
int idx[4] = {1, 2, 3, 4};
arr[1][2][3][4] = 1;
cout<<"Expected: "<<arr[1][2][3][4]<<" at "<<&arr[1][2][3][4]<<endl;
cout<<"Got with ****: ";
cout<<*(*(*(*(arr + idx[0]) + idx[1]) + idx[2]) + idx[3])<<endl;
return 0;
}
output
Expected: 1 at 0x7fff54c61f28
Got with ****: 1
The way you constructor your algorithm for indexing a multi dimensional array will vary depending on the language of choice; you have tagged this question with both C and C++. I will stick with the latter since my answer would pertain to C++. For a little while now I've been working on something similar but different so this becomes an interesting question as I was building a multipurpose multidimensional matrix class template.
What I have discovered about higher levels of multi dimensional vectors and matrices is that the order of 3 repetitiously works miracles in understanding the nature of higher dimensions. Think of this in the geometrical perspective before considering the algorithmic software implementation side of it.
Mathematically speaking Let's consider the lowest dimension of 0 with the first shape that is a 0 Dimensional object. This happens to be any arbitrary point where this point can have an infinite amount of coordinate location properties. Points such as p0(0), p1(1), p2(2,2), p3(3,3,3),... pn(n,n,...n) where each of these objects point to a specific locale with the defined number of dimensional attributes. This means that there is no linear distance such as length, width, or height and conversely a magnitude in any direction or dimension where this shape or object that has no bounds of magnitude does not define any area, volume or higher dimensions of volume. Also with these 0 dimensional points there is no awareness of direction which also implies that there is no angle of rotation that defines magnitude. Another thing to consider is that any arbitrary point is also the zero vector. Another thing to help in understand this is by the use of algebraic polynomials such that f(x) = mx+b which is linear is a One Dimensional equation, shape(in this case a line) or graph, f(x) = x^2 is Two Dimensional, f(x) = x^3 is Three Dimensional, f(x) = x^4 is Four Dimensional and so on up to f(x) = x^n where this would be N Dimensional. Length or Magnitude, Direction or Angle of Rotation, Area, Volume, and others can not be defined until you relate two distinct points to give you at least 1 line segment or vector with a specified direction. Once you have an implied direction you then have slope.
When looking at operations in mathematics the simplest is addition and it is nothing more than a linear translation and once you introduce addition you also introduce all other operations such as subtraction, multiplication, division, powers, and radicals; once you have multiplication and division you define rotation, angles of rotation, area, volume, rates of change, slope (also tangent function), which thus defines geometry and trigonometry which then also leads into integrations and derivatives. Yes, we have all had our math lessons but I think that this is important in to understanding how to construct the relationships of one order of magnitude to another, which then will help us to work through higher dimensional orders with ease once you know how to construct it. Once you can understand that even your higher orders of operations are nothing more than expansions of addition and subtraction you will begin to learn that their continuous operations are still linear in nature it is just that they expand into multiple dimensions.
Early I stated that the order of 3 repetitiously works miracles so let me explain my meaning. Since we perceive things on a daily basis in the perspective of 3D; we can only visualize 3 distinct vectors that are orthogonal to each other giving you our natural 3 Dimensions of Space such as Left & Right, Forward & Backward giving you the Horizontal axis and planes and Up & Down giving you the Vertical axis and planes. We can not visualize anything higher so dimensions of the order of x^4, x^5, x^6 etc... we can not visualize but yet they do exist. If we begin to look at the graphs of the mathematical polynomials we can begin to see a pattern between odd and even functions where x^4, x^6, x^8 are similar where they are nothing more than expansions of x^2 and functions of x^5, x^7 & x^9 are nothing more than expansions of x^3. So I consider the first few dimensions as normal: Zero - Point, 1st - Linear, 2nd - Area, and 3rd - Volume and as for the 4th and higher dimensions I call all of them Volumetric.
So if you see me use Volume then it relates directly to the 3rd Dimension where if I refer to Volumetric it relates to any Dimension higher than the 3rd. Now lets consider a matrix such that you have seen in regular algebra where the common matrices are defined by MxN. Well this is a 2D flat matrix that has M * N elements and this matrix also has an area of M * N as well. Let's expand to a higher dimensional matrix such as MxNxO this is a 3D Matrix with M * N * O elements and now has M * N * O Volume. So when you visualize this think of the MxN 2D part as being a page to a book and the O components represents each page of a book or slice of a box. The elements of these matrices can be anything from a simple value, to an applied operation, to an equation, system of equations, sets or just an arbitrary object as in a storage container. So now when we have a matrix that is of the 4th order such as MxNxOxP this now has a 4th dimensional aspect but the easiest way to visualize this is that This would be a 1 dimensional array or vector to where all of its P elements would be a 3D Matrix of a Volume of MxNxO. When you have a matrix of MxNxOxPxQ now you have a 2D Area Matrix of PxQ where each of those elements are a MxNxO Volume Matrix. Then again if you have a MxNxOxPxQxR you now have a 6th dimensional matrix and this time you have a 3D Volume Matrix where each of the PxQxR elements are in fact 3D Matrices of MxNxO. And once you go higher and higher this patter repeats and merges again. So the order of how arbitrary matrices behave is that these dimensionalities repeat: 1D are Linear Vectors or Matrices, 2D are Area or Planar Matrices and 3D is Volume Matrices and any thing of a higher repeats this process compressing the previous step of Volumes thus the terminology of Volumetric Matrices. Take a Look at this table:
// Order of Magnitude And groupings
-----------------------------------
Linear Area Volume
x^1 x^2 x^3
x^4 x^5 x^6
x^7 x^8 x^9
x^10 x^11 x^12
... ... ...
----------------------------------
Now it is just a matter of using a little bit of calculus to know which order of magnitude to index into which higher level of dimensionality. Once you know a specific dimension it is simple to take multiple derivatives to give you a linear expression; then traverse the space, then integrate to the same orders of the multiple derivatives to give the results. This should eliminate a good amount of intermediate work by at first ignoring the least significant lower dimensions in a high dimensional order. If you are working in something that has 12 dimensions you can assume that the first 3 dimensions that define the first set of volume is packed tight being an element to another 3D Volumetric Matrix and then once again that 2d order of Volumetric Matrix is itself an element of another 3D Volumetric Matrix. Thus we have a repeating pattern and now it's just a matter of apply this to construct an algorithm and once you have an algorithm; it should be quite easy to implement the methods in any programmable language. So you may have to have a 3 case switch to determine which algorithmic approach to use knowing the overall dimensionality of your matrix or n-d array where one handles orders of linearity, another to handle area, and the final to handle volumes and if they are 4th+ then the overall process becomes recursive in nature.
I figured out a way to solve this myself.
The idea is that use void * pointers, we know that every memory cell holds value or an address of a memory cell, so we can directly compute the offset of the target to the base address.
In this case, we use void *p = arr to get the base address of the n-d array, and then loop over the array idx, to calculate the offset.
For arr[10][10][10][10], the offset between arr[0] and arr[1] is 10 * 10 * 10 * sizeof(int), since arr is 4-d, arr[0] and arr[1] is 3-d, so there is 10 * 10 * 10 = 1000 elements between arr[0] and arr[1], after that, we should know that the offset between two void * adjacent addresses is 1 byte, so we should multiply sizeof(int) to get the correct offset, according to this, we finally get the exact address of the memory cell we want to access.
Finally, we have to cast void * pointer to int * pointer and access the address to get the correct int value, that's it!
With void *(not so good)
#include <bits/stdc++.h>
using namespace std;
#define N 10
int main()
{
int arr[N][N][N][N] = {0};
int idx[4] = {1, 2, 3, 4};
arr[1][2][3][4] = 1;
cout<<"Expected: "<<arr[1][2][3][4]<<" at "<<&arr[1][2][3][4]<<endl;
cout<<"Got with ****: ";
cout<<*(*(*(*(arr + idx[0]) + idx[1]) + idx[2]) + idx[3])<<endl;
void *p = arr;
for(int i = 0; i < 4; i++)
p += idx[i] * int(pow(10, 3-i)) * sizeof(int);
cout<<"Got with void *:";
cout<<*((int*)p)<<" at "<<p<<endl;
return 0;
}
Output
Expected: 1 at 0x7fff5e3a3f18
Got with ****: 1
Got with void *:1 at 0x7fff5e3a3f18
Notice:
There is a warning when compiling it, but I choose to ignore it.
test.cpp: In function 'int main()':
test.cpp:23:53: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith]
p += idx[i] * int(pow(10, 3-i)) * sizeof(int);
Use char * instead of void *(better)
Since we want to manipulate pointer byte by byte, it would be better to use char * to replace void *.
#include <bits/stdc++.h>
using namespace std;
#define N 10
int main()
{
int arr[N][N][N][N] = {0};
int idx[4] = {1, 2, 3, 4};
arr[1][2][3][4] = 1;
cout<<"Expected: "<<arr[1][2][3][4]<<" at "<<&arr[1][2][3][4]<<endl;
char *p = (char *)arr;
for(int i = 0; i < 4; i++)
p += idx[i] * int(pow(10, 3-i)) * sizeof(int);
cout<<"Got with char *:";
cout<<*((int*)p)<<" at "<<(void *)p<<endl;
return 0;
}
Output
Expected: 1 at 0x7fff4ffd7f18
Got with char *:1 at 0x7fff4ffd7f18
With int *(In this specific case)
I have been told it's not a good practice for void * used in arithmetic, it would be better to use int *, so I cast arr into int * pointer and also replace pow.
#include <bits/stdc++.h>
using namespace std;
#define N 10
int main()
{
int arr[N][N][N][N] = {0};
int idx[4] = {1, 2, 3, 4};
arr[1][2][3][4] = 1;
cout<<"Expected: "<<arr[1][2][3][4]<<" at "<<&arr[1][2][3][4]<<endl;
cout<<"Got with ****: ";
cout<<*(*(*(*(arr + idx[0]) + idx[1]) + idx[2]) + idx[3])<<endl;
int *p = (int *)arr;
int offset = 1e3;
for(int i = 0; i < 4; i++)
{
p += idx[i] * offset;
offset /= 10;
}
cout<<"Got with int *:";
cout<<*p<<" at "<<p<<endl;
return 0;
}
Output
Expected: 1 at 0x7fff5eaf9f08
Got with ****: 1
Got with int *:1 at 0x7fff5eaf9f08

Add extra feature to a matrix np.Concatenate error : only length-1 arrays can be converted to Python scalars

I want to add an extra column to my matrix in order to predict some features with some machine learning algorithms.
My trainSet got 8899 rows and 11 dimensions.
All i want to do is to add the extra dimension distance (see code).
But i got an error :
only length-1 arrays can be converted to Python scalars
temp_train_long/lat is (8899L,)
X_train = df_train.as_matrix()
temp_train_long=(X_train[:,3] - X_train[:,7])**2#long
temp_train_lat = (X_train[:,4] - X_train[:,8])**2#lat
distance = np.sqrt(temp_train_long + temp_train_lat)
np.concatenate(X_train, distance.T)
Review the concatenate docs
concatenate((a1, a2, ...), axis=0)
The function takes 2 arguments. The first is a list or tuple, the arrays that you want to join. The second is a number, denoting the axis. And it returns a new array. It does not operate in place.
X_train = df_train.as_matrix()
So this is 2d (8899, n), n larger than 9. According to pd documentation this is a numpy array not a numpy matrix (that's important)
temp_train_long=(X_train[:,3] - X_train[:,7])**2#long
temp_train_lat = (X_train[:,4] - X_train[:,8])**2#lat
Two 1d arrays (8899,)
distance = np.sqrt(temp_train_long + temp_train_lat)
Also (8899,). distance.T does nothing; that is not change in shape
np.concatenate(X_train, distance.T)
You give it 2 arguments, one is the 2d array, the other, in the axis slow is a 1d array.
You probably want
new_train = np.concatenate((X_train, distance[:,None]), axis=1)
2 array in one tuple, axis is scalar. the distance array has been turned into a 2d 1 column array.

Eigen use of diagonal matrix

Using Eigen, I have a Matrix3Xd (3 rows, n columns). I would like to get the squared norm of all columns
to be clearer, lets say I have
Matrix3Xd a =
1 3 2 1
2 1 1 4
I would like to get the squared norm of each column
squaredNorms =
5 10 5 17
I wanted to take advantage of matrix computation instead of going through a for loop doing the computation myself.
What I though of was
squaredNorms = (A.transpose() * A).diagonal()
This works, but I am afraid of performance issues: A.transpose() * A will be a nxn matrix (potentially million of elements), when I only need the diagonal.
Is Eigen clever enough to compute only the coefficients I need?
What would be the most efficient way to achieve squareNorm computation on each column?
The case of (A.transpose() * A).diagonal() is explicitly handled by Eigen to enforce lazy evaluation of the product expression nested in a diagonal-view. Therefore, only the n required diagonal coefficients will be computed.
That said, it's simpler to call A.colwise().squaredNorm() as well noted by Eric.
This will do what you want.
squaredNorms = A.colwise().squaredNorm();
https://eigen.tuxfamily.org/dox/group__QuickRefPage.html
Eigen provides several reduction methods such as: minCoeff() , maxCoeff() , sum() , prod() , trace() *, norm() *, squaredNorm() *, all() , and any() . All reduction operations can be done matrix-wise, column-wise or row-wise .

Image reconstruction using SVD Decomposition

I have performed block SVD decomposition over image and I stored results.
Now, I need to make reconstruction from this results. I found few examples all written in Matlab, which is a mystery for me.
I only need formula from which I can reconstruct my picture, or example written in C language.
Matrix A is equal U*S*V'. How will look formula, e.g. for calculating first five singular values (product of which rows and columns)? Please provide formula with indexes in C like style. U and V' are matrices and S is vector (not matrix).
Not sure if I get your question right, but if you just need to know singular values, they are the diagonal values of the middle matrix S. S in general is a diagonal matrix, which is stored here as a vector. I mean, only the diagonal is stored, you should imagine it as a matrix if you're thinking in matrix calculations.
Those diagonal values are your singular values, if you need the first biggest singular values, just take the 5 biggest values of the vector S.
Quoting from Wikipedia:
The diagonal entries Σi,i of Σ are known as the singular values of M.
The m columns of U and the n columns of V are called the left-singular
vectors and right-singular vectors of M, respectively.
In the above quote, sigma is your S, and M is the original matrix.
You have asked for C code, yet my hope is that pseudocode will suffice (it's late, I'm tired). The target matrix A has m rows, c columns and rank rho. The variable p = min(m,n).
One strategy is to first form the the intermediate matrix product B = US. This is trivial due to the diagonal-like nature of the matrix of singular values. Assume you have rho ( = 5 ) singular values. You must enforce rho <= p.
Replace column vector u1 with s1u1.
Replace column vector u2 with s2u2.
...
Replace column vector urho with srhourho.
Replace column vector urho+1 with a zero vector of length m.
Replace column vector urho+2 with a zero vector of length m.
...
Replace column vector up with a zero vector of length m.
Next form the new image matrix A = BVT. The matrix element in row r and column c is the dot product of the rth row vector (length rho) of B with the cth column vector (length rho) of VT.
Another strategy is to jump to the form where the matrix elements of A in row r and column c are
ar,c = sum ( skur,kvc,k, { k, 1, rho } )
The row counter r runs from 1 to m; the column counter c runs from 1 to n.