I am fairly new to C++ programming language. Currently, what I am trying to accomplish is getting a short* input from MATLAB, creating its transpose and multiplying. I am having problems when I try to multiply. I am getting h_raw from MATLAB and doing some computation to get sh_data. This looks correct. Next, I am creating a transpose dst which is also correct. Finding the covariance matrix is giving me troubles. This covariance matrix should be MATALB equivalent to cov_matrix=sh_data * sh_data'; Any help is appreciated!
for (int numEl = 0; numEl < elements; numEl++) {
int shift_idx = tof[1 + (1 * pix_x * numEl)];
for (int sdex = 0; sdex < shift_idx; sdex++) {
for (int p = 0; p < nrows; p++) {
sh_data[p + (numEl*nrows)] = h_raw[(p + sdex) + (numEl*nrows)];
}
}
}
// finding the transpose
for (int n = 0; n < nrows*ncols; n++) {
int i = n / ncols;
int j = n % ncols;
dst[n] = sh_data[nrows*j + i];
}
// calculating the covariance matrix
for (int nel = 0; nel < elements; nel++) {
for (int k = 0; k < nrows; k++) {
cov_matrix[(nel*elements)+nel] += dst[(k*elements)+nel] * sh_data[k+(nel*nrows)];
}
}
Related
I'm trying to implement a quick program to solve a system of linear equations. The program reads the input from a file and then writes the upper-triangular system and solutions to a file. It is working with no pivoting, but when I try to implement the pivoting it produces incorrect results.
As example input, here is the following system of equations:
w+2x-3y+4z=12
2w+2x-2y+3z=10
x+y=-1
w-x+y-2z=-4
I expect the results to be w=1, x=0, y=-1 and z=2. When I don't pivot, I get this answer (with some rounding error on x). When I add in the pivoting, I get the same numbers but in the wrong order: w=2,x=1,y=-1 and z=0.
What do I need to do to get these in the correct order? Am I missing a step somewhere? I need to do column swapping instead of rows because I need to adapt this to a parallel algorithm later that requires that. Here is the code that does the elimination and back substitution:
void gaussian_elimination(double** A, double* b, double* x, int n)
{
int maxIndex;
double temp;
int i;
for (int k = 0; k < n; k++)
{
i = k;
for (int j = k+1; j < n; j++)
{
if (abs(A[k][j]) > abs(A[k][i]))
{
i = j;
}
}
if (i != k)
{
for (int j = 0; j < n; j++)
{
temp = A[j][k];
A[j][k] = A[j][i];
A[j][i] = temp;
}
}
for (int j = k + 1; j < n; j++)
{
A[k][j] = A[k][j] / A[k][k];
}
b[k] = b[k] / A[k][k];
A[k][k] = 1;
for (i = k + 1; i < n; i++)
{
for (int j = k + 1; j < n; j++)
{
A[i][j] = A[i][j] - A[i][k] * A[k][j];
}
b[i] = b[i] - A[i][k] * b[k];
A[i][k] = 0;
}
}
}
void back_substitution(double**U, double*x, double*y, int n)
{
for (int k = n - 1; k >= 0; k--)
{
x[k] = y[k];
for (int i = k - 1; i >= 0; i--)
{
y[i] = y[i] - x[k]*U[i][k];
}
}
}
I believe what you implemented is actually complete pivoting.
With complete pivoting, you must keep track of the permutation of columns, and apply the same permutation to your answer.
You can do this with an array {0, 1, ..., n}, where you swap the i'th and k'th values in the second loop. Then, rearange the solution using this array.
If what you were trying to do is partial pivoting, you need to look for the maximum in the respective row, and swap the rows and the values of 'b' accordingly.
I have to use a nested for-loop to compute the entries of a Eigen::MatrixXd type matrix output columnwise. Here input[0], input[1] and input[2] are defined as Eigen::ArrayXXd in order to use the elementwise oprerations. This part seems to be the bottleneck for my code. Can anyone help me to accelerate this loop? Thanks!
for (int i = 0; i < r; i++) {
for (int j = 0; j < r; j++) {
for (int k = 0; k < r; k++) {
output.col(i * (r * r) + j * r + k) =
input[0].col(i) * input[1].col(j) * input[2].col(k);
}
}
}
When thinking about optimizing code of a for loop, it helps to think, "Are there redundant calculations that I can eliminate?"
Notice how in the inner most loop, only k is changing. You should move all possible calculations that don't involve k out of that loop:
for (int i = 0; i < r; i++) {
int temp1 = i * (r * r);
for (int j = 0; j < r; j++) {
int temp2 = j * r;
for (int k = 0; k < r; k++) {
output.col(temp1 + temp2 + k) =
input[0].col(i) * input[1].col(j) * input[2].col(k);
}
}
}
Notice how i * (r * r) is being calculated over and over, but the answer is always the same! You only need to recalculate this when i increments. The same goes for j * r.
Hopefully this helps!
To reduce the number of flops, you should cache the result of input[0]*input[1]:
ArrayXd tmp(input[0].rows());
for (int i = 0; i < r; i++) {
for (int j = 0; j < r; j++) {
tmp = input[0].col(i) * input[1].col(j);
for (int k = 0; k < r; k++) {
output.col(i * (r * r) + j * r + k) = tmp * input[2].col(k);
}
}
}
Then, to fully use your CPU, enable AVX/FMA with -march=native and of course compiler optimizations (-O3).
Then, to get an idea of what you could gain more, measure accurately the time taken by this part, count the number of multiplications (r^2*(n+r*n)), and then compute the number of floating point operations per second you achieve. Then compare it to the capacity of your CPU. If you're good, then the only option is to multithread one of the for loop using, e.g., OpenMP. The choice of which for loop depends on the size of your inputs, but you can try with the outer one, making sure each thread has its own tmp array.
If we access pixel by a pointer using step and data of Mat Image. see example below
int step = srcimg.step;
for (int j = 0; j < srcimg.rows; j++) {
for (int i = 0; i < srcimg.cols; i++) {
//this is pointer to the pixel value.
uchar* ptr = srcimg.data + step* j + i;
}
}
Question:
How can we perform 3x3 weighted avg operations with image step by a pointer?
thanks
You mustn't use data field in opencv because memory is not allways continuous. you can check this using isContinuous() method.
Now you can do like this (image type is CV_8UC1)
for (int i = 1; i < srcimg.rows-1; i++)
{
for (int j = 1; j < srcimg.cols-1; j++)
{
int x=0;
for (int k=-1;k<=1;k++)
{
uchar* ptr=srcimg.ptr(k+i)+j-1;
for (int l=-1;l<=1;l++,ptr++)
x +=*ptr;
}
}
}
image border are not processed. Now if you want to blur an image use blur method
You can use this post too
I am doing something like this .
int sr = 3;
for (int j = 0; j < srcimg.rows; j++) {
for (int i = 0; i < srcimg.cols; i++) {
uchar* cp_imptr = im.data;
uchar* tptr = im.data + imstep *(sr + j) + (sr + i);
int val_tptr = cp_imptr [imstep *(sr + j) + (sr + i)]; //pointer of image data amd step at 3x3
int val_cp_imptr = cp_imptr[imstep *j + i];
double s = 0;
for (int n = templeteWindowSize; n--;)
{
for (int m = templeteWindowSize; m--;)
{
uchar* t = tptr; //pointer of template
// sum
s += *t;
t++;
}
t += cstep;
}
}
cout << endl;
}
i want to potentiate a Matrix but i dont workings how it should work.
m ist the Matrix i want to potentiate
long double pro[100][100]; // product after each step
long double res[100][100]; // the Matrix with the exponent n
for (int n = 1; n < nVal; n++) // exponent
{
for (int i = 0; i < mVal; i++) // row
{
for (int j = 0; j < mVal; j++) // col
{
res[i][j] = 0;
for (int k = 0; k < mVal; k++) // inner
{
res[i][j] += pro[i][k] * m[k][j]; // multiply the product with the default matrix
}
}
}
}
// array Output - working
for (int i = 0; i<mVal; i++)
{
for (int j = 0; j<mVal; j++)
cout << res[i][j] << "\t";
cout << endl;
}
in the output i see some crazy numbers and i dont know why :(
Can anyone help me?
You should
initialise the pro matrix to the identity at the beginning of loop on n
copy the res matrix into the pro matrix the end of each loop on n.
In pseudo code
pro = Identity matrix
for (int n = 1; n < nVal; n++) {
res = pro * m // using two loops
pro = res
}
result is in pro.
Note that there are much faster way to compute powers: http://en.wikipedia.org/wiki/Exponentiation_by_squaring
As Willll said you shouldn't forget to initialize.
Another suggestion would be to erase the exponent loop and just use the pow() function from math library. It´ll make it more simple and easier to visualize.
So what I am trying to do is multiply one 2d vector by another 2d vector.
I come from Java,Python and C# so I am pretty much learning C++ as I go along.
I have the code down to generate the vector and display the vector but I can't seem to finish the multiplication part.
v1 is another matrix that is already generated.
vector<vector<int> > v2 = getVector();
int n1 = v1[0].size();
int n2 = v2.size();
vector<int> a1(n2, 0);
vector<vector<int> > ans(n1, a1);
for (int i = 0; i < n1; i++) {
for (int j = 0; j < n2; j++) {
for (int k = 0; k < 10; k++) {
// same as z[i][j] = z[i][j] + x[i][k] * y[k][j];
ans[i][j] += v1[i][k] * v2[k][j];
}
}
}
displayVector(ans);
My guess for where I am going wrong is in the inner-most loop. I can't figure out what to actually put in place of that 10 I have there now.
When you multiply matrices, the number of columns of the matrix on the left side must equal the number of rows of the matrix on the right side. You need to check that that is true, and use that common number for your size of the k variable:
int nCommon = v1.size();
assert(v2[0].size() == nCommon);
for (int i = 0; i < n1; i++) {
for (int j = 0; j < n2; j++) {
for (int k = 0; k < nCommon ; k++) {
ans[i][j] += v1[i][k] * v2[k][j];
}
}
}
For you inner loop, you should do something like this
ans[i][j] = 0;
for (int k = 0; k < n2; k++) {
ans[i][j] += v1[i][k] * v2[k][j];
}
I don't know where the 10 comes from.