Best way to convert 2D memory allocation to 1D vector - c++

What is the fastest way to convert a 2D memory allocation to a 1D vector? Right now, I am using nested loops to do it by referring each value at a time. Would it be possible use memcpy or copy to gain speed? I have the following:
Ipp32f* datamem = ippiMalloc_32f_C1(size0, size1, &(steps));
Ipp32f* datamem2 = ippiMalloc_32f_C1(size0, size1, &(steps)); s
std::vector<double> result(2*size0*size1);
int count = 0;
for (int i = 0; i < 1; i++){
for (int j = 0; j < size0; j++){
for(int k = 0; k < size1; k++){
Ipp32f* pointer = (Ipp32f)((Ipp8u*)datamem + steps * k + sizeof(Ipp32f) * j);
result[count] = *pointer;
count++;
}
}
}
for (int i = 1; i < 2; i++){
for(int j = 0; j < size0; j++){
for(int k = 0; k < size1; k++){
Ipp32f* pointer = (Ipp32f)((Ipp8u*)datamem2 + step * k + sizeof(Ipp32f) * j);
result[count] = *pointer;
count++;
}
}
}
This code creates a vector where each index are referenced by a number store in datamem and datamem2. Is there a better way to this?

Related

Partial Pivoting/Gaussian elimination- swapping columns instead of rows producing wrong output

I'm trying to implement a quick program to solve a system of linear equations. The program reads the input from a file and then writes the upper-triangular system and solutions to a file. It is working with no pivoting, but when I try to implement the pivoting it produces incorrect results.
As example input, here is the following system of equations:
w+2x-3y+4z=12
2w+2x-2y+3z=10
x+y=-1
w-x+y-2z=-4
I expect the results to be w=1, x=0, y=-1 and z=2. When I don't pivot, I get this answer (with some rounding error on x). When I add in the pivoting, I get the same numbers but in the wrong order: w=2,x=1,y=-1 and z=0.
What do I need to do to get these in the correct order? Am I missing a step somewhere? I need to do column swapping instead of rows because I need to adapt this to a parallel algorithm later that requires that. Here is the code that does the elimination and back substitution:
void gaussian_elimination(double** A, double* b, double* x, int n)
{
int maxIndex;
double temp;
int i;
for (int k = 0; k < n; k++)
{
i = k;
for (int j = k+1; j < n; j++)
{
if (abs(A[k][j]) > abs(A[k][i]))
{
i = j;
}
}
if (i != k)
{
for (int j = 0; j < n; j++)
{
temp = A[j][k];
A[j][k] = A[j][i];
A[j][i] = temp;
}
}
for (int j = k + 1; j < n; j++)
{
A[k][j] = A[k][j] / A[k][k];
}
b[k] = b[k] / A[k][k];
A[k][k] = 1;
for (i = k + 1; i < n; i++)
{
for (int j = k + 1; j < n; j++)
{
A[i][j] = A[i][j] - A[i][k] * A[k][j];
}
b[i] = b[i] - A[i][k] * b[k];
A[i][k] = 0;
}
}
}
void back_substitution(double**U, double*x, double*y, int n)
{
for (int k = n - 1; k >= 0; k--)
{
x[k] = y[k];
for (int i = k - 1; i >= 0; i--)
{
y[i] = y[i] - x[k]*U[i][k];
}
}
}
I believe what you implemented is actually complete pivoting.
With complete pivoting, you must keep track of the permutation of columns, and apply the same permutation to your answer.
You can do this with an array {0, 1, ..., n}, where you swap the i'th and k'th values in the second loop. Then, rearange the solution using this array.
If what you were trying to do is partial pivoting, you need to look for the maximum in the respective row, and swap the rows and the values of 'b' accordingly.

how to improve performance of 2d array in C++

I have a low-level function that will be called millions of times, so it should be very efficient. When I use "gprof" in Linux, I found that a part of the code takes 60% of the total computation of the function (the rest part is to solve the roots of a cubic equation). Here Point is a data structure has x and v, which will be converted to a matrix for later use. The idea is to subtract each row by the first row. The code shows like below
double x[4][3] = {0}, v[4][3] = {0};
for (int i = 0; i < 4; ++i){
for (int j = 0; j < 3; ++j){
v[i][j] = Point[i]->v[j];
x[i][j] = Point[i]->x[j];
}
}
for (int i = 1; i < 4; ++i){
for (int j = 0; j < 3; ++j){
v[i][j] = v[0][j] - v[i][j];
x[i][j] = x[0][j] - x[i][j];
}
}
Can anyone show me the problem of this code? Why it performs so badly?
You can do it all in one pass:
double x[4][3] = {
{ Point[0]->x[0], Point[0]->x[1], Point[0]->x[2] }
};
double v[4][3] = {
{ Point[0]->v[0], Point[0]->v[1], Point[0]->v[2] }
};
for (int i = 1; i < 4; ++i){
for (int j = 0; j < 3; ++j){
x[i][j] = x[0][j] - Point[i]->x[j];
v[i][j] = v[0][j] - Point[i]->v[j];
}
}
You could even take that to the next level and put the entire thing into the initializers for x and v.
Or, if x and v in Point are each contiguous arrays:
double x[4][3], v[4][3]; // no init
// fill entire arrays
for (int i = 0; i < 4; ++i){
memcpy(x[0], Point[0]->x, sizeof(x[0]));
memcpy(v[0], Point[0]->v, sizeof(v[0]));
}
for (int i = 1; i < 4; ++i){
for (int j = 0; j < 3; ++j){
x[i][j] -= Point[i]->x[j];
v[i][j] -= Point[i]->v[j];
}
}

How do I copy the elements of a 2D array onto a 1D vector?

So i keep trying to transfer the elements but it keeps giving me repeated elements, it fails to properly copy the 2D array onto a 1D vector
// This was one of my attempts
vector<int> rando(int rowsize, int columnsize)
{
int elements = rowsize*columnsize;
vector<int> x(elements);
int matrix[100][100];
for(int i = 0; i < rowsize; i++)
{
for(int j = 0; j < columnsize; j++)
{
srand((int)time(0));
matrix[i][j]= -10 + rand() % 21;
for(int n=0; n < elements; n++)
x[n]=matrix[i][j];
}
// Ive also tried this
for(int n=0; n < elements; n++)
{
for(int i = 0; i < rowsize; i++)
{
for(int j = 0; j < columnsize; j++)
{
x[n]=matrix[i][j];
}
}
}
}
return x;
}
Why do you want to store data into the matrix first and copy it into the vector afterwards? Use the vector from the start.
std::vector<int> rando(std::size_t rowsize, std::size_t columnsize)
{
std::vector<int> v(rowsize*columnsize);
std::mt19937 mt{std::random_device{}()};
std::uniform_int_distribution<int> rand_dist(-10, 10);
for (auto & e : v) e = rand_dist(mt);
return v;
}
If you want to transfer data from a matrix into a vector you must calculate the proper index or just increment a single variable as Thomas Matthews suggests.
constexpr std::size_t n = 100, m = 100;
int matrix[n][m];
// do stuff with matrix
std::vector<int> v(n*m);
for (std::size_t i=0; i<n; ++i)
{
for (std::size_t j=0; j<m; ++j)
{
v[i*m + j] = matrix[i][j];
}
}
THe general copy should loop through the 2 dimensions, and just increment the target index at each iteration (no third nested loop):
int n=0;
for(int i = 0; i < rowsize; i++)
{
for(int j = 0; j < columnsize; j++)
{
...
x[n++]=matrix[i][j]; // not in an additional for loop !!
}
} // end of initialisation of matrix
If your matrix is a 2D array (i.e. contiguous elements) you can also take the following shortcut using <algorithm>:
copy (reinterpret_cast<int*>(matrix), reinterpret_cast<int*>(matrix)+elements, x.begin());
Try this:
unsigned int destination_index = 0;
for(int i = 0; i < rowsize; i++)
{
for(int j = 0; j < columnsize; j++)
{
x[destination_index++]=matrix[i][j];
}
}
The destination index is incremented after each assignment to a new slot.
No need for a 3rd loop.
It is enough to use two loops.
For example
srand((int)time(0));
for(int i = 0; i < rowsize; i++)
{
for(int j = 0; j < columnsize; j++)
{
matrix[i][j]= -10 + rand() % 21;
x[i * columnsize + j] = matrix[i][j];
}
}
In general if you have a two-dimensional array and want to copy nRows and nCols of each row elements in a vector then you can use standard algorithm std::copy declared in header <algorithm>
For example
auto it = x.begin();
for ( int i = 0; i < nRows; i++ )
{
it = std::copy( matrix[i], matrix[i] + nCols, it );
}

Trying to multiply two dynamically created matrices(2d vector's) together in c++

So what I am trying to do is multiply one 2d vector by another 2d vector.
I come from Java,Python and C# so I am pretty much learning C++ as I go along.
I have the code down to generate the vector and display the vector but I can't seem to finish the multiplication part.
v1 is another matrix that is already generated.
vector<vector<int> > v2 = getVector();
int n1 = v1[0].size();
int n2 = v2.size();
vector<int> a1(n2, 0);
vector<vector<int> > ans(n1, a1);
for (int i = 0; i < n1; i++) {
for (int j = 0; j < n2; j++) {
for (int k = 0; k < 10; k++) {
// same as z[i][j] = z[i][j] + x[i][k] * y[k][j];
ans[i][j] += v1[i][k] * v2[k][j];
}
}
}
displayVector(ans);
My guess for where I am going wrong is in the inner-most loop. I can't figure out what to actually put in place of that 10 I have there now.
When you multiply matrices, the number of columns of the matrix on the left side must equal the number of rows of the matrix on the right side. You need to check that that is true, and use that common number for your size of the k variable:
int nCommon = v1.size();
assert(v2[0].size() == nCommon);
for (int i = 0; i < n1; i++) {
for (int j = 0; j < n2; j++) {
for (int k = 0; k < nCommon ; k++) {
ans[i][j] += v1[i][k] * v2[k][j];
}
}
}
For you inner loop, you should do something like this
ans[i][j] = 0;
for (int k = 0; k < n2; k++) {
ans[i][j] += v1[i][k] * v2[k][j];
}
I don't know where the 10 comes from.

Position 2D array bug as parameter causes memory dumps

This is my program in C++, which accepts an 2D array a[m][n]. If an element a[i][j] is zero, then set all the ith row and jth column elements to zero.
This is code sample:
#include <iostream>
#include <cstdlib>
#include <ctime>
using namespace std;
class SetZero{
public:
static void setZero(int **, int , int);
};
void SetZero::setZero(int ** a, int m, int n){
int i, j, k;
int ** b = new int *[m]; //flags to identify whether set to zero or not.
for(i = 0; i < m; i++){
b[i] = new int[n];
for(j = 0; j < n; j++)
b[i][j] = 1;
}
for(i = 0; i < m; i++)
for(j = 0; j < n; j++)
if(a[i][j] == 0 && b[i][j]){//DUMP here. If I change it to (a+i)[j], then works.
for (k = 0; k < n; k++){
a[i][k] = 0;//but there is NO dump here. Weird!
b[i][k] = 0;
}
for(k = 0; k < m; k++){
a[k][j] = 0;
b[k][j] = 0;
}
j = n;//break. next row loop.
}
for(int i = 0; i < m; i++)
delete[] b[i];
delete[] b;
}
int main(){
int a[4][5];
srand(time(NULL));
for(int i = 0; i < 4; i++){//create an 2D array
for(int j = 0; j < 5; j++){
a[i][j] = rand() % 100;
cout << a[i][j] << " ";
}
cout << endl;
}
SetZero::setZero((int **)a, 4, 5);//type cast.
cout << endl;
for(int i = 0; i < 4; i++){//print result
for(int j = 0; j < 5; j++)
cout << a[i][j] << " ";
cout << endl;
}
return 0;
}
Environment: WIN8 Visual Studio 2012.
Edit:
The program can compile but cannot execute normally. It will stop when it reaches if(a[i][j] == 0 && b[i][j]){
The error message is:
Unhandled exception at 0x012875DD in CCLC.exe: 0xC0000005: Access
violation reading location 0x0000004B.
SetZero::setZero((int **)a, 4, 5)
a is not an array of pointers, it is simply a 2 dimensional array.
notice how the access violation is reading address 0x0000004B? that's 75, a number between 0 and 99 :) because you are treating a 2 dimensional array (which is just a one dimensional array with a neat way of accessing it) as an array of arrays, it is taking one of the values in your array (75) to be the address of a sub array, then trying to read the non existent array at address 75 (or 0x0000004B)
I suggest that you 'flatten' your arrays and work with them as one dimensional arrays, which I find simpler:
void SetZero::setZero(int * a, int m, int n){
int i, j, k;
int * b = new int [m*n]; //flags to identify whether set to zero or not.
for(i = 0; i < m; i++){
b[i] = new int[n];
for(j = 0; j < n; j++)
b[i*n+j] = 1;
}
for(i = 0; i < m; i++)
for(j = 0; j < n; j++)
if(a[i*n+j] == 0 && b[i*n+j]){//DUMP here. If I change it to (a+i)[j], then works.
for (k = 0; k < n; k++){
a[i*n+k] = 0;//but there is NO dump here. Weird!
b[i*n+k] = 0;
}
for(k = 0; k < m; k++){
a[k*n+j] = 0;
b[k*n+j] = 0;
}
j = n;//break. next row loop.
}
delete[] b;
}
int main(){
int a[4*5];
srand(time(NULL));
for(int i = 0; i < 4; i++){//create an 2D array
for(int j = 0; j < 5; j++){
a[i*5+j] = rand() % 100;
cout << a[i*5+j] << " ";
}
cout << endl;
}
SetZero::setZero(a, 4, 5);//type cast.
cout << endl;
for(int i = 0; i < 4; i++){//print result
for(int j = 0; j < 5; j++)
cout << a[i*5+j] << " ";
cout << endl;
}
return 0;
}
One suggestion about the SetZero(). There is a function called memset() which allows you to set all bytes to a specific value given a starting pointer and the range. This function could make your SetZero() function more cleaner:
void * memset ( void * ptr, int value, size_t num );
Fill block of memory. Sets the first num bytes of the block of memory pointed by ptr to the specified value (interpreted as an unsigned char).
Parameters
ptr: Pointer to the block of memory to fill.
value: Value to be set. The value is passed as an int, but the function fills the block of memory using the unsigned char conversion of this value.
num: Number of bytes to be set to the value, size_t is an unsigned integral type.
For example, the following code block from your program:
for (k = 0; k < n; k++){
a[i][k] = 0;//but there is NO dump here. Weird!
b[i][k] = 0;
}
can be achieved by memset in a cleaner way:
memset(a[i], 0, n * sizeof(int));
memset(b[i], 0, n * sizeof(int));