Assigning value to pointer with multithreading generates wrong result

Assigning value to pointer with multithreading generates wrong result - c++

void assign(short *inPtr, int ext[4], int sz[2])
{
for (int j = ext[2]; j < ext[3]; j++)
for (int i = ext[0]; i < ext[1]; i++)
{
inPtr[i + j * sz[1]] = 100;
}
}
int main(int, char **)
{
int size[2] = {16, 16};
short ptr[256];
std::vector<std::thread> pool;
int block = 2;
int row = size[0] / block;
int col = size[1] / block;
for (size_t j = 0; j < row; j++)
for (size_t i = 0; i < col; i++)
{
int ext[4] = {i * block, (i + 1) * block, j * block, (j + 1) * block};
pool.push_back(std::thread(assign, ptr, ext, size));
}
for (size_t i = 0; i < row * col; i++)
{
pool[i].join();
}
for (size_t i = 0; i < 256; i++)
{
std::cout << (double)ptr[i] << ", ";
}
std::cout << std::endl;
}
I'm trying to assign value to pointer ptr by using multithreading.
The code above is supposed to assign 100 to all element of ptr, however, there are many 0 in the output. Why does it happen and how to fix it ?

Related

My code is realy slow and i need optimization problem

#include <iostream>
#include <chrono>
using namespace std;
int main()
{
const unsigned int m = 200;
const unsigned int n = 200;
srand(static_cast<unsigned int>(static_cast<std::chrono::duration<double>
>(std::chrono::high_resolution_clock::now().time_since_epoch()).count()));
double** matrixa;
double** matrixb;
double** matrixc;
matrixa = new double* [m];
matrixb = new double* [m];
matrixc = new double* [m];
unsigned int max = static_cast<unsigned int>(1u << 31);
for (unsigned int i = 0; i < m; i++)
matrixa[i] = new double[n];
for (unsigned int i = 0; i < m; i++)
matrixb[i] = new double[n];
for (unsigned int i = 0; i < m; i++)
matrixc[i] = new double[n];
for (unsigned int i = 0; i < m; i++)
for (unsigned int j = 0; j < n; j++)
matrixa[i]
[j] = static_cast<double>(static_cast<double>(rand()) / max * 10);
for (unsigned int i = 0; i < m; i++)
for (unsigned int j = 0; j < n; j++)
matrixb[i]
[j] = static_cast<double>(static_cast<double>(rand()) / max * 10);
auto start = std::chrono::high_resolution_clock::now();
for (unsigned int i = 0; i < m; i++)
for (unsigned int j = 0; j < n; j++)
for (unsigned int k = 0; k < m; k++)
for (unsigned int l = 0; l < m; l++)
matrixc[i][j] += matrixa[k][l] * matrixb[l][k];
auto stop = std::chrono::high_resolution_clock::now();
std::chrono::duration<double> time_diff = stop - start;
cout << "Czas wykonania programu " << time_diff.count() << " sekund." <<
endl;
for (unsigned int i = 0; i < m; i++)
delete[] matrixa[i];
for (unsigned int i = 0; i < m; i++)
delete[] matrixb[i];
for (unsigned int i = 0; i < m; i++)
delete[] matrixc[i];
delete[] matrixa;
delete[] matrixb;
delete[] matrixc;
return 0;
}
I have this code and I would like to optimize it, unfortunately I have absolutely no idea how to go about it. Maybe someone has an idea and would like to help me? I got to the point where the program for 400 arrays executes 105 seconds but it is still too much, I would like to optimize this code to run faster. I found OpenMP library and thread class but I don't know how to use it in my program.

Firstly, your matrix multiply algorithm is over complex than a normal one(Or it's just wrong), you may reference the wiki for a typical algorithm:
Input: matrices A and B
Let C be a new matrix of the appropriate size
For i from 1 to n:
For j from 1 to p:
Let sum = 0
For k from 1 to m:
Set sum ← sum + Aik × Bkj
Set Cij ← sum
Return C
There is a critical bug in your code, you haven't initialized the result matrix.
So the fixed code may like this:
#include <chrono>
#include <iostream>
using namespace std;
int main() {
const unsigned int m = 200;
const unsigned int n = 201;
const unsigned int p = 202;
srand(static_cast<unsigned int>(
static_cast<std::chrono::duration<double> >(
std::chrono::high_resolution_clock::now().time_since_epoch())
.count()));
double** matrixa;
double** matrixb;
double** matrixc;
matrixa = new double*[m];
matrixb = new double*[n];
matrixc = new double*[m];
unsigned int max = static_cast<unsigned int>(1u << 31);
for (unsigned int i = 0; i < m; i++) matrixa[i] = new double[n];
for (unsigned int i = 0; i < n; i++) matrixb[i] = new double[p];
for (unsigned int i = 0; i < m; i++) {
matrixc[i] = new double[p];
std::fill(matrixc[i], matrixc[i] + p, 0.0);
}
for (unsigned int i = 0; i < m; i++)
for (unsigned int j = 0; j < n; j++)
matrixa[i][j] =
static_cast<double>(static_cast<double>(rand()) / max * 10);
for (unsigned int i = 0; i < n; i++)
for (unsigned int j = 0; j < p; j++)
matrixb[i][j] =
static_cast<double>(static_cast<double>(rand()) / max * 10);
auto start = std::chrono::high_resolution_clock::now();
for (unsigned int i = 0; i < m; i++)
for (unsigned int j = 0; j < p; j++)
for (unsigned int k = 0; k < n; k++)
matrixc[i][j] += matrixa[i][k] * matrixb[k][j];
auto stop = std::chrono::high_resolution_clock::now();
std::chrono::duration<double> time_diff = stop - start;
cout << "Czas wykonania programu " << time_diff.count() << " sekund." << endl;
for (unsigned int i = 0; i < m; i++) delete[] matrixa[i];
for (unsigned int i = 0; i < n; i++) delete[] matrixb[i];
for (unsigned int i = 0; i < m; i++) delete[] matrixc[i];
delete[] matrixa;
delete[] matrixb;
delete[] matrixc;
return 0;
}
Now it's much faster than the one in question.
It still can be faster with slightly modification:
#include <chrono>
#include <iostream>
using namespace std;
int main() {
const unsigned int m = 200;
const unsigned int n = 201;
const unsigned int p = 202;
srand(static_cast<unsigned int>(
static_cast<std::chrono::duration<double> >(
std::chrono::high_resolution_clock::now().time_since_epoch())
.count()));
double** matrixa;
double** matrixb;
double** matrixc;
matrixa = new double*[m];
matrixb = new double*[n];
matrixc = new double*[m];
unsigned int max = static_cast<unsigned int>(1u << 31);
for (unsigned int i = 0; i < m; i++) matrixa[i] = new double[n];
for (unsigned int i = 0; i < n; i++) matrixb[i] = new double[p];
for (unsigned int i = 0; i < m; i++) {
matrixc[i] = new double[p];
std::fill(matrixc[i], matrixc[i] + p, 0.0);
}
for (unsigned int i = 0; i < m; i++)
for (unsigned int j = 0; j < n; j++)
matrixa[i][j] =
static_cast<double>(static_cast<double>(rand()) / max * 10);
for (unsigned int i = 0; i < n; i++)
for (unsigned int j = 0; j < p; j++)
matrixb[i][j] =
static_cast<double>(static_cast<double>(rand()) / max * 10);
auto start = std::chrono::high_resolution_clock::now();
for (unsigned int i = 0; i < m; i++)
for (unsigned int k = 0; k < n; k++)
for (unsigned int j = 0; j < p; j++)
matrixc[i][j] += matrixa[i][k] * matrixb[k][j];
auto stop = std::chrono::high_resolution_clock::now();
std::chrono::duration<double> time_diff = stop - start;
cout << "Czas wykonania programu " << time_diff.count() << " sekund." << endl;
for (unsigned int i = 0; i < m; i++) delete[] matrixa[i];
for (unsigned int i = 0; i < n; i++) delete[] matrixb[i];
for (unsigned int i = 0; i < m; i++) delete[] matrixc[i];
delete[] matrixa;
delete[] matrixb;
delete[] matrixc;
return 0;
}
This code is more cache-friendly, the explanation can be found here.
The code can still be improved by a parallel algorithm, to speed up the previous code with OpenMP, with only one line change:
Add we need to add the build option -fopenmp to compile it.
#include <chrono>
#include <iostream>
using namespace std;
int main() {
const unsigned int m = 200;
const unsigned int n = 201;
const unsigned int p = 202;
srand(static_cast<unsigned int>(
static_cast<std::chrono::duration<double> >(
std::chrono::high_resolution_clock::now().time_since_epoch())
.count()));
double** matrixa;
double** matrixb;
double** matrixc;
matrixa = new double*[m];
matrixb = new double*[n];
matrixc = new double*[m];
unsigned int max = static_cast<unsigned int>(1u << 31);
for (unsigned int i = 0; i < m; i++) matrixa[i] = new double[n];
for (unsigned int i = 0; i < n; i++) matrixb[i] = new double[p];
for (unsigned int i = 0; i < m; i++) {
matrixc[i] = new double[p];
std::fill(matrixc[i], matrixc[i] + p, 0.0);
}
for (unsigned int i = 0; i < m; i++)
for (unsigned int j = 0; j < n; j++)
matrixa[i][j] =
static_cast<double>(static_cast<double>(rand()) / max * 10);
for (unsigned int i = 0; i < n; i++)
for (unsigned int j = 0; j < p; j++)
matrixb[i][j] =
static_cast<double>(static_cast<double>(rand()) / max * 10);
auto start = std::chrono::high_resolution_clock::now();
#pragma omp parallel for
for (unsigned int i = 0; i < m; i++)
for (unsigned int k = 0; k < n; k++)
for (unsigned int j = 0; j < p; j++)
matrixc[i][j] += matrixa[i][k] * matrixb[k][j];
auto stop = std::chrono::high_resolution_clock::now();
std::chrono::duration<double> time_diff = stop - start;
cout << "Czas wykonania programu " << time_diff.count() << " sekund." << endl;
for (unsigned int i = 0; i < m; i++) delete[] matrixa[i];
for (unsigned int i = 0; i < n; i++) delete[] matrixb[i];
for (unsigned int i = 0; i < m; i++) delete[] matrixc[i];
delete[] matrixa;
delete[] matrixb;
delete[] matrixc;
return 0;
}
It would be better to use std::vector rather than dynamically allocated arrays, the work is left for you.

How can I find the size of a matrix which passed to function?

In this C++ code sizeof (ar) doesn't help me to find cols and rows variables and always gives me the same wrong cols and rows.
How can I find the size of the matrix without passing sizeX and sizeY variables to the IsMagicSquare(arr) function? Can you help me to understand this problem and find a way to solve it?
int main()
{
int sizeX, sizeY;
cout << "Size of Matrix:";
cin >> sizeX >> sizeY;
int** arr = new int* [sizeY];
for (int i = 0; i < sizeY; ++i)
{
arr[i] = new int[sizeX];
}
cout << "Elements of the Matrix:";
for (int i = 0; i < sizeX; i++)
for (int j = 0; j < sizeY; j++)
cin >> arr[i][j];
if (IsMagicSquare(arr))
{
for (int i = 0; i < sizeX; i++)
{
cout << "\n";
for (int j = 0; j < sizeY; j++)
cout << arr[i][j];
}
}
else
{
for (int i = 0; i < sizeY; i++)
{
delete[] arr[i];
}
delete[] arr;
}
return 0;
}
bool IsMagicSquare(int** ar)
{
int rows = sizeof (ar)/ sizeof (ar[0]);
int cols = sizeof (ar[0]) / sizeof(int);
cout << rows << cols;
if (rows == cols)
{
int* ver = new int[rows]();
int* hor = new int[cols]();
int cross0=0;
int cross1=0;
for (int i = 0; i < rows; i++)
{
for (int j = 0; j < cols; j++)
{
hor[i] += ar[i][j];
ver[j] += ar[i][j];
if (i == j)
cross0 += ar[i][j];
else if ((i+j) == cols)
cross1 += ar[i][j];
}
}
if (cross0 != cross1)
return false;
else
{
for (int i = 0; i < rows; i++)
if ((cross0 != ver[i]) || (cross1 != hor[i]))
return false;
}
}
else
return false;
return true;
}

Just sum the allocations you are making for a reasonable estimate:
The outer array: sizeY * sizeof(int*).
The inner arrays: sizeY * sizeX * sizeof(int)
Of course, the size calculations you got in IsMagicSquare() won’t work: sizeof operates on the types rather than the actual allocated memory. That information is lost and can’t be recovered from the pointers. You are best off to use, e.g., a std::vector<std::vector<int>>. That deals with automatic memory allocation and it tracks the sizes of the structure.

Pass an unsigned short pointer to a function that as reference for cropping an image

This program is crashing!, as a newbie can anyone say where is the memory getting corrupted here and what to do to fix this issue?
Here I am trying to extract an ROI from part of data and assigning it to original data back again.
Modified code below, here there is no issue and the 'newdata' will have cropped data from the original variable 'data'
#include "stdafx.h"
#include <iostream>
using namespace std;
void ExtractROI(unsigned short *image, int nRows, int nCols, unsigned short *imageROI)
{
int indexROI = 0;
for (int i = 0; i < nRows; i++)
{
for (int j = 0; j < nCols; j++)
{
imageROI[indexROI] = image[i * nCols + j];
indexROI++;
}
}
}
int main()
{
const int nRows = 12;
const int nCols = 12;
unsigned short *data = new unsigned short[nRows * nCols];
for (int i = 0; i < nRows; i++)
{
for (int j = 0; j < nCols; j++)
{
data[i * nCols + j] = i * nCols + j;
}
}
unsigned short *newdata = new unsigned short[2 * 2];
memset(newdata, 0, sizeof(unsigned short) * 2 * 2);
ExtractROI(data, 2, 2, newdata);
for (int i = 0; i < 2; i++)
{
for (int j = 0; j < 2; j++)
{
cout << "(" << i << "," << j << ")" << " = " << newdata[i * 2 + j] << endl;
}
}
delete[] data;
delete[] newdata;
char x;
cin >> x;
return 0;
}
/* Old code below*/
#include "stdafx.h"
#include <iostream>
using namespace std;
void ExtractROI(unsigned short *image, int nRows, int nCols, unsigned short *imageROI)
{
int indexROI = 0;
for (int i = 0; i < nRows; i++)
{
for (int j = 0; j < nCols; j++)
{
imageROI[indexROI] = image[i * nCols + j];
indexROI++;
}
}
}
int main()
{
const int nRows = 12;
const int nCols = 12;
unsigned short *data = new unsigned short[nRows * nCols];
for (int i = 0; i < nRows; i++)
{
for (int j = 0; j < nCols; j++)
{
data[i * nCols + j] = i * nCols + j;
}
}
unsigned short *newdata = new unsigned short[2 * 2];
memset(newdata, 0, sizeof(unsigned short) * 2 * 2);
ExtractROI(data, nRows, nCols, newdata);
data = newdata;
for (int i = 0; i < nRows; i++)
{
for (int j = 0; j < nCols; j++)
{
cout << "(" << i << "," << j << ")" << " = " << data[i * nCols + j] << endl;
}
}
/*delete[] data;
delete[] newdata;*/
char x;
cin >> x;
return 0;
}

Inside the loops in ExtractROI the variable indexROI will be increased a total of nRows * nCols times. Since you pass 12 for each, indexROI will at the end be 12 * 12 (or 144). This is quite a lot more than the 2 * 2 (or 4) elements allocated for imageROI.
Going out of bounds of allocated memory leads to undefined behavior.
After the call to ExtractROI you have the same problem in the loops there. As well as a memory leak (you lose what data is originally pointing to).

Adding two 2D arrays together in C++ - Why do this program crash?

Hey I am beginner at C++ programming. I have made a program that is meant to add two 2D arrays together. However, The program outputs the values until the program crashes. Can someone help me to identify the problem?
#include <iostream>
using namespace std;
int main(int argc, char** argv)
{
int a[10][10], c[10][10], i, j;
for (i = 1; i <= 10; ++i)
{
for(j=0; j < 10; ++j)
{
a[i][j] = i * j;
}
}
// We are able to treat the individual columns as arrays
for (int i = 0; i < 10; ++i)
{
int *b = a[i];
for (int j = 0; j < 10; ++j)
{
cout << b[j] << " ";
}
cout << endl;
}
cout << "****" << endl;
// Declare a multidimensional array on the heap
int **b = new int*[10];
// need to allocate all members individually
for (int i = 0; i < 10; ++i)
{
b[i] = new int[10];
}
// Set the values of b
for (int i = 0; i < 10; ++i)
{
for (j = 0; j < 10; ++j)
{
b[i][j] = (i * 10) + j;
}
}
for (i = 0; i < 10; ++i)
{
for (j = 1; j <= 10; ++j)
{
c[i][j] = a[i][j] + b[i][j];
}
}
for (i = 0; i < 10; ++i)
{
for (j = 1; j <= 10; ++j)
{
cout << c[i][j] << endl;
}
}
// Delete the multidimensional array - have to delete each part
for (int i = 0; i < 10; ++i)
{
delete[] b[i];
}
delete[] b;
return 0;
}

I corrected your code.Now, It's working and program didn't crash. You can try it out.
#include<conio.h>
#include<iostream.h>
int main(int argc, char** argv)
{
int a[10][10], c[10][10], i, j;
for (i = 0; i <10; ++i)
{
for(j=0; j < 10; ++j)
{
a[i][j] = i * j;
}
}
//We are able to treat the individual columns as arrays;
for (i = 0; i < 10; ++i)
{
int *b = a[i];
for (int j = 0; j < 10; ++j)
{
cout << b[j] << " ";
}
cout << endl;
}
cout << "****" << endl;
//Declare a multidimensional array on the heap;
int **b = new int*[10];
//need to allocate all members individually
for (i = 0; i < 10; ++i)
{
b[i] = new int[10];
}
//Set the values of b
for ( i = 0; i < 10; ++i)
{
for (j = 0; j < 10; ++j)
{
b[i][j] = (i * 10) + j;
}
}
for (i = 0; i < 10; ++i)
{
for (j = 0; j <10; ++j)
{
c[i][j] = a[i][j] + b[i][j];
}
}
for (i = 0; i < 10; ++i)
{
for (j = 0;j < 10; ++j)
{
cout << c[i][j] << " ";
}
cout<<endl;
}
// Delete the multidimensional array - have to delete each part
for (i = 0; i < 10; ++i)
{
delete[] b[i];
}
delete[] b;
return 0;
}

Error in double pointer to array parameter

int main (void)
{
int** arr = new int*[4];
for (int i = 0; i < 4; i++) arr[i] = new int[4] {1, 0, 0, 1};
const int* p = &(arr[0][0]);
TFigure* test = new TFigure(arr, 4, 4);
test->resolve();
for (int i = 0; i < 4; i++) delete[] arr[i];
delete[] arr;
return 0;
}
where constructor declaration is
line 57:
TFigure(int **ia, int n, int m)
N = n;
M =m;
landscape = new int*[n];
puddles = new int*[n];
for (int i = 0; i < n; i++){
landscape[i] = new int[m];
puddles[i] = new int[n];
for (int j = 0; j < m; j++)
landscape[i][j] = *ia[i][j];
}
for (int i = 0; i < n; i++)
for (int j = 0; j < 0; j++)
if (i == 0 || i == N || j == 0 || j == M)
puddles[i][j] = 0;
else
puddles[i][j] = 1;
for (int i = 0; i < N; i++){
for (int j = 0; j < M; j++)
std::cout << puddles[i][j] << ' ';
std::cout << std::endl;
}
for (int i = 0; i < N; i++){
for (int j = 0; j < M; j++)
std::cout << landscape[i][j] << ' ';
std::cout << std::endl;
}
};
but I have an error
57:43: error: invalid type argument of unary «*» (have «int»)
I don't understand what causes this.

The problem is with this line:
landscape[i][j] = *ia[i][j];
ia[i][j] gives you an int which you then try to dereference. It seems like you really just want:
landscape[i][j] = ia[i][j];
I'm not sure if this was a mistake when copy and pasting or not, but your constructor definition is missing an opening {.
TFigure(int **ia, int n, int m) {
// Here ^

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Assigning value to pointer with multithreading generates wrong result - c++

Related

My code is realy slow and i need optimization problem

How can I find the size of a matrix which passed to function?

Pass an unsigned short pointer to a function that as reference for cropping an image

Adding two 2D arrays together in C++ - Why do this program crash?

Error in double pointer to array parameter

Categories

Resources