how to improve performance of 2d array in C++

how to improve performance of 2d array in C++ - c++

I have a low-level function that will be called millions of times, so it should be very efficient. When I use "gprof" in Linux, I found that a part of the code takes 60% of the total computation of the function (the rest part is to solve the roots of a cubic equation). Here Point is a data structure has x and v, which will be converted to a matrix for later use. The idea is to subtract each row by the first row. The code shows like below
double x[4][3] = {0}, v[4][3] = {0};
for (int i = 0; i < 4; ++i){
for (int j = 0; j < 3; ++j){
v[i][j] = Point[i]->v[j];
x[i][j] = Point[i]->x[j];
}
}
for (int i = 1; i < 4; ++i){
for (int j = 0; j < 3; ++j){
v[i][j] = v[0][j] - v[i][j];
x[i][j] = x[0][j] - x[i][j];
}
}
Can anyone show me the problem of this code? Why it performs so badly?

You can do it all in one pass:
double x[4][3] = {
{ Point[0]->x[0], Point[0]->x[1], Point[0]->x[2] }
};
double v[4][3] = {
{ Point[0]->v[0], Point[0]->v[1], Point[0]->v[2] }
};
for (int i = 1; i < 4; ++i){
for (int j = 0; j < 3; ++j){
x[i][j] = x[0][j] - Point[i]->x[j];
v[i][j] = v[0][j] - Point[i]->v[j];
}
}
You could even take that to the next level and put the entire thing into the initializers for x and v.
Or, if x and v in Point are each contiguous arrays:
double x[4][3], v[4][3]; // no init
// fill entire arrays
for (int i = 0; i < 4; ++i){
memcpy(x[0], Point[0]->x, sizeof(x[0]));
memcpy(v[0], Point[0]->v, sizeof(v[0]));
}
for (int i = 1; i < 4; ++i){
for (int j = 0; j < 3; ++j){
x[i][j] -= Point[i]->x[j];
v[i][j] -= Point[i]->v[j];
}
}

Related

Neural network does not learn XOR (converges to 0.5 output)

I wrote a multilayer perceptron which should be able to learn XOR. However, whatever I do it converges to an output of 0.5 for inputs (1,1), (0,1) and (1,0). While for the input (0,0) it converges to zero. Does anyone have an idea where my mistake is?
Forward pass:
void MLP::feedforward() {
for(int hidden = 0; hidden < nHidden; hidden++) {
hiddenNeurons.at(hidden) = 0;
for(int input = 0 ; input < nInput; input ++) {
hiddenNeurons.at(hidden) += inputNeurons.at(input)*weightItoH(input,hidden);
}
}
//Propagate towards the output layer
for(int i =0; i< nOutput; i ++) {
outputNeurons.at(i) = 0;
for(int j = 0; j <nHidden; j++) {
outputNeurons.at(i) += hiddenNeurons.at(j) * weightHtoO(j,i);
}
outputNeurons.at(i) = sigmoid(outputNeurons.at(i));
}
}
Backpropagation:
void MLP::backPropagation(int i) {
float learningRate = 0.75;
float error = desiredOutput[i] - outputNeurons[0];
// Calculate delta for output layer
for(int i=0; i<nOutput; i++) {
outputDelta.at(i) = error * dersigmoid(outputNeurons[i]);
}
//Calculate delta for hidden layer
for(int i = 0; i < nHidden; i++) {
hiddenDelta.at(i) = 0;//zero the values from the previous iteration
//add to the delta for each connection with an output neuron
for(int j = 0; j < nOutput; j ++) {
hiddenDelta.at(i) += outputDelta.at(j) * weightHtoO(i,j) ;
}
}
//Adjust weights Input to Hidden
for(int i = 0; i < nInput; i ++) {
for(int j = 0; j < nHidden; j ++) {
weightItoH(i,j) += learningRate * hiddenDelta.at(j);
}
}
//Adjust weights hidden to Output
for(int i = 0; i < nOutput; i++) {
for(int j = 0; j < nHidden; j ++) {
weightHtoO(j,i) += learningRate * outputDelta.at(i) * hiddenNeurons.at(j);
}
}
}
Input
nInputPatterns = 4;
inputPatterns.assign(nInputPatterns, vector<int>(2));
inputPatterns[0][0] = 1;
inputPatterns[0][1] = 1;
inputPatterns[1][0] = 0;
inputPatterns[1][1] = 1;
inputPatterns[2][0] = 1;
inputPatterns[2][1] = 0;
inputPatterns[3][0] = 0;
inputPatterns[3][1] = 0;
desiredOutput = {0,1,1,0};
Sigmoid function and macro's
#define sigmoid(value) (1/(1+exp(-value)));
#define dersigmoid(value) (value*(1-value));
//Macro's
#define weightItoH(input,hidden) weightsIH.at(nInput*hidden+input)
#define weightHtoO(hidden,output) weightsHO.at(nHidden*output+hidden)
C++ file: http://pastebin.com/8URZAHSy
Header file: http://pastebin.com/YiMXpmZX

There's no random initialization. This is needed to break the symmetry; else all your neurons learn the exact same values. That's effectively the same as having one neuron, and one neuron is insufficient for XOR.

How do I create a 2d array pointer with my own class as type?

I am trying to create a 2d array pointer with my own class, Tile, as type. I have looked at the code example at How do I declare a 2d array in C++ using new?. The following code works perfectly:
int** ary = new int*[sizeX];
for(int i = 0; i < sizeX; ++i)
ary[i] = new int[sizeY];
for(int i = 0; i < 8; i++)
for(int j = 0; j < 8; j++)
ary[i][j] = 5;
for(int i = 0; i < 8; i++)
for(int j = 0; j < 8; j++)
cout << ary[i][j];
However when I try to change type from int to my own class, Tile, I get an
No viable overloaded '='
error in XCode, and I can't figure out what this means. I use the following code:
Tile** t;
t = new Tile*[8];
for(int i = 0; i < 8; ++i)
t[i] = new Tile[8];
for(int i = 0; i < 8; i++) {
for(int j = 0; j < 8; j++) {
t[i][j] = new Tile(new NoPiece());
}
}
for(int i = 0; i < 8; i++) {
for(int j = 0; j < 8; j++) {
cout << (t[i][j].get_piece()).to_string();
}
}
Here is the code for Tile.cpp:
#include "Tile.h"
Tile::Tile() {
}
Tile::Tile(Piece p) {
piece = &p;
}
Piece Tile::get_piece() {
return *piece;
}
And the code for Tile.h:
#include <iostream>
#include "Piece.h"
class Tile {
Piece * piece;
public:
Tile();
Tile(Piece p);
Piece get_piece();
};

The difference between two code snippets is that the one using int treats array elements like values, i.e. assigns
ary[i][j] = 5;
while the one using Tile treats array elements like pointers:
t[i][j] = new Tile(new NoPiece()); // new makes a pointer to Tile
Change the assignment to one without new to fix the problem:
t[i][j] = Tile(new NoPiece());
There is nothing wrong to making a 2D array of pointers, too - all you need is to declare it as a "triple pointer", and add an extra level of indirection:
Tile*** t;
t = new Tile**[8];
for(int i = 0; i < 8; ++i)
t[i] = new Tile*[8];
for(int i = 0; i < 8; i++) {
for(int j = 0; j < 8; j++) {
t[i][j] = new Tile(new NoPiece());
}
}
for(int i = 0; i < 8; i++) {
for(int j = 0; j < 8; j++) {
cout << (t[i][j]->get_piece()).to_string();
}
}
// Don't forget to free the tiles and the array
for(int i = 0; i < 8; i++) {
for(int j = 0; j < 8; j++) {
delete t[i][j];
}
delete[] t[i];
}

Trying to multiply two dynamically created matrices(2d vector's) together in c++

So what I am trying to do is multiply one 2d vector by another 2d vector.
I come from Java,Python and C# so I am pretty much learning C++ as I go along.
I have the code down to generate the vector and display the vector but I can't seem to finish the multiplication part.
v1 is another matrix that is already generated.
vector<vector<int> > v2 = getVector();
int n1 = v1[0].size();
int n2 = v2.size();
vector<int> a1(n2, 0);
vector<vector<int> > ans(n1, a1);
for (int i = 0; i < n1; i++) {
for (int j = 0; j < n2; j++) {
for (int k = 0; k < 10; k++) {
// same as z[i][j] = z[i][j] + x[i][k] * y[k][j];
ans[i][j] += v1[i][k] * v2[k][j];
}
}
}
displayVector(ans);
My guess for where I am going wrong is in the inner-most loop. I can't figure out what to actually put in place of that 10 I have there now.

When you multiply matrices, the number of columns of the matrix on the left side must equal the number of rows of the matrix on the right side. You need to check that that is true, and use that common number for your size of the k variable:
int nCommon = v1.size();
assert(v2[0].size() == nCommon);
for (int i = 0; i < n1; i++) {
for (int j = 0; j < n2; j++) {
for (int k = 0; k < nCommon ; k++) {
ans[i][j] += v1[i][k] * v2[k][j];
}
}
}

For you inner loop, you should do something like this
ans[i][j] = 0;
for (int k = 0; k < n2; k++) {
ans[i][j] += v1[i][k] * v2[k][j];
}
I don't know where the 10 comes from.

Infinite Impluse Response (IIR) Function

I am trying to design a signal class which includes an IIR filter function. The following is my code:
void signal::IIRFilter(vector<double> coefA, vector<double> coefB){
double ** temp;
temp = new double*[_nchannels];
for(int i = 0; i < _nchannels; i++){
temp[i] = new double[_ninstances];
}
for(int i = 0; i < _nchannels; i++){
for(int j = 0; j < _ninstances; j++){
temp[i][j] = 0;
}
}
for(int i = 0; i < _nchannels; i++){
for (int j = 0; j < _ninstances; j++){
int sum1 = 0;
int sum2 = 0;
for(int k = 0; k < coefA.size(); k++){
if ((j-k) > 0 ){
sum1 += coefA.at(k)*temp[i][j-k-1];
}
}
for (int m = 0; m < coefB.size(); m++){
if(j >= m){
sum2 += coefB.at(m)*_data[i][j-m];
}
}
temp[i][j] = sum2-sum1;
}
}
for(int i = 0; i < _nchannels; i++){
for(int j = 0; j < _ninstances; j++){
_data[i][j] = temp[i][j];
}
}
}
_data contains my original signal, _ninstances is my number of samples, and _nchannels is the number of channels. The function compiles and works but the result I am getting is different from the result given by MATLAB. I even use the same coefficients given by MATLAB. Is there anything that I'm doing wrong in my function?

One issue that I can see is that you are declaring sum1 and sum2 as integers when they should be double. To avoid this kind of error in the future, you should try configuring your compiler to warn of implicit conversions. In g++, this is accomplished using the -Wconversion flag.

c++ 3d arrays

I'm trying to run a 3d array but the code just crashes in windows when i run it, here's my code;
#include <iostream>
using namespace std;
int main(){
int myArray[10][10][10];
for (int i = 0; i <= 9; ++i){
for (int t = 0; t <=9; ++t){
for (int x = 0; x <= 9; ++t){
myArray[i][t][x] = i+t+x;
}
}
}
for (int i = 0; i <= 9; ++i){
for (int t = 0; t <=9; ++t){
for (int x = 0; x <= 9; ++t){
cout << myArray[i][t][x] << endl;
}
}
}
system("pause");
}
can someone throw me a quick fix / explanation

You twice have the line
for (int x = 0; x <= 9; ++t){
when you mean
for (int x = 0; x <= 9; ++x){
Classic copy-and-paste error.
BTW, if you run this in a debugger and look at the values of the variables, it's pretty easy to see what's going on.

David's answer is correct.
Incidentally, convention is to use i,j,and k for nested iterator indices, and also to use < array_length rather than <= array_length -1 as the terminator.
If you do that, then you can make the array size a constant and get rid of some magic numbers.
Also, an assertion at the point where you use the array indices might have pointed you to the error.
The result may look like:
const std::size_t ARRAY_SIZE = 10;
int myArray[ARRAY_SIZE][ARRAY_SIZE][ARRAY_SIZE];
for (std::size_t i = 0; i < ARRAY_SIZE; ++i)
{
for (std::size_t j = 0; j < ARRAY_SIZE; ++j)
{
for (std::size_t k = 0; k < ARRAY_SIZE; ++k)
{
std::assert (i < ARRAY_SIZE && j < ARRAY_SIZE && k < ARRAY_SIZE);
// Do stuff
}
}
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

how to improve performance of 2d array in C++ - c++

Related

Neural network does not learn XOR (converges to 0.5 output)

How do I create a 2d array pointer with my own class as type?

Trying to multiply two dynamically created matrices(2d vector's) together in c++

Infinite Impluse Response (IIR) Function

c++ 3d arrays

Categories

Resources