Neural network output in c++ - c++

I want to create a function that calculates neural network output. The elements of my NN
is a 19D input vector and a 19D output vector. I choose one hidden layer with 50 neurons. My code is the following but i am not quite sure if it works properly.
double *BuildPlanner::neural_tactics(){
norm(); //normalize input vector
ReadFromFile(); // load weights W1 W2 b1
double hiddenLayer [50][1];
for(int h=0; h<50; h++){
hiddenLayer[h][0] =0;
for(int f = 0; f < 19; f++){
hiddenLayer[h][0] = hiddenLayer[h][0] + W1[h][f]*input1[f][0];
}
}
double HiddenLayer[50][1];
for(int h=0; h<50; h++){
HiddenLayer[h][0] = tanh(hiddenLayer[h][0] + b1[h][0]);
}
double outputLayer[50][1];
for(int h=0; h<19; h++){
for(int k=0; k<50; k++){
outputLayer[h][0] = outputLayer[h][0] + W2[h][k]*HiddenLayer[k][0];
}
}
double Output[19];
for(int h=0; h<19; h++){
Output[h] = tanh(outputLayer[h][0]);
}
return Output;
}
Actually I not quite sure about the matrices multiplication. W1*input+b1 where the size
of the matrices are 50x19 * 19x1 + 50x1 and W2*outHiddenLayer 19x50*50x1!

Your matrix multiplication looks ok to me, but there are other problems--`outputLayer is 50x1 but a) you only iterate through the first 19 elements, and b) you have it on the RHS of your equation
outputLayer[h][0] = outputLayer[h][0] + W2[h][k]...
before that element has ever been defined. That could be causing all your problems. Also, although I assume you're making outputLayer 2-dimensional to make them look matrix-like, it's completely gratuitous and slows things down when the second dimension has size 1--just declare it and the others as
double outputLayer[50];
since it's a vector and those are always one dimensional so it will actually make your code clearer.

Related

Less memory usage for simple calculations on big random arrays in C++

For some molecular dynamic simulation I created some simple C++ code to create a random distribution of particles in 3D space:
double ** X = new double* [N];
X[0] = new double[C];
for(int i=0; i < C; i++)
X[0][i] = R_max * (double)rand() / RAND_MAX;
double ** Y = new double* [N];
Y[0] = new double[C];
for(int i=0; i < C; i++)
Y[0][i] = R_max * (double)rand() / RAND_MAX;
double ** Z = new double* [N];
Z[0] = new double[C];
for(int i=0; i < C; i++)
Z[0][i] = R_max * (double)rand() / RAND_MAX;
After that step some particles have to be deleted dependent on some additional properties (for example the distance between particles) that also have to be calculated. Therefore I use simple loops:
for (int m=0; m<C; m++){
for(int l=0; l<m; l++){
r_x[l][m] = X[0][m]-X[0][l];
r_y[l][m] = Y[0][m]-Y[0][l];
r_z[l][m] = Z[0][m]-Z[0][l];
r[l][m]= sqrt(r_x[l][m]*r_x[l][m] +r_y[l][m]*r_y[l][m] + r_z[l][m]*r_z[l][m]);
if(r[l][m] <= ...)...;
}
}
In the last step the coordinates have to be stored in some file, I use fopen and again, some simple loops:
FILE *file;
file=fopen("system.ini","w");
for(int i= 0; i< ...; i++){
for (int j=0; j<...; j++){
fprintf(file,"%lf %lf %lf\n",X[i][j],Y[i][j],Z[i][j]);
}
}
That worked fine but now I have to scale up the systemsize by a factor of a few 1000 so my 16 GB memory is no longer sufficient enough to calculate the additional properties. Ofcourse professional simualtion software won't take more than a few 100 MB memory and a fraction of the computing time for the same calculations, how can I make this simple code a little bit more efficient?
Do a google for "C++ Sparse Matrix" and you'll find a ton of implementations and discussions.
Geeks for Geeks talks about rolling your own. Boost has a class for it. Those were the most interesting links from the first page.
I think the key to this is understanding what a sparse matrix is. After that, the rest is pretty straightforward.

How to access a vector inside a vector?

So I have a vector of vectors type double. I basically need to be able to set 360 numbers to cosY, and then put those 360 numbers into cosineY[0], then get another 360 numbers that are calculated with a different a now, and put them into cosineY[1].Technically my vector is going to be cosineYa I then need to be able to take out just cosY for a that I specify...
My code is saying this:
for (int a = 0; a < 8; a++)
{
for int n=0; n <= 360; n++
{
cosY[n] = cos(a*vectorOfY[n]);
}
cosineY.push_back(cosY);
}
which I hope is the correct way of actually setting it.
But then I need to take cosY for a that I specify, and calculate another another 360 vector, which will be stored in another vector again as a vector of vectors.
Right now I've got:
for (int a = 0; a < 8; a++
{
for (int n = 0; n <= 360; n++)
{
cosProductPt[n] = (VectorOfY[n]*cosY[n]);
}
CosProductY.push_back(cosProductPt);
}
The VectorOfY is besically the amplitude of an input wave. What I am doing is trying to create a cosine wave with different frequencies (a). I am then calculation the product of the input and cosine wave at each frequency. I need to be able to access these 360 points for each frequency later on in the program, and right now also I need to calculate the addition of all elements in cosProductPt, for every frequency (stored in cosProductY), and store it in a vector dotProductCos[a].
I've been trying to work it out but I don't know how to access all the elements in a vector of vectors to add them. I've been trying to do this for the whole day without any results. Right now I know so little that I don't even know how I would display or access a vector inside a vector, but I need to use that access point for the addition.
Thank you for your help.
for (int a = 0; a < 8; a++)
{
for int n=0; n < 360; n++) // note traded in <= for <. I think you had an off by one
// error here.
{
cosY[n] = cos(a*vectorOfY[n]);
}
cosineY.push_back(cosY);
}
Is sound so long as cosY has been pre-allocated to contain at least 360 elements. You could
std::vector<std::vector<double>> cosineY;
std::vector<double> cosY(360); // strongly consider replacing the 360 with a well-named
// constant
for (int a = 0; a < 8; a++) // same with that 8
{
for int n=0; n < 360; n++)
{
cosY[n] = cos(a*vectorOfY[n]);
}
cosineY.push_back(cosY);
}
for example, but this hangs on to cosY longer than you need to and could cause problems later, so I'd probably scope cosY by throwing the above code into a function.
std::vector<std::vector<double>> buildStageOne(std::vector<double> &vectorOfY)
{
std::vector<std::vector<double>> cosineY;
std::vector<double> cosY(NumDegrees);
for (int a = 0; a < NumVectors; a++)
{
for int n=0; n < NumDegrees; n++)
{
cosY[n] = cos(a*vectorOfY[n]); // take radians into account if needed.
}
cosineY.push_back(cosY);
}
return cosineY;
}
This looks horrible, returning the vector by value, but the vast majority of compilers will take advantage of Copy Elision or some other sneaky optimization to eliminate the copying.
Then I'd do almost the exact same thing for the second step.
std::vector<std::vector<double>> buildStageTwo(std::vector<double> &vectorOfY,
std::vector<std::vector<double>> &cosineY)
{
std::vector<std::vector<double>> CosProductY;
for (int a = 0; a < numVectors; a++)
{
for (int n = 0; n < NumDegrees; n++)
{
cosProductPt[n] = (VectorOfY[n]*cosineY[a][n]);
}
CosProductY.push_back(cosProductPt);
}
return CosProductY;
}
But we can make a couple optimizations
std::vector<std::vector<double>> buildStageTwo(std::vector<double> &vectorOfY,
std::vector<std::vector<double>> &cosineY)
{
std::vector<std::vector<double>> CosProductY;
for (int a = 0; a < numVectors; a++)
{
// why risk constantly looking up cosineY[a]? grab it once and cache it
std::vector<double> & cosY = cosineY[a]; // note the reference
for (int n = 0; n < numDegrees; n++)
{
cosProductPt[n] = (VectorOfY[n]*cosY[n]);
}
CosProductY.push_back(cosProductPt);
}
return CosProductY;
}
And the next is kind of an extension of the first:
std::vector<std::vector<double>> buildStageTwo(std::vector<double> &vectorOfY,
std::vector<std::vector<double>> &cosineY)
{
std::vector<std::vector<double>> CosProductY;
std::vector<double> cosProductPt(360);
for (std::vector<double> & cosY: cosineY) // range based for. Gets rid of
{
for (int n = 0; n < NumDegrees; n++)
{
cosProductPt[n] = (VectorOfY[n]*cosY[n]);
}
CosProductY.push_back(cosProductPt);
}
return CosProductY;
}
We could do the same range-based for trick for the for (int n = 0; n < NumDegrees; n++), but since we are iterating multiple arrays here it's not all that helpful.

Fast access to Rcpp::List elements

I have a data set that I really want to work with as a 3D array. Rather than deal with an attempt to get an R array into a RcppArmadillo Cube, which I'm not sure would work (?), I'm sending in a list of matrices. My problem, however, is that the list is of large matrices and I want to be able to loop over the 3rd dimension in the middle of loops over rows or columns. With medium size matrices (list of 20 matrices of size 50,000x5), flattening the list into one long array gets me my result in less than a second.
I'd prefer to avoid copying the data in order to accommodate larger matrices. But using as< NumericMatrix >(list_obj[t]) inside a loop over the rows makes the function take several minutes at least. An example of my code use as<> that is incredibly slow is below. dat is the list sent into the function. steps is an int sent into the function.
T = dat.size()
N = as<NumericMatrix>(dat[0]).nrow()
M = as<NumericMatrix>(dat[0]).ncol()
// Temp vals
double top, bot;
// Output vector
NumericVector out(M);
// Loop through each signal
for (int j=0; j<M; j++) {
// Reset numerator and denominator
top = 0;
bot = 0;
// Loop through each time dimension
for (int tm = 0; tm < (T - steps); tm++) {
// Loop through each row
for (int i = 0; i < N; i++) {
// Check if entry is positive
if (as<NumericMatrix>(dat[tm])(i, j) > 0) {
// Increment denominator
bot += 1.0;
// Compute future product
top = 1.0;
for (int k = 1; k <= steps; k++) {
if (as<NumericMatrix>(dat[tm + k])(i, j) == 0) {
top = 0.0;
break;
}
}
}
}
out(j) = top / bot;
}
}
Is there a fast way to do this without flattening the matrix and requiring a full copy of the potentially large data?

Bucket sort and User input

Here's the problem I'm working on: a user gives me an unspecified number of points on a standard x,y coordinate plane, where 0 < x^2 + y^2 <= 1. (x squared plus y squared, just for clarity).
Here is an example of the input:
0.2 0.38
0.6516 -0.1
-0.3 0.41
-0.38 0.2
From there, I calculate the distance of those points from the origin, (0, 0). Here is the function I use to find the distance and push it into a vector of doubles, B.
void findDistance(double x = 0, double y = 0) {
double x2 = pow(x, 2);
double y2 = pow(y, 2);
double z = x2 + y2;
double final = sqrt(z);
B.push_back(final);
}
Then, I want to bucket sort vector B, where there are n buckets for n points. Here is my current build of the bucketSort:
void bucketSort(double arr[], int n)
{
vector<double> b[n];
for (int i=0; i<n; i++)
{
int bi = n*arr[i];
b[bi].push_back(arr[i]);
}
for (int i=0; i<n; i++)
sort(b[i].begin(), b[i].end());
int index = 0;
for (int i = 0; i < n; i++)
for (int j = 0; j < b[i].size(); j++)
arr[index++] = b[i][j];
}
My problem is I can't get bucketSort to work without crashing. I get a windows message saying the program has stopped working. Now, I know the function works, but only when I initialize the vector and fill it at the same time. This is an example of a call that works:
double arr[] = {0.707107, 0.565685, 0.989949, 0.848528 };
int n = sizeof(arr)/sizeof(arr[0]);
bucketSort(arr, n);
So far, I've yet to find any other format for calling and initializing the vector that the function will accept and run. I need to find a way to take the points, computer the distances, and sort the distances. Current main that I'm plugging in and getting as a backfire:
int main(){
int number;
while (cin >> number){
A.push_back(number); }
int q = 0; double r = 0; double d = 0;
while (q < (A.size() - 1)){
findDistance(A[q], A[q+1]);
q += 2;
}
double arr[B.size()]; copy(B.begin(), B.end(), arr);
int n = (sizeof(B) + sizeof(B[0])) / sizeof(B[0]);
bucketSort(arr, n);
int w = 0;
while (w < y){ cout << arr[w] << endl; w++; }
The arr copy was created in some strange debugging attempt: sorry if unclear. Results of distance function stored in B, copied into arr, and arr is what's attempted to be sorted. The user inputs are given through the command prompt, using the syntax listed in the beginning. Output should be something like:
0.42941
0.49241
0.50804
0.65923
If anyone can offer suggestions of edits to either of functions that would make it work, the assistance would be greatly appreciated.
Here are a few issues to work on:
Your input loop will stop when it reads a non-integer. Change number to double
Your size calculation
int n = (sizeof(B) + sizeof(B[0])) / sizeof(B[0]);
I am not sure what you are trying to do here, but sizeof on a vector is not what you want. I think replacing this with:
int n = B.size();
is what you want.
I am not sure why you needed to convert the vector to an array to do the bucket sort - much easier to just pass the vector through to the bucket sort, then the size comes with the vector.
Change the bucketSort function to take a reference to a vector:
void bucketSort(vector<double> &arr)
{
int n = B.size();
...
and just pass B into the function. The rest of the code should be the same.
Also a portability note: not every compiler supports variable sized arrays, you are better off sticking with vector wherever possible.

Multiplication of matrices

I ve got two matrices W2 and hiddenLayer and i want to proceed the multiplication of those. W2 size's 12x50 and hiddenLayer size's 50x1. The proper code for the above calculation:
for(int h=0; h<50; h++){
for(int k=0; k<12; k++){
outputLayer += W2[k][h]*HiddenLayer[h];
}
}
or i ve got to put at first k-for??
Matrix multiplication is defined as:
C = AB ⇔ Ci,j = Σk=1..n Ai,k Bk,j for i,j = 1...n (in case of square matrices).
Thus outputLayer is a vector. Since HiddenLayer is a vector too, this isn't really a matrix multiplication but a matrix vector multiplication, which simplifies the formula above:
b = Ax ⇔ bi = Σk=1..m Ai,k xk for i = 1...n (A is an n x m matrix).
So all in all your code should be something like
for(int row = 0; row < 12; row++){
outputLayer[row] = 0;
for(int column = 0; column < 50; column++){
outputLayer[row] += W2[row][column]*HiddenLayer[column];
}
}