Multidimensional arrays from MATLAB to C++ - c++

So I'm trying to do one of my project I did in MATLAB in C++ but I got stuck along the way.
Here's the portion of code here in MATLAB I want to convert to C++. It does work on MATLAB but not working in C++
RelRough = [0, 1E-6, 5E-6, 1E-5, 5E-5, 0.0001, 0.0002, 0.0004, 0.0006, 0.0008, 0.001];
ReT = [4000, 5000, 6000, 7000, 8000, 9000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 200000, 300000, 400000, 500000];
for i = 1:length(ReT)
for j = 1:length(RelRough)
FCT_guess = 1;
tolerance = 1;
while tolerance > 1e-14
FCT_cal = 1/(-2*log10((RelRough(j)/3.7) + (2.51/(ReT(i)*sqrt(FCT_guess)))))^2;
tolerance = abs(FCT_cal-FCT_guess);
FCT_guess = FCT_cal;
FCT(i,j) = FCT_cal*1000;
end
end
end
Here's my C++ version and I kept getting error like "expression must have integral or unscoped enum type" for variable g
double RelRough[] = { 0, 1E-6, 5E-6, 1E-5, 5E-5, 0.0001, 0.0002, 0.0004, 0.0006, 0.0008, 0.001 };
const int lengthRelRough = sizeof(RelRough) / sizeof(RelRough[0]);
double ReT[] = { 4000, 5000, 6000, 7000, 8000, 9000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 200000, 300000, 400000, 500000 };
const int lengthReT = sizeof(ReT) / sizeof(ReT[0]);
double fct[lengthReT][lengthRelRough] = { 0 };
double fct_guess = 1;
double tolerance = 1;
double fct_cal = 0;
for (int ii = 0; ii < lengthReT; ++ii) {
for (int jj = 0; jj < lengthRelRough; ++jj) {
while (tolerance > 1e-14) {
double h = (RelRough[jj] / 3.7), w = (2.51 / (ReT[ii] * sqrt(fct_guess)));
double g = (-2*log10(h+w))^2;
fct_cal = 1/g;
tolerance = abs(fct_cal - fct_guess);
fct_guess = fct_cal;
fct[ii][jj] = fct_cal;
std::cout << fct[ii][jj] << "\t";
}
}
}
return 0;
}
Is there anyone that help to see what's wrong. Thanks in advance!

Change this:
double g = (-2*log10(h+w))^2;
into:
double g = pow(-2*log10(h+w),2.0);
As #Eljay pointed out in his comment, the operator ^ performs a XOR in C++, and not an exponentiation. For more information:
Exponentiation (pow function)
Boolean Operations (including XOR)

Related

How do I prevent over-correcting in my autonomous driving solution?

I am working on an autonomous driving solution for Euro Truck Simulator 2 with OpenCV in C++.
Here is where we detect the curve of the road:
int bottom_center = 160;
int sum_centerline = 0;
int count_centerline = 0;
int first_centerline = 0;
int last_centerline = 0;
double avr_center_to_left = 0;
double avr_center_to_right = 0;
//#pragma omp parallel for
for (int i = 240; i > 30; i--){
double center_to_right = -1;
double center_to_left = -1;
for (int j = 0; j < 150; j++) {
if (contours.at<uchar>(i, bottom_center + j) == 112 && center_to_right == -1) {
center_to_right = j;
}
if (contours.at<uchar>(i, bottom_center - j) == 112 && center_to_left == -1) {
center_to_left = j;
}
}
if (center_to_left != -1 && center_to_right != -1){
int centerline = (center_to_right - center_to_left + 2 * bottom_center) / 2;
if (first_centerline == 0) {
first_centerline = centerline;
}
cv::circle(outputImg, Point(centerline, i), 1, Scalar(30, 255, 30), 3);
cv::circle(outputImg, Point(centerline + center_to_right+20, i), 1, Scalar(255, 30, 30) , 3);
cv::circle(outputImg, Point(centerline - center_to_left+10, i), 1, Scalar(255, 30, 30) , 3);
sum_centerline += centerline;
avr_center_to_left = (avr_center_to_left * count_centerline + center_to_left) / count_centerline + 1;
avr_center_to_right = (avr_center_to_right * count_centerline + center_to_right) / count_centerline + 1;
last_centerline = centerline;
count_centerline++;
}
else {}
}
And here is my current solution for steering:
int diff = 0;
if (count_centerline != 0) {
diff = sum_centerline / count_centerline - bottom_center;
int degree = atan2(last_centerline - first_centerline, count_centerline) * 180 / PI;
//diff = (90 - degree);
int move_mouse_pixel = diff;
cout << "Steer: " << move_mouse_pixel << "px ";
if (diff <= 20 || diff >= -20){
SetCursorPos(pt.x + (move_mouse_pixel / 10), height / 2);
}
else{
SetCursorPos(pt.x + (move_mouse_pixel / 25), height / 2);
}
}
Finally, here is a video of what my program currently does: https://www.youtube.com/watch?v=rqyvoFuGKKk&feature=youtu.be
The current problem I have is that the steering does not center fast enough, leading it to continually over-correct until it swerves off the lane. I have tried to increase steering sensitivity in-game, to allow for faster or slower turning, but this either makes the truck spin out of control or not turn enough when driving along a large curve.
My current method just divides slight movements (between -20px and 20px) by 10, and large movements by 20. I've also tried reversing this but did not fix the over-correcting problem.
There are two possible solutions that I have found so far:
I could incrementally increase the divider for which we apply to move_mouse_pixel, therefore reducing the force of steering done between small movements.
Or, I could somehow make the program center the steering wheel more quickly. I am not sure how I would implement this.
What do you guys think?
I believe that PID controller would be suitable for this task.
https://en.wikipedia.org/wiki/PID_controller
In your situation it would look similar to this:
diffOld = diff;
diff = sum_centerline / count_centerline - bottom_center;
SetCursorPos(width/2 + Kp* diff + Kd*(diff - diffOld) , height / 2);
Do not use if statement in this controller. You need to keep steering even if there is no error to corect. I would suggest to skip integral part, because your object integrate (when you do not drive straight you integrate error). You need to experimentally choose values of Kp and Kd parameters, for example with Ziegler–Nichols method https://en.wikipedia.org/wiki/Ziegler%E2%80%93Nichols_method.

SuiteSparse(4.5.1)'s SPQR - calling to cholmod_allocate_triplet always returns NULL

I am trying to use SuiteSparse SPQR to solve a linear equation system x = A\b; my A matrix is sparse and it is a rectangular matrix so I chose SPQR to solve this.
I built SuiteSparse using MS Visual Studio 2012 on Windows 7 x64 using those provided by https://github.com/jlblancoc/suitesparse-metis-for-windows.
In order to test the function, I modified the spqr_example project to allocate tripets before converting to sparse matrix, instead of originally reading input from stdin to create a sparse matrix. I input a small A and b matrix for testing. The program compiled successfully. I debugged the program and found that my call to cholmod_allocate_triplet() has failed because in the declaration of this function it has this code below:
RETURN_IF_NULL_COMMON (NULL) ;
This always return false (even though my common starts successfully).
I don't want to explicitly make change to this line, as I might have make mistake somewhere or I forgot to do something I have to do because I am new to use the library.
Can anybody help give me some suggestion on how to make my program run properly? My code below is modified from the provided spqr_example. Thank you very much.
#include <iostream>
#include "SuiteSparseQR.hpp"
int main (int argc, char **argv)
{
cholmod_common Common, *cc ;
cholmod_sparse *A ;
cholmod_dense *X, *B, *Residual ;
double rnorm, one [2] = {1,0}, minusone [2] = {-1,0} ;
int mtype ;
// start CHOLMOD
cc = &Common ;
cholmod_l_start (cc) ;
// load A
//A = (cholmod_sparse *) cholmod_l_read_matrix (stdin, 1, &mtype, cc) ;
// A = [ 1 0 0 0;
// -1 1 0 0; ...
// 0 -1 1 0; ...
// 0 0 -1 1; ...
// 0 0 0 -1];
int row[] = {0, 1, 1, 2, 2, 3, 3, 4};
int col[] = {0, 0, 1, 1, 2, 2, 3, 3};
double val[] = {1.0, -1.0, 1.0, -1.0, 1.0, -1.0, 1.0, -1.0};
int numEq = 5;
int numElement = 8;
int numSol = 4;
double b[] = {5.0, -5.0, 2.0, 1.0, 0.0};
cholmod_triplet* triplet = cholmod_allocate_triplet(5,4,5*4,0,CHOLMOD_REAL,cc);
int * triplet_i = (int *)(triplet->i);
int * triplet_j = (int *)(triplet->j);
double * triplet_x = (double *)(triplet->x);
for (int ne=0; ne<numElement; ne++)
{
triplet_i[triplet->nnz] = row[ne];
triplet_j[triplet->nnz] = col[ne];
triplet_x[triplet->nnz] = val[ne];
triplet->nnz++;
}
// Convert triplet to sparse matrix
A = cholmod_triplet_to_sparse(triplet, numElement, cc);
cholmod_free_triplet(&triplet, cc);
// B = ones (size (A,1),1)
//B = cholmod_l_ones (A->nrow, 1, A->xtype, cc) ;
B = cholmod_l_zeros(numEq, 1, CHOLMOD_REAL, cc);
for (int ne=0; ne<numEq; ne++)
{
((double *)(B->x))[ne] = val[ne];
}
// X = A\B
X = SuiteSparseQR<double>(A,B,cc);
//X = SuiteSparseQR <double> (A, B, cc) ;
// Print out the result
double *sol = static_cast<double *>(malloc(sizeof(X->x)));
sol = (double *)(X->x);
for (int r=0; r<numSol; r++)
{
std::cout << "x[" << r << "] = " << sol << std::endl;
sol++;
}
///// END HERE
// rnorm = norm (B-A*X)
Residual = cholmod_l_copy_dense (B, cc) ;
cholmod_l_sdmult (A, 0, minusone, one, X, Residual, cc) ;
rnorm = cholmod_l_norm_dense (Residual, 2, cc) ;
printf ("2-norm of residual: %8.1e\n", rnorm) ;
printf ("rank %ld\n", cc->SPQR_istat [4]) ;
// free everything and finish CHOLMOD
cholmod_l_free_dense (&Residual, cc) ;
cholmod_l_free_sparse (&A, cc) ;
cholmod_l_free_dense (&X, cc) ;
cholmod_l_free_dense (&B, cc) ;
cholmod_l_finish (cc) ;
return (0) ;
}
I have finally find out why my program broke after the line below
cholmod_triplet* triplet = cholmod_allocate_triplet(5,4,5*4,0,CHOLMOD_REAL,cc);
as the result of the cholmod_allocate_triplet() internally calling RETURN_IF_NULL_COMMON (NULL), which return false.
The reason is that I start the process calling
cholmod_l_start (cc) ;
which is the long int version of cholmod_start().
To fix the problem, I have to call cholmod_l_allocate_triplet() instead of cholmod_allocate_triplet() as well as change all other functions to use cholmod_l instead of only calling cholmod_

iris to screen calculation for eye tracking

I'm currently experimenting with Eye tracking I've successfully built an iris tracking algorithm using OpenCV with contours and Hough transform. But the next step is unclear for me. I want to know if the calculations i'm doing are correct for translating the center of an eye to the screen. The head of the user has an fixed position.
What I want is an algorithm that works on all eyes off course. Is there like an angle calculation? So when the user is looking more to the right, linear?
What I do right now is:
First I let the user look at specific points and use RANSAC to detect the iris position that's closest to the position on the screen. I do that with four 2D points on the screen and iris. I'm using Homography for this to get the correct calculation.
void gaussian_elimination(float *input, int n){
// ported to c from pseudocode in
// http://en.wikipedia.org/wiki/Gaussian_elimination
float * A = input;
int i = 0;
int j = 0;
int m = n-1;
while (i < m && j < n){
// Find pivot in column j, starting in row i:
int maxi = i;
for(int k = i+1; k<m; k++){
if(fabs(A[k*n+j]) > fabs(A[maxi*n+j])){
maxi = k;
}
}
if (A[maxi*n+j] != 0){
//swap rows i and maxi, but do not change the value of i
if(i!=maxi)
for(int k=0;k<n;k++){
float aux = A[i*n+k];
A[i*n+k]=A[maxi*n+k];
A[maxi*n+k]=aux;
}
//Now A[i,j] will contain the old value of A[maxi,j].
//divide each entry in row i by A[i,j]
float A_ij=A[i*n+j];
for(int k=0;k<n;k++){
A[i*n+k]/=A_ij;
}
//Now A[i,j] will have the value 1.
for(int u = i+1; u< m; u++){
//subtract A[u,j] * row i from row u
float A_uj = A[u*n+j];
for(int k=0;k<n;k++){
A[u*n+k]-=A_uj*A[i*n+k];
}
//Now A[u,j] will be 0, since A[u,j] - A[i,j] * A[u,j] = A[u,j] - 1 * A[u,j] = 0.
}
i++;
}
j++;
}
//back substitution
for(int i=m-2;i>=0;i--){
for(int j=i+1;j<n-1;j++){
A[i*n+m]-=A[i*n+j]*A[j*n+m];
//A[i*n+j]=0;
}
}
}
ofMatrix4x4 findHomography(ofPoint src[4], ofPoint dst[4]){
ofMatrix4x4 matrix;
// create the equation system to be solved
//
// from: Multiple View Geometry in Computer Vision 2ed
// Hartley R. and Zisserman A.
//
// x' = xH
// where H is the homography: a 3 by 3 matrix
// that transformed to inhomogeneous coordinates for each point
// gives the following equations for each point:
//
// x' * (h31*x + h32*y + h33) = h11*x + h12*y + h13
// y' * (h31*x + h32*y + h33) = h21*x + h22*y + h23
//
// as the homography is scale independent we can let h33 be 1 (indeed any of the terms)
// so for 4 points we have 8 equations for 8 terms to solve: h11 - h32
// after ordering the terms it gives the following matrix
// that can be solved with gaussian elimination:
float P[8][9]={
{-src[0].x, -src[0].y, -1, 0, 0, 0, src[0].x*dst[0].x, src[0].y*dst[0].x, -dst[0].x }, // h11
{ 0, 0, 0, -src[0].x, -src[0].y, -1, src[0].x*dst[0].y, src[0].y*dst[0].y, -dst[0].y }, // h12
{-src[1].x, -src[1].y, -1, 0, 0, 0, src[1].x*dst[1].x, src[1].y*dst[1].x, -dst[1].x }, // h13
{ 0, 0, 0, -src[1].x, -src[1].y, -1, src[1].x*dst[1].y, src[1].y*dst[1].y, -dst[1].y }, // h21
{-src[2].x, -src[2].y, -1, 0, 0, 0, src[2].x*dst[2].x, src[2].y*dst[2].x, -dst[2].x }, // h22
{ 0, 0, 0, -src[2].x, -src[2].y, -1, src[2].x*dst[2].y, src[2].y*dst[2].y, -dst[2].y }, // h23
{-src[3].x, -src[3].y, -1, 0, 0, 0, src[3].x*dst[3].x, src[3].y*dst[3].x, -dst[3].x }, // h31
{ 0, 0, 0, -src[3].x, -src[3].y, -1, src[3].x*dst[3].y, src[3].y*dst[3].y, -dst[3].y }, // h32
};
gaussian_elimination(&P[0][0],9);
matrix(0,0)=P[0][8];
matrix(0,1)=P[1][8];
matrix(0,2)=0;
matrix(0,3)=P[2][8];
matrix(1,0)=P[3][8];
matrix(1,1)=P[4][8];
matrix(1,2)=0;
matrix(1,3)=P[5][8];
matrix(2,0)=0;
matrix(2,1)=0;
matrix(2,2)=0;
matrix(2,3)=0;
matrix(3,0)=P[6][8];
matrix(3,1)=P[7][8];
matrix(3,2)=0;
matrix(3,3)=1;
return matrix;
}
You should have a look at existing solutions for this:
Eye writer for painting with your eyes (I tested this to control the mouse only)
Eyewriter.org
Eyewriter walkthrough
Eyewriter on Github
EyeLike pupil tracking
EyeLike info page (algorithm similar to want you want is discussed here)
EyeLike on Github
Good luck!
May be this link is helpful to you , best luck
cv::Mat computeMatXGradient(const cv::Mat &mat) {
cv::Mat out(mat.rows,mat.cols,CV_64F);
for (int y = 0; y < mat.rows; ++y) {
const uchar *Mr = mat.ptr<uchar>(y);
double *Or = out.ptr<double>(y);
Or[0] = Mr[1] - Mr[0];
for (int x = 1; x < mat.cols - 1; ++x) {
Or[x] = (Mr[x+1] - Mr[x-1])/2.0;
}
Or[mat.cols-1] = Mr[mat.cols-1] - Mr[mat.cols-2];
}
return out;
}

Anisotropic Diffusion

I have converted this Matlab Anisotropic Diffusion code to C++ but I am not getting the desired output. All I am getting is a black image. Can someone please check my code and give any suggestions? Below is my code:
const double lambda = 1 / 7;
const double k = 30;
const int iter = 1;
int ahN[3][3] = { {0, 1, 0}, {0, -1, 0}, {0, 0, 0} };
int ahS[3][3] = { {0, 0, 0}, {0, -1, 0}, {0, 1, 0} };
int ahE[3][3] = { {0, 0, 0}, {0, -1, 1}, {0, 0, 0} };
int ahW[3][3] = { {0, 0, 0}, {1, -1, 0}, {0, 0, 0} };
int ahNE[3][3] = { {0, 0, 1}, {0, -1, 0}, {0, 0, 0} };
int ahSE[3][3] = { {0, 0, 0}, {0, -1, 0}, {0, 0, 1} };
int ahSW[3][3] = { {0, 0, 0}, {0, -1, 0}, {1, 0, 0} };
int ahNW[3][3] = { {1, 0, 0}, {0, -1, 0}, {0, 0, 0} };
Mat hN = Mat(3, 3, CV_32FC1, &ahN);
Mat hS = Mat(3, 3, CV_32FC1, &ahS);
Mat hE = Mat(3, 3, CV_32FC1, &ahE);
Mat hW = Mat(3, 3, CV_32FC1, &ahW);
Mat hNE = Mat(3, 3, CV_32FC1, &ahNE);
Mat hSE = Mat(3, 3, CV_32FC1, &ahSE);
Mat hSW = Mat(3, 3, CV_32FC1, &ahSW);
Mat hNW = Mat(3, 3, CV_32FC1, &ahNW);
void anisotropicDiffusion(Mat &output, int width, int height) {
//mat initialisation
Mat nablaN, nablaS, nablaW, nablaE, nablaNE, nablaSE, nablaSW, nablaNW;
Mat cN, cS, cW, cE, cNE, cSE, cSW, cNW;
//depth of filters
int ddepth = -1;
//center pixel distance
double dx = 1, dy = 1, dd = sqrt(2);
double idxSqr = 1.0 / (dx * dx), idySqr = 1.0 / (dy * dy), iddSqr = 1 / (dd * dd);
for (int i = 0; i < iter; i++) {
//filters
filter2D(output, nablaN, ddepth, hN);
filter2D(output, nablaS, ddepth, hS);
filter2D(output, nablaW, ddepth, hW);
filter2D(output, nablaE, ddepth, hE);
filter2D(output, nablaNE, ddepth, hNE);
filter2D(output, nablaSE, ddepth, hSE);
filter2D(output, nablaSW, ddepth, hSW);
filter2D(output, nablaNW, ddepth, hNW);
//exponential flux
cN = nablaN / k;
cN.mul(cN);
cN = 1.0 / (1.0 + cN);
//exp(-cN, cN);
cS = nablaS / k;
cS.mul(cS);
cS = 1.0 / (1.0 + cS);
//exp(-cS, cS);
cW = nablaW / k;
cW.mul(cW);
cW = 1.0 / (1.0 + cW);
//exp(-cW, cW);
cE = nablaE / k;
cE.mul(cE);
cE = 1.0 / (1.0 + cE);
//exp(-cE, cE);
cNE = nablaNE / k;
cNE.mul(cNE);
cNE = 1.0 / (1.0 + cNE);
//exp(-cNE, cNE);
cSE = nablaSE / k;
cSE.mul(cSE);
cSE = 1.0 / (1.0 + cSE);
//exp(-cSE, cSE);
cSW = nablaSW / k;
cSW.mul(cSW);
cSW = 1.0 / (1.0 + cSW);
//exp(-cSW, cSW);
cNW = nablaNW / k;
cNW.mul(cNW);
cNW = 1.0 / (1.0 + cNW);
//exp(-cNW, cNW);
output = output + lambda * (idySqr * cN.mul(nablaN) + idySqr * cS.mul(nablaS) +
idxSqr * cW.mul(nablaW) + idxSqr * cE.mul(nablaE) +
iddSqr * cNE.mul(nablaNE) + iddSqr * cSE.mul(nablaSE) +
iddSqr * cNW.mul(nablaNW) + iddSqr * cSW.mul(nablaSW));
}
}
Resolved in c#. Easy of translate to c++
You need this variables:
IMAGE[height, width] = integer array with stored Image
height = height of images in pixels
width = width of images in pixels
/// <summary>Perona & Malik anisotropic difusion filter. (squared formula)</summary>
/// <param name="data">Image data</param>
/// <param name="dt">Heat difusion value. Upper = more rapid convergence.</param>
/// <param name="lambda">The shape of the diffusion coefficient g(), controlling the Perona Malik diffusion g(delta) = 1/((1 + delta2) / lambda2). Upper = more blurred image & more noise removed</param>
/// <param name="interations">Determines the maximum number of iteration steps of the filter. Upper = less speed & more noise removed</param>
private void PeronaMalik(int[,] image, double dt, int lambda, int interations)
{
try
{
//test parameters
if (dt < 0)
throw new Exception("DT negative value not allowed");
if (lambda < 0)
throw new Exception("lambda must be upper of 0");
if (interations <= 0)
throw new Exception("Iterations must be upper of 0");
//Make temp image
int[,] temp = new int[height, width];
Array.Copy(image, temp, image.Length);
//Precalculate tables (for speed up)
double[] precal = new double[512];
double lambda2 = lambda * lambda;
for (int f = 0; f < 512; f++)
{
int diff = f - 255;
precal[f] = -dt * diff * lambda2 / (lambda2 + diff * diff);
}
//Apply the filter
for (int n = 0; n < interations; n++)
{
for (int h = 0; h < height; h++)
for (int w = 0; w < width; w++)
{
int current = temp[h, w];
int px = w - 1;
int nx = w + 1;
int py = h - 1;
int ny = h + 1;
if (px < 0)
px = 0;
if (nx >= width)
nx = width - 1;
if (py < 0)
py = 0;
if (ny >= height)
ny = height - 1;
image[h, w] = (int)(precal[255 + current - temp[h, px]] +
precal[255 + current - temp[h, nx]] +
precal[255 + current - temp[py, w]] +
precal[255 + current - temp[ny, w]]) +
temp[h, w];
}
}
}
catch (Exception ex) { throw new Exception(ex.Message + "\r\nIn PeronaMalik"); }
}
The solution is for equation 2. If you want equation 1 (exponential), you can change the ecuation in precal table for this:
precal[f] = -dt * delta * Math.Exp(-(delta * delta / lambda2));
Looks like you need to assign multiplication result:
Mat C = A.mul(B);
And
int ahN[3][3] ....
should be
float ahN[3][3] ....

Neural Network not learning - MNIST data - Handwriting recognition

I have written a Neural Network Program. It works for Logic Gates, but when I try to use it for recognizing handwritten digits - it simply does not learn.
Please find the code below:
// This is a single neuron; this might be necessary in order to understand remaining code
typedef struct SingleNeuron
{
double outputValue;
std::vector<double> weight;
std::vector<double> deltaWeight;
double gradient;
double sum;
}SingleNeuron;
Then I initialize the net. I set weights to be random value between -0.5 to +0.5, sum to 0, deltaWeight to 0
Then comes the FeedForward:
for (unsigned i = 0; i < inputValues.size(); ++i)
{
neuralNet[0][i].outputValue = inputValues[i];
neuralNet[0][i].sum = 0.0;
// std::cout << "o/p Val = " << neuralNet[0][i].outputValue << std::endl;
}
for (unsigned i = 1; i < neuralNet.size(); ++i)
{
std::vector<SingleNeuron> prevLayerNeurons = neuralNet[i - 1];
unsigned j = 0;
double thisNeuronOPVal = 0;
// std::cout << std::endl;
for (j = 0; j < neuralNet[i].size() - 1; ++j)
{
double sum = 0;
for (unsigned k = 0; k < prevLayerNeurons.size(); ++k)
{
sum += prevLayerNeurons[k].outputValue * prevLayerNeurons[k].weight[j];
}
neuralNet[i][j].sum = sum;
neuralNet[i][j].outputValue = TransferFunction(sum);
// std::cout << neuralNet[i][j].outputValue << "\t";
}
// std::cout << std::endl;
}
My transfer function and its derivative is mentioned at the end.
After this I try to back-propagate using:
// calculate output layer gradients
for (unsigned i = 0; i < outputLayer.size() - 1; ++i)
{
double delta = actualOutput[i] - outputLayer[i].outputValue;
outputLayer[i].gradient = delta * TransferFunctionDerivative(outputLayer[i].sum);
}
// std::cout << "Found Output gradients "<< std::endl;
// calculate hidden layer gradients
for (unsigned i = neuralNet.size() - 2; i > 0; --i)
{
std::vector<SingleNeuron>& hiddenLayer = neuralNet[i];
std::vector<SingleNeuron>& nextLayer = neuralNet[i + 1];
for (unsigned j = 0; j < hiddenLayer.size(); ++j)
{
double dow = 0.0;
for (unsigned k = 0; k < nextLayer.size() - 1; ++k)
{
dow += nextLayer[k].gradient * hiddenLayer[j].weight[k];
}
hiddenLayer[j].gradient = dow * TransferFunctionDerivative(hiddenLayer[j].sum);
}
}
// std::cout << "Found hidden layer gradients "<< std::endl;
// from output to 1st hidden layer, update all weights
for (unsigned i = neuralNet.size() - 1; i > 0; --i)
{
std::vector <SingleNeuron>& currentLayer = neuralNet[i];
std::vector <SingleNeuron>& prevLayer = neuralNet[i - 1];
for (unsigned j = 0; j < currentLayer.size() - 1; ++j)
{
for (unsigned k = 0; k < prevLayer.size(); ++k)
{
SingleNeuron& thisNeueon = prevLayer[k];
double oldDeltaWeight = thisNeueon.deltaWeight[j];
double newDeltaWeight = ETA * thisNeueon.outputValue * currentLayer[j].gradient + (ALPHA * oldDeltaWeight);
thisNeueon.deltaWeight[j] = newDeltaWeight;
thisNeueon.weight[j] += newDeltaWeight;
}
}
}
These are the TransferFuntion and its derivative;
double TransferFunction(double x)
{
double val;
//val = tanh(x);
val = 1 / (1 + exp(x * -1));
return val;
}
double TransferFunctionDerivative(double x)
{
//return 1 - x * x;
double val = exp(x * -1) / pow((exp(x * -1) + 1), 2);
return val;
}
One thing I observed If i use standard sigmoid function to be my transfer function AND if I pass output of neuron to transfer function - Result is INFINITY. But tanh(x) works fine with this value
So if I am using 1/1+e^(-x) as transfer function I have to pass Sum of Net Inputs and with tanh being my transfer function I have to pass output of current neuron.
I do not completely understand why this is the way it is, may be this calls for a different question.
But this question is really about something else: NETWORK IS WORKING FOR LOGIC GATES BUT NOT FOR CHARACTER RECOGNITION
I have tried many variations/combinations of Learning Rate and Acceleration and # hidden layers and their sizes. Please find the results below:
AvgErr: 0.299399 #Pass799
AvgErr : 0.305071 #Pass809
AvgErr : 0.303046 #Pass819
AvgErr : 0.299569 #Pass829
AvgErr : 0.30413 #Pass839
AvgErr : 0.304165 #Pass849
AvgErr : 0.300529 #Pass859
AvgErr : 0.302973 #Pass869
AvgErr : 0.299238 #Pass879
AvgErr : 0.304708 #Pass889
AvgErr : 0.30068 #Pass899
AvgErr : 0.302582 #Pass909
AvgErr : 0.301767 #Pass919
AvgErr : 0.303167 #Pass929
AvgErr : 0.299551 #Pass939
AvgErr : 0.301295 #Pass949
AvgErr : 0.300651 #Pass959
AvgErr : 0.297867 #Pass969
AvgErr : 0.304221 #Pass979
AvgErr : 0.303702 #Pass989
After looking at the results you might feel this guy is simply stuck into local minima, but please wait and read through:
Input = [0, 0, 0, 0, 0, 0, 1, 0, 0, 0]
Output = 0.0910903, 0.105674, 0.064575, 0.0864824, 0.128682, 0.0878434, 0.0946296, 0.154405, 0.0678767, 0.0666924
Input = [1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Output = 0.0916106, 0.105958, 0.0655508, 0.086579, 0.126461, 0.0884082, 0.110953, 0.163343, 0.0689315, 0.0675822
Input = [0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
Output = 0.105344, 0.105021, 0.0659517, 0.0858077, 0.123104, 0.0884107, 0.116917, 0.161911, 0.0693426, 0.0675156
Input = [0, 0, 0, 0, 0, 0, 1, 0, 0, 0]
Output = , 0.107113, 0.101838, 0.0641632, 0.0967766, 0.117149, 0.085271, 0.11469, 0.153649, 0.0672772, 0.0652416
Above is the output of epoch #996, #997,#998 and #999
So simply network is not learning. For this e.g. I have used ALPHA = 0.4, ETA = 0.7, 10 hidden layers each of 100 neurons and average is over 10 epochs. If you are worried about Learning Rate being 0.4 or so many hidden layers I have already tried their variations. For e.g. for learning rate being 0.1 and 4 hidden layers - each of 16
Input = [0, 0, 0, 0, 0, 0, 1, 0, 0, 0]
Output = 0.0883238, 0.0983253, 0.0613749, 0.0809751, 0.124972, 0.0897194, 0.0911235, 0.179984, 0.0681346, 0.0660039
Input = [1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Output = 0.0868767, 0.0966924, 0.0612488, 0.0798343, 0.120353, 0.0882381, 0.111925, 0.169309, 0.0676711, 0.0656819
Input = [0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
Output = 0.105252, 0.0943837, 0.0604416, 0.0781779, 0.116231, 0.0858496, 0.108437, 0.1588, 0.0663156, 0.0645477
Input = [0, 0, 0, 0, 0, 0, 1, 0, 0, 0]
Output = 0.102023, 0.0914957, 0.059178, 0.09339, 0.111851, 0.0842454, 0.104834, 0.149892, 0.0651799, 0.063558
I am so damn sure that I have missed something. I am not able to figure it out. I have read Tom Mitchel's algorithm so many times, but I don't know what is wrong. Whatever example I solve by hand - works! (Please don't ask me to solve MNIST data images by hand ;) ) I do not know where to change the code, what to do.. please help out..
EDIT -- Uploading more data as per suggestions in comments
1 Hidden Layer of 32 -- still no learning.
Expected Output -- Input is images between 0-9, so a simple vector describing which is current image, that bit is 1 all others are 0. So i would want output to be as close to 1 for that particular bit and others being close to 0 For e.g. if input is Input = [0, 0, 0, 0, 0, 0, 1, 0, 0, 0] I would want output to be something like Output = 0.002023, 0.0914957, 0.059178, 0.09339, 0.011851, 0.0842454, 0.924834, 0.049892, 0.0651799, 0.063558 (THis is vague, hand-generated)
Here are the links of other researcher's work.
Stanford
SourceForge -- This is rather a library
Not only these 2, there are so many sites showing the demos.
Things are working quite fine for them. If I set my network parameters(Alpha, ETA) like them I am not getting results like them, so this is reassurance that something is wrong with my code.
EDIT 2
Adding more failure cases
Accelaration - 0.7, Learning Rate 0.1
Accelaration - 0.7, Learning Rate 0.6
In both of the above cases Hidden layers were 3, each of 32 neurons.
This answer is copied from the OP's comment on the question.
I solved the puzzle. I had made the worst possible mistake. I was giving wrong input. I have used opencv to scan the images, instead of using reshape I was using resize and so input was linear interpolation of images. So my input was wrong. There was nothing wrong with the code. My network is 784 - 65 - 10 giving 96.43% accuracy.