Neural Network not learning - MNIST data - Handwriting recognition - c++

I have written a Neural Network Program. It works for Logic Gates, but when I try to use it for recognizing handwritten digits - it simply does not learn.
Please find the code below:
// This is a single neuron; this might be necessary in order to understand remaining code
typedef struct SingleNeuron
{
double outputValue;
std::vector<double> weight;
std::vector<double> deltaWeight;
double gradient;
double sum;
}SingleNeuron;
Then I initialize the net. I set weights to be random value between -0.5 to +0.5, sum to 0, deltaWeight to 0
Then comes the FeedForward:
for (unsigned i = 0; i < inputValues.size(); ++i)
{
neuralNet[0][i].outputValue = inputValues[i];
neuralNet[0][i].sum = 0.0;
// std::cout << "o/p Val = " << neuralNet[0][i].outputValue << std::endl;
}
for (unsigned i = 1; i < neuralNet.size(); ++i)
{
std::vector<SingleNeuron> prevLayerNeurons = neuralNet[i - 1];
unsigned j = 0;
double thisNeuronOPVal = 0;
// std::cout << std::endl;
for (j = 0; j < neuralNet[i].size() - 1; ++j)
{
double sum = 0;
for (unsigned k = 0; k < prevLayerNeurons.size(); ++k)
{
sum += prevLayerNeurons[k].outputValue * prevLayerNeurons[k].weight[j];
}
neuralNet[i][j].sum = sum;
neuralNet[i][j].outputValue = TransferFunction(sum);
// std::cout << neuralNet[i][j].outputValue << "\t";
}
// std::cout << std::endl;
}
My transfer function and its derivative is mentioned at the end.
After this I try to back-propagate using:
// calculate output layer gradients
for (unsigned i = 0; i < outputLayer.size() - 1; ++i)
{
double delta = actualOutput[i] - outputLayer[i].outputValue;
outputLayer[i].gradient = delta * TransferFunctionDerivative(outputLayer[i].sum);
}
// std::cout << "Found Output gradients "<< std::endl;
// calculate hidden layer gradients
for (unsigned i = neuralNet.size() - 2; i > 0; --i)
{
std::vector<SingleNeuron>& hiddenLayer = neuralNet[i];
std::vector<SingleNeuron>& nextLayer = neuralNet[i + 1];
for (unsigned j = 0; j < hiddenLayer.size(); ++j)
{
double dow = 0.0;
for (unsigned k = 0; k < nextLayer.size() - 1; ++k)
{
dow += nextLayer[k].gradient * hiddenLayer[j].weight[k];
}
hiddenLayer[j].gradient = dow * TransferFunctionDerivative(hiddenLayer[j].sum);
}
}
// std::cout << "Found hidden layer gradients "<< std::endl;
// from output to 1st hidden layer, update all weights
for (unsigned i = neuralNet.size() - 1; i > 0; --i)
{
std::vector <SingleNeuron>& currentLayer = neuralNet[i];
std::vector <SingleNeuron>& prevLayer = neuralNet[i - 1];
for (unsigned j = 0; j < currentLayer.size() - 1; ++j)
{
for (unsigned k = 0; k < prevLayer.size(); ++k)
{
SingleNeuron& thisNeueon = prevLayer[k];
double oldDeltaWeight = thisNeueon.deltaWeight[j];
double newDeltaWeight = ETA * thisNeueon.outputValue * currentLayer[j].gradient + (ALPHA * oldDeltaWeight);
thisNeueon.deltaWeight[j] = newDeltaWeight;
thisNeueon.weight[j] += newDeltaWeight;
}
}
}
These are the TransferFuntion and its derivative;
double TransferFunction(double x)
{
double val;
//val = tanh(x);
val = 1 / (1 + exp(x * -1));
return val;
}
double TransferFunctionDerivative(double x)
{
//return 1 - x * x;
double val = exp(x * -1) / pow((exp(x * -1) + 1), 2);
return val;
}
One thing I observed If i use standard sigmoid function to be my transfer function AND if I pass output of neuron to transfer function - Result is INFINITY. But tanh(x) works fine with this value
So if I am using 1/1+e^(-x) as transfer function I have to pass Sum of Net Inputs and with tanh being my transfer function I have to pass output of current neuron.
I do not completely understand why this is the way it is, may be this calls for a different question.
But this question is really about something else: NETWORK IS WORKING FOR LOGIC GATES BUT NOT FOR CHARACTER RECOGNITION
I have tried many variations/combinations of Learning Rate and Acceleration and # hidden layers and their sizes. Please find the results below:
AvgErr: 0.299399 #Pass799
AvgErr : 0.305071 #Pass809
AvgErr : 0.303046 #Pass819
AvgErr : 0.299569 #Pass829
AvgErr : 0.30413 #Pass839
AvgErr : 0.304165 #Pass849
AvgErr : 0.300529 #Pass859
AvgErr : 0.302973 #Pass869
AvgErr : 0.299238 #Pass879
AvgErr : 0.304708 #Pass889
AvgErr : 0.30068 #Pass899
AvgErr : 0.302582 #Pass909
AvgErr : 0.301767 #Pass919
AvgErr : 0.303167 #Pass929
AvgErr : 0.299551 #Pass939
AvgErr : 0.301295 #Pass949
AvgErr : 0.300651 #Pass959
AvgErr : 0.297867 #Pass969
AvgErr : 0.304221 #Pass979
AvgErr : 0.303702 #Pass989
After looking at the results you might feel this guy is simply stuck into local minima, but please wait and read through:
Input = [0, 0, 0, 0, 0, 0, 1, 0, 0, 0]
Output = 0.0910903, 0.105674, 0.064575, 0.0864824, 0.128682, 0.0878434, 0.0946296, 0.154405, 0.0678767, 0.0666924
Input = [1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Output = 0.0916106, 0.105958, 0.0655508, 0.086579, 0.126461, 0.0884082, 0.110953, 0.163343, 0.0689315, 0.0675822
Input = [0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
Output = 0.105344, 0.105021, 0.0659517, 0.0858077, 0.123104, 0.0884107, 0.116917, 0.161911, 0.0693426, 0.0675156
Input = [0, 0, 0, 0, 0, 0, 1, 0, 0, 0]
Output = , 0.107113, 0.101838, 0.0641632, 0.0967766, 0.117149, 0.085271, 0.11469, 0.153649, 0.0672772, 0.0652416
Above is the output of epoch #996, #997,#998 and #999
So simply network is not learning. For this e.g. I have used ALPHA = 0.4, ETA = 0.7, 10 hidden layers each of 100 neurons and average is over 10 epochs. If you are worried about Learning Rate being 0.4 or so many hidden layers I have already tried their variations. For e.g. for learning rate being 0.1 and 4 hidden layers - each of 16
Input = [0, 0, 0, 0, 0, 0, 1, 0, 0, 0]
Output = 0.0883238, 0.0983253, 0.0613749, 0.0809751, 0.124972, 0.0897194, 0.0911235, 0.179984, 0.0681346, 0.0660039
Input = [1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Output = 0.0868767, 0.0966924, 0.0612488, 0.0798343, 0.120353, 0.0882381, 0.111925, 0.169309, 0.0676711, 0.0656819
Input = [0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
Output = 0.105252, 0.0943837, 0.0604416, 0.0781779, 0.116231, 0.0858496, 0.108437, 0.1588, 0.0663156, 0.0645477
Input = [0, 0, 0, 0, 0, 0, 1, 0, 0, 0]
Output = 0.102023, 0.0914957, 0.059178, 0.09339, 0.111851, 0.0842454, 0.104834, 0.149892, 0.0651799, 0.063558
I am so damn sure that I have missed something. I am not able to figure it out. I have read Tom Mitchel's algorithm so many times, but I don't know what is wrong. Whatever example I solve by hand - works! (Please don't ask me to solve MNIST data images by hand ;) ) I do not know where to change the code, what to do.. please help out..
EDIT -- Uploading more data as per suggestions in comments
1 Hidden Layer of 32 -- still no learning.
Expected Output -- Input is images between 0-9, so a simple vector describing which is current image, that bit is 1 all others are 0. So i would want output to be as close to 1 for that particular bit and others being close to 0 For e.g. if input is Input = [0, 0, 0, 0, 0, 0, 1, 0, 0, 0] I would want output to be something like Output = 0.002023, 0.0914957, 0.059178, 0.09339, 0.011851, 0.0842454, 0.924834, 0.049892, 0.0651799, 0.063558 (THis is vague, hand-generated)
Here are the links of other researcher's work.
Stanford
SourceForge -- This is rather a library
Not only these 2, there are so many sites showing the demos.
Things are working quite fine for them. If I set my network parameters(Alpha, ETA) like them I am not getting results like them, so this is reassurance that something is wrong with my code.
EDIT 2
Adding more failure cases
Accelaration - 0.7, Learning Rate 0.1
Accelaration - 0.7, Learning Rate 0.6
In both of the above cases Hidden layers were 3, each of 32 neurons.

This answer is copied from the OP's comment on the question.
I solved the puzzle. I had made the worst possible mistake. I was giving wrong input. I have used opencv to scan the images, instead of using reshape I was using resize and so input was linear interpolation of images. So my input was wrong. There was nothing wrong with the code. My network is 784 - 65 - 10 giving 96.43% accuracy.

Related

I'm trying to parallelize rat maze problem using openmp but it is taking too much time can anyone help me?

The above parallelize code is taking much more time as compare to the original one. I have used bfs approach to solve the problem. I am getting the correct output but it is taking too much time.
(x, y) represents matrix cell coordinates, and
dist represents their minimum distance from the source
struct Node
{
int x, y, dist;
};
\\Below arrays detail all four possible movements from a cell
int row[] = { -1, 0, 0, 1 };
int col[] = { 0, -1, 1, 0 };
Function to check if it is possible to go to position (row, col)
from the current position. The function returns false if (row, col)
is not a valid position or has a value 0 or already visited.
bool isValid(vector<vector<int>> const &mat, vector<vector<bool>> &visited, int row, int col)
{
return (row >= 0 && row < mat.size()) && (col >= 0 && col < mat[0].size())
&& mat[row][col] && !visited[row][col];
}
Find the shortest possible route in a matrix mat from source
cell (i, j) to destination cell (x, y)
int findShortestPathLength(vector<vector<int>> const &mat, pair<int, int> &src,
pair<int, int> &dest)
{
if (mat.size() == 0 || mat[src.first][src.second] == 0 ||
mat[dest.first][dest.second] == 0) {
return -1;
}
// `M × N` matrix
int M = mat.size();
int N = mat[0].size();
// construct a `M × N` matrix to keep track of visited cells
vector<vector<bool>> visited;
visited.resize(M, vector<bool>(N));
// create an empty queue
queue<Node> q;
// get source cell (i, j)
int i = src.first;
int j = src.second;
// mark the source cell as visited and enqueue the source node
visited[i][j] = true;
q.push({i, j, 0});
// stores length of the longest path from source to destination
int min_dist = INT_MAX;
// loop till queue is empty
while (!q.empty())
{
// dequeue front node and process it
Node node = q.front();
q.pop();
// (i, j) represents a current cell, and `dist` stores its
// minimum distance from the source
int i = node.x, j = node.y, dist = node.dist;
// if the destination is found, update `min_dist` and stop
if (i == dest.first && j == dest.second)
{
min_dist = dist;
break;
}
// check for all four possible movements from the current cell
// and enqueue each valid movement
#pragma omp parallel for
for (int k = 0; k < 4; k++)
{
// check if it is possible to go to position
// (i + row[k], j + col[k]) from current position
#pragma omp task shared(i,visited,j)
{
if (isValid(mat, visited, i + row[k], j + col[k]))
{
// mark next cell as visited and enqueue it
visited[i + row[k]][j + col[k]] = true;
q.push({ i + row[k], j + col[k], dist + 1 });
}
}
}
}
if (min_dist != INT_MAX) {
return min_dist;
}
return -1;
}
main part of the code only contains a matrix and source and destination coordinates
int main()
{
vector<vector<int>> mat =
{
{ 1, 1, 1, 1, 1, 0, 0, 1, 1, 1 },
{ 0, 1, 1, 1, 1, 1, 0, 1, 0, 1 },
{ 0, 0, 1, 0, 1, 1, 1, 0, 0, 1 },
{ 1, 0, 1, 1, 1, 0, 1, 1, 0, 1 },
{ 0, 0, 0, 1, 0, 0, 0, 1, 0, 1 },
{ 1, 0, 1, 1, 1, 0, 0, 1, 1, 0 },
{ 0, 0, 0, 0, 1, 0, 0, 1, 0, 1 },
{ 0, 1, 1, 1, 1, 1, 1, 1, 0, 0 },
{ 1, 1, 1, 1, 1, 0, 0, 1, 1, 1 },
{ 0, 0, 1, 0, 0, 1, 1, 0, 0, 1 },
};
pair<int, int> src = make_pair(0, 0);
pair<int, int> dest = make_pair(7, 5);
int min_dist = findShortestPathLength(mat, src, dest);
if (min_dist != -1)
{
cout << min_dist<<endl;
}
else {
cout << "Destination cannot be reached from a given source"<<endl;
}
return 0;
}
I have used shared variable but it is taking too much time.
Can anyone help me?
As people have remarked, you only get parallelism over the 4 directions; a better approach is to keep a set of sets of points: first the starting point, then all points you can reach from there in 1 step, then the ones you can reach in two steps.
With this approach you get more parallelism: you're building what's known as a "wave front" and all the points can be tackled simultaneously.
auto last_level = reachable.back();
vector<int> newly_reachable;
for ( auto n : last_level ) {
const auto& row = graph.row(n);
for ( auto j : row ) {
if ( not reachable.has(j)
and not any_of
( newly_reachable.begin(),newly_reachable.end(),
[j] (int i) { return i==j; } ) )
newly_reachable.push_back(j);
}
}
if (newly_reachable.size()>0)
reachable.push_back(newly_reachable);
(I have written this for a general DAG; writing your maze as a DAG is an exercise for the reader.)
However, this approach still has big problems: if two points on the current wave front decide to add the same new point, you have to resolve that.
For a very parallel approach you need to abandon the "push" model of adding new points altogether, and to go a "pull" model: in every "distance" iteration you loop over all points and ask, if I am not reachable, was one of my neighbors reachable? If so, mark me as reachable in one step more than that neighbor.
If you think about that last approach for a second (or two) you'll see that you are essentially doing a sequence of matrix-vector products, with the adjacency matrix and the currently reachable set. Except that you replace the scalar "+" operation by "min" and the scalar "*" by "+1". Read any tutorial about the interpretation of graph operations as linear algebra. Except that it's not really linear.

boost odeint gives very different values from Python3 scipy

I am trying to integrate a very simple ODE using boost odeint.
For some cases, the values are the same (or very similar) to Python's scipy odeint function.
But for other initial conditions, the values are vastly different.
The function is: d(uhat) / dt = - alpha^2 * kappa^2 * uhat
where alpha is 1.0, and kappa is a constant depending on the case (see values below).
I have tried several different ODE solvers from boost, and none seem to work.
Update: The code below is now working.
In the code below, the first case gives nearly identical results, the 2nd case is kind of trivial (but reassuring), and the 3rd case gives erroneous answers in the C++ version.
Here is the C++ version:
#include <boost/numeric/odeint.hpp>
#include <cstdlib>
#include <iostream>
typedef boost::numeric::odeint::runge_kutta_dopri5<double> Stepper_Type;
struct ResultsObserver
{
std::ostream& m_out;
ResultsObserver( std::ostream &out ) : m_out( out ) { }
void operator()(const State_Type& x , double t ) const
{
m_out << t << " : " << x << std::endl;
}
};
// The rhs: d_uhat_dt = - alpha^2 * kappa^2 * uhat
class Eq {
public:
Eq(double alpha, double kappa)
: m_constant(-1.0 * alpha * alpha * kappa * kappa) {}
void operator()(double uhat, double& d_uhat_dt, const double t) const
{
d_uhat_dt = m_constant * uhat;
}
private:
double m_constant;
};
void integrate(double kappa, double initValue)
{
const unsigned numTimeIncrements = 100;
const double dt = 0.1;
const double alpha = 1.0;
double uhat = initValue; //Init condition
std::vector<double> uhats; //Results vector
Eq rhs(alpha, kappa); //The RHS of the ODE
//This is what I was doing that did not work
//
//boost::numeric::odeint::runge_kutta_dopri5<double> stepper;
//for(unsigned step = 0; step < numTimeIncrements; ++step) {
// uhats.push_back(uhat);
// stepper.do_step(rhs, uhat, step*dt, dt);
//}
//This works
integrate_const(
boost::numeric::odeint::make_dense_output<Stepper_Type>( 1E-12, 1E-6 ),
rhs, uhat, startTime, endTime, dt, ResultsObserver(std::cout)
);
std::cout << "kappa = " << kappa << ", initial value = " << initValue << std::endl;
for(auto val : uhats)
std::cout << val << std::endl;
std::cout << "---" << std::endl << std::endl;
}
int main() {
const double kappa1 = 0.062831853071796;
const double initValue1 = -187.097241230045967;
integrate(kappa1, initValue1);
const double kappa2 = 28.274333882308138;
const double initValue2 = 0.000000000000;
integrate(kappa2, initValue2);
const double kappa3 = 28.337165735379934;
const double initValue3 = -0.091204068895190;
integrate(kappa3, initValue3);
return EXIT_SUCCESS;
}
and the corresponding Python3 version:
enter code here
#!/usr/bin/env python3
import numpy as np
from scipy.integrate import odeint
def Eq(uhat, t, kappa, a):
d_uhat = -a**2 * kappa**2 * uhat
return d_uhat
def integrate(kappa, initValue):
dt = 0.1
t = np.arange(0,10,dt)
a = 1.0
print("kappa = " + str(kappa))
print("initValue = " + str(initValue))
uhats = odeint(Eq, initValue, t, args=(kappa,a))
print(uhats)
print("---")
print()
kappa1 = 0.062831853071796
initValue1 = -187.097241230045967
integrate(kappa1, initValue1)
kappa2 = 28.274333882308138
initValue2 = 0.000000000000
integrate(kappa2, initValue2)
kappa3 = 28.337165735379934
initValue3 = -0.091204068895190
integrate(kappa3, initValue3)
With boost::odeint you should use the integrate... interface functions.
What happens in your code using do_step is that you use the dopri5 method as a fixed-step method with your provided step size. In connection with the large coefficient L=kappa^2 of about 800, the product L*dt=80 is far outside the stability region of the method, the boundary is between values of 2 to 3.5. Divergence or oscillating divergence is the expected result.
What should happen and what is implemented in the scipy odeint, ode-dopri5 or solve_ivp-RK45 methods is a method with adaptive step size. Internally the optimal step size for the error tolerances is chosen, and in each internal step an interpolation polynomial is determined. The output or observed values are computed with this interpolator, also called "dense output" if the interpolation function is returned from the integrator. One result of the error control is that the step size is always inside the stability region, for stiff problems with medium error tolerances on or close to the boundary of this region.
This has all the hallmarks of precision issues.
Simply replacing double with long double gives:
Live On Compiler Explorer
#include <boost/numeric/odeint.hpp>
#include <boost/multiprecision/cpp_bin_float.hpp>
#include <boost/multiprecision/cpp_dec_float.hpp>
#include <fmt/ranges.h>
#include <fmt/ostream.h>
#include <iostream>
using Value = long double;
//using Value = boost::multiprecision::number<
//boost::multiprecision::backends::cpp_bin_float<100>,
//boost::multiprecision::et_off>;
// The rhs: d_uhat_dt = - alpha^2 * kappa^2 * uhat
class Eq {
public:
Eq(Value alpha, Value kappa)
: m_constant(-1.0 * alpha * alpha * kappa * kappa)
{
}
void operator()(Value uhat, Value& d_uhat_dt, const Value) const
{
d_uhat_dt = m_constant * uhat;
}
private:
Value m_constant;
};
void integrate(Value const kappa, Value const initValue)
{
const unsigned numTimeIncrements = 100;
const Value dt = 0.1;
const Value alpha = 1.0;
Value uhat = initValue; // Init condition
std::vector<Value> uhats; // Results vector
Eq rhs(alpha, kappa); // The RHS of the ODE
boost::numeric::odeint::runge_kutta_dopri5<Value> stepper;
for (unsigned step = 0; step < numTimeIncrements; ++step) {
uhats.push_back(uhat);
auto&& stepdt = step * dt;
stepper.do_step(rhs, uhat, stepdt, dt);
}
fmt::print("kappa = {}, initial value = {}\n{}\n---\n", kappa, initValue,
uhats);
}
int main() {
for (auto [kappa, initValue] :
{
std::pair //
{ 0.062831853071796 , -187.097241230045967 },
{28.274333882308138 , 0.000000000000 },
{28.337165735379934 , -0.091204068895190 },
}) //
{
integrate(kappa, initValue);
}
}
Prints
kappa = 0.0628319, initial value = -187.097
[-187.097, -187.023, -186.95, -186.876, -186.802, -186.728, -186.655, -186.581, -186.507, -186.434, -186.36, -186.287, -186.213, -186.139, -186.066, -185.993, -185.919, -185.846, -185.772, -185.699, -185.626, -185.553, -185.479, -185.406, -185.333, -185.26, -185.187, -185.114, -185.04, -184.967, -184.894, -184.821, -184.748, -184.676, -184.603, -184.53, -184.457, -184.384, -184.311, -184.239, -184.166, -184.093, -184.021, -183.948, -183.875, -183.803, -183.73, -183.658, -183.585, -183.513, -183.44, -183.368, -183.296, -183.223, -183.151, -183.079, -183.006, -182.934, -182.862, -182.79, -182.718, -182.645, -182.573, -182.501, -182.429, -182.357, -182.285, -182.213, -182.141, -182.069, -181.998, -181.926, -181.854, -181.782, -181.71, -181.639, -181.567, -181.495, -181.424, -181.352, -181.281, -181.209, -181.137, -181.066, -180.994, -180.923, -180.852, -180.78, -180.709, -180.638, -180.566, -180.495, -180.424, -180.353, -180.281, -180.21, -180.139, -180.068, -179.997, -179.926]
---
kappa = 28.2743, initial value = 0
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
---
kappa = 28.3372, initial value = -0.0912041
[-0.0912041, -38364100, -1.61375e+16, -6.78809e+24, -2.85534e+33, -1.20107e+42, -5.0522e+50, -2.12516e+59, -8.93928e+67, -3.76022e+76, -1.5817e+85, -6.65327e+93, -2.79864e+102, -1.17722e+111, -4.95186e+119, -2.08295e+128, -8.76174e+136, -3.68554e+145, -1.55029e+154, -6.52114e+162, -2.74306e+171, -1.15384e+180, -4.85352e+188, -2.04159e+197, -8.58774e+205, -3.61235e+214, -1.5195e+223, -6.39163e+231, -2.68858e+240, -1.13092e+249, -4.75713e+257, -2.00104e+266, -8.41718e+274, -3.54061e+283, -1.48932e+292, -6.26469e+300, -2.63518e+309, -1.10846e+318, -4.66265e+326, -1.9613e+335, -8.25002e+343, -3.47029e+352, -1.45975e+361, -6.14028e+369, -2.58285e+378, -1.08645e+387, -4.57005e+395, -1.92235e+404, -8.08618e+412, -3.40137e+421, -1.43075e+430, -6.01833e+438, -2.53155e+447, -1.06487e+456, -4.47929e+464, -1.88417e+473, -7.92559e+481, -3.33382e+490, -1.40234e+499, -5.89881e+507, -2.48128e+516, -1.04373e+525, -4.39033e+533, -1.84675e+542, -7.76818e+550, -3.26761e+559, -1.37449e+568, -5.78166e+576, -2.432e+585, -1.023e+594, -4.30314e+602, -1.81008e+611, -7.61391e+619, -3.20272e+628, -1.34719e+637, -5.66684e+645, -2.3837e+654, -1.00268e+663, -4.21768e+671, -1.77413e+680, -7.4627e+688, -3.13911e+697, -1.32044e+706, -5.55429e+714, -2.33636e+723, -9.82768e+731, -4.13392e+740, -1.73889e+749, -7.31449e+757, -3.07677e+766, -1.29421e+775, -5.44399e+783, -2.28996e+792, -9.6325e+800, -4.05182e+809, -1.70436e+818, -7.16922e+826, -3.01567e+835, -1.26851e+844, -5.33587e+852]
---
As you can see, I tried some simple takes to get it to use Boost Multiprecision, but that didn't immediately work and may require someone with actual understanding of the maths / IDEINT.

C++ Problem filling structs with values from another struct

I'm having an issue trying to figure out why I am not getting the correct functionality with a piece of code. I have looked around to try and find a solution however, I haven't been able to do so. Below is an example of my code:
//Structs
typedef struct
{
int gene[60];
int fitness;
} individual;
typedef struct
{
int cond[5];
int out;
}rule;
//Array of individuals
individual population[P]
int function(individual solution){
int k = 0;
//Array of rules
rule rulebase[10]
for (int i = 0; i < 10; i++){
for (int j = 0; j < 5; j++){
rulebase[i].cond[j] = solution.gene[k++];
}
rulebase[i].out = solution.gene[k++];
}
for (int i = 0; i < 5; i++){
cout << rulebase[0].cond[i];
}
The solution that is passed into the function is the first individual in 'population' and the gene array contains only binary numbers, for example:
gene = [0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1] //There will be 60 in total
The desired functionality is to fill the rule structures in the rulebase with the values found in the solution. For example, using the example above the first rule in the rulebase will have the values below in the 'cond' array:
[0, 0, 1, 0, 1]
and the 'out' will be the next integer in the solution:
[1]
Then the next rule will be filled with the next values in the solution the same way.
The problem that I am having is the code seems to be filling the 'cond' array of each rule with all of the values in the solution, as oppose to the desired way described above. For example, when I print the genes in 'rulebase[0]' I get:
[0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1]
As oppose to:
[0, 0, 1, 0, 1]
I can't seem to figure out why I am getting this problem as the code looks to me like it should work? Any help would be greatly appreciated as I am seriously struggling!
A rule contains only 5 values in cond, not 10 as you show. Its just your code that prints the values of rulebase[0] that is wrong, i.e. it exceeds array bounds and prints - in addition to the cond-values of rulebase[0] - the values of out and cond of the next rule, which - in memory - come next.

iris to screen calculation for eye tracking

I'm currently experimenting with Eye tracking I've successfully built an iris tracking algorithm using OpenCV with contours and Hough transform. But the next step is unclear for me. I want to know if the calculations i'm doing are correct for translating the center of an eye to the screen. The head of the user has an fixed position.
What I want is an algorithm that works on all eyes off course. Is there like an angle calculation? So when the user is looking more to the right, linear?
What I do right now is:
First I let the user look at specific points and use RANSAC to detect the iris position that's closest to the position on the screen. I do that with four 2D points on the screen and iris. I'm using Homography for this to get the correct calculation.
void gaussian_elimination(float *input, int n){
// ported to c from pseudocode in
// http://en.wikipedia.org/wiki/Gaussian_elimination
float * A = input;
int i = 0;
int j = 0;
int m = n-1;
while (i < m && j < n){
// Find pivot in column j, starting in row i:
int maxi = i;
for(int k = i+1; k<m; k++){
if(fabs(A[k*n+j]) > fabs(A[maxi*n+j])){
maxi = k;
}
}
if (A[maxi*n+j] != 0){
//swap rows i and maxi, but do not change the value of i
if(i!=maxi)
for(int k=0;k<n;k++){
float aux = A[i*n+k];
A[i*n+k]=A[maxi*n+k];
A[maxi*n+k]=aux;
}
//Now A[i,j] will contain the old value of A[maxi,j].
//divide each entry in row i by A[i,j]
float A_ij=A[i*n+j];
for(int k=0;k<n;k++){
A[i*n+k]/=A_ij;
}
//Now A[i,j] will have the value 1.
for(int u = i+1; u< m; u++){
//subtract A[u,j] * row i from row u
float A_uj = A[u*n+j];
for(int k=0;k<n;k++){
A[u*n+k]-=A_uj*A[i*n+k];
}
//Now A[u,j] will be 0, since A[u,j] - A[i,j] * A[u,j] = A[u,j] - 1 * A[u,j] = 0.
}
i++;
}
j++;
}
//back substitution
for(int i=m-2;i>=0;i--){
for(int j=i+1;j<n-1;j++){
A[i*n+m]-=A[i*n+j]*A[j*n+m];
//A[i*n+j]=0;
}
}
}
ofMatrix4x4 findHomography(ofPoint src[4], ofPoint dst[4]){
ofMatrix4x4 matrix;
// create the equation system to be solved
//
// from: Multiple View Geometry in Computer Vision 2ed
// Hartley R. and Zisserman A.
//
// x' = xH
// where H is the homography: a 3 by 3 matrix
// that transformed to inhomogeneous coordinates for each point
// gives the following equations for each point:
//
// x' * (h31*x + h32*y + h33) = h11*x + h12*y + h13
// y' * (h31*x + h32*y + h33) = h21*x + h22*y + h23
//
// as the homography is scale independent we can let h33 be 1 (indeed any of the terms)
// so for 4 points we have 8 equations for 8 terms to solve: h11 - h32
// after ordering the terms it gives the following matrix
// that can be solved with gaussian elimination:
float P[8][9]={
{-src[0].x, -src[0].y, -1, 0, 0, 0, src[0].x*dst[0].x, src[0].y*dst[0].x, -dst[0].x }, // h11
{ 0, 0, 0, -src[0].x, -src[0].y, -1, src[0].x*dst[0].y, src[0].y*dst[0].y, -dst[0].y }, // h12
{-src[1].x, -src[1].y, -1, 0, 0, 0, src[1].x*dst[1].x, src[1].y*dst[1].x, -dst[1].x }, // h13
{ 0, 0, 0, -src[1].x, -src[1].y, -1, src[1].x*dst[1].y, src[1].y*dst[1].y, -dst[1].y }, // h21
{-src[2].x, -src[2].y, -1, 0, 0, 0, src[2].x*dst[2].x, src[2].y*dst[2].x, -dst[2].x }, // h22
{ 0, 0, 0, -src[2].x, -src[2].y, -1, src[2].x*dst[2].y, src[2].y*dst[2].y, -dst[2].y }, // h23
{-src[3].x, -src[3].y, -1, 0, 0, 0, src[3].x*dst[3].x, src[3].y*dst[3].x, -dst[3].x }, // h31
{ 0, 0, 0, -src[3].x, -src[3].y, -1, src[3].x*dst[3].y, src[3].y*dst[3].y, -dst[3].y }, // h32
};
gaussian_elimination(&P[0][0],9);
matrix(0,0)=P[0][8];
matrix(0,1)=P[1][8];
matrix(0,2)=0;
matrix(0,3)=P[2][8];
matrix(1,0)=P[3][8];
matrix(1,1)=P[4][8];
matrix(1,2)=0;
matrix(1,3)=P[5][8];
matrix(2,0)=0;
matrix(2,1)=0;
matrix(2,2)=0;
matrix(2,3)=0;
matrix(3,0)=P[6][8];
matrix(3,1)=P[7][8];
matrix(3,2)=0;
matrix(3,3)=1;
return matrix;
}
You should have a look at existing solutions for this:
Eye writer for painting with your eyes (I tested this to control the mouse only)
Eyewriter.org
Eyewriter walkthrough
Eyewriter on Github
EyeLike pupil tracking
EyeLike info page (algorithm similar to want you want is discussed here)
EyeLike on Github
Good luck!
May be this link is helpful to you , best luck
cv::Mat computeMatXGradient(const cv::Mat &mat) {
cv::Mat out(mat.rows,mat.cols,CV_64F);
for (int y = 0; y < mat.rows; ++y) {
const uchar *Mr = mat.ptr<uchar>(y);
double *Or = out.ptr<double>(y);
Or[0] = Mr[1] - Mr[0];
for (int x = 1; x < mat.cols - 1; ++x) {
Or[x] = (Mr[x+1] - Mr[x-1])/2.0;
}
Or[mat.cols-1] = Mr[mat.cols-1] - Mr[mat.cols-2];
}
return out;
}

inverse fft of fft not returning expected data

I'm trying to make sure FFTW does what I think it should do, but am having problems. I'm using OpenCV's cv::Mat. I made a test program that, given a Mat f, computes ifft(fft(f)) and compares the result to f. I would expect the difference between the two to be negligible, but there's a strange pattern in the data..
In this case, f is initialized to be an 8x8 array of floats with positive values less than 1.
Here's my test program code:
Mat f = .. //populate f
if (f.type() != CV_32FC1)
DLOG << "Bad f type";
const int y = f.rows;
const int x = f.cols;
double* input = fftw_alloc_real(y * 2*(x/2 + 1));
// forward fft
fftw_plan plan = fftw_plan_dft_r2c_2d(x, y, input, (fftw_complex*)input, FFTW_MEASURE);
// inverse fft
fftw_plan iplan = fftw_plan_dft_c2r_2d(x, y, (fftw_complex*)input, input, FFTW_MEASURE);
// populate fftw data from f
for (int yi = 0; yi < y; ++yi)
{
const float* yptr = f.ptr<float>(yi);
for (int xi = 0; xi < x; ++xi)
input[yi*x + xi] = (double)yptr[xi];
}
fftw_execute(plan);
fftw_execute(iplan);
// put data into another cv::Mat for comparison
Mat check(y, x, f.type());
for (int yi = 0; yi < y; ++yi)
{
float* yptr = check.ptr<float>(yi);
for (int xi = 0; xi < x ; ++xi)
yptr[xi] = (float)input[yi*x + xi];
}
DLOG << Util::summary(f, "f");
DLOG << f;
DLOG << Util::summary(check, "check");
DLOG << check;
Mat diff = f*x*y - check;
DLOG << Util::summary(diff, "diff");
DLOG << diff;
Where DLOG is my logger and Util::summary(cv::Mat m) just prints passed string and the dimensions, channels, min, and max of the mat.
Here's what the data looks like (output):
f: rows:8 cols:8 chans:1 min:0.00257996 max:0.4
[0.050668437, 0.04509116, 0.033668514, 0.10986148, 0.12855141, 0.048241843, 0.12613985,.09731093;
0.028602425, 0.0092236707, 0.037089188, 0.118964, 0.075040311, 0.40000001, 0.11959606, 0.071930833;
0.0025799556, 0.051522054, 0.22233701, 0.052993439, 0.032000393, 0.12673819, 0.015244827, 0.044803992;
0.13946071, 0.019708242, 0.0112687, 0.047459811, 0.019342113, 0.030085485, 0.018739942, 0.0098618753;
0.041809395, 0.029681522, 0.026837418, 0.16038358, 0.29034778, 0.17247421, 0.1789207, 0.042179305;
0.025630442, 0.017192598, 0.060540862, 0.1854037, 0.21287154, 0.04813192, 0.042614728, 0.034764063;
0.0030835248, 0.018511582, 0.0071733585, 0.017076733, 0.064545207, 0.0026390438, 0.088922881, 0.045725599;
0.12798512, 0.23215951, 0.027465452, 0.03174505, 0.04352935, 0.025079668, 0.044403922, 0.035459157]
check: rows:8 cols:8 chans:1 min:-3.26489 max:25.6
[3.24278, 2.8858342, 2.1547849, 7.0311346, 8.2272902, 3.0874779, 8.0729504, 6.2278996;
0.30818239, 0, 2.373708, 7.6136961, 4.8025799, 25.6, 7.6541481, 4.6035733;
0.16511716, 3.2974114, -3.2648909, 0, 2.0480251, 8.1112442, 0.97566891, 2.8674555;
8.9254856, 1.2613275, 0.72119683, 3.0374279, -0.32588482, 0, 1.1993563, 0.63116002;
2.6758013, 1.8996174, 1.7175947, 10.264549, 18.582258, 11.038349, 0.042666838, 0;
1.6403483, 1.1003263, 3.8746152, 11.865837, 13.623778, 3.0804429, 2.7273426, 2.2249;
0.44932228, 0, 0.45909494, 1.0929109, 4.1308932, 0.16889881, 5.6910644, 2.9264383;
8.1910477, 14.858209, -0.071794562, 0, 2.7858784, 1.6050987, 2.841851, 2.2693861]
diff: rows:8 cols:8 chans:1 min:-0.251977 max:17.4945
[0, 0, 0, 0, 0, 0, 0, 0;
1.5223728, 0.59031492, 0, 0, 0, 0, 0, 0;
0, 0, 17.494459, 3.3915801, 0, 0, 0, 0;
0, 0, 0, 0, 1.5637801, 1.9254711, 0, 0;
0, 0, 0, 0, 0, 0, 11.408258, 2.6994755;
0, 0, 0, 0, 0, 0, 0, 0;
-0.2519767, 1.1847413, 0, 0, 0, 0, 0, 0;
0, 0, 1.8295834, 2.0316832, 0, 0, 0, 0]
The difficult part for me is the nonzero entries in the diff matrix. I've accounted for the scaling FFTW does on the values and the padding needed to do an in-place fft on real data; what am I missing?
I find it surprising that the data could be off by a value of 17 (which is 66% of the max value), when there are so many zeros. Also, the data irregularities seem to form a diagonal pattern.
As you may have noticed when writting fftw_alloc_real(y * 2*(x/2 + 1)); fftw needs extra space in the x direction to store complex data. In your case, as x=8, it needs 2*(x/2+1)=10 reals.
http://www.fftw.org/doc/Real_002ddata-DFT-Array-Format.html#Real_002ddata-DFT-Array-Format
So...you should take care of this as you populate the input array or retreive values from it.
You way change
input[yi*x + xi] = (double)yptr[xi];
for
int xfft=2*(x/2 + 1);
...
input[yi*xfft + xi] = (double)yptr[xi];
And
yptr[xi] = (float)input[yi*x + xi];
for
yptr[xi] = (float)input[yi*xfft + xi];
It should solve your problem since the non-nul points in your diff correspond to the extra padding.
Bye,