Low Accuracy of DNN - c++

I've been implementing NN recently based on http://neuralnetworksanddeeplearning.com/. I've made whole algorithm for backprop and SGD almost the same way as author of this book. The problem is that while he gets accuracy around 90 % after one epoch i get 30% after 5 epochs even though i have the same hiperparameters. Do you have any idea what might be the cause ?
Here s my respository.
Here is part with algorithm for backprop and SGD implemented in Network.cpp:
void Network::Train(MatrixD_Array& TrainingData, MatrixD_Array& TrainingLabels, int BatchSize,int epochs, double LearningRate)
assert(TrainingData.size() == TrainingLabels.size() && CostFunc != nullptr && CostFuncDer != nullptr && LearningRate > 0);
std::vector<long unsigned int > indexes;
for (int i = 0; i < TrainingData.size(); i++) indexes.push_back(i);
std::random_device rd;
std::mt19937 g(rd());
std::vector<Matrix<double>> NablaWeights;
std::vector<Matrix<double>> NablaBiases;
for (int i = 0; i < Layers.size(); i++)
NablaWeights[i] = Matrix<double>(Layers[i].GetInDim(), Layers[i].GetOutDim());
NablaBiases[i] = Matrix<double>(1, Layers[i].GetOutDim());
//---- Epoch iterating
for (int i = 0; i < epochs; i++)
cout << "Epoch number: " << i << endl;
shuffle(indexes.begin(), indexes.end(), g);
// Batch iterating
for (int batch = 0; batch < TrainingData.size(); batch = batch + BatchSize)
for (int i = 0; i < Layers.size(); i++)
int i = 0;
while( i < BatchSize && (i+batch)< TrainingData.size())
std::vector<Matrix<double>> ActivationOutput;
std::vector<Matrix<double>> Z_Output;
ActivationOutput.resize(Layers.size() + 1);
ActivationOutput[0] = TrainingData[indexes[i + batch]];
int index = 0;
// Pushing values through
for (auto layer : Layers)
Z_Output[index] = layer.Mul(ActivationOutput[index]);
ActivationOutput[index + 1] = layer.ApplyActivation(Z_Output[index]);
// ---- Calculating Nabla that will be later devided by batch size element wise
auto DeltaNabla = BackPropagation(ActivationOutput, Z_Output, TrainingLabels[indexes[i + batch]]);
for (int i = 0; i < Layers.size(); i++)
NablaWeights[i] = NablaWeights[i] + DeltaNabla.first[i];
NablaBiases[i] = NablaBiases[i] + DeltaNabla.second[i];
for (int g = 0; g < Layers.size(); g++)
Layers[g].Weights = Layers[g].Weights - NablaWeights[g] * LearningRate;
Layers[g].Biases = Layers[g].Biases - NablaBiases[g] * LearningRate;
// std::transform(NablaWeights.begin(), NablaWeights.end(), NablaWeights.begin(),[BatchSize, LearningRate](Matrix<double>& Weight) {return Weight * (LearningRate / BatchSize);});
//std::transform(NablaBiases.begin(), NablaBiases.end(), NablaBiases.begin(), [BatchSize, LearningRate](Matrix<double>& Bias) {return Bias * (LearningRate / BatchSize); });
std::pair<MatrixD_Array, MatrixD_Array> Network::BackPropagation( MatrixD_Array& ActivationOutput, MatrixD_Array& Z_Output,Matrix<double>& label)
MatrixD_Array NablaWeight;
MatrixD_Array NablaBias;
auto zs = Layers[Layers.size() - 1].ActivationPrime(Z_Output[Z_Output.size() - 1]);
Matrix<double> Delta_L = Hadamard(CostFuncDer(ActivationOutput[ActivationOutput.size() - 1],label), zs);
NablaWeight[Layers.size() - 1] = Delta_L * ActivationOutput[ActivationOutput.size() - 2].Transpose();
NablaBias[Layers.size() - 1] = Delta_L;
for (int j = 2; j <= Layers.size() ; j++)
auto sp = Layers[Layers.size() - j].ActivationPrime(Z_Output[Layers.size() -j]);
Delta_L = Hadamard(Layers[Layers.size() - j+1 ].Weights.Transpose() * Delta_L, sp);
NablaWeight[Layers.size() - j] = Delta_L * ActivationOutput[ActivationOutput.size() -j-1].Transpose();
NablaBias[Layers.size() - j] = Delta_L;
return make_pair(NablaWeight, NablaBias);

It turned out that mnist loader didnt work correctly.


Fixing Neural Net vanishing gradients problem?

This is going to be a long one. I am still very new to coding, started 3 months ago so I know my code is not perfect, any criticism beyond the question is more than welcome. I have specifically avoided using pointers because I do not fully understand them, I can use them but I dont trust that I will use them correctly in a program like this.
First things first, I have a version of this where there is only 1 hidden layer and the net works perfectly. I have started running into problems since I tried to expand the number of hidden layers.
Some info on the net:
-I am using softmax output activation as I have 3 output neurons.
-I am using tanh as my activation function on the rest of the net.
-The file being read for the input has a format of
"input: 0.56 0.76 0.23 0.67"
"output: 0.0 0.0 1.0" (this is the target)
-The weights for connecting layer 1 neuron to layer 2 neuron are stored in layer 1 one neuron.
-The bias's for each neuron are stored in that neuron.
-The target is 1.0 0.0 0.0 if the sum of the input numbers is below one, 0.0 1.0 0.0 if sum is between 1 and 2, 0.0 0.0 1.0 if sum is above 2.
-using L1 regularization.
Those problems specifically being:
The softmax output values do not move from an relatively equalised range ie:
(position 1 and 2 in the target vector have a roughly 50/50 occurance rate while position 3 less than 3% occurance rate. so by relatively equalised I mean the softmax output generally looks something like
"0.56.... 0.48.... 0.02..." even after 500 epochs.
The weights at the hidden layer closer to inputlayer dont change much at all, which is what i think vanishing gradients are. I might be wrong on this. But the weights at hiddenlayer closest to output are ending up at between -50 & 50 (which i think is okay?)
Things I have tried:
I have tried using Relu, parametric Relu, exponential Relu, but with all of these the softmax output value for neuron 3 keeps rising, the other 2 neurons values keep falling. these values continue their trajectory until either 500 epochs have been reached or they just turn into nans. (I think this is to do with the structure of my code rather than the Relu function itself).
If I set the number of hidden layers above 3 while using relu, it immediately spits out nans, within the first epoch.
The backprop function is pretty long, but this is specifically because I have deconstructed it many times over to try and figure out where I might be mismatching values or something. I do have it in a condensed version but I feel I have a higher chance of being completely off the mark there than I do if I have it deconstructed.
I have included the Relu function code that I used, it is the first time I use it so I might be wrong on that aswell but I dont think so, I have double checked multiple times. The Relu in the code is specifically "Elu" or exponential relu.
here is the code for the net:
#include <iostream>
#include <fstream>
#include <cmath>
#include <vector>
#include <sstream>
#include <random>
#include <string>
#include <iomanip>
double randomt(double x, double y)
std::random_device rd;
std::mt19937 mt(rd());
std::uniform_real_distribution<double> dist(x, y);
return dist(mt);
class InputN
double val{};
std::vector <double> weights{};
class HiddenN
double preactval{};
double actval{};
double actvalPD{};
double preactvalpd{};
std::vector <double> weights{};
double bias{};
class OutputN
double preactval{};
double actval{};
double preactvalpd{};
double bias{};
class Net
std::vector <InputN> inneurons{};
std::vector <std::vector <HiddenN>> hiddenneurons{};
std::vector <OutputN> outputneurons{};
double lambda{ 0.015 };
double alpha{ 0.02 };
double tanhderiv(double val)
return 1 - tanh(val) * tanh(val);
double Relu(double val)
if (val < 0) return 0.01 *(exp(val) - 1);
else return val;
double Reluderiv(double val)
if (val < 0) return Relu(val) + 0.01;
else return 1;
double regularizer(double weight)
double absval{};
if (weight < 0) absval = weight - weight - weight;
else if (weight > 0 || weight == 0) absval = weight;
if (absval > 0) return 1;
else if (absval < 0) return -1;
else if (absval == 0) return 0;
else return 2;
void feedforward(Net& net)
double sum{};
int prevlayer{};
for (size_t Hsize = 0; Hsize < net.hiddenneurons.size(); Hsize++)
//std::cout << "in first loop" << '\n';
prevlayer = Hsize - 1;
for (size_t Hel = 0; Hel < net.hiddenneurons[Hsize].size(); Hel++)
//std::cout << "in second loop" << '\n';
if (Hsize == 0)
//std::cout << "in first if" << '\n';
for (size_t Isize = 0; Isize < net.inneurons.size(); Isize++)
//std::cout << "in fourth loop" << '\n';
sum += (net.inneurons[Isize].val * net.inneurons[Isize].weights[Hel]);
net.hiddenneurons[Hsize][Hel].preactval = net.hiddenneurons[Hsize][Hel].bias + sum;
net.hiddenneurons[Hsize][Hel].actval = tanh(sum);
sum = 0;
//std::cout << "first if done" << '\n';
//std::cout << "in else" << '\n';
for (size_t prs = 0; prs < net.hiddenneurons[prevlayer].size(); prs++)
//std::cout << "in fourth loop" << '\n';
sum += net.hiddenneurons[prevlayer][prs].actval * net.hiddenneurons[prevlayer][prs].weights[Hel];
//std::cout << "fourth loop done" << '\n';
net.hiddenneurons[Hsize][Hel].preactval = net.hiddenneurons[Hsize][Hel].bias + sum;
net.hiddenneurons[Hsize][Hel].actval = tanh(sum);
//std::cout << "else done" << '\n';
sum = 0;
//std::cout << "first loop done " << '\n';
int lasthid = net.hiddenneurons.size() - 1;
for (size_t Osize = 0; Osize < net.outputneurons.size(); Osize++)
for (size_t Hsize = 0; Hsize < net.hiddenneurons[lasthid].size(); Hsize++)
sum += (net.hiddenneurons[lasthid][Hsize].actval * net.hiddenneurons[lasthid][Hsize].weights[Osize]);
net.outputneurons[Osize].preactval = net.outputneurons[Osize].bias + sum;
void softmax(Net& net)
double sum{};
for (size_t Osize = 0; Osize < net.outputneurons.size(); Osize++)
sum += exp(net.outputneurons[Osize].preactval);
for (size_t Osize = 0; Osize < net.outputneurons.size(); Osize++)
net.outputneurons[Osize].actval = exp(net.outputneurons[Osize].preactval) / sum;
void lossfunc(Net& net, std::vector <double> target)
int pos{ -1 };
double val{};
for (size_t t = 0; t < target.size(); t++)
pos += 1;
if (target[t] > 0)
for (size_t s = 0; net.outputneurons.size(); s++)
val = -log(net.outputneurons[pos].actval);
void backprop(Net& net, std::vector<double>& target)
for (size_t outI = 0; outI < net.outputneurons.size(); outI++)
double PD = target[outI] - net.outputneurons[outI].actval;
net.outputneurons[outI].preactvalpd = PD * -1;
size_t lasthid = net.hiddenneurons.size() - 1;
for (size_t LH = 0; LH < net.hiddenneurons[lasthid].size(); LH++)
for (size_t LHW = 0; LHW < net.hiddenneurons[lasthid][LH].weights.size(); LHW++)
double weight = net.hiddenneurons[lasthid][LH].weights[LHW];
double PD = net.outputneurons[LHW].preactvalpd * net.hiddenneurons[lasthid][LH].actval;
PD = PD * -1;
double delta = PD - (net.lambda * regularizer(weight));
weight = weight + (net.alpha * delta);
net.hiddenneurons[lasthid][LH].weights[LHW] = weight;
for (size_t OB = 0; OB < net.outputneurons.size(); OB++)
double bias = net.outputneurons[OB].bias;
double BPD = net.outputneurons[OB].preactvalpd;
BPD = BPD * -1;
double Delta = BPD;
bias = bias + (net.alpha * Delta);
for (size_t HPD = 0; HPD < net.hiddenneurons[lasthid].size(); HPD++)
double PD{};
for (size_t HW = 0; HW < net.outputneurons.size(); HW++)
PD += net.hiddenneurons[lasthid][HPD].weights[HW] * net.outputneurons[HW].preactvalpd;
net.hiddenneurons[lasthid][HPD].actvalPD = PD;
PD = 0;
for (size_t HPD = 0; HPD < net.hiddenneurons[lasthid].size(); HPD++)
net.hiddenneurons[lasthid][HPD].preactvalpd = net.hiddenneurons[lasthid][HPD].actvalPD * tanhderiv(net.hiddenneurons[lasthid][HPD].preactval);
for (size_t AllHid = net.hiddenneurons.size() - 2; AllHid > -1; AllHid--)
size_t uplayer = AllHid + 1;
for (size_t cl = 0; cl < net.hiddenneurons[AllHid].size(); cl++)
for (size_t clw = 0; clw < net.hiddenneurons[AllHid][cl].weights.size(); clw++)
double weight = net.hiddenneurons[AllHid][cl].weights[clw];
double PD = net.hiddenneurons[uplayer][clw].preactvalpd * net.hiddenneurons[AllHid][cl].actval;
PD = PD * -1;
double delta = PD - (net.lambda * regularizer(weight));
weight = weight + (net.alpha * delta);
net.hiddenneurons[AllHid][cl].weights[clw] = weight;
for (size_t up = 0; up < net.hiddenneurons[uplayer].size(); up++)
double bias = net.hiddenneurons[uplayer][up].bias;
double PD = net.hiddenneurons[uplayer][up].preactvalpd;
PD = PD * -1;
double delta = PD;
bias = bias + (net.alpha * delta);
for (size_t APD = 0; APD < net.hiddenneurons[AllHid].size(); APD++)
double PD{};
for (size_t APDW = 0; APDW < net.hiddenneurons[AllHid][APD].weights.size(); APDW++)
PD += net.hiddenneurons[AllHid][APD].weights[APDW] * net.hiddenneurons[uplayer][APDW].preactvalpd;
net.hiddenneurons[AllHid][APD].actvalPD = PD;
PD = 0;
for (size_t PPD = 0; PPD < net.hiddenneurons[AllHid].size(); PPD++)
double PD = net.hiddenneurons[AllHid][PPD].actvalPD * tanhderiv(net.hiddenneurons[AllHid][PPD].preactval);
net.hiddenneurons[AllHid][PPD].preactvalpd = PD;
for (size_t IN = 0; IN < net.inneurons.size(); IN++)
for (size_t INW = 0; INW < net.inneurons[IN].weights.size(); INW++)
double weight = net.inneurons[IN].weights[INW];
double PD = net.hiddenneurons[0][INW].preactvalpd * net.inneurons[IN].val;
PD = PD * -1;
double delta = PD - (net.lambda * regularizer(weight));
weight = weight + (net.alpha * delta);
net.inneurons[IN].weights[INW] = weight;
for (size_t hidB = 0; hidB < net.hiddenneurons[0].size(); hidB++)
double bias = net.hiddenneurons[0][hidB].bias;
double PD = net.hiddenneurons[0][hidB].preactvalpd;
PD = PD * -1;
double delta = PD;
bias = bias + (net.alpha * delta);
net.hiddenneurons[0][hidB].bias = bias;
int main()
std::vector <double> invals{ };
std::vector <double> target{ };
Net net;
InputN Ineuron;
HiddenN Hneuron;
OutputN Oneuron;
int IN = 4;
int HIDLAYERS = 4;
int HID = 8;
int OUT = 3;
for (int i = 0; i < IN; i++)
for (int m = 0; m < HID; m++)
net.inneurons.back().weights.push_back(randomt(0.0, 0.5));
//std::cout << "first loop done" << '\n';
for (int s = 0; s < HIDLAYERS; s++)
net.hiddenneurons.push_back(std::vector <HiddenN>());
if (s == HIDLAYERS - 1)
for (int i = 0; i < HID; i++)
for (int m = 0; m < OUT; m++)
net.hiddenneurons[s].back().weights.push_back(randomt(0.0, 0.5));
net.hiddenneurons[s].back().bias = 1.0;
for (int i = 0; i < HID; i++)
for (int m = 0; m < HID; m++)
net.hiddenneurons[s].back().weights.push_back(randomt(0.0, 0.5));
net.hiddenneurons[s].back().bias = 1.0;
//std::cout << "second loop done" << '\n';
for (int i = 0; i < OUT; i++)
net.outputneurons.back().bias = randomt(0.0, 0.5);
//std::cout << "third loop done" << '\n';
int count{};
std::ifstream fileread("N.txt");
for (int epoch = 0; epoch < 500; epoch++)
count = 0;
if (epoch == 100 || epoch == 100 * 2 || epoch == 100 * 3 || epoch == 100 * 4 || epoch == 499)
printvals("no", net);
fileread.clear(); fileread.seekg(0, std::ios::beg);
while (fileread.is_open())
std::cout << '\n' << "epoch: " << epoch << '\n';
std::string fileline{};
fileread >> fileline;
if (fileline == "in:")
std::string input{};
double nums{};
std::getline(fileread, input);
std::stringstream ss(input);
while (ss >> nums)
if (fileline == "out:")
std::string output{};
double num{};
std::getline(fileread, output);
std::stringstream ss(output);
while (ss >> num)
count += 1;
if (count == 2)
for (size_t inv = 0; inv < invals.size(); inv++)
net.inneurons[inv].val = invals[inv];
//std::cout << "calling feedforward" << '\n';
//std::cout << "ff done" << '\n';
printvals("output", net);
std::cout << "target: " << '\n';
for (auto element : target) std::cout << element << " / ";
std::cout << '\n';
backprop(net, target);
count = 0;
if (fileread.eof()) break;
//std::cout << "fourth loop done" << '\n';
return 1;
Much aprecciated to anyone who actually made it through all that! :)

manhattan distance works better than manhattan distance + linear conflict

I try to implement A* algorithm with both manhattan distance and manhattan distance + linear conflict heuristic.
but my manhattan distance works much better and I can't understand why!
manhattan distance + linear conflict in my algorithm expands much more node and i found that it's answer it not even optimal.
#include <iostream>
#include <time.h>
#include <fstream>
#include <set>
#include <vector>
#include <queue>
#include <map>
#include "node.h"
using namespace std;
const int n = 4;
const int hash_base = 23;
const long long hash_mod = 9827870924701019;
set<long long> explored;
set<pair<int , node*> > frontier;
map<string,int> database[3];
void check(int* state){
for(int j = 0 ; j < n ; j++){
for(int i = 0 ; i < n ; i++)
cerr << state[j * n + i] << " ";
cerr << endl;
cerr << endl;
bool goal_test(int* state){
if(state[n * n - 1] != 0) return 0;
for(int i = 0 ; i < (n * n - 1) ; i++){
if(state[i] != i + 1)
return 0;
return 1;
vector<node> solution(node* v){
vector<node> ans;
while((v->parent)->hash != v->hash){
v = v->parent;
return ans;
//first heuristic
int manhattanDistance(int* state){
int md = 0;
for(int i = 0 ; i < (n * n) ; i++){
if(state[i] == 0) continue;
//what is the goal row and column of this tile
int gr = (state[i] - 1) / n , gc = (state[i] - 1) % n;
//what is the row and column of this tile
int r = i / n , c = i % n;
md += (max(gr - r , r - gr) + max(gc - c , c - gc));
return md;
//second heuristic
int linearConflict(int* state){
int lc = 0;
for(int i = 0 ; i < n ; i++){
for(int j = 0 ; j < n ; j++){
//jth tile in ith row = (i * n + j)th in state
int la = i * n + j;
if(state[la] == 0 || (state[la] / n) != i)
for(int k = j + 1 ; k < n ; k++){
//kth tile in ith row = (i * n + k)th in state
int lb = i * n + k;
if(state[lb] == 0 || (state[lb] / n) != i)
if(state[la] > state[lb])
for(int i = 0 ; i < n ; i++){
for(int j = 0 ; j < n ; j++){
//j the tile of i th column
int la = j * 4 + i;
if(state[la] == 0 || (state[la] % n) != i)
for(int k = j + 1 ; k < n ; k++){
int lb = k * 4 + i;
if(state[lb] == 0 || (state[lb] % n) != i)
if(state[la] > state[lb])
return lc * 2;
long long make_hash(int* v){
long long power = 1LL;
long long hash = 0 * 1LL;
for(int i = 0 ; i < (n * n) ; i++){
hash += (power * v[i] * 1LL) % hash_mod;
hash %= hash_mod;
power = (hash_base * power * 1LL) % hash_mod;
return hash;
vector<node> successor(node* parent){
vector<node> child;
int parent_empty = parent->empty_cell;
//the row and column of empty cell
int r = parent_empty / n , c = parent_empty % n;
//empty cell go down
if(r + 1 < n){
struct node down;
for(int i = 0 ; i < (n*n) ; i++)
down.state[i] = parent->state[i];
down.state[parent_empty] = parent->state[parent_empty + n];
down.state[parent_empty + n] = 0;
down.hash = make_hash(down.state);
down.empty_cell = parent_empty + n;
down.cost = parent->cost + 1;
down.parent = parent;
//first heuristic -> manhattan Distance
// down.heuristic = manhattanDistance(down.state);
//second heuristic -> manhattan distance + linear conflict
down.heuristic = linearConflict(down.state) + manhattanDistance(down.state);
//third heuristic -> disjoint pattern database
// down.heuristic = DisjointPatternDB(down.state);
//empty cell go up
if(r - 1 >= 0){
struct node up;
for(int i = 0 ; i < n * n ; i++)
up.state[i] = parent->state[i];
up.state[parent_empty] = parent->state[parent_empty - n];
up.state[parent_empty - n] = 0;
up.empty_cell = parent_empty - n;
up.hash = make_hash(up.state);
up.cost = parent->cost + 1;
up.parent = parent;
//first heuristic -> manhattan Distance
// up.heuristic = manhattanDistance(up.state);
//second heuristic -> manhattan distance + linear conflict
up.heuristic = linearConflict(up.state) + manhattanDistance(up.state);
//third heuristic -> disjoint pattern database
// up.heuristic = DisjointPatternDB(up.state);
//empty cell going right
if(c + 1 < n){
struct node right;
for(int i = 0 ; i < (n * n) ; i++)
right.state[i] = parent->state[i];
right.state[parent_empty] = parent->state[parent_empty + 1];
right.state[parent_empty + 1] = 0;
right.empty_cell = parent_empty + 1;
right.hash = make_hash(right.state);
right.cost = parent->cost + 1;
right.parent = parent;
//first heuristic -> manhattan Distance
// right.heuristic = manhattanDistance(right.state);
//second heuristic -> manhattan distance + linear conflict
right.heuristic = linearConflict(right.state) + manhattanDistance(right.state);
//third heuristic -> disjoint pattern database
// right.heuristic = DisjointPatternDB(right.state);
//empty cell going left
if(c - 1 >= 0){
struct node left;
for (int i = 0; i < (n * n) ; i++)
left.state[i] = parent->state[i];
left.state[parent_empty] = parent->state[parent_empty - 1];
left.state[parent_empty - 1] = 0;
left.empty_cell = parent_empty - 1;
left.hash = make_hash(left.state);
left.cost = parent->cost + 1;
left.parent = parent;
//first heuristic -> manhattan Distance
// left.heuristic = manhattanDistance(left.state);
//second heuristic -> manhattan distance + linear conflict
left.heuristic = linearConflict(left.state) + manhattanDistance(left.state);
//third heuristic -> disjoint pattern database
// left.heuristic = DisjointPatternDB(left.state);
return child;
node* nodeCopy(node child){
node* tmp = new node;
for(int i = 0 ; i < n * n; i++)
tmp->state[i] = child.state[i];
tmp->hash = child.hash;
tmp->empty_cell = child.empty_cell;
tmp->cost = child.cost;
tmp->parent = child.parent;
tmp->heuristic = child.heuristic;
return tmp;
vector<node> Astar(node* initNode){
if(goal_test(initNode->state)) return solution(initNode);
frontier.insert(make_pair(initNode->cost + initNode-> heuristic ,initNode));
node* v = (*frontier.begin()).second;
if(goal_test(v->state)) return solution(v);
vector<node> childs = successor(v);
for(node child: childs){
if(explored.find(child.hash) == explored.end()){
node* tmp = nodeCopy(child);
frontier.insert(make_pair((child.cost + child.heuristic) , tmp));
return solution(initNode);
int main(){
clock_t tStart = clock();
printf("Time taken: %.2fs\n", (double)(clock() - tStart)/CLOCKS_PER_SEC);
struct node init;// {{1 , 2 , 3 , 4 , 5, 6, 7, 8, 0} , 1 , 9 , 0 , &init , 0};
for(int i = 0 ; i < (n * n) ; i++){
cin >> init.state[i];
if(init.state[i] == 0)
init.empty_cell = i;
init.hash = make_hash(init.state);
init.cost = 0;
init.parent = &init;
init.heuristic = manhattanDistance(init.state) ;//+ linearConflict(init.state);
vector <node> ans = Astar(&init);
//cout << 1 << " ";
for(int j = 0 ; j < n * n ; j++){
if(j == n * n - 1) cout << init.state[j];
else cout << init.state[j] << ",";
cout << endl;
for(int i = (ans.size() - 1) ; i >= 0 ; i--){
//cout << (ans.size() - i + 1) << " ";
cerr << linearConflict(ans[i].state) << endl;
for(int j = 0 ; j < n * n ; j++){
if(j == n * n - 1) cout << ans[i].state[j];
else cout << ans[i].state[j] << ",";
cout << endl;
cout << "path size : " << ans.size() << endl;
cout << "number of node expanded : " << explored.size() << endl;
printf("Time taken: %.2fs\n", (double)(clock() - tStart)/CLOCKS_PER_SEC);
#include <vector>
using namespace std;
struct node{
int state[16];
long long hash;
int empty_cell;
long cost;
node* parent;
int heuristic;
If its answer is not optimal it is because your heuristic is not admissible.
This means that sometimes it is overestimating the cost to reach a node in your graph.
linear conflicts heuristic should always be coupled with a distance estimated heuristic like Manhattan and it is not as simple as giving twice the number of linear conflicts in each row / column.
See Linear Conflict violating admissibility and driving me insane

Optimization of C++ code - std::vector operations

I have this funcition (RotateSlownessTop) and it's called about 800 times computing the corresponding values. But the calculation is slow and is there a way I can make the computations faster.
The number of element in X/Y is 7202. (Fairly large set)
I did the performance analysis and the screenshot has been attached.
void RotateSlownessTop(vector <double> &XR1, vector <double> &YR1, float theta = 0.0)
Matrix2d a;
a(0,0) = cos(theta);
a(0,1) = -sin(theta);
a(1, 0) = sin(theta);
a(1, 1) = cos(theta);
vector <double> XR2(7202), YR2(7202);
for (size_t i = 0; i < X.size(); ++i)
XR2[i] = (a(0, 0)*X[i] + a(0, 1)*Y[i]);
YR2[i] = (a(1, 0)*X[i] + a(1, 1)*Y[i]);
size_t i = 0;
size_t j = 0;
while (i < YR2.size())
if (i > 0)
if ((XR2[i]>0) && (XR2[i-1]<0))
j = i;
if (YR2[i] > (-1e-10) && YR2[i]<0.0)
YR2[i] = 0.0;
if (YR2[i] < (1e-10) && YR2[i]>0.0)
YR2[i] = -YR2[i];
if ( YR2[i]<0.0)
YR2.erase(YR2.begin() + i);
XR2.erase(XR2.begin() + i);
size_t k = 0;
while (j < YR2.size())
YR1[k] = (YR2[j]);
XR1[k] = (XR2[j]);
YR2.erase(YR2.begin() + j);
XR2.erase(XR2.begin() + j);
size_t l = 0;
for (; k < XR1.size(); ++k)
XR1[k] = XR2[l];
YR1[k] = YR2[l];
Edit1: I have updated the code by replacing all push_back() with operator[], since I read somewhere that this is much faster.
However the whole program is still slow. Any suggestions are appreciated.
If the size is large, you can improve the push_back by pre-allocating the space needed. Add this before the loop:

How to select threshold values automatically uses the peaks of the histogram?

By the OpenCV library, I want to threshold an image like this:
threshold(image, thresh, 220, 255, THRESH_BINARY_INV)
But I want to automatically find the threshold value (220).
I use Otsu to estimate the threshold. But it doesn't work in my case.
therefore, I should use Histogram Peak Technique. I want to find the two peaks in the histogram corresponding to the background and object of the image. It sets the threshold value automatically halfway between the two peaks.
I use this book (pages: 117 and 496-505): "Image Processing in C" by Dwayne Phillips (http://homepages.inf.ed.ac.uk/rbf/BOOKS/PHILLIPS/). And I use source code for find the two peaks in the histogram corresponding to the background and object of the image. this is my image:
this is my c++ code:
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/opencv.hpp>
#include <iostream>
#include <stdio.h>
#include <fstream>
using namespace std;
using namespace cv;
int main()
Mat image0 = imread("C:/Users/Alireza/Desktop/contrast950318/2.bmp");
imshow("image0", image0);
Mat image, thresh, Tafrigh;
cvtColor(image0, image, CV_RGB2GRAY);
int N = image.rows*image.cols;
int histogram[256];
for (int i = 0; i < 256; i++) {
histogram[i] = 0;
//create histo
for (int i = 0; i < image.rows; i++){
for (int j = 0; j < image.cols; j++){
histogram[((int)image.at<uchar>(i, j))]++;
int peak1, peak2;
#define PEAKS 30
int distance[PEAKS], peaks[PEAKS][2];
int i, j = 0, max = 0, max_place = 0;
for (int i = 0; i<PEAKS; i++){
distance[i] = 0;
peaks[i][0] = -1;
peaks[i][1] = -1;
for (i = 0; i <= 255; i++){
max = histogram[i];
max_place = i;
//insert_into_peaks(peaks, max, max_place);
//int max, max_place, peaks[PEAKS][2];
//int i, j;
/* first case */
if (max > peaks[0][0]){
for (i = PEAKS - 1; i > 0; i--){
peaks[i][0] = peaks[i - 1][0];
peaks[i][1] = peaks[i - 1][1];
peaks[0][0] = max;
peaks[0][1] = max_place;
} /* ends if */
/* middle cases */
for (j = 0; j < PEAKS - 3; j++){
if (max < peaks[j][0] && max > peaks[j + 1][0]){
for (i = PEAKS - 1; i > j + 1; i--){
peaks[i][0] = peaks[i - 1][0];
peaks[i][1] = peaks[i - 1][1];
peaks[j + 1][0] = max;
peaks[j + 1][1] = max_place;
} /* ends if */
} /* ends loop over j */
/* last case */
if (max < peaks[PEAKS - 2][0] &&
max > peaks[PEAKS - 1][0]){
peaks[PEAKS - 1][0] = max;
peaks[PEAKS - 1][1] = max_place;
} /* ends if */
}/* ends loop over i */
for (int i = 1; i<PEAKS; i++){
distance[i] = peaks[0][1] - peaks[i][1];
if (distance[i] < 0)
distance[i] = distance[i] * (-1);
peak1 = peaks[0][1];
cout << " peak1= " << peak1;
for (int i = PEAKS - 1; i > 0; i--){
if (distance[i] > 1)
peak2 = peaks[i][1];
cout << " peak2= " << peak2;
int mid_point;
//int peak1, peak2;
short hi, low;
unsigned long sum1 = 0, sum2 = 0;
if (peak1 > peak2)
mid_point = ((peak1 - peak2) / 2) + peak2;
if (peak1 < peak2)
mid_point = ((peak2 - peak1) / 2) + peak1;
for (int i = 0; i<mid_point; i++)
sum1 = sum1 + histogram[i];
for (int i = mid_point; i <= 255; i++)
sum2 = sum2 + histogram[i];
if (sum1 >= sum2){
low = mid_point;
hi = 255;
low = 0;
hi = mid_point;
cout << " low= " << low << " hi= " << hi;
double threshnum = 0.5* (low + hi);
threshold(image, thresh, threshnum, hi, THRESH_BINARY_INV);
return 0;
But I don't know this code correct is or not. If it correct, why is threshold value 202?
What ideas on how to solve this task would you suggest? Or on what resource on the internet can I find help?
You can use also the Max Entropy. In some cases using only the high frequency of the entropy could be better
int maxentropie(const cv::Mat1b& src)
// Histogram
cv::Mat1d hist(1, 256, 0.0);
for (int r=0; r<src.rows; ++r)
for (int c=0; c<src.cols; ++c)
// Normalize
hist /= double(src.rows * src.cols);
// Cumulative histogram
cv::Mat1d cumhist(1, 256, 0.0);
float sum = 0;
for (int i = 0; i < 256; ++i)
sum += hist(i);
cumhist(i) = sum;
cv::Mat1d hl(1, 256, 0.0);
cv::Mat1d hh(1, 256, 0.0);
for (int t = 0; t < 256; ++t)
// low range entropy
double cl = cumhist(t);
if (cl > 0)
for (int i = 0; i <= t; ++i)
if (hist(i) > 0)
hl(t) = hl(t) - (hist(i) / cl) * log(hist(i) / cl);
// high range entropy
double ch = 1.0 - cl; // constraint cl + ch = 1
if (ch > 0)
for (int i = t+1; i < 256; ++i)
if (hist(i) > 0)
hh(t) = hh(t) - (hist(i) / ch) * log(hist(i) / ch);
// choose best threshold
cv::Mat1d entropie(1, 256, 0.0);
double h_max = hl(0) + hh(0);
int threshold = 0;
entropie(0) = h_max;
for (int t = 1; t < 256; ++t)
entropie(t) = hl(t) + hh(t);
if (entropie(t) > h_max)
h_max = entropie(t);
threshold = uchar(t);
if(threshold==0) threshold=255;
return threshold;

Laguerre interpolation algorithm, something's wrong with my implementation

This is a problem I have been struggling for a week, coming back just to give up after wasted hours...
I am supposed to find coefficents for the following Laguerre polynomial:
P0(x) = 1
P1(x) = 1 - x
Pn(x) = ((2n - 1 - x) / n) * P(n-1) - ((n - 1) / n) * P(n-2)
I believe there is an error in my implementation, because for some reason the coefficents I get seem way too big. This is the output this program generates:
a1 = -190.234
a2 = -295.833
a3 = 378.283
a4 = -939.537
a5 = 774.861
a6 = -400.612
Description of code (given below):
If you scroll the code down a little to the part where I declare array, you'll find given x's and y's.
The function polynomial just fills an array with values of said polynomial for certain x. It's a recursive function. I believe it works well, because I have checked the output values.
The gauss function finds coefficents by performing Gaussian elimination on output array. I think this is where the problems begin. I am wondering, if there's a mistake in this code or perhaps my method of veryfying results is bad? I am trying to verify them like that:
-190.234 * 1.5 ^ 5 - 295.833 * 1.5 ^ 4 ... - 400.612 = -3017,817625 =/= 2
#include "stdafx.h"
#include <conio.h>
#include <iostream>
#include <iomanip>
#include <math.h>
using namespace std;
double polynomial(int i, int j, double **tab)
double n = i;
double **array = tab;
double x = array[j][0];
if (i == 0) {
return 1;
} else if (i == 1) {
return 1 - x;
} else {
double minusone = polynomial(i - 1, j, array);
double minustwo = polynomial(i - 2, j, array);
double result = (((2.0 * n) - 1 - x) / n) * minusone - ((n - 1.0) / n) * minustwo;
return result;
int gauss(int n, double tab[6][7], double results[7])
double multiplier, divider;
for (int m = 0; m <= n; m++)
for (int i = m + 1; i <= n; i++)
multiplier = tab[i][m];
divider = tab[m][m];
if (divider == 0) {
return 1;
for (int j = m; j <= n; j++)
if (i == n) {
tab[i][j] = (tab[m][j] * multiplier / divider) - tab[i][j];
for (int j = m; j <= n; j++) {
tab[i - 1][j] = tab[i - 1][j] / divider;
double s = 0;
results[n - 1] = tab[n - 1][n];
int y = 0;
for (int i = n-2; i >= 0; i--)
s = 0;
for (int x = 0; x < n; x++)
s = s + (tab[i][n - 1 - x] * results[n-(x + 1)]);
if (y == x + 1) {
results[i] = tab[i][n] - s;
int _tmain(int argc, _TCHAR* argv[])
int num;
double **array;
array = new double*[5];
for (int i = 0; i <= 5; i++)
array[i] = new double[2];
//i 0 1 2 3 4 5
array[0][0] = 1.5; //xi 1.5 2 2.5 3.5 3.8 4.1
array[0][1] = 2; //yi 2 5 -1 0.5 3 7
array[1][0] = 2;
array[1][1] = 5;
array[2][0] = 2.5;
array[2][1] = -1;
array[3][0] = 3.5;
array[3][1] = 0.5;
array[4][0] = 3.8;
array[4][1] = 3;
array[5][0] = 4.1;
array[5][1] = 7;
double W[6][7]; //n + 1
for (int i = 0; i <= 5; i++)
for (int j = 0; j <= 5; j++)
W[i][j] = polynomial(j, i, array);
W[i][6] = array[i][1];
for (int i = 0; i <= 5; i++)
for (int j = 0; j <= 6; j++)
cout << W[i][j] << "\t";
cout << endl;
double results[6];
gauss(6, W, results);
for (int i = 0; i < 6; i++) {
cout << "a" << i + 1 << " = " << results[i] << endl;
return 0;
I believe your interpretation of the recursive polynomial generation either needs revising or is a bit too clever for me.
given P[0][5] = {1,0,0,0,0,...}; P[1][5]={1,-1,0,0,0,...};
then P[2] is a*P[0] + convolution(P[1], { c, d });
where a = -((n - 1) / n)
c = (2n - 1)/n and d= - 1/n
This can be generalized: P[n] == a*P[n-2] + conv(P[n-1], { c,d });
In every step there is involved a polynomial multiplication with (c + d*x), which increases the degree by one (just by one...) and adding to P[n-1] multiplied with a scalar a.
Then most likely the interpolation factor x is in range [0..1].
(convolution means, that you should implement polynomial multiplication, which luckily is easy...)
* [e,f]
af,bf,cf,df +
ae,be,ce,de, 0 +
(= coefficients of the final polynomial)
The definition of P1(x) = x - 1 is not implemented as stated. You have 1 - x in the computation.
I did not look any further.