I've been working on a neural net class that I can later turn into a library of my own. Primarily doing this to get a good understanding of nets and I've been reading all the formulas from pure maths lectures so I might have a few small details wrong. (I had no idea how to before I started this)
In this net I have coded in a normal SGD algorithm and then an momentum algorithm (or atleast what I think it is).
When I run the net on my simple data set using SGD, it works perfectly, no problems at all. But if I try using SGD with momentum, the net does not learn at all, even after 10000 iterations the loss stays around 0.7.
I have been back and forth, referencing the formula from many places and while I still doubt I completely understand, I feel like it is definitely something with my code but I cant figure it out. I have tried many combinations of alpha and lambda values, many reasonable combinations of layers and neurons(specifically more than one hidden layer with momentum formula, but it doesnt work with 1 layer either).
I am going to post the code for the full net, so if anyone is willing to just scan through it quick and see if there's anything that seems obviously wrong, that would be much appreciated. I feel the fault might ly in updateweights() function since that is where most of the calculation happens but it could also be in the calcema() function.
I have tried changing the weight update formula from W = W - (alpha * partial derivative) to W = W + ( alpha * PD) (and keeping the PD positive instead of making it negative), also tried removing the regularizer for the momentum update formula but none of it has actually made a difference.
I am still very new to this, trying my best so any feedback is appreciated.
Here is a sample from the input file:
in: 0.6 0.34 0.32 0.78
out: 1.0 0.0 0.0
in: 0.36 0.52 0.75 0.67
out: 1.0 0.0 0.0
in: 0.29 0.034 0.79 0.5
out: 0.0 1.0 0.0
in: 0.21 0.29 0.47 0.62
out: 0.0 1.0 0.0
in: 0.67 0.57 0.42 0.19
out: 0.0 1.0 0.0
in: 0.48 0.22 0.79 0.0096
out: 0.0 1.0 0.0
in: 0.75 0.48 0.61 0.67
out: 1.0 0.0 0.0
in: 0.41 0.96 0.65 0.074
out: 1.0 0.0 0.0
in: 0.19 0.88 0.68 0.1
out: 0.0 1.0 0.0
in: 0.9 0.89 0.95 0.45
out: 1.0 0.0 0.0
in: 0.71 0.58 0.95 0.013
out: 1.0 0.0 0.0
in: 0.66 0.043 0.073 0.98
out: 0.0 1.0 0.0
in: 0.12 0.37 0.2 0.22
out: 0.0 0.0 1.0
in: 0.11 0.38 0.54 0.64
out: 0.0 1.0 0.0
in: 0.42 0.81 0.94 0.98
out: 1.0 0.0 0.0
if anyone would like the full input file, let me know, I just dont know how to post files on here but I will find a way.
So my problem specifically is that when I use SGD with momentum (or what I think is SGD with momentum), my net does not learn at all and gets stuck at a loss of 0.7... but if I use normal SGD it works perfectly.
The code:
#include <iostream>
#include <vector>
#include <iomanip>
#include <cmath>
#include <random>
#include <fstream>
#include <chrono>
#include <sstream>
#include <string>
#include <assert.h>
double Relu(double val)
{
if (val < 0) return 0.01 * (exp(val) - 1);
else return val;
}
double Reluderiv(double val)
{
if (val < 0) return Relu(val) + 0.01;
else return 1;
}
double randdist(double x, double y)
{
return sqrt(2.0 / (x + y));
}
int randomt(int x, int y)
{
std::random_device rd;
std::mt19937 mt(rd());
std::uniform_real_distribution<double> dist(x, y);
return round(dist(mt));
}
class INneuron
{
public:
double val{};
std::vector <double> weights{};
std::vector <double> weightderivs{};
std::vector <double> emavals{};
};
class HIDneuron
{
public:
double preactval{};
double actval{};
double actvalPD{};
double preactvalPD{};
std::vector <double> weights{};
std::vector <double> weightderivs{};
std::vector <double> emavals{};
double bias{};
double biasderiv{};
double biasema{};
};
class OUTneuron
{
public:
double preactval{};
double actval{};
double preactvalPD{};
double bias{};
double biasderiv{};
double biasema{};
};
class Net
{
public:
Net(int netdimensions, int hidlayers, int hidneurons, int outneurons, int inneurons, double lambda, double alpha)
{
NETDIMENSIONS = netdimensions; HIDLAYERS = hidlayers; HIDNEURONS = hidneurons; OUTNEURONS = outneurons; INNEURONS = inneurons; Lambda = lambda; Alpha = alpha;
}
void defineoptimizer(std::string optimizer);
void Feedforward(const std::vector <double>& invec);
void Backprop(const std::vector <double>& targets);
void Updateweights();
void printvalues(double totalloss);
void Initweights();
void softmax();
double regularize(double weight,std::string type);
double lossfunc(const std::vector <double>& target);
void calcema(int Layernum, int neuron, int weight, std::string layer, std::string BorW);
private:
INneuron Inn;
HIDneuron Hidn;
OUTneuron Outn;
std::vector <std::vector <HIDneuron>> Hidlayers{};
std::vector <INneuron> Inlayer{};
std::vector <OUTneuron> Outlayer{};
double NETDIMENSIONS{};
double HIDLAYERS{};
double HIDNEURONS{};
double OUTNEURONS{};
double INNEURONS{};
double Lambda{};
double Alpha{};
double loss{};
int optimizerformula{};
};
void Net::defineoptimizer(std::string optimizer)
{
if (optimizer == "ExpAvrg")
{
optimizerformula = 1;
}
else if (optimizer == "SGD")
{
optimizerformula = 2;
}
else if (optimizer == "Adam")
{
optimizerformula = 3;
}
else if (optimizer == "MinibatchSGD")
{
optimizerformula = 4;
}
else {
std::cout << "no optimizer matching description" << '\n';
abort();
}
}
double Net::regularize(double weight,std::string type)
{
if (type == "L1")
{
double absval{ weight };
/*if (weight < 0) absval = weight * -1;
else if (weight > 0 || weight == 0) absval = weight;
else;*/
if (absval > 0.0) return 1.0;
else if (absval < 0.0) return -1.0;
else if (absval == 0.0) return 0.0;
else return 2;
}
else if (type == "l2")
{
double absval{};
if (weight < 0.0) absval = weight * -1.0;
else absval = weight;
return (2.0 * absval);
}
else { std::cout << "no regularizer recognized" << '\n'; abort(); }
}
void Net::softmax()
{
double sum{};
for (size_t Osize = 0; Osize < Outlayer.size(); Osize++)
{
sum += exp(Outlayer[Osize].preactval);
}
for (size_t Osize = 0; Osize < Outlayer.size(); Osize++)
{
Outlayer[Osize].actval = exp(Outlayer[Osize].preactval) / sum;
}
}
void Net::Initweights()
{
unsigned seed = std::chrono::system_clock::now().time_since_epoch().count();
std::default_random_engine generator(seed);
std::normal_distribution<double> distribution(0.0, 1.0);
for (int WD = 0; WD < HIDLAYERS + 1; WD++)
{
if (WD == 0)
{
for (int WL = 0; WL < INNEURONS; WL++)
{
Inlayer.push_back(Inn);
for (int WK = 0; WK < HIDNEURONS; WK++)
{
double val = distribution(generator) * randdist(INNEURONS, HIDNEURONS);
Inlayer.back().weights.push_back(val);
Inlayer.back().weightderivs.push_back(0.0);
Inlayer.back().emavals.push_back(0.0);
}
}
}
else if (WD < HIDLAYERS && WD != 0)
{
Hidlayers.push_back(std::vector <HIDneuron>());
for (int WL = 0; WL < HIDNEURONS; WL++)
{
Hidlayers.back().push_back(Hidn);
for (int WK = 0; WK < HIDNEURONS; WK++)
{
double val = distribution(generator) * randdist(HIDNEURONS, HIDNEURONS);
Hidlayers.back().back().weights.push_back(val);
Hidlayers.back().back().weightderivs.push_back(0.0);
Hidlayers.back().back().emavals.push_back(0.0);
}
Hidlayers.back().back().bias = 0.0;
Hidlayers.back().back().biasderiv = 0.0;
Hidlayers.back().back().biasema = 0.0;
}
}
else if (WD == HIDLAYERS)
{
Hidlayers.push_back(std::vector <HIDneuron>());
for (int WL = 0; WL < HIDNEURONS; WL++)
{
Hidlayers.back().push_back(Hidn);
for (int WK = 0; WK < OUTNEURONS; WK++)
{
double val = distribution(generator) * randdist(HIDNEURONS, OUTNEURONS);
Hidlayers.back().back().weights.push_back(val);
Hidlayers.back().back().weightderivs.push_back(0.0);
Hidlayers.back().back().emavals.push_back(0.0);
}
Hidlayers.back().back().bias = 0.0;
Hidlayers.back().back().biasderiv = 0.0;
Hidlayers.back().back().biasema = 0.0;
}
}
}
for (int i = 0; i < OUTNEURONS; i++)
{
Outlayer.push_back(Outn);
Outlayer.back().bias = 0.0;
Outlayer.back().biasderiv = 0.0;
Outlayer.back().biasema = 0.0;
}
}
void Net::Feedforward(const std::vector <double>& invec)
{
for (size_t I = 0; I < Inlayer.size(); I++)
{
Inlayer[I].val = invec[I];
}
for (size_t h = 0; h < Hidlayers[0].size(); h++)
{
double preval = Hidlayers[0][h].bias;
for (size_t I = 0;I < Inlayer.size(); I++)
{
preval += Inlayer[I].val * Inlayer[I].weights[h];
}
Hidlayers[0][h].preactval = preval;
Hidlayers[0][h].actval = Relu(preval);
}
for (size_t H = 1; H < Hidlayers.size();H++)
{
size_t prevh = H - 1;
for (size_t h = 0; h < Hidlayers[H].size(); h++)
{
double preval = Hidlayers[H][h].bias;
for (size_t p = 0; p < Hidlayers[prevh].size(); p++)
{
preval += Hidlayers[prevh][p].actval * Hidlayers[prevh][p].weights[h];
}
Hidlayers[H][h].preactval = preval;
Hidlayers[H][h].actval = Relu(preval);
}
}
for (size_t O = 0; O < Outlayer.size(); O++)
{
size_t lhid = Hidlayers.size() - 1;
double preval = Outlayer[O].bias;
for (size_t h = 0; h < Hidlayers[lhid].size(); h++)
{
preval += Hidlayers[lhid][h].actval * Hidlayers[lhid][h].weights[O];
}
Outlayer[O].preactval = preval;
}
}
void Net::Backprop(const std::vector <double>& targets)
{
for (size_t O = 0; O < Outlayer.size(); O++)
{
double PDval{};
PDval = targets[O] - Outlayer[O].actval;
PDval = PDval * -1.0;
Outlayer[O].preactvalPD = PDval;
}
for (size_t H = Hidlayers.size(); H > 0; H--)
{
size_t Top = H;
size_t Current = H - 1;
for (size_t h = 0; h < Hidlayers[Current].size(); h++)
{
double actPD{};
double PreactPD{};
double biasPD{};
for (size_t hw = 0; hw < Hidlayers[Current][h].weights.size(); hw++)
{
double PDval{};
if (H == Hidlayers.size())
{
PDval = Outlayer[hw].preactvalPD * Hidlayers[Current][h].actval;
biasPD = Outlayer[hw].preactvalPD;
Outlayer[hw].biasderiv = biasPD;
actPD += Hidlayers[Current][h].weights[hw] * Outlayer[hw].preactvalPD;
calcema(0, hw, 0, "Outlayer", "Bias");
}
else
{
PDval = Hidlayers[Top][h].preactvalPD * Hidlayers[Current][h].actval;
actPD += Hidlayers[Current][h].weights[hw] * Hidlayers[Top][h].preactvalPD;
}
Hidlayers[Current][h].weightderivs[hw] = PDval;
calcema(Current, h, hw, "Hidlayer", "Weight");
}
if (H != Hidlayers.size())
{
biasPD = Hidlayers[Top][h].preactvalPD;
Hidlayers[Top][h].biasderiv = biasPD;
calcema(Top, h, 0, "Hidlayer", "Bias");
}
Hidlayers[Current][h].actvalPD = actPD;
PreactPD = Hidlayers[Current][h].actvalPD * Reluderiv(Hidlayers[Current][h].preactval);
Hidlayers[Current][h].preactvalPD = PreactPD;
actPD = 0;
}
}
for (size_t I = 0; I < Inlayer.size(); I++)
{
double PDval{};
for (size_t hw = 0; hw < Inlayer[I].weights.size(); hw++)
{
PDval = Hidlayers[0][hw].preactvalPD * Inlayer[I].val;
Inlayer[I].weightderivs[hw] = PDval;
double biasPD = Hidlayers[0][hw].preactvalPD;
Hidlayers[0][hw].biasderiv = biasPD;
}
}
}
//PROBABLE CULPRIT
void Net::Updateweights()
{
for (size_t I = 0; I < Inlayer.size(); I++)
{
double PD{};
for (size_t iw = 0; iw < Inlayer[I].weights.size(); iw++)
{
if (optimizerformula == 2)
{
PD = (Inlayer[I].weightderivs[iw] * -1.0) - (Lambda * regularize(Inlayer[I].weights[iw], "L1"));
Inlayer[I].weights[iw] = Inlayer[I].weights[iw] + (Alpha * PD);
}
else if (optimizerformula == 1)
{
PD = (Inlayer[I].emavals[iw] * -1.0) - (Lambda * regularize(Inlayer[I].weights[iw], "L1"));
Inlayer[I].weights[iw] = Inlayer[I].weights[iw] + (Alpha * PD);
}
}
}
for (size_t H = 0; H < Hidlayers.size(); H++)
{
for (size_t h = 0; h < Hidlayers[H].size(); h++)
{
double PD{};
for (size_t hw = 0; hw < Hidlayers[H][h].weights.size(); hw++)
{
if (optimizerformula == 2)
{
PD = (Hidlayers[H][h].weightderivs[hw] * -1.0) - (Lambda * regularize(Hidlayers[H][h].weights[hw], "L1"));
Hidlayers[H][h].weights[hw] = Hidlayers[H][h].weights[hw] + (Alpha * PD);
}
else if (optimizerformula == 1)
{
PD = (Hidlayers[H][h].emavals[hw] * -1.0) - (Lambda * regularize(Hidlayers[H][h].weights[hw], "L1"));
Hidlayers[H][h].weights[hw] = Hidlayers[H][h].weights[hw] + (Alpha * PD);
}
}
if (optimizerformula == 1)
{
PD = Hidlayers[H][h].biasema * -1.0;
Hidlayers[H][h].bias = Hidlayers[H][h].bias + (Alpha * PD);
}
else if (optimizerformula == 2)
{
PD = Hidlayers[H][h].biasderiv * -1.0;
Hidlayers[H][h].bias = Hidlayers[H][h].bias + (Alpha * PD);
}
}
}
for (size_t biases = 0; biases < Outlayer.size(); biases++)
{
if (optimizerformula == 2)
{
double PD = Outlayer[biases].biasderiv * -1.0;
Outlayer[biases].bias = Outlayer[biases].bias + (Alpha * PD);
}
else if (optimizerformula == 1)
{
double PD = Outlayer[biases].biasema * -1.0;
Outlayer[biases].bias = Outlayer[biases].bias + (Alpha * PD);
}
}
}
void Net::printvalues(double totalloss)
{
for (size_t Res = 0; Res < Outlayer.size(); Res++)
{
std::cout << Outlayer[Res].actval << " / ";
}
std::cout << '\n' << "loss = " << totalloss << '\n';
}
double Net::lossfunc(const std::vector <double>& target)
{
int pos{ -1 };
double val{};
for (size_t t = 0; t < target.size(); t++)
{
pos += 1;
if (target[t] > 0)
{
break;
}
}
val = -log(Outlayer[pos].actval);
return val;
}
//OTHER PROBABLE CULPRIT
void Net::calcema(int Layernum, int neuron, int weight, std::string layer, std::string BorW )
{
static double Beta{ 0.9 };
if (BorW == "Weight")
{
if (layer == "Inlayer")
{
Inlayer[neuron].emavals[weight] = (Beta * Inlayer[neuron].emavals[weight]) + ((1.0 - Beta) * Inlayer[neuron].weightderivs[weight]);
}
else if (layer == "Hidlayers")
{
Hidlayers[Layernum][neuron].emavals[weight] = (Beta * Hidlayers[Layernum][neuron].emavals[weight]) + ((1.0 - Beta) * Hidlayers[Layernum][neuron].weightderivs[weight]);
}
}
else if (BorW == "Bias")
{
if (layer == "Hidlayers")
{
Hidlayers[Layernum][neuron].biasema = (Beta * Hidlayers[Layernum][neuron].biasema) + ((1.0 - Beta) * Hidlayers[Layernum][neuron].biasderiv);
}
else if (layer == "Outlayer")
{
Outlayer[neuron].biasema = (Beta * Outlayer[neuron].biasema) + ((1.0 - Beta) * Outlayer[neuron].biasderiv);
}
}
}
int main()
{
std::vector <double> innums{};
std::vector <double> outnums{};
std::vector <std::string> INstrings{};
std::vector <std::string> OUTstrings{};
std::string nums{};
std::string in{};
std::string out{};
double totalloss{};
double loss{};
double single{};
int batchcount{0};
Net net(0, 2, 4, 3, 4, 0.0001, 0.006);
net.Initweights();
net.defineoptimizer("ExpAvrg");
std::ifstream file("N.txt");
while (file.is_open())
{
int count{ 0 };
while (file >> nums)
{
if (nums == "in:")
{
count += 1;
std::getline(file, in);
INstrings.push_back(in);
}
else if (nums == "out:")
{
count += 1;
std::getline(file, out);
OUTstrings.push_back(out);
}
else;
}
break;
}
for (int epoch = 0; epoch < 50000; epoch++)
{
int random = randomt(0, 99);
std::string invals = INstrings[random];
std::string outvals = OUTstrings[random];
std::stringstream in(invals);
std::stringstream out(outvals);
std::cout << "fetching" << '\n';
while (in >> single)
{
innums.push_back(single);
}
while (out >> single)
{
outnums.push_back(single);
}
std::cout << "epoch " << epoch << '\n';
std::cout << "In nums: " << '\n';
for (auto element : innums) std::cout << element << " / ";
std::cout << '\n' << "targets: " << '\n';
for (auto element : outnums) std::cout << element << " / ";
std::cout << '\n';
batchcount += 1;
net.Feedforward(innums);
net.softmax();
loss += net.lossfunc(outnums);
totalloss = loss / batchcount;
net.printvalues(totalloss);
net.Backprop(outnums);
net.Updateweights();
innums.clear();
outnums.clear();
}
std::cout << "in size: "<< INstrings.size() << '\n';
std::cout << "out size: " << OUTstrings.size() << '\n';
}
So, to anyone that might be interested in this. I found the answer.
In my resource, the formula for SGD with momentum is:
Momentumgradient = partial derivative of weight + (beta * previous momentumgradient);
What I was doing wrong was I was assuming that I was doing that calculation in my calcema() function and then I just took the value calculated in calcema() and plugged it into a normal SGD formula. replacing weightderivative value with the momentum gradient value.
The fix to this problem was to simply do exactly as the formula says (feel stupid now).
which is:
In updateweights():
//previous formula
else if (optimizerformula == 1)
{
PD = (Hidlayers[H][h].emavals[hw] * -1.0) - (Lambda * regularize(Hidlayers[H][h].weights[hw], "L1"));
Hidlayers[H][h].weights[hw] = Hidlayers[H][h].weights[hw] + (Alpha * PD);
}
//update formula
else if (optimizerformula == 1)
{
PD = ((Hidlayers[H][h].weightderivs[hw] + (0.9 *Hidlayers[H][h].emavals[hw])) * -1.0) - (Lambda * regularize(Hidlayers[H][h].weights[hw], "L1"));
Hidlayers[H][h].weights[hw] = Hidlayers[H][h].weights[hw] + (Alpha * PD);
}
Related
This is going to be a long one. I am still very new to coding, started 3 months ago so I know my code is not perfect, any criticism beyond the question is more than welcome. I have specifically avoided using pointers because I do not fully understand them, I can use them but I dont trust that I will use them correctly in a program like this.
First things first, I have a version of this where there is only 1 hidden layer and the net works perfectly. I have started running into problems since I tried to expand the number of hidden layers.
Some info on the net:
-I am using softmax output activation as I have 3 output neurons.
-I am using tanh as my activation function on the rest of the net.
-The file being read for the input has a format of
"input: 0.56 0.76 0.23 0.67"
"output: 0.0 0.0 1.0" (this is the target)
-The weights for connecting layer 1 neuron to layer 2 neuron are stored in layer 1 one neuron.
-The bias's for each neuron are stored in that neuron.
-The target is 1.0 0.0 0.0 if the sum of the input numbers is below one, 0.0 1.0 0.0 if sum is between 1 and 2, 0.0 0.0 1.0 if sum is above 2.
-using L1 regularization.
Those problems specifically being:
The softmax output values do not move from an relatively equalised range ie:
(position 1 and 2 in the target vector have a roughly 50/50 occurance rate while position 3 less than 3% occurance rate. so by relatively equalised I mean the softmax output generally looks something like
"0.56.... 0.48.... 0.02..." even after 500 epochs.
The weights at the hidden layer closer to inputlayer dont change much at all, which is what i think vanishing gradients are. I might be wrong on this. But the weights at hiddenlayer closest to output are ending up at between -50 & 50 (which i think is okay?)
Things I have tried:
I have tried using Relu, parametric Relu, exponential Relu, but with all of these the softmax output value for neuron 3 keeps rising, the other 2 neurons values keep falling. these values continue their trajectory until either 500 epochs have been reached or they just turn into nans. (I think this is to do with the structure of my code rather than the Relu function itself).
If I set the number of hidden layers above 3 while using relu, it immediately spits out nans, within the first epoch.
The backprop function is pretty long, but this is specifically because I have deconstructed it many times over to try and figure out where I might be mismatching values or something. I do have it in a condensed version but I feel I have a higher chance of being completely off the mark there than I do if I have it deconstructed.
I have included the Relu function code that I used, it is the first time I use it so I might be wrong on that aswell but I dont think so, I have double checked multiple times. The Relu in the code is specifically "Elu" or exponential relu.
here is the code for the net:
#include <iostream>
#include <fstream>
#include <cmath>
#include <vector>
#include <sstream>
#include <random>
#include <string>
#include <iomanip>
double randomt(double x, double y)
{
std::random_device rd;
std::mt19937 mt(rd());
std::uniform_real_distribution<double> dist(x, y);
return dist(mt);
}
class InputN
{
public:
double val{};
std::vector <double> weights{};
};
class HiddenN
{
public:
double preactval{};
double actval{};
double actvalPD{};
double preactvalpd{};
std::vector <double> weights{};
double bias{};
};
class OutputN
{
public:
double preactval{};
double actval{};
double preactvalpd{};
double bias{};
};
class Net
{
public:
std::vector <InputN> inneurons{};
std::vector <std::vector <HiddenN>> hiddenneurons{};
std::vector <OutputN> outputneurons{};
double lambda{ 0.015 };
double alpha{ 0.02 };
};
double tanhderiv(double val)
{
return 1 - tanh(val) * tanh(val);
}
double Relu(double val)
{
if (val < 0) return 0.01 *(exp(val) - 1);
else return val;
}
double Reluderiv(double val)
{
if (val < 0) return Relu(val) + 0.01;
else return 1;
}
double regularizer(double weight)
{
double absval{};
if (weight < 0) absval = weight - weight - weight;
else if (weight > 0 || weight == 0) absval = weight;
else;
if (absval > 0) return 1;
else if (absval < 0) return -1;
else if (absval == 0) return 0;
else return 2;
}
void feedforward(Net& net)
{
double sum{};
int prevlayer{};
for (size_t Hsize = 0; Hsize < net.hiddenneurons.size(); Hsize++)
{
//std::cout << "in first loop" << '\n';
prevlayer = Hsize - 1;
for (size_t Hel = 0; Hel < net.hiddenneurons[Hsize].size(); Hel++)
{
//std::cout << "in second loop" << '\n';
if (Hsize == 0)
{
//std::cout << "in first if" << '\n';
for (size_t Isize = 0; Isize < net.inneurons.size(); Isize++)
{
//std::cout << "in fourth loop" << '\n';
sum += (net.inneurons[Isize].val * net.inneurons[Isize].weights[Hel]);
}
net.hiddenneurons[Hsize][Hel].preactval = net.hiddenneurons[Hsize][Hel].bias + sum;
net.hiddenneurons[Hsize][Hel].actval = tanh(sum);
sum = 0;
//std::cout << "first if done" << '\n';
}
else
{
//std::cout << "in else" << '\n';
for (size_t prs = 0; prs < net.hiddenneurons[prevlayer].size(); prs++)
{
//std::cout << "in fourth loop" << '\n';
sum += net.hiddenneurons[prevlayer][prs].actval * net.hiddenneurons[prevlayer][prs].weights[Hel];
}
//std::cout << "fourth loop done" << '\n';
net.hiddenneurons[Hsize][Hel].preactval = net.hiddenneurons[Hsize][Hel].bias + sum;
net.hiddenneurons[Hsize][Hel].actval = tanh(sum);
//std::cout << "else done" << '\n';
sum = 0;
}
}
}
//std::cout << "first loop done " << '\n';
int lasthid = net.hiddenneurons.size() - 1;
for (size_t Osize = 0; Osize < net.outputneurons.size(); Osize++)
{
for (size_t Hsize = 0; Hsize < net.hiddenneurons[lasthid].size(); Hsize++)
{
sum += (net.hiddenneurons[lasthid][Hsize].actval * net.hiddenneurons[lasthid][Hsize].weights[Osize]);
}
net.outputneurons[Osize].preactval = net.outputneurons[Osize].bias + sum;
}
}
void softmax(Net& net)
{
double sum{};
for (size_t Osize = 0; Osize < net.outputneurons.size(); Osize++)
{
sum += exp(net.outputneurons[Osize].preactval);
}
for (size_t Osize = 0; Osize < net.outputneurons.size(); Osize++)
{
net.outputneurons[Osize].actval = exp(net.outputneurons[Osize].preactval) / sum;
}
}
void lossfunc(Net& net, std::vector <double> target)
{
int pos{ -1 };
double val{};
for (size_t t = 0; t < target.size(); t++)
{
pos += 1;
if (target[t] > 0)
{
break;
}
}
for (size_t s = 0; net.outputneurons.size(); s++)
{
val = -log(net.outputneurons[pos].actval);
}
}
void backprop(Net& net, std::vector<double>& target)
{
for (size_t outI = 0; outI < net.outputneurons.size(); outI++)
{
double PD = target[outI] - net.outputneurons[outI].actval;
net.outputneurons[outI].preactvalpd = PD * -1;
}
size_t lasthid = net.hiddenneurons.size() - 1;
for (size_t LH = 0; LH < net.hiddenneurons[lasthid].size(); LH++)
{
for (size_t LHW = 0; LHW < net.hiddenneurons[lasthid][LH].weights.size(); LHW++)
{
double weight = net.hiddenneurons[lasthid][LH].weights[LHW];
double PD = net.outputneurons[LHW].preactvalpd * net.hiddenneurons[lasthid][LH].actval;
PD = PD * -1;
double delta = PD - (net.lambda * regularizer(weight));
weight = weight + (net.alpha * delta);
net.hiddenneurons[lasthid][LH].weights[LHW] = weight;
}
}
for (size_t OB = 0; OB < net.outputneurons.size(); OB++)
{
double bias = net.outputneurons[OB].bias;
double BPD = net.outputneurons[OB].preactvalpd;
BPD = BPD * -1;
double Delta = BPD;
bias = bias + (net.alpha * Delta);
}
for (size_t HPD = 0; HPD < net.hiddenneurons[lasthid].size(); HPD++)
{
double PD{};
for (size_t HW = 0; HW < net.outputneurons.size(); HW++)
{
PD += net.hiddenneurons[lasthid][HPD].weights[HW] * net.outputneurons[HW].preactvalpd;
}
net.hiddenneurons[lasthid][HPD].actvalPD = PD;
PD = 0;
}
for (size_t HPD = 0; HPD < net.hiddenneurons[lasthid].size(); HPD++)
{
net.hiddenneurons[lasthid][HPD].preactvalpd = net.hiddenneurons[lasthid][HPD].actvalPD * tanhderiv(net.hiddenneurons[lasthid][HPD].preactval);
}
for (size_t AllHid = net.hiddenneurons.size() - 2; AllHid > -1; AllHid--)
{
size_t uplayer = AllHid + 1;
for (size_t cl = 0; cl < net.hiddenneurons[AllHid].size(); cl++)
{
for (size_t clw = 0; clw < net.hiddenneurons[AllHid][cl].weights.size(); clw++)
{
double weight = net.hiddenneurons[AllHid][cl].weights[clw];
double PD = net.hiddenneurons[uplayer][clw].preactvalpd * net.hiddenneurons[AllHid][cl].actval;
PD = PD * -1;
double delta = PD - (net.lambda * regularizer(weight));
weight = weight + (net.alpha * delta);
net.hiddenneurons[AllHid][cl].weights[clw] = weight;
}
}
for (size_t up = 0; up < net.hiddenneurons[uplayer].size(); up++)
{
double bias = net.hiddenneurons[uplayer][up].bias;
double PD = net.hiddenneurons[uplayer][up].preactvalpd;
PD = PD * -1;
double delta = PD;
bias = bias + (net.alpha * delta);
}
for (size_t APD = 0; APD < net.hiddenneurons[AllHid].size(); APD++)
{
double PD{};
for (size_t APDW = 0; APDW < net.hiddenneurons[AllHid][APD].weights.size(); APDW++)
{
PD += net.hiddenneurons[AllHid][APD].weights[APDW] * net.hiddenneurons[uplayer][APDW].preactvalpd;
}
net.hiddenneurons[AllHid][APD].actvalPD = PD;
PD = 0;
}
for (size_t PPD = 0; PPD < net.hiddenneurons[AllHid].size(); PPD++)
{
double PD = net.hiddenneurons[AllHid][PPD].actvalPD * tanhderiv(net.hiddenneurons[AllHid][PPD].preactval);
net.hiddenneurons[AllHid][PPD].preactvalpd = PD;
}
}
for (size_t IN = 0; IN < net.inneurons.size(); IN++)
{
for (size_t INW = 0; INW < net.inneurons[IN].weights.size(); INW++)
{
double weight = net.inneurons[IN].weights[INW];
double PD = net.hiddenneurons[0][INW].preactvalpd * net.inneurons[IN].val;
PD = PD * -1;
double delta = PD - (net.lambda * regularizer(weight));
weight = weight + (net.alpha * delta);
net.inneurons[IN].weights[INW] = weight;
}
}
for (size_t hidB = 0; hidB < net.hiddenneurons[0].size(); hidB++)
{
double bias = net.hiddenneurons[0][hidB].bias;
double PD = net.hiddenneurons[0][hidB].preactvalpd;
PD = PD * -1;
double delta = PD;
bias = bias + (net.alpha * delta);
net.hiddenneurons[0][hidB].bias = bias;
}
}
int main()
{
std::vector <double> invals{ };
std::vector <double> target{ };
Net net;
InputN Ineuron;
HiddenN Hneuron;
OutputN Oneuron;
int IN = 4;
int HIDLAYERS = 4;
int HID = 8;
int OUT = 3;
for (int i = 0; i < IN; i++)
{
net.inneurons.push_back(Ineuron);
for (int m = 0; m < HID; m++)
{
net.inneurons.back().weights.push_back(randomt(0.0, 0.5));
}
}
//std::cout << "first loop done" << '\n';
for (int s = 0; s < HIDLAYERS; s++)
{
net.hiddenneurons.push_back(std::vector <HiddenN>());
if (s == HIDLAYERS - 1)
{
for (int i = 0; i < HID; i++)
{
net.hiddenneurons[s].push_back(Hneuron);
for (int m = 0; m < OUT; m++)
{
net.hiddenneurons[s].back().weights.push_back(randomt(0.0, 0.5));
}
net.hiddenneurons[s].back().bias = 1.0;
}
}
else
{
for (int i = 0; i < HID; i++)
{
net.hiddenneurons[s].push_back(Hneuron);
for (int m = 0; m < HID; m++)
{
net.hiddenneurons[s].back().weights.push_back(randomt(0.0, 0.5));
}
net.hiddenneurons[s].back().bias = 1.0;
}
}
}
//std::cout << "second loop done" << '\n';
for (int i = 0; i < OUT; i++)
{
net.outputneurons.push_back(Oneuron);
net.outputneurons.back().bias = randomt(0.0, 0.5);
}
//std::cout << "third loop done" << '\n';
int count{};
std::ifstream fileread("N.txt");
for (int epoch = 0; epoch < 500; epoch++)
{
count = 0;
if (epoch == 100 || epoch == 100 * 2 || epoch == 100 * 3 || epoch == 100 * 4 || epoch == 499)
{
printvals("no", net);
}
fileread.clear(); fileread.seekg(0, std::ios::beg);
while (fileread.is_open())
{
std::cout << '\n' << "epoch: " << epoch << '\n';
std::string fileline{};
fileread >> fileline;
if (fileline == "in:")
{
std::string input{};
double nums{};
std::getline(fileread, input);
std::stringstream ss(input);
while (ss >> nums)
{
invals.push_back(nums);
}
}
if (fileline == "out:")
{
std::string output{};
double num{};
std::getline(fileread, output);
std::stringstream ss(output);
while (ss >> num)
{
target.push_back(num);
}
}
count += 1;
if (count == 2)
{
for (size_t inv = 0; inv < invals.size(); inv++)
{
net.inneurons[inv].val = invals[inv];
}
//std::cout << "calling feedforward" << '\n';
feedforward(net);
//std::cout << "ff done" << '\n';
softmax(net);
printvals("output", net);
std::cout << "target: " << '\n';
for (auto element : target) std::cout << element << " / ";
std::cout << '\n';
backprop(net, target);
invals.clear();
target.clear();
count = 0;
}
if (fileread.eof()) break;
}
}
//std::cout << "fourth loop done" << '\n';
return 1;
}
Much aprecciated to anyone who actually made it through all that! :)
I'm translating Python's version of 'page_dewarper' (https://mzucker.github.io/2016/08/15/page-dewarping.html) into C++. I'm going to use dlib, which is a fantastic tool, that helped me in a few optimization problems before. In line 748 of Github repo (https://github.com/mzucker/page_dewarp/blob/master/page_dewarp.py) Matt uses optimize function from Scipy, to find the minimal distance between two vectors. I think, my C++ equivalent should be solve_least_squares_lm() or solve_least_squares(). I'll give a concrete example to analyze.
My data:
a) dstpoints is a vector with OpenCV points - std::vector<cv::Point2f> (I have 162 points in this example, they are not changing),
b) ppts is also std::vector<cv::Point2f> and the same size as dstpoints.
std::vector<cv::Point2f> ppts = project_keypoints(params, input);
It is dependent on:
- dlib::column_vector 'input' is 2*162=324 long and is not changing,
- dlib::column_vector 'params' is 189 long and its values should be changed to get the minimal value of variable 'suma', something like this:
double suma = 0.0;
for (int i=0; i<dstpoints_size; i++)
{
suma += pow(dstpoints[i].x - ppts[i].x, 2);
suma += pow(dstpoints[i].y - ppts[i].y, 2);
}
I'm looking for 'params' vector that will give me the smallest value of 'suma' variable. Least squares algorithm seems to be a good option to solve it: http://dlib.net/dlib/optimization/optimization_least_squares_abstract.h.html#solve_least_squares, but I don't know if it is good for my case.
I think, my problem is that for every different 'params' vector I get different 'ppts' vector, not only single value, and I don't know if solve_least_squares function can match my example.
I must calculate residual for every point. I think, my 'list' from aforementioned link should be something like this:
(ppts[i].x - dstpoints[i].x, ppts[i].y - dstpoints[i].y, ppts[i+1].x - dstpoints[i+1].x, ppts[i+1].y - dstpoints[i+1].y, etc.)
, where 'ppts' vector depends on 'params' vector and then this problem can be solved with least squares algorithm. I don't know how to create data_samples with these assumptions, because it requires dlib::input_vector for every sample, as it is shown in example: http://dlib.net/least_squares_ex.cpp.html.
Am I thinking right?
I'm doing the same thing this days. My solution is writing a Powell Class by myself. It works, but really slowly. The program takes 2 minutes in dewarping linguistics_thesis.jpg.
I don't know what cause the program running so slowly. Maybe because of the algorithm or the code has some extra loop. I'm a Chinese student and my school only have java lessons. So it is normal if you find some extra codes in my codes.
Here is my Powell class.
using namespace std;
using namespace cv;
class MyPowell
{
public:
vector<vector<double>> xi;
vector<double> pcom;
vector<double> xicom;
vector<Point2d> dstpoints;
vector<double> myparams;
vector<double> params;
vector<Point> keypoint_index;
Point2d dst_br;
Point2d dims;
int N;
int itmax;
int ncom;
int iter;
double fret, ftol;
int usingAorB;
MyPowell(vector<Point2d> &dstpoints, vector<double> ¶ms, vector<Point> &keypoint_index);
MyPowell(Point2d &dst_br, vector<double> ¶ms, Point2d & dims);
MyPowell();
double obj(vector<double> ¶ms);
void powell(vector<double> &p, vector<vector<double>> &xi, double ftol, double &fret);
double sign(double a);// , double b);
double sqr(double a);
void linmin(vector<double> &p, vector<double> &xit, int n, double &fret);
void mnbrak(double & ax, double & bx, double & cx,
double & fa, double & fb, double & fc);
double f1dim(double x);
double brent(double ax, double bx, double cx, double & xmin, double tol);
vector<double> usePowell();
void erase(vector<double>& pbar, vector<double> &prr, vector<double> &pr);
};
#include"Powell.h"
MyPowell::MyPowell(vector<Point2d> &dstpoints, vector<double>& params, vector<Point> &keypoint_index)
{
this->dstpoints = dstpoints;
this->myparams = params;
this->keypoint_index = keypoint_index;
N = params.size();
itmax = N * N;
usingAorB = 1;
}
MyPowell::MyPowell(Point2d & dst_br, vector<double>& params, Point2d & dims)
{
this->dst_br = dst_br;
this->myparams.push_back(dims.x);
this->myparams.push_back(dims.y);
this->params = params;
this->dims = dims;
N = 2;
itmax = N * 1000;
usingAorB = 2;
}
MyPowell::MyPowell()
{
usingAorB = 3;
}
double MyPowell::obj(vector<double> &myparams)
{
if (1 == usingAorB)
{
vector<Point2d> ppts = Dewarp::projectKeypoints(keypoint_index, myparams);
double total = 0;
for (int i = 0; i < ppts.size(); i++)
{
double x = dstpoints[i].x - ppts[i].x;
double y = dstpoints[i].y - ppts[i].y;
total += (x * x + y * y);
}
return total;
}
else if(2 == usingAorB)
{
dims.x = myparams[0];
dims.y = myparams[1];
//cout << "dims.x " << dims.x << " dims.y " << dims.y << endl;
vector<Point2d> vdims = { dims };
vector<Point2d> proj_br = Dewarp::projectXY(vdims, params);
double total = 0;
double x = dst_br.x - proj_br[0].x;
double y = dst_br.y - proj_br[0].y;
total += (x * x + y * y);
return total;
}
return 0;
}
void MyPowell::powell(vector<double> &x, vector<vector<double>> &direc, double ftol, double &fval)
{
vector<double> x1;
vector<double> x2;
vector<double> direc1;
int myitmax = 20;
if(N>500)
myitmax = 10;
else if (N > 300)
{
myitmax = 15;
}
double fx2, t, fx, dum, delta;
fval = obj(x);
int bigind;
for (int j = 0; j < N; j++)
{
x1.push_back(x[j]);
}
int iter = 0;
while (true)
{
do
{
do
{
iter += 1;
fx = fval;
bigind = 0;
delta = 0.0;
for (int i = 0; i < N; i++)
{
direc1 = direc[i];
fx2 = fval;
linmin(x, direc1, N, fval);
if (fabs(fx2 - fval) > delta)
{
delta = fabs(fx2 - fval);
bigind = i;
}
}
if (2.0 * fabs(fx - fval) <= ftol * (fabs(fx) + fabs(fval)) + 1e-7)
{
erase(direc1, x2, x1);
return;
}
if (iter >= itmax)
{
cout << "powell exceeding maximum iterations" << endl;
return;
}
if (!x2.empty())
{
x2.clear();
}
for (int j = 0; j < N; j++)
{
x2.push_back(2.0*x[j] - x1[j]);
direc1[j] = x[j] - x1[j];
x1[j] = x[j];
}
myitmax--;
cout << fx2 << endl;
fx2 = obj(x2);
if (myitmax < 0)
return;
} while (fx2 >= fx);
dum = fx - 2 * fval + fx2;
t = 2.0*dum*pow((fx - fval - delta), 2) - delta * pow((fx - fx2), 2);
} while (t >= 0.0);
linmin(x, direc1, N, fval);
direc[bigind] = direc1;
}
}
double MyPowell::sign(double a)//, double b)
{
if (a > 0.0)
{
return 1;
}
else
{
if (a < 0.0)
{
return -1;
}
}
return 0;
}
double MyPowell::sqr(double a)
{
return a * a;
}
void MyPowell::linmin(vector<double>& p, vector<double>& xit, int n, double &fret)
{
double tol = 1e-2;
ncom = n;
pcom = p;
xicom = xit;
double ax = 0.0;
double xx = 1.0;
double bx = 0.0;
double fa, fb, fx, xmin;
mnbrak(ax, xx, bx, fa, fx, fb);
fret = brent(ax, xx, bx, xmin, tol);
for (int i = 0; i < n; i++)
{
xit[i] = (xmin * xit[i]);
p[i] += xit[i];
}
}
void MyPowell::mnbrak(double & ax, double & bx, double & cx,
double & fa, double & fb, double & fc)
{
const double GOLD = 1.618034, GLIMIT = 110.0, TINY = 1e-20;
double val, fw, tmp2, tmp1, w, wlim;
double denom;
fa = f1dim(ax);
fb = f1dim(bx);
if (fb > fa)
{
val = ax;
ax = bx;
bx = val;
val = fb;
fb = fa;
fa = val;
}
cx = bx + GOLD * (bx - ax);
fc = f1dim(cx);
int iter = 0;
while (fb >= fc)
{
tmp1 = (bx - ax) * (fb - fc);
tmp2 = (bx - cx) * (fb - fa);
val = tmp2 - tmp1;
if (fabs(val) < TINY)
{
denom = 2.0*TINY;
}
else
{
denom = 2.0*val;
}
w = bx - ((bx - cx)*tmp2 - (bx - ax)*tmp1) / (denom);
wlim = bx + GLIMIT * (cx - bx);
if ((bx - w) * (w - cx) > 0.0)
{
fw = f1dim(w);
if (fw < fc)
{
ax = bx;
fa = fb;
bx = w;
fb = fw;
return;
}
else if (fw > fb)
{
cx = w;
fc = fw;
return;
}
w = cx + GOLD * (cx - bx);
fw = f1dim(w);
}
else
{
if ((cx - w)*(w - wlim) >= 0.0)
{
fw = f1dim(w);
if (fw < fc)
{
bx = cx;
cx = w;
w = cx + GOLD * (cx - bx);
fb = fc;
fc = fw;
fw = f1dim(w);
}
}
else if ((w - wlim)*(wlim - cx) >= 0.0)
{
w = wlim;
fw = f1dim(w);
}
else
{
w = cx + GOLD * (cx - bx);
fw = f1dim(w);
}
}
ax = bx;
bx = cx;
cx = w;
fa = fb;
fb = fc;
fc = fw;
}
}
double MyPowell::f1dim(double x)
{
vector<double> xt;
for (int j = 0; j < ncom; j++)
{
xt.push_back(pcom[j] + x * xicom[j]);
}
return obj(xt);
}
double MyPowell::brent(double ax, double bx, double cx, double & xmin, double tol = 1.48e-8)
{
const double CGOLD = 0.3819660, ZEPS = 1.0e-4;
int itmax = 500;
double a = MIN(ax, cx);
double b = MAX(ax, cx);
double v = bx;
double w = v, x = v;
double deltax = 0.0;
double fx = f1dim(x);
double fv = fx;
double fw = fx;
double rat = 0, u = 0, fu;
int iter;
int done;
double dx_temp, xmid, tol1, tol2, tmp1, tmp2, p;
for (iter = 0; iter < 500; iter++)
{
xmid = 0.5 * (a + b);
tol1 = tol * fabs(x) + ZEPS;
tol2 = 2.0*tol1;
if (fabs(x - xmid) <= (tol2 - 0.5*(b - a)))
break;
done = -1;
if (fabs(deltax) > tol1)
{
tmp1 = (x - w) * (fx - fv);
tmp2 = (x - v) * (fx - fw);
p = (x - v) * tmp2 - (x - w) * tmp1;
tmp2 = 2.0 * (tmp2 - tmp1);
if (tmp2 > 0.0)
p = -p;
tmp2 = fabs(tmp2);
dx_temp = deltax;
deltax = rat;
if ((p > tmp2 * (a - x)) && (p < tmp2 * (b - x)) &&
fabs(p) < fabs(0.5 * tmp2 * dx_temp))
{
rat = p / tmp2;
u = x + rat;
if ((u - a) < tol2 || (b - u) < tol2)
{
rat = fabs(tol1) * sign(xmid - x);
}
done = 0;
}
}
if(done)
{
if (x >= xmid)
{
deltax = a - x;
}
else
{
deltax = b - x;
}
rat = CGOLD * deltax;
}
if (fabs(rat) >= tol1)
{
u = x + rat;
}
else
{
u = x + fabs(tol1) * sign(rat);
}
fu = f1dim(u);
if (fu > fx)
{
if (u < x)
{
a = u;
}
else
{
b = u;
}
if (fu <= fw || w == x)
{
v = w;
w = u;
fv = fw;
fw = fu;
}
else if (fu <= fv || v == x || v == w)
{
v = u;
fv = fu;
}
}
else
{
if (u >= x)
a = x;
else
b = x;
v = w;
w = x;
x = u;
fv = fw;
fw = fx;
fx = fu;
}
}
if(iter > itmax)
cout << "\n Brent exceed maximum iterations.\n\n";
xmin = x;
return fx;
}
vector<double> MyPowell::usePowell()
{
ftol = 1e-4;
vector<vector<double>> xi;
for (int i = 0; i < N; i++)
{
vector<double> xii;
for (int j = 0; j < N; j++)
{
xii.push_back(0);
}
xii[i]=(1.0);
xi.push_back(xii);
}
double fret = 0;
powell(myparams, xi, ftol, fret);
//for (int i = 0; i < xi.size(); i++)
//{
// double a = obj(xi[i]);
// if (fret > a)
// {
// fret = a;
// myparams = xi[i];
// }
//}
cout << "final result" << fret << endl;
return myparams;
}
void MyPowell::erase(vector<double>& pbar, vector<double>& prr, vector<double>& pr)
{
for (int i = 0; i < pbar.size(); i++)
{
pbar[i] = 0;
}
for (int i = 0; i < prr.size(); i++)
{
prr[i] = 0;
}
for (int i = 0; i < pr.size(); i++)
{
pr[i] = 0;
}
}
I used PRAXIS library, because it doesn't need derivative information and is fast.
I modified the code a little to my needs and now it is faster than original version written in Python.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
Firstly, I'm a complete amateur so I may mix up some terminology.
I've been working on a Neural Network to play Connect 4 / Four In A Row.
The current design of the network model is 170 input values, 417 hidden neurons and 1 output neuron. The network is fully connected, i.e. every input is connected to every hidden neuron and every hidden neuron is connected to the output node.
Every connection has an independent weight, and every hidden node, and the single output node, have an additional bias node with a weight.
The input representation of 170 values for the game-state of Connect 4 is:
42 pairs of values (84 input variables) which denote whether a space is occupied by player 1, player 2 or is vacant.
0,0 means it's free
1,0 means it's player 1's position
0,1 means it's player 2's position
1,1 is not possible
Another 42 pairs of values (84 input variables) which denote whether adding a piece here will give player 1 or player 2 a "Connect 4"/"Four In a Row". The combination of values means the same as above.
2 final input variables to denote who's turn it is:
1,0 player 1's turn
0,1 player 2's turn
1,1 and 0,0 are not possible
I measured the average Mean Square Error of 100 games over 10,000 total games of various configurations to arrive at:
417 hidden neurons
Alpha and Beta learning rate of 0.1 at the start and dropping to 0.01 linearly across the total number of epochs
A lambda value of 0.5
90 out of 100 moves are random at the start and drop down to 10 out of every 100 after the first 50% of epochs. So at the midway point 10 out of 100 moves are random
The first 50% of epochs start with a random move
Sigmoid Activation Function used in every node
This image shows the results of the various configurations plotted with a logarithmic scale. This is how I determined which configuration to use.
I calculate this Mean Square Error by comparing the output of a board in a win-state against either -1 for a player 2 win and 1 for a player 1 win. I add these up every 100 games and divide the total by 100 to get 1000 values to plot in the above graph. I.e. the code snippet is:
if(board.InARowConnected(4) == Board<7,6,4>::Player1)
{
totalLoss += NN->BackPropagateFinal({1},previousNN,alpha,beta,lambda);
winState = true;
}
else if(board.InARowConnected(4) == Board<7,6,4>::Player2)
{
totalLoss += NN->BackPropagateFinal({-1},previousNN,alpha,beta,lambda);
winState = true;
}
else if(!board.IsThereAvailableMove())
{
totalLoss += NN->BackPropagateFinal({0},previousNN,alpha,beta,lambda);
winState = true;
}
...
if(gameNumber % 100 == 0 && gameNumber != 0)
{
totalLoss = totalLoss / gamesToOutput;
matchFile << std::fixed << std::setprecision(51) << totalLoss << std::endl;
totalLoss = 0.0;
}
The way I'm training the network is by having it play against itself over and over again. It's a feed-forward network and I'm using TD-Lambda to train it for every move (every move that wasn't randomly chosen).
The Board State that is given to the Neural Network is done through this:
template<std::size_t BoardWidth, std::size_t BoardHeight, std::size_t InARow>
void create_board_state(std::array<double,BoardWidth*BoardHeight*4+2>& gameState, const Board<BoardWidth,BoardHeight,InARow>& board,
const typename Board<BoardWidth,BoardHeight,InARow>::Player player)
{
using BoardType = Board<BoardWidth,BoardHeight,InARow>;
auto bb = board.GetBoard();
std::size_t stateIndex = 0;
for(std::size_t boardIndex = 0; boardIndex < BoardWidth*BoardHeight; ++boardIndex, stateIndex += 2)
{
if(bb[boardIndex] == BoardType::Free)
{
gameState[stateIndex] = 0;
gameState[stateIndex+1] = 0;
}
else if(bb[boardIndex] == BoardType::Player1)
{
gameState[stateIndex] = 1;
gameState[stateIndex+1] = 0;
}
else
{
gameState[stateIndex] = 0;
gameState[stateIndex+1] = 1;
}
}
for(std::size_t x = 0; x < BoardWidth; ++x)
{
for(std::size_t y = 0; y < BoardHeight; ++y)
{
auto testBoard1 = board;
auto testBoard2 = board;
testBoard1.SetBoardChecker(x,y,Board<BoardWidth,BoardHeight,InARow>::Player1);
testBoard2.SetBoardChecker(x,y,Board<BoardWidth,BoardHeight,InARow>::Player2);
// player 1's set
if(testBoard1.InARowConnected(4) == Board<7,6,4>::Player1)
gameState[stateIndex] = 1;
else
gameState[stateIndex] = 0;
// player 2's set
if(testBoard2.InARowConnected(4) == Board<7,6,4>::Player2)
gameState[stateIndex+1] = 1;
else
gameState[stateIndex+1] = 0;
stateIndex += 2;
}
}
if(player == Board<BoardWidth,BoardHeight,InARow>::Player1)
{
gameState[stateIndex] = 1;
gameState[stateIndex+1] = 0;
}
else
{
gameState[stateIndex] = 0;
gameState[stateIndex+1] = 1;
}
}
It's templated to make changing things later easier. I don't believe there's anything wrong in the above.
My Sigmoid activation function:
inline double sigmoid(const double x)
{
// return 1.0 / (1.0 + std::exp(-x));
return x / (1.0 + std::abs(x));
}
My Neuron Class
template<std::size_t NumInputs>
class Neuron
{
public:
Neuron()
{
for(auto& i : m_inputValues)
i = 9;
for(auto& e : m_eligibilityTraces)
e = 9;
for(auto& w : m_weights)
w = 9;
m_biasWeight = 9;
m_biasEligibilityTrace = 9;
m_outputValue = 9;
}
void SetInputValue(const std::size_t index, const double value)
{
m_inputValues[index] = value;
}
void SetWeight(const std::size_t index, const double weight)
{
if(std::isnan(weight))
throw std::runtime_error("Shit! this is a nan bread");
m_weights[index] = weight;
}
void SetBiasWeight(const double weight)
{
m_biasWeight = weight;
}
double GetInputValue(const std::size_t index) const
{
return m_inputValues[index];
}
double GetWeight(const std::size_t index) const
{
return m_weights[index];
}
double GetBiasWeight() const
{
return m_biasWeight;
}
double CalculateOutput()
{
m_outputValue = 0;
for(std::size_t i = 0; i < NumInputs; ++i)
{
m_outputValue += m_inputValues[i] * m_weights[i];
}
m_outputValue += 1.0 * m_biasWeight;
m_outputValue = sigmoid(m_outputValue);
return m_outputValue;
}
double GetOutput() const
{
return m_outputValue;
}
double GetEligibilityTrace(const std::size_t index) const
{
return m_eligibilityTraces[index];
}
void SetEligibilityTrace(const std::size_t index, const double eligibility)
{
m_eligibilityTraces[index] = eligibility;
}
void SetBiasEligibility(const double eligibility)
{
m_biasEligibilityTrace = eligibility;
}
double GetBiasEligibility() const
{
return m_biasEligibilityTrace;
}
void ResetEligibilityTraces()
{
for(auto& e : m_eligibilityTraces)
e = 0;
m_biasEligibilityTrace = 0;
}
private:
std::array<double,NumInputs> m_inputValues;
std::array<double,NumInputs> m_weights;
std::array<double,NumInputs> m_eligibilityTraces;
double m_biasWeight;
double m_biasEligibilityTrace;
double m_outputValue;
};
My Neural Network class
template
class NeuralNetwork
{
public:
void RandomiseWeights()
{
double inputToHiddenRange = 4.0 * std::sqrt(6.0 / (NumInputs+1+NumOutputs));
RandomGenerator inputToHidden(-inputToHiddenRange,inputToHiddenRange);
double hiddenToOutputRange = 4.0 * std::sqrt(6.0 / (NumHidden+1+1));
RandomGenerator hiddenToOutput(-hiddenToOutputRange,hiddenToOutputRange);
for(auto& hiddenNeuron : m_hiddenNeurons)
{
for(std::size_t i = 0; i < NumInputs; ++i)
hiddenNeuron.SetWeight(i, inputToHidden());
hiddenNeuron.SetBiasWeight(inputToHidden());
}
for(auto& outputNeuron : m_outputNeurons)
{
for(std::size_t h = 0; h < NumHidden; ++h)
outputNeuron.SetWeight(h, hiddenToOutput());
outputNeuron.SetBiasWeight(hiddenToOutput());
}
}
double GetOutput(const std::size_t index) const
{
return m_outputNeurons[index].GetOutput();
}
std::array<double,NumOutputs> GetOutputs()
{
std::array<double, NumOutputs> returnValue;
for(std::size_t o = 0; o < NumOutputs; ++o)
returnValue[o] = m_outputNeurons[o].GetOutput();
return returnValue;
}
void SetInputValue(const std::size_t index, const double value)
{
for(auto& hiddenNeuron : m_hiddenNeurons)
hiddenNeuron.SetInputValue(index, value);
}
std::array<double,NumOutputs> Calculate()
{
for(auto& h : m_hiddenNeurons)
h.CalculateOutput();
for(auto& o : m_outputNeurons)
o.CalculateOutput();
return GetOutputs();
}
std::array<double,NumOutputs> FeedForward(const std::array<double,NumInputs>& inputValues)
{
for(std::size_t h = 0; h < NumHidden; ++h)//auto& hiddenNeuron : m_hiddenNeurons)
{
for(std::size_t i = 0; i < NumInputs; ++i)
m_hiddenNeurons[h].SetInputValue(i,inputValues[i]);
m_hiddenNeurons[h].CalculateOutput();
}
std::array<double, NumOutputs> returnValue;
for(std::size_t h = 0; h < NumHidden; ++h)
{
auto hiddenOutput = m_hiddenNeurons[h].GetOutput();
for(std::size_t o = 0; o < NumOutputs; ++o)
m_outputNeurons[o].SetInputValue(h, hiddenOutput);
}
for(std::size_t o = 0; o < NumOutputs; ++o)
{
returnValue[o] = m_outputNeurons[o].CalculateOutput();
}
return returnValue;
}
double BackPropagateFinal(const std::array<double,NumOutputs>& actualValues, const NeuralNetwork<NumInputs,NumHidden,NumOutputs>* NN, const double alpha, const double beta, const double lambda)
{
for(std::size_t iO = 0; iO < NumOutputs; ++iO)
{
auto y = NN->m_outputNeurons[iO].GetOutput();
auto y1 = actualValues[iO];
for(std::size_t iH = 0; iH < NumHidden; ++iH)
{
auto e = NN->m_outputNeurons[iO].GetEligibilityTrace(iH);
auto h = NN->m_hiddenNeurons[iH].GetOutput();
auto w = NN->m_outputNeurons[iO].GetWeight(iH);
double e1 = lambda * e + (y * (1.0 - y) * h);
double w1 = w + beta * (y1 - y) * e1;
m_outputNeurons[iO].SetEligibilityTrace(iH,e1);
m_outputNeurons[iO].SetWeight(iH,w1);
}
auto e = NN->m_outputNeurons[iO].GetBiasEligibility();
auto h = 1.0;
auto w = NN->m_outputNeurons[iO].GetBiasWeight();
double e1 = lambda * e + (y * (1.0 - y) * h);
double w1 = w + beta * (y1 - y) * e1;
m_outputNeurons[iO].SetBiasEligibility(e1);
m_outputNeurons[iO].SetBiasWeight(w1);
}
for(std::size_t iH = 0; iH < NumHidden; ++iH)
{
auto h = NN->m_hiddenNeurons[iH].GetOutput();
for(std::size_t iI = 0; iI < NumInputs; ++iI)
{
auto e = NN->m_hiddenNeurons[iH].GetEligibilityTrace(iI);
auto x = NN->m_hiddenNeurons[iH].GetInputValue(iI);
auto u = NN->m_hiddenNeurons[iH].GetWeight(iI);
double sumError = 0;
for(std::size_t iO = 0; iO < NumOutputs; ++iO)
{
auto w = NN->m_outputNeurons[iO].GetWeight(iH);
auto y = NN->m_outputNeurons[iO].GetOutput();
auto y1 = actualValues[iO];
auto grad = y1 - y;
double e1 = lambda * e + (y * (1.0 - y) * w * h * (1.0 - h) * x);
sumError += grad * e1;
}
double u1 = u + alpha * sumError;
m_hiddenNeurons[iH].SetEligibilityTrace(iI,sumError);
m_hiddenNeurons[iH].SetWeight(iI,u1);
}
auto e = NN->m_hiddenNeurons[iH].GetBiasEligibility();
auto x = 1.0;
auto u = NN->m_hiddenNeurons[iH].GetBiasWeight();
double sumError = 0;
for(std::size_t iO = 0; iO < NumOutputs; ++iO)
{
auto w = NN->m_outputNeurons[iO].GetWeight(iH);
auto y = NN->m_outputNeurons[iO].GetOutput();
auto y1 = actualValues[iO];
auto grad = y1 - y;
double e1 = lambda * e + (y * (1.0 - y) * w * h * (1.0 - h) * x);
sumError += grad * e1;
}
double u1 = u + alpha * sumError;
m_hiddenNeurons[iH].SetBiasEligibility(sumError);
m_hiddenNeurons[iH].SetBiasWeight(u1);
}
double retVal = 0;
for(std::size_t o = 0; o < NumOutputs; ++o)
{
retVal += 0.5 * alpha * std::pow((NN->GetOutput(o) - GetOutput(0)),2);
}
return retVal / NumOutputs;
}
double BackPropagate(const NeuralNetwork<NumInputs,NumHidden,NumOutputs>* NN, const double alpha, const double beta, const double lambda)
{
for(std::size_t iO = 0; iO < NumOutputs; ++iO)
{
auto y = NN->m_outputNeurons[iO].GetOutput();
auto y1 = m_outputNeurons[iO].GetOutput();
for(std::size_t iH = 0; iH < NumHidden; ++iH)
{
auto e = NN->m_outputNeurons[iO].GetEligibilityTrace(iH);
auto h = NN->m_hiddenNeurons[iH].GetOutput();
auto w = NN->m_outputNeurons[iO].GetWeight(iH);
double e1 = lambda * e + (y * (1.0 - y) * h);
double w1 = w + beta * (y1 - y) * e1;
m_outputNeurons[iO].SetEligibilityTrace(iH,e1);
m_outputNeurons[iO].SetWeight(iH,w1);
}
auto e = NN->m_outputNeurons[iO].GetBiasEligibility();
auto h = 1.0;
auto w = NN->m_outputNeurons[iO].GetBiasWeight();
double e1 = lambda * e + (y * (1.0 - y) * h);
double w1 = w + beta * (y1 - y) * e1;
m_outputNeurons[iO].SetBiasEligibility(e1);
m_outputNeurons[iO].SetBiasWeight(w1);
}
for(std::size_t iH = 0; iH < NumHidden; ++iH)
{
auto h = NN->m_hiddenNeurons[iH].GetOutput();
for(std::size_t iI = 0; iI < NumInputs; ++iI)
{
auto e = NN->m_hiddenNeurons[iH].GetEligibilityTrace(iI);
auto x = NN->m_hiddenNeurons[iH].GetInputValue(iI);
auto u = NN->m_hiddenNeurons[iH].GetWeight(iI);
double sumError = 0;
for(std::size_t iO = 0; iO < NumOutputs; ++iO)
{
auto w = NN->m_outputNeurons[iO].GetWeight(iH);
auto y = NN->m_outputNeurons[iO].GetOutput();
auto y1 = m_outputNeurons[iO].GetOutput();
auto grad = y1 - y;
double e1 = lambda * e + (y * (1.0 - y) * w * h * (1.0 - h) * x);
sumError += grad * e1;
}
double u1 = u + alpha * sumError;
m_hiddenNeurons[iH].SetEligibilityTrace(iI,sumError);
m_hiddenNeurons[iH].SetWeight(iI,u1);
}
auto e = NN->m_hiddenNeurons[iH].GetBiasEligibility();
auto x = 1.0;
auto u = NN->m_hiddenNeurons[iH].GetBiasWeight();
double sumError = 0;
for(std::size_t iO = 0; iO < NumOutputs; ++iO)
{
auto w = NN->m_outputNeurons[iO].GetWeight(iH);
auto y = NN->m_outputNeurons[iO].GetOutput();
auto y1 = m_outputNeurons[iO].GetOutput();
auto grad = y1 - y;
double e1 = lambda * e + (y * (1.0 - y) * w * h * (1.0 - h) * x);
sumError += grad * e1;
}
double u1 = u + alpha * sumError;
m_hiddenNeurons[iH].SetBiasEligibility(sumError);
m_hiddenNeurons[iH].SetBiasWeight(u1);
}
double retVal = 0;
for(std::size_t o = 0; o < NumOutputs; ++o)
{
retVal += 0.5 * alpha * std::pow((NN->GetOutput(o) - GetOutput(0)),2);
}
return retVal / NumOutputs;
}
std::array<double,NumInputs*NumHidden+NumHidden+NumHidden*NumOutputs+NumOutputs> GetNetworkWeights() const
{
std::array<double,NumInputs*NumHidden+NumHidden+NumHidden*NumOutputs+NumOutputs> returnVal;
std::size_t weightPos = 0;
for(std::size_t h = 0; h < NumHidden; ++h)
{
for(std::size_t i = 0; i < NumInputs; ++i)
returnVal[weightPos++] = m_hiddenNeurons[h].GetWeight(i);
returnVal[weightPos++] = m_hiddenNeurons[h].GetBiasWeight();
}
for(std::size_t o = 0; o < NumOutputs; ++o)
{
for(std::size_t h = 0; h < NumHidden; ++h)
returnVal[weightPos++] = m_outputNeurons[o].GetWeight(h);
returnVal[weightPos++] = m_outputNeurons[o].GetBiasWeight();
}
return returnVal;
}
static constexpr std::size_t NumWeights = NumInputs*NumHidden+NumHidden+NumHidden*NumOutputs+NumOutputs;
void SetNetworkWeights(const std::array<double,NumInputs*NumHidden+NumHidden+NumHidden*NumOutputs+NumOutputs>& weights)
{
std::size_t weightPos = 0;
for(std::size_t h = 0; h < NumHidden; ++h)
{
for(std::size_t i = 0; i < NumInputs; ++i)
m_hiddenNeurons[h].SetWeight(i, weights[weightPos++]);
m_hiddenNeurons[h].SetBiasWeight(weights[weightPos++]);
}
for(std::size_t o = 0; o < NumOutputs; ++o)
{
for(std::size_t h = 0; h < NumHidden; ++h)
m_outputNeurons[o].SetWeight(h, weights[weightPos++]);
m_outputNeurons[o].SetBiasWeight(weights[weightPos++]);
}
}
void ResetEligibilityTraces()
{
for(auto& h : m_hiddenNeurons)
h.ResetEligibilityTraces();
for(auto& o : m_outputNeurons)
o.ResetEligibilityTraces();
}
private:
std::array<Neuron<NumInputs>,NumHidden> m_hiddenNeurons;
std::array<Neuron<NumHidden>,NumOutputs> m_outputNeurons;
};
I believe one of the places I may have an issue is the BackPropagate and BackPropagateFinal methods in the Neural Network class.
Here's my main function that is training the network:
int main()
{
std::ofstream matchFile("match.txt");
RandomGenerator randomPlayerStart(0,1);
RandomGenerator randomMove(0,100);
Board<7,6,4> board;
auto NN = new NeuralNetwork<7*6*4+2,417,1>();
auto previousNN = new NeuralNetwork<7*6*4+2,417,1>();
NN->RandomiseWeights();
const int numGames = 3000000;
double alpha = 0.1;
double beta = 0.1;
double lambda = 0.5;
double learningRateFloor = 0.01;
double decayRateAlpha = (alpha - learningRateFloor) / numGames;
double decayRateBeta = (beta - learningRateFloor) / numGames;
double randomChance = 90; // out of 100
double randomChangeFloor = 10;
double percentToReduceRandomOver = 0.5;
double randomChangeDecay = (randomChance-randomChangeFloor) / (numGames*percentToReduceRandomOver);
double percentOfGamesToRandomiseStart = 0.5;
int numGamesWonP1 = 0;
int numGamesWonP2 = 0;
int gamesToOutput = 100;
matchFile << "Num Games: " << numGames << "\t\ta,b,l: " << alpha << ", " << beta << ", " << lambda << std::endl;
Board<7,6,4>::Player playerStart = randomPlayerStart() > 0.5 ? Board<7,6,4>::Player1 : Board<7,6,4>::Player2;
double totalLoss = 0.0;
for(int gameNumber = 0; gameNumber < numGames; ++gameNumber)
{
bool winState = false;
Board<7,6,4>::Player playerWhoTurnItIs = playerStart;
playerStart = playerStart == Board<7,6,4>::Player1 ? Board<7,6,4>::Player2 : Board<7,6,4>::Player1;
board.ClearBoard();
int turnNumber = 0;
while(!winState)
{
Board<7,6,4>::Player playerWhoTurnItIsNot = playerWhoTurnItIs == Board<7,6,4>::Player1 ? Board<7,6,4>::Player2 : Board<7,6,4>::Player1;
bool wasRandomMove = false;
std::size_t selectedMove;
bool moveFound = false;
if(board.IsThereAvailableMove())
{
std::vector<std::size_t> availableMoves;
if((gameNumber <= numGames * percentOfGamesToRandomiseStart && turnNumber == 0) || randomMove() > 100.0-randomChance)
wasRandomMove = true;
std::size_t bestMove = 8;
double bestWorstResponse = playerWhoTurnItIs == Board<7,6,4>::Player1 ? std::numeric_limits<double>::min() : std::numeric_limits<double>::max();
for(std::size_t m = 0; m < 7; ++m)
{
Board<7,6,4> testBoard = board; // make a copy of the current board to run our tests
if(testBoard.AvailableMoveInColumn(m))
{
if(wasRandomMove)
{
availableMoves.push_back(m);
}
testBoard.AddChecker(m, playerWhoTurnItIs);
double worstResponse = playerWhoTurnItIs == Board<7,6,4>::Player1 ? std::numeric_limits<double>::max() : std::numeric_limits<double>::min();
std::size_t worstMove = 8;
for(std::size_t m2 = 0; m2 < 7; ++m2)
{
Board<7,6,4> testBoard2 = testBoard;
if(testBoard2.AvailableMoveInColumn(m2))
{
testBoard2.AddChecker(m,playerWhoTurnItIsNot);
StateType state;
create_board_state(state, testBoard2, playerWhoTurnItIs);
auto outputs = NN->FeedForward(state);
if(playerWhoTurnItIs == Board<7,6,4>::Player1 && (outputs[0] < worstResponse || worstMove == 8))
{
worstResponse = outputs[0];
worstMove = m2;
}
else if(playerWhoTurnItIs == Board<7,6,4>::Player2 && (outputs[0] > worstResponse || worstMove == 8))
{
worstResponse = outputs[0];
worstMove = m2;
}
}
}
if(playerWhoTurnItIs == Board<7,6,4>::Player1 && (worstResponse > bestWorstResponse || bestMove == 8))
{
bestWorstResponse = worstResponse;
bestMove = m;
}
else if(playerWhoTurnItIs == Board<7,6,4>::Player2 && (worstResponse < bestWorstResponse || bestMove == 8))
{
bestWorstResponse = worstResponse;
bestMove = m;
}
}
}
if(bestMove == 8)
{
std::cerr << "wasn't able to determine the best move to make" << std::endl;
return 0;
}
if(gameNumber <= numGames * percentOfGamesToRandomiseStart && turnNumber == 0)
{
std::size_t rSelection = int(randomMove()) % (availableMoves.size());
selectedMove = availableMoves[rSelection];
moveFound = true;
}
else if(wasRandomMove)
{
std::remove(availableMoves.begin(),availableMoves.end(),bestMove);
std::size_t rSelection = int(randomMove()) % (availableMoves.size());
selectedMove = availableMoves[rSelection];
moveFound = true;
}
else
{
selectedMove = bestMove;
moveFound = true;
}
}
StateType prevState;
create_board_state(prevState,board,playerWhoTurnItIs);
NN->FeedForward(prevState);
*previousNN = *NN;
// now that we have the move, add it to the board
StateType state;
board.AddChecker(selectedMove,playerWhoTurnItIs);
create_board_state(state,board,playerWhoTurnItIsNot);
auto outputs = NN->FeedForward(state);
if(board.InARowConnected(4) == Board<7,6,4>::Player1)
{
totalLoss += NN->BackPropagateFinal({1},previousNN,alpha,beta,lambda);
winState = true;
++numGamesWonP1;
}
else if(board.InARowConnected(4) == Board<7,6,4>::Player2)
{
totalLoss += NN->BackPropagateFinal({-1},previousNN,alpha,beta,lambda);
winState = true;
++numGamesWonP2;
}
else if(!board.IsThereAvailableMove())
{
totalLoss += NN->BackPropagateFinal({0},previousNN,alpha,beta,lambda);
winState = true;
}
else if(turnNumber > 0 && !wasRandomMove)
{
NN->BackPropagate(previousNN,alpha,beta,lambda);
}
if(!wasRandomMove)
{
outputs = NN->FeedForward(state);
}
++turnNumber;
playerWhoTurnItIs = playerWhoTurnItIsNot;
}
alpha -= decayRateAlpha;
beta -= decayRateBeta;
NN->ResetEligibilityTraces();
if(gameNumber > 0 && randomChance > randomChangeFloor && gameNumber <= numGames * percentToReduceRandomOver)
{
randomChance -= randomChangeDecay;
if(randomChance < randomChangeFloor)
randomChance = randomChangeFloor;
}
if(gameNumber % gamesToOutput == 0 && gameNumber != 0)
{
totalLoss = totalLoss / gamesToOutput;
matchFile << std::fixed << std::setprecision(51) << totalLoss << std::endl;
totalLoss = 0.0;
}
}
matchFile << std::endl << "Games won: " << numGamesWonP1 << " . " << numGamesWonP2 << std::endl;
auto weights = NN->GetNetworkWeights();
matchFile << std::endl;
matchFile << std::endl;
for(const auto& w : weights)
matchFile << std::fixed << std::setprecision(51) << w << ", \n";
matchFile << std::endl;
return 0;
}
One place I think I may have an issue is the minimax that's choosing the best move to make.
There's a few additional pieces that I don't think are too pertinent to the issues I'm having.
The Problems
It doesn't seem to matter whether I train 1000 games or 3000000 games, either Player 1 or Player 2 will win the vast majority of games. To the point of like 90 out of 100 games won by one player. If I output the actual individual game moves and outputs I can see that the games won by the other player are almost always the result of a lucky random move.
At the same time, I notice that the prediction outputs sort of "favour" a player. I.e. the outputs seem to be on the negative side of 0, so Player 1 is always making the best moves it can for example, but they all seem to be predicted toward Player 2 winning.
Sometimes it's Player 1 who wins majority, other times it's Player 2. I'm assuming that this is due to the random weights initialising
slight toward one player.
The first game or so doesn't favour one player over the other, but it very quickly starts to "lean" one way.
I've tried training now over 3000000 games, that took 3 days, but the network still doesn't seem to be able to make good decisions. I've tested the network by having it play other "bots" on riddles.io Connect 4 comp.
It fails to recognise that it needs to block the opponents 4 in a row
It, even after 3000000 games, doesn't play the centre column as the first move, which we know is the only starting move you can make that will guarantee a win.
Any help and direction would be greatly appreciated. Specifically, is my implementation of TD-Lambda back-propagation correct?
I need some help here. Please excuse the complexity of the code. Basically, I am looking to use the bisection method to find a value "Theta" and each i increment.
I know that all the calculations work fine when I know the Theta, and I have the code run to just simply calculate all the values, but when I introduce a while loop and the bisection method to have the code approximate Theta, I can't seem to get it to run correctly. I am assuming I have my while loop set up incorrectly....
#include <math.h>
#include <iostream>
#include <vector>
#include <iomanip>
#include <algorithm> // std::max
using namespace std;
double FuncM(double theta, double r, double F, double G, double Gprime, double d_t, double sig);
double FuncM(double theta, double r, double F, double G, double Gprime, double d_t, double sig)
{
double eps = 0.0001;
return ((log(max((r + (theta + F - 0.5 * G * Gprime ) * d_t), eps))) / sig);
}
double FuncJSTAR(double m, double x_0, double d_x);
double FuncJSTAR(double m, double x_0, double d_x)
{
return (int(((m - x_0) / d_x)+ 0.5));
}
double FuncCN(double m, double x_0, double j, double d_x);
double FuncCN(double m, double x_0, double j, double d_x)
{
return (m - x_0 - j * d_x);
}
double FuncPup(double d_t, double cn, double d_x);
double FuncPup(double d_t, double cn, double d_x)
{
return (((d_t + pow(cn, 2.0)) / (2.0 * pow(d_x, 2.0))) + (cn / (2.0 * d_x)));
}
double FuncPdn(double d_t, double cn, double d_x);
double FuncPdn(double d_t, double cn, double d_x)
{
return (((d_t + pow(cn, 2.0)) / (2.0 * pow(d_x, 2.0))) - (cn / (2.0 * d_x)));
}
double FuncPmd(double pd, double pu);
double FuncPmd(double pd, double pu)
{
return (1 - pu - pd);
}
int main()
{
const int Maturities = 5;
const double EPS = 0.00001;
double TermStructure[Maturities][2] = {
{0.5 , 0.05},
{1.0 , 0.06},
{1.5 , 0.07},
{2.0 , 0.075},
{3.0 , 0.085} };
//--------------------------------------------------------------------------------------------------------
vector<double> Price(Maturities);
double Initial_Price = 1.00;
for (int i = 0; i < Maturities; i++)
{
Price[i] = Initial_Price * exp(-TermStructure[i][1] * TermStructure[i][0]);
}
//--------------------------------------------------------------------------------------------------------
int j_max = 8;
int j_range = ((j_max * 2) + 1);
//--------------------------------------------------------------------------------------------------------
// Set up vector of possible j values
vector<int> j_value(j_range);
for (int j = 0; j < j_range; j++)
{
j_value[j] = j_max - j;
}
//--------------------------------------------------------------------------------------------------------
double dt = 0.5;
double dx = sqrt(3 * dt);
double sigma = 0.15;
double mean_reversion = 0.2; // "a" value
//--------------------------------------------------------------------------------------------------------
double r0 = TermStructure[0][1]; // Initialise r(0) in case no corresponding dt rate in term structure
//--------------------------------------------------------------------------------------------------------
double x0 = log(r0) / sigma;
//--------------------------------------------------------------------------------------------------------
vector<double> r_j(j_range); // rate at each j
vector<double> F_r(j_range);
vector<double> G_r(j_range);
vector<double> G_prime_r(j_range);
for(int j = 0; j < j_range; j++)
{
if (j == j_max)
{
r_j[j] = r0;
}
else
{
r_j[j] = exp((x0 + j_value[j]*dx) * sigma);
}
F_r[j] = -mean_reversion * r_j[j];
G_r[j] = sigma * r_j[j];
G_prime_r[j] = sigma;
}
//--------------------------------------------------------------------------------------------------------
vector<vector<double>> m((j_range), vector<double>(Maturities));
vector<vector<int>> j_star((j_range), vector<int>(Maturities));
vector<vector<double>> Central_Node((j_range), vector<double>(Maturities));
vector<double> Theta(Maturities - 1);
vector<vector<double>> Pu((j_range), vector<double>(Maturities));
vector<vector<double>> Pd((j_range), vector<double>(Maturities));
vector<vector<double>> Pm((j_range), vector<double>(Maturities));
vector<vector<double>> Q((j_range), vector<double>(Maturities));// = {}; // Arrow Debreu Price. Initialised all array values to 0
vector<double> Q_dt_sum(Maturities);// = {}; // Sum of Arrow Debreu Price at each time step. Initialised all array values to 0
//--------------------------------------------------------------------------------------------------------
double Theta_A, Theta_B, Theta_C;
int JSTART;
int JEND;
int TempStart;
int TempEnd;
int max;
int min;
vector<vector<int>> Up((j_range), vector<int>(Maturities));
vector<vector<int>> Down((j_range), vector<int>(Maturities));
// Theta[0] = 0.0498039349327417;
// Theta[1] = 0.0538710670441647;
// Theta[2] = 0.0181648634139392;
// Theta[3] = 0.0381183886467521;
for(int i = 0; i < (Maturities-1); i++)
{
Theta_A = 0.00;
Theta_B = TermStructure[i][1];
Q_dt_sum[0] = Initial_Price;
Q_dt_sum[i+1] = 0.0;
while (fabs(Theta_A - Theta_B) >= 0.0000001)
{
max = 1;
min = 10;
if (i == 0)
{
JSTART = j_max;
JEND = j_max;
}
else
{
JSTART = TempStart;
JEND = TempEnd;
}
for(int j = JSTART; j >= JEND; j--)
{
Theta_C = (Theta_A + Theta_B) / 2.0; // If Theta C is too low, the associated Price will be higher than Price from initial term structure. (ie P(Theta C) > P(i+2) for Theta C < Theta)
// If P_C > P(i+2), set Theta_B = Theta_C, else if P_C < P(i+2), set Theta_A = Theta_C, Else if P_C = P(i+2), Theta_C = Theta[i]
//cout << Theta_A << " " << Theta_B << " " << Theta_C << endl;
m[j][i] = FuncM(Theta[i], r_j[j], F_r[j], G_r[j], G_prime_r[j], dt, sigma);
j_star[j][i] = FuncJSTAR(m[j][i], x0, dx);
Central_Node[j][i] = FuncCN(m[j][i], x0, j_star[j][i], dx);
Pu[j][i] = FuncPup(dt, Central_Node[j][i], dx);
Pd[j][i] = FuncPdn(dt, Central_Node[j][i], dx);
Pm[j][i] = FuncPmd(Pd[j][i], Pu[j][i]);
for (int p = 0; p < j_range; p++)
{
Q[p][i] = 0; // Clear Q array
}
Q[j_max][0] = Initial_Price;
Q[j_max -(j_star[j][i]+1)][i+1] = Q[j_max - (j_star[j][i]+1)][i+1] + Q[j][i] * Pu[j][i] * exp(-r_j[j] * dt);
Q[j_max -(j_star[j][i] )][i+1] = Q[j_max - (j_star[j][i] )][i+1] + Q[j][i] * Pm[j][i] * exp(-r_j[j] * dt);
Q[j_max -(j_star[j][i]-1)][i+1] = Q[j_max - (j_star[j][i]-1)][i+1] + Q[j][i] * Pd[j][i] * exp(-r_j[j] * dt);
}
for (int j = 0; j < j_range; j++)
{
Up[j][i] = j_star[j][i] + 1;
Down[j][i] = j_star[j][i] - 1;
if (Up[j][i] > max)
{
max = Up[j][i];
}
if ((Down[j][i] < min) && (Down[j][i] > 0))
{
min = Down[j][i];
}
}
TempEnd = j_max - (max);
TempStart = j_max - (min);
for (int j = 0; j < j_range; j++)
{
Q_dt_sum[i+1] = Q_dt_sum[i+1] + Q[j][i] * exp(-r_j[j] * dt);
cout << Q_dt_sum[i+1] << endl;
}
if (Q_dt_sum[i+1] == Price[i+2])
{
Theta[i] = Theta_C;
break;
}
if (Q_dt_sum[i+1] > Price[i+2])
{
Theta_B = Theta_C;
}
else if (Q_dt_sum[i+1] < Price[i+2])
{
Theta_A = Theta_C;
}
}
cout << Theta[i] << endl;
}
return 0;
}
Ok, my bad. I had a value being called incorrectly.
All good.
I'm trying to implement a gradient descent algorithm in C++. Here's the code I have so far :
#include <iostream>
double X[] {163,169,158,158,161,172,156,161,154,145};
double Y[] {52, 68, 49, 73, 71, 99, 50, 82, 56, 46 };
double m, p;
int n = sizeof(X)/sizeof(X[0]);
int main(void) {
double alpha = 0.00004; // 0.00007;
m = (Y[1] - Y[0]) / (X[1] - X[0]);
p = Y[0] - m * X[0];
for (int i = 1; i <= 8; i++) {
gradientStep(alpha);
}
return 0;
}
double Loss_function(void) {
double res = 0;
double tmp;
for (int i = 0; i < n; i++) {
tmp = Y[i] - m * X[i] - p;
res += tmp * tmp;
}
return res / 2.0 / (double)n;
}
void gradientStep(double alpha) {
double pg = 0, mg = 0;
for (int i = 0; i < n; i++) {
pg += Y[i] - m * X[i] - p;
mg += X[i] * (Y[i] - m * X[i] - p);
}
p += alpha * pg / n;
m += alpha * mg / n;
}
This code converges towards m = 2.79822, p = -382.666, and an error of 102.88. But if I use my calculator to find out the correct linear regression model, I find that the correct values of m and p should respectively be 1.601 and -191.1.
I also noticed that the algorithm won't converge for alpha > 0.00007, which seems quite low, and the value of p barely changes during the 8 iterations (or even after 2000 iterations).
What's wrong with my code?
Here's a good overview of the algorithm I'm trying to implement. The values of theta0 and theta1 are called p and m in my program.
Other implementation in python
More about the algorithm
This link gives a comprehensive view of the algorithm; it turns out I was following a completely wrong approach.
The following code does not work properly (and I have no plans to work on it further), but should put on track anyone who's confronted to the same problem as me :
#include <vector>
#include <iostream>
typedef std::vector<double> vect;
std::vector<double> y, omega(2, 0), omega2(2, 0);;
std::vector<std::vector<double>> X;
int n = 10;
int main(void) {
/* Initialize x so that each members contains (1, x_i) */
/* Initialize x so that each members contains y_i */
double alpha = 0.00001;
display();
for (int i = 1; i <= 8; i++) {
gradientStep(alpha);
display();
}
return 0;
}
double f_function(const std::vector<double> &x) {
double c;
for (unsigned int i = 0; i < omega.size(); i++) {
c += omega[i] * x[i];
}
return c;
}
void gradientStep(double alpha) {
for (int i = 0; i < n; i++) {
for (unsigned int j = 0; j < X[0].size(); j++) {
omega2[j] -= alpha/(double)n * (f_function(X[i]) - y[i]) * X[i][j];
}
}
omega = omega2;
}
void display(void) {
double res = 0, tmp = 0;
for (int i = 0; i < n; i++) {
tmp = y[i] - f_function(X[i]);
res += tmp * tmp; // Loss functionn
}
std::cout << "omega = ";
for (unsigned int i = 0; i < omega.size(); i++) {
std::cout << "[" << omega[i] << "] ";
}
std::cout << "\tError : " << res * .5/(double)n << std::endl;
}