I made a simple perceptron in c++ to study AI and even following a book(pt_br) i could not make my perceptron return an expected result, i tryed to debug and find the error but i didnt succeed.
My algorithm AND gate results (A and B = Y):
0 && 0 = 0
0 && 1 = 1
1 && 0 = 1
1 && 1 = 1
Basically its working as an OR gate or random.
I Tried to jump to Peter Norving and Russel book, but he goes fast over this and dont explain on depth one perceptron training.
I really want to learn every inch of this content, so i dont want to jump to Multilayer perceptron without making the simple one work, can you help?
The following code is the minimal code for operation with some explanations:
Sharp function:
int signal(float &sin){
if(sin < 0)
return 0;
if(sin > 1)
return 1;
return round(sin);
}
Perceptron Struct (W are Weights):
struct perceptron{
float w[3];
};
Perceptron training:
perceptron startTraining(){
//- Random factory generator
long int t = static_cast<long int>(time(NULL));
std::mt19937 gen;
gen.seed(std::random_device()() + t);
std::uniform_real_distribution<float> dist(0.0, 1.0);
//--
//-- Samples (-1 | x | y)
float t0[][3] = {{-1,0,0},
{-1,0,1},
{-1,1,0},
{-1,1,1}};
//-- Expected result
short d [] = {0,0,0,1};
perceptron per;
per.w[0] = dist(gen);
per.w[1] = dist(gen);
per.w[2] = dist(gen);
//-- print random numbers
cout <<"INIT "<< "W0: " << per.w[0] <<" W1: " << per.w[1] << " W2: " << per.w[2] << endl;
const float n = 0.1; // Lerning rate N
int saida =0; // Output Y
long int epo = 0; // Simple Couter
bool erro = true; // Loop control
while(erro){
erro = false;
for (int amost = 0; amost < 4; ++amost) { // Repeat for the number of samples x0=-1, x1,x2
float u=0; // Variable for the somatory
for (int entrad = 0; entrad < 3; ++entrad) { // repeat for every sinaptic weight W0=θ , W1, W2
u = u + (per.w[entrad] * t0[amost][entrad]);// U <- Weights * Inputs
}
// u=u-per.w[0]; // some references sau to take θ and subtract from U, i tried but without success
saida = signal(u); // returns 1 or 0
cout << d[amost] << " <- esperado | encontrado -> "<< saida<< endl;
if(saida != d[amost]){ // if the output is not equal to the expected value
for (int ajust = 0; ajust < 3; ++ajust) {
per.w[ajust] = per.w[ajust] + n * (d[amost] - saida) * t0[amost][ajust]; // W <- W + ɳ * ((d - y) x) where
erro = true; // W: Weights, ɳ: Learning rate
} // d: Desired outputs, y: outputs
} // x: samples
epo++;
}
}
cout << "Epocas(Loops): " << epo << endl;
return per;
}
Main with testing part:
int main()
{
perceptron per = startTraining();
cout << "fim" << endl;
cout << "W0: " << per.w[0] <<" W1: " << per.w[1] << " W2: " << per.w[2] << endl;
while(true){
int x,y;
cin >> x >> y;
float u=0;
u = (per.w[1] * x);
u = u + (per.w[2] * y);
//u=u-per.w[0];
cout << signal(u) << endl;
}
return 0;
}
In your main(), re-enable the line you commented out. Alternatively, you could write it like this to make it more illuminating:
float u = 0.0f;
u += (per.w[0] * float (-1));
u += (per.w[1] * float (x));
u += (per.w[2] * float (y));
The thing is that you trained the perceptron with three inputs, the first being hard-wired to a "-1" (making the first weight w[0] act like a constant "bias"). Accordingly, in your training function, your u is the sum of all THREE of those weight-input product.
However, in the main() you posted, you omit w[0] completely, thus producing a wrong result.
Related
I am trying to calculate distances between particles in a box. If the distance calculated is greater than a preset cut-off distance, then the potential energy is 0. Otherwise, it is 1.
There are some rounding issues I think and I am not familiar with variable types and passing variables through functions to know what to do next.
The error
When I calculate d0 by hand I get d0 = 0.070 - this is not what the computer gets! The computer gets a number on the order of e-310.
All of the calculated distances (dij) are no shorter than 1/14, which is much larger than e-310. According to my if statement, if dij>d0, then U=0, so I should get a total energy of 0, but this is what I get:
d0 is 6.95322e-310
i is 0 j is 1 dij is 0.0714286 d0 is 6.95322e-310 Uij is 1
.....
Energy of the system is 24976
Please let me know if I could provide any more information. I did not include the entirety of my code, but the other portion involves no manipulation of d0.
I copied the relevant pieces of code below
Part 1: relevant box data
class Vector {
public:
double x;
double y;
Vector() {
}
Vector (double x_, double y_) {
x = x_;
y = y_;
}
double len() {
return sqrt(x*x + y*y);
}
double lenSqr() {
return x*x + y*y;
}
};
class Atom
{
public:
Vector pos;
Vector vel;
Vector force;
Atom (double x_, double y_) {
pos = Vector(x_, y_);
}
};
class BoxData
{
public:
const double Len = 1.;
const double LenHalf = 0.5 * Len;
long double d = 1. / 14; // d is the distance between each atom
in the initial trigonal lattice
int nu = 7; // auxillary parameter - will be varied
long double d0 = d * (1 - 2^(nu - 8)); // cutoff distance
double alpha = d - d0; // maximum allowed displacement
};
int main() {
// Initialize box
LoadBox();
// Institute a for loop here
SystemEnergy();
MonteCarloMove();
return 0;
}
//Putting atoms into box
void LoadBox()
{
ofstream myfile("init.dat", ios::out);
//Load atoms in box in triangular offset lattice
const double x_shift = 1. / 14;
const double y_shift = 1. / 16;
double x = 0;
double y = 0;
double x_offset = 0;
for (y = 0; y <= 1. - y_shift; y += y_shift) {
for (x = x_offset; x < 0.99; x += x_shift) {
// create atom in position (x, y)
// and store it in array of atoms
atoms.push_back(Atom(x, y));
}
// every new row flip offset 0 -> 1/28 -> 0 -> 1/28...
if (x_offset < x_shift / 4) {
x_offset = x_shift / 2;
} else {
x_offset = 0.0;
}
}
const int numAtoms = atoms.size();
//print the position of each atom in the file init.dat
for (int i = 0; i < numAtoms; i++) {
myfile << "x is " << atoms[i].pos.x << " y is " << atoms[i].pos.y << endl;
}
myfile.close();
}
Part 2 : Energy calculation
vector<Atom> atoms;
BoxData box_;
void SystemEnergy()
{
ofstream myfile("energy.dat", ios::out);
double box_Len, box_LenHalf, box_d0;
double dij; // distance between two atoms
double Uij; // energy between two particles
double UTotal = 0;
double pbcx, pbcy; // pbc -> periodic boundary condition
double dx, dy;
myfile << "d0 is " << box_d0 << endl;
// define the number of atoms as the size of the array of atoms
const int numAtoms = atoms.size();
//pick atoms
for (int i=0; i<numAtoms-1; i++) { // pick one atom -> "Atom a"
Atom &a = atoms[i];
for (int j=i+1; j<numAtoms; j++) { // pick another atom -> "Atom b"
Atom &b = atoms[j];
dx = a.pos.x - b.pos.x;
dy = a.pos.y - b.pos.y;
pbcx = 0.0;
pbcy = 0.0;
// enforce periodic boundary conditions
if(dx > box_LenHalf) pbcx =- box_Len;
if(dx < -box_LenHalf) pbcx =+ box_Len;
if(dy > box_LenHalf) pbcy =- box_Len;
if(dy < -box_LenHalf) pbcy =+ box_Len;
dx += pbcx;
dy += pbcy;
// calculate distance between atoms
dij = sqrt(dx*dx + dy*dy);
// compare dij to the cutoff distance to determine energy
if (dij > box_d0) {
Uij = 0;
} else {
Uij = 1;
}
myfile << "i is " << i << " j is " << j << " dij is " << dij << " d0 is " << box_d0 << " Uij is " << Uij << endl;
UTotal += Uij; // sum the energies
}
}
myfile << "Energy of the system is " << UTotal << endl;
myfile.close();
}
Sorry for the formatting issues - getting the hang of copy/pasting to the forum.
I'm trying to get a hold on how to work with splines in Eigen, specifically I want do find the value of the spline interpolation and its first and second derivatives in some point. Finding the interpolated value is easy, but when I try to calculate the derivative I get strange values.
I tried following the instructions for the derivatives command in the manual (http://eigen.tuxfamily.org/dox/unsupported/classEigen_1_1Spline.html#af3586ab1929959e0161bfe7da40155c6), and this is my attempt in code:
#include <iostream>
#include <Eigen/Core>
#include <unsupported/Eigen/Splines>
using namespace Eigen;
using namespace std;
double scaling(double x, double min, double max) // for scaling numbers
{
return (x - min)/(max - min);
}
VectorXd scale(VectorXd xvals) // for scaling vectors
{
const double min = xvals.minCoeff();
const double max = xvals.maxCoeff();
for (int k = 0; k < xvals.size(); k++)
xvals(k) = scaling(xvals(k),min,max);
return xvals;
}
int main()
{
typedef Spline<double,1,3> spline;
VectorXd xvals = (VectorXd(4) << 0,1,2,4).finished();
VectorXd yvals = xvals.array().square(); // x^2
spline testspline = SplineFitting<spline>::Interpolate(yvals.transpose(), 3,
scale(xvals).transpose());
cout << "derivative at x = 0: " << testspline.derivatives(0.00,2) << endl;
cout << "derivative at x = 1: " << testspline.derivatives(0.25,2) << endl;
cout << "derivative at x = 2: " << testspline.derivatives(0.50,2) << endl;
cout << "derivative at x = 3: " << testspline.derivatives(0.75,2) << endl;
cout << "derivative at x = 4: " << testspline.derivatives(1.00,2) << endl;
}
it outputs
derivative at x = 0: 0 0 32
derivative at x = 1: 1 8 32
derivative at x = 2: 4 16 32
derivative at x = 3: 9 24 32
derivative at x = 4: 16 32 32
That is, the interpolation is correct (c.f. x = 3), but the derivatives are not, and they are off in a systematic way, so I'm thinking I'm doing something wrong. Since these follow x^2, the derivatives should be 0,2,4,6,8 and the second order derivative should be 2.
Any ideas on how to solve this?
Edit 1
Changing x^2 to x^2 + 1 yields the same derivatives, so that checks out at least. But changing x^2 to x^3 is wrong, but wrong in a slightly different way, output would then be:
derivative at x = 2: 8 48 192
derivative at x = 3: 27 108 288
derivative at x = 4: 64 192 384
Which is wrong, it should be 6, 9, 12.
Also running the x^2 case, but changing he input vector to 0,1,...9 yields the same derivative as using the original input vector, but the second order derivative becomes a steady 200, which too is wrong. I fail to see why the second order derivative should depend on the number of input points.
Solved it. You were very close. All you had to do was scale the derivatives
with
1 / (x_max - x_min) (first derivative)
1 / (x_max - x_min)^2 (second derivative).
TLDR: You normalized the x values to be between 0 and 1 while fitting the spline, but you didn't scale the y values.
Instead of the spline fitting x^2, you actually fitted:
x_norm = (x - x_min) / (x_max - x_min)
y = x_norm**2
So using the chain rule the first derivative of y = x_norm**2 would be 2x / (x_max - x_min) and the second derivative would be 2 / (x_max - x_min)**2.
Full example code:
#include <iostream>
#include <Eigen/Core>
#include <unsupported/Eigen/Splines>
using namespace Eigen;
using namespace std;
VectorXd normalize(const VectorXd &x) {
VectorXd x_norm;
x_norm.resize(x.size());
const double min = x.minCoeff();
const double max = x.maxCoeff();
for (int k = 0; k < x.size(); k++) {
x_norm(k) = (x(k) - min)/(max - min);
}
return x_norm;
}
int main() {
typedef Spline<double, 1, 3> Spline1D;
typedef SplineFitting<Spline1D> Spline1DFitting;
const Vector4d x{0, 1, 2, 4};
const Vector4d y = (x.array().square()); // x^2
const auto knots = normalize(x); // Normalize x to be between 0 and 1
const double scale = 1 / (x.maxCoeff() - x.minCoeff());
const double scale_sq = scale * scale;
Spline1D spline = Spline1DFitting::Interpolate(y.transpose(), 3, knots);
cout << "1st deriv at x = 0: " << spline.derivatives(0.00, 1)(1) * scale << endl;
cout << "1st deriv at x = 1: " << spline.derivatives(0.25, 1)(1) * scale << endl;
cout << "1st deriv at x = 2: " << spline.derivatives(0.50, 1)(1) * scale << endl;
cout << "1st deriv at x = 3: " << spline.derivatives(0.75, 1)(1) * scale << endl;
cout << "1st deriv at x = 4: " << spline.derivatives(1.00, 1)(1) * scale << endl;
cout << endl;
/**
* IMPORTANT NOTE: Eigen's spline module is not documented well. Once you fit a spline
* to find the derivative of the fitted spline at any point u [0, 1] you call:
*
* spline.derivatives(u, 1)(1)
* ^ ^ ^
* | | |
* | | +------- Access the result
* | +---------- Derivative order
* +------------- Parameter u [0, 1]
*
* The last bit `(1)` is if the spline is 1D. And value of `1` for the first
* order. `2` for the second order. Do not forget to scale the result.
*
* For higher dimensions, treat the return as a matrix and grab the 1st or
* 2nd column for the first and second derivative.
*/
cout << "2nd deriv at x = 0: " << spline.derivatives(0.00, 2)(2) * scale_sq << endl;
cout << "2nd deriv at x = 1: " << spline.derivatives(0.25, 2)(2) * scale_sq << endl;
cout << "2nd deriv at x = 2: " << spline.derivatives(0.50, 2)(2) * scale_sq << endl;
cout << "2nd deriv at x = 3: " << spline.derivatives(0.75, 2)(2) * scale_sq << endl;
cout << "2nd deriv at x = 4: " << spline.derivatives(1.00, 2)(2) * scale_sq << endl;
return 0;
}
Example output:
1st deriv at x = 0: 4.52754e-16
1st deriv at x = 1: 2
1st deriv at x = 2: 4
1st deriv at x = 3: 6
1st deriv at x = 4: 8
2nd deriv at x = 0: 2
2nd deriv at x = 1: 2
2nd deriv at x = 2: 2
2nd deriv at x = 3: 2
2nd deriv at x = 4: 2
Edit: see working .h-file for calculating B-splines of any order at the bottom.
Disclaimer: this is not an answer to my question as it is actually stated in the title, but rather a work-around with some comments.
After deliberations with user #Paul H. (see comments) I realized that my limited understanding of splines might have caused some confusion on my part. After some scrutiny of the Eigen documentation it seems plausible that the derivative() command does indeed works as intended, hence making my question badly phrased. The derivative() calculates the derivative of the spline rather than the derivative of the fitted function, as I intended it to. I have not figured out a way to get Eigen to output the function derivatives from the fit, and I don't think it is designed to to this. However, the derivatives can of course be readily calculated once the fitted points is obtained using some standard algorithm for calculating derivatives.
I wrote the following .h-file for calculating splines and their derivatives in the process which I thought worthwhile sharing. It's fairly well commented for convenience.
Note that this program uses 1-indexing rather than 0-indexing of the splines, hence for an e.g. quadratic B-spline order should be set to 4. This is a small quirk that is be easily fixed by changing the calculations to match wikipedia.
bsplines.h
//
// Header-file for calculating splines using standard libraries
//
// usage:
//
// x bsplines class constructs a set of splines up to some given order,
// on some given knot sequence, the splines are stored in a vector,
// such that splines[a][b] accesses the spline of order a and index b
// x get<some_member>() is an accessor that returns a pointer to some
// data member of the spline
// x calcsplines() calculates spline values as well as first and second
// order derivatives on some predefined grid
// x calcspline() returns the spline value as well as first and second
// derivatives in some point. This alborithm is slower than the grid
// one, due to unnecessary recalculations of intermediate results
// x writesplines() writes the splines and their derivatives to a file
// x for more details se the class declaration below
// TODO:
// x change to 0-indexation
// x introduce the possibility of calculating higher order derivatives
// recursively
//
// change log:
//
// 1.0 - initial release
// 1.1 - reworked grid such that the class now expects separate
// grid and knot files.
// - added the ability to calculate spline value in a point
// rather than calculate values on a grid
// - added a feature to change knots and grid
// 1.1.1 - reworked how returning single values works
// 1.1.2 - enabled swapping grid
//
// Note:
//
// This file uses 1-indexation rathar than 0-indexation, hence a qubic spline
// would be k = 4. Someone should eventually fix this as this is non-standard.
//
// Also, while only standard libraries are used here, you might want to check out
// some linear algebra package (e.g. Armadillo or Eigen) if you're going to use the
// splines in a context where you need linear algebraic operations.
//
// Originally developed by David Andersson
//
#include <iomanip>
#include <sstream>
#include <iostream>
#include <vector>
#include <algorithm>
#include <fstream>
#include <functional>
using namespace std;
typedef unsigned int uint;
class bsplines // class for bsplines
{
// private section
uint order; // order of spline
uint gridpts; // number of grid points
uint knotpts; // number of knot points
double tolerance; // tolerance for float comparisons
vector<double> knots; // knot sequence
vector<double> grid; // grid points
class spline // a member spline in the set of splines
{
int index; // the spline index, or number
vector<double> vals; // spline values
vector<double> d1; // spline first derivatives
vector<double> d2; // spline second derivatives
double tval; // same, but in one point
double td1;
double td2;
friend bsplines; // for ease of access
public:
};
vector<vector <spline>> splines; // the set of splines
// puclic section
public:
void readknots(string); // read knots from file
void readknotsnorm(string); // read knots from file and normalize
void readgrid(string); // read grid from file
void swapgrid(string); // reads and swaps new grid from file
void writesplines(); // write spline vals and derivs to file
void buildsplines(); // build the set of splines
void calcsplines(); // calculate spline vals and derivs
void printknots(); // print knot sequence
void printgrid(); // print grid
void printgridsize(); // print gridsize
void printvals(uint,uint); // print values of a spline
vector <double> calcspline(uint,uint,double); // calculate spline in point
// accessors // returns pointer to member
vector <double>* getknots(){return &knots;}
vector <double>* getgrid(){return &grid;}
uint* getknotpts(){return &knotpts;}
uint* getgridpts(){return &gridpts;}
uint getnosplines(uint m){return splines[m].size();}
vector <spline>* getsplines(uint m){return &splines[m];}
vector <double>* getvals(uint m, uint n){return &splines[m][n].vals;}
vector <double>* getd1(uint m, uint n){return &splines[m][n].d1;}
vector <double>* getd2(uint m, uint n){return &splines[m][n].d2;}
// constructor // sets up the spline class
bsplines (string iknots, string igrid, uint iorder, double itol)
:order(iorder), tolerance(itol)
{
readknots(iknots);
readgrid(igrid);
buildsplines();
}
};
void bsplines::buildsplines()
{
{
for (uint l = 1; l <= order; l++)
{
vector <spline> splinevec;
for (uint k = 0; k < knotpts - l; k++)
{
spline tmp;
tmp.index = k;
tmp.vals.reserve(gridpts);
tmp.d1.reserve(gridpts);
tmp.d2.reserve(gridpts);
splinevec.push_back(tmp);
}
splines.push_back(splinevec);
}
}
}
vector <double> bsplines::calcspline(uint m, uint n, double x)
{
// first order splines // exceptions handles infinities
for (auto& sp : splines[0])
{
uint i = sp.index;
if (x > knots[i+1])
sp.tval = 0;
else if ((x >= knots[i] && x < knots[i+1]) || x == knots.back())
sp.tval = 1;
else
sp.tval = 0;
}
// higher order splines
for (uint o = 1; o < order; o++)
{
uint oo = o+1; // compensating for 1-indexation
for (auto& sp : splines[o])
{
uint i = sp.index;
double t1 = knots[i+oo-1] - knots[i];
double t2 = knots[i+oo] - knots[i+1];
double c = 0;
if (abs(t1) > tolerance)
c += (x - knots[i]) / t1 * splines[o-1][i].tval;
if (abs(t2) > tolerance)
c += (knots[i+oo] - x) / t2 * splines[o-1][i+1].tval;
sp.tval = c;
}
}
uint o = order - 1;
// first order derivatives
for (auto& sp : splines[o])
{
uint i = sp.index;
double t1 = knots[i+order-1] - knots[i];
double t2 = knots[i+order] - knots[i+1];
double c = 0;
if (abs(t1) > tolerance)
c += 1.0 / t1 * splines[o-1][i].tval;
if (abs(t2) > tolerance)
c -= 1.0 / t2 * splines[o-1][i+1].tval;
c *= (order-1);
sp.td1 = c;
}
// second order derivatives
for (auto& sp : splines[o])
{
uint i = sp.index;
double t1 = (knots[i+order-1] - knots[i+0]) * (knots[i+order-2] - knots[i+0]);
double t2 = (knots[i+order-1] - knots[i+0]) * (knots[i+order-1] - knots[i+1]);
double t3 = (knots[i+order-0] - knots[i+1]) * (knots[i+order-1] - knots[i+1]);
double t4 = (knots[i+order-0] - knots[i+1]) * (knots[i+order-0] - knots[i+2]);
double c = 0;
if (abs(t1) > tolerance)
c += 1.0 / t1 * splines[o-2][sp.index].tval;
if (abs(t2) > tolerance)
c -= 1.0 / t2 * splines[o-2][sp.index+1].tval;
if (abs(t3) > tolerance)
c -= 1.0 / t3 * splines[o-2][sp.index+1].tval;
if (abs(t4) > tolerance)
c += 1.0 / t4 * splines[o-2][sp.index+2].tval;
c *= (order-1)*(order-2);
sp.td2 = c;
}
vector <double> retvals = {splines[m][n].tval, splines[m][n].td1, splines[m][n].td2};
return retvals;
}
void bsplines::calcsplines()
{
// first order splines
for (auto& sp : splines[0])
{
uint i = sp.index;
for (auto& x : grid)
{
if (x > knots[i+1])
sp.vals.push_back(0);
else if ((x >= knots[i] && x < knots[i+1]) || x == knots.back())
sp.vals.push_back(1);
else
sp.vals.push_back(0);
}
}
// higher order splines
for (uint o = 1; o < order; o++)
{
uint oo = o+1; // compensating for 1-indexation
for (auto& sp : splines[o])
{
uint i = sp.index;
double t1 = knots[i+oo-1] - knots[i];
double t2 = knots[i+oo] - knots[i+1];
for (auto& x : grid)
{
uint k = &x - &grid[0];
double c = 0;
if (abs(t1) > tolerance)
c += (x - knots[i]) / t1 * splines[o-1][i].vals[k];
if (abs(t2) > tolerance)
c += (knots[i+oo] - x) / t2 * splines[o-1][i+1].vals[k];
sp.vals.push_back(c);
}
}
}
uint o = order - 1; // use this one when accessing splines;
// first order derivatives
for (auto& sp : splines[o])
{
uint i = sp.index;
double t1 = knots[i+order-1] - knots[i];
double t2 = knots[i+order] - knots[i+1];
for (auto& x : grid)
{
uint k = &x - &grid[0];
double c = 0;
if (abs(t1) > tolerance)
c += 1.0 / t1 * splines[o-1][i].vals[k];
if (abs(t2) > tolerance)
c -= 1.0 / t2 * splines[o-1][i+1].vals[k];
c *= (order-1);
sp.d1.push_back(c);
}
}
// second order derivatives
for (auto& sp : splines[o])
{
uint i = sp.index;
double t1 = (knots[i+order-1] - knots[i+0]) * (knots[i+order-2] - knots[i+0]);
double t2 = (knots[i+order-1] - knots[i+0]) * (knots[i+order-1] - knots[i+1]);
double t3 = (knots[i+order-0] - knots[i+1]) * (knots[i+order-1] - knots[i+1]);
double t4 = (knots[i+order-0] - knots[i+1]) * (knots[i+order-0] - knots[i+2]);
for (auto& x : grid)
{
uint k = &x - &grid[0];
double c = 0;
if (abs(t1) > tolerance)
c += 1.0 / t1 * splines[o-2][sp.index].vals[k];
if (abs(t2) > tolerance)
c -= 1.0 / t2 * splines[o-2][sp.index+1].vals[k];
if (abs(t3) > tolerance)
c -= 1.0 / t3 * splines[o-2][sp.index+1].vals[k];
if (abs(t4) > tolerance)
c += 1.0 / t4 * splines[o-2][sp.index+2].vals[k];
c *= (order-1)*(order-2);
sp.d2.push_back(c);
}
}
}
void bsplines::readknots(string knotfile)
{
double x;
ifstream readknots(knotfile);
while (readknots >> x)
knots.push_back(x);
for (uint k = 0; k < order - 1; k++)
{
knots.insert(knots.begin(),knots.front());
knots.insert(knots.end(),knots.back());
}
knotpts = knots.size();
}
void bsplines::readknotsnorm(string knotfile)
{
double x;
knots.reserve(knotpts + 2*(order - 1));
ifstream readknots(knotfile);
while (readknots >> x)
knots.push_back(x);
auto minmax = minmax_element(begin(knots), end(knots));
double min = *(minmax.first);
double max = *(minmax.second);
for (auto& el : knots)
el = (el - min) / (max-min);
}
void bsplines::readgrid(string gridfile)
{
double x;
ifstream readgrid(gridfile);
while (readgrid >> x)
grid.push_back(x);
gridpts = grid.size();
}
void bsplines::swapgrid(string gridfile)
{
grid = {};
double x;
ifstream readgrid(gridfile);
while (readgrid >> x)
grid.push_back(x);
gridpts = grid.size();
}
void bsplines::printknots()
{
cout << "content in knot vector: " << endl;
for (auto& el : knots)
cout << el << " ";
cout << endl;
}
void bsplines::printgrid()
{
cout << "content in grid vector: " << endl;
for (auto& el : grid)
cout << el << " ";
cout << endl;
}
void bsplines::printgridsize()
{
cout << "number of grid points: " << endl << grid.size() << endl;
}
void bsplines::printvals(uint m, uint n)
{
cout << "content in spline (B" << m << "," << n << ") vals vector: " << endl;
for (auto& el : splines[n][m].vals)
cout << el << " ";
cout << endl;
}
void bsplines::writesplines()
{
for (uint o = 0; o < order; o++)
for (auto& sp : splines[o])
{
uint i = sp.index;
ostringstream namestream;
namestream << "B(" << fixed << setprecision(1) << i << ","
<< fixed << setprecision(1) << o << ").csv";
string filename = namestream.str();
ofstream fs;
fs.open(filename);
if (o < order - 1)
{
for (uint k = 0; k < sp.vals.size(); k++)
fs << sp.vals[k] << "," << 0 << "," << 0 << endl;
fs.close();
}
else
{
for (uint k = 0; k < sp.vals.size(); k++)
fs << sp.vals[k] << "," << sp.d1[k] << "," << sp.d2[k] << endl;
fs.close();
}
cout << "write " << sp.vals.size() << " numbers to " << filename << endl;
}
}
Edit: updated .h-file.
I have a graph with N vertices and M edges (N is between 1 and 15 and M is between 1 and N^2). The graph is directed and weighted (with a probability for that excact edge). You are given a start vertex and a number of edges. The program is then going to calculate the probability for each vertex being the end vertex.
Examle input:
3 3 //Number of vertices and number of edges
1 2 0.4 //Edge nr.1 from vertex 1 to 2 with a probability of 0.4
1 3 0.5 //Edge nr.2 from vertex 1 to 3 with a probability of 0.5
2 1 0.8 //Edge nr.3...
3 //Number of questions
2 1 //Start vertex, number of edges to visit
1 1
1 2
Output:
0.8 0.2 0.0 //The probability for vertex 1 beign the last vertex is 0.8 for vertex 2 it is 0.2 and for vertex 3 it is 0.0
0.1 0.4 0.5
0.33 0.12 0.55
I have used a DFS in my solution, but when number of edges to visit can be up to 1 billion, this is way too slow... I have been looking at DP but I am not sure about how to implement it for this particular problem (if it is even the right way to solve it). So I was hoping that some of you could suggest an alternative to DFS and/or perhaps a way of using/implementing DP.
(I know it might be a bit messy, I have only been programming in C++ for a month)
#include <iostream>
#include <vector>
#include <stack>
using namespace std;
struct bird {
int colour;
float probability;
};
struct path {
int from;
int to;
};
vector <vector <bird>> birdChanges;
vector <int> layer;
vector <double> savedAnswers;
stack <path> nextBirds;
int fromBird;
//Self loop
void selfLoop(){
float totalOut = 0;
for (int i = 0; i < birdChanges.size(); i++) {
for (int j = 0; j < birdChanges[i].size(); j++) {
totalOut += birdChanges[i][j].probability;
}
if (totalOut < 1) {
bird a;
a.colour = i;
a.probability = 1 - totalOut;
birdChanges[i].push_back(a);
}
totalOut = 0;
}
}
double fillingUp(double momentarilyProbability, long long int numberOfBerries){
int layernumber=0;
while (layer[numberOfBerries - (1+layernumber)] == 0) {
layernumber++;
if (numberOfBerries == layernumber) {
break;
}
}
layernumber = layer.size() - layernumber;
path direction;
int b;
if (layernumber != 0) {
b= birdChanges[nextBirds.top().from][nextBirds.top().to].colour;//Usikker
}
else {
b = fromBird;
}
while (layer[numberOfBerries - 1] == 0) {
//int a = birdChanges[nextBirds.top().from][nextBirds.top().to].colour;
if (layernumber != 0) {
momentarilyProbability *= birdChanges[nextBirds.top().from][nextBirds.top().to].probability;
//b = birdChanges[nextBirds.top().from][nextBirds.top().to].colour;
}
for (int i = 0; i < birdChanges[b].size(); i++) {
direction.from = b;
direction.to = i;
//cout << endl << "Stacking " << b << " " << birdChanges[b][i].colour;
nextBirds.push(direction);
layer[layernumber]++;
}
layernumber++;
b = birdChanges[nextBirds.top().from][nextBirds.top().to].colour;
}
//cout << "Returning" << endl;
return momentarilyProbability *= birdChanges[nextBirds.top().from][nextBirds.top().to].probability;;
}
//DFS
void depthFirstSearch(int fromBird, long long int numberOfBerries) {
//Stack for next birds (stack)
path a;
double momentarilyProbability = 1;//Momentarily probability (float)
momentarilyProbability=fillingUp(1, numberOfBerries);
//cout << "Back " << momentarilyProbability << endl;
//Previous probabilities (stack)
while (layer[0] != 0) {
//cout << "Entering" << endl;
while (layer[numberOfBerries - 1] != 0) {
savedAnswers[birdChanges[nextBirds.top().from][nextBirds.top().to].colour] += momentarilyProbability;
//cout << "Probability for " << birdChanges[nextBirds.top().from][nextBirds.top().to].colour << " is " << momentarilyProbability << endl;
momentarilyProbability = momentarilyProbability / birdChanges[nextBirds.top().from][nextBirds.top().to].probability;
nextBirds.pop();
layer[numberOfBerries - 1]--;
if (layer[numberOfBerries - 1] != 0) {
momentarilyProbability *= birdChanges[nextBirds.top().from][nextBirds.top().to].probability;
}
}
if (layer[0] != 0) {
int k = 1;
while (layer[layer.size() - k]==0&&k+1<=layer.size()) {
//cout << "start" << endl;
momentarilyProbability = momentarilyProbability / birdChanges[nextBirds.top().from][nextBirds.top().to].probability;
//cout << "Popping " << nextBirds.top().from << birdChanges[nextBirds.top().from][nextBirds.top().to].colour << endl;
nextBirds.pop();
//cout << "k " << k << endl;
layer[numberOfBerries - 1 - k]--;
k++;
//cout << "end" << endl;
}
}
if (layer[0] != 0) {
//cout << 1 << endl;
//cout << "Filling up from " << nextBirds.top().from << birdChanges[nextBirds.top().from][nextBirds.top().to].colour << endl;
momentarilyProbability = fillingUp(momentarilyProbability, numberOfBerries);
}
}
//Printing out
for (int i = 1; i < savedAnswers.size(); i++) {
cout << savedAnswers[i] << " ";
}
cout << endl;
}
int main() {
int numberOfColours;
int possibleColourchanges;
cin >> numberOfColours >> possibleColourchanges;
birdChanges.resize(numberOfColours+1);
int from, to;
float probability;
for (int i = 0; i < possibleColourchanges; i++) {
cin >> from >> to >> probability;
bird a;
a.colour = to;
a.probability = probability;
birdChanges[from].push_back(a);
}
selfLoop();
int numberOfQuestions;
cin >> numberOfQuestions;
long long int numberOfBerries;
for (int i = 0; i < numberOfQuestions; i++) {
cin >> fromBird >> numberOfBerries;
savedAnswers.assign(numberOfColours + 1, 0);
layer.resize(numberOfBerries, 0);
//DFS
depthFirstSearch(fromBird, numberOfBerries);
}
system("pause");
}
Fast explanation of how to do this with the concept of a Markov Chain:
Basic algorithm:
Input: starting configuration vector b of probabilities of
being in a vertex after 0 steps,
Matrix A that stores the probability weights,
in the scheme of an adjacency matrix
precision threshold epsilon
Output:
an ending configuration b_inf of probabilities after infinite steps
Pseudocode:
b_old = b
b_new = A*b
while(difference(b_old, b_new) > epsilon){
b_old = b_new
b_new = A*b_old
}
return b_new
In this algorithm, we essentially compute potencies of the probability matrix and look for when those become stable.
b are the probabilities to be at a vertex after no steps where taken
(so, in your case, every entry being zero except for the start vertex, which is one)
A*b are those after one step was taken
A^2 * b are those after two steps were taken, A^n * b after n steps.
When A^n * b is nearly the same as A^n-1 * b, we assume that nothing big will happen to it any more, that it is basically the same as A^infinity * b
One can mock this algorithm with some examples, like an edge that leads in a subgraph with a very small probability that will result one being in the subgraph with probability 1 after infinite steps, but for example from reality, it will work.
For the difference, the euclidean distance should work well, but essentially any norm does, you could also go with maximum or manhattan.
Note that I present a pragmatic point of view, a mathematician would go far more into detail about under which properties of A it will converge how fast for which values of epsilon.
You might want to use a good library for matrices for that, like Eigen.
EDIT:
Reading the comment of Jarod42, I realize that your amount of steps are given. In that case, simply go with A^steps * b for the exact solution. Use a good library for a fast computation of the potency.
I am trying to perform a Tuckerman Rounding Test in order to determine the correctly rounded to nearest result.
I created a program in C++ to compare two solutions to a square root of a number and perform a tuckerman test on them. However, the C++ math library solution fails to pass the tuckerman test, so I'm wondering what could be wrong?
Here is my output:
Square root program started
Input value is 62a83003
===Tuckerman Test with MATLAB result===
Square root result from MATLAB = 5112b968
g*(g-ulp) = 62a83001
b = 62a83003
g*(g+ulp) = 62a83003
=====>Passes Tuckerman test
===Tuckerman Test with correct result===
Correct square root result = 5112b969
g*(g-ulp) = 62a83003
b = 62a83003
g*(g+ulp) = 62a83005
=====>Fails Tuckerman test
Here is my code (C++):
#include <iostream>
#include <cmath>
#include <fstream>
using namespace std;
union newfloat{
float f;
unsigned int i;
};
int main ()
{
// Declare new floating point numbers
newfloat input;
newfloat result, resultm1, resultp1;
newfloat correct_result, correct_resultm1, correct_resultp1;
newfloat resultm1_times_result, resultp1_times_result;
newfloat correct_resultm1_times_result, correct_resultp1_times_result;
// Print message at start of program
cout << "Square root program started" << endl;
input.i = 0x62A83003; // Input we are trying to find the square root of
cout << "Input value is " << hex << input.i << "\n" << endl; // Print input value
result.i = 0x5112B968; // Result from MATLAB
resultm1.i = result.i - 1; // value minus 1 ulp
resultp1.i = result.i + 1; // value plus 1 ulp
correct_result.f = sqrt(input.f); // Compute correct square root
correct_resultm1.i = correct_result.i - 1; // correct value minus 1 ulp
correct_resultp1.i = correct_result.i + 1; // correct value plus 1 ulp
resultm1_times_result.f = result.f * resultm1.f; // Compute g(g-ulp) for matlab result
resultp1_times_result.f = result.f * resultp1.f; // Compute g(g+ulp) for matlab result
correct_resultm1_times_result.f = correct_result.f * correct_resultm1.f; // Compute g*(g-ulp) for correct result
correct_resultp1_times_result.f = correct_result.f * correct_resultp1.f; // Compute g*(g+ulp) for correct result
// Print output from MATLAB algorithm and perform tuckerman test
cout << "===Tuckerman Test with MATLAB result===" << endl;
cout << "Square root result from MATLAB = " << result.i << endl;
cout << "g*(g-ulp) = " << hex << resultm1_times_result.i << endl;
cout << "b = " << hex << input.i << endl;
cout << "g*(g+ulp) = " << hex << resultp1_times_result.i << endl;
if ((resultm1_times_result.f < input.f) && (input.f <= resultp1_times_result.f))
cout << "=====>Passes Tuckerman test" << endl;
else
cout << "=====>Fails Tuckerman test" << endl;
cout << "\n" << endl;
// Print output from C++ sqrt math library and perform tuckerman test
cout << "===Tuckerman Test with correct result===" << endl;
cout << "Correct square root result = " << hex << correct_result.i << endl;
cout << "g*(g-ulp) = " << hex << correct_resultm1_times_result.i << endl;
cout << "b = " << hex << input.i << endl;
cout << "g*(g+ulp) = " << hex << correct_resultp1_times_result.i << endl;
if ((correct_resultm1_times_result.f < input.f) && (input.f <= correct_resultp1_times_result.f))
cout << "=====>Passes Tuckerman test" << endl;
else
cout << "=====>Fails Tuckerman test" << endl;
return 0;
}
The original publication that introduced Tuckerman rounding for the square root was:
Ramesh C. Agarwal, James W. Cooley, Fred G. Gustavson, James B. Shearer, Gordon Slishman, Bryant Tuckerman,
"New scalar and vector elementary functions for the IBM System/370", IBM J. Res. Develop., Vol. 30, No. 2, March 1986, pp. 126-144.
This paper specifically points out that the multiplications used to compute the products g*(g-ulp) and g*(g+ulp) are truncating, not rounding multiplications:
"However, these inequalities can be shown to be equivalent to
y- * y < x <= y * y+ ,
where * denotes System/360/370 multiplication (which truncates the result), so that the tests are easily carried out
without the need for extra precision. (Note the asymmetry: one <, one <=.) If the left inequality fails, y is too large; if the
right inequality fails, y is too small."
The following C99 code shows how Tuckerman rounding is successfully utilized to deliver correctly rounded results in a single-precision square root function.
#include <stdio.h>
#include <stdlib.h>
#include <fenv.h>
#include <math.h>
#pragma STDC FENV_ACCESS ON
float mul_fp32_rz (float a, float b)
{
float r;
int orig_rnd = fegetround();
fesetround (FE_TOWARDZERO);
r = a * b;
fesetround (orig_rnd);
return r;
}
float my_sqrtf (float a)
{
float b, r, v, w, p, s;
int e, t, f;
if ((a <= 0.0f) || isinff (a) || isnanf (a)) {
if (a < 0.0f) {
r = 0.0f / 0.0f;
} else {
r = a + a;
}
} else {
/* compute exponent adjustments */
b = frexpf (a, &e);
t = e - 2*512;
f = t / 2;
t = t - 2 * f;
f = f + 512;
/* map argument into the primary approximation interval [0.25,1) */
b = ldexpf (b, t);
/* initial approximation to reciprocal square root */
r = -6.10005470e+0f;
r = r * b + 2.28990124e+1f;
r = r * b - 3.48110069e+1f;
r = r * b + 2.76135244e+1f;
r = r * b - 1.24472151e+1f;
r = r * b + 3.84509158e+0f;
/* round rsqrt approximation to 11 bits */
r = rintf (r * 2048.0f);
r = r * (1.0f / 2048.0f);
/* Use A. Schoenhage's coupled iteration for the square root */
v = 0.5f * r;
w = b * r;
w = (w * -w + b) * v + w;
v = (r * -w + 1.0f) * v + v;
w = (w * -w + b) * v + w;
/* Tuckerman rounding: mul_rz (w, w-ulp) < b <= mul_rz (w, w+ulp) */
p = nextafterf (w, 0.0f);
s = nextafterf (w, 2.0f);
if (b <= mul_fp32_rz (w, p)) {
w = p;
} else if (b > mul_fp32_rz (w, s)) {
w = s;
}
/* map back from primary approximation interval by jamming exponent */
r = ldexpf (w, f);
}
return r;
}
int main (void)
{
volatile union {
float f;
unsigned int i;
} arg, res, ref;
arg.i = 0;
do {
res.f = my_sqrtf (arg.f);
ref.f = sqrtf (arg.f);
if (res.i != ref.i) {
printf ("!!!! error # arg=%08x: res=%08x ref=%08x\n",
arg.i, res.i, ref.i);
break;
}
arg.i++;
} while (arg.i);
return EXIT_SUCCESS;
}
doing a C++ approximation of Pi using a random number generator, output works exactly as expected on my AMD 64 machine running Ubuntu, however on my school machine the second algorithm I've implemented is broken, and would love some insight as to why. Code is as follows:
#ifndef RANDOMNUMBER_H_
#define RANDOMNUMBER_H_
class RandomNumber {
public:
RandomNumber() {
x = time(NULL);
m = pow(2, 19); //some constant value
M = 65915 * 7915; //multiply of some simple numbers p and q
method = 1;
}
RandomNumber(int seed) {
x = ((seed > 0) ? seed : time(NULL));
m = pow(2, 19); //some constant value
method = 1; //method number
M = 6543 * 7915; //multiply of some simple numbers p and q
}
void setSeed(long int seed) {
x = seed; //set start value
}
void chooseMethod(int method) {
this->method = ((method > 0 && method <= 2) ? method : 1); //choose one of two method
}
long int linearCongruential() { //first generator, that uses linear congruential method
long int c = 0; // some constant
long int a = 69069; //some constant
x = (a * x + c) % m; //solution next value
return x;
}
long int BBS() { //algorithm Blum - Blum - Shub
x = (long int) (pow(x, 2)) % M;
return x;
}
double nextPoint() { //return random number in range (-1;1)
double point;
if (method == 1) //use first method
point = linearCongruential() / double(m);
else
point = BBS() / double(M);
return point;
}
private:
long int x; //current value
long int m; // some range for first method
long int M; //some range for second method
int method; //method number
};
#endif /* RANDOMNUMBER_H_ */
and test class:
#include <iostream>
#include <stdlib.h>
#include <math.h>
#include <iomanip>
#include "RandomNumber.h"
using namespace std;
int main(int argc, char* argv[]) {
cout.setf(ios::fixed);
cout.precision(6);
RandomNumber random;
random.setSeed(argc);
srand((unsigned) time(NULL));
cout << "---------------------------------" << endl;
cout << " Monte Carlo Pi Approximation" << endl;
cout << "---------------------------------" << endl;
cout << " Enter number of points: ";
long int k1;
cin >> k1;
cout << "Select generator number: ";
int method;
cin >> method;
random.chooseMethod(method);
cout << "---------------------------------" << endl;
long int k2 = 0;
double sumX = 0;
double sumY = 0;
for (long int i = 0; i < k1; i++) {
double x = pow(-1, int(random.nextPoint() * 10) % 2)
* random.nextPoint();
double y = pow(-1, int(random.nextPoint() * 10) % 2)
* random.nextPoint();
sumX += x;
sumY += y;
if ((pow(x, 2) + pow(y, 2)) <= 1)
k2++;
}
double pi = 4 * (double(k2) / k1);
cout << "M(X) = " << setw(10) << sumX / k1 << endl; //mathematical expectation of x
cout << "M(Y) = " << setw(10) << sumY / k1 << endl; //mathematical expectation of y
cout << endl << "Pi = " << pi << endl << endl; //approximate Pi
return 0;
}
The second method returns 4.000 consistently on my lab machine, yet returns a rather close approximation on my personal machine.
For one thing, the BBS generator as you're using it will always return 1.
Since your program takes no arguments, presumably its argc will be 1. You pass argc as the seed (why?), so the initial value of x is 1.
BBS() has the following logic:
x = (long int) (pow(x, 2)) % M;
Clearly, 1 squared modulo M gives 1, so x never changes.
When you run the simulation with such a generator, your program will always output 4.
P.S. Wikipedia has the following to say about the initial value x0 for Blum Blum Shub:
The seed x0 should be an integer that's co-prime to M (i.e. p and q are not factors of x0) and not 1 or 0.