I implemented simulated annealing in C++ to minimize (x-2)^2+(y-1)^2 in some range.
I'm getting varied output which is not acceptable for this type of heuristic method. It seems that the solution is converging but never quite closing in on the solution.
My code:
#include <bits/stdc++.h>
using namespace std;
double func(double x, double y)
{
return (pow(x-2, 2)+pow(y-1, 2));
}
double accept(double z, double minim, double T,double d)
{
double p = -(z - minim) / (d * T);
return pow(exp(1), p);
}
double fRand(double fMin, double fMax)
{
double f = (double)rand() / RAND_MAX;
return fMin + f * (fMax - fMin);
}
int main()
{
srand (time(NULL));
double x = fRand(-30,30);
double y = fRand(-30,30);
double xm = x, ym=y;
double tI = 100000;
double tF = 0.000001;
double a = 0.99;
double d=(1.6*(pow(10,-23)));
double T = tI;
double minim = func(x, y);
double z;
double counter=0;
while (T>tF) {
int i=1;
while(i<=30) {
x=x+fRand(-0.5,0.5);
y=y+fRand(-0.5,0.5);
z=func(x,y);
if (z<minim || (accept(z,minim,T,d)>(fRand(0,1)))) {
minim=z;
xm=x;
ym=y;
}
i=i+1;
}
counter=counter+1;
T=T*a;
}
cout<<"min: "<<minim<<" x: "<<xm<<" y: "<<ym<<endl;
return 0;
}
How can I get it to reach the solution?
There are a couple of things that I think are wrong in your implementation of the simulated annealing algorithm.
At every iteration you should look at some neighbours z of current minimum and update it if f(z) < minimum. If f(z) > minimum you can also accept the new point, but with an acceptance probability function.
The problem is that in your accept function, the parameter d is way too low - it will always return 0.0 and never trigger the condition of acceptance. Try something like 1e-5; it doesn't have to be physically correct, it only has to decrease while lowering the "temperature".
After updating the temperature in the outer loop, you should put x=xm and y=ym, before doing the inner loop or instead of searching the neigbours of the current solution you will basically randomly wander around (you aren't checking any boundaries too).
Doing so, I usually get some output like this:
min: 8.25518e-05 x: 2.0082 y: 0.996092
Hope it helped.
Related
I wanted to calculate p-values of a t-statistic for a two tailed test with 5% level of significance. And I wanted to do this with the standard library. I was wondering if this was possible using the student_t_distribution from the < random > module.
My code currently is as following
#include <iostream>
int main(){
double t_stat = 0.0267; // t-statistic
double alpha_los = 0.05; // level of significance
double dof = 30; // degrees of freedom
// calculate P > |t| and compare with alpha_los
return 0;
}
Thank you
The <random> header just provides you with the ability to get random numbers from different distributions.
If you are able to use boost you can do the following:
#include <boost/math/distributions/students_t.hpp>
int main() {
double t_stat = 0.0267; // t-statistic
double alpha_los = 0.05; // level of significance
double dof = 30; // degrees of freedom
boost::math::students_t dist(dof);
double P_x_greater_t = 1.0 - boost::math::cdf(dist, t_stat);
double P_x_smaller_negative_t = boost::math::cdf(dist, -t_stat);
if(P_x_greater_t + P_x_smaller_negative_t < alpha_los) {
} else {
}
}
So I have this Rcpp function in a .cpp file. You'll see that it is calling other custom functions that I don't show for simplicity, but those don't show any problem whatsoever.
// [[Rcpp::export]]
int sim_probability(float present_wealth , int time_left, int n, float mu, float sigma, float r, float gamma, float gu, float gl){
int i;
int count = 0;
float final_wealth;
NumericVector y(time_left);
NumericVector rw(time_left);
for(i=0;i<n;i++){
rw = random_walk(time_left, 0);
y = Y(rw, mu, sigma, r, gamma);
final_wealth = y[time_left-1] - y[0] + present_wealth;
if(final_wealth <= gu && final_wealth >= gl){
count = count + 1;
}
}
return count;
}
Then I can call this function from a .R seamlessly:
library(Rcpp)
sourceCpp("functions.cpp")
sim_probability(present_wealth = 100, time_left = 10, n = 1e3, mu = 0.05, sigma = 0.20, r = 0, gamma = 2, gu = 200, gl = 90)
But, if I call it inside a for loop, no matter how small it is, R crashes without popping any apparent error. The chunk below would make R crash.
for(l in 1:1){
sim_probability(present_wealth = 100, time_left = 10, n = 1e3, mu = 0.05, sigma = 0.20, r = 0, gamma = 2, gu = 200, gl = 90)
}
I've also tried to execute it manually (Ctrl + Enter) many times as fast as I could, and I'm fast enough it also crashes.
I have tried smaller or bigger loops, both out and within the function. It also crashes if it's called from another Rcpp function. I know I shouldn't call Rcpp functions in a R loop. Eventually I intend to call it from another Rcpp function (to generate a matrix of data) but it crashes all the same.
I have followed other cases that I've found googling and tried a few things, as changing to [] brackets for the arrays' index (this question), playing with the gc() garbage collector (as suggested here).
I suspected that something happened with the NumericVector definitions. But as far as I can tell they are declared properly.
It is been fairly pointed out in the comments that this is not a reproducible exaxmple. I'll add down here the missing functions Y() and random_walk():
// [[Rcpp::export]]
NumericVector Y(NumericVector path, float mu, float sigma, float r, float gamma){
int time_step, n, i;
time_step = 1;
float theta, y0, prev, inc_W;
theta = (mu - r) / sigma;
y0 = theta / (sigma*gamma);
n = path.size();
NumericVector output(n);
for(i=0;i<n;i++){
if(i == 0){
prev = y0;
inc_W = path[0];
}else{
prev = output[i-1];
inc_W = path[i] - path[i-1];
}
output[i] = prev + (theta / gamma) * (theta * time_step + inc_W);
}
return output;
}
// [[Rcpp::export]]
NumericVector random_walk(int length, float starting_point){
if(length == 1){return starting_point;}
NumericVector output(length);
output[1] = starting_point;
int i;
for(i=0; i<length; i++){output[i+1] = output[i] + R::rnorm(0,1);}
return output;
}
Edit1: Added more code so it is reproducible.
Edit2: I was assigning local variables when calling the functions. That was dumb from my part, but harmless. The same error still persists. But I've fixed that.
Edit3: As it's been pointed out by Dirk in the comments, I was doing a pointless exercise redefining the rnorm(). Now it's removed and fixed.
The answer has been solved in the comments, by #coatless. I put it here to keep it for future readers. The thing is that the random_walk() function wasn't properly set up correctly.
The problem was that the loop inside the function allowed i to go out of the defined dimension of the vector output. This is just inefficient when called once, yet it works. But it blows up when it's called many times real fast.
So in order to avoid this error and many others, the function should have been defined as
// [[Rcpp::export]]
NumericVector random_walk(int length, float starting_point){
if(length == 0){return starting_point;}
NumericVector output(length);
output[0] = starting_point;
int i;
for(i=0; i<length-1; i++){output[i+1] = output[i] + R::rnorm(0,1);}
return output;
}
I am reading in a temperature value every 1 second/minute (this rate is not crucial). I want to measure this temperature so that if it begins to rise rapidly above a certain threshold I perform an action.
If the temperature rises above 30 degrees ( at any rate ) I increase the fan speed.
I think I must do something like set old temperature to new temp and then each time it loops set old temp to the current temp of the engine. But I am not sure if I need to use arrays for the engine temp or not.
Of course you can store just one old sample, then check difference like in:
bool isHot(int sample) {
static int oldSample = sample;
return ((sample > 30) || (sample - oldSample > threshold));
}
It's OK from C point of view, but very bad from metrology point of view. You should consider some conditioning of your signal (in this case temperature) to smothen out any spikes.
Of course you can add signal conditioning letter on. For (easy) example look at Simple Moving Avarage: https://en.wikipedia.org/wiki/Moving_average
If you want control the fan speed "right way" you should consider learning a bit about PID controller: https://en.wikipedia.org/wiki/PID_controller
Simple discrete PID:
PidController.h:
class PidController
{
public:
PidController();
double sim(double y);
void UpdateParams(double kp, double ki, double kd);
void setSP(double setPoint) { m_setPoint = setPoint; } //set current value of r(t)
private:
double m_setPoint; //current value of r(t)
double m_kp;
double m_ki;
double m_kd;
double m_outPrev;
double m_errPrev[2];
};
PidController.cpp
#include "PidController.h"
PidController::PidController():ControllerObject()
{
m_errPrev[0] = 0;
m_errPrev[1] = 0;
m_outPrev = 0;
}
void PidController::UpdateParams(double kp, double ki, double kd)
{
m_kp = kp;
m_ki = ki;
m_kd = kd;
}
//calculates PID output
//y - sample of y(t)
//returns sample of u(t)
double PidController::sim(double y)
{
double out; //u(t) sample
double e = m_setPoint - y; //error
out = m_outPrev + m_kp * (e - m_errPrev[0] + m_kd * (e - 2 * m_errPrev[0] + m_errPrev[1]) + m_ki * e);
m_outPrev = out; //store previous output
//store previous errors
m_errPrev[1] = m_errPrev[0];
m_errPrev[0] = e;
return out;
}
I'm trying to implement a very simple 1-dimensional gradient descent algorithm. The code I have does not work at all. Basically depending on my alpha value, the end parameters will either be wildly huge (like ~70 digits), or basically zero (~ 0.000). I feel like a gradient descent should not be nearly this sensitive in alpha (I'm generating small data in [0.0,1.0], but I think the gradient itself should account for the scale of the data, no?).
Here's the code:
#include <cstdio>
#include <cstdlib>
#include <ctime>
#include <vector>
using namespace std;
double a, b;
double theta0 = 0.0, theta1 = 0.0;
double myrand() {
return double(rand()) / RAND_MAX;
}
double f(double x) {
double y = a * x + b;
y *= 0.1 * (myrand() - 0.5); // +/- 5% noise
return y;
}
double h(double x) {
return theta1 * x + theta0;
}
int main() {
srand(time(NULL));
a = myrand();
b = myrand();
printf("set parameters: a = %lf, b = %lf\n", a, b);
int N = 100;
vector<double> xs(N);
vector<double> ys(N);
for (int i = 0; i < N; ++i) {
xs[i] = myrand();
ys[i] = f(xs[i]);
}
double sensitivity = 0.008;
double d0, d1;
for (int n = 0; n < 100; ++n) {
d0 = d1 = 0.0;
for (int i = 0; i < N; ++i) {
d0 += h(xs[i]) - ys[i];
d1 += (h(xs[i]) - ys[i]) * xs[i];
}
theta0 -= sensitivity * d0;
theta1 -= sensitivity * d1;
printf("theta0: %lf, theta1: %lf\n", theta0, theta1);
}
return 0;
}
Changing the value of alpha can produce the algorithm to diverge, so that may be one of the causes of what is happening. You can check by computing the error in each iteration and see if is increasing or decreasing.
In adition, it is recommended to set randomly the values of theta at the beginning in stead of assigning them to zero.
Apart from that, you should divide by N when you update the value of theta as follows:
theta0 -= sensitivity * d0/N;
theta1 -= sensitivity * d1/N;
I had a quick look at your implementation and it looks fine to me.
The code I have does not work at all.
I wouldn't say that. It seems to behave correctly for small enough values of sensitivity, which is a value that you just have to "guess", and that is how the gradient descent is supposed to work.
I feel like a gradient descent should not be nearly this sensitive in alpha
If you struggle to visualize that, remember that you are using gradient descent to find the minimum of the cost function of linear regression, which is a quadratic function. If you plot the cost function you will see why the learning rate is so sensitive in these cases: intuitively, if the parabola is narrow, the algorithm will converge more quickly, which is good, but then the learning rate is more "sensitive" and the algorithm can easily diverge if you are not careful.
The problem to solve is finding the floating status of a floating body, given its weight and the center of gravity.
The function i use calculates the displaced volume and center of bouyance of the body given sinkage, heel and trim.
Where sinkage is a length unit and heel/trim is an angle limited to a value from -90 to 90.
The floating status is found when displaced volum is equal to weight and the center of gravity is in a vertical line with center of bouancy.
I have this implemeted as a non-linear Newton-Raphson root finding problem with 3 variables (sinkage, trim, heel) and 3 equations.
This method works, but needs good initial guesses. So I am hoping to find either a better approach for this, or a good method to find the initial values.
Below is the code for the newton and jacobian algorithm used for the Newton-Raphson iteration. The function volume takes the parameters sinkage, heel and trim. And returns volume, and the coordinates for center of bouyancy.
I also included the maxabs and GSolve2 algorithms, I belive these are taken from Numerical Recipies.
void jacobian(float x[], float weight, float vcg, float tcg, float lcg, float jac[][3], float f0[]) {
float h = 0.0001f;
float temp;
float j_volume, j_vcb, j_lcb, j_tcb;
float f1[3];
volume(x[0], x[1], x[2], j_volume, j_lcb, j_vcb, j_tcb);
f0[0] = j_volume-weight;
f0[1] = j_tcb-tcg;
f0[2] = j_lcb-lcg;
for (int i=0;i<3;i++) {
temp = x[i];
x[i] = temp + h;
volume(x[0], x[1], x[2], j_volume, j_lcb, j_vcb, j_tcb);
f1[0] = j_volume-weight;
f1[1] = j_tcb-tcg;
f1[2] = j_lcb-lcg;
x[i] = temp;
jac[0][i] = (f1[0]-f0[0])/h;
jac[1][i] = (f1[1]-f0[1])/h;
jac[2][i] = (f1[2]-f0[2])/h;
}
}
void newton(float weight, float vcg, float tcg, float lcg, float &sinkage, float &heel, float &trim) {
float x[3] = {10,1,1};
float accuracy = 0.000001f;
int ntryes = 30;
int i = 0;
float jac[3][3];
float max;
float f0[3];
float gauss_f0[3];
while (i < ntryes) {
jacobian(x, weight, vcg, tcg, lcg, jac, f0);
if (sqrt((f0[0]*f0[0]+f0[1]*f0[1]+f0[2]*f0[2])/2) < accuracy) {
break;
}
gauss_f0[0] = -f0[0];
gauss_f0[1] = -f0[1];
gauss_f0[2] = -f0[2];
GSolve2(jac, 3, gauss_f0);
x[0] = x[0]+gauss_f0[0];
x[1] = x[1]+gauss_f0[1];
x[2] = x[2]+gauss_f0[2];
// absmax(x) - Return absolute max value from an array
max = absmax(x);
if (max < 1) max = 1;
if (sqrt((gauss_f0[0]*gauss_f0[0]+gauss_f0[1]*gauss_f0[1]+gauss_f0[2]*gauss_f0[2])) < accuracy*max) {
x[0]=x2[0];
x[1]=x2[1];
x[2]=x2[2];
break;
}
i++;
}
sinkage = x[0];
heel = x[1];
trim = x[2];
}
int GSolve2(float a[][3],int n,float b[]) {
float x,sum,max,temp;
int i,j,k,p,m,pos;
int nn = n-1;
for (k=0;k<=n-1;k++)
{
/* pivot*/
max=fabs(a[k][k]);
pos=k;
for (p=k;p<n;p++){
if (max < fabs(a[p][k])){
max=fabs(a[p][k]);
pos=p;
}
}
if (ABS(a[k][pos]) < EPS) {
writeLog("Matrix is singular");
break;
}
if (pos != k) {
for(m=k;m<n;m++){
temp=a[pos][m];
a[pos][m]=a[k][m];
a[k][m]=temp;
}
}
/* convert to upper triangular form */
if ( fabs(a[k][k])>=1.e-6)
{
for (i=k+1;i<n;i++)
{
x = a[i][k]/a[k][k];
for (j=k+1;j<n;j++) a[i][j] = a[i][j] -a[k][j]*x;
b[i] = b[i] - b[k]*x;
}
}
else
{
writeLog("zero pivot found in line:%d",k);
return 0;
}
}
/* back substitution */
b[nn] = b[nn] / a[nn][nn];
for (i=n-2;i>=0;i--)
{
sum = b[i];
for (j=i+1;j<n;j++)
sum = sum - a[i][j]*b[j];
b[i] = sum/a[i][i];
}
return 0;
}
float absmax(float x[]) {
int i = 1;
int n = sizeof(x);
float max = x[0];
while (i < n) {
if (max < x[i]) {
max = x[i];
}
i++;
}
return max;
}
Have you considered some stochastic search methods to find the initial value and then fine-tuning with Newton Raphson? One possibility is evolutionary computation, you can use the Inspyred package. For a physical problem similar in many ways to the one you describe, look at this example: http://inspyred.github.com/tutorial.html#lunar-explorer
What about using a damped version of Newton's method? You could quite easily modify your implementation to make it. Think about Newton's method as finding a direction
d_k = f(x_k) / f'(x_k)
and updating the variable
x_k+1 = x_k - L_k d_k
In the usual Newton's method, L_k is always 1, but this might create overshoots or undershoots. So, let your method chose L_k. Suppose that your method usually overshoots. A possible strategy consists in taking the largest L_k in the set {1,1/2,1/4,1/8,... L_min} such that the condition
|f(x_k+1)| <= (1-L_k/2) |f(x_k)|
is satisfied (or L_min if none of the values satisfies this criteria).
With the same criteria, another possible strategy is to start with L_0=1 and if the criteria is not met, try with L_0/2 until it works (or until L_0 = L_min). Then for L_1, start with min(1, 2L_0) and do the same. Then start with L_2=min(1, 2L_1) and so on.
By the way: are you sure that your problem has a unique solution? I guess that the answer to this question depends on the shape of your object. If you have a rugby ball, there's one angle that you cannot fix. So if your shape is close to such an object, I would not be surprised that the problem is difficult to solve for that angle.