Batch gradient descent algorithm does not converge

Batch gradient descent algorithm does not converge - c++

I'm trying to implement batch grandient descent algorithm for my machine learning homework. I have a training set, whose x value is around 10^3 and y value is around 10^6. I'm trying to find the value of [theta0, theta1] which makes y = theta0 + theta1 * x converge. I set the learning rate to 0.0001 and maximum interation to 10. Here's my code in Qt.
QVector<double> gradient_descent_batch(QVector<double> x, QVector<double>y)
{
QVector<double> theta(0);
theta.resize(2);
int size = x.size();
theta[1] = 0.1;
theta[0] = 0.1;
for (int j=0;j<MAX_ITERATION;j++)
{
double dJ0 = 0.0;
double dJ1 = 0.0;
for (int i=0;i<size;i++)
{
dJ0 += (theta[0] + theta[1] * x[i] - y[i]);
dJ1 += (theta[0] + theta[1] * x[i] - y[i]) * x[i];
}
double theta0 = theta[0];
double theta1 = theta[1];
theta[0] = theta0 - LRATE * dJ0;
theta[1] = theta1 - LRATE * dJ1;
if (qAbs(theta0 - theta[0]) < THRESHOLD && qAbs(theta1 - theta[1]) < THRESHOLD)
return theta;
}
return theta;
}
I print the value of theta every interation, and here's the result.
QVector(921495, 2.29367e+09)
QVector(-8.14503e+12, -1.99708e+16)
QVector(7.09179e+19, 1.73884e+23)
QVector(-6.17475e+26, -1.51399e+30)
QVector(5.3763e+33, 1.31821e+37)
QVector(-4.68109e+40, -1.14775e+44)
QVector(4.07577e+47, 9.99338e+50)
QVector(-3.54873e+54, -8.70114e+57)
QVector(3.08985e+61, 7.57599e+64)
QVector(-2.6903e+68, -6.59634e+71)
I seems that theta will never converge.
I follow the solution here to set learning rate to 0.00000000000001 and maximum iteration to 20. But it seems will not converge. Here's the result.
QVector(0.100092, 0.329367)
QVector(0.100184, 0.558535)
QVector(0.100276, 0.787503)
QVector(0.100368, 1.01627)
QVector(0.10046, 1.24484)
QVector(0.100552, 1.47321)
QVector(0.100643, 1.70138)
QVector(0.100735, 1.92936)
QVector(0.100826, 2.15713)
QVector(0.100918, 2.38471)
QVector(0.101009, 2.61209)
QVector(0.1011, 2.83927)
QVector(0.101192, 3.06625)
QVector(0.101283, 3.29303)
QVector(0.101374, 3.51962)
QVector(0.101465, 3.74601)
QVector(0.101556, 3.9722)
QVector(0.101646, 4.1982)
QVector(0.101737, 4.424)
QVector(0.101828, 4.6496)
What's wrong?

So firstly your algorithm seems fine except that you should divide LRATE by size;
theta[0] = theta0 - LRATE * dJ0 / size;
theta[1] = theta1 - LRATE * dJ1 / size;
What I would suggest you should calculate cost function and monitor it;
Cost function
Your cost should be decreasing on every iteration. If its bouncing back and forward you are using a large value of learning rate. I would suggest you to use 0.01 and do 400 iterations.

Related

Find a new angle of the ball when it bounces (using coordinates)

I have a game where a ball is bouncing off walls. It's on a coordinate plane. I want there to be some small amount of randomness when it bounces to keep the game more interesting. How would I do this while keeping the ball at a constant speed the whole time? Right now my code means it only bounces at right angles.
The top left corner of the window is 0,0 and the bottom right is winW,winH (set at 800,800 right now).
ball.cpp snippet
pos.x = start.x;
pos.y = start.y;
speed.x = .4f; // the f indicates that it's per frame.
speed.y = .4f;
void Ball::hitLeftRight() {
speed.x = -speed.x;
}
void Ball::hitTopBottom() {
speed.y = -speed.y;
}
void Ball::reset() {
// for a new level in game
pos.x = start.x;
pos.y = start.y;
}
void Ball::update() {
// called every frame
pos.y += speed.y;
pos.x += speed.x;
ballShape.setPosition(pos);
}

You could add a random angle to the ball in addition to just flipping the velocities. By using the <random> header and generating values between 0 and 2 * pi you can add a random velocity in any direction. Or as in the example below, limiting it to -22.5 to 22.5 degrees. Or pi / 8.0 radians.
Random Angle
You could of course tweak those values based on the angle of impact, but that is implementation specific. Below is an example on how you could generate such numbers:
#include <random>
#include <cmath>
#include <iostream>
int main() {
constexpr double pi = 3.14159;
constexpr double bounceSpeed = 5.0;
std::random_device seed;
std::mt19937 generator(seed());
// Numbers between - pi / 2 and pi / 8 gives angles between -22.5 and 22.5
std::uniform_real_distribution<double> random(-pi / 8.0, pi / 8.0);
double deltaX = cos(random(generator)) * bounceSpeed;
double deltaY = sin(random(generator)) * bounceSpeed;
std::cout << deltaX << "\n" << deltaY << "\n";
return 0;
}
Afterwards, you could add deltaX and deltaY to your respective x and y velocities.
Plain Random
Or if you're satisfied with just any purely random velocity:
// Generate random double in range [min, max]
double uniform(double min, double max) {
static std::random_device seed;
static std::mt19937 generator(seed());
std::uniform_real_distribution<double> random(min, max);
return random(generator);
}
Call that function twice with the velocity range you desire, and add that to the x and y of your ball.
Keeping Speed
To keep the speed of the ball you could normalize the velocity vector and then multiply it by the desired speed of you ball after adding the random velocity.
To normalize, divide by length:
#include <cmath>
#include <iostream>
int main() {
constexpr double speed = 4.0;
double x = 3.0;
double y = 4.0;
double length = sqrt(x * x + y * y);
x /= length;
y /= length;
x *= speed;
y *= speed;
std::cout << x << "\n" << y << "\n";
return 0;
}
Then multiply by speed to keep it consistent.

To start, have a set speed for the ball as a float. Then have a velocity which you move by every update with x and y members like what you have now. Calculate how far you move x and y using your speed, your angle of movement, and some trigonometry. (Remember to convert degrees to radians)
float Speed = 10;
float Angle = 45;
Velocity.x = cos(Angle * 3.14159 / 180) * Speed;
Velocity.y = sin(Angle * 3.14159 / 180) * Speed;
Whenever you encounter a collision, you can then recalculate your angle add your new random angle and recalculate your velocity. (Again converting degrees to radians).
//if (collision)
Angle = atan2(Velocity.y * 3.14159 / 180, Velocity.x * 3.14159 / 180);
Angle += rand() % 90 + 1; // Could also be subtracting here
Velocity.x = cos(Angle * 3.14159 / 180) * Speed;
Velocity.y = sin(Angle * 3.14159 / 180) * Speed;
Add or subtract from the angle based on what side you have struck and from what angle.
void HitTop()
{
if (Velocity.x > 0)
//Subtract random angle
else
//Add random angle
}
Do this for all sides.

Let your border has arbitrary form (not only rectangle). If border normal (unit) vector in bouncing points is
n = (n.x, n.y)
then speed vector after bouncing changes:
dot = speed.x * n.x + speed.y * n.y
//after reflection
newspeed.x = speed.x - 2 * dot * n.x
newspeed.y = speed.y - 2 * dot * n.y
To add some randomness, just rotate normal by small random angle:
df = GaussianRandom(Mean = 0, Sigma = Pi / 30) //arbitrary parameter
//seems in c++ it is std::normal_distribution
n'.x = n.x * Cos(df) - n.y * Sin(df)
n'.y = n.x * Sin(df) + n.y * Cos(df)
and use n' to calculate reflection. Note that this approach preserves speed magnitude.

interior angles of irregular polygon with angles > 180

I'm trying to calculate the values shown in the picture in red i.e. the interior angles.
I've got an array of the points where lines intersect and have tried using the dot-product but it only returns the smallest angles. I need the full range of internal angles (0-359) but can't seem to find much that meets this criteria.

Assuming your angles are in standard counterclockwise format, the following should work:
void angles(double points[][2], double angles[], int npoints){
for(int i = 0; i < npoints; i++){
int last = (i - 1 + npoints) % npoints;
int next = (i + 1) % npoints;
double x1 = points[i][0] - points[last][0];
double y1 = points[i][1] - points[last][1];
double x2 = points[next][0] - points[i][0];
double y2 = points[next][1] - points[i][1];
double theta1 = atan2(y1, x1)*180/3.1415926358979323;
double theta2 = atan2(y2, x2)*180/3.1415926358979323;
angles[i] = (180 + theta1 - theta2 + 360);
while(angles[i]>360)angles[i]-=360;
}
}
Obviously, if you are using some sort of data structure for your points, you will want to replace double points[][2] and references to it with references to your data structure.

You can obtain full angle range (-Pi..Pi) with atan2 function:
atan2(crossproduct, dotproduct)

bandpass FIR filter

I need to make a simple bandpass audio filter.
Now I've used this simple C++ class: http://www.cardinalpeak.com/blog/a-c-class-to-implement-low-pass-high-pass-and-band-pass-filters
It works well and cut off the desired bands. But when I try to change upper or lower limit with small steps, on some values of limit I hear the wrong result - attenuated or shifted in frequency (not corresponding to current limits) sound.
Function for calculating impulse response:
void Filter::designBPF()
{
int n;
float mm;
for(n = 0; n < m_num_taps; n++){
mm = n - (m_num_taps - 1.0) / 2.0;
if( mm == 0.0 ) m_taps[n] = (m_phi - m_lambda) / M_PI;
else m_taps[n] = ( sin( mm * m_phi ) -
sin( mm * m_lambda ) ) / (mm * M_PI);
}
return;
}
where
m_lambda = M_PI * Fl / (Fs/2);
m_phi = M_PI * Fu / (Fs/2);
Fs - sample rate (44.100)
Fl - lower limit
Fu - upper limit
And simple filtering function:
float Filter::do_sample(float data_sample)
{
int i;
float result;
if( m_error_flag != 0 ) return(0);
for(i = m_num_taps - 1; i >= 1; i--){
m_sr[i] = m_sr[i-1];
}
m_sr[0] = data_sample;
result = 0;
for(i = 0; i < m_num_taps; i++) result += m_sr[i] * m_taps[i];
return result;
}
Do I need to use any window function (Blackman, etc.)? If yes, how do I do this?
I have tried to multiply my impulse response to Blackman window:
m_taps[n] *= 0.42 - 0.5 * cos(2.0 * M_PI * n / double(N - 1)) +
0.08 * cos(4.0 * M_PI * n / double(N - 1));
but the result was wrong.
And do I need to normalize taps?

I found a good free implementation of FIR filter:
http://www.iowahills.com/A7ExampleCodePage.html
...This Windowed FIR Filter C Code has two parts, the first is the
calculation of the impulse response for a rectangular window (low
pass, high pass, band pass, or notch). Then a window (Kaiser, Hanning,
etc) is applied to the impulse response. There are several windows to
choose from...

y[i] = waveform[i] × (0.42659071 – 0.49656062cos(w) + 0.07684867cos(2w))
where w = (2)i/n and n is the number of elements in the waveform
Try this I got the code from:
http://zone.ni.com/reference/en-XX/help/370592P-01/digitizers/blackman_window/
I hope this helps.

Complex Numbers and Naive Fourier Transform (C++)

I'm trying to get fourier transforms to work, I have to do it for an assignment and I think I have it to where it should be working and i'm not sure why it's not. I think it has something to do with the complex numbers since 'i' is involved. I've looked at many references and I understand the formula but i'm having trouble programming it. this is what i have so far
void NaiveDFT::Apply( Image & img )
{
//make the fourier transform using the naive method and set that to the image.
Image dft(img);
Pixel ** dftData = dft.GetImageData();
Pixel ** imgData = img.GetImageData();
for(unsigned u = 0; u < img.GetWidth(); ++u)
{
for(unsigned v = 0; v < img.GetHeight(); ++v)
{
std::complex<double> sum = 0;
for(unsigned x = 0; x < img.GetWidth(); ++x)
{
for(unsigned y = 0; y < img.GetHeight(); ++y)
{
std::complex<double> i = sqrt(std::complex<double>(-1));
std::complex<double> theta = 2 * M_PI * (((u * x) / img.GetWidth()) + ((v * y) / img.GetHeight()));
sum += std::complex<double>(imgData[x][y]._red) * cos(theta) + (-i * sin(theta));
//sum += std::complex<double>(std::complex<double>(imgData[x][y]._red) * pow(EULER, -i * theta));
}
}
dftData[u][v] = (sum.imag() / (img.GetWidth() * img.GetHeight()));
}
}
img = dft;
}
I have a few test images i'm testing this with and i'm either getting like an all black image or like, an all gray image.
I've also tried the sum of e^(-i*2*PI*(x*u*width + y*v*height) * 1/width * height which gets the same result as expected although it's still not the desiered output.
I've also tried the sum.real() number and that doesn't look right either
if anyone has any tips or can point me in the right direction, that'd be great, at this point, i just keep trying different things and checking the output until I get what I should be getting.
thanks.

I think that there can be a problem during the multiplication with the complex term. The line:
sum += std::complex<double>(imgData[x][y]._red) * cos(theta) + (-i * sin(theta));
should be:
sum += std::complex<double>(imgData[x][y]._red) * ( cos(theta) + -i * sin(theta));
Moreover, while calculating theta you need to use double precision:
std::complex<double> theta = 2 * M_PI * ((((double)u * x) / (double)(img.GetWidth())) + (((double)v * y) / (double)(img.GetHeight())));

Recursively create a sine wave given a single sine wave value and the period

I am trying to write a .oct function for Octave that, given a single sine wave value, between -1 and 1, and sine wave period, returns a sine wave vector of period length with the last value in the vector being the given sine wave value. My code so far is:
#include <octave/oct.h>
#include <octave/dColVector.h>
#include <math.h>
#define PI 3.14159265
DEFUN_DLD (sinewave_recreate, args, , "args(0) sinewave value, args(1) is period")
{
octave_value_list retval;
double sinewave_value = args(0).double_value ();
double period = args(1).double_value ();
ColumnVector output_sinewave(period);
double degrees_inc = 360 / period;
double output_sinewave_degrees;
output_sinewave_degrees = asin( sinewave_value ) * 180 / PI;
output_sinewave(period-1) = sin( output_sinewave_degrees * PI / 180 );
for (octave_idx_type ii (1); ii < period; ii++) // Start the loop
{
output_sinewave_degrees = output_sinewave_degrees - degrees_inc;
if ( output_sinewave_degrees < 0 )
{
output_sinewave_degrees += 360 ;
}
output_sinewave( period-1-ii ) = sin( output_sinewave_degrees * PI / 180 );
}
retval(0) = output_sinewave;
return retval;
}
but is giving patchy results. By this I mean that it sometimes recreates the sine wave quite accurately and other times it is way off. I have determined this simply by creating a given sine wave, taking the last value in time and plugging this into the function to recreate the sine wave backwards through time and then comparing plots of the two. Obviously I am doing something wrong, but I can't seem to identify what.

Lets start with some trigonometric identities:
sin(x)^2 + cos(x)^2 == 1
sin(x+y) == sin(x)*cos(y) + sin(y)*cos(x)
cos(x+y) == cos(x)*cos(y) - sin(x)*sin(y)
Given the sine and cosine at a point x, we can exactly calculate the values after a step of size d, after precalculating sd = sin(d) and cd = cos(d):
sin(x+d) = sin(x)*cd + cos(x)*sd
cos(x+d) = cos(x)*cd - sin(x)*sd
Given the initial sine value, you can calculate the initial cosine value:
cos(x) = sqrt(1 - sin(x)^2)
Note that there are two possible solutions, corresponding to the two possible square-root values. Also note that all the angles in these identities are in radians, and d needs to be negative if you're going back through the wave.

Mike's note that there are two possible solutions for cos(x) made me realise that I would need to resolve the phase ambiguity of the sine wave. My second, successful attempt at this function is:
#include <octave/oct.h>
#include <octave/dColVector.h>
#include <math.h>
#define PI 3.14159265
DEFUN_DLD (sinewave_recreate_3, args, , "args(0) sinewave value, args(1) is period, args(2) is the phase")
{
octave_value_list retval;
double sinewave_value = args(0).double_value ();
double period = args(1).double_value ();
double phase = args(2).double_value ();
ColumnVector output_sinewave(period);
double X0 = asin(sinewave_value);
if (sinewave_value < 0 & phase > 180 & phase < 270)
{
X0 = PI + (0 - X0);
}
if (sinewave_value < 0 & phase >= 270)
{
X0 = X0 + 2 * PI;
}
if (sinewave_value > 0 & phase > 90)
{
X0 = PI - X0;
}
if (sinewave_value > 0 & phase < 0)
{
X0 = X0 + PI / 2;
}
double dx = PI / 180 * (360/period);
for (octave_idx_type ii (0); ii < period; ii++) // Start the loop
{
output_sinewave(period-1-ii) = sin(X0 - dx * ii);
}
retval(0) = output_sinewave;
return retval;
}
Thanks are also due to Keynslug.

There is simple formula. Here is the example in Python:
import math
import numpy as np
# We are supposing step is equal to 1degree
T = math.radians(1.0/360.0)
PrevBeforePrevValue = np.sin(math.radians(49.0)) # y(t-2)
PrevValue = np.sin(math.radians(50.0)) # y(t-1)
ValueNowRecursiveFormula = ((2.0*(4.0-T*T))/(4.0+T*T))*PrevValue - PrevBeforePrevValue
print("From RECURSIVE formula - " + str(ValueNowRecursiveFormula))
The details can be found here:
http://howtodoit.com.ua/en/on-the-way-of-developing-recursive-sinewave-generator/

You might try an easier way to go through.
Just recall that if
y = sin(x)
then first derivative of y will be equal to
dy/dx = cos(x)
So at every step of computation you add to the current value of y some delta equal to
dy = cos(x) * dx
But that might cut your accuracy down as a side-effect. You could probe it whatever. HTH.
It seems that slightly improved equation tend to be more accurate:
dy = cos(x + dx/2) * dx
Take a look at this.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Batch gradient descent algorithm does not converge - c++

Related

Find a new angle of the ball when it bounces (using coordinates)

interior angles of irregular polygon with angles > 180

bandpass FIR filter

Complex Numbers and Naive Fourier Transform (C++)

Recursively create a sine wave given a single sine wave value and the period

Categories

Resources