I am trying to write a simple gradient descent algorithm in C++ (for 10,000 iterations). Here is my program:
#include<iostream>
#include<cmath>
using namespace std;
int main(){
double learnrate=10;
double x=10.0; //initial start value
for(int h=1; h<=10000; h++){
x=x-learnrate*(2*x + 100*cos(100*x));
}
cout<<"The minimum is at y = "<<x*x + sin(100*x)<<" and at x = "<<x;
return 0;
}
The output ends up being: y=nan and x=nan. I tried looking at the values of x and y by putting them into a file, and after a certain amount of iterations, I am getting all nans (for x and y). edit: I picked the learning rate (or step size) to be 10 as an experiment, I will use much smaller values afterwards.
There must be something wrong with your formula. Already the first 10 values of x are increasing like hell:
-752.379
15290.7
-290852
5.52555e+06
-1.04984e+08
1.9947e+09
-3.78994e+10
7.20088e+11
-1.36817e+13
2.59952e+14
No matter what starting value you choose the absolute value of the next x will be bigger.
|next_x| = | x - 20 * x - 100 * cos(100*x) |
For example consider what happens when you choose a very small starting value (|x|->0), then
|next_x| = | 0 - 20 * 0 - 100 * cos ( 0 ) | = 100
Because at h=240 the variable "x" exceeds the limits of double type (1.79769e+308). This is a diverging arithmetic progression. You need to reduce your learn rate.
A couple of more things:
1- Do not use "using namespace std;" it is bad practice.
2- You can use "std::isnan() function to identify this situation.
Here is an example:
#include <iomanip>
#include <limits>
int main()
{
double learnrate = 10.0;
double x = 10.0; //initial start value
std::cout<<"double type maximum=" << std::numeric_limits<double>::max()<<std::endl;
bool failed = false;
for (int h = 1; h <= 10000; h++)
{
x = x - learnrate*(2.0*x + 100.0 * std::cos(100.0 * x));
if (std::isnan(x))
{
failed = true;
std::cout << " Nan detected at h=" << h << std::endl;
break;
}
}
if(!failed)
std::cout << "The minimum is at y = " << x*x + std::sin(100.0*x) << " and at x = " << x;
return 0;
}
Print x before the call to the cosine function and you will see that the last number printed before NaN (at h = 240) is:
-1.7761e+307
This means that the value is going to infinity, which cannot be represented (thus Not a Number).
It overflows the double type.
If you use long double, you will succeed in 1000 iterations, but you will still overflow the type with 10000 iterations.
So the problem is that the parameter learnrate is just too big. You should do let steps, while using a data type with larger range, as I suggested above.
The "learn rate" is far too high. Change it to 1e-4, for example, and the program works, for an initial value of 10 at least. When the learnrate is 10, the iterations jump too far past the solution.
At its best, gradient descent is not a good algorithm. For serious applications you want to use something better. Much better. Search for Brent optimizer and BFGS.
Related
#include<iostream>
using namespace std;
double log(double x,int n)
{
static double p = x ;
double s;
if(n==1)
return x;
else
{
s=log(x,n-1);
p*=x;
if(n%2==0)
return s - (p/n);
else
return s + (p/n);
}
}
int main()
{
double r = log(1,15);
cout << r;
return 0;
}
I tried writing the above function for evaluating the log(1+x) function using its taylor series with recursion. But it didn't gave the result as I expected.
Eg :
ln(2) = 0.693 whereas my code gave 0.725. In the above code, n represents the number of terms.
Also I am new to this platform, so can I say that the above question is complete or does it need some additional information for further explanation?
There is nothing wrong with that piece of code: this has obviously got to do with the rate of convergence of the Taylor series.
If you take n = 200 instead of n = 15 in your code, the approximation error will be low enough that the first two decimals of the exact solution ln(2) = 0.693147... will be the correct ones.
The more you increase the n parameter, the better approximation you will get of ln(2).
Your program does converge to the right number, just very slowly...
log(1,15) returns 0.725, as you noticed, log(1,50) is 0.683, log(1,100) is 0.688, and log(1,200) is 0.691. That's getting close to the number you expected, but still a long way to go...
So there is no C++ or recursion bug in your code - you just need to find a better Taylor series to calculate log(X). Don't look for a Taylor series for log(1+x) - these will typically assume x is small, and converge quickly for small x, not for x=1.
To get some more practice in C++, I decided to do some basic math functions without the aid of the math library. I've made a power and factorial function and they seem to work well. However, I'm having lots of problems regarding my Taylor Series cosine function.
Wikipedia Cosine Taylor Series
It outputs a good approximation at cos(1), cos(2), and begins losing precision at cos(3) and cos(4). Beyond that, its answer becomes completely wrong. The following are results from ./a.out
Input an angle in radians, output will be its cosine
1
Output is: 0.540302
Input an angle in radians, output will be its cosine
2
Output is: -0.415873
Input an angle in radians, output will be its cosine
3
Output is: -0.974777
Input an angle in radians, output will be its cosine
4
Output is: -0.396825 <-------------Should be approx. -0.654
Input an angle in radians, output will be its cosine
5
Output is: 2.5284 <-------------Should be approx. 0.284
Here is the complete source code:
#include <iostream>
#include <iomanip>
using std::cout;
using std::cin;
using std::endl;
int factorial(int factorial_input) {
int original_input = factorial_input;
int loop_length = factorial_input - 1;
if(factorial_input == 1 || factorial_input == 0) {
return 1;
}
for(int i=1; i != loop_length; i++) {
factorial_input = factorial_input - 1;
original_input = original_input * factorial_input;
}
return original_input;
}
double power(double base_input, double exponent_input) {
double power_output = base_input;
if(exponent_input == 0) {
return 1;
}
if(base_input == 0) {
return 0;
}
for(int i=0; i < exponent_input -1; i++){
power_output = power_output * base_input;
}
return power_output;
}
double cos(double user_input) {
double sequence[5] = { 0 }; //The container for each generated elemement.
double cos_value = 0; //The final output.
double variable_x = 0; //The user input x, being raised to the power 2n
int alternating_one = 0; //The (-1) that is being raised to the nth power,so switches back and forth from -1 to 1
int factorial_denom = 0; //Factorial denominator (2n)!
int loop_lim = sizeof(sequence)/sizeof(double); //The upper limit of the series (where to stop), depends on size of sequence. Bigger is more precision.
for(int n=0; n < loop_lim; n++) {
alternating_one = power(-1, n);
variable_x = power(user_input, (n*2));
factorial_denom = factorial((n*2));
sequence[n] = alternating_one * variable_x / factorial_denom;
cout << "Element[" << n << "] is: " << sequence[n] << endl; //Prints out the value of each element for debugging.
}
//This loop sums together all the elements of the sequence.
for(int i=0; i < loop_lim; i++) {
cos_value = cos_value + sequence[i];
}
return cos_value;
}
int main() {
double user_input = 0;
double cos_output;
cout << "Input an angle in radians, output will be its cosine" << endl;
cin >> user_input;
cos_output = cos(user_input);
cout << "Output is: " << cos_output << endl;
}
At five iterations, my function should maintain accuracy until after around x > 4.2 according to this graph on Desmos:
Desmos Graph
Also, when I set the series up to use 20 iterations or more (it generates smaller and smaller numbers which should make the answer more precise), the elements start acting very unpredictable. This is the ./a.out with the sequence debugger on so that we may see what each element contains. The input is 1.
Input an angle in radians, output will be its cosine
1
Element[0] is: 1
Element[1] is: -0.5
Element[2] is: 0.0416667
Element[3] is: -0.00138889
Element[4] is: 2.48016e-05
Element[5] is: -2.75573e-07
Element[6] is: 2.08768e-09
Element[7] is: -7.81894e-10
Element[8] is: 4.98955e-10
Element[9] is: 1.11305e-09
Element[10] is: -4.75707e-10
Element[11] is: 1.91309e-09
Element[12] is: -1.28875e-09
Element[13] is: 5.39409e-10
Element[14] is: -7.26886e-10
Element[15] is: -7.09579e-10
Element[16] is: -4.65661e-10
Element[17] is: -inf
Element[18] is: inf
Element[19] is: -inf
Output is: -nan
Can anyone point out what things I'm doing wrong and what I should be doing better? I'm new to C++ so I still have a lot of misconceptions. Thank you so much for taking the time to read this!
You have the following problems:
In the graph you are showing in the picture k is included in the sum, while you are excluding it in your code. Therefore k=5 in the Desmos graph is equal to double sequence[6] = { 0 } in your code.
This fixes the output for user_input = 4.
For user_input = 5 you can then compare to the graph to see that it gives a similar result as well (which is already far off of the true value)
Then you will have bugs for larger number of terms, because the factorial function outputs int, but the factorial grows so quickly that it will go out-of-range of the values int can hold quickly and also quickly out-of-range of any integer type. You should return double and let original_input be double as well, if you want to support a somewhat (though not much) larger input range.
In power you take the exponent as double, but work with it as if it was an integer. In particular you use it for the limit of loop iterations. That will only work correctly as long as the values are small enough to be exactly representable by double. As soon as the values become larger, the number of loop iterations will become inexact.
Use int as second parameter to power instead.
If one were to implement cos with this approach, one would normally use cos symmetry first, to reduce the range to something smaller, e.g. [0,pi/2] first, by using e.g. that cos(x + 2pi) = cos(x) and cos(x+pi) = - cos(x) and cos(-x) = cos(x), etc.
The problem comes from the factorial function you implemented.
I made minimal changes to your code and it runs fine for your example calculation of cos(1). Just #include <cmath> and replace factorial((n*2)) by tgamma(2*n+1). The output then reads
Input an angle in radians, output will be its cosine
Element[0] is: 1
Element[1] is: -0.5
Element[2] is: 0.0416667
Element[3] is: -0.00139082
Element[4] is: 2.48022e-05
Element[5] is: -2.75573e-07
Element[6] is: 2.08768e-09
Element[7] is: 4.65661e-10
Element[8] is: -4.65661e-10
Element[9] is: 4.65661e-10
Element[10] is: -4.65661e-10
Element[11] is: 4.65661e-10
Element[12] is: -4.65661e-10
Element[13] is: 4.65661e-10
Element[14] is: -4.65661e-10
Element[15] is: 4.65661e-10
Element[16] is: -4.65661e-10
Element[17] is: 4.65661e-10
Element[18] is: -4.65661e-10
Element[19] is: 4.65661e-10
Output is: 0.5403
This is the expected output for cos(1). For cos(n) with n>1 the problem is that the values for factorial_denom are getting to big for an integer. You should change the type to double: double factorial_denom. With your modified code I am getting the following results:
cos(1): Output is: 0.5403
cos(2): Output is: -0.416147
cos(3): Output is: -0.989992
cos(4): Output is: -0.653644
cos(5): Output is: 0.283662
Run your modified code online.
In addition to the changes already suggested, consider limiting the use of the series to a relatively narrow range of inputs. There are numerical problems you can encounter for very large angles, and they increase the amount of testing you need to do.
The cosine function has several identities, such as cos(x) = cos(-x) and cos(x) = cos(n*2*pi+x) for any integer n. Use these to reduce the angle to a limited range before running your series solution.
I'm very new to C++ programming, and have written a simple program to calculate the factorial of an integer provided by the user. I am attempting to account for inputs which would cause an error, or do not make sense (e.g. I have accounted for input of a negative number/-1 already). I want to print out an error if the user enters a number whose factorial would be larger than the maximum integer size.
I started with:
if(factorial(n) > INT_MAX)
std::cout << "nope";
continue
I tested this with n = ~25 or 26 but it doesn't prevent the result from overflowing and printing out a large negative number instead.
Second, I tried assigning this to a variable using a function from the 'limits.h' header and then comparing the result of factorial(n) against this. Still no luck (you can see this solution in the code sample below).
I could of course assign the result to a long and test against that but you wouldn't have to go very far until you started to wrap around that value, either. I'd prefer to find a way to simply prevent the value from being printed if this happens.
#include <iostream>
#include <cstdlib>
#include <limits>
int factorial(int n)
{
auto total = 1;
for(auto i = 1; i <= n; i++)
{
total = total * i; //Product of all numbers up to n
}
return total;
}
int main()
{
auto input_toggle = true;
auto n = 0;
auto int_max_size = std::numeric_limits<int>::max();
while(input_toggle = true)
{
/* get user input, check it is an integer */
if (factorial(n) > int_max_size)
{
std::cout << "Error - Sorry, factorial of " << n << " is larger than \nthe maximum integer size supported by this system. " << std::endl;
continue;
}
/* else std::cout << factorial(n) << std::endl; */`
As with my other condition(s), I expect it to simply print out that small error message and then continue asking the user for input to calculate. The code does work, it just continues to print values that have wrapped around if I request the factorial of a value >25 or so. I feel this kind of error-checking will be quite useful.
Thanks!
You are trying to do things backwards.
First, no integer can actually be bigger than INT_MAX, by definition - this is a maximum value integer can be! So your condition factorial(n) > int_max_size is always going to be false.
Moreover, there is a logical flaw in your approach. You calculate the value first and than check if it is less than maximum value allowed. By that time it is too late! You have already calculated the value and went through any overflows you might have encountered. Any check you might be performing should be performed while you are still doing your calculations.
In essence, you need to check if multiplying X by Z will be within allowed range without actually doing the multiplication (unfortunately, C++ is very strict in leaving signed integer overflow undefined behavior, so you can't try and see.).
So how do you check if X * Y will be lesser than Z? One approach would be to divide Z by Y before engaging in calculation. If you end up with the number which is lesser than X, you know that multiplying X by Y will result in overflow.
I believe, you know have enough information to code the solution yourself.
I'm currently trying to solve a programming problem that involves different ranges of values that overlap. The task is to accept input, in E-notation, and that is where the overlap of range inevitably occurs.
I have 2 ranges that overlap at 1E-11. 1E-11 and lower and 1E-11 and higher
The output would be 1E-11 is either x or it is y. Programmatically i would solve it like this:
(X_MIN would be 1E-11 and X_MAX 1E-8)
(Y_MAX would be 1E-11 and Y_MIN 1E-13)
(lengthOfRange <= X_MIN) && (lengthOfRange >= Y_MAX) ?
cout << "This value entered indicates that it is x\n" :
cout << "It is y";
Expressed this way if i input IE-11 it shows me "This value entered indicates ..." but will never show me it is y (understandably - overlap!)
The other way around would be expressing it this way:
(lengthOfRange <= X_MIN) && (lengthOfRange != Y_MAX) ?
cout << "This value entered indicates that it is x\n" :
cout << "It is y";
The output would always be "... It is y ..." (Same difference - overlap!) There is no other determining factor that would tell range is x or y coming in to play there as of right now.
...
if (lengthOfRange <= X_MIN) && (lengthOfRange == Y_MAX)
{
cout << "The input indicates that it could be either x or y\n";
}
...
Even if i were to solve the problem in a way such as defining the range with different values, would in the end lead to the very same problem. I COULD define MIN and MAX as constants in lengthOfFrequency, which is totally different, bit then i would have to say: lengthOfFrequency = 1E-11; and voila same problem once again. 1 input 2 ranges that are technically different, getting the same one and only correct value in E-notation.
Is there a way around this without involving to simply say input is either x || y? Which it is technically of course, and if it were to be solved physically there are ways of telling it apart that 1E-11 is not 1E-11 though it is. (I hope i make sense here). But, again, ... is there such way, and how would i go about writing it? (Not asking for code specifically though it would be highly welcome, just a pointer in the right direction.) Or should i rather go with saying input is either x || y?
Thanks in advance for any answer!
**Minimum Complete Code:**
#include <iostream>
using std::cout;
using std::cin;
int main()
{
/* Constants for ranges, min and max */
const double X_RAYS_MIN = 1E-13,
X_RAYS_MAX = 1E-11,
Y_RAYS_MIN = 1E-11,
Y_RAYS_MAX = 1E-8,
Z_RAYS_MIN = 1E-7,
Z_RAYS_MAX = 3.8E-7;
double lengthOfRange;
/* Test output and validation */
cout << "Enter value in scientifc notation: ";
cin >> lengthOfRange;
/* X_RAYS_MIN < is 1E-14, 1E-15, 1E-16 etc. > 1E-12, 1E-11 etc.. */
if (lengthOfRange >= X_RAYS_MIN && lengthOfRange <= X_RAYS_MAX)
{
cout << "X_RAYS\n";
}
else if (lengthOfRange >= Y_RAYS_MIN && lengthOfRange <= Y_RAYS_MAX)
{
cout << "Y_RAYS\n";
}
system("pause");
return 0;
}
Output is: 1E-10 is Y_RAYS, 1E-9 is Y_RAYS, 1E-11 X_RAYS, 1E-12 X_RAYS
Somehow i found the solution for my problem myself without going any roundabout ways ... By hovering over the 1E-13:
X_RAYS_MIN = 1E-13
VS showed me 1.(numberofzeros)3E-13, and guess what ... if instead the input for 1E-11 is 2E-11, the output for X_RAYS becomes Y_RAYS ... so the problem "magically" solved itself ... lucky me i guess ... :)
I know how to obtain the square root of a number using the sqrt function.
How can I obtain the cube root of a number?
sqrt stands for "square root", and "square root" means raising to the power of 1/2. There is no such thing as "square root with root 2", or "square root with root 3". For other roots, you change the first word; in your case, you are seeking how to perform cube rooting.
Before C++11, there is no specific function for this, but you can go back to first principles:
Square root: std::pow(n, 1/2.) (or std::sqrt(n))
Cube root: std::pow(n, 1/3.) (or std::cbrt(n) since C++11)
Fourth root: std::pow(n, 1/4.)
etc.
If you're expecting to pass negative values for n, avoid the std::pow solution — it doesn't support negative inputs with fractional exponents, and this is why std::cbrt was added:
std::cout << std::pow(-8, 1/3.) << '\n'; // Output: -nan
std::cout << std::cbrt(-8) << '\n'; // Output: -2
N.B. That . is really important, because otherwise 1/3 uses integer division and results in 0.
in C++11 std::cbrt was introduced as part of math library, you may refer
include <cmath>
std::pow(n, 1./3.)
Also, in C++11 there is cbrt in the same header.
Math for Dummies.
The nth root of x is equal to x^(1/n), so use std::pow. But I don't see what this has to with operator overloading.
Just to point this out, though we can use both ways but
long long res = pow(1e9, 1.0/3);
long long res2 = cbrt(1e9);
cout<<res<<endl;
cout<<res2<<endl;
returns
999
1000
So, in order to get the correct results with pow function we need to add an offset of 0.5 with the actual number or use a double data type i.e.
long long res = pow(1e9+0.5, 1.0/3)
double res = pow(1e9, 1.0/3)
more detailed explanation here C++ pow unusual type conversion
Actually the round must go for the above solutions to work.
The Correct solution would be
ans = round(pow(n, 1./3.));
The solution for this problem is
cube_root = pow(n,(float)1/3);
and you should #include <math.h> library file
Older standards of C/C++ don't support cbrt() function.
When we write code like cube_root = pow(n,1/3); the compiler thinks 1/3 = 0 (division problem in C/C++), so you need to do typecasting using (float)1/3 in order to get the correct answer
#include<iostream.h>
#include<conio.h>
#include<math.h>
using namespace std;
int main(){
float n = 64 , cube_root ;
clrscr();
cube_root = pow(n , (float)1/3);
cout<<"cube root = "<<cube_root<<endl;
getch();
return 0;
}
cube root = 4
You can try this C algorithm :
// return a number that, when multiplied by itself twice, makes N.
unsigned cube_root(unsigned n){
unsigned a = 0, b;
for (int c = sizeof(unsigned) * CHAR_BIT / 3 * 3 ; c >= 0; c -= 3) {
a <<= 1;
b = 3 * a * (a + 1) + 1;
if (n >> c >= b)
n -= b << c, ++a;
}
return a;
}
I would discourage any of the above methods as they didn't work for me. I did pow(64, 1/3.) along with pow(64, 1./3.) but the answer I got was 3
Here's my logic.
ans = pow(n, 1/3.);
if (pow(ans, 3) != n){
ans++;
}