Output is NaN , how? - c++

I am trying to code Taylor series but I am getting 'nan' as output in case of large value of n(=100).
Where am I doing things wrong?
#include<iostream>
#include<cmath>
using namespace std;
int main(){
int n;
double x;
cin >> n;
cin >> x;
long double temp_val = 1;
int sign = 1;
int power = 1;
long long int factorial = 1;
for(int i = 1 ; i < n ; i++){
sign = sign* -1 ;
power = 2*i;
factorial = factorial*(2*i)*(2*i-1);
temp_val += sign*pow(x,power)/factorial;
}
cout<<temp_val;
}

For large n your program has undefined behavior.
You are calculating the factorial of 2n (so 200) in factorial. 200! is, according to Wolfram Alpha:
788657867364790503552363213932185062295135977687173263294742533244359449963403342920304284011984623904177212138919638830257642790242637105061926624952829931113462857270763317237396988943922445621451664240254033291864131227428294853277524242407573903240321257405579568660226031904170324062351700858796178922222789623703897374720000000000000000000000000000000000000000000000000
For comparison, the typical largest value that a long long int can hold is
9223372036854775807
(which is assuming it is 64-bit)
Clearly you will not be able to fit 200! into that. When you overflow a signed integer variable your program will have undefined behavior. That means that there will be no guarantee how it will behave.
But even if you change the variable type to be unsigned, not much will change. The program won't have undefined behavior anymore, but the factorial will not actually hold the correct value. Instead it will keep wrapping around back to zero.
Even if you change factorial to be type double, this will probably not be enough with at typical double implementation to hold this value. Your platform might have a long double type that is larger than double and able to hold this value.
You will have similar problems with pow(x, power) if x is not close to 1.
As mentioned in the answer by #idclev463035818 the Taylor series, if evaluated straightforwardly, is numerically very ill-behaved and can not really be used practically in this form for large n.

Calculating the taylor series has a trap that also occurs in other situations: Both the numerator and denominator of the terms to add grow rather fast and overflow easily, but their quotient converges to zero (otherwise adding them up till infinity would not converge to a finite number).
Instead of keeping track of both terms individually you need to update the result and the total increment. I wont provide you a full solution. In pseudo-code
double res = 0;
double delta = x;
int n = 1;
double sign = -1;
while ( ! stop_condition ) {
delta *= (x / n);
res += sign*delta;
++n;
sign *= -1;
}

Related

Casting float to int in C++

int uniquePaths(int m, int n) {
int num = m+n-2;
int den=1;
double ans = 1;
while(den<=m-1) {
ans = ans*(num--)/(den++);
}
cout<<ans;
return (int)ans;
}
The expected answer for m=53, n=4 as input to the above piece of code is 26235 but the code returns 26234. However, the stdout shows 26235.
Could you please help me understand this behavior?
Due to floating-point rounding, your code computes ans to be 26,234.999999999985448084771633148193359375. When it is printed with cout<<ans, the default formatting does not show the full value and rounds it to “26235”. However, when the actual value is converted to int, the result is 26,234.
After setting num to m+n-2, your code is computing num! / ((m-1)!(num-m+1)!), which of course equals num! / ((num-m+1)!(m-1)!). Thus, you can use either m-1 or num-m+1 as the limit. So you can change the while line to these two lines:
int limit = m-1 < num-m+1 ? m-1 : num-m+1;
while(den<=limit) {
and then your code will run to the lower limit, which will avoid dividing ans by factors that are not yet in it. All results will be exact integer results, with no rounding errors, unless you try to calculate a result that exceeds the range of your double format where it is able to represent all integers (up to 253 in the ubiquitous IEEE-754 binary64 format used for double).

Strange behaviour of integer overflow

I created the following code for finding the answer to the coin problem. This involves finding minimum number of coins of given k denominations (where each such coins are in infinite supply) to form a target sum n. In particular I have investigated the case where k=5, denominations = {2,3,4,5,6} and target sum n=100.
Code:
#include<iostream>
#include<algorithm>
using namespace std;
int coins[5] = {2,3,4,5,6};
int values[101];
int n=100;
int k=5;
int INF = INT_MAX;
int main()
{
for (int x=1;x<=n;x++)
{
values[x] = INF;
for (int j=1;j<=k;j++)
{
if (x-coins[j-1]>=0)
{
values[x] = min(values[x],values[x-coins[j-1]]+1);
}
}
}
cout<<values[100];
return 0;
}
The output to this code that I received is -2147483632. Clearly some kind of overflow must be occurring so I decided to output INF+1. And I got INT_MIN as the answer. But I had also remembered that often while outputting some numbers beyond the int range there was no such problem.
I decided to output 1e11 and to my surprise the answer was still 1e11. Why is this happening, please help.
Here:
values[x] = min(values[x],values[x-coins[j-1]]+1);
For example, for x=3 and coins[0]=2, you add values[1] + 1.
However, values[1] = INT_MAX. Then, you get an undefined behavior when performing this calculation.
You can solve the issue with INF = INT_MAX - 1;
If your program performs arithmetic on signed integers that produces result that is outside the range of representable values - i.e. if such operation overflows - such as in the case of INT_MAX + 1 then the behaviour of the program is undefined.
I decided to output 1e11 and to my surprise the answer was still 1e11.
1e11 is a floating point literal. Floating point types have different range of representable values from int, and different requirements regarding overflow.

Modulo division returning negative number

I am carrying out the following modulo division operations from within a C program:
(5^6) mod 23 = 8
(5^15) mod 23 = 19
I am using the following function, for convenience:
int mod_func(int p, int g, int x) {
return ((int)pow((double)g, (double)x)) % p;
}
But the result of the operations when calling the function is incorrect:
mod_func(23, 5, 6) //returns 8
mod_func(23, 5, 15) //returns -6
Does the modulo operator have some limit on the size of the operands?
5 to the power 15 is 30,517,578,125
The largest value you can store in an int is 2,147,483,647
You could use 64-bit integers, but beware you'll have precision issues when converting from double eventually.
From memory, there is a rule from number theory about the calculation you are doing that means you don't need to compute the full power expansion in order to determine the modulo result. But I could be wrong. Been too many years since I learned that stuff.
Ahh, here it is: Modular Exponentiation
Read that, and stop using double and pow =)
int mod_func(int p, int g, int x)
{
int r = g;
for( int i = 1; i < x; i++ ) {
r = (r * g) % p;
}
return r;
}
The integral part of pow(5, 15) is not representable in an int (assuming the width of int is 32-bit). The conversion (from double to int in the cast expression) is undefined behavior in C and in C++.
To avoid undefined behavior, you should use fmod function to perform the floating point remainder operation.
My guess is the problem is 5 ^ 15 = 30517578125 which is greater than INT_MAX (2147483647). You are currently casting it to an int, which is what's failing.
As has been said, your first problem in
int mod_func(int p, int g, int x) {
return ((int)pow((double)g, (double)x)) % p;
}
is that pow(g,x) often exceeds the int range, and then you have undefined behaviour converting that result to int, and whatever the resulting int is, there is no reason to believe it has anything to do with the desired modulus.
The next problem is that the result of pow(g,x) as a double may not be exact. Unless g is a power of 2, the mathematical result cannot be exactly represented as a double for large enough exponents even if it is in range, but it could also happen if the mathematical result is exactly representable (depends on the implementation of pow).
If you do number-theoretic computations - and computing the residue of a power modulo an integer is one - you should only use integer types.
For the case at hand, you can use exponentiation by repeated squaring, computing the residue of all intermediate results. If the modulus p is small enough that (p-1)*(p-1) never overflows,
int mod_func(int p, int g, int x) {
int aux = 1;
g %= p;
while(x > 0) {
if (x % 2 == 1) {
aux = (aux * g) % p;
}
g = (g * g) % p;
x /= 2;
}
return aux;
}
does it. If p can be larger, you need to use a wider type for the calculations.

single-word division algorithm

I develop software for embedded platform and need a single-word division algorithm.
The problem is as follows:
given a large integer represented by a sequence of 32-bit words (can be many),
we need to divide it by another 32-bit word, i.e. compute the quotient (also large integer)
and the remainder (32-bits).
Certainly, If I were developing this algorithm on x86, I could simply take GNU MP
but this library is way too large for embdedde platform. Furthermore, our processor
does not have hardware integer divider (integer division is performed in the software).
However the processor has quite fast FPU, so the trick is to use floating-point arithmetic wherever possible.
Any ideas how to implement this ?
Sounds like a classic optimization. Instead of dividing by D, multiply by 0x100000000/D and then divide by 0x100000000. The latter is just a wordshift, i.e. trivial. Calculating the multiplier is a bit harder, but not a lot.
See also this detailed article for a far more detailed background.
Take a look at this one: the algorithm divides an integer a[0..n-1] by a single word 'c'
using floating-point for 64x32->32 division. The limbs of the quotient 'q' are just printed in a loop, you can save then in an array if you like. Note that you don't need GMP to run the algorithm - I use it just to compare the results.
#include <gmp.h>
// divides a multi-precision integer a[0..n-1] by a single word c
void div_by_limb(const unsigned *a, unsigned n, unsigned c) {
typedef unsigned long long uint64;
unsigned c_norm = c, sh = 0;
while((c_norm & 0xC0000000) == 0) { // make sure the 2 MSB are set
c_norm <<= 1; sh++;
}
// precompute the inverse of 'c'
double inv1 = 1.0 / (double)c_norm, inv2 = 1.0 / (double)c;
unsigned i, r = 0;
printf("\nquotient: "); // quotient is printed in a loop
for(i = n - 1; (int)i >= 0; i--) { // start from the most significant digit
unsigned u1 = r, u0 = a[i];
union {
struct { unsigned u0, u1; };
uint64 x;
} s = {u0, u1}; // treat [u1, u0] as 64-bit int
// divide a 2-word number [u1, u0] by 'c_norm' using floating-point
unsigned q = floor((double)s.x * inv1), q2;
r = u0 - q * c_norm;
// divide again: this time by 'c'
q2 = floor((double)r * inv2);
q = (q << sh) + q2; // reconstruct the quotient
printf("%x", q);
}
r %= c; // adjust the residue after normalization
printf("; residue: %x\n", r);
}
int main() {
mpz_t z, quo, rem;
mpz_init(z); // this is a dividend
mpz_set_str(z, "9999999999999999999999999999999", 10);
unsigned div = 9; // this is a divisor
div_by_limb((unsigned *)z->_mp_d, mpz_size(z), div);
mpz_init(quo); mpz_init(rem);
mpz_tdiv_qr_ui(quo, rem, z, div); // divide 'z' by 'div'
gmp_printf("compare: Quo: %Zx; Rem %Zx\n", quo, rem);
mpz_clear(quo);
mpz_clear(rem);
mpz_clear(z);
return 1;
}
I believe that a look-up table and Newton Raphson successive approximation is the canonical choice used by hardware designers (who generally can't afford the gates for a full hardware divide). You get to choose the trade off the between accuracy and execution time.

Can I rely on this to judge a square number in C++?

Can I rely on
sqrt((float)a)*sqrt((float)a)==a
or
(int)sqrt((float)a)*(int)sqrt((float)a)==a
to check whether a number is a perfect square? Why or why not?
int a is the number to be judged. I'm using Visual Studio 2005.
Edit: Thanks for all these rapid answers. I see that I can't rely on float type comparison. (If I wrote as above, will the last a be cast to float implicitly?) If I do it like
(int)sqrt((float)a)*(int)sqrt((float)a) - a < e
How small should I take that e value?
Edit2: Hey, why don't we leave the comparison part aside, and decide whether the (int) is necessary? As I see, with it, the difference might be great for squares; but without it, the difference might be small for non-squares. Perhaps neither will do. :-(
Actually, this is not a C++, but a math question.
With floating point numbers, you should never rely on equality. Where you would test a == b, just test against abs(a - b) < eps, where eps is a small number (e.g. 1E-6) that you would treat as a good enough approximation.
If the number you are testing is an integer, you might be interested in the Wikipedia article about Integer square root
EDIT:
As Krugar said, the article I linked does not answer anything. Sure, there is no direct answer to your question there, phoenie. I just thought that the underlying problem you have is floating point precision and maybe you wanted some math background to your problem.
For the impatient, there is a link in the article to a lengthy discussion about implementing isqrt. It boils down to the code karx11erx posted in his answer.
If you have integers which do not fit into an unsigned long, you can modify the algorithm yourself.
If you don't want to rely on float precision then you can use the following code that uses integer math.
The Isqrt is taken from here and is O(log n)
// Finds the integer square root of a positive number
static int Isqrt(int num)
{
if (0 == num) { return 0; } // Avoid zero divide
int n = (num / 2) + 1; // Initial estimate, never low
int n1 = (n + (num / n)) / 2;
while (n1 < n)
{
n = n1;
n1 = (n + (num / n)) / 2;
} // end while
return n;
} // end Isqrt()
static bool IsPerfectSquare(int num)
{
return Isqrt(num) * Isqrt(num) == num;
}
Not to do the same calculation twice I would do it with a temporary number:
int b = (int)sqrt((float)a);
if((b*b) == a)
{
//perfect square
}
edit:
dav made a good point. instead of relying on the cast you'll need to round off the float first
so it should be:
int b = (int) (sqrt((float)a) + 0.5f);
if((b*b) == a)
{
//perfect square
}
Your question has already been answered, but here is a working solution.
Your 'perfect squares' are implicitly integer values, so you could easily solve floating point format related accuracy problems by using some integer square root function to determine the integer square root of the value you want to test. That function will return the biggest number r for a value v where r * r <= v. Once you have r, you simply need to test whether r * r == v.
unsigned short isqrt (unsigned long a)
{
unsigned long rem = 0;
unsigned long root = 0;
for (int i = 16; i; i--) {
root <<= 1;
rem = ((rem << 2) + (a >> 30));
a <<= 2;
if (root < rem)
rem -= ++root;
}
return (unsigned short) (root >> 1);
}
bool PerfectSquare (unsigned long a)
{
unsigned short r = isqrt (a);
return r * r == a;
}
I didn't follow the formula, I apologize.
But you can easily check if a floating point number is an integer by casting it to an integer type and compare the result against the floating point number. So,
bool isSquare(long val) {
double root = sqrt(val);
if (root == (long) root)
return true;
else return false;
}
Naturally this is only doable if you are working with values that you know will fit within the integer type range. But being that the case, you can solve the problem this way, saving you the inherent complexity of a mathematical formula.
As reinier says, you need to add 0.5 to make sure it rounds to the nearest integer, so you get
int b = (int) (sqrt((float)a) + 0.5f);
if((b*b) == a) /* perfect square */
For this to work, b has to be (exactly) equal to the square root of a if a is a perfect square. However, I don't think you can guarantee this. Suppose that int is 64 bits and float is 32 bits (I think that's allowed). Then a can be of the order 2^60, so its square root is of order 2^30. However, a float only stores 24 bits in the significand, so the rounding error is of order 2^(30-24) = 2^6. This is larger to 1, so b may contain the wrong integer. For instance, I think that the above code does not identify a = (2^30+1)^2 as a perfect square.
I would do.
// sqrt always returns positive value. So casting to int is equivalent to floor()
int down = static_cast<int>(sqrt(value));
int up = down+1; // This is the ceil(sqrt(value))
// Because of rounding problems I would test the floor() and ceil()
// of the value returned from sqrt().
if (((down*down) == value) || ((up*up) == value))
{
// We have a winner.
}
The more obvious, if slower -- O(sqrt(n)) -- way:
bool is_perfect_square(int i) {
int d = 1;
for (int x = 0; x <= i; x += d, d += 2) {
if (x == i) return true;
}
return false;
}
While others have noted that you should not test for equality with floats, I think you are missing out on chances to take advantage of the properties of perfect squares. First there is no point in re-squaring the calculated root. If a is a perfect square then sqrt(a) is an integer and you should check:
b = sqrt((float)a)
b - floor(b) < e
where e is set sufficiently small. There are also a number of integers that you can cross of as non-square before taking the square root. Checking Wikipedia you can see some necessary conditions for a to be square:
A square number can only end with
digits 00,1,4,6,9, or 25 in base 10
Another simple check would be to see that a % 4 == 1 or 0 before taking the root since:
Squares of even numbers are even,
since (2n)^2 = 4n^2.
Squares of odd
numbers are odd, since (2n + 1)^2 =
4(n^2 + n) + 1.
These would essentially eliminate half of the integers before taking any roots.
The cleanest solution is to use an integer sqrt routine, then do:
bool isSquare( unsigned int a ) {
unsigned int s = isqrt( a );
return s * s == a;
}
This will work in the full int range and with perfect precision. A few cases:
a = 0, s = 0, s * s = 0 (add an exception if you don't want to treat 0 as square)
a = 1, s = 1, s * s = 1
a = 2, s = 1, s * s = 1
a = 3, s = 1, s * s = 1
a = 4, s = 2, s * s = 4
a = 5, s = 2, s * s = 4
Won't fail either as you approach the maximum value for your int size. E.g. for 32-bit ints:
a = 0x40000000, s = 0x00008000, s * s = 0x40000000
a = 0xFFFFFFFF, s = 0x0000FFFF, s * s = 0xFFFE0001
Using floats you run into a number of issues. You may find that sqrt( 4 ) = 1.999999..., and similar problems, although you can round-to-nearest instead of using floor().
Worse though, a float has only 24 significant bits which means you can't cast any int larger than 2^24-1 to a float without losing precision, which introduces false positives/negatives. Using doubles for testing 32-bit ints, you should be fine, though.
But remember to cast the result of the floating-point sqrt back to an int and compare the result to the original int. Comparisons between floats are never a good idea; even for square values of x in a limited range, there is no guarantee that sqrt( x ) * sqrt( x ) == x, or that sqrt( x * x) = x.
basics first:
if you (int) a number in a calculation it will remove ALL post-comma data. If I remember my C correctly, if you have an (int) in any calculation (+/-*) it will automatically presume int for all other numbers.
So in your case you want float on every number involved, otherwise you will loose data:
sqrt((float)a)*sqrt((float)a)==(float)a
is the way you want to go
Floating point math is inaccurate by nature.
So consider this code:
int a=35;
float conv = (float)a;
float sqrt_a = sqrt(conv);
if( sqrt_a*sqrt_a == conv )
printf("perfect square");
this is what will happen:
a = 35
conv = 35.000000
sqrt_a = 5.916079
sqrt_a*sqrt_a = 34.999990734
this is amply clear that sqrt_a^2 is not equal to a.