Floating point difficulty in area of triangle - c++

The output created by my program is at first accurate, then becomes 0 for all answers above 5000000. I would like to know why this is the case when I use the function I have called Heron's Area.
#include "stdafx.h"
#include "stdlib.h"
#include <iostream>
#include <math.h>
#include <stdio.h>
float heron_area(float a, float c) {
float s = (a + a + c) / 2.0f;
return (s - a)*sqrtf(s*(s - c));
}
int main(void) {
int j = 18;
float i = 10;
for (int k = 0; k < j; k++){
float g = i * 10;
std::cout << heron_area(g, 1) << std::endl;
i = g;
}
return 0;
}
It is potentially to do with the issue with using floating point numbers. Why am I getting the output of 0 after the last output 500000?

It is the issue with floating point numbers as you suspect.
If you print a and s in heron_area, you'll note that they very quickly become identical, making s - a zero.
This happens when c is much smaller than a (that is, when you have a very "pointy" triangle; your zeros appear when two sides are 10,000,000 and the third is 1).
Changing the type to double makes the problem appear later, but it won't go away.
You'll need to rearrange your computations if you want to handle very large differences in magnitude.
There's a solution on Wikipedia (linked by #harold in the comments) that gives
Area = 0.25 * sqrt((a+(b+c)) * (c-(a-b)) * (c+(a-b)) * (a+(b-c)))
where a >= b and b >= c, and the brackets are necessary.
Yes, you need to worry about the order of operations.
(And there's a very detailed article here with an analysis of this solution.)

As the variable a in the function heron_area grows exponentially larger and larger, the variable c, which is constant, with the value 1.0f, becomes less and less relevant.
Due to limited precision of the floating point the expression:
float s = (a + a + c) / 2.0f;
then simplifies to:
float s = (a + a) / 2.0f;
which is the same as:
float s = a;
Thus the variables s and a have the same value, so the expression:
return (s - a)*sqrtf(s*(s - c));
always yields 0.0f, as the result of subtracting s - a is 0.0f, and multiplying zero by anything is always zero.

Related

Simpson's Composite Rule giving too large values for when n is very large

Using Simpson's Composite Rule to calculate the integral from 2 to 1,000 of 1/ln(x), however when using a large n (usually around 500,000), I start to get results that vary from the value my calculator and other sources give me (176.5644). For example, when n = 10,000,000, it gives me a value of 184.1495. Wondering why this is, since as n gets larger, the accuracy is supposed to increase and not decrease.
#include <iostream>
#include <cmath>
// the function f(x)
float f(float x)
{
return (float) 1 / std::log(x);
}
float my_simpson(float a, float b, long int n)
{
if (n % 2 == 1) n += 1; // since n has to be even
float area, h = (b-a)/n;
float x, y, z;
for (int i = 1; i <= n/2; i++)
{
x = a + (2*i - 2)*h;
y = a + (2*i - 1)*h;
z = a + 2*i*h;
area += f(x) + 4*f(y) + f(z);
}
return area*h/3;
}
int main()
{
std::cout.precision(20);
int upperBound = 1'000;
int subsplits = 1'000'000;
float approx = my_simpson(2, upperBound, subsplits);
std::cout << "Output: " << approx << std::endl;
return 0;
}
Update: Switched from floats to doubles and works much better now! Thank you!
Unlike a real (in mathematical sense) number, a float has a limited precision.
A typical IEEE 754 32-bit (single precision) floating-point number binary representation dedicates only 24 bits (one of which is implicit) to the mantissa and that translates in roughly less than 8 decimal significant digits (please take this as a gross semplification).
A double on the other end, has 53 significand bits, making it more accurate and (usually) the first choice for numerical computations, these days.
since as n gets larger, the accuracy is supposed to increase and not decrease.
Unfortunately, that's not how it works. There's a sweat spot, but after that the accumulation of rounding errors prevales and the results diverge from their expected values.
In OP's case, this calculation
area += f(x) + 4*f(y) + f(z);
introduces (and accumulates) rounding errors, due to the fact that area becomes much greater than f(x) + 4*f(y) + f(z) (e.g 224678.937 vs. 0.3606823). The bigger n is, the sooner this gets relevant, making the result diverging from the real one.
As mentioned in the comments, another issue (undefined behavior) is that area isn't initialized (to zero).

Quadratic Algebra advice for Array like return function

I have a problem. I want to write a method, which uses the PQ-Formula to calculate Zeros on quadratic algebra.
As I see C++ doesn't support Arrays, unlike C#, which I use normally.
How do I get either, ZERO, 1 or 2 results returned?
Is there any other way without Array, which doesn't exists?
Actually I am not into pointers so my actual code is corrupted.
I'd glad if someone can help me.
float* calculateZeros(float p, float q)
{
float *x1, *x2;
if (((p) / 2)*((p) / 2) - (q) < 0)
throw std::exception("No Zeros!");
x1 *= -((p) / 2) + sqrt(static_cast<double>(((p) / 2)*((p) / 2) - (q)));
x2 *= -((p) / 2) - sqrt(static_cast<double>(((p) / 2)*((p) / 2) - (q)));
float returnValue[1];
returnValue[0] = x1;
returnValue[1] = x2;
return x1 != x2 ? returnValue[0] : x1;
}
Actualy this code is not compilable but this is the code I've done so far.
There are quite a fiew issues with; at very first, I'll be dropping all those totally needless parentheses, they just make the code (much) harder to read:
float* calculateZeros(float p, float q)
{
float *x1, *x2; // pointers are never initialized!!!
if ((p / 2)*(p / 2) - q < 0)
throw std::exception("No Zeros!"); // zeros? q just needs to be large enough!
x1 *= -(p / 2) + sqrt(static_cast<double>((p / 2)*(p / 2) - q);
x2 *= -(p / 2) - sqrt(static_cast<double>((p / 2)*(p / 2) - q);
// ^ this would multiply the pointer values! but these are not initialized -> UB!!!
float returnValue[1];
returnValue[0] = x1; // you are assigning pointer to value here
returnValue[1] = x2;
return x1 != x2 ? returnValue[0] : x1;
// ^ value! ^ pointer!
// apart from, if you returned a pointer to returnValue array, then you would
// return a pointer to data with scope local to the function – i. e. the array
// is destroyed upon leaving the function, thus the pointer returned will get
// INVALID as soon as the function is exited; using it would again result in UB!
}
As is, your code wouldn't even compile...
As I see C++ doesn't support arrays
Well... I assume you meant: 'arrays as return values or function parameters'. That's true for raw arrays, these can only be passed as pointers. But you can accept structs and classes as parameters or use them as return values. You want to return both calculated values? So you could use e. g. std::array<float, 2>; std::array is a wrapper around raw arrays avoiding all the hassle you have with the latter... As there are exactly two values, you could use std::pair<float, float>, too, or std::tuple<float, float>.
Want to be able to return either 2, 1 or 0 values? std::vector<float> might be your choice...
std::vector<float> calculateZeros(float p, float q)
{
std::vector<float> results;
// don't repeat the code all the time...
double h = static_cast<double>(p) / 2; // "half"
s = h * h; // "square" (of half)
if(/* s greater than or equal q */)
{
// only enter, if we CAN have a result otherwise, the vector remains empty
// this is far better behaviour than the exception
double r = sqrt(s - q); // "root"
h = -h;
if(/* r equals 0*/)
{
results.push_back(h);
}
else
{
results.reserve(2); // prevents re-allocations;
// admitted, for just two values, we could live with...
results.push_back(h + r);
results.push_back(h - r);
}
}
return results;
}
Now there's one final issue left: as precision even of double is limited, rounding errors can occur (and the matter is even worth if using float; I would recommend making all floats to doubles, parameters and return values as well!). You shouldn't ever compare for exact equality (someValue == 0.0), but consider some epsilon to cover badly rounded values:
-epsilon < someValue && someValue < +epsilon
Ok, in given case, there are two originally exact comparisons involved, we might want to do as little epsilon-comparisons as possible. So:
double d = r - s;
if(d > -epsilon)
{
// considered 0 or greater than
h = -h;
if(d < +epsilon)
{
// considered 0 (and then no need to calculate the root at all...)
results.push_back(h);
}
else
{
// considered greater 0
double r = sqrt(d);
results.push_back(h - r);
results.push_back(h + r);
}
}
Value of epsilon? Well, either use a fix, small enough value or calculate it dynamically based on the smaller of the two values (multiply some small factor to) – and be sure to have it positive... You might be interested in a bit more of information on the matter. You don't have to care about not being C++ – the issue is the same for all languages using IEEE754 representation for doubles.

Is there any "standard" way to calculate the numerical gradient?

I am trying to calculate the numerical gradient of a smooth function in c++. And the parameter value could vary from zero to a very large number(maybe 1e10 to 1e20?)
I used the function f(x,y) = 10*x^3 + y^3 as a testbench, but I found that if x or y is too large, I can't get correct gradient.
Here is my code to calculate the graidient:
#include <iostream>
#include <cmath>
#include <cassert>
using namespace std;
double f(double x, double y)
{
// black box expensive function
return 10 * pow(x, 3) + pow(y, 3);
}
int main()
{
// double x = -5897182590.8347721;
// double y = 269857217.0017581;
double x = 1.13041e+19;
double y = -5.49756e+14;
const double epsi = 1e-4;
double f1 = f(x, y);
double f2 = f(x, y+epsi);
double f3 = f(x, y-epsi);
cout << f1 << endl;
cout << f2 << endl;
cout << f3 << endl;
cout << f1 - f2 << endl; // 0
cout << f2 - f3 << endl; // 0
return 0;
}
If I use the above code to calculate the gradient, the gradient would be zero!
The testbench function, 10*x^3 + y^3, is just a demo, the real problem I need to solve is actually a black box function.
So, is there any "standard" way to calculate the numerical gradient?
In the first place, you should use the central difference scheme, which is more accurate (by cancellation of one more term of the Taylor develoment).
(f(x + h) - f(x - h)) / 2h
rather than
(f(x + h) - f(x)) / h
Then the choice of h is critical and using a fixed constant is the worst thing you can do. Because for small x, h will be too large so that the approximation formula no more works, and for large x, h will be too small, resulting in severe truncation error.
A much better choice is to take a relative value, h = x√ε, where ε is the machine epsilon (1 ulp), which gives a good tradeoff.
(f(x(1 + √ε)) - f(x(1 - √ε))) / 2x√ε
Beware that when x = 0, a relative value cannot work and you need to fall back to a constant. But then, nothing tells you which to use !
You need to consider the precision needed.
At first glance, since |y| = 5.49756e14 and epsi = 1e-4, you need at least ⌈log2(5.49756e14)-log2(1e-4)⌉ = 63 bits of significand precision (that is the number of bits used to encode the digits of your number, also known as mantissa) for y and y+epsi to be considered different.
The double-precision floating-point format only has 53 bits of significand precision (assuming it is 8 bytes). So, currently, f1, f2 and f3 are exactly the same because y, y+epsi and y-epsi are equal.
Now, let's consider the limit : y = 1e20, and the result of your function, 10x^3 + y^3. Let's ignore x for now, so let's take f = y^3. Now we can calculate the precision needed for f(y) and f(y+epsi) to be different : f(y) = 1e60 and f(epsi) = 1e-12. This gives a minimum significand precision of ⌈log2(1e60)-log2(1e-12)⌉ = 240 bits.
Even if you were to use the long double type, assuming it is 16 bytes, your results would not differ : f1, f2 and f3 would still be equal, even though y and y+epsi would not.
If we take x into account, the maximum value of f would be 11e60 (with x = y = 1e20). So the upper limit on precision is ⌈log2(11e60)-log2(1e-12)⌉ = 243 bits, or at least 31 bytes.
One way to solve your problem is to use another type, maybe a bignum used as fixed-point.
Another way is to rethink your problem and deal with it differently. Ultimately, what you want is f1 - f2. You can try to decompose f(y+epsi). Again, if you ignore x, f(y+epsi) = (y+epsi)^3 = y^3 + 3*y^2*epsi + 3*y*epsi^2 + epsi^3. So f(y+epsi) - f(y) = 3*y^2*epsi + 3*y*epsi^2 + epsi^3.
The only way to calculate gradient is calculus.
Gradient is a vector:
g(x, y) = Df/Dx i + Df/Dy j
where (i, j) are unit vectors in x and y directions, respectively.
One way to approximate derivatives is first order differences:
Df/Dx ~ (f(x2, y)-f(x1, y))/(x2-x1)
and
Df/Dy ~ (f(x, y2)-f(x, y1))/(y2-y1)
That doesn't look like what you're doing.
You have a closed form expression:
g(x, y) = 30*x^2 i + 3*y^2 j
You can plug in values for (x, y) and calculate the gradient exactly at any point. Compare that to your differences and see how well your approximation is doing.
How you implement it numerically is your responsibility. (10^19)^3 = 10^57, right?
What is the size of double on your machine? Is it a 64 bit IEEE double precision floating point number?
Use
dx = (1+abs(x))*eps, dfdx = (f(x+dx,y) - f(x,y)) / dx
dy = (1+abs(y))*eps, dfdy = (f(x,y+dy) - f(x,y)) / dy
to get meaningful step sizes for large arguments.
Use eps = 1e-8 for one-sided difference formulas, eps = 1e-5 for central difference quotients.
Explore automatic differentiation (see autodiff.org) for derivatives without difference quotients and thus much smaller numerical errors.
We can examine the behaviour of the error in the derivative using the following program - it calculates the 1-sided derivative and the central difference based derivative using a varying step size. Here I'm using x and y ~ 10^10, which is smaller than what you were using, but should illustrate the same point.
#include <iostream>
#include <cmath>
#include <cassert>
using namespace std;
double f(double x, double y) {
return 10 * pow(x, 3) + pow(y, 3);
}
double f_x(double x, double y) {
return 3 * 10 * pow(x,2);
}
double f_y(double x, double y) {
return 3 * pow(y,2);
}
int main()
{
// double x = -5897182590.8347721;
// double y = 269857217.0017581;
double x = 1.13041e+10;
double y = -5.49756e+10;
//double x = 10.1;
//double y = -5.2;
double epsi = 1e8;
for(int i=0; i<60; ++i) {
double dfx_n = (f(x+epsi,y) - f(x,y))/epsi;
double dfx_cd = (f(x+epsi,y) - f(x-epsi,y))/(2*epsi);
double dfx = f_x(x,y);
cout<<epsi<<" "<<fabs(dfx-dfx_n)<<" "<<fabs(dfx - dfx_cd)<<std::endl;
epsi/=1.5;
}
return 0;
}
The output shows that a 1-sided difference gets us an optimal error of about 1.37034e+13 at a step length of about 100.0. Note that while this error looks large, as a relative error it is 3.5746632302764072e-09 (since the exact value is 3.833e+21)
In comparison the 2-sided difference gets an optimal error of about 1.89493e+10 with a step size of about 45109.3. This is three-orders of magnitude better, (with a much larger step-size).
How can we work out the step size? The link in the comments of Yves Daosts answer gives us a ballpark value:
h=x_c sqrt(eps) for 1-Sided, and h=x_c cbrt(eps) for 2-Sided.
But either way, if the required step size for decent accuracy at x ~ 10^10 is 100.0, the required step size with x ~ 10^20 is going to be 10^10 larger too. So the problem is simply that your step size is way too small.
This can be verified by increasing the starting step-size in the above code and resetting the x/y values to the original values.
Then expected derivative is O(1e39), best 1-sided error of about O(1e31) occurs near a step length of 5.9e10, best 2-sided error of about O(1e29) occurs near a step length of 6.1e13.
As numerical differentiation is ill conditioned (which means a small error could alter your result significantly) you should consider to use Cauchy's integral formula. This way you can calculate the n-th derivative with an integral. This will lead to less problems with considering accuracy and stability.

C++ How do I set the fractional part of a float?

I know how to get the fractional part of a float but I don't know how to set it. I have two integers returned by a function, one holds the integer and the other holds the fractional part.
For example:
int a = 12;
int b = 2; // This can never be 02, 03 etc
float c;
How do I get c to become 12.2? I know I could add something like (float)b \ 10 but then what if b is >= than 10? Then I would have to divide by 100, and so on. Is there a function or something where I can do setfractional(c, b)?
Thanks
edit: The more I think about this problem the more I realize how illogical it is. if b == 1 then it would be 12.1 but if b == 10 it would also be 12.1 so I don't know how I'm going to handle this. I'm guessing the function never returns a number >= 10 for fractional but I don't know.
Something like:
float IntFrac(int integer, int frac)
{
float integer2 = integer;
float frac2 = frac;
float log10 = log10f(frac2 + 1.0f);
float ceil = ceilf(log10);
float pow = powf(10.0f, -ceil);
float res = abs(integer);
res += frac2 * pow;
if (integer < 0)
{
res = -res;
}
return res;
}
Ideone: http://ideone.com/iwG8UO
It's like saying: log10(98 + 1) = log10(99) = 1.995, ceilf(1.995) = 2, powf(10, -2) = 0.01, 99 * 0.01 = 0.99, and then 12 + 0.99 = 12.99 and then we check for the sign.
And let's hope the vagaries of IEEE 754 float math won't hit too hard :-)
I'll add that it would be probably better to use double instead of float. Other than 3d graphics, there are very few fields were using float is a good idea nowadays.
The most trivial method would be counting the digits of b and then divide accordingly:
int i = 10;
while(b > i) // rather slow, there are faster ways
i*= 10;
c = a + static_cast<float>(b)/i;
Note that due to the nature of float the result might not be what you expected. Also, if you want something like 3.004 you can modify the initial value of i to another power of ten.
kindly try this below code after including include math.h and stdlib.h file:
int a=12;
int b=22;
int d=b;
int i=0;
float c;
while(d>0)
{
d/=10;
i++;
}
c=a+(float)b/pow(10,i);

finding cube root in C++?

Strange things happen when i try to find the cube root of a number.
The following code returns me undefined. In cmd : -1.#IND
cout<<pow(( double )(20.0*(-3.2) + 30.0),( double )1/3)
While this one works perfectly fine. In cmd : 4.93242414866094
cout<<pow(( double )(20.0*4.5 + 30.0),( double )1/3)
From mathematical way it must work since we can have the cube root from a negative number.
Pow is from Visual C++ 2010 math.h library. Any ideas?
pow(x, y) from <cmath> does NOT work if x is negative and y is non-integral.
This is a limitation of std::pow, as documented in the C standard and on cppreference:
Error handling
Errors are reported as specified in math_errhandling
If base is finite and negative and exp is finite and non-integer, a domain error occurs and a range error may occur.
If base is zero and exp is zero, a domain error may occur.
If base is zero and exp is negative, a domain error or a pole error may occur.
There are a couple ways around this limitation:
Cube-rooting is the same as taking something to the 1/3 power, so you could do std::pow(x, 1/3.).
In C++11, you can use std::cbrt. C++11 introduced both square-root and cube-root functions, but no generic n-th root function that overcomes the limitations of std::pow.
The power 1/3 is a special case. In general, non-integral powers of negative numbers are complex. It wouldn't be practical for pow to check for special cases like integer roots, and besides, 1/3 as a double is not exactly 1/3!
I don't know about the visual C++ pow, but my man page says under errors:
EDOM The argument x is negative and y is not an integral value. This would result in a complex number.
You'll have to use a more specialized cube root function if you want cube roots of negative numbers - or cut corners and take absolute value, then take cube root, then multiply the sign back on.
Note that depending on context, a negative number x to the 1/3 power is not necessarily the negative cube root you're expecting. It could just as easily be the first complex root, x^(1/3) * e^(pi*i/3). This is the convention mathematica uses; it's also reasonable to just say it's undefined.
While (-1)^3 = -1, you can't simply take a rational power of a negative number and expect a real response. This is because there are other solutions to this rational exponent that are imaginary in nature.
http://www.wolframalpha.com/input/?i=x^(1/3),+x+from+-5+to+0
Similarily, plot x^x. For x = -1/3, this should have a solution. However, this function is deemed undefined in R for x < 0.
Therefore, don't expect math.h to do magic that would make it inefficient, just change the signs yourself.
Guess you gotta take the negative out and put it in afterwards. You can have a wrapper do this for you if you really want to.
function yourPow(double x, double y)
{
if (x < 0)
return -1.0 * pow(-1.0*x, y);
else
return pow(x, y);
}
Don't cast to double by using (double), use a double numeric constant instead:
double thingToCubeRoot = -20.*3.2+30;
cout<< thingToCubeRoot/fabs(thingToCubeRoot) * pow( fabs(thingToCubeRoot), 1./3. );
Should do the trick!
Also: don't include <math.h> in C++ projects, but use <cmath> instead.
Alternatively, use pow from the <complex> header for the reasons stated by buddhabrot
pow( x, y ) is the same as (i.e. equivalent to) exp( y * log( x ) )
if log(x) is invalid then pow(x,y) is also.
Similarly you cannot perform 0 to the power of anything, although mathematically it should be 0.
C++11 has the cbrt function (see for example http://en.cppreference.com/w/cpp/numeric/math/cbrt) so you can write something like
#include <iostream>
#include <cmath>
int main(int argc, char* argv[])
{
const double arg = 20.0*(-3.2) + 30.0;
std::cout << cbrt(arg) << "\n";
std::cout << cbrt(-arg) << "\n";
return 0;
}
I do not have access to the C++ standard so I do not know how the negative argument is handled... a test on ideone http://ideone.com/bFlXYs seems to confirm that C++ (gcc-4.8.1) extends the cube root with this rule cbrt(x)=-cbrt(-x) when x<0; for this extension you can see http://mathworld.wolfram.com/CubeRoot.html
I was looking for cubit root and found this thread and it occurs to me that the following code might work:
#include <cmath>
using namespace std;
function double nth-root(double x, double n){
if (!(n%2) || x<0){
throw FAILEXCEPTION(); // even root from negative is fail
}
bool sign = (x >= 0);
x = exp(log(abs(x))/n);
return sign ? x : -x;
}
I think you should not confuse exponentiation with the nth-root of a number. See the good old Wikipedia
because the 1/3 will always return 0 as it will be considered as integer...
try with 1.0/3.0...
it is what i think but try and implement...
and do not forget to declare variables containing 1.0 and 3.0 as double...
Here's a little function I knocked up.
#define uniform() (rand()/(1.0 + RAND_MAX))
double CBRT(double Z)
{
double guess = Z;
double x, dx;
int loopbreaker;
retry:
x = guess * guess * guess;
loopbreaker = 0;
while (fabs(x - Z) > FLT_EPSILON)
{
dx = 3 * guess*guess;
loopbreaker++;
if (fabs(dx) < DBL_EPSILON || loopbreaker > 53)
{
guess += uniform() * 2 - 1.0;
goto retry;
}
guess -= (x - Z) / dx;
x = guess*guess*guess;
}
return guess;
}
It uses Newton-Raphson to find a cube root.
Sometime Newton -Raphson gets stuck, if the root is very close to 0 then the derivative can
get large and it can oscillate. So I've clamped and forced it to restart if that happens.
If you need more accuracy you can change the FLT_EPSILONs.
If you ever have no math library you can use this way to compute the cubic root:
cubic root
double curt(double x) {
if (x == 0) {
// would otherwise return something like 4.257959840008151e-109
return 0;
}
double b = 1; // use any value except 0
double last_b_1 = 0;
double last_b_2 = 0;
while (last_b_1 != b && last_b_2 != b) {
last_b_1 = b;
// use (2 * b + x / b / b) / 3 for small numbers, as suggested by willywonka_dailyblah
b = (b + x / b / b) / 2;
last_b_2 = b;
// use (2 * b + x / b / b) / 3 for small numbers, as suggested by willywonka_dailyblah
b = (b + x / b / b) / 2;
}
return b;
}
It is derives from the sqrt algorithm below. The idea is that b and x / b / b bigger and smaller from the cubic root of x. So, the average of both lies closer to the cubic root of x.
Square Root And Cubic Root (in Python)
def sqrt_2(a):
if a == 0:
return 0
b = 1
last_b = 0
while last_b != b:
last_b = b
b = (b + a / b) / 2
return b
def curt_2(a):
if a == 0:
return 0
b = a
last_b_1 = 0;
last_b_2 = 0;
while (last_b_1 != b and last_b_2 != b):
last_b_1 = b;
b = (b + a / b / b) / 2;
last_b_2 = b;
b = (b + a / b / b) / 2;
return b
In contrast to the square root, last_b_1 and last_b_2 are required in the cubic root because b flickers. You can modify these algorithms to compute the fourth root, fifth root and so on.
Thanks to my math teacher Herr Brenner in 11th grade who told me this algorithm for sqrt.
Performance
I tested it on an Arduino with 16mhz clock frequency:
0.3525ms for yourPow
0.3853ms for nth-root
2.3426ms for curt