Calculating distances but the result is - 2147483648 - c++

Below is the code to calculate the distance
// creating array of cities
double x[] = {21.0,12.0,15.0,3.0,7.0,30.0};
double y[] = {17.0,10.0,4.0,2.0,3.0,1.0};
// distance function - C = sqrt of A squared + B squared

One issue is that the order of operations is messing you up (multiplication is done before subtraction)
Change
(x[c1] - x[c2] * x[c1] - x[c2]) + (y[c1] - y[c2] * y[c1] - y[c2])
to
((x[c1] - x[c2]) * (x[c1] - x[c2])) + ((y[c1] - y[c2]) * (y[c1] - y[c2]))
I would also recommend, just for clarity, doing some of those calculations on separate lines (clearly that's a style choice that I prefer, and I'm sure some would disagree). It should make no difference to the compiler though
double deltaX = x[c1] - x[c2];
double deltaY = y[c1] - y[c2];
double distance = sqrt(deltaX * deltaX + deltaY * deltaY);
In my opinion that makes for more maintainable (and less error prone, as in this instance) code. Note that, as rewritten, the order of operations does not require extra parentheses.

Remember operator precedence: a - b * c - d means a - (b * c) - d.

Do you want
(x[c1] - (x[c2] * x[c1]) - x[c2])
or
((x[c1] - x[c2]) * (x[c1] - x[c2]))
(x[c1] - x[c2] * x[c1] - x[c2]) will be similar to (x[c1] - (x[c2] * x[c1]) - x[c2]) because * has higher precedence than -.

I am going to go ahead and fix a couple of issues:
// creating array of cities
double x[] = {21.0,12.0,15.0,3.0,7.0,30.0};
double y[] = {17.0,10.0,4.0,2.0,3.0,1.0};
// distance function - C = sqrt of A squared + B squared
double dist(int c1, int c2) {
double z = sqrt (
((x[c1] - x[c2]) * (x[c1] - x[c2])) + ((y[c1] - y[c2]) * (y[c1] - y[c2])));
return z;
}
void main()
{
int a[] = {1, 2, 3, 4, 5, 6};
execute(a, 0, sizeof(a)/sizeof(int));
int x;
printf("Type in a number \n");
scanf("%d", &x);
int y;
printf("Type in a number \n");
scanf("%d", &y);
double z = dist (x,y);
cout << "The result is " << z;
}
This fixes the unused return value, and also fixes the order of operation, and incorrect variable type of int.

Related

Efficient floating point scaling in C++

I'm working on my fast (and accurate) sin implementation in C++, and I have a problem regarding the efficient angle scaling into the +- pi/2 range.
My sin function for +-pi/2 using Taylor series is the following
(Note: FLOAT is a macro expanded to float or double just for the benchmark)
/**
* Sin for 'small' angles, accurate on [-pi/2, pi/2], fairly accurate on [-pi, pi]
*/
// To switch between float and double
#define FLOAT float
FLOAT
my_sin_small(FLOAT x)
{
constexpr FLOAT C1 = 1. / (7. * 6. * 5. * 4. * 3. * 2.);
constexpr FLOAT C2 = -1. / (5. * 4. * 3. * 2.);
constexpr FLOAT C3 = 1. / (3. * 2.);
constexpr FLOAT C4 = -1.;
// Correction for sin(pi/2) = 1, due to the ignored taylor terms
constexpr FLOAT corr = -1. / 0.9998431013994987;
const FLOAT x2 = x * x;
return corr * x * (x2 * (x2 * (x2 * C1 + C2) + C3) + C4);
}
So far so good... The problem comes when I try to scale an arbitrary angle into the +-pi/2 range. My current solution is:
FLOAT
my_sin(FLOAT x)
{
constexpr FLOAT pi = 3.141592653589793238462;
constexpr FLOAT rpi = 1 / pi;
// convert to +-pi/2 range
int n = std::nearbyint(x * rpi);
FLOAT xbar = (n * pi - x) * (2 * (n & 1) - 1);
// (2 * (n % 2) - 1) is a sign correction (see below)
return my_sin_small(xbar);
};
I made a benchmark, and I'm losing a lot for the +-pi/2 scaling.
Tricking with int(angle/pi + 0.5) is a nope since it is limited to the int precision, also requires +- branching, and i try to avoid branches...
What should I try to improve the performance for this scaling? I'm out of ideas.
Benchmark results for float. (In the benchmark the angle could be out of the validity range for my_sin_small, but for the bench I don't care about that...):
Benchmark results for double.
Sign correction for xbar in my_sin():
Algo accuracy compared to python sin() function:
Candidate improvements
Convert the radians x to rotations by dividing by 2*pi.
Retain only the fraction so we have an angle (-1.0 ... 1.0). This simplifies the OP's modulo step to a simple "drop the whole number" step instead. Going forward with different angle units simply involves a co-efficient set change. No need to scale back to radians.
For positive values, subtract 0.5 so we have (-0.5 ... 0.5) and then flip the sign. This centers the possible values about 0.0 and makes for better convergence of the approximating polynomial as compared to the math sine function. For negative values - see below.
Call my_sin_small1() that uses this (-0.5 ... 0.5) rotations range rather than [-pi ... +pi] radians.
In my_sin_small1(), fold constants together to drop the corr * step.
Rather than use the truncated Taylor's series, use a more optimal set. IMO, this will provide better answers, especially near +/-pi.
Notes: No int to/from float code. With more analysis, possible to get a better set of coefficients that fix my_sin(+/-pi) closer to 0.0. This is just a quick set of code to demo less FP steps and good potential results.
C like code for OP to port to C++
FLOAT my_sin_small1(FLOAT x) {
static const FLOAT A1 = -5.64744881E+01;
static const FLOAT A2 = +7.81017968E+01;
static const FLOAT A3 = -4.11145353E+01;
static const FLOAT A4 = +6.27923581E+00;
const FLOAT x2 = x * x;
return x * (x2 * (x2 * (x2 * A1 + A2) + A3) + A4);
}
FLOAT my_sin1(FLOAT x) {
static const FLOAT pi = 3.141592653589793238462;
static const FLOAT pi2i = 1/(pi * 2);
x *= pi2i;
FLOAT xfraction = 0.5f - (x - truncf(x));
return my_sin_small1(xfraction);
}
For negative values, use -my_sin1(-x) or like code to flip the sign - or add 0.5 in the above minus 0.5 step.
Test
#include <math.h>
#include <stdio.h>
int main(void) {
for (int d = 0; d <= 360; d += 20) {
FLOAT x = d / 180.0 * M_PI;
FLOAT y = my_sin1(x);
printf("%12.6f %11.8f %11.8f\n", x, sin(x), y);
}
}
Output
0.000000 0.00000000 -0.00022483
0.349066 0.34202013 0.34221691
0.698132 0.64278759 0.64255589
1.047198 0.86602542 0.86590189
1.396263 0.98480775 0.98496443
1.745329 0.98480775 0.98501128
2.094395 0.86602537 0.86603642
2.443461 0.64278762 0.64260530
2.792527 0.34202022 0.34183803
3.141593 -0.00000009 0.00000000
3.490659 -0.34202016 -0.34183764
3.839724 -0.64278757 -0.64260519
4.188790 -0.86602546 -0.86603653
4.537856 -0.98480776 -0.98501128
4.886922 -0.98480776 -0.98496443
5.235988 -0.86602545 -0.86590189
5.585053 -0.64278773 -0.64255613
5.934119 -0.34202036 -0.34221727
6.283185 0.00000017 -0.00022483
Alternate code below makes for better results near 0.0, yet might cost a tad more time. OP seems more inclined to speed.
FLOAT xfraction = 0.5f - (x - truncf(x));
// vs.
FLOAT xfraction = x - truncf(x);
if (x >= 0.5f) x -= 1.0f;
[Edit]
Below is a better set with about 10% reduced error.
-56.0833765f
77.92947047f
-41.0936875f
6.278635918f
Yet another approach:
Spend more time (code) to reduce the range to ±pi/4 (±45 degrees), then possible to use only 3 or 2 terms of a polynomial that is like the usually Taylors series.
float sin_quick_small(float x) {
const float x2 = x * x;
#if 0
// max error about 7e-7
static const FLOAT A2 = +0.00811656036940792f;
static const FLOAT A3 = -0.166597759850666f;
static const FLOAT A4 = +0.999994132743861f;
return x * (x2 * (x2 * A2 + A3) + A4);
#else
// max error about 0.00016
static const FLOAT A3 = -0.160343346851626f;
static const FLOAT A4 = +0.999031566686144f;
return x * (x2 * A3 + A4);
#endif
}
float cos_quick_small(float x) {
return cosf(x); // TBD code.
}
float sin_quick(float x) {
if (x < 0.0) {
return -sin_quick(-x);
}
int quo;
float x90 = remquof(fabsf(x), 3.141592653589793238462f / 2, &quo);
switch (quo % 4) {
case 0:
return sin_quick_small(x90);
case 1:
return cos_quick_small(x90);
case 2:
return sin_quick_small(-x90);
case 3:
return -cos_quick_small(x90);
}
return 0.0;
}
int main() {
float max_x = 0.0;
float max_error = 0.0;
for (int d = -45; d <= 45; d += 1) {
FLOAT x = d / 180.0 * M_PI;
FLOAT y = sin_quick(x);
double err = fabs(y - sin(x));
if (err > max_error) {
max_x = x;
max_error = err;
}
printf("%12.6f %11.8f %11.8f err:%11.8f\n", x, sin(x), y, err);
}
printf("x:%.6f err:%.6f\n", max_x, max_error);
return 0;
}

C++/SDL2 -- Rendering A Circle [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
Am I on the right track here, drawing a filled circle with SDL2? I figured using parametric equations and a radius tending to zero would work, but it seems really inefficient in terms of processor use. Any other ideas are much appreciated Thanks in advance.
//Circle test
int circle_x = WINDOW_WIDTH/2;
int circle_y = WINDOW_HEIGHT/2;
int circle_radius = 100;
SDL_SetRenderDrawColor(window.renderer, 100, 100, 255, 255);
int point_x;
int point_y;
while (circle_radius > 0)
{
for (int t = 0; t < 360; t++)
{
point_x = circle_x + circle_radius * cos(t);
point_y = circle_y + circle_radius * sin(t);
SDL_RenderDrawPoint(window.renderer, point_x, point_y);
}
circle_radius--;
}
Output Img
First of all, there's some mistake in your code, as the sin() and cos() functions are radian and not degree based functions. This will make your circle to appear at pseudorandomly selected points as each step draws a point about 57 degrees apart from the previous. This will have no impact as modern graphics work buffered and you'll see the final result, not the workings.
Once this said, nobody draws a circle today by using your exposed algorithm. Have a look at the Bresenham's middle point algorithm, that basically tries to draw a circle by octants, but several times faster.
The idea behind these algorithms is to consider the R^2 = x^2 + y^2 formula and to draw, pixel by pixel, on one of the axis, and considering when the other axis must be followed (you draw by octants, as this way you don't cope with greater than one derivatives and you have only to decide if you move up or not). The routine also takes into account the circle symmetry to calculate only one octant and then draw the eight points at each pass.
As I developed that algorithm from scratch when I was young (without having seen Bresenham's before) probably my reasoning up to the solution will be of help to you.
The first attempt is to take into account that the resolution (the granularity is not angle dependant) for small circles you have to paint less pixels than for big ones, and the one degree approach you have followed has to be redesigned to adapt to finer or coarse resolutions. The idea is to go, pixel by pixel, instead of degree by degree, until you draw the complete thing. We are going to paint only the first octant, and will draw the rest by the symmetry properties of the figure. We part from the (0, -R) point and will go, pixel by pixel until we get to the (sqrt(2)*R, R - sqrt(2)*R) point.
The first thing we are going to do is to try to save all operations we have to do. The first place where we can save operations is in calculating squares... We are going to use the R^2 = x^2 + y^2 equation and on it, R is only used as R^2 all the time, so, lets suppose we want to draw a ten pixels radius circle, we will square things up to 100 (which is the 10 pixels radius squared).
Next, we are going to see one property of squares, that is, they grow odd from one square to the next (0 -> 1(delta is 1) -> 4(delta is 3) -> 9(delta is 5) -> 16(delta is 7) ...) so if we can arrange to grow by 1 in the x, we can calculate x^2 easily, by just adding two to the odd variable, and then adding odd to the last square number, so we'll use two numbers: x and x2. We initialize both to 0, and the first grows by x += 1;, while the second grows by the relationship x2 += dx2; dx2 += 2; (we initialize dx2 = 1;) this makes us to allow x and x2 growing by only making sums, and no multiplications at all.
If someone thinks we are going to need y2 = 100 - x2 and then whe are forced to calculate y = sqrt(y2) you are almost right, but the trick here is to be able to manage the y and y2 sequences backwards the same as the x counterparts. Well, right, y and y2 can be managed in the reverse direction the same as x and x2 but this time we have to go backwards, decreasing odd numbers from (what?) to 1 where dy2 -= 2; y2 -= dy2; and finally reaches 0. For this to happen, check that the difference between two consecutive squares is precisely the two numbers added, so, for example, the difference between 13^2 = 169 and 14^2 = 196 is 13 + 14 = 27, and this is the odd number to begin with if we go back from R = 14 to 0 in y.
The reason of complicating things so much, is that this way we only make additions with integers and no need to make multiplications (well, multiplications are not so expensive, but there was a time that they were) Well, we do multiplications to square initially the radius R, but we do it only once at the beginning to calculate R^2.
The idea now is to set the origin at the departure point (0, -R) and go to the right, pixel by pixel, adding (and modifiying) x, x2, and sum (we substract to sum all the time) until we reach the next square in y, and then update all the y axis values (we do decrement the y, we have to move a pixel up in that moment) y, y2, dy2, and draw the pixel (or draw it before, as we do in the routine), until... what? (well, the point is until we meet at the 45 degrees point in which the octant is complete, x and y coordinates are equal) It is important to stop there, because from that point on, it is possible that one step makes the y coordinate to increment more than one pixel (the derivative is greater than one) and this should complicate the overal algorithm (and we are anyway painting the other symmetrical eight points, so we are drawing the other part of the graphic)
So, suppose we have 100 as radius, and begin with:
x=0, x2= 0, dx2= 1, y=10, y2=100, dy2=19, sum=100 *
x=1, x2= 1, dx2= 3, y= 9, y2= 81, dy2=17, sum= 99
x=2, x2= 4, dx2= 5, y= 9, y2= 81, dy2=17, sum= 96
x=3, x2= 9, dx2= 7, y= 9, y2= 81, dy2=17, sum= 91
x=4, x2=16, dx2= 9, y= 9, y2= 81, dy2=17, sum= 84 *
x=5, x2=25, dx2=11, y= 8, y2= 64, dy2=15, sum= 75 *
x=6, x2=36, dx2=13, y= 7, y2= 49, dy2=13, sum= 64 *
x=7, x2=49, dx2=15, y= 7, y2= 49, dy2=13, sum= 51
The point marked with asterisk are the ones where the sum value crosses the next y2 value, making the y value to be decrmented and having to shift the pixel we paint on. The final routine is:
int bh_onlyonepoint(int r, int cx, int cy)
{
int r2 = r*r;
int x = 0, x2 = 0, dx2 = 1;
int y = r, y2 = y*y, dy2 = 2*y - 1;
int sum = r2;
while(x <= y) {
draw(cx + x, cy + y); /* draw the point, see below */
sum -= dx2;
x2 += dx2;
x++;
dx2 += 2;
if (sum <= y2) {
y--; y2 -= dy2; dy2 -= 2;
}
} /* while */
return x; /* number of points drawn */
}
If you want, I have written a simple example to draw a circle in ascii on the screen, given radius as command parameter. It uses ANSI escape sequences to position the cursor in the screen before drawing a single asterisk. The scale is doubled in the X direction to compensate for the character height ("pixels" in ascii are not square) I have included a new drawing function pointer parameter to call back for point drawing and a main routine to get parameters from the command line:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
void draw(int x, int y)
{
/* move to position (2*x, y) and plot an asterisk */
printf("\033[%d;%dH*", y, x<<1);
}
int bh(int r, int cx, int cy, void(*draw)(int, int))
{
/* the variables mentioned in the text */
int r2 = r*r;
int x = 0, x2 = 0, dx2 = 1;
int y = r, y2 = y*y, dy2 = 2*y - 1;
int sum = r2;
while(x <= y) {
/* draw the eight points */
draw(cx + x, cy + y);
draw(cx + x, cy - y);
draw(cx - x, cy + y);
draw(cx - x, cy - y);
draw(cx + y, cy + x);
draw(cx + y, cy - x);
draw(cx - y, cy + x);
draw(cx - y, cy - x);
sum -= dx2;
x2 += dx2;
x++;
dx2 += 2;
if (sum <= y2) {
y--; y2 -= dy2; dy2 -= 2;
}
} /* while */
return x; /* number of points drawn */
}
int main(int argc, char **argv)
{
int i;
char *cols = getenv("COLUMNS");
char *lines = getenv("LINES");
int cx, cy;
if (!cols) cols = "80";
if (!lines) lines = "24";
cx = atoi(cols)/4;
cy = atoi(lines)/2;
printf("\033[2J"); /* erase screen */
for (i = 1; i < argc; i++) {
bh(atoi(argv[i]), cx, cy, draw);
}
fflush(stdout);
sleep(10);
puts(""); /* force a new line */
}
And the final result is:
*
* * * * * * * * * *
* * * *
* *
* * * *
* * *
* * * * * * * * * *
* * * *
* * * * * *
* * * *
* * * *
* * * *
* * * *
* * * *
* * * *
* * * *
* * * *
* * * *
* * * *
* * * *
* * * *
* * * *
* * * * * *
* * * *
* * * * * * * * * *
* * *
* * * *
* *
* * * *
* * * * * * * * * *
*
Finally, if you want better results (not those peaks that arise from an exact value of radius that make them only touch the point when x or y are zero) you can pass the routine directly the square value of the radius (that allows to make integer calculations with fractional radii)
filling a circle
If you want to fill the circle, just paint all points between a pair of the calculated points, as in:
lineFromTo(cx - x, cy - y, cx + x, cy - y);
lineFromTo(cx - y, cy + x, cx + y, cy + x);
lineFromTo(cx - y, cy - x, cx + y, cy - x);
lineFromTo(cx - x, cy + y, cx + x, cy + y);
These are all horizontal lines, so perhaps you can get an improvement with something like:
/* X1 X2 Y */
HorizLineX1X2Y(cx - x, cx + x, cy - y);
HorizLineX1X2Y(cx - y, cx + y, cy + x);
HorizLineX1X2Y(cx - y, cx + y, cy - x);
HorizLineX1X2Y(cx - x, cx + x, cy + y);
A new git repository with the final program allowing to fill, draw or trace the run of the algorithm has been created in github

Fast approximate float division

On modern processors, float division is a good order of magnitude slower than float multiplication (when measured by reciprocal throughput).
I'm wondering if there are any algorithms out there for computating a fast approximation to x/y, given certain assumptions and tolerance levels. For example, if you assume that 0<x<y, and are willing to accept any output that is within 10% of the true value, are there algorithms faster than the built-in FDIV operation?
I hope that this helps because this is probably as close as your going to get to what you are looking for.
__inline__ double __attribute__((const)) divide( double y, double x ) {
// calculates y/x
union {
double dbl;
unsigned long long ull;
} u;
u.dbl = x; // x = x
u.ull = ( 0xbfcdd6a18f6a6f52ULL - u.ull ) >> (unsigned char)1;
// pow( x, -0.5 )
u.dbl *= u.dbl; // pow( pow(x,-0.5), 2 ) = pow( x, -1 ) = 1.0/x
return u.dbl * y; // (1.0/x) * y = y/x
}
See also:
Another post about reciprocal approximation.
The Wikipedia page.
FDIV is usually exceptionally slower than FMUL just b/c it can't be piped like multiplication and requires multiple clk cycles for iterative convergence HW seeking process.
Easiest way is to simply recognize that division is nothing more than the multiplication of the dividend y and the inverse of the divisor x. The not so straight forward part is remembering a float value x = m * 2 ^ e & its inverse x^-1 = (1/m)*2^(-e) = (2/m)*2^(-e-1) = p * 2^q approximating this new mantissa p = 2/m = 3-x, for 1<=m<2. This gives a rough piece-wise linear approximation of the inverse function, however we can do a lot better by using an iterative Newton Root Finding Method to improve that approximation.
let w = f(x) = 1/x, the inverse of this function f(x) is found by solving for x in terms of w or x = f^(-1)(w) = 1/w. To improve the output with the root finding method we must first create a function whose zero reflects the desired output, i.e. g(w) = 1/w - x, d/dw(g(w)) = -1/w^2.
w[n+1]= w[n] - g(w[n])/g'(w[n]) = w[n] + w[n]^2 * (1/w[n] - x) = w[n] * (2 - x*w[n])
w[n+1] = w[n] * (2 - x*w[n]), when w[n]=1/x, w[n+1]=1/x*(2-x*1/x)=1/x
These components then add to get the final piece of code:
float inv_fast(float x) {
union { float f; int i; } v;
float w, sx;
int m;
sx = (x < 0) ? -1:1;
x = sx * x;
v.i = (int)(0x7EF127EA - *(uint32_t *)&x);
w = x * v.f;
// Efficient Iterative Approximation Improvement in horner polynomial form.
v.f = v.f * (2 - w); // Single iteration, Err = -3.36e-3 * 2^(-flr(log2(x)))
// v.f = v.f * ( 4 + w * (-6 + w * (4 - w))); // Second iteration, Err = -1.13e-5 * 2^(-flr(log2(x)))
// v.f = v.f * (8 + w * (-28 + w * (56 + w * (-70 + w *(56 + w * (-28 + w * (8 - w))))))); // Third Iteration, Err = +-6.8e-8 * 2^(-flr(log2(x)))
return v.f * sx;
}

Levenberg–Marquardt not converging

I try to make a model fit using Levenberg-marquardt's method according to numerical recipes.
The Problem is: it does not converge or when it does, it's not precise... or at least the covariant matrix is strange.
int i=0;
for (i = 0; i < 3e4; i++) {
mrqmin(x, y, sig, NPCalib, a, ia, 3, covar, alpha, &chisk, afunc,
&alamda);
if (chisk < 1e-8)
sumchisk++;
if (sumchisk > 5)
break;
if (alamda > 1e8)
alamda = 1e8;
}
(x,y) are 3 points (double) that work pretty well with the form y=a(x-x0)^2.
using sumchisk like this is the recommendation of numerical recipees for using this function.
alamda is capped at the top here as otherwise there might have been an overflow.
Other definitions and data-points:
double a[4] = {0.0, 0.0001, 100.0, -1};
int ia[4] = {0.0, 1, 1, 0};
double *x = {0.0, 799.157549545577, 799.92196995454, 800.683769692575};
double *y = {0.0, 524.26491, 525.26768, 526.26586};
double *sig = {0.0, 0.1*y[1], 0.1*y[2], 0.1*y[3]};
double **covar = new double*[4];
covar[1] = new double[4];
covar[2] = new double[4];
covar[3] = new double[4];
double **alpha = new double*[4];
alpha[1] = new double[4];
alpha[2] = new double[4];
alpha[3] = new double[4];
double chisk = 0;
double alamda = -1;
void afunc(int i, double x[], double a[], double *y, double dyda[], int ma)
{
*y = a[1] * pow(x[i] + a[2], 2) / pow(1 + a[3] * CT[i - 1], 2);
dyda[1] = pow(x[i] + a[2], 2) / pow(1 + a[3] * CT[i - 1], 2);
dyda[2] = (2 * a[1] * (x[i] + a[2])) / pow
(1 + a[3] * CalibTurn[i - 1], 2);
dyda[3] = (-2 * a[1] * CT[i - 1] * pow(x[i] + a[2], 2)) / pow
(1 + a[3] * CT[i - 1], 3);
}
I changed the nr-sourcecode to use double instead of float. The first array-element is not used because this comes from fortran-code and I didn't feel like changing such a small detail.
The model also contains a 3. parameter, which isn't used in this fit and thus remains a[3]=-1, because ia[3]=0. ia[]=1 means the parameter is about to get fitted...
However, Now I have the problem that sometimes this doesn't converge. It finishes with alamda=1e8 and i=3e4. Especially when I set the treshold for chisk lower.
The sets of parameters seem to be fine, though... the chisk is e.g. about 1e-6 and the parameters seem fine, but looking at the diagonals of the covariant-matrix (which should give the squared standard deviation of each parameter), there is some rubish like ~800000 for a parameter 0.0001.
Does anyone know what I did wrong when using this algorithm?
Anything specific I need to write into covar/alpha when I start? Can the sig be set like this?

Variable grouping providing different answers in optimized code

I've been attempting to unit test a C++ class I've written for Geodetic transforms.
I've noticed that a trivial grouping change of three variables greatly influences the error in the function.
EDIT : Here is the entire function for a compilable example:
Assume latitude, longitude and altitude are zero. Earth::a = 6378137 and Earth::b = 6356752.3 I'm working on getting benchmark numbers, something came up at work today and I had to do that instead.
void Geodesy::Geocentric2EFG(double latitude, double longitude, double altitude, double *E, double *F, double *G) {
double a2 = pow<double>(Earth::a, 2);
double b2 = pow<double>(Earth::b, 2);
double radius = sqrt((a2 * b2)/(a2 * pow<double>(sin(latitude), 2) + b2 * pow<double>(cos(longitude), 2)));
radius += altitude;
*E = radius * (cos(latitude) * cos(longitude));
*F = radius * (cos(latitude) * sin(longitude));
*G = radius * sin(latitude);
return;
}
Where all values are defined as double including those in Earth. The pow<T>() function is a recursive template function defined by:
template <typename T>
static inline T pow(const T &base, unsigned const exponent) {
return (exponent == 0) ? 1 : (base * pow(base, exponent - 1));
}
The code in question:
*E = radius * cos(latitude) * cos(longitude);
*F = radius * cos(latitude) * sin(longitude);
produces different results than:
*E = radius * (cos(latitude) * cos(longitude));
*F = radius * (cos(latitude) * sin(longitude));
What is the compiler doing in gcc with optimization level 3 to make these results 1e-2 different?
You have different rounding as floating point cannot represent all numbers:
a * b * c; is (a * b) * c which may differ than a * (b * c).
You may have similar issues with addition too.
example with addition:
10e10f + 1.f == 10e10f
so (1.f + 10e10f) - 10e10f == 10e10f - 10e10f == 0.f
whereas 1.f + (10e10f - 10e10f) == 1.f - 0.f == 1.f.