Efficient floating point scaling in C++ - c++

I'm working on my fast (and accurate) sin implementation in C++, and I have a problem regarding the efficient angle scaling into the +- pi/2 range.
My sin function for +-pi/2 using Taylor series is the following
(Note: FLOAT is a macro expanded to float or double just for the benchmark)
/**
* Sin for 'small' angles, accurate on [-pi/2, pi/2], fairly accurate on [-pi, pi]
*/
// To switch between float and double
#define FLOAT float
FLOAT
my_sin_small(FLOAT x)
{
constexpr FLOAT C1 = 1. / (7. * 6. * 5. * 4. * 3. * 2.);
constexpr FLOAT C2 = -1. / (5. * 4. * 3. * 2.);
constexpr FLOAT C3 = 1. / (3. * 2.);
constexpr FLOAT C4 = -1.;
// Correction for sin(pi/2) = 1, due to the ignored taylor terms
constexpr FLOAT corr = -1. / 0.9998431013994987;
const FLOAT x2 = x * x;
return corr * x * (x2 * (x2 * (x2 * C1 + C2) + C3) + C4);
}
So far so good... The problem comes when I try to scale an arbitrary angle into the +-pi/2 range. My current solution is:
FLOAT
my_sin(FLOAT x)
{
constexpr FLOAT pi = 3.141592653589793238462;
constexpr FLOAT rpi = 1 / pi;
// convert to +-pi/2 range
int n = std::nearbyint(x * rpi);
FLOAT xbar = (n * pi - x) * (2 * (n & 1) - 1);
// (2 * (n % 2) - 1) is a sign correction (see below)
return my_sin_small(xbar);
};
I made a benchmark, and I'm losing a lot for the +-pi/2 scaling.
Tricking with int(angle/pi + 0.5) is a nope since it is limited to the int precision, also requires +- branching, and i try to avoid branches...
What should I try to improve the performance for this scaling? I'm out of ideas.
Benchmark results for float. (In the benchmark the angle could be out of the validity range for my_sin_small, but for the bench I don't care about that...):
Benchmark results for double.
Sign correction for xbar in my_sin():
Algo accuracy compared to python sin() function:

Candidate improvements
Convert the radians x to rotations by dividing by 2*pi.
Retain only the fraction so we have an angle (-1.0 ... 1.0). This simplifies the OP's modulo step to a simple "drop the whole number" step instead. Going forward with different angle units simply involves a co-efficient set change. No need to scale back to radians.
For positive values, subtract 0.5 so we have (-0.5 ... 0.5) and then flip the sign. This centers the possible values about 0.0 and makes for better convergence of the approximating polynomial as compared to the math sine function. For negative values - see below.
Call my_sin_small1() that uses this (-0.5 ... 0.5) rotations range rather than [-pi ... +pi] radians.
In my_sin_small1(), fold constants together to drop the corr * step.
Rather than use the truncated Taylor's series, use a more optimal set. IMO, this will provide better answers, especially near +/-pi.
Notes: No int to/from float code. With more analysis, possible to get a better set of coefficients that fix my_sin(+/-pi) closer to 0.0. This is just a quick set of code to demo less FP steps and good potential results.
C like code for OP to port to C++
FLOAT my_sin_small1(FLOAT x) {
static const FLOAT A1 = -5.64744881E+01;
static const FLOAT A2 = +7.81017968E+01;
static const FLOAT A3 = -4.11145353E+01;
static const FLOAT A4 = +6.27923581E+00;
const FLOAT x2 = x * x;
return x * (x2 * (x2 * (x2 * A1 + A2) + A3) + A4);
}
FLOAT my_sin1(FLOAT x) {
static const FLOAT pi = 3.141592653589793238462;
static const FLOAT pi2i = 1/(pi * 2);
x *= pi2i;
FLOAT xfraction = 0.5f - (x - truncf(x));
return my_sin_small1(xfraction);
}
For negative values, use -my_sin1(-x) or like code to flip the sign - or add 0.5 in the above minus 0.5 step.
Test
#include <math.h>
#include <stdio.h>
int main(void) {
for (int d = 0; d <= 360; d += 20) {
FLOAT x = d / 180.0 * M_PI;
FLOAT y = my_sin1(x);
printf("%12.6f %11.8f %11.8f\n", x, sin(x), y);
}
}
Output
0.000000 0.00000000 -0.00022483
0.349066 0.34202013 0.34221691
0.698132 0.64278759 0.64255589
1.047198 0.86602542 0.86590189
1.396263 0.98480775 0.98496443
1.745329 0.98480775 0.98501128
2.094395 0.86602537 0.86603642
2.443461 0.64278762 0.64260530
2.792527 0.34202022 0.34183803
3.141593 -0.00000009 0.00000000
3.490659 -0.34202016 -0.34183764
3.839724 -0.64278757 -0.64260519
4.188790 -0.86602546 -0.86603653
4.537856 -0.98480776 -0.98501128
4.886922 -0.98480776 -0.98496443
5.235988 -0.86602545 -0.86590189
5.585053 -0.64278773 -0.64255613
5.934119 -0.34202036 -0.34221727
6.283185 0.00000017 -0.00022483
Alternate code below makes for better results near 0.0, yet might cost a tad more time. OP seems more inclined to speed.
FLOAT xfraction = 0.5f - (x - truncf(x));
// vs.
FLOAT xfraction = x - truncf(x);
if (x >= 0.5f) x -= 1.0f;
[Edit]
Below is a better set with about 10% reduced error.
-56.0833765f
77.92947047f
-41.0936875f
6.278635918f

Yet another approach:
Spend more time (code) to reduce the range to ±pi/4 (±45 degrees), then possible to use only 3 or 2 terms of a polynomial that is like the usually Taylors series.
float sin_quick_small(float x) {
const float x2 = x * x;
#if 0
// max error about 7e-7
static const FLOAT A2 = +0.00811656036940792f;
static const FLOAT A3 = -0.166597759850666f;
static const FLOAT A4 = +0.999994132743861f;
return x * (x2 * (x2 * A2 + A3) + A4);
#else
// max error about 0.00016
static const FLOAT A3 = -0.160343346851626f;
static const FLOAT A4 = +0.999031566686144f;
return x * (x2 * A3 + A4);
#endif
}
float cos_quick_small(float x) {
return cosf(x); // TBD code.
}
float sin_quick(float x) {
if (x < 0.0) {
return -sin_quick(-x);
}
int quo;
float x90 = remquof(fabsf(x), 3.141592653589793238462f / 2, &quo);
switch (quo % 4) {
case 0:
return sin_quick_small(x90);
case 1:
return cos_quick_small(x90);
case 2:
return sin_quick_small(-x90);
case 3:
return -cos_quick_small(x90);
}
return 0.0;
}
int main() {
float max_x = 0.0;
float max_error = 0.0;
for (int d = -45; d <= 45; d += 1) {
FLOAT x = d / 180.0 * M_PI;
FLOAT y = sin_quick(x);
double err = fabs(y - sin(x));
if (err > max_error) {
max_x = x;
max_error = err;
}
printf("%12.6f %11.8f %11.8f err:%11.8f\n", x, sin(x), y, err);
}
printf("x:%.6f err:%.6f\n", max_x, max_error);
return 0;
}

Related

Code two required condition in finding root with Fixed point method

I'm trying to find a root with simple fixed-point method by C++, but the point is that Xr is a root of f(x) and a inflection point as well. In addition, A equation is a little bit more complex than the normal Fixed-Point method.
The equation is added constant c for check how quickly converge to the root xr.
I was going to find a root and then check if the root is a inflection point or not, but it is not working and I can't find the problem in my code.
I need your help.
The real Problem is
Consider the root finding problem f(x)=0 with root xr, with f'(x)=0.
Convert it to the simple fixed-point problem.
x=x+c*f(x)=g(x)
with c a nonzero constant. How should c be chosen to ensure rapid convergence of
x(n+1)=x(n)+c*f(x(n)) ( x(n+1) means the value of the n+1th of X )
to c (provided that x0 is chosen sufficiently close to xr?). Apply your way of choosing c to the root-finding problem x*x*x-5=0. Start your program with x0=1.0 and run with several values of c and discuss about the observed trend in your results (in other words, the effect of c value on convergence behavior)
#include <stdio.h>
#include <conio.h>
#include <math.h>
#include <stdlib.h>
double gx(double x, double c)
{
return(x + c*(x*x*x - 5));
}
double gxpr(double x, double c)
{
return(x + c*(3 * x*x));
}
void Simple_Fixed_Point(double x, double c)
{
int i = 1;
long double x2=0.0;
long double x3=0.0;
long double ea=0.0;
long double ea2 = 0.0;
long double es = pow(10, -6);
printf("Simple Fixed Point Method\n");
Lbl:
x2 = gx(x,c);
printf("iteration=%d Root=%.5f Approximate error=%.15f\n", i++,
x2, ea);
if (ea=fabs((x2 - x)/x2*100) <es)
{
goto Lbm;
}
else
{
x = x2;
goto Lbl;
}
Lbm:
x3 = gxpr(x2, c);
if (ea2 = fabs((x3 - x2) / x3 * 100) < es)
{
goto End;
}
else
{
x2 = x3;
goto Lbm;
}
End:
getch();
}
int main(void)
{
Simple_Fixed_Point(1.0, 1.0);
return(0);
}
Hope this helps you:
//f(x+dx) = f(x) + (dfdx) * dx;
eps = 1.0;
dx = 1e-7; //something small
x = x0;
while (eps > mineps) {
f1 = f(x);
f2 = f(x + dx);
f3 = f(x + dx + dx);
d2fdx2 = (f3 - f2 - f2 + f1) / dx / dx;
dfdx = (f2 - f1) / dx;
x -= (relax1 * f1 / dfdx + relax2 * dfdx / d2fdx2); //relax - something less 1
eps = max(abs(dfdx), abs(f1));
}

Fast approximate float division

On modern processors, float division is a good order of magnitude slower than float multiplication (when measured by reciprocal throughput).
I'm wondering if there are any algorithms out there for computating a fast approximation to x/y, given certain assumptions and tolerance levels. For example, if you assume that 0<x<y, and are willing to accept any output that is within 10% of the true value, are there algorithms faster than the built-in FDIV operation?
I hope that this helps because this is probably as close as your going to get to what you are looking for.
__inline__ double __attribute__((const)) divide( double y, double x ) {
// calculates y/x
union {
double dbl;
unsigned long long ull;
} u;
u.dbl = x; // x = x
u.ull = ( 0xbfcdd6a18f6a6f52ULL - u.ull ) >> (unsigned char)1;
// pow( x, -0.5 )
u.dbl *= u.dbl; // pow( pow(x,-0.5), 2 ) = pow( x, -1 ) = 1.0/x
return u.dbl * y; // (1.0/x) * y = y/x
}
See also:
Another post about reciprocal approximation.
The Wikipedia page.
FDIV is usually exceptionally slower than FMUL just b/c it can't be piped like multiplication and requires multiple clk cycles for iterative convergence HW seeking process.
Easiest way is to simply recognize that division is nothing more than the multiplication of the dividend y and the inverse of the divisor x. The not so straight forward part is remembering a float value x = m * 2 ^ e & its inverse x^-1 = (1/m)*2^(-e) = (2/m)*2^(-e-1) = p * 2^q approximating this new mantissa p = 2/m = 3-x, for 1<=m<2. This gives a rough piece-wise linear approximation of the inverse function, however we can do a lot better by using an iterative Newton Root Finding Method to improve that approximation.
let w = f(x) = 1/x, the inverse of this function f(x) is found by solving for x in terms of w or x = f^(-1)(w) = 1/w. To improve the output with the root finding method we must first create a function whose zero reflects the desired output, i.e. g(w) = 1/w - x, d/dw(g(w)) = -1/w^2.
w[n+1]= w[n] - g(w[n])/g'(w[n]) = w[n] + w[n]^2 * (1/w[n] - x) = w[n] * (2 - x*w[n])
w[n+1] = w[n] * (2 - x*w[n]), when w[n]=1/x, w[n+1]=1/x*(2-x*1/x)=1/x
These components then add to get the final piece of code:
float inv_fast(float x) {
union { float f; int i; } v;
float w, sx;
int m;
sx = (x < 0) ? -1:1;
x = sx * x;
v.i = (int)(0x7EF127EA - *(uint32_t *)&x);
w = x * v.f;
// Efficient Iterative Approximation Improvement in horner polynomial form.
v.f = v.f * (2 - w); // Single iteration, Err = -3.36e-3 * 2^(-flr(log2(x)))
// v.f = v.f * ( 4 + w * (-6 + w * (4 - w))); // Second iteration, Err = -1.13e-5 * 2^(-flr(log2(x)))
// v.f = v.f * (8 + w * (-28 + w * (56 + w * (-70 + w *(56 + w * (-28 + w * (8 - w))))))); // Third Iteration, Err = +-6.8e-8 * 2^(-flr(log2(x)))
return v.f * sx;
}

distance from given point to given ellipse

I have an ellipse, defined by Center Point, radiusX and radiusY, and I have a Point. I want to find the point on the ellipse that is closest to the given point. In the illustration below, that would be S1.
Now I already have code, but there is a logical error somewhere in it, and I seem to be unable to find it. I broke the problem down to the following code example:
#include <vector>
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <math.h>
using namespace std;
void dostuff();
int main()
{
dostuff();
return 0;
}
typedef std::vector<cv::Point> vectorOfCvPoints;
void dostuff()
{
const double ellipseCenterX = 250;
const double ellipseCenterY = 250;
const double ellipseRadiusX = 150;
const double ellipseRadiusY = 100;
vectorOfCvPoints datapoints;
for (int i = 0; i < 360; i+=5)
{
double angle = i / 180.0 * CV_PI;
double x = ellipseRadiusX * cos(angle);
double y = ellipseRadiusY * sin(angle);
x *= 1.4;
y *= 1.4;
x += ellipseCenterX;
y += ellipseCenterY;
datapoints.push_back(cv::Point(x,y));
}
cv::Mat drawing = cv::Mat::zeros( 500, 500, CV_8UC1 );
for (int i = 0; i < datapoints.size(); i++)
{
const cv::Point & curPoint = datapoints[i];
const double curPointX = curPoint.x;
const double curPointY = curPoint.y * -1; //transform from image coordinates to geometric coordinates
double angleToEllipseCenter = atan2(curPointY - ellipseCenterY * -1, curPointX - ellipseCenterX); //ellipseCenterY * -1 for transformation to geometric coords (from image coords)
double nearestEllipseX = ellipseCenterX + ellipseRadiusX * cos(angleToEllipseCenter);
double nearestEllipseY = ellipseCenterY * -1 + ellipseRadiusY * sin(angleToEllipseCenter); //ellipseCenterY * -1 for transformation to geometric coords (from image coords)
cv::Point center(ellipseCenterX, ellipseCenterY);
cv::Size axes(ellipseRadiusX, ellipseRadiusY);
cv::ellipse(drawing, center, axes, 0, 0, 360, cv::Scalar(255));
cv::line(drawing, curPoint, cv::Point(nearestEllipseX,nearestEllipseY*-1), cv::Scalar(180));
}
cv::namedWindow( "ellipse", CV_WINDOW_AUTOSIZE );
cv::imshow( "ellipse", drawing );
cv::waitKey(0);
}
It produces the following image:
You can see that it actually finds "near" points on the ellipse, but it are not the "nearest" points. What I intentionally want is this: (excuse my poor drawing)
would you extent the lines in the last image, they would cross the center of the ellipse, but this is not the case for the lines in the previous image.
I hope you get the picture. Can anyone tell me what I am doing wrong?
Consider a bounding circle around the given point (c, d), which passes through the nearest point on the ellipse. From the diagram it is clear that the closest point is such that a line drawn from it to the given point must be perpendicular to the shared tangent of the ellipse and circle. Any other points would be outside the circle and so must be further away from the given point.
So the point you are looking for is not the intersection between the line and the ellipse, but the point (x, y) in the diagram.
Gradient of tangent:
Gradient of line:
Condition for perpedicular lines - product of gradients = -1:
When rearranged and substituted into the equation of your ellipse...
...this will give two nasty quartic (4th-degree polynomial) equations in terms of either x or y. AFAIK there are no general analytical (exact algebraic) methods to solve them. You could try an iterative method - look up the Newton-Raphson iterative root-finding algorithm.
Take a look at this very good paper on the subject:
http://www.spaceroots.org/documents/distance/distance-to-ellipse.pdf
Sorry for the incomplete answer - I totally blame the laws of mathematics and nature...
EDIT: oops, i seem to have a and b the wrong way round in the diagram xD
There is a relatively simple numerical method with better convergence than Newtons Method. I have a blog post about why it works http://wet-robots.ghost.io/simple-method-for-distance-to-ellipse/
This implementation works without any trig functions:
def solve(semi_major, semi_minor, p):
px = abs(p[0])
py = abs(p[1])
tx = 0.707
ty = 0.707
a = semi_major
b = semi_minor
for x in range(0, 3):
x = a * tx
y = b * ty
ex = (a*a - b*b) * tx**3 / a
ey = (b*b - a*a) * ty**3 / b
rx = x - ex
ry = y - ey
qx = px - ex
qy = py - ey
r = math.hypot(ry, rx)
q = math.hypot(qy, qx)
tx = min(1, max(0, (qx * r / q + ex) / a))
ty = min(1, max(0, (qy * r / q + ey) / b))
t = math.hypot(ty, tx)
tx /= t
ty /= t
return (math.copysign(a * tx, p[0]), math.copysign(b * ty, p[1]))
Credit to Adrian Stephens for the Trig-Free Optimization.
Here is the code translated to C# implemented from this paper to solve for the ellipse:
http://www.geometrictools.com/Documentation/DistancePointEllipseEllipsoid.pdf
Note that this code is untested - if you find any errors let me know.
//Pseudocode for robustly computing the closest ellipse point and distance to a query point. It
//is required that e0 >= e1 > 0, y0 >= 0, and y1 >= 0.
//e0,e1 = ellipse dimension 0 and 1, where 0 is greater and both are positive.
//y0,y1 = initial point on ellipse axis (center of ellipse is 0,0)
//x0,x1 = intersection point
double GetRoot ( double r0 , double z0 , double z1 , double g )
{
double n0 = r0*z0;
double s0 = z1 - 1;
double s1 = ( g < 0 ? 0 : Math.Sqrt(n0*n0+z1*z1) - 1 ) ;
double s = 0;
for ( int i = 0; i < maxIter; ++i ){
s = ( s0 + s1 ) / 2 ;
if ( s == s0 || s == s1 ) {break; }
double ratio0 = n0 /( s + r0 );
double ratio1 = z1 /( s + 1 );
g = ratio0*ratio0 + ratio1*ratio1 - 1 ;
if (g > 0) {s0 = s;} else if (g < 0) {s1 = s ;} else {break ;}
}
return s;
}
double DistancePointEllipse( double e0 , double e1 , double y0 , double y1 , out double x0 , out double x1)
{
double distance;
if ( y1 > 0){
if ( y0 > 0){
double z0 = y0 / e0;
double z1 = y1 / e1;
double g = z0*z0+z1*z1 - 1;
if ( g != 0){
double r0 = (e0/e1)*(e0/e1);
double sbar = GetRoot(r0 , z0 , z1 , g);
x0 = r0 * y0 /( sbar + r0 );
x1 = y1 /( sbar + 1 );
distance = Math.Sqrt( (x0-y0)*(x0-y0) + (x1-y1)*(x1-y1) );
}else{
x0 = y0;
x1 = y1;
distance = 0;
}
}
else // y0 == 0
x0 = 0 ; x1 = e1 ; distance = Math.Abs( y1 - e1 );
}else{ // y1 == 0
double numer0 = e0*y0 , denom0 = e0*e0 - e1*e1;
if ( numer0 < denom0 ){
double xde0 = numer0/denom0;
x0 = e0*xde0 ; x1 = e1*Math.Sqrt(1 - xde0*xde0 );
distance = Math.Sqrt( (x0-y0)*(x0-y0) + x1*x1 );
}else{
x0 = e0;
x1 = 0;
distance = Math.Abs( y0 - e0 );
}
}
return distance;
}
The following python code implements the equations described at "Distance from a Point to an Ellipse" and uses newton's method to find the roots and from that the closest point on the ellipse to the point.
Unfortunately, as can be seen from the example, it seems to only be accurate outside the ellipse. Within the ellipse weird things happen.
from math import sin, cos, atan2, pi, fabs
def ellipe_tan_dot(rx, ry, px, py, theta):
'''Dot product of the equation of the line formed by the point
with another point on the ellipse's boundary and the tangent of the ellipse
at that point on the boundary.
'''
return ((rx ** 2 - ry ** 2) * cos(theta) * sin(theta) -
px * rx * sin(theta) + py * ry * cos(theta))
def ellipe_tan_dot_derivative(rx, ry, px, py, theta):
'''The derivative of ellipe_tan_dot.
'''
return ((rx ** 2 - ry ** 2) * (cos(theta) ** 2 - sin(theta) ** 2) -
px * rx * cos(theta) - py * ry * sin(theta))
def estimate_distance(x, y, rx, ry, x0=0, y0=0, angle=0, error=1e-5):
'''Given a point (x, y), and an ellipse with major - minor axis (rx, ry),
its center at (x0, y0), and with a counter clockwise rotation of
`angle` degrees, will return the distance between the ellipse and the
closest point on the ellipses boundary.
'''
x -= x0
y -= y0
if angle:
# rotate the points onto an ellipse whose rx, and ry lay on the x, y
# axis
angle = -pi / 180. * angle
x, y = x * cos(angle) - y * sin(angle), x * sin(angle) + y * cos(angle)
theta = atan2(rx * y, ry * x)
while fabs(ellipe_tan_dot(rx, ry, x, y, theta)) > error:
theta -= ellipe_tan_dot(
rx, ry, x, y, theta) / \
ellipe_tan_dot_derivative(rx, ry, x, y, theta)
px, py = rx * cos(theta), ry * sin(theta)
return ((x - px) ** 2 + (y - py) ** 2) ** .5
Here's an example:
rx, ry = 12, 35 # major, minor ellipse axis
x0 = y0 = 50 # center point of the ellipse
angle = 45 # ellipse's rotation counter clockwise
sx, sy = s = 100, 100 # size of the canvas background
dist = np.zeros(s)
for x in range(sx):
for y in range(sy):
dist[x, y] = estimate_distance(x, y, rx, ry, x0, y0, angle)
plt.imshow(dist.T, extent=(0, sx, 0, sy), origin="lower")
plt.colorbar()
ax = plt.gca()
ellipse = Ellipse(xy=(x0, y0), width=2 * rx, height=2 * ry, angle=angle,
edgecolor='r', fc='None', linestyle='dashed')
ax.add_patch(ellipse)
plt.show()
Which generates an ellipse and the distance from the boundary of the ellipse as a heat map. As can be seen, at the boundary the distance is zero (deep blue).
Given an ellipse E in parametric form and a point P
the square of the distance between P and E(t) is
The minimum must satisfy
Using the trigonometric identities
and substituting
yields the following quartic equation:
Here's an example C function that solves the quartic directly and computes sin(t) and cos(t) for the nearest point on the ellipse:
void nearest(double a, double b, double x, double y, double *ecos_ret, double *esin_ret) {
double ax = fabs(a*x);
double by = fabs(b*y);
double r = b*b - a*a;
double c, d;
int switched = 0;
if (ax <= by) {
if (by == 0) {
if (r >= 0) { *ecos_ret = 1; *esin_ret = 0; }
else { *ecos_ret = 0; *esin_ret = 1; }
return;
}
c = (ax - r) / by;
d = (ax + r) / by;
} else {
c = (by + r) / ax;
d = (by - r) / ax;
switched = 1;
}
double cc = c*c;
double D0 = 12*(c*d + 1); // *-4
double D1 = 54*(d*d - cc); // *4
double D = D1*D1 + D0*D0*D0; // *16
double St;
if (D < 0) {
double t = sqrt(-D0); // *2
double phi = acos(D1 / (t*t*t));
St = 2*t*cos((1.0/3)*phi); // *2
} else {
double Q = cbrt(D1 + sqrt(D)); // *2
St = Q - D0 / Q; // *2
}
double p = 3*cc; // *-2
double SS = (1.0/3)*(p + St); // *4
double S = sqrt(SS); // *2
double q = 2*cc*c + 4*d; // *2
double l = sqrt(p - SS + q / S) - S - c; // *2
double ll = l*l; // *4
double ll4 = ll + 4; // *4
double esin = (4*l) / ll4;
double ecos = (4 - ll) / ll4;
if (switched) {
double t = esin;
esin = ecos;
ecos = t;
}
*ecos_ret = copysign(ecos, a*x);
*esin_ret = copysign(esin, b*y);
}
Try it online!
You just need to calculate the intersection of the line [P1,P0] to your elipse which is S1.
If the line equeation is:
and the elipse equesion is:
than the values of S1 will be:
Now you just need to calculate the distance between S1 to P1 , the formula (for A,B points) is:
I've solved the distance issue via focal points.
For every point on the ellipse
r1 + r2 = 2*a0
where
r1 - Euclidean distance from the given point to focal point 1
r2 - Euclidean distance from the given point to focal point 2
a0 - semimajor axis length
I can also calculate the r1 and r2 for any given point which gives me another ellipse that this point lies on that is concentric to the given ellipse. So the distance is
d = Abs((r1 + r2) / 2 - a0)
As propposed by user3235832
you shall solve quartic equation to find the normal to the ellipse (https://www.mathpages.com/home/kmath505/kmath505.htm). With good initial value only few iterations are needed (I use it myself). As an initial value I use S1 from your picture.
The fastest method I guess is
http://wwwf.imperial.ac.uk/~rn/distance2ellipse.pdf
Which has been mentioned also by Matt but as he found out the method doesn't work very well inside of ellipse.
The problem is the theta initialization.
I proposed an stable initialization:
Find the intersection of ellipse and horizontal line passing the point.
Find the other intersection using vertical line.
Choose the one that is closer the point.
Calculate the initial angle based on that point.
I got good results with no issue inside and outside:
As you can see in the following image it just iterated about 3 times to reach 1e-8. Close to axis it is 1 iteration.
The C++ code is here:
double initialAngle(double a, double b, double x, double y) {
auto abs_x = fabs(x);
auto abs_y = fabs(y);
bool isOutside = false;
if (abs_x > a || abs_y > b) isOutside = true;
double xd, yd;
if (!isOutside) {
xd = sqrt((1.0 - y * y / (b * b)) * (a * a));
if (abs_x > xd)
isOutside = true;
else {
yd = sqrt((1.0 - x * x / (a * a)) * (b * b));
if (abs_y > yd)
isOutside = true;
}
}
double t;
if (isOutside)
t = atan2(a * y, b * x); //The point is outside of ellipse
else {
//The point is inside
if (xd < yd) {
if (x < 0) xd = -xd;
t = atan2(y, xd);
}
else {
if (y < 0) yd = -yd;
t = atan2(yd, x);
}
}
return t;
}
double distanceToElipse(double a, double b, double x, double y, int maxIter = 10, double maxError = 1e-5) {
//std::cout <<"p="<< x << "," << y << std::endl;
auto a2mb2 = a * a - b * b;
double t = initialAngle(a, b, x, y);
auto ct = cos(t);
auto st = sin(t);
int i;
double err;
for (i = 0; i < maxIter; i++) {
auto f = a2mb2 * ct * st - x * a * st + y * b * ct;
auto fp = a2mb2 * (ct * ct - st * st) - x * a * ct - y * b * st;
auto t2 = t - f / fp;
err = fabs(t2 - t);
//std::cout << i + 1 << " " << err << std::endl;
t = t2;
ct = cos(t);
st = sin(t);
if (err < maxError) break;
}
auto dx = a * ct - x;
auto dy = b * st - y;
//std::cout << a * ct << "," << b * st << std::endl;
return sqrt(dx * dx + dy * dy);
}

Variable grouping providing different answers in optimized code

I've been attempting to unit test a C++ class I've written for Geodetic transforms.
I've noticed that a trivial grouping change of three variables greatly influences the error in the function.
EDIT : Here is the entire function for a compilable example:
Assume latitude, longitude and altitude are zero. Earth::a = 6378137 and Earth::b = 6356752.3 I'm working on getting benchmark numbers, something came up at work today and I had to do that instead.
void Geodesy::Geocentric2EFG(double latitude, double longitude, double altitude, double *E, double *F, double *G) {
double a2 = pow<double>(Earth::a, 2);
double b2 = pow<double>(Earth::b, 2);
double radius = sqrt((a2 * b2)/(a2 * pow<double>(sin(latitude), 2) + b2 * pow<double>(cos(longitude), 2)));
radius += altitude;
*E = radius * (cos(latitude) * cos(longitude));
*F = radius * (cos(latitude) * sin(longitude));
*G = radius * sin(latitude);
return;
}
Where all values are defined as double including those in Earth. The pow<T>() function is a recursive template function defined by:
template <typename T>
static inline T pow(const T &base, unsigned const exponent) {
return (exponent == 0) ? 1 : (base * pow(base, exponent - 1));
}
The code in question:
*E = radius * cos(latitude) * cos(longitude);
*F = radius * cos(latitude) * sin(longitude);
produces different results than:
*E = radius * (cos(latitude) * cos(longitude));
*F = radius * (cos(latitude) * sin(longitude));
What is the compiler doing in gcc with optimization level 3 to make these results 1e-2 different?
You have different rounding as floating point cannot represent all numbers:
a * b * c; is (a * b) * c which may differ than a * (b * c).
You may have similar issues with addition too.
example with addition:
10e10f + 1.f == 10e10f
so (1.f + 10e10f) - 10e10f == 10e10f - 10e10f == 0.f
whereas 1.f + (10e10f - 10e10f) == 1.f - 0.f == 1.f.

Fast Inverse Square Root on x64

I found on net Fast Inverse Square Root on http://en.wikipedia.org/wiki/Fast_inverse_square_root . Does it work properly on x64 ?
Did anyone use and serious test ?
Originally Fast Inverse Square Root was written for a 32-bit float, so as long as you operate on IEEE-754 floating point representation, there is no way x64 architecture will affect the result.
Note that for "double" precision floating point (64-bit) you should use another constant:
...the "magic number" for 64 bit IEEE754 size type double ... was shown to be exactly 0x5fe6eb50c7b537a9
Here is an implementation for double precision floats:
#include <cstdint>
double invsqrtQuake( double number )
{
double y = number;
double x2 = y * 0.5;
std::int64_t i = *(std::int64_t *) &y;
// The magic number is for doubles is from https://cs.uwaterloo.ca/~m32rober/rsqrt.pdf
i = 0x5fe6eb50c7b537a9 - (i >> 1);
y = *(double *) &i;
y = y * (1.5 - (x2 * y * y)); // 1st iteration
// y = y * ( 1.5 - ( x2 * y * y ) ); // 2nd iteration, this can be removed
return y;
}
I did a few tests and it seems to work fine
Yes, it works if using the correct magic number and corresponding integer type. In addition to the answers above, here's a C++11 implementation that works for both double and float. Conditionals should optimise out at compile time.
template <typename T, char iterations = 2> inline T inv_sqrt(T x) {
static_assert(std::is_floating_point<T>::value, "T must be floating point");
static_assert(iterations == 1 or iterations == 2, "itarations must equal 1 or 2");
typedef typename std::conditional<sizeof(T) == 8, std::int64_t, std::int32_t>::type Tint;
T y = x;
T x2 = y * 0.5;
Tint i = *(Tint *)&y;
i = (sizeof(T) == 8 ? 0x5fe6eb50c7b537a9 : 0x5f3759df) - (i >> 1);
y = *(T *)&i;
y = y * (1.5 - (x2 * y * y));
if (iterations == 2)
y = y * (1.5 - (x2 * y * y));
return y;
}
As for testing, I use the following doctest in my project:
#ifdef DOCTEST_LIBRARY_INCLUDED
TEST_CASE_TEMPLATE("inv_sqrt", T, double, float) {
std::vector<T> vals = {0.23, 3.3, 10.2, 100.45, 512.06};
for (auto x : vals)
CHECK(inv_sqrt<T>(x) == doctest::Approx(1.0 / std::sqrt(x)));
}
#endif