Coeficients in numerical calculations of exp() function - c++

I am trying to understand the implementation of exp_ps() from http://gruntthepeon.free.fr/ssemath/sse_mathfun.h or exp256_ps() from http://software-lisc.fbk.eu/avx_mathfun/avx_mathfun.h.
I understand almost everything in the calculation, except for how constant cephes_exp_C2 is determined. It seems that it increases the accuracy of the calculation. If it is removed from the calculation then resulting function is significantly faster and slightly less precise (relative error is still under 1% for values around +/- 10). I found such coefficients in other numerical libraries, but without closer explanation.

After a bit of searching through the Cephes source, I think it's an error in Pommier's translation. This is not the first time I have seen errors in Pommier's code. I recommend using math library in Gromacs.
From exp.c in Cephe's,
static double C1 = 6.93145751953125E-1;
static double C2 = 1.42860682030941723212E-6;
....
px = floor( LOG2E * x + 0.5 );
n = px;
x -= px * C1;
x -= px * C2;
From Pommier,
_PS_CONST(cephes_exp_C1, 0.693359375);
_PS_CONST(cephes_exp_C2, -2.12194440e-4); <-- Wrong value
....
//
// fx = LOG2E * x + 0.5
//
fx = _mm_mul_ps(x, *(v4sf*)_ps_cephes_LOG2EF);
fx = _mm_add_ps(fx, *(v4sf*)_ps_0p5);
//
// fx = floor(fx)
//
emm0 = _mm_cvttps_epi32(fx);
tmp = _mm_cvtepi32_ps(emm0);
v4sf mask = _mm_cmpgt_ps(tmp, fx);
mask = _mm_and_ps(mask, one);
fx = _mm_sub_ps(tmp, mask);
//
// x -= fx * C1;
// x -= fx * C2; (Using z allows for better ILP in this step)
//
tmp = _mm_mul_ps(fx, *(v4sf*)_ps_cephes_exp_C1);
v4sf z = _mm_mul_ps(fx, *(v4sf*)_ps_cephes_exp_C2);
x = _mm_sub_ps(x, tmp);
x = _mm_sub_ps(x, z);

Related

Efficient floating point scaling in C++

I'm working on my fast (and accurate) sin implementation in C++, and I have a problem regarding the efficient angle scaling into the +- pi/2 range.
My sin function for +-pi/2 using Taylor series is the following
(Note: FLOAT is a macro expanded to float or double just for the benchmark)
/**
* Sin for 'small' angles, accurate on [-pi/2, pi/2], fairly accurate on [-pi, pi]
*/
// To switch between float and double
#define FLOAT float
FLOAT
my_sin_small(FLOAT x)
{
constexpr FLOAT C1 = 1. / (7. * 6. * 5. * 4. * 3. * 2.);
constexpr FLOAT C2 = -1. / (5. * 4. * 3. * 2.);
constexpr FLOAT C3 = 1. / (3. * 2.);
constexpr FLOAT C4 = -1.;
// Correction for sin(pi/2) = 1, due to the ignored taylor terms
constexpr FLOAT corr = -1. / 0.9998431013994987;
const FLOAT x2 = x * x;
return corr * x * (x2 * (x2 * (x2 * C1 + C2) + C3) + C4);
}
So far so good... The problem comes when I try to scale an arbitrary angle into the +-pi/2 range. My current solution is:
FLOAT
my_sin(FLOAT x)
{
constexpr FLOAT pi = 3.141592653589793238462;
constexpr FLOAT rpi = 1 / pi;
// convert to +-pi/2 range
int n = std::nearbyint(x * rpi);
FLOAT xbar = (n * pi - x) * (2 * (n & 1) - 1);
// (2 * (n % 2) - 1) is a sign correction (see below)
return my_sin_small(xbar);
};
I made a benchmark, and I'm losing a lot for the +-pi/2 scaling.
Tricking with int(angle/pi + 0.5) is a nope since it is limited to the int precision, also requires +- branching, and i try to avoid branches...
What should I try to improve the performance for this scaling? I'm out of ideas.
Benchmark results for float. (In the benchmark the angle could be out of the validity range for my_sin_small, but for the bench I don't care about that...):
Benchmark results for double.
Sign correction for xbar in my_sin():
Algo accuracy compared to python sin() function:
Candidate improvements
Convert the radians x to rotations by dividing by 2*pi.
Retain only the fraction so we have an angle (-1.0 ... 1.0). This simplifies the OP's modulo step to a simple "drop the whole number" step instead. Going forward with different angle units simply involves a co-efficient set change. No need to scale back to radians.
For positive values, subtract 0.5 so we have (-0.5 ... 0.5) and then flip the sign. This centers the possible values about 0.0 and makes for better convergence of the approximating polynomial as compared to the math sine function. For negative values - see below.
Call my_sin_small1() that uses this (-0.5 ... 0.5) rotations range rather than [-pi ... +pi] radians.
In my_sin_small1(), fold constants together to drop the corr * step.
Rather than use the truncated Taylor's series, use a more optimal set. IMO, this will provide better answers, especially near +/-pi.
Notes: No int to/from float code. With more analysis, possible to get a better set of coefficients that fix my_sin(+/-pi) closer to 0.0. This is just a quick set of code to demo less FP steps and good potential results.
C like code for OP to port to C++
FLOAT my_sin_small1(FLOAT x) {
static const FLOAT A1 = -5.64744881E+01;
static const FLOAT A2 = +7.81017968E+01;
static const FLOAT A3 = -4.11145353E+01;
static const FLOAT A4 = +6.27923581E+00;
const FLOAT x2 = x * x;
return x * (x2 * (x2 * (x2 * A1 + A2) + A3) + A4);
}
FLOAT my_sin1(FLOAT x) {
static const FLOAT pi = 3.141592653589793238462;
static const FLOAT pi2i = 1/(pi * 2);
x *= pi2i;
FLOAT xfraction = 0.5f - (x - truncf(x));
return my_sin_small1(xfraction);
}
For negative values, use -my_sin1(-x) or like code to flip the sign - or add 0.5 in the above minus 0.5 step.
Test
#include <math.h>
#include <stdio.h>
int main(void) {
for (int d = 0; d <= 360; d += 20) {
FLOAT x = d / 180.0 * M_PI;
FLOAT y = my_sin1(x);
printf("%12.6f %11.8f %11.8f\n", x, sin(x), y);
}
}
Output
0.000000 0.00000000 -0.00022483
0.349066 0.34202013 0.34221691
0.698132 0.64278759 0.64255589
1.047198 0.86602542 0.86590189
1.396263 0.98480775 0.98496443
1.745329 0.98480775 0.98501128
2.094395 0.86602537 0.86603642
2.443461 0.64278762 0.64260530
2.792527 0.34202022 0.34183803
3.141593 -0.00000009 0.00000000
3.490659 -0.34202016 -0.34183764
3.839724 -0.64278757 -0.64260519
4.188790 -0.86602546 -0.86603653
4.537856 -0.98480776 -0.98501128
4.886922 -0.98480776 -0.98496443
5.235988 -0.86602545 -0.86590189
5.585053 -0.64278773 -0.64255613
5.934119 -0.34202036 -0.34221727
6.283185 0.00000017 -0.00022483
Alternate code below makes for better results near 0.0, yet might cost a tad more time. OP seems more inclined to speed.
FLOAT xfraction = 0.5f - (x - truncf(x));
// vs.
FLOAT xfraction = x - truncf(x);
if (x >= 0.5f) x -= 1.0f;
[Edit]
Below is a better set with about 10% reduced error.
-56.0833765f
77.92947047f
-41.0936875f
6.278635918f
Yet another approach:
Spend more time (code) to reduce the range to ±pi/4 (±45 degrees), then possible to use only 3 or 2 terms of a polynomial that is like the usually Taylors series.
float sin_quick_small(float x) {
const float x2 = x * x;
#if 0
// max error about 7e-7
static const FLOAT A2 = +0.00811656036940792f;
static const FLOAT A3 = -0.166597759850666f;
static const FLOAT A4 = +0.999994132743861f;
return x * (x2 * (x2 * A2 + A3) + A4);
#else
// max error about 0.00016
static const FLOAT A3 = -0.160343346851626f;
static const FLOAT A4 = +0.999031566686144f;
return x * (x2 * A3 + A4);
#endif
}
float cos_quick_small(float x) {
return cosf(x); // TBD code.
}
float sin_quick(float x) {
if (x < 0.0) {
return -sin_quick(-x);
}
int quo;
float x90 = remquof(fabsf(x), 3.141592653589793238462f / 2, &quo);
switch (quo % 4) {
case 0:
return sin_quick_small(x90);
case 1:
return cos_quick_small(x90);
case 2:
return sin_quick_small(-x90);
case 3:
return -cos_quick_small(x90);
}
return 0.0;
}
int main() {
float max_x = 0.0;
float max_error = 0.0;
for (int d = -45; d <= 45; d += 1) {
FLOAT x = d / 180.0 * M_PI;
FLOAT y = sin_quick(x);
double err = fabs(y - sin(x));
if (err > max_error) {
max_x = x;
max_error = err;
}
printf("%12.6f %11.8f %11.8f err:%11.8f\n", x, sin(x), y, err);
}
printf("x:%.6f err:%.6f\n", max_x, max_error);
return 0;
}

Integrate function

I have this function to reach a certain 1 dimensional value accelerated and damped with overshoot. That is: given an inital value, a velocity and a acceleration (force/mass), the target value is attained by accelerating to it and gets increasingly damped while getting closer to the target value.
This all works fine, howver If i want to know what the TotalAngle is after time 't' I have to run this function say N steps with a 'small' dt to find the 'limit'.
I was wondering If i can (and how) to intergrate over dt so that the TotalAngle can be determined given a time 't' initially.
Regards, Tanks for any help.
dt = delta time step per frame
input = 1
TotalAngle = 0 at t=0
Velocity = 0 at t=0
void FAccelDampedWithOvershoot::Update(float dt, float input, float& Velocity, float& TotalAngle)
{
const float Force = 500000.f;
const float DampForce = 5000.f;
const float MaxAngle = 45.f;
const float InvMass = 1.f / 162400.f;
float target = MaxAngle * input;
float ratio = (target - TotalAngle) / MaxAngle;
float fMove = Force * ratio;
float fDamp = -Velocity * DampForce;
Velocity += (fMove + fDamp) * invMass * dt;
TotalAngle += Velocity * dt;
}
Updated with fixed bugs in math
Originally I've lost mass and MaxAngle a few times. This is why you should first solve it on a paper and then enter to the SO rather than trying to solve it in the text editor.
Anyway, I've fixed the math and now it seems to work reasonably well. I put fixed solution just over previous one.
Well, this looks like a Newtonian mechanics which means differential equations. Let's try to solve them.
SO is not very friendly to math formulas and I'm a bit bored to type characters so here is what I use:
F = Force
Fd = DampForce
MA = MaxAngle
A= TotalAngle
v = Velocity
m = 1 / InvMass
' for derivative i.e. something' is 1-st derivative of something by t and something'' is 2-nd derivative
if I divide you last two lines of code by dt and merge in all the other lines I can get (I also assume that input = 1 as other case is obviously symmetrical)
v' = ([F * (1 - A / MA)] - v * Fd) / m
and applying A' = v we get
m * A'' = F(1 - A/MA) - Fd * A'
or moving to one side we get a simple 2-nd order differential equation
m * A'' + Fd * A' + F/MA * A = F
IIRC, the way to solve it is to first solve characteristic equation which here is
m * x^2 + Fd * x + F/MA = 0
x[1,2] = (-Fd +/- sqrt(Fd^2 - 4*F*m/MA))/ (2*m)
I expect that part under sqrt i.e. (Fd^2 - 4*F*m/MA) is negative thus solution should be of the following form. Let
Dm = Fd/(2*m)
K = sqrt(F/MA/m - Dm^2)
(note the negated value under sqrt so it works now) then
A(t) = e^(-Dm*t) * [P * sin(K*t) + Q * cos(K*t)] + C
where P, Q and C are some constants.
The solution is easier to find as a sum of two solutions: some specific solution for
m * A'' + Fd * A' + F/MA * A = F
and a general solution for homogeneou
m * A'' + Fd * A' + F/MA * A = 0
that makes original conditions fit. Obviously specific solution A(t) = MA works and thus C = MA. So now we need to fit P and Q of general solution to match starting conditions. To find them we need
A(0) = - MA
A'(0) = V(0) = 0
Given that e^0 = 1, sin(0) = 0 and cos(0) = 1 you get something like
Q = -MA
P = 0
or
P = 0
Q = - MA
C = MA
thus
A(t) = MA * [1 - e^(-Dm*t) * cos(K*t)]
where
Dm = Fd/(2*m)
K = sqrt(F/MA/m - Dm^2)
which kind of makes sense given your task.
Note also that this equation assumes that everything happens in radians rather than degrees (i.e. derivative of [sin(t)]' is just cos(t)) so you should transform all your constants accordingly or transform the solution.
const float Force = 500000.f * M_PI / 180;
const float DampForce = 5000.f * M_PI / 180;
const float MaxAngle = M_PI_4;
which on my machine produces
Dm = 0.000268677541
K = 0.261568546
This seems to be similar to original funcion is I step with dt = 0.01f and the main obstacle seems to be precision loss because of float
Hope this helps!
This is not a full answer and I am sure someone else can work it out, but there is no room in the comments and it may help you find a better solution.
The image below shows the velocity (blue) as your function integrates at time steps 1. The red shows the function below that calculates the value for time t
The function F(t)
F(t) = sin((t / f) * pi * 2) * (1 / (((t / f) + a) ^ c)) * b
With f = 23.7, a = 1.4, c = 2, and b= 50 that give the red plot in the image above
All the values are just approximation.
f determines the frequency and is close to a match,
a,b,c control the falloff in amplitude and are a by eye guestimate.
If it does not matter that you have a perfect match then this will work for you. totalAngle uses the same function but t has 0.25 added to it. Unfortunately I did not get any values for a,b,c for totalAngle and I did notice that it was offset so you will have to add the offset value d (I normalised everything so have no idea what the range of totalAngle was)
Function F(t) for totalAngle
F(t) = sin(((t+0.25) / f) * pi * 2) * (1 / ((((t+0.25) / f) + a) ^ c)) * b + d
Sorry only have f = 23.7, c= 2, a~1.4 nothing for b=? d=?

Fast approximate float division

On modern processors, float division is a good order of magnitude slower than float multiplication (when measured by reciprocal throughput).
I'm wondering if there are any algorithms out there for computating a fast approximation to x/y, given certain assumptions and tolerance levels. For example, if you assume that 0<x<y, and are willing to accept any output that is within 10% of the true value, are there algorithms faster than the built-in FDIV operation?
I hope that this helps because this is probably as close as your going to get to what you are looking for.
__inline__ double __attribute__((const)) divide( double y, double x ) {
// calculates y/x
union {
double dbl;
unsigned long long ull;
} u;
u.dbl = x; // x = x
u.ull = ( 0xbfcdd6a18f6a6f52ULL - u.ull ) >> (unsigned char)1;
// pow( x, -0.5 )
u.dbl *= u.dbl; // pow( pow(x,-0.5), 2 ) = pow( x, -1 ) = 1.0/x
return u.dbl * y; // (1.0/x) * y = y/x
}
See also:
Another post about reciprocal approximation.
The Wikipedia page.
FDIV is usually exceptionally slower than FMUL just b/c it can't be piped like multiplication and requires multiple clk cycles for iterative convergence HW seeking process.
Easiest way is to simply recognize that division is nothing more than the multiplication of the dividend y and the inverse of the divisor x. The not so straight forward part is remembering a float value x = m * 2 ^ e & its inverse x^-1 = (1/m)*2^(-e) = (2/m)*2^(-e-1) = p * 2^q approximating this new mantissa p = 2/m = 3-x, for 1<=m<2. This gives a rough piece-wise linear approximation of the inverse function, however we can do a lot better by using an iterative Newton Root Finding Method to improve that approximation.
let w = f(x) = 1/x, the inverse of this function f(x) is found by solving for x in terms of w or x = f^(-1)(w) = 1/w. To improve the output with the root finding method we must first create a function whose zero reflects the desired output, i.e. g(w) = 1/w - x, d/dw(g(w)) = -1/w^2.
w[n+1]= w[n] - g(w[n])/g'(w[n]) = w[n] + w[n]^2 * (1/w[n] - x) = w[n] * (2 - x*w[n])
w[n+1] = w[n] * (2 - x*w[n]), when w[n]=1/x, w[n+1]=1/x*(2-x*1/x)=1/x
These components then add to get the final piece of code:
float inv_fast(float x) {
union { float f; int i; } v;
float w, sx;
int m;
sx = (x < 0) ? -1:1;
x = sx * x;
v.i = (int)(0x7EF127EA - *(uint32_t *)&x);
w = x * v.f;
// Efficient Iterative Approximation Improvement in horner polynomial form.
v.f = v.f * (2 - w); // Single iteration, Err = -3.36e-3 * 2^(-flr(log2(x)))
// v.f = v.f * ( 4 + w * (-6 + w * (4 - w))); // Second iteration, Err = -1.13e-5 * 2^(-flr(log2(x)))
// v.f = v.f * (8 + w * (-28 + w * (56 + w * (-70 + w *(56 + w * (-28 + w * (8 - w))))))); // Third Iteration, Err = +-6.8e-8 * 2^(-flr(log2(x)))
return v.f * sx;
}

SSE optimization of Gaussian blur

I'm working on a school project , I have to optimize part of code in SSE, but I'm stuck on one part for few days now.
I dont see any smart way of using vector SSE instructions(inline assembler / instric f) in this code(its a part of guassian blur algorithm). I would be glad if somebody could give me just a small hint
for (int x = x_start; x < x_end; ++x) // vertical blur...
{
float sum = image[x + (y_start - radius - 1)*image_w];
float dif = -sum;
for (int y = y_start - 2*radius - 1; y < y_end; ++y)
{ // inner vertical Radius loop
float p = (float)image[x + (y + radius)*image_w]; // next pixel
buffer[y + radius] = p; // buffer pixel
sum += dif + fRadius*p;
dif += p; // accumulate pixel blur
if (y >= y_start)
{
float s = 0, w = 0; // border blur correction
sum -= buffer[y - radius - 1]*fRadius; // addition for fraction blur
dif += buffer[y - radius] - 2*buffer[y]; // sum up differences: +1, -2, +1
// cut off accumulated blur area of pixel beyond the border
// assume: added pixel values beyond border = value at border
p = (float)(radius - y); // top part to cut off
if (p > 0)
{
p = p*(p-1)/2 + fRadius*p;
s += buffer[0]*p;
w += p;
}
p = (float)(y + radius - image_h + 1); // bottom part to cut off
if (p > 0)
{
p = p*(p-1)/2 + fRadius*p;
s += buffer[image_h - 1]*p;
w += p;
}
new_image[x + y*image_w] = (unsigned char)((sum - s)/(weight - w)); // set blurred pixel
}
else if (y + radius >= y_start)
{
dif -= 2*buffer[y];
}
} // for y
} // for x
One more feature you can use is logical operations and masks:
for example instead of:
// process only 1
if (p > 0)
p = p*(p-1)/2 + fRadius*p;
you can write
// processes 4 floats
const __m128 &mask = _mm_cmplt_ps(p,0);
const __m128 &notMask = _mm_cmplt_ps(0,p);
const __m128 &p_tmp = ( p*(p-1)/2 + fRadius*p );
p = _mm_add_ps(_mm_and_ps(p_tmp, mask), _mm_and_ps(p, notMask)); // = p_tmp & mask + p & !mask
Also I can recommend you to use a special libraries, which overloads instructions. For example: http://code.compeng.uni-frankfurt.de/projects/vc
dif variable makes iterations of inner loop dependent. You should try to parallelize the outer loop. But with out instructions overloading the code will become unmanageable then.
Also consider rethinking the whole algorithm. Current one doesn't look paralell. May be you can neglect precision, or increase scalar time a bit?

Sinusoid and derivative function

I have some object whish moves by sinusoid. I have to animate it each time it reaches the top (or the bottom) of the "wave". I want to do this with derivative function: as we know it changes the value (from positive to negative or contrary) at that points. So the code is:
// Start value
int functionValue = +1;
// Function
float y = k1 * sinf(k2 * Deg2Rad(x)) + y_base;
// Derivative function
float tempValue = -cosf(y);
// Check whether value is changed
if (tempValue * functionValue < 0)
{
animation = true;
}
functionValue = tempValue;
if I will output the tempValue it shows strange numbers:
0.851513
0.997643
0.0242145
0.690432
0.326303
-0.614262
0.892036
0.1348
0.709843
0.968676
0.0454846
0.920602
-0.423125
0.692132
-0.960107
0.0799654
-0.747722
-0.635241
0.148477
-0.98611
0.900912
-0.877801
0.811632
-0.362743
-0.233856
0.35512
-0.994107
0.885184
-0.468005
0.982489
0.675337
0.661048
0.870765
0.0312914
-0.319066
-0.602956
-0.996169
-0.95627
And animation is strange too. Not only at the top of wave. What's wrong is there?
You have
y = a * sin(b * x) + c
derivative of that is
y' = a * b * cos(b * x)
not
y' = -cos(y)
You're doing your math wrong. Derivative of sin(x) is cos(x), not cos(sin(x)).
should be
float tempValue = cosf(k2 * Deg2Rad(x));