I am trying to get a value of a variable "pesoil" before its converted to to mm month^-1
pesoil = beta * rnsoil + CPAIR * RHOAIR * exp_h * vpdo / r_as;
pesoil /= (beta + PSY * exp_h * (1.0 + r_ss / r_as))); // W m^-2
pesoil *= (etimes * ndays[dm]) / LAMBDA; // convert to mm month^-1
can I do it like this? or will this mess up value stored in "pesoil"
epotential_W = (pesoil /= (beta + PSY * exp_h * (1.0 + r_ss / r_as))); // W m^-2
I'm working on my fast (and accurate) sin implementation in C++, and I have a problem regarding the efficient angle scaling into the +- pi/2 range.
My sin function for +-pi/2 using Taylor series is the following
(Note: FLOAT is a macro expanded to float or double just for the benchmark)
/**
* Sin for 'small' angles, accurate on [-pi/2, pi/2], fairly accurate on [-pi, pi]
*/
// To switch between float and double
#define FLOAT float
FLOAT
my_sin_small(FLOAT x)
{
constexpr FLOAT C1 = 1. / (7. * 6. * 5. * 4. * 3. * 2.);
constexpr FLOAT C2 = -1. / (5. * 4. * 3. * 2.);
constexpr FLOAT C3 = 1. / (3. * 2.);
constexpr FLOAT C4 = -1.;
// Correction for sin(pi/2) = 1, due to the ignored taylor terms
constexpr FLOAT corr = -1. / 0.9998431013994987;
const FLOAT x2 = x * x;
return corr * x * (x2 * (x2 * (x2 * C1 + C2) + C3) + C4);
}
So far so good... The problem comes when I try to scale an arbitrary angle into the +-pi/2 range. My current solution is:
FLOAT
my_sin(FLOAT x)
{
constexpr FLOAT pi = 3.141592653589793238462;
constexpr FLOAT rpi = 1 / pi;
// convert to +-pi/2 range
int n = std::nearbyint(x * rpi);
FLOAT xbar = (n * pi - x) * (2 * (n & 1) - 1);
// (2 * (n % 2) - 1) is a sign correction (see below)
return my_sin_small(xbar);
};
I made a benchmark, and I'm losing a lot for the +-pi/2 scaling.
Tricking with int(angle/pi + 0.5) is a nope since it is limited to the int precision, also requires +- branching, and i try to avoid branches...
What should I try to improve the performance for this scaling? I'm out of ideas.
Benchmark results for float. (In the benchmark the angle could be out of the validity range for my_sin_small, but for the bench I don't care about that...):
Benchmark results for double.
Sign correction for xbar in my_sin():
Algo accuracy compared to python sin() function:
Candidate improvements
Convert the radians x to rotations by dividing by 2*pi.
Retain only the fraction so we have an angle (-1.0 ... 1.0). This simplifies the OP's modulo step to a simple "drop the whole number" step instead. Going forward with different angle units simply involves a co-efficient set change. No need to scale back to radians.
For positive values, subtract 0.5 so we have (-0.5 ... 0.5) and then flip the sign. This centers the possible values about 0.0 and makes for better convergence of the approximating polynomial as compared to the math sine function. For negative values - see below.
Call my_sin_small1() that uses this (-0.5 ... 0.5) rotations range rather than [-pi ... +pi] radians.
In my_sin_small1(), fold constants together to drop the corr * step.
Rather than use the truncated Taylor's series, use a more optimal set. IMO, this will provide better answers, especially near +/-pi.
Notes: No int to/from float code. With more analysis, possible to get a better set of coefficients that fix my_sin(+/-pi) closer to 0.0. This is just a quick set of code to demo less FP steps and good potential results.
C like code for OP to port to C++
FLOAT my_sin_small1(FLOAT x) {
static const FLOAT A1 = -5.64744881E+01;
static const FLOAT A2 = +7.81017968E+01;
static const FLOAT A3 = -4.11145353E+01;
static const FLOAT A4 = +6.27923581E+00;
const FLOAT x2 = x * x;
return x * (x2 * (x2 * (x2 * A1 + A2) + A3) + A4);
}
FLOAT my_sin1(FLOAT x) {
static const FLOAT pi = 3.141592653589793238462;
static const FLOAT pi2i = 1/(pi * 2);
x *= pi2i;
FLOAT xfraction = 0.5f - (x - truncf(x));
return my_sin_small1(xfraction);
}
For negative values, use -my_sin1(-x) or like code to flip the sign - or add 0.5 in the above minus 0.5 step.
Test
#include <math.h>
#include <stdio.h>
int main(void) {
for (int d = 0; d <= 360; d += 20) {
FLOAT x = d / 180.0 * M_PI;
FLOAT y = my_sin1(x);
printf("%12.6f %11.8f %11.8f\n", x, sin(x), y);
}
}
Output
0.000000 0.00000000 -0.00022483
0.349066 0.34202013 0.34221691
0.698132 0.64278759 0.64255589
1.047198 0.86602542 0.86590189
1.396263 0.98480775 0.98496443
1.745329 0.98480775 0.98501128
2.094395 0.86602537 0.86603642
2.443461 0.64278762 0.64260530
2.792527 0.34202022 0.34183803
3.141593 -0.00000009 0.00000000
3.490659 -0.34202016 -0.34183764
3.839724 -0.64278757 -0.64260519
4.188790 -0.86602546 -0.86603653
4.537856 -0.98480776 -0.98501128
4.886922 -0.98480776 -0.98496443
5.235988 -0.86602545 -0.86590189
5.585053 -0.64278773 -0.64255613
5.934119 -0.34202036 -0.34221727
6.283185 0.00000017 -0.00022483
Alternate code below makes for better results near 0.0, yet might cost a tad more time. OP seems more inclined to speed.
FLOAT xfraction = 0.5f - (x - truncf(x));
// vs.
FLOAT xfraction = x - truncf(x);
if (x >= 0.5f) x -= 1.0f;
[Edit]
Below is a better set with about 10% reduced error.
-56.0833765f
77.92947047f
-41.0936875f
6.278635918f
Yet another approach:
Spend more time (code) to reduce the range to ±pi/4 (±45 degrees), then possible to use only 3 or 2 terms of a polynomial that is like the usually Taylors series.
float sin_quick_small(float x) {
const float x2 = x * x;
#if 0
// max error about 7e-7
static const FLOAT A2 = +0.00811656036940792f;
static const FLOAT A3 = -0.166597759850666f;
static const FLOAT A4 = +0.999994132743861f;
return x * (x2 * (x2 * A2 + A3) + A4);
#else
// max error about 0.00016
static const FLOAT A3 = -0.160343346851626f;
static const FLOAT A4 = +0.999031566686144f;
return x * (x2 * A3 + A4);
#endif
}
float cos_quick_small(float x) {
return cosf(x); // TBD code.
}
float sin_quick(float x) {
if (x < 0.0) {
return -sin_quick(-x);
}
int quo;
float x90 = remquof(fabsf(x), 3.141592653589793238462f / 2, &quo);
switch (quo % 4) {
case 0:
return sin_quick_small(x90);
case 1:
return cos_quick_small(x90);
case 2:
return sin_quick_small(-x90);
case 3:
return -cos_quick_small(x90);
}
return 0.0;
}
int main() {
float max_x = 0.0;
float max_error = 0.0;
for (int d = -45; d <= 45; d += 1) {
FLOAT x = d / 180.0 * M_PI;
FLOAT y = sin_quick(x);
double err = fabs(y - sin(x));
if (err > max_error) {
max_x = x;
max_error = err;
}
printf("%12.6f %11.8f %11.8f err:%11.8f\n", x, sin(x), y, err);
}
printf("x:%.6f err:%.6f\n", max_x, max_error);
return 0;
}
I'm trying to overload the operator to divide two complex numbers
Testing with 3+2i / 4 - 3i
complex g(3, 2);
complex f(4,-3);
cout << g / f << endl;
I added * -1.0 since we go
(4*4) + (3 * -3)i^2 in math which is 25
((3*4) + (3 * -3)* -1) is my intent
Testing I get -0.545455 - 1.72727i
While before I added the * 1.0 I got
0.24 +0.76i
Which is was very close to
0.24 + 0.68i
the answer
complex complex :: operator/ (complex& x) {
complex conjugate = x.conj();
double j = (real * conjugate.real) + (imag * conjugate.imag); // real
double u = (real * conjugate.imag) + (imag * conjugate.real); // imag
double h = (((conjugate.imag * imag)* -1.0) + (real * conjugate.real)) + ( (real*conjugate.imag) + (imag * conjugate.real));
return complex(j/h,u/h);
}
This is wrong:
double j = (real * conjugate.real) + (imag * conjugate.imag); // real
should be -, not +.
This is right:
double u = (real * conjugate.imag) + (imag * conjugate.real); // imag
Although both j and u are meaninglessly named.
What's going on here?
double h = (((conjugate.imag * imag)* -1.0) + (real * conjugate.real)) + ( (real*conjugate.imag) + (imag * conjugate.real));
The denominator is just x*conjugate which is:
double denom = x.real * x.real + x.imag * x.imag;
Side-note, you want to take your argument by reference to const.
I've been attempting to unit test a C++ class I've written for Geodetic transforms.
I've noticed that a trivial grouping change of three variables greatly influences the error in the function.
EDIT : Here is the entire function for a compilable example:
Assume latitude, longitude and altitude are zero. Earth::a = 6378137 and Earth::b = 6356752.3 I'm working on getting benchmark numbers, something came up at work today and I had to do that instead.
void Geodesy::Geocentric2EFG(double latitude, double longitude, double altitude, double *E, double *F, double *G) {
double a2 = pow<double>(Earth::a, 2);
double b2 = pow<double>(Earth::b, 2);
double radius = sqrt((a2 * b2)/(a2 * pow<double>(sin(latitude), 2) + b2 * pow<double>(cos(longitude), 2)));
radius += altitude;
*E = radius * (cos(latitude) * cos(longitude));
*F = radius * (cos(latitude) * sin(longitude));
*G = radius * sin(latitude);
return;
}
Where all values are defined as double including those in Earth. The pow<T>() function is a recursive template function defined by:
template <typename T>
static inline T pow(const T &base, unsigned const exponent) {
return (exponent == 0) ? 1 : (base * pow(base, exponent - 1));
}
The code in question:
*E = radius * cos(latitude) * cos(longitude);
*F = radius * cos(latitude) * sin(longitude);
produces different results than:
*E = radius * (cos(latitude) * cos(longitude));
*F = radius * (cos(latitude) * sin(longitude));
What is the compiler doing in gcc with optimization level 3 to make these results 1e-2 different?
You have different rounding as floating point cannot represent all numbers:
a * b * c; is (a * b) * c which may differ than a * (b * c).
You may have similar issues with addition too.
example with addition:
10e10f + 1.f == 10e10f
so (1.f + 10e10f) - 10e10f == 10e10f - 10e10f == 0.f
whereas 1.f + (10e10f - 10e10f) == 1.f - 0.f == 1.f.
I think I understand why calling glRotate(#, 0, 0, 0) results in a divide-by-zero. The rotation vector, a, is normalized: a' = a/|a| = a/0
Is that the only situation glRotate could result in a divide-by-zero? Yes, I know glRotate is deprecated. Yes, I know the matrix is on the OpenGL manual. No, I don't know linear algebra enough to confidently answer the question from the matrix. Yes, I think it would help. Yes, I asked this already in #opengl (can you tell?). And no, I didn't get an answer.
I would say yes. And I would say that you are right about the normalization step as well. The matrix shown in the OpenGL manual only consists of multiplications. And multiplying a vector would result into the same. Of course, it would do strange things if you result in a vector of (0,0,0). OpenGL states in the same manual that |x,y,z|=1 (or OpenGL will normalize).
So IF it wouldn't normalize, you would end up with a very empty matrix of:
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 1
Which will implode your object in the strangest ways. So DON'T call this function with a zero-vector. If you would like to, tell me why.
And I recommend using a library like GLM to do your matrix calculations if it gets too complicated for some simple glRotates.
Why should it divide by zero when you can check for that?:
/**
* Generate a 4x4 transformation matrix from glRotate parameters, and
* post-multiply the input matrix by it.
*
* \author
* This function was contributed by Erich Boleyn (erich#uruk.org).
* Optimizations contributed by Rudolf Opalla (rudi#khm.de).
*/
void
_math_matrix_rotate( GLmatrix *mat,
GLfloat angle, GLfloat x, GLfloat y, GLfloat z )
{
GLfloat xx, yy, zz, xy, yz, zx, xs, ys, zs, one_c, s, c;
GLfloat m[16];
GLboolean optimized;
s = (GLfloat) sin( angle * DEG2RAD );
c = (GLfloat) cos( angle * DEG2RAD );
memcpy(m, Identity, sizeof(GLfloat)*16);
optimized = GL_FALSE;
#define M(row,col) m[col*4+row]
if (x == 0.0F) {
if (y == 0.0F) {
if (z != 0.0F) {
optimized = GL_TRUE;
/* rotate only around z-axis */
M(0,0) = c;
M(1,1) = c;
if (z < 0.0F) {
M(0,1) = s;
M(1,0) = -s;
}
else {
M(0,1) = -s;
M(1,0) = s;
}
}
}
else if (z == 0.0F) {
optimized = GL_TRUE;
/* rotate only around y-axis */
M(0,0) = c;
M(2,2) = c;
if (y < 0.0F) {
M(0,2) = -s;
M(2,0) = s;
}
else {
M(0,2) = s;
M(2,0) = -s;
}
}
}
else if (y == 0.0F) {
if (z == 0.0F) {
optimized = GL_TRUE;
/* rotate only around x-axis */
M(1,1) = c;
M(2,2) = c;
if (x < 0.0F) {
M(1,2) = s;
M(2,1) = -s;
}
else {
M(1,2) = -s;
M(2,1) = s;
}
}
}
if (!optimized) {
const GLfloat mag = SQRTF(x * x + y * y + z * z);
if (mag <= 1.0e-4) {
/* no rotation, leave mat as-is */
return;
}
x /= mag;
y /= mag;
z /= mag;
/*
* Arbitrary axis rotation matrix.
*
* This is composed of 5 matrices, Rz, Ry, T, Ry', Rz', multiplied
* like so: Rz * Ry * T * Ry' * Rz'. T is the final rotation
* (which is about the X-axis), and the two composite transforms
* Ry' * Rz' and Rz * Ry are (respectively) the rotations necessary
* from the arbitrary axis to the X-axis then back. They are
* all elementary rotations.
*
* Rz' is a rotation about the Z-axis, to bring the axis vector
* into the x-z plane. Then Ry' is applied, rotating about the
* Y-axis to bring the axis vector parallel with the X-axis. The
* rotation about the X-axis is then performed. Ry and Rz are
* simply the respective inverse transforms to bring the arbitrary
* axis back to its original orientation. The first transforms
* Rz' and Ry' are considered inverses, since the data from the
* arbitrary axis gives you info on how to get to it, not how
* to get away from it, and an inverse must be applied.
*
* The basic calculation used is to recognize that the arbitrary
* axis vector (x, y, z), since it is of unit length, actually
* represents the sines and cosines of the angles to rotate the
* X-axis to the same orientation, with theta being the angle about
* Z and phi the angle about Y (in the order described above)
* as follows:
*
* cos ( theta ) = x / sqrt ( 1 - z^2 )
* sin ( theta ) = y / sqrt ( 1 - z^2 )
*
* cos ( phi ) = sqrt ( 1 - z^2 )
* sin ( phi ) = z
*
* Note that cos ( phi ) can further be inserted to the above
* formulas:
*
* cos ( theta ) = x / cos ( phi )
* sin ( theta ) = y / sin ( phi )
*
* ...etc. Because of those relations and the standard trigonometric
* relations, it is pssible to reduce the transforms down to what
* is used below. It may be that any primary axis chosen will give the
* same results (modulo a sign convention) using thie method.
*
* Particularly nice is to notice that all divisions that might
* have caused trouble when parallel to certain planes or
* axis go away with care paid to reducing the expressions.
* After checking, it does perform correctly under all cases, since
* in all the cases of division where the denominator would have
* been zero, the numerator would have been zero as well, giving
* the expected result.
*/
xx = x * x;
yy = y * y;
zz = z * z;
xy = x * y;
yz = y * z;
zx = z * x;
xs = x * s;
ys = y * s;
zs = z * s;
one_c = 1.0F - c;
/* We already hold the identity-matrix so we can skip some statements */
M(0,0) = (one_c * xx) + c;
M(0,1) = (one_c * xy) - zs;
M(0,2) = (one_c * zx) + ys;
/* M(0,3) = 0.0F; */
M(1,0) = (one_c * xy) + zs;
M(1,1) = (one_c * yy) + c;
M(1,2) = (one_c * yz) - xs;
/* M(1,3) = 0.0F; */
M(2,0) = (one_c * zx) - ys;
M(2,1) = (one_c * yz) + xs;
M(2,2) = (one_c * zz) + c;
/* M(2,3) = 0.0F; */
/*
M(3,0) = 0.0F;
M(3,1) = 0.0F;
M(3,2) = 0.0F;
M(3,3) = 1.0F;
*/
}
#undef M
matrix_multf( mat, m, MAT_FLAG_ROTATION );
}