I would like to compute the angle between two 3d vectors. I'm using the following equation (source of equation) to achieve this:
diffangle = atan2(norm(cross(v1,v2)),dot(v1,v2))
The components of v1 and v2 are given in data type float, but since I have very small angles I would like to have the difference angle in type double. My actual implementation looks like follows:
double angle(float x1, float y1, float z1, float x2, float y2, float z2)
{
double dot = x1*x2 + y1*y2 + z1*z2;
double crossX = y1*z2-z1*y2;
double crossY = z1*x2-x1*z2;
double crossZ = x1*y2-y1*x2;
double norm = sqrt(crossX*crossX+crossY*crossY+crossZ*crossZ);
return (atan2(norm,dot)/M_PI*180);
}
Does my implementation what it should or do I have to cast somthing or take other things into account?
Thanks for your help.
Regarding the issue of accuracy: In C (and C++), floats will be promoted to double when a double in involved in the calculation (same thing for promoting from int to long int).
So the expression
double z = x*y;
First computes x*y in single precision (float), then casts the result to double. To actually perform the calculation using double precision you need to cast one of the elements involved in the expression:
double z = (double)x*y;
The simplest solution, however, would be to change the function declaration to accept double. This way, the float values would be promoted when calling the function and all calculations will use double precision.
Related
enter image description here
The picture contains what I've tried...I assume something might be wrong in the normalize function but I'm not sure what.
Two vectors are perpendicular/orthogonal If their dot product is zero. You don't need to normalize anything. Here's a function that checks if two vectors are perpendicular:
int arePerpendicular(double x1, double y1, double x2, double y2)
{
double epsilon = 0.1;
double dot = x1 * x2 + y1 * y2;
if (abs(dot) < epsilon)
{
return 1;
}
return 0;
}
x1 and y1 are the first vector components, and x2 and y2 the second vector components. Also, epsilon is a threshold value, absolute values lesser than it are considered 0. This is necessary because an explicit == 0 check using floating point arithmetic is likely to fail due to the very nature of float and double. In fact, when comparing two floating point variables for equality you should find the absolute value of their difference and check if it's below a certain threshold.
I've written a simple program to calculate the first and second derivative of a function, using function pointers. My program computes the correct answers (more or less), but for some functions, the accuracy is less than I would like.
This is the function I am differentiating:
float f1(float x) {
return (x * x);
}
These are the derivative functions, using the central finite difference method:
// Function for calculating the first derivative.
float first_dx(float (*fx)(float), float x) {
float h = 0.001;
float dfdx;
dfdx = (fx(x + h) - fx(x - h)) / (2 * h);
return dfdx;
}
// Function for calculating the second derivative.
float second_dx(float (*fx)(float), float x) {
float h = 0.001;
float d2fdx2;
d2fdx2 = (fx(x - h) - 2 * fx(x) + fx(x + h)) / (h * h);
return d2fdx2;
}
Main function:
int main() {
pc.baud(9600);
float x = 2.0;
pc.printf("**** Function Pointers ****\r\n");
pc.printf("Value of f(%f): %f\r\n", x, f1(x));
pc.printf("First derivative: %f\r\n", first_dx(f1, x));
pc.printf("Second derivative: %f\r\n\r\n", second_dx(f1, x));
}
This is the output from the program:
**** Function Pointers ****
Value of f(2.000000): 4.000000
First derivative: 3.999948
Second derivative: 1.430511
I'm happy with the accuracy of the first derivative, but I believe the second derivative is too far off (it should be equal to ~2.0).
I have a basic understanding of how floating point numbers are represented and why they are sometimes inaccurate, but how can I make this second derivative result more accurate? Could I be using something better than the central finite difference method, or is there a way I can get better results with the current method?
The accuracy can be increased by choosing a type which has more precision. float is currently defined as an IEEE-754 32-bit number, giving you a precision of ~7.225 decimal places.
What you want is the 64-bit counterpart: double with ~15.955 decimal places accuracy.
That should be sufficient for your calculation, however worth mentioning is boosts implementation which offers a quadruple-precision floating point number (128-bit).
Finally The GNU Multiple Precision Arithmetic Library offers types with an arbitrary number of decimal places for precision.
Go analytical. ;-) probably not an option given "with the current
method".
Use double instead of float.
Vary the epsilon (h), and combine the results in some way. For example you could try 0.00001, 0.000001, 0.0000001 and average them. In fact, you'd want the result with the smallest h that doesn't overflow/underflow. But it's not clear how to detect overflow and underflow.
How can I work with complex numbers in C? I see there is a complex.h header file, but it doesn't give me much information about how to use it. How to access real and imaginary parts in an efficient way? Is there native functions to get module and phase?
This code will help you, and it's fairly self-explanatory:
#include <stdio.h> /* Standard Library of Input and Output */
#include <complex.h> /* Standard Library of Complex Numbers */
int main() {
double complex z1 = 1.0 + 3.0 * I;
double complex z2 = 1.0 - 4.0 * I;
printf("Working with complex numbers:\n\v");
printf("Starting values: Z1 = %.2f + %.2fi\tZ2 = %.2f %+.2fi\n", creal(z1), cimag(z1), creal(z2), cimag(z2));
double complex sum = z1 + z2;
printf("The sum: Z1 + Z2 = %.2f %+.2fi\n", creal(sum), cimag(sum));
double complex difference = z1 - z2;
printf("The difference: Z1 - Z2 = %.2f %+.2fi\n", creal(difference), cimag(difference));
double complex product = z1 * z2;
printf("The product: Z1 x Z2 = %.2f %+.2fi\n", creal(product), cimag(product));
double complex quotient = z1 / z2;
printf("The quotient: Z1 / Z2 = %.2f %+.2fi\n", creal(quotient), cimag(quotient));
double complex conjugate = conj(z1);
printf("The conjugate of Z1 = %.2f %+.2fi\n", creal(conjugate), cimag(conjugate));
return 0;
}
with:
creal(z1): get the real part (for float crealf(z1), for long double creall(z1))
cimag(z1): get the imaginary part (for float cimagf(z1), for long double cimagl(z1))
Another important point to remember when working with complex numbers is that functions like cos(), exp() and sqrt() must be replaced with their complex forms, e.g. ccos(), cexp(), csqrt().
Complex types are in the C language since C99 standard (-std=c99 option of GCC). Some compilers may implement complex types even in more earlier modes, but this is non-standard and non-portable extension (e.g. IBM XL, GCC, may be intel,... ).
You can start from http://en.wikipedia.org/wiki/Complex.h - it gives a description of functions from complex.h
This manual http://pubs.opengroup.org/onlinepubs/009604499/basedefs/complex.h.html also gives some info about macros.
To declare a complex variable, use
double _Complex a; // use c* functions without suffix
or
float _Complex b; // use c*f functions - with f suffix
long double _Complex c; // use c*l functions - with l suffix
To give a value into complex, use _Complex_I macro from complex.h:
float _Complex d = 2.0f + 2.0f*_Complex_I;
(actually there can be some problems here with (0,-0i) numbers and NaNs in single half of complex)
Module is cabs(a)/cabsl(c)/cabsf(b); Real part is creal(a), Imaginary is cimag(a). carg(a) is for complex argument.
To directly access (read/write) real an imag part you may use this unportable GCC-extension:
__real__ a = 1.4;
__imag__ a = 2.0;
float b = __real__ a;
For convenience, one may include tgmath.h library for the type generate macros. It creates the same function name as the double version for all type of variable. For example, For example, it defines a sqrt() macro that expands to the sqrtf() , sqrt() , or sqrtl() function, depending on the type of argument provided.
So one don't need to remember the corresponding function name for different type of variables!
#include <stdio.h>
#include <tgmath.h>//for the type generate macros.
#include <complex.h>//for easier declare complex variables and complex unit I
int main(void)
{
double complex z1=1./4.*M_PI+1./4.*M_PI*I;//M_PI is just pi=3.1415...
double complex z2, z3, z4, z5;
z2=exp(z1);
z3=sin(z1);
z4=sqrt(z1);
z5=log(z1);
printf("exp(z1)=%lf + %lf I\n", creal(z2),cimag(z2));
printf("sin(z1)=%lf + %lf I\n", creal(z3),cimag(z3));
printf("sqrt(z1)=%lf + %lf I\n", creal(z4),cimag(z4));
printf("log(z1)=%lf + %lf I\n", creal(z5),cimag(z5));
return 0;
}
The notion of complex numbers was introduced in mathematics, from the need of calculating negative quadratic roots. Complex number concept was taken by a variety of engineering fields.
Today that complex numbers are widely used in advanced engineering domains such as physics, electronics, mechanics, astronomy, etc...
Real and imaginary part, of a negative square root example:
#include <stdio.h>
#include <complex.h>
int main()
{
int negNum;
printf("Calculate negative square roots:\n"
"Enter negative number:");
scanf("%d", &negNum);
double complex negSqrt = csqrt(negNum);
double pReal = creal(negSqrt);
double pImag = cimag(negSqrt);
printf("\nReal part %f, imaginary part %f"
", for negative square root.(%d)",
pReal, pImag, negNum);
return 0;
}
To extract the real part of a complex-valued expression z, use the notation as __real__ z.
Similarly, use __imag__ attribute on the z to extract the imaginary part.
For example;
__complex__ float z;
float r;
float i;
r = __real__ z;
i = __imag__ z;
r is the real part of the complex number "z"
i is the imaginary part of the complex number "z"
I am writing a program for class that simply calculates distance between two coordinate points (x,y).
differenceofx1 = x1 - x2;
differenceofy1 = y1 - y2;
squareofx1 = differenceofx1 * differenceofx1;
squareofy1 = differenceofy1 * differenceofy1;
distance1 = sqrt(squareofx1 - squareofy1);
When I calculate the distance, it works. However there are some situations such as the result being a square root of a non-square number, or the difference of x1 and x2 / y1 and y2 being negative due to the input order, that it just gives a distance of 0.00000 when the distance is clearly more than 0. I am using double for all the variables, should I use float instead for the negative possibility or does double do the same job? I set the precision to 8 as well but I don't understand why it wouldn't calculate properly?
I am sorry for the simplicity of the question, I am a bit more than a beginner.
You are using the distance formula wrong
it should be
distance1 = sqrt(squareofx1 + squareofy1);
instead of
distance1 = sqrt(squareofx1 - squareofy1);
due to the wrong formula if squareofx1 is less than squareofy1 you get an error as sqrt of a negative number is not possible in case of real coordinates.
Firstly, your formula is incorrect change it to distance1 = sqrt(squareofx1 + squareofy1) as #fefe mentioned. Btw All your calculation can be represented in one line of code:
distance1 = sqrt((x1-x2)*(x1-x2) + (y1-y2)*(y1-y2));
No need for variables like differenceofx1, differenceofy1, squareofx1, squareofy1 unless you are using the results stored in these variables again in your program.
Secondly, Double give you more precision than float. If you need precision more than 6-7 places after decimal use Double else float works too. Read more about Float vs Double
I'm computing the ordinate y of a point on a line at a given abscissa x. The line is defined by its two end points coordinates (x0,y0)(x1,y1). End points coordinates are floats and the computation must be done in float precision for use in GPU.
The maths, and thus the naive implementation, are trivial.
Let t = (x - x0)/(x1 - x0), then y = (1 - t) * y0 + t * y1 = y0 + t * (y1 - y0).
The problem is when x1 - x0 is small. The result will introduce cancellation error. When combined with the one of x - x0, in the division I expect a significant error in t.
The question is if there exist another way to determine y with a better accuracy ?
i.e. should I compute (x - x0)*(y1 - y0) first, and divide by (x1 - x0) after ?
The difference y1 - y0 will always be big.
To a large degree, your underlying problem is fundamental. When (x1-x0) is small, it means there are only a few bits in the mantissa of x1 and x0 which differ. And by extension, there are only a limted number of floats between x0 and x1. E.g. if only the lower 4 bits of the mantissa differ, there are at most 14 values between them.
In your best algorithm, the t term represents these lower bits. And to continue or example, if x0 and x1 differ by 4 bits, then t can take on only 16 values either. The calculation of these possible values is fairly robust. Whether you're calculating 3E0/14E0 or 3E-12/14E-12, the result is going to be close to the mathematical value of 3/14.
Your formula has the additional advantage of having y0 <= y <= y1, since 0 <= t <= 1
(I'm assuming that you know enough about float representations, and therefore "(x1-x0) is small" really means "small, relative to the values of x1 and x0 themselves". A difference of 1E-1 is small when x0=1E3 but large if x0=1E-6 )
You may have a look at Qt's "QLine" (if I remember it right) sources; they have implemented an intersection determination algorithm taken from one the "Graphics Gems" books (the reference must be in the code comments, the book was on EDonkey a couple of years ago), which, in turn, has some guarantees on applicability for a given screen resolution when calculations are performed with given bit-width (they use fixed-point arithmetics if I'm not wrong).
If you have the possibility to do it, you can introduce two cases in your computation, depending on abs(x1-x0) < abs(y1-y0). In the vertical case abs(x1-x0) < abs(y1-y0), compute x from y instead of y from x.
EDIT. Another possibility would be to obtain the result bit by bit using a variant of dichotomic search. This will be slower, but may improve the result in extreme cases.
// Input is X
xmin = min(x0,x1);
xmax = max(x0,x1);
ymin = min(y0,y1);
ymax = max(y0,y1);
for (int i=0;i<20;i++) // get 20 bits in result
{
xmid = (xmin+xmax)*0.5;
ymid = (ymin+ymax)*0.5;
if ( x < xmid ) { xmax = xmid; ymax = ymid; } // first half
else { xmin = xmid; ymin = ymid; } // second half
}
// Output is some value in [ymin,ymax]
Y = ymin;
I have implemented a benchmark program to compare the effect of the different expression.
I computed y using double precision and then compute y using single precision with different expressions.
Here are the expression tested:
inline double getYDbl( double x, double x0, double y0, double x1, double y1 )
{
double const t = (x - x0)/(x1 - x0);
return y0 + t*(y1 - y0);
}
inline float getYFlt1( float x, float x0, float y0, float x1, float y1 )
{
double const t = (x - x0)/(x1 - x0);
return y0 + t*(y1 - y0);
}
inline float getYFlt2( float x, float x0, float y0, float x1, float y1 )
{
double const t = (x - x0)*(y1 - y0);
return y0 + t/(x1 - x0);
}
inline float getYFlt3( float x, float x0, float y0, float x1, float y1 )
{
double const t = (y1 - y0)/(x1 - x0);
return y0 + t*(x - x0);
}
inline float getYFlt4( float x, float x0, float y0, float x1, float y1 )
{
double const t = (x1 - x0)/(y1 - y0);
return y0 + (x - x0)/t;
}
I computed the average and stdDev of the difference between the double precision result and single precision result.
The result is that there is none on the average over 1000 and 10K random value sets. I used icc compiler with and without optimization as well as g++.
Note that I had to use the isnan() function to filter out bogus values. I suspect these result from underflow in the difference or division.
I don't know if the compilers rearrange the expression.
Anyway, the conclusion from this test is that the above rearrangements of the expression have no effect on the computation precision. The error remains the same (on average).
Check if the distance between x0 and x1 is small, i.e. fabs(x1 - x0) < eps. Then the line is parallell to the y axis of the coordinate system, i.e. you can't calculuate the y values of that line depending on x. You have infinite many y values and therefore you have to treat this case differently.
If your source data is already a float then you already have fundamental inaccuracy.
To explain further, imagine if you were doing this graphically. You have a 2D sheet of graph paper, and 2 point marked.
Case 1: Those points are very accurate, and have been marked with a very sharp pencil. Its easy to draw the line joining them, and easy to then get y given x (or vice versa).
Case 2: These point have been marked with a big fat felt tip pen, like a bingo marker. Clearly the line you draw will be less accurate. Do you go through the centre of the spots? The top edge? The bottom edge? Top of one, bottom of the other? Clearly there are many different options. If the two dots are close to each other then the variation will be even greater.
Floats have a certain level of inaccuracy inherent in them, due to the way they represent numbers, ergo they correspond more to case 2 than case 1 (which one could suggest is the equivalent of using an arbitrary precision librray). No algorithm in the world can compensate for that. Imprecise data in, Imprecise data out
How about computing something like:
t = sign * power2 ( sqrt (abs(x - x0))/ sqrt (abs(x1 - x0)))
The idea is to use a mathematical equivalent formula in which low (x1-x0) has less effect.
(not sure if the one I wrote matches this criteria)
As MSalters said, the problem is already in the original data.
Interpolation / extrapolation requires the slope, which already has low accuracy in the given conditions (worst for very short line segments far away from the origin).
Choice of algorithm canot regain this accuracy loss. My gut feeling is that the different evaluation order will not change things, as the error is introduced by the subtractions, not the devision.
Idea:
If you have more accurate data when the lines are generated, you can change the representation from ((x0, y0), (x1, y1)) to (x0,y0, angle, length). You could store angle or slope, slope has a pole, but angle requires trig functions... ugly.
Of course that won't work if you need the end point frequently, and you have so many lines that you can't store additional data, I have no idea. But maybe there is another representation that works well for your needs.
doubles have enough resolution in most situations, but that would double the working set too.