Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I know that it may seem a duplicate question, but I could not find my answer in previous questions.
I mean how to write a log base 10 function by simple loops and not using built in log function in c++.
The easiest way is to calculate the natural logarithm (ln) with a Taylor series. Once you have found the natural logarithm, just divide it by ln(10) and you get the base-10 log.
The Taylor series is quite simple to implement in C. If z is the number for which you are seeking the log, you just have to loop a few iterations multiplying an accumulator by (z-1) each time. To a limit, the more iterations you run, the more accurate your result will be. Check it a few times against the libC log10() version until you are happy with the precision.
This is a "numeric approach". There are other numeric solutions to finding the logarithm of a number which can give more accurate results. Some of them can be found in that Wikipedia link I gave you.
Assuming by "log base 10" you mean "the number of times n can be divided by 10 before resulting in a value < 10":
log = 0;
// Assume n has initial value N
while ( n >= 10 ) {
// Invariant: N = n * 10^log
n /= 10;
log += 1;
}
You'll get faster convergence with Newton's Method. Use something like this (hand written not compiled or tested uses f(r) = 2**r - x to compute log2(x) ):
double next(double r, double x) {
static double one_over_ln2 = 1.4426950408889634;
return r - one_over_ln2 * (1 - x / (1 << static_cast<int>(r)));
double log2(double x) {
static double epsilon = 0.000000001; // change this to change accuracy
double r = x / 2;. // better first guesses converge faster
double r2 = next(r, x);
double delta = r - r2;
while (delta * delta > epsilon) {
r = r2;
r2 = next(r, x);
delta = r - r2
}
return r2;
}
double log10(double x) {
static double log2_10 = log2(10);
return log2(x) / log2_10;
}
Related
This question already has answers here:
How do I print a double value with full precision using cout?
(17 answers)
Closed 1 year ago.
I have the following calculation:
double a = 141150, b = 141270, c = 141410;
double d = (a + b + c) / 3;
cout << d << endl;
The output shows d = 141277, whereas d should be 141276.666667. The calculation consists of double additions and a double division. Why am I getting a result that is rounded up?? By the way d = (a + b + c) / 3.0 doesn't help.
However in another similar calculation, the result is correct:
double u = 1, v = 2, x = 3, y = 4;
double z = (u + v + x + y) / 4;
z results in 2.5 as expected. These two calculations are essentially the same, but why different behaviors?
Lastly, I know C++ automatically truncates numbers casted to lower precision, but I've never heard of automatic rounding. Can someone shed some light?
Here you can find multiple answers on the same problem with code snippets as examples. It does include what the guys said in the comments
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
Statements in each case are mathematically equivalent. My question is which one is better to choose while coding. Which part of code may cause overflow for some ranges of variables, while the other doesn't have overflow for the same ranges. Which part of code is more precise and why?
double x, y, z;
//case 1
x = (x * y) * z;
x *= y * z;
//case 2
z = x + x*y;
z = x * ( 1.0 + y);
//case 3
y = x/5.0;
y = x*0.2;
// Case 1
x = (x * y) * z;
x *= y * z;
// Case 2
z = x + x*y;
z = x * ( 1.0 + y);
// Case 3
y = x/5.0;
y = x*0.2;
Case 1: x *= y * z; is like x = x * (y * z); so this case stresses the evaluation order. Should either sub-product exceed computation range and convert to INF or 0.0 or a sub-normal, the final product would significantly be affected depending on order. OTOH, intermediate math may be performed at a wider FP type. Search for FLT_EVAL_METHOD. In that case the order could be irrelevant if all computation was done as long double.
Case 2: The 2 forms are slightly different. The 2nd is numerically more stable as the addition/subtraction uses exact values: 1, y versus the first x, x*y, x*y potentially being a rounded answer. Additional/subtraction is prone to draconian precision loss - in this case when y is near -1.0. As case 1, wider intermediate math helps, but the 2nd form is still better.
C11 (C99?) offer fma(double x, double y, double z) and using fma(x, y, x) would be another good alternative.
The fma functions compute (x × y) + z, rounded as one ternary operation: they compute the value (as if) to infinite precision and round once to the result format, according to the current rounding mode. A range error may occur.
Case 3:
The "trick" here is double 0.2 the same as mathematical 0.2? Typically it is not - yet they are close. Yet an optimizing compile could 1 ) treat them as the same or 2) or as in case 1, use wider math. Then the result is the same for both lines of code.
Otherwise: depending on rounding mode, the two forms may exhibit a difference in the lest bit (ULP). With a weak compiler, recommend /5.0
Division by 5.0 is more accurate than multiplication by an approximate 0.2. But coded either way, a smart compiler may use do a wide multiplication for both.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
So, I have radian angles without any range (-inf to +inf basically) and I need to interpolate these as quick as possible. If there any cookie cutter way to do this?
PS: I only need to interpolate 2 values at a time, so a+f*(b-a) basically
PPS: the output does NOT have to be in any specific range (-PI to PI or 0 to 2PI)
PPPS: The specific problem is how to do the wrapping of the values around -PI/+PI and their multiples most efficiently
BETTER Actually forget what I wrote first. You can simply do:
template<typename K>
K lerpAngle(K u, K v, K p) {
return u + p*wrapMP(v - u);
}
where wrapMP moves an angle to the interval [-pi|+pi]. Both input u and v can be any radian angle and the result will also not be in a specific interval.
The idea is actually quite simple. Move your point of view to the point u such that u=0. Now v can be arbitrary as angles are not normalized, but we just wrap the distance v-u = v to [-pi|+pi] and walk by the given percentage p into that direction.
OLD (and inefficient) I wrote this code once:
template<typename K>
K lerpAngle(K u, K v, K p) {
K a = wrap2Pi(u);
K b = wrap2Pi(v);
K d = b - a;
if(d < -PI) {
b += 2*PI;
}
if(d > +PI) {
b -= 2*PI;
}
return wrap2Pi(a + p*(b - a));
}
where wrap2i moves an angle to the interval [0,2*pi[.
First, "normalize" your input angles so they're in a specific range, like (-π, π]. If you do this already, skip this part.
Then, take their difference. If the absolute value of this difference is greater than π then adjust one of the inputs by adding / subtracting 2π such that the absolute value of the difference becomes less or equal to π.
Then simply interpolate between those two values.
If the output of this interpolation should always be in the range (-π, π] again (note: it probably doesn't have to, depending on where you need this angle), then add / subtract 2π again if it's not.
I would like to reduce numerical floating-point errors in the following computation.
I have an equation of the following form:
b_3+w_3*(b_2+w_2*(b_1+w_1*(b_0+w_0)))
where the variable w represents some floating-point number in the range [0,1] and b represents a floating-point constant in the range [1,~1000000]. b increases monotonically with subscript (though this may not be important). Naturally, this could be extended to any number of terms:
b_4+w_4*(c_3+w_3*(b_2+w_2*(b_1+w_1*(b_0+w_0))))
This can be defined recursively as:
func(x,n):
if(n==MAX)
return x
else
return func(b[n]+x*w[n],n+1)
func(1,0)
If I were doing an online summation, I could use the Kahan Summation Algorithm (Kahan 1965), or one of several other methods ala Higham 1993 or McNamee 2004, to bound the size of my errors. If I were doing online repeated products, I could use some sort of conversion technique to reduce the problem to summation.
As it is, I'm not sure how to approach this particular problem. Does anyone have thoughts (and citations to go with them)?
Thanks!
Higham 1993. "The accuracy of floating point summation". SIAM Journal on Scientific Computing.
Kahan 1965. "Pracniques: further remarks on reducing truncation errors". CACM. "10.1145/363707.363723".
McNamee 2004. "A comparison of methods for accurate summation". SIGSAM Bull. "10.1145/980175.980177".
Your computation looks similar to a Horner scheme, except that instead of a single variable x, there are different weights w[i] being used at every stage.
There are algorithms for compensated Horner schemes which I think you could adapt for your purposes. See for example theorem 3 and algorithm 2 in the following paper.
P. Langlois, How to Ensure a Faithful Polynomial Evaluation with the Compensated Horner Algorithm. 18th IEEE Symposium on Computer Arithmetic, 25 - 27 June 2007, ARITH '07, pp. 141-149,
http://www.acsel-lab.com/arithmetic/papers/ARITH18/ARITH18_Langlois.pdf
If in algorithm 2 you replace TwoProd (s[i+1], x) with TwoProd (s[i+1], w[i+1]) it seems you would get the desired result, but I have not tried it.
The way you have defined func, it evaluates to the following expression:
For MAX = n+1, func(1,0) ==
n n
\--- -----
1 + > | | w[j]
/--- | |
i=0 j=n-i
So, the way I would resolve the sum would be:
double s = 0.0;
double a = 1.0;
for (int i = 1; i <= MAX; ++i) {
a *= w[MAX-i];
s += a;
}
return 1.0 + s;
Even if we treat the x input value to func as a variable, it only affects the final term. But because of it's range, you should take care in calculating it.
double s = 0.0;
double a = 1.0;
double ax = x;
for (int i = 1; i < MAX; ++i) {
a *= w[MAX-i];
ax *= w[MAX-i];
s += a;
}
ax *= w[0];
s += ax;
return 1.0 + s;
I have problem with precision. I have to make my c++ code to have same precision as matlab. In matlab i have script which do some stuff with numbers etc. I got code in c++ which do the same as that script. Output on the same input is diffrent :( I found that in my script when i try 104 >= 104 it returns false. I tried to use format long but it did not help me to find out why its false. Both numbers are type of double. i thought that maybe matlab stores somewhere the real value of 104 and its for real like 103.9999... So i leveled up my precision in c++. It also didnt help because when matlab returns me value of 50.000 in c++ i got value of 50.050 with high precision. Those 2 values are from few calculations like + or *. Is there any way to make my c++ and matlab scrips have same precision?
for i = 1:neighbors
y = spoints(i,1)+origy;
x = spoints(i,2)+origx;
% Calculate floors, ceils and rounds for the x and y.
fy = floor(y); cy = ceil(y); ry = round(y);
fx = floor(x); cx = ceil(x); rx = round(x);
% Check if interpolation is needed.
if (abs(x - rx) < 1e-6) && (abs(y - ry) < 1e-6)
% Interpolation is not needed, use original datatypes
N = image(ry:ry+dy,rx:rx+dx);
D = N >= C;
else
% Interpolation needed, use double type images
ty = y - fy;
tx = x - fx;
% Calculate the interpolation weights.
w1 = (1 - tx) * (1 - ty);
w2 = tx * (1 - ty);
w3 = (1 - tx) * ty ;
w4 = tx * ty ;
%Compute interpolated pixel values
N = w1*d_image(fy:fy+dy,fx:fx+dx) + w2*d_image(fy:fy+dy,cx:cx+dx) + ...
w3*d_image(cy:cy+dy,fx:fx+dx) + w4*d_image(cy:cy+dy,cx:cx+dx);
D = N >= d_C;
end
I got problems in else which is in line 12. tx and ty eqauls 0.707106781186547 or 1 - 0.707106781186547. Values from d_image are in range 0 and 255. N is value 0..255 of interpolating 4 pixels from image. d_C is value 0.255. Still dunno why matlab shows that when i have in N vlaues like: x x x 140.0000 140.0000 and in d_C: x x x 140 x. D gives me 0 on 4th position so 140.0000 != 140. I Debugged it trying more precision but it still says that its 140.00000000000000 and it is still not 140.
int Codes::Interpolation( Point_<int> point, Point_<int> center , Mat *mat)
{
int x = center.x-point.x;
int y = center.y-point.y;
Point_<double> my;
if(x<0)
{
if(y<0)
{
my.x=center.x+LEN;
my.y=center.y+LEN;
}
else
{
my.x=center.x+LEN;
my.y=center.y-LEN;
}
}
else
{
if(y<0)
{
my.x=center.x-LEN;
my.y=center.y+LEN;
}
else
{
my.x=center.x-LEN;
my.y=center.y-LEN;
}
}
int a=my.x;
int b=my.y;
double tx = my.x - a;
double ty = my.y - b;
double wage[4];
wage[0] = (1 - tx) * (1 - ty);
wage[1] = tx * (1 - ty);
wage[2] = (1 - tx) * ty ;
wage[3] = tx * ty ;
int values[4];
//wpisanie do tablicy 4 pixeli ktore wchodza do interpolacji
for(int i=0;i<4;i++)
{
int val = mat->at<uchar>(Point_<int>(a+help[i].x,a+help[i].y));
values[i]=val;
}
double moze = (wage[0]) * (values[0]) + (wage[1]) * (values[1]) + (wage[2]) * (values[2]) + (wage[3]) * (values[3]);
return moze;
}
LEN = 0.707106781186547 Values in array values are 100% same as matlab values.
Matlab uses double precision. You can use C++'s double type. That should make most things similar, but not 100%.
As someone else noted, this is probably not the source of your problem. Either there is a difference in the algorithms, or it might be something like a library function defined differently in Matlab and in C++. For example, Matlab's std() divides by (n-1) and your code may divide by n.
First, as a rule of thumb, it is never a good idea to compare floating point variables directly. Instead of, for example instead of if (nr >= 104) you should use if (nr >= 104-e), where e is a small number, like 0.00001.
However, there must be some serious undersampling or rounding error somewhere in your script, because getting 50050 instead of 50000 is not in the limit of common floating point imprecision. For example, Matlab can have a step of as small as 15 digits!
I guess there are some casting problems in your code, for example
int i;
double d;
// ...
d = i/3 * d;
will will give a very inaccurate result, because you have an integer division. d = (double)i/3 * d or d = i/3. * d would give a much more accurate result.
The above example would NOT cause any problems in Matlab, because there everything is already a floating-point number by default, so a similar problem might be behind the differences in the results of the c++ and Matlab code.
Seeing your calculations would help a lot in finding what went wrong.
EDIT:
In c and c++, if you compare a double with an integer of the same value, you have a very high chance that they will not be equal. It's the same with two doubles, but you might get lucky if you perform the exact same computations on them. Even in Matlab it's dangerous, and maybe you were just lucky that as both are doubles, both got truncated the same way.
By you recent edit it seems, that the problem is where you evaluate your array. You should never use == or != when comparing floats or doubles in c++ (or in any languages when you use floating-point variables). The proper way to do a comparison is to check whether they are within a small distance of each other.
An example: using == or != to compare two doubles is like comparing the weight of two objects by counting the number of atoms in them, and deciding that they are not equal even if there is one single atom difference between them.
MATLAB uses double precision unless you say otherwise. Any differences you see with an identical implementation in C++ will be due to floating-point errors.