I've been making a calculator in Swift 3, but have run into some problems.
I have been using NSExpression to calculate the users equation, but the answer is always rounded.
To check that the answer was rounded, I calculated 3 / 2.
let expression = NSExpression(format: "3 / 2");
let answer: Double = expression.expressionValue(with: nil, context: nil) as! Double;
Swift.print(String(answer));
The above code outputs 1.0, instead of 1.5.
Does anyone know how to stop NSExpression from rounding? Thanks.
The expression is using integer division since your operands are integers. Try this instead:
let expression = NSExpression(format: "3.0 / 2.0");
let answer: Double = expression.expressionValue(with: nil, context: nil) as! Double;
Swift.print(String(answer));
Consider the following code:
let answer = Double(3 / 2)
answer in this case would still be 1.0 since 3 / 2 is evaluated to 1 before being inserted in the Double initializer.
However, this code would give answer the value of 1.5:
let answer = Double(3.0 / 2.0)
This is because 3.0 / 2.0 will be evaluated based on the division operation for Double instead of the division operation of Integer.
Related
Could someone please explain how the functions std::fmod and std::remainder work. In the case of the std::fmod, can someone explain the steps to show how:
std::fmod(+5.1, +3.0) = 2.1
Same thing goes for std::remainder which can produce negative results.
std::remainder(+5.1, +3.0) = -0.9
std::remainder(-5.1, +3.0) = 0.9
As the reference states for std::fmod:
The floating-point remainder of the division operation x/y calculated by this function is exactly the value x - n*y, where n is x/y with its fractional part truncated.
The returned value has the same sign as x and is less than y in magnitude.
So to take the example in the question, when x = +5.1 and y = +3.0,
x/y (5.1/3.0 = 1.7) with its fractional part truncated is 1. So n is 1. So the fmod will yield x - 1*y which is 5.1 - 1 * 3.0 which is 5.1 - 3.0 which is 2.1.
And the reference states for std::remainder:
The IEEE floating-point remainder of the division operation x/y calculated by this function is exactly the value x - n*y, where the value n is the integral value nearest the exact value x/y. When |n-x/y| = ½, the value n is chosen to be even.
So to take the example in the question, when x = +5.1 and y = +3.0
The nearest integral value to x/y (1.7) is 2. So n is 2. So the remainder will yield x - 2y which is 5.1 - 2 * 3.0 which is 5.1 - 6.0 which is -0.9.
But when x = -5.1 and y = +3.0
The nearest integral value to x/y (-1.7) is -2. So n is -2. So the remainder will yield x - 2y which is -5.1 - (-2) * 3.0 which is -5.1 + 6.0 which is +0.9
The reference also states that: In contrast to std::fmod(), the returned value is not guaranteed to have the same sign as x.
For those who may have a small difficulty understanding the good example by P.W., here is a slightly less mathematical approach.
fmod() function tells you how much remains after dividing your numerator evenly by your denominator.
remainder() function tells you how far your numerator is from the next closest number the denominator evenly divides into.
Examples:
fmod(10,3.5) = 3.
3.5 can fit twice into 10 (2*3.5 = 7), leaving a remainder of 3.
remainder(10,3.5) = -0.5.
3.5 cannot fit evenly into 10, but it can fit evenly into 7 (2*3.5) and 10.5 (3*3.5).
10.5 is closer to 10 than 7.
How far away is 10 from 10.5?
It is -0.5 away from 10.5.
Let say I've set a step of 0.1 in my application.
So, whatever fp value I get, I just need 1 digit after the comma.
So, 47.93434 must be 47.9 (or at least, the nearest fp representable value).
If I write this:
double value = 47.9;
It correctly "snap" to the nearest fp value it can get, which is:
47.89999999999999857891452847979962825775146484375 // fp value
101111.11100110011001100110011001100110011001100110011 // binary
Now, suppose "I don't write those values", but I got them from a software.
And than I need to snap it. I wrote this function:
inline double SnapValue(double value, double step) {
return round(value / step) * step;
}
But it returns these values:
47.900000000000005684341886080801486968994140625 // fp value
101111.11100110011001100110011001100110011001100110100 // binary
which is formally a little far than the first example (its the "next" fp value, 011 + 1).
How would you get the first value (which is "more correct") for each input value?
Here's the testing code.
NOTE: the step can be different - i.e. step = 0.25 need to snap value around the nearest 0.25 X. Example: a step of 0.25 will return values as 0, 0.25, 0.50, 0.75, 1.0, 1.25 and so on. Thus, given an input of 1.30, it need to wrap to the nearest snapped value - i.e. 1.25.
You could try to use rational values instead of floating point. The latter are often inaccurate already, so not really an ideal match for a step.
inline double snap(double original, int numerator, int denominator)
{
return round(original * denominator / numerator) * numerator / denominator;
}
Say you want steps of 0.4, then use 2 / 5:
snap(1.7435, 2, 5) = round(4.35875) * 2 / 5 = 4 * 2 / 5 = 1.6 (or what comes closest to it)
I have the following values:
i->fitness = 160
sum_fitness = 826135
I do the operation:
i->roulette = (int)(((i->fitness / sum_fitness)*100000) + 0.5);
But i keep getting 0 in i->roulette.
I also tried to save i->fitness / sum_fitness in a double variable and only then applying the other operations, but also this gets a 0.
I'm thinking that's because 160/826135 is such a small number, then it rounds it down to 0.
How can i overcome this?
Thank you
edit:
Thanks everyone, i eventually did this:
double temp = (double)(i->fitness);
i->roulette = (int)(((temp / sum_fitness)*100000) + 0.5);
And it worked.
All the answers are similar so it's hard to choose one.
You line
i->roulette = (int)(((i->fitness / sum_fitness)*100000) + 0.5);
is casting the value to int which is why any float operation is truncated
try
i->roulette = (((i->fitness / sum_fitness)*100000) + 0.5);
and make sure that either 'sum_fitness' or 'i->fitness' is of of a float or double type to make the division a floating point division -- if they are not you will need to cast one of them before dividing, like this
i->roulette = (((i->fitness / (double)sum_fitness)*100000) + 0.5);
If you want to make this as a integer calculation you could also try to change the order of the division and multiplication, like
i->roulette = ( i->fitness *100000) / sum_fitness;
which would work as long as you don't get any integer overflow, which in your case would occur only if fitness risk to be above 2000000.
I'm thinking that's because 160/826135 is such a small number, then it rounds it down to 0.
It is integer division, and it is truncated to the integral part. So yes, it is 0, but there is no rounding. 99/100 would also be 0.
You could fix it like by casting the numerator or the denominator to double:
i->roulette = ((i->fitness / static_cast<double>(sum_fitness))*100000) + 0.5;
i write this code:
def frange(start, end, increase):
x = start
while x < end:
yield x
x = x + increase
print(list(frange(1, 2, 0.3)))
the output is:
[1, 1.3, 1.6, 1.9000000000000001]
but i don't know why the last element is 1.9000000000000001 other than 1.9.
Could you tell me the reason?
Consume the doc of Floating Point Arithmetic: Issues and Limitations to know the reason.
I understand why floating point numbers can't be compared, and know about the mantissa and exponent binary representation, but I'm no expert and today I came across something I don't get:
Namely lets say you have something like:
float denominator, numerator, resultone, resulttwo;
resultone = numerator / denominator;
float buff = 1 / denominator;
resulttwo = numerator * buff;
To my knowledge different flops can yield different results and this is not unusual. But in some edge cases these two results seem to be vastly different. To be more specific in my GLSL code calculating the Beckmann facet slope distribution for the Cook-Torrance lighitng model:
float a = 1 / (facetSlopeRMS * facetSlopeRMS * pow(clampedCosHalfNormal, 4));
float b = clampedCosHalfNormal * clampedCosHalfNormal - 1.0;
float c = facetSlopeRMS * facetSlopeRMS * clampedCosHalfNormal * clampedCosHalfNormal;
facetSlopeDistribution = a * exp(b/c);
yields very very different results to
float a = (facetSlopeRMS * facetSlopeRMS * pow(clampedCosHalfNormal, 4));
facetDlopeDistribution = exp(b/c) / a;
Why does it? The second form of the expression is problematic.
If I say try to add the second form of the expression to a color I get blacks, even though the expression should always evaluate to a positive number. Am I getting an infinity? A NaN? if so why?
I didn't go through your mathematics in detail, but you must be aware that small errors get pumped up easily by all these powers and exponentials. You should try and substitute all variables var with var + e(var) (on paper, yes) and derive an expression for the total error - without simplifying in between steps, because that's where the error comes from!
This is also a very common problem in computational fluid dynamics, where you can observe things like 'numeric diffusion' if your grid isn't properly aligned with the simulated flow.
So get a clear grip on where the biggest errors come from, and rewrite equations where possible to minimize the numeric error.
edit: to clarify, an example
Say you have some variable x and an expression y=exp(x). The error in x is denoted e(x) and is small compared to x (say e(x)/x < 0.0001, but note that this depends on the type you are using). Then you could say that
e(y) = y(x+e(x)) - y(x)
e(y) ~ dy/dx * e(x) (for small e(x))
e(y) = exp(x) * e(x)
So there's a magnification of the absolute error of exp(x), meaning that around x=0 there's really no issue (not a surprise, since at that point the slope of exp(x) equals that of x) , but for big x you will notice this.
The relative error would then be
e(y)/y = e(y)/exp(x) = e(x)
whilst the relative error in x was
e(x)/x
so you added a factor of x to the relative error.