In the Num module, it is said :
Numbers (type num) are arbitrary-precision rational numbers, plus the special elements 1/0 (infinity) and 0/0 (undefined).
I expected to find this infinity but can't find it. I guessed, then, that I could create it by hand :
let infinity = let one = Int 1 and zero = Int 0 in one // zero
But bum :
Exception: Failure "create_ratio infinite or undefined rational number".
So, ok, there is this val infinity : float in Pervasives, let's find a num_from_float. Oh, there's no such function...
Well, does anyone know how to represent positive and negative infinity with Num ?
By default, special numbers are disabled. This behavior can be controlled with the Arith_status module. For example, to allow zero denominators, use the following:
Arith_status.set_error_when_null_denominator false
Once the flag is set, your infinity definition works fine:
let infinity = let one = Int 1 and zero = Int 0 in one // zero;;
val infinity : Num.num = <num 1/0>
float_of_num infinity;;
- : float = infinity
Related
In C++ : how to print the digits after the decimal.
For example i have this float number ( 12.54 ), and i want to print it like this ( 0.54 ).
Thank you all.
You can use modf function.
double integral_part;
double fractional = modf(some_double, &integral_part);
You can also cast it to an integer, but be warned you may overflow the integer. The result is not predictable then.
The simplest way
float f = 10.123;
float fract = f - (int)f;
std::cout << fract;
But for large input you can obtain integer overflow. In this case use
float fract = f - truncf(f);
Output
0.123
In C++ : how to print the digits after the decimal. For example i have
this float number ( 12.54 ), and i want to print it like this ( 0.54
).
If you want to use get the fractional part of a floating type number you have a choice of std::floor or std::trunc. Non negative numbers will be treated the same by either but negative numbers will not.
std::floor returns the lowest, non fractional, value while std::trunc returns the non fractional towards 0.
double f=1.23;
floor(f); // yields .23
trunc(1.23); // also yields .23
However
double f=-1.23;
floor(f); // yields -2
trunc(f); // but yields -1
So use trunc to get the fractional part for both positive and negative f's:
double f=-1.23;
f - floor(f); // yields .77
f - trunc(f); // but yields -.23
In the following part of code:
I want to generate a random number "U" from the range 0 to 1,
then I calculate an equation having log
The error is: some value of U makes the log in the equation give "not a number"value
I tried casting the "U" to float or double or even round it to 2 decimal places but same error
vector <double>Xs;//random Xs
double x;
double U;
while (check_arr < 360)
{
U = ((rand() / RAND_MAX) * 100) / 100;
x = (log10(1 - U)) / (-1 / a);
Xs.push_back(x);
}
There are multiple problems with your code.
rand() returns an integer, and RAND_MAX is an integer, so when you divide them you get an integer which will almost always be zero (since rand() can produce the value RAND_MAX - one time in 2^31 on my computer - and that division will produce 1).
Next, multiplying then dividing by 100 is doing nothing. The result will be the same: an integer that's almost always 0, sometimes 1.
Finally, you must avoid taking the log10 of zero. This value is disallowed and will raise the divide-by-zero exception (also, negative values would raise the invalid floating point exception).
Perhaps you could use the following expression instead:
U = (rand() % 100)/100.0;
This will give you a value of U with a distribution from 0.00 up to 0.99 inclusive. When you then take log10(1-U) you won't get an exception.
log10() will return "not a number" when the parameter being passed to it is 0. When I ran the method on my machine the result that I got was "-1.#INF000000000000". log(0) is an invalid number. You can verify this by opening the calculator on your PC (if you are using windows), switch to scientific mode then try to do log 0.
Mathematical explanation:
The log base 10 function is used to help find the exponent y in 10^y=x. So when you are trying to plug in 0 in the function you are trying to find a solution to the following:
10^y=0
But there are no solution to this so instead the function will return an invalid number. It would be better if you set the range of the x value to 0 < x <= 1 so you will not have that same issue.
Since the rand function returns a value between 0 and RAND-MAX you can be able to use the following to ensure that you will not input 0 into the log function:
U = (rand() % 100 + 1)/100;
This will return a range of 0.01 and 1. You can mess around with the numbers to increase/decrease the range.
Let's say I have an input 1.251564.
How can I find how many elements are after "." to have an output as follows:
int numFloating;
// code to go here that leads to
// numFloating == 6
p.s. Sorry for not providing any code, I just have no idea how that should be implemented :(
Thanks for your answers!
Let us consider your number, 1.251564. When you store this in a double, it is stored in the binary IEEE754 format. And you might find that the number is not representable. So, let us check for this number. The closest representable double is:
1.25156 39999 99999 89880 45035 73046 53152 82344 81811 52343 75
This probably comes as something of a surprise to you. There are 52 decimal digits following the decimal point.
The lesson that you need to take away from this is that if you want to ask questions about decimal representations, you need to use a decimal data type rather than double. Once you can actually represent the value exactly, then you will be able to reason about it in a manner that matches your expectations.
Simplest way would be to store it in string.
std::string str("1.1234");
size_t length = str.length();
size_t found = str.find('.', 0 );
size_t count = length-found-1;
int finallyGotTheCount = static_cast<int>(count);
This won't end up well. The problem is that sometimes there are float errors when representing numbers in binary (which is what your computer does).
For example, when adding 1 / 3 + 1 / 3 + 1 / 3 you might get 0.999999... and the number of decimal places varies greatly.
ravi already provided a good way to calculate it, so I'll provide a different one:
double number = 0; // should be equal to the number you want to check
int numFloating = 0;
while ((double)(int)number != number){
number *= 10;
numFloating++;
}
number is a double variable that holds the number you want to check for decimal places.
If you have a fractional number. Lets say .1234
Repeatedly multiply by 10 and throw away the integer portion of the number until you get zero. The number of steps will be the number of decimals. e.g:
.1234 * 10 = 1.234
.234 * 10 = 2.34
.34 * 10 = 3.4
.4 * 10 = 4.0
Problems will however occur when you have a number that is "floating" like 1.199999999.
int numFloating = 0;
double orgin = 1.251564;
double value = orgin - floor(orgin);
while(value == 0)
{
value *= 10;
value = value - floor(value);
numFloating ++;
}
By using this code sometimes answer is wrong. exp: zero in floating point is equal to (2^31)-1.
Obviously output depends on how it realy stored.
I have an algorithm which uses floats or doubles to perform some calculations.
Example:
double a;
double b;
double c;
...
double result = c / (b - a);
if ((result > 0) && (result < small_number))
{
// result is relevant...
} else {
// result not required...
}
Now, I am worried about (b - a) might be zero. If it is close to zero but not zero, it does not matter because the result will be out of range to be useful, and I already detect that (as (b - a) approaches zero, result will approach +/- inf, which is not in the range 0-small_number...)
But if the result of (b - a) is exactly zero, I expect that something platform dependant will happen due to divide by zero. I could change the if statement to:
if ((!((b-a) == 0.0)) && ((result = c/(b-a)) > 0) && (result < small_number)) {
but I don't know if (b-a) == 0.0 will always detect equality with zero. I have seen there are multiple representations for exact zero in floating point? How can you test for them all without doing some epsilon check, which I don't need (a small epsilon will be ignored in my algorithm)?
What is the platform independant way to check?
EDIT:
Not sure if it was clear enough to people. Basically I want to know how to find if an expression like:
double result = numerator / denominator;
will result in a floating point exception, a cpu exception, a signal from the operating system or something else.... without actually performing the operating and seeing if it will "throw"... because detecting a "throw" of this nature seems to be complicated and platform specific.
Is ( (denominator==0.0) || (denominator==-0.0) ) ? "Will 'throw'" : "Won't 'throw'"; enough?
It depends on how b and a got their values. Zero has an exact representation in floating point format, but the bigger problem would be almost-but-not-quite zero values. It would always be safe to check:
if (abs(b-a) > 0.00000001 && ...
Where 0.00000001 is whatever value makes sense.
Here's how you do it: instead of checking for (result < small_number), you check for
(abs(c) < abs(b - a) * small_number)
Then all your troubles disappear! The computation of c/(b-a) will never overflow if this test is passed.
I guess you can use fpclassify(-0.0) == FP_ZERO . But this is only useful if you want to check if someone did put some kind of zero into float-type variable. As many already said if you want to check result of calculation you may get values very close to zero due to nature of representation.
In brief, we can know a floating number is ZERO exactly if we know it represent format.
In practice, we compare x with a small number. And if x is less than this number, we think x is as same as ZERO functionally (but most of time our small number is still large than zero). This method is very easy, efficient and can cross platform.
Actually, the float and double have been presented by special format, and the widely used one is IEEE 754 in current hardware which divided the number into sign, exponent and mantissa (significand) bits.
So, if we want to check if a float number is ZERO exactly, we can check if both exponent and mantissa is ZERO, see here.
In IEEE 754 binary floating point numbers, zero values are represented
by the biased exponent and significand both being zero. Negative zero
has the sign bit set to one.
Take float for example, we can write a simple code to extract exponent and mantissa bit and then check it.
#include <stdio.h>
typedef union {
float f;
struct {
unsigned int mantissa : 23;
unsigned int exponent : 8;
unsigned int sign : 1;
} parts;
} float_cast;
int isZero(float num) {
int flag = 0;
float_cast data;
data.f = num;
// Check both exponent and mantissa parts
if(data.parts.exponent == 0u && data.parts.mantissa == 0u) {
flag = 1;
} else {
flag = 0;
}
return(flag);
}
int main() {
float num1 = 0.f, num2 = -0.f, num3 = 1.2f;
printf("\n is zero of %f -> %d", num1, isZero(num1));
printf("\n is zero of %f -> %d", num2, isZero(num2));
printf("\n is zero of %f -> %d", num3, isZero(num3));
return(0);
}
Test results:
# is zero of 0.000000 -> 1
# is zero of -0.000000 -> 1
# is zero of 1.200000 -> 0
More examples:
Let's check when the float becomes real ZERO with code.
void test() {
int i =0;
float e = 1.f, small = 1.f;
for(i = 0; i < 40; i++) {
e *= 10.f;
small = 1.f/e;
printf("\nis %e zero? : %d", small, isZero(small));
}
return;
}
is 1.0000e-01 zero? : NO
is 1.0000e-02 zero? : NO
is 1.0000e-03 zero? : NO
is 1.0000e-04 zero? : NO
is 1.0000e-05 zero? : NO
is 1.0000e-06 zero? : NO
is 1.0000e-07 zero? : NO
is 1.0000e-08 zero? : NO
is 1.0000e-09 zero? : NO
is 1.0000e-10 zero? : NO
is 1.0000e-11 zero? : NO
is 1.0000e-12 zero? : NO
is 1.0000e-13 zero? : NO
is 1.0000e-14 zero? : NO
is 1.0000e-15 zero? : NO
is 1.0000e-16 zero? : NO
is 1.0000e-17 zero? : NO
is 1.0000e-18 zero? : NO
is 1.0000e-19 zero? : NO
is 1.0000e-20 zero? : NO
is 1.0000e-21 zero? : NO
is 1.0000e-22 zero? : NO
is 1.0000e-23 zero? : NO
is 1.0000e-24 zero? : NO
is 1.0000e-25 zero? : NO
is 1.0000e-26 zero? : NO
is 1.0000e-27 zero? : NO
is 1.0000e-28 zero? : NO
is 1.0000e-29 zero? : NO
is 1.0000e-30 zero? : NO
is 1.0000e-31 zero? : NO
is 1.0000e-32 zero? : NO
is 1.0000e-33 zero? : NO
is 1.0000e-34 zero? : NO
is 1.0000e-35 zero? : NO
is 1.0000e-36 zero? : NO
is 1.0000e-37 zero? : NO
is 1.0000e-38 zero? : NO
is 0.0000e+00 zero? : YES <-- 1e-39
is 0.0000e+00 zero? : YES <-- 1e-40
UPDATE (2016-01-04)
I've received some downvotes on this answer, and I wondered if I should just delete it. It seems the consensus (https://meta.stackexchange.com/questions/146403/should-i-delete-my-answers) is that deleting answers should only be done in extreme cases.
So, my answer is wrong. But I guess I'm leaving it up because it provides for an interesting "think out of the box" kind of thought experiment.
===============
Bingo,
You say you want to know if b-a == 0.
Another way of looking at this is to determine whether a == b. If a equals b, then b-a will be equal 0.
Another interesting idea I found:
http://www.cygnus-software.com/papers/comparingfloats/Comparing%20floating%20point%20numbers.htm
Essentially, you take the floating point variables you have and tell the compiler to reinterpret them (bit for bit) as signed integers, as in the following:
if (*(int*)&b == *(int*)&a)
Then you are comparing integers, and not floating points. Maybe that will help? Maybe not. Good luck!
I believe that (b-a)==0 will be true exactly in those cases when the c/(b-a) would fail because of (b-a) being zero. The float maths is tricky but questioning this is exaggerating in my opinion. Also I believe that the (b-a)==0 is going to be equivalent to b!=a.
Distinguishing positive and negative 0 is also not necessary. See e.g. here Does float have a negative zero? (-0f)
For epsilon, in there is a standard template definition std::numeric_limits::epsilon(). I guess checking the difference to be bigger than std::numeric_limits::epsilon() should be safe enough to protect against division by zero. No platform dependency here I guess.
You could try
if ((b-a)!=(a-b) && ((result = c/(b-a)) > 0) && (result < small_number))) {
...
As part of a numerical library test I need to choose base 10 decimal numbers that can be represented exactly in base 2. How do you detect in C++ if a base 10 decimal number can be represented exactly in base 2?
My first guess is as follows:
bool canBeRepresentedInBase2(const double &pNumberInBase10)
{
//check if a number in base 10 can be represented exactly in base 2
//reference: http://en.wikipedia.org/wiki/Binary_numeral_system
bool funcResult = false;
int nbOfDoublings = 16*3;
double doubledNumber = pNumberInBase10;
for (int i = 0; i < nbOfDoublings ; i++)
{
doubledNumber = 2*doubledNumber;
double intPart;
double fracPart = modf(doubledNumber/2, &intPart);
if (fracPart == 0) //number can be represented exactly in base 2
{
funcResult = true;
break;
}
}
return funcResult;
}
I tested this function with the following values: -1.0/4.0, 0.0, 0.1, 0.2, 0.205, 1.0/3.0, 7.0/8.0, 1.0, 256.0/255.0, 1.02, 99.005. It returns true for -1.0/4.0, 0.0, 7.0/8.0, 1.0, 99.005 which is correct.
Any better ideas?
I think what you are looking for is a number which has a fractional portion which is the sum of a sequence of negative powers of 2 (aka: 1 over a power of 2). I believe this should always be able to be represented exactly in IEEE floats/doubles.
For example:
0.375 = (1/4 + 1/8) which should have an exact representation.
If you want to generate these. You could try do something like this:
#include <iostream>
#include <cstdlib>
int main() {
srand(time(0));
double value = 0.0;
for(int i = 1; i < 256; i *= 2) {
// doesn't matter, some random probability of including this
// fraction in our sequence..
if((rand() % 3) == 0) {
value += (1.0 / static_cast<double>(i));
}
}
std::cout << value << std::endl;
}
EDIT: I believe your function has a broken interface. It would be better if you had this:
bool canBeRepresentedExactly(int numerator, int denominator);
because not all fractions have exact representations, but the moment you shove it into a double, you've chosen a representation in binary... defeating the purpose of the test.
If you're checking to see if it's binary, it will always return true. If your method takes a double as the parameter, the number is already represented in binary (double is a binary type, usually 64 bits). Looking at your code, I think you're actually trying to see if it can be represented exactly as an integer, in which case why can't you just cast to int, then back to double and compare to the original. Any integer stored in a double that's within the range representable by an int should be exact, IIRC, because a 64 bit double has 53 bits of mantissa (and I'm assuming a 32 bit int). That means if they're equal, it's an integer.
If you're passing in a double, then by definition, it has already been represented in binary and if not, then you've already lost accuracy.
Maybe try passing in numerator and denominator of the fraction to the function. Then you have not lost accuracy and can check to see if you can come up with a binary representation of the answer that is the same as the fraction you've passed in.
As rmeador have pointed out, it might not be a good idea to accept the double, because the number has been converted to a double, an possible approximation to the number that you're trying to check.
So, in a very abstract way, you should split your check into integers, and decimals. Integers should not be too large such that the mantissa cannot express all the integers, (e.g. 9007199254740993 should not be represented properly by a 64-bit fp)
Decimal points may be a bit easier, mentally, because if anything after the decimal point (e.g. yyy in xxx.yyy) contains a factor of anything other than 2, the floating point repeats in order to try to represent it. It's the reason why 1/3 cannot be represented with finite digits in base 10 = base (2*5)... See Recurring Decimal
EDIT: As the comments pointed out, if the decimal number has a factor of anything other than 1/2, that would be the mathematically correct way to say it...
As others have mentioned, your method doesn't do what you mean, since you pass a number represented as a (binary) double. The method actually detects, if the number you passed is in the form integer/2^48. This should fail for numbers like (1+2^-50), which is binary, and 259/255, which isn't.
If you really want to test a number for being exactly representable by finite binary string, you have to pass a number in an exact form.
You can't pass IN a Double because it's already lost precision. You should be able to use the toString() method of Double to check for this. (example in Java)
public static Boolean canBeRepresentedInBase2(String thenumber)
{
// Reuturns true of the parsed Double did not loose precision.
// Only works for numbers that are not converted into scientific notation by toString.
return thenumber.equals(Double.parseDouble(thenumber).toString())
}
You asked for C++ but maybe this algorithm will help. I use "EE" to mean "exactly expressible as a float."
Start with a decimal representation of the number you want to test. Remove any trailing zeroes (that is, 0.123450000 becomes 0.12345).
1) If the number is not an integer, check to see if the rightmost digit is 5. If it's not, then stop -- the number is not EE.
2) Multiply the number by 2. If the result is an integer, then stop -- the number is EE. Otherwise, go back to step 1.
I don't have rigorous proof for this but a "warm fuzzy." Fire up Calculator and enter your favorite fractional power of 2, like 0.0000152587890625. Add it to itself a few dozen times (I just hit "+" once then "=" a bunch of times). If there are any non-zero digits to the right of the decimal point, the last digit is always 5.
Here is the code in C# and it works. Because it works with the Decimal data - there are no inherent rounding errors that show up in the original code which uses double. (decimal in C# stores using base 10 instead of base 2 - which is what double does)
static bool canBeRepresentedInBase2(decimal pNumberInBase10)
{
//check if a number in base 10 can be represented exactly in base 2
//reference: http://en.wikipedia.org/wiki/Binary_numeral_system
bool funcResult = false;
int nbOfDoublings = 16*3;
decimal doubledNumber = pNumberInBase10;
for (int i = 0; i < nbOfDoublings ; i++)
{
doubledNumber = 2*doubledNumber;
decimal intPart;
decimal fracPart = ModF(doubledNumber/2, out intPart);
if (fracPart == 0) //number can be represented exactly in base 2
{
funcResult = true;
break;
}
}
return funcResult;
}
static decimal ModF(decimal number, out decimal intPart)
{
intPart = Math.Floor(number);
decimal fractional = number - (intPart);
return fractional;
}
Tested with the following code (where WL does a Console.WritelLine - SnippetCompiler)
WL(canBeRepresentedInBase2(-1.0M/4.0M)); //true
WL(canBeRepresentedInBase2(0.0M)); //true
WL(canBeRepresentedInBase2(0.1M)); //false
WL(canBeRepresentedInBase2(0.2M)); //false
WL(canBeRepresentedInBase2(0.205M)); //false
WL(canBeRepresentedInBase2(1.0M/3.0M)); //false
WL(canBeRepresentedInBase2(7.0M/8.0M)); //true
WL(canBeRepresentedInBase2(1.0M)); //true
WL(canBeRepresentedInBase2(256.0M/255.0M)); //false
WL(canBeRepresentedInBase2(1.02M)); //false
WL(canBeRepresentedInBase2(99.005M)); //false
WL(canBeRepresentedInBase2(2.53M)); //false
Or even easier:
return pNumber == floor(pNumber);
On the other hand, if you have some weird fractional representation (numerator denominator pair, or string with a decimal in it, or something), and you really do want to know if the value can be exactly represented as a double, it's a bit harder.
But you would need a different parameter(s) for that...
Given a number r it can be represented exactly with finite precision in base 2 iff r can be written as r = m/2^n, where m, n are integers, and n >= 0.
For example 1/7 doesn't have a finite binary expression, also 1/6 and 1/10 can't be written with a finite expression in base 2.
But 1/4+1/32+1/1024, have a finite expression in base.
PS: In general you can express a number r with finite digits in a base b iff r=m/b^n where m, n are integers an n >= 0.
PPS: As almost everybody has stated previously using a double as input is a bad idea, because you are loosing precision, and you will end up with a different number.
I don't think this is what he's asking... I think he's looking for a solution that will tell him if a number can be represented EXACTLY in binary form. For example, 33.3.. That's a number cannot be represented in binary, because it will go on forever, so depending on your FPU settings, it will be represented as something like "33.333333333333336". So, it looks like his method will do the job. I don't know of a better way off the top of my head.
\
Ignoring the general criticism of using a double...
For a general finite decimal, you can determine if it has a finite representation in binary with the following algorithm:
Extract the fraction part of the decimal f.
Determine f x 10b = c, where b and c are integers.
Determine 2d >= 10b, where d is an integer.
If c x 2b / 10b is an integer, then the decimal has a finite representation in binary. Otherwise, it doesn't.
You can generalize this to any two bases.