C++: how can I test if a number is power of ten? - c++

I want to test if a number double x is an integer power of 10. I could perhaps use cmath's log10 and then test if x == (int) x?
edit: Actually, my solution does not work because doubles can be very big, much bigger than int, and also very small, like fractions.

A lookup table will be by far the fastest and most precise way to do this; only about 600 powers of 10 are representable as doubles. You can use a hash table, or if the table is ordered from smallest to largest, you can rapidly search it with binary chop.
This has the advantage that you will get a "hit" if and only if your number is exactly the closest possible IEEE double to some power of 10. If this isn't what you want, you need to be more precise about exactly how you would like your solution to handle the fact that many powers of 10 can't be exactly represented as doubles.
The best way to construct the table is probably to use string -> float conversion; that way hopefully your library authors will already have solved the problem of how to do the conversion in a way that gives the most precise answer possible.

Your solution sounds good but I would replace the exact comparison with a tolerance one.
double exponent = log10(value);
double rounded = floor(exponent + 0.5);
if (fabs(exponent - rounded) < some_tolerance) {
//Power of ten
}

I am afraid you're in for a world of hurt. There is no way to cast down a very large or very small floating point number to a BigInt class because you lost precision when using the small floating point number.
For example float only has 6 digits of precision. So if you represent 109 as a float chances are it will be converted back as 1 000 000 145 or something like that: nothing guarantees what the last digits will be, they are off the precision.
You can of course use a much more precise representation, like double which has 15 digits of precision. So normally you should be able to represent integers from 0 to 1014 faithfully.
Finally some platforms may have a long long type with an ever greater precision.
But anyway, as soon as your value exceed the number of digits available to be converted back to an integer without loss... you can't test it for being a power of ten.
If you really need this precision, my suggestion is not to use a floating point number. There are mathematical libraries available with BigInt implementations or you can roll your own (though efficiency is difficult to achieve).

bool power_of_ten(double x) {
if(x < 1.0 || x > 10E15) {
warning("IEEE754 doubles can only precisely represent powers "
"of ten between 1 and 10E15, answer will be approximate.");
}
double exponent;
// power of ten if log10 of absolute value has no fractional part
return !modf(log10(fabs(x)), &exponent);
}

Depending on the platform your code needs to run on the log might be very expensive.
Since the amount of numbers that are 10^n (where n is natural) is very small,
it might be faster to just use a hardcoded lookup table.
(Ugly pseudo code follows:)
bool isPowerOfTen( int16 x )
{
if( x == 10 // n=1
|| x == 100 // n=2
|| x == 1000 // n=3
|| x == 10000 ) // n=4
return true;
return false;
}
This covers the whole int16 range and if that is all you need might be a lot faster.
(Depending on the platform.)

How about a code like this:
#include <stdio.h>
#define MAX 20
bool check_pow10(double num)
{
char arr[MAX];
sprintf(arr,"%lf",num);
char* ptr = arr;
bool isFirstOne = true;
while (*ptr)
{
switch (*ptr++)
{
case '1':
if (isFirstOne)
isFirstOne = false;
else
return false;
break;
case '0':
break;
case '.':
break;
default:
return false;
}
}
return true;
}
int main()
{
double number;
scanf("%lf",&number);
printf("isPower10: %s\n",check_pow10(number)?"yes":"no");
}
That would not work for negative powers of 10 though.
EDIT: works for negative powers also.

if you don't need it to be fast, use recursion. Pseudocode:
bool checkifpoweroften(double Candidadte)
if Candidate>=10
return (checkifpoweroften(Candidadte/10)
elsif Candidate<=0.1
return (checkifpoweroften(Candidadte*10)
elsif Candidate == 1
return 1
else
return 0
You still need to choose between false positives and false negatives and add tolerances accordingly, as other answers pointed out. The tolerances should apply to all comparisons, or else, for exemple, 9.99999999 would fail the >=10 comparison.

how about that:
bool isPow10(double number, double epsilon)
{
if (number > 0)
{
for (int i=1; i <16; i++)
{
if ( (number >= (pow((double)10,i) - epsilon)) &&
(number <= (pow((double)10,i) + epsilon)))
{
return true;
}
}
}
return false;
}
I guess if performance is an issue the few values could be precomputed, with or without the epsilon according to the needs.

A variant of this one:
double log10_value= log10(value);
double integer_value;
double fractional_value= modf(log10_value, &integer_value);
return fractional_value==0.0;
Note that the comparison to 0.0 is exact rather than within a particular epsilon since you want to ensure that log10_value is an integer.
EDIT: Since this sparked a bit of controversy due to log10 possibly being imprecise and the generic understanding that you shouldn't compare doubles without an epsilon, here's a more precise way of determining if a double is a power of 10 using only properties of powers of 10 and IEEE 754 doubles.
First, a clarification: a double can represent up to 1E22, as 1e22 has only 52 significant bits. Luckily, 5^22 also only has 52 significant bits, so we can determine if a double is (2*5)^n for n= [0, 22]:
bool is_pow10(double value)
{
int exponent;
double mantissa= frexp(value, &exponent);
int exponent_adjustment= exponent/10;
int possible_10_exponent= (exponent - exponent_adjustment)/3;
if (possible_10_exponent>=0 &&
possible_10_exponent<=22)
{
mantissa*= pow(2.0, exponent - possible_10_exponent);
return mantissa==pow(5.0, possible_10_exponent);
}
else
{
return false;
}
}
Since 2^10==1024, that adds an extra bit of significance that we have to remove from the possible power of 5.

Related

Display of Double Precision Floating Points Vs Their comparrison

Preamble
I am looking into a system developed to be used by people who don't understand floating point arithmetic. For this reason the implementation of comparison for floating point numbers is not exposed to the people using the system. Currently comparisons of floating point numbers occur like this (And this cannot change due to legacy reasons):
// If either number is not finite, do default comparison
if (!IsFinite(num1) || !IsFinite(num2)) {
output = (num1 == num2);
} else {
// Get exponents of both numbers to determine epsilon for comparison
tmp = (OSINT32*)&num1+1;
exp1 = (((*tmp)>>20)& 0x07ff) - 1023;
tmp = (OSINT32*)&num2+1;
exp2 = (((*tmp)>>20)& 0x07ff) - 1023;
// Check if exponent is the same
if (exp1 != exp2) {
output = false;
} else {
// Calculate epsilon based on the magic number 47 (presumably calculated experimentally)?
epsilon = pow(2.0,exp1-47);
output = (fabs(num2-num1) <= eps);
}
}
The crux of it is, we calculate the epsilon based on the exponent of the number to stop users of the interface from making floating point comparison mistakes. A BIG NOTE: This is for people who are not software programmers so when they do pow(sqrt(2), 2) == 2 they don't get a big surprise. Maybe this is not the best idea, but like i said, it cannot be changed.
The Problem
We are having trouble figuring out how to display numbers to the user. In the past they simply displayed the number to 15 significant digits. But this results in problems of the following type:
>> SHOW 4.1 MOD 1
>> 0.099999999999999996
>> SHOW (4.1 MOD 1) == 0.1
>> TRUE
The comparison calls this correct because of the generated epsilon. But the printing of the number is confusing for people, how is 0.099999999999999996 = 0.1?. We need a way to show the number such that it represents the shortest number of significant bits to which a number compared to it would be TRUE. So for 0.099999999999999996 this would be 0.1, for 0.569999999992724327 it would be 0.569999999992725.
Is this possible?
You could calculate (num - pow(2.0, exp - 47)) and (num + pow(2.0, exp - 47)), convert both to string and search the smallest decimal between the range.
The exact value of a double is mantissa * pow(2.0, exp - 51) with an integer value mantissa, so if you add/subtract pow(2.0, exp - 47) you change the mantissa by 2^4, which should be exactly representable without rounding errors (unless in corner cases where the mantissa under/overflows, i.e if it is binary <= pow(2,4) or >= pow(2, 53) - pow(2,4). you might want to check for these*).
Then you have two strings, search the first position where the digits differ and cut it off there. Although there are a lot of rounding cases, especially when you not just want a correct number in the range, but the number closes to the input number (but that might not be needed). For example if you get "1.23" and "1.24", you might even want to output `"1.235".
This also shows that your example is wrong. epsilon for 0.569999999992724327 is (to maximal precision) 0.000000000000003552713678800500929355621337890625. The ranges are 0.569999999992720773889232077635824680328369140625 to 0.569999999992727879316589678637683391571044921875 and would be cut off at 0.569999999992725 (or 0.569999999992723 if you prefer that rounding)
An easier to implement sledgehammer method would be to output it to the maximal precision, cut one digit off, convert it back to double, check if it compares correctly. Then continue cutting, till the comparison fails. (could be improved with a binary search)
* They should still be exactly representable, but your comparison method will behave very odd. Consider num1 == 1 and num2 == 1 - pow(2.0, -53) = 0.99999999999999988897769753748434595763683319091796875. There difference 0.00000000000000011102230246251565404236316680908203125 is below your epsilon0.000000000000003552713678800500929355621337890625, but the comparison will say they differ, because they have different exponents
Yes, it's possible.
double a=fmod(4.1,1);
cerr<<std::setprecision(0)<<a<<"\n";
cerr<<std::setprecision(10)<<a<<"\n";
cerr<<std::setprecision(20)<<a<<"\n";
produces:
0.1
0.1
0.099999999999999644729
I think you just need to determine what level of display precision corresponds to your epsilon value.
We need a way to show the number such that it represents the shortest
number of significant bits to which a number compared to it would be
TRUE.
Can't you just do it the brute-force-ish way?
float num = 0.09999999;
for (int precision = 0; precision < MAX_PRECISION; ++precision) {
std::stringstream str;
float tmp = 0;
str << std::fixed << std::setprecision(precision) << num;
str >> tmp;
if (num == tmp) {
std::cout << std::fixed << std::setprecision(precision) << num;
break;
}
}
It is not possible to avoid confusing users given the constraints you've specified. For one thing, 0.0999999999999996447 compares equal to 0.1, and 0.1000000000000003664 compares equal to 0.1, but 0.0999999999999996447 does not compare equal to 0.1000000000000003664. For another, 2.00000000000001421 compares equal to 2.0, but 1.999999999999999778 does not compare equal to 2.0 even though it's much closer to 2.0 than 2.00000000000001421 is.
Enjoy.

Do multiples of Pi to the thousandths have a value that may change how a loop executes?

Recently I decided to get into c++, and after going through the basics I decided to build a calculator using only iostream (just to challenge myself). After most of it was complete, I came across an issue with my loop for exponents. Whenever a multiple of Pi was used as the exponent, it looped way too many times. I fixed it in a somewhat redundant way and now I'm hoping someone might be able to tell me what happened. My unfixed code snippet is below. Ignore everything above and just look at the last bit of fully functioning code. All I was wondering was why values of pi would throw off the loop so much. Thanks.
bool TestForDecimal(double Num) /* Checks if the number given is whole or not */ {
if (Num > -INT_MAX && Num < INT_MAX && Num == (int)Num) {
return 0;
}
else {
return 1;
}
}
And then heres where it all goes wrong (Denominator is set to a value of 1)
if (TestForDecimal(Power) == 1) /* Checks if its decimal or not */ {
while (TestForDecimal(Power) == 1) {
Power = Power * 10;
Denominator = Denominator * 10;
}
}
If anyone could give me an explanation that would be great!
To clarify further, the while loop kept looping even after Power became a whole number (This only happened when Power was equal to a multiple of pi such as 3.1415 or 6.2830 etc.)
Heres a complete code you can try:
#include <iostream>
bool TestForDecimal(double Num) /* Checks if the number given is whole or not */ {
if (Num > -INT_MAX && Num < INT_MAX && Num == (int)Num) {
return 0;
}
else {
return 1;
}
}
void foo(double Power) {
double x = Power;
if (TestForDecimal(x) == 1) /* Checks if its decimal or not */ {
while (TestForDecimal(x) == 1) {
x = x * 10;
std::cout << x << std::endl;
}
}
}
int main() {
foo(3.145); // Substitute this with 3.1415 and it doesn't work (this was my problem)
system("Pause");
return 0;
}
What's wrong with doing something like this?
#include <cmath> // abs and round
#include <cfloat> // DBL_EPSILON
bool TestForDecimal(double Num) {
double diff = abs(round(Num) - Num);
// true if not a whole number
return diff > DBL_EPSILON;
}
The look is quite inefficient...what if Num is large...
A faster way could be something like
if (Num == static_cast<int>(Num))
or
if (Num == (int)Num)
if you prefer a C-style syntax.
Then a range check may be useful... it oes not make sense to ask if Num is an intger when is larger than 2^32 (about 4 billions)
Finally do not think od these numers as decimals. They are stored as binary numbers, instead of multiplying Power and Denominator by 2 you are better of multiplying them by 2.
Most decimal fractions can't be represented exactly in a binary floating-point format, so what you're trying to do can't work in general. For example, with a standard 64-bit double format, the closest representable value to 3.1415 is more like 3.1415000000000001812.
If you need to represent decimal fractions exactly, then you'll need a non-standard type. Boost.Multiprecision has some decimal types, and there's a proposal to add decimal types to the standard library; some implementations may have experimental support for this.
Beware. A double is (generally but I think you use a standard architecture) represented in IEE-754 format, that is mantissa * 2exponent. For a double, you have 53 bits for the mantissa part, one for the sign and 10 for the exponent. When you multiply it by 10 it will grow, and will get an integer value as soon as exponent will be greater than 53.
Unfortunately, unless you have a 64 bits system, an 53 bits integer cannot be represented as a 32 bits int, and your test will fail again.
So if you have a 32 bits system, you will never reach an integer value. You will more likely reach an infinity representation and stay there ...
The only use case where it could work, would be if you started with a number that can be represented with a small number of negative power of 2, for example 0.5 (1/2), 0.25(1/4), 0.75(1/2 + 1/4), giving almost all digits of mantissa part being 0.
After studying your "unfixed" function, from what I can tell, here's your basic algorithm:
double TestForDecimal(double Num) { ...
A function that accepts a double and returns a double. This would make sense if the returned value was the decimal value, but since that's not the case, perhaps you meant to use bool?
while (Num > 1) { make it less }
While there is nothing inherently wrong with this, it doesn't really address negative numbers with large magnitudes, so you'll run into problems there.
if (Num > -INT_MAX && Num < INT_MAX && Num == (int)Num) { return 0; }
This means that if Num is within the signed integer range and its integer typecast is equal to itself, return a 0 typecasted to a double. This means you don't care whether numbers outside the integer range are whole numbers or not. To fix this, change the condition to if (Num == (long)Num) since sizeof(long) == sizeof(double).
Perhaps the algorithm your function follows that I've just explained might shed some light on your problem.

Determining the number of decimal digits in a double - C++

I am trying to get the number of digits after a decimal point in a double. Currently, my code looks like this:
int num_of_decimal_digits = 0;
while (someDouble - someInt != 0)
{
someDouble = someDouble*10;
someInt = someDouble;
num_of_decimal_digits++;
}
Whenever I enter a decimal in for someDouble that is less than one, the loop gets stuck and repeats infinitely. Should I use static_cast? Any advice?
Due to floating-point rounding error, multiplying by 10 is not necessarily an exact decimal shift. You can test the absolute error of the difference rather than comparing it for exact equality with 0.
while (abs(someDouble - someInt) < epsilon)
Or you can acknowledge that a double with a 53-bit mantissa can only represent log10 253 ā‰ˆ 15.9 decimal digits, and limit the loop to 16 iterations.
while (someDouble - someInt != 0 && num_of_decimal_digits < 16)
Or both.
while (abs(someDouble - someInt) < epsilon && num_of_decimal_digits < 16)
The naive answer would be:
int num_of_decimal_digits = 0;
double absDouble = someDouble > 0 ? someDouble : someDouble * -1;
while (absDouble - someInt != 0)
{
absDouble = absDouble*10;
someInt = absDouble;
num_of_decimal_digits++;
}
This solves your problem of negative numbers.
However, this solution is likely not going to give you the output you desire in a lot of cases because of the way that floating point numbers are represented. For example 0.35 might really be represented as 0.3499999999998 the way floating point numbers are stored in binary. I would suggest that you share more background information about what you are hoping to accomplish with this code (your input and your desired output). There is likely a much better solution for what you are attempting to accomplish.

C++ Should this be easier?

long-time listener, first-time caller. I am relatively new to programming and was looking back at some of the code I wrote for an old lab. Is there an easier way to tell if a double is evenly divisible by an integer?
double num (//whatever);
int divisor (//an integer);
bool bananas;
if(floor(num)!= num || static_cast<int>(num)%divisor != 0) {
bananas=false;
}
if(bananas==true)
//do stuff;
}
The question is strange, and the checks are as well. The problem is that it makes little sense to speak about divisibility of a floating point number because floating point number are represented imprecisely in binary, and divisibility is about exactitude.
I encourage you to read this article, by David Goldberg: What Every Computer Scientist Should Know About Floating Point Arithmetic. It is a bit long-winded, so you may appreciate this website, instead: The Floating-Point Guide.
The truth is that floor(num) == num is a strange piece of code.
num is a double
floor(num) returns an double, close to an int
The trouble is that this does not check what you really wanted. For example, suppose (for the sake of example) that 5 cannot be represented exactly as a double, therefore, instead of storing 5, the computer will store 4.999999999999.
double num = 5; // 4.999999999999999
double floored = floor(num); // 4.0
assert(num != floored);
In general exact comparisons are meaningless for floating point numbers, because of rounding errors.
If you insist on using floor, I suggest to use floor(num + 0.5) which is better, though slightly biased. A better rounding method is the Banker's rounding because it is unbiased, and the article references others if you wish. Note that the Banker's rounding is the baked in in round...
As for your question, first you need a double aware modulo: fmod, then you need to remember the avoid exact comparisons bit.
A first (naive) attempt:
// divisor is deemed non-zero
// epsilon is a constant
double mod = fmod(num, divisor); // divisor will be converted to a double
if (mod <= epsilon) { }
Unfortunately it fails one important test: the magnitude of mod depends on the magnitude of divisor, thus if divisor is smaller than epsilon to begin with, it will always be true.
A second attempt:
// divisor is deemed non-zero
double const epsilon = divisor / 1000.0;
double mod = fmod(num, divisor);
if (mod <= epsilon) { }
Better, but not quite there: mod and epsilon are signed! Yes, it's a bizarre modulo, th sign of mod is the sign of num
A third attempt:
// divisor is deemed non-zero
double const eps = fabs(divisor / 1000.0);
double mod = fabs(fmod(num, divisor));
if (mod <= eps) { }
Much better.
Should work fairly well too if divisor comes from an integer, as there won't be precision issues... or at least not too much.
EDIT: fourth attempt, by #ybungalobill
The previous attempt does not deal well with situations where num/divisor errors on the wrong side. Like 1.999/1.000 --> 0.999, it's nearly divisor so we should indicate equality, yet it failed.
// divisor is deemed non-zero
mod = fabs(fmod(num/divisor, 1));
if (mod <= 0.001 || fabs(1 - mod) <= 0.001) { }
Looks like a never ending task eh ?
There is still cause for troubles though.
double has a limited precision, that is a limited number of digits that is representable (16 I think ?). This precision might be insufficient to represent an integer:
Integer n = 12345678901234567890;
double d = n; // 1.234567890123457 * 10^20
This truncation means it is impossible to map it back to its original value. This should not cause any issue with double and int, for example on my platform double is 8 bytes and int is 4 bytes, so it would work, but changing double to float or int to long could violate this assumption, oh hell!
Are you sure you really need floating point, by the way ?
Based on the above comments, I believe you can do this...
double num (//whatever);
int divisor (//an integer);
if(fmod(num, divisor) == 0) {
//do stuff;
}
I haven't checked it but why not do this?
if (floor(num) == num && !(static_cast<int>(num) % divisor)) {
// do stuff...
}

Heuristic to identify if a series of 4 bytes chunks of data are integers or floats

What's the best heuristic I can use to identify whether a chunk of X 4-bytes are integers or floats? A human can do this easily, but I wanted to do it programmatically.
I realize that since every combination of bits will result in a valid integer and (almost?) all of them will also result in a valid float, there is no way to know for sure. But I still would like to identify the most likely candidate (which will virtually always be correct; or at least, a human can do it).
For example, let's take a series of 4-bytes raw data and print them as integers first and then as floats:
1 1.4013e-45
10 1.4013e-44
44 6.16571e-44
5000 7.00649e-42
1024 1.43493e-42
0 0
0 0
-5 -nan
11 1.54143e-44
Obviously they will be integers.
Now, another example:
1065353216 1
1084227584 5
1085276160 5.5
1068149391 1.33333
1083179008 4.5
1120403456 100
0 0
-1110651699 -0.1
1195593728 50000
These will obviously be floats.
PS: I'm using C++ but you can answer in any language, pseudo code or just in english.
The "common sense" heuristic from your example seems to basically amount to a range check. If one interpretation is very large (or a tiny fraction, close to zero), that is probably wrong. Check the exponent of the float interpretation and compare it to the exponent that results from a proper static cast of the integer interpretation to a float.
Looks like a kolmogorov complexity issue. Basically, from what you show as example, the shorter number (when printed as string to be read by a human), be it integer or float, is the right answer for your heuristic.
Also, obviously if the value is an incorrect float, it is an integer :-)
Seems direct enough to implement.
You can probably "detect" it by looking at the high bits, with floats they'd generally be non-zero, with integers, they would be unless you're dealing with a very large number. So... you could try and see if (2^30) & number returns 0 or not.
If both numbers are positive, your floats are reasonably large (greater than 10^-42), and your ints are reasonably small (less than 8*10^6), then the check is pretty simple. Treat the data as a float and compare to the least normalized float.
union float_or_int {
float f;
int32_t i;
};
bool is_positive_normalized_float( float_or_int &u ) {
return u.f >= numeric_limits<float>::min();
}
This assumes IEEE float and same endinanness between the CPU and the FPU.
A human can do this easily
A human can't do it at all. Ergo neither can a computer. There are 2^32 valid int values. A large number of them are also valid float values. There is no way of distinguishing the intent of the data other than by tagging it or by not getting into such a mess in the first place.
Don't attempt this.
You are going to be looking at the upper 8 or 9 bits. That's where the sign and mantissa of a floating point value are. Values of 0x00 0x80 and 0xFF here are pretty uncommon for valid float data.
In particular if the upper 9 bits are all 0 then this likely to be a valid floating point value only if all 32 bits are 0. Another way to say this is that if the exponent is 0, the mantissa should also be zero. If the upper bit is 1 and the next 8 bits are 0, this is legal, but also not likely to be valid. It represents -0.0 which is a legal floating point value, but a meaningless one.
To put this into numerical terms. if the upper byte is 0x00 (or 0x80), then the value has a magnitude of at most 2.35e-38. Plank's constant is 6.62e-34 m2kg/s that's 4 orders of magnitude larger. The estimated diameter of a proton is much much larger than that (estimated at 1.6eāˆ’15 meters). The smallest non-zero value for audio data is about 2.3e-10. You aren't likely to see floating point values are are legitimate measurements of anything real that are smaller than 2.35e-38 but not zero.
Going the other direction if the upper byte is 0xFF then this value is either Infinite, a NaN or larger in magnitude than 3.4e+38. The age of the universe is estimated to be 1.3e+10 years (1.3e+25 femtoseconds). The observable universe has roughly e+23 stars, Avagadro's number is 6.02e+23. Once again float values larger than e+38 rarely show up in legitimate measurements.
This is not to say that the FPU can't load or produce such values, and you will certainly see them in intermediate values of calculations if you are working with modern FPUs. A modern FPU will load a floating point value that has a exponent of 0 but the other bits are not 0. These are called denormalized values. This is why you are seeing small positive integers show up as float values in the range of e-42 even though the normal range of a float only goes down to e-38
An exponent of all 1s represents Infinity. You probably won't find infinities in your data, but you would know better than I. -Infinity is 0xFF800000, +Infinity is 0x7F800000, any value other than 0 in the mantissa of Infinity is malformed. malformed infinities are used as NaNs.
Loading a NaN into a float register can cause it to throw an exception, so you want to use integer math to do your guessing about whether your data is float or int until you are fairly certain it is int.
If you know that your floats are all going to be actual values (no NaNs, INFs, denormals or other aberrant values) then you can use this a criterion. In general an array of ints will have a high probability of containing "bad" float values.
I assume the following:
that you mean IEEE 754 single precision floating point numbers.
that the sign bit of the float is saved in the MSB of an int.
So here we go:
static boolean probablyFloat(uint32_t bits) {
bool sign = (bits & 0x80000000U) != 0;
int exp = ((bits & 0x7f800000U) >> 23) - 127;
uint32_t mant = bits & 0x007fffff;
// +- 0.0
if (exp == -127 && mant == 0)
return true;
// +- 1 billionth to 1 billion
if (-30 <= exp && exp <= 30)
return true;
// some value with only a few binary digits
if ((mant & 0x0000ffff) == 0)
return true;
return false;
}
int main() {
assert(probablyFloat(1065353216));
assert(probablyFloat(1084227584));
assert(probablyFloat(1085276160));
assert(probablyFloat(1068149391));
assert(probablyFloat(1083179008));
assert(probablyFloat(1120403456));
assert(probablyFloat(0));
assert(probablyFloat(-1110651699));
assert(probablyFloat(1195593728));
return 0;
}
simplifying what Alan said, I'd ONLY look at the integer form. and say, if the number is bigger than 99999999 then it's almost definitely a float.
This has the advantage that it's fast, easy, and avoids nan issues.
It has the disadvantage that it pretty much full of crap... i didn't actually look at what floats these will represent or anything, but it looks reasonable from your examples...
In any case, this is a heuristic, so it's GONNA be full of crap, and not always work anyway...
Measure with a micrometer, mark with chalk, cut with an axe.
Here is a heuristic I came up with, based on #kriss' idea. After a brief look at some of my data, it seems to work fairly well.
I am using it in a disassembler to detect if a 32-bit value was likely originally an integer or float literal.
public class FloatUtil {
private static final int canonicalFloatNaN = Float.floatToRawIntBits(Float.NaN);
private static final int maxFloat = Float.floatToRawIntBits(Float.MAX_VALUE);
private static final int piFloat = Float.floatToRawIntBits((float)Math.PI);
private static final int eFloat = Float.floatToRawIntBits((float)Math.E);
private static final DecimalFormat format = new DecimalFormat("0.####################E0");
public static boolean isLikelyFloat(int value) {
// Check for some common named float values
if (value == canonicalFloatNaN ||
value == maxFloat ||
value == piFloat ||
value == eFloat) {
return true;
}
// Check for some named integer values
if (value == Integer.MAX_VALUE || value == Integer.MIN_VALUE) {
return false;
}
// a non-canocical NaN is more likely to be an integer
float floatValue = Float.intBitsToFloat(value);
if (Float.isNaN(floatValue)) {
return false;
}
// Otherwise, whichever has a shorter scientific notation representation is more likely.
// Integer wins the tie
String asInt = format.format(value);
String asFloat = format.format(floatValue);
// try to strip off any small imprecision near the end of the mantissa
int decimalPoint = asFloat.indexOf('.');
int exponent = asFloat.indexOf("E");
int zeros = asFloat.indexOf("000");
if (zeros > decimalPoint && zeros < exponent) {
asFloat = asFloat.substring(0, zeros) + asFloat.substring(exponent);
} else {
int nines = asFloat.indexOf("999");
if (nines > decimalPoint && nines < exponent) {
asFloat = asFloat.substring(0, nines) + asFloat.substring(exponent);
}
}
return asFloat.length() < asInt.length();
}
}
And here are some of the values it works for (and a couple it doesn't)
#Test
public void isLikelyFloatTest() {
Assert.assertTrue(FloatUtil.isLikelyFloat(Float.floatToRawIntBits(1.23f)));
Assert.assertTrue(FloatUtil.isLikelyFloat(Float.floatToRawIntBits(1.0f)));
Assert.assertTrue(FloatUtil.isLikelyFloat(Float.floatToRawIntBits(Float.NaN)));
Assert.assertTrue(FloatUtil.isLikelyFloat(Float.floatToRawIntBits(Float.NEGATIVE_INFINITY)));
Assert.assertTrue(FloatUtil.isLikelyFloat(Float.floatToRawIntBits(Float.POSITIVE_INFINITY)));
Assert.assertTrue(FloatUtil.isLikelyFloat(Float.floatToRawIntBits(1e-30f)));
Assert.assertTrue(FloatUtil.isLikelyFloat(Float.floatToRawIntBits(1000f)));
Assert.assertTrue(FloatUtil.isLikelyFloat(Float.floatToRawIntBits(1f)));
Assert.assertTrue(FloatUtil.isLikelyFloat(Float.floatToRawIntBits(-1f)));
Assert.assertTrue(FloatUtil.isLikelyFloat(Float.floatToRawIntBits(-5f)));
Assert.assertTrue(FloatUtil.isLikelyFloat(Float.floatToRawIntBits(1.3333f)));
Assert.assertTrue(FloatUtil.isLikelyFloat(Float.floatToRawIntBits(4.5f)));
Assert.assertTrue(FloatUtil.isLikelyFloat(Float.floatToRawIntBits(.1f)));
Assert.assertTrue(FloatUtil.isLikelyFloat(Float.floatToRawIntBits(50000f)));
Assert.assertTrue(FloatUtil.isLikelyFloat(Float.floatToRawIntBits(Float.MAX_VALUE)));
Assert.assertTrue(FloatUtil.isLikelyFloat(Float.floatToRawIntBits((float)Math.PI)));
Assert.assertTrue(FloatUtil.isLikelyFloat(Float.floatToRawIntBits((float)Math.E)));
// Float.MIN_VALUE is equivalent to integer value 1. this should be detected as an integer
// Assert.assertTrue(FloatUtil.isLikelyFloat(Float.floatToRawIntBits(Float.MIN_VALUE)));
// This one doesn't quite work. It has a series of 2 0's, but we only strip 3 0's or more
// Assert.assertTrue(FloatUtil.isLikelyFloat(Float.floatToRawIntBits(1.33333f)));
Assert.assertFalse(FloatUtil.isLikelyFloat(0));
Assert.assertFalse(FloatUtil.isLikelyFloat(1));
Assert.assertFalse(FloatUtil.isLikelyFloat(10));
Assert.assertFalse(FloatUtil.isLikelyFloat(100));
Assert.assertFalse(FloatUtil.isLikelyFloat(1000));
Assert.assertFalse(FloatUtil.isLikelyFloat(1024));
Assert.assertFalse(FloatUtil.isLikelyFloat(1234));
Assert.assertFalse(FloatUtil.isLikelyFloat(-5));
Assert.assertFalse(FloatUtil.isLikelyFloat(-13));
Assert.assertFalse(FloatUtil.isLikelyFloat(-123));
Assert.assertFalse(FloatUtil.isLikelyFloat(20000000));
Assert.assertFalse(FloatUtil.isLikelyFloat(2000000000));
Assert.assertFalse(FloatUtil.isLikelyFloat(-2000000000));
Assert.assertFalse(FloatUtil.isLikelyFloat(Integer.MAX_VALUE));
Assert.assertFalse(FloatUtil.isLikelyFloat(Integer.MIN_VALUE));
Assert.assertFalse(FloatUtil.isLikelyFloat(Short.MIN_VALUE));
Assert.assertFalse(FloatUtil.isLikelyFloat(Short.MAX_VALUE));
}