How to get rid of minus sign from signed zero - c++

I am using asin to calculate the angle. The code is as below :
double FindAngle(const double theValue)
{
return asin(theValue);
}
FindAngle returns a -0.0 (signed zero), when the argument theValue = -0.0. Now, how do i get rid of the minus sign from the return value.

You can do the following:
double FindAngle(const double theValue)
{
return (asin(theValue) + 0.0);
}
I had the same problem and that worked for me.

If you just want to convert -0 to 0 and leave other untouched, just do a comparison.
double FindAngle(double value) {
double res = asin(value);
if (res == 0.0) res = 0.0;
return res;
}

include <cmath> and use the abs function on your return value, if you want all results to be positive, or check if your return value is equal to -0.0 and take the abs value of it, for just that case.
abs function (c++ reference)

double FindAngle(const double theValue)
{
return abs(asin(value));
}

You can use the following method.
value = Float.compare(value, -0.0f) == 0 ? 0.0f : value ;

Are you sure that signed zero is your problem? (In this case, adding 0.0 to it—as proposed by FacundoJ above—would in fact solve it. Provided your arithmetic conforms to IEEE 754, that is.)
If, on the other hand, your problem is that printf("%f", x) produces -0.000000 (or similar for a similar format specifier), then just adding 0.0 is not enough: you will get the same output for any small, but negative, value.
In this case some actual programming is needed (at least I know of no better solution). I used something like this the other day:
int snrfmt(char *s, const char *format, double x)
{
int n, m;
char z[32];
n = sprintf(s, format, x);
m = sprintf(z, format, -DBL_MIN);
if (n == m && strcmp(s, z) == 0)
n = sprintf(s, format, 0.0);
return n;
}
as a kind-of replacement for sprintf():
double x = -1.23E-45;
char nr[80];
(void)snrfmt(buf, "%#+010.4f", x);
puts(nr);
This produces "+0000.0000" as desired (but of course "-0000.0001" for x = -0.50001E-4).

Related

Why is my double or int value is always 0 after division?

I'm fairly new to C++ and I'm experiencing some strange behaviour from a percentage increase method I am writing for some image editing software.
What I want to do is give the R G or B value of the current pixel and divide it by some modifier, then multiply it by the new value to return the percentage increase, fairly easy concept.
However, whenever I run my debugger, the return value is always 0, I thought this may be because I was trying to do operations which give negative numbers on an integer (or maybe a divide by zero could occur?), so I tried to use a double to store the output of the computation, however I've had no luck.
The code I'm struggling with is below:
int Sliders::getPercentageIncrease(int currPixel, int newValue, int modifier)
{
// calculate return value
double returnVal = (currPixel / modifier) * newValue;
// Check we are returning a positive integer
if(returnVal >= 0)
return (int)returnVal;
// Return a negative integer value
return (int)(0 - returnVal);
}
What am I doing wrong here?
NOTE: I have checked values, of inputs in my debugger and I get stuff like:
currPixel = 30
newValue = 119
modifier = 200
From this I would expect an output of 18 (I am not concerned with returning decimal figures)
Your current calculation only involves integers and so will be affected by integer division (which truncates the result to the nearest integer value).
(currPixel / modifier) * newValue
| |
---------------integer division e.g. 10/3 = 3, not 3.333
The result is then cast to double, but the accuracy is lost before this point.
Consider the following:
#include <iostream>
using namespace std;
int main() {
int val1 = 10;
int val2 = 7;
int val3 = 9;
double outval1 = (val1 / val2) * val3;
double outval2 = ((double)val1 / val2) * val3;
cout << "without cast: " << outval1 << "\nwith cast: "<< outval2 << std::endl;
return 0;
}
The output of this is:
without cast: 9
with cast: 12.8571
See it here
Note that the cast has to be applied in the right place:
(double)(val1 / val2) * val3 == 9.0 //casts result of (val1/val2) after integer division
(val1 / val2) * (double)val3 == 9.0 //promotes result of (val1/val2) after integer division
((double)val1 / val2) * val3 == 12.8571 //promotes val2 before division
(val1 / (double)val2) * val3 == 12.8571 //promotes val1 before division
Due to promotion of the other operands, if in doubt you can just cast everything and the resulting code will be the same:
((double)val1 / (double)val2) * (double)val3 == 12.8571
It is a little more verbose though.
Since all three parameters are integer the result of the calculation
double returnVal = (currPixel / modifier) * newValue;
will always be truncated. Add cast to (double) and the result should be fine. Simply:
double returnVal = ((double)currPixel / modifier) * newValue;
If you only set a cast before the bracket the result of the division stays an integer.
As long as all values are in a range, let me say, less than 1000 and greater (or equal) than 0, which is common on colour values, do something like
int returnVal = (currPixel * newValue) / modifier
No need for doubles; it will even speed up the code.
Needless to say, modifiershould not be zero.
Do this:
// calculate return value
double returnVal = (static_cast<double>(currPixel) / modifier) * newValue;
Or this:
double returnVal = (currPixel / static_cast<double>(modifier)) * newValue;
As you know that operator / will be performed first, and then the operator *. I have typecasted one of the operands of / to double, and hence division will be performed double. Now, left operand of * would be double (since / produced double), and the multiplication would be performed double also. For clarity and correctness, you may write:
double returnVal = (static_cast<double>(currPixel) / static_cast<double>(modifier)) * static_cast<double>(newValue);
Or simply:
double returnVal = (double(currPixel) / (double)modifier) * (double)newValue;
But, following is WRONG:
double returnVal = (double)(currPixel / modifier) * /*(double)*/ newValue;
Since the division would be performed int only! It is like:
double x = 10/3;
Where you need (either):
double x = 10.0/3;
double x = 10/3.0;
double x = (double)10/3;
casting to double should fix the error.
double returnVal = (double ) (currPixel) / (modifier) * newValue;
see type casting rules typecasting rules in c.

How to code a modulo (%) operator in C/C++/Obj-C that handles negative numbers

One of my pet hates of C-derived languages (as a mathematician) is that
(-1) % 8 // comes out as -1, and not 7
fmodf(-1,8) // fails similarly
What's the best solution?
C++ allows the possibility of templates and operator overloading, but both of these are murky waters for me. examples gratefully received.
First of all I'd like to note that you cannot even rely on the fact that (-1) % 8 == -1. the only thing you can rely on is that (x / y) * y + ( x % y) == x. However whether or not the remainder is negative is implementation-defined.
Reference: C++03 paragraph 5.6 clause 4:
The binary / operator yields the quotient, and the binary % operator yields the remainder from the division of the first expression by the second. If the second operand of / or % is zero the behavior is undefined; otherwise (a/b)*b + a%b is equal to a. If both operands are nonnegative then the remainder is nonnegative; if not, the sign of the remainder is implementation-defined.
Here it follows a version that handles both negative operands so that the result of the subtraction of the remainder from the divisor can be subtracted from the dividend so it will be floor of the actual division. mod(-1,8) results in 7, while mod(13, -8) is -3.
int mod(int a, int b)
{
if(b < 0) //you can check for b == 0 separately and do what you want
return -mod(-a, -b);
int ret = a % b;
if(ret < 0)
ret+=b;
return ret;
}
Here is a C function that handles positive OR negative integer OR fractional values for BOTH OPERANDS
#include <math.h>
float mod(float a, float N) {return a - N*floor(a/N);} //return in range [0, N)
This is surely the most elegant solution from a mathematical standpoint. However, I'm not sure if it is robust in handling integers. Sometimes floating point errors creep in when converting int -> fp -> int.
I am using this code for non-int s, and a separate function for int.
NOTE: need to trap N = 0!
Tester code:
#include <math.h>
#include <stdio.h>
float mod(float a, float N)
{
float ret = a - N * floor (a / N);
printf("%f.1 mod %f.1 = %f.1 \n", a, N, ret);
return ret;
}
int main (char* argc, char** argv)
{
printf ("fmodf(-10.2, 2.0) = %f.1 == FAIL! \n\n", fmodf(-10.2, 2.0));
float x;
x = mod(10.2f, 2.0f);
x = mod(10.2f, -2.0f);
x = mod(-10.2f, 2.0f);
x = mod(-10.2f, -2.0f);
return 0;
}
(Note: You can compile and run it straight out of CodePad: http://codepad.org/UOgEqAMA)
Output:
fmodf(-10.2, 2.0) = -0.20 == FAIL!
10.2 mod 2.0 = 0.2
10.2 mod -2.0 = -1.8
-10.2 mod 2.0 = 1.8
-10.2 mod -2.0 = -0.2
I have just noticed that Bjarne Stroustrup labels % as the remainder operator, not the modulo operator.
I would bet that this is its formal name in the ANSI C & C++ specifications, and that abuse of terminology has crept in. Does anyone know this for a fact?
But if this is the case then C's fmodf() function (and probably others) are very misleading. they should be labelled fremf(), etc
The simplest general function to find the positive modulo would be this-
It would work on both positive and negative values of x.
int modulo(int x,int N){
return (x % N + N) %N;
}
For integers this is simple. Just do
(((x < 0) ? ((x % N) + N) : x) % N)
where I am supposing that N is positive and representable in the type of x. Your favorite compiler should be able to optimize this out, such that it ends up in just one mod operation in assembler.
The best solution ¹for a mathematician is to use Python.
C++ operator overloading has little to do with it. You can't overload operators for built-in types. What you want is simply a function. Of course you can use C++ templating to implement that function for all relevant types with just 1 piece of code.
The standard C library provides fmod, if I recall the name correctly, for floating point types.
For integers you can define a C++ function template that always returns non-negative remainder (corresponding to Euclidian division) as ...
#include <stdlib.h> // abs
template< class Integer >
auto mod( Integer a, Integer b )
-> Integer
{
Integer const r = a%b;
return (r < 0? r + abs( b ) : r);
}
... and just write mod(a, b) instead of a%b.
Here the type Integer needs to be a signed integer type.
If you want the common math behavior where the sign of the remainder is the same as the sign of the divisor, then you can do e.g.
template< class Integer >
auto floor_div( Integer const a, Integer const b )
-> Integer
{
bool const a_is_negative = (a < 0);
bool const b_is_negative = (b < 0);
bool const change_sign = (a_is_negative != b_is_negative);
Integer const abs_b = abs( b );
Integer const abs_a_plus = abs( a ) + (change_sign? abs_b - 1 : 0);
Integer const quot = abs_a_plus / abs_b;
return (change_sign? -quot : quot);
}
template< class Integer >
auto floor_mod( Integer const a, Integer const b )
-> Integer
{ return a - b*floor_div( a, b ); }
… with the same constraint on Integer, that it's a signed type.
¹ Because Python's integer division rounds towards negative infinity.
Here's a new answer to an old question, based on this Microsoft Research paper and references therein.
Note that from C11 and C++11 onwards, the semantics of div has become truncation towards zero (see [expr.mul]/4). Furthermore, for D divided by d, C++11 guarantees the following about the quotient qT and remainder rT
auto const qT = D / d;
auto const rT = D % d;
assert(D == d * qT + rT);
assert(abs(rT) < abs(d));
assert(signum(rT) == signum(D) || rT == 0);
where signum maps to -1, 0, +1, depending on whether its argument is <, ==, > than 0 (see this Q&A for source code).
With truncated division, the sign of the remainder is equal to the sign of the dividend D, i.e. -1 % 8 == -1. C++11 also provides a std::div function that returns a struct with members quot and rem according to truncated division.
There are other definitions possible, e.g. so-called floored division can be defined in terms of the builtin truncated division
auto const I = signum(rT) == -signum(d) ? 1 : 0;
auto const qF = qT - I;
auto const rF = rT + I * d;
assert(D == d * qF + rF);
assert(abs(rF) < abs(d));
assert(signum(rF) == signum(d));
With floored division, the sign of the remainder is equal to the sign of the divisor d. In languages such as Haskell and Oberon, there are builtin operators for floored division. In C++, you'd need to write a function using the above definitions.
Yet another way is Euclidean division, which can also be defined in terms of the builtin truncated division
auto const I = rT >= 0 ? 0 : (d > 0 ? 1 : -1);
auto const qE = qT - I;
auto const rE = rT + I * d;
assert(D == d * qE + rE);
assert(abs(rE) < abs(d));
assert(signum(rE) >= 0);
With Euclidean division, the sign of the remainder is always non-negative.
Oh, I hate % design for this too....
You may convert dividend to unsigned in a way like:
unsigned int offset = (-INT_MIN) - (-INT_MIN)%divider
result = (offset + dividend) % divider
where offset is closest to (-INT_MIN) multiple of module, so adding and subtracting it will not change modulo. Note that it have unsigned type and result will be integer. Unfortunately it cannot correctly convert values INT_MIN...(-offset-1) as they cause arifmetic overflow. But this method have advandage of only single additional arithmetic per operation (and no conditionals) when working with constant divider, so it is usable in DSP-like applications.
There's special case, where divider is 2N (integer power of two), for which modulo can be calculated using simple arithmetic and bitwise logic as
dividend&(divider-1)
for example
x mod 2 = x & 1
x mod 4 = x & 3
x mod 8 = x & 7
x mod 16 = x & 15
More common and less tricky way is to get modulo using this function (works only with positive divider):
int mod(int x, int y) {
int r = x%y;
return r<0?r+y:r;
}
This just correct result if it is negative.
Also you may trick:
(p%q + q)%q
It is very short but use two %-s which are commonly slow.
I believe another solution to this problem would be use to variables of type long instead of int.
I was just working on some code where the % operator was returning a negative value which caused some issues (for generating uniform random variables on [0,1] you don't really want negative numbers :) ), but after switching the variables to type long, everything was running smoothly and the results matched the ones I was getting when running the same code in python (important for me as I wanted to be able to generate the same "random" numbers across several platforms.
For a solution that uses no branches and only 1 mod, you can do the following
// Works for other sizes too,
// assuming you change 63 to the appropriate value
int64_t mod(int64_t x, int64_t div) {
return (x % div) + (((x >> 63) ^ (div >> 63)) & div);
}
/* Warning: macro mod evaluates its arguments' side effects multiple times. */
#define mod(r,m) (((r) % (m)) + ((r)<0)?(m):0)
... or just get used to getting any representative for the equivalence class.
Example template for C++
template< class T >
T mod( T a, T b )
{
T const r = a%b;
return ((r!=0)&&((r^b)<0) ? r + b : r);
}
With this template, the returned remainder will be zero or have the same sign as the divisor (denominator) (the equivalent of rounding towards negative infinity), instead of the C++ behavior of the remainder being zero or having the same sign as the dividend (numerator) (the equivalent of rounding towards zero).
define MOD(a, b) ((((a)%(b))+(b))%(b))
unsigned mod(int a, unsigned b) {
return (a >= 0 ? a % b : b - (-a) % b);
}
This solution (for use when mod is positive) avoids taking negative divide or remainder operations all together:
int core_modulus(int val, int mod)
{
if(val>=0)
return val % mod;
else
return val + mod * ((mod - val - 1)/mod);
}
I would do:
((-1)+8) % 8
This adds the latter number to the first before doing the modulo giving 7 as desired. This should work for any number down to -8. For -9 add 2*8.

Can I rely on this to judge a square number in C++?

Can I rely on
sqrt((float)a)*sqrt((float)a)==a
or
(int)sqrt((float)a)*(int)sqrt((float)a)==a
to check whether a number is a perfect square? Why or why not?
int a is the number to be judged. I'm using Visual Studio 2005.
Edit: Thanks for all these rapid answers. I see that I can't rely on float type comparison. (If I wrote as above, will the last a be cast to float implicitly?) If I do it like
(int)sqrt((float)a)*(int)sqrt((float)a) - a < e
How small should I take that e value?
Edit2: Hey, why don't we leave the comparison part aside, and decide whether the (int) is necessary? As I see, with it, the difference might be great for squares; but without it, the difference might be small for non-squares. Perhaps neither will do. :-(
Actually, this is not a C++, but a math question.
With floating point numbers, you should never rely on equality. Where you would test a == b, just test against abs(a - b) < eps, where eps is a small number (e.g. 1E-6) that you would treat as a good enough approximation.
If the number you are testing is an integer, you might be interested in the Wikipedia article about Integer square root
EDIT:
As Krugar said, the article I linked does not answer anything. Sure, there is no direct answer to your question there, phoenie. I just thought that the underlying problem you have is floating point precision and maybe you wanted some math background to your problem.
For the impatient, there is a link in the article to a lengthy discussion about implementing isqrt. It boils down to the code karx11erx posted in his answer.
If you have integers which do not fit into an unsigned long, you can modify the algorithm yourself.
If you don't want to rely on float precision then you can use the following code that uses integer math.
The Isqrt is taken from here and is O(log n)
// Finds the integer square root of a positive number
static int Isqrt(int num)
{
if (0 == num) { return 0; } // Avoid zero divide
int n = (num / 2) + 1; // Initial estimate, never low
int n1 = (n + (num / n)) / 2;
while (n1 < n)
{
n = n1;
n1 = (n + (num / n)) / 2;
} // end while
return n;
} // end Isqrt()
static bool IsPerfectSquare(int num)
{
return Isqrt(num) * Isqrt(num) == num;
}
Not to do the same calculation twice I would do it with a temporary number:
int b = (int)sqrt((float)a);
if((b*b) == a)
{
//perfect square
}
edit:
dav made a good point. instead of relying on the cast you'll need to round off the float first
so it should be:
int b = (int) (sqrt((float)a) + 0.5f);
if((b*b) == a)
{
//perfect square
}
Your question has already been answered, but here is a working solution.
Your 'perfect squares' are implicitly integer values, so you could easily solve floating point format related accuracy problems by using some integer square root function to determine the integer square root of the value you want to test. That function will return the biggest number r for a value v where r * r <= v. Once you have r, you simply need to test whether r * r == v.
unsigned short isqrt (unsigned long a)
{
unsigned long rem = 0;
unsigned long root = 0;
for (int i = 16; i; i--) {
root <<= 1;
rem = ((rem << 2) + (a >> 30));
a <<= 2;
if (root < rem)
rem -= ++root;
}
return (unsigned short) (root >> 1);
}
bool PerfectSquare (unsigned long a)
{
unsigned short r = isqrt (a);
return r * r == a;
}
I didn't follow the formula, I apologize.
But you can easily check if a floating point number is an integer by casting it to an integer type and compare the result against the floating point number. So,
bool isSquare(long val) {
double root = sqrt(val);
if (root == (long) root)
return true;
else return false;
}
Naturally this is only doable if you are working with values that you know will fit within the integer type range. But being that the case, you can solve the problem this way, saving you the inherent complexity of a mathematical formula.
As reinier says, you need to add 0.5 to make sure it rounds to the nearest integer, so you get
int b = (int) (sqrt((float)a) + 0.5f);
if((b*b) == a) /* perfect square */
For this to work, b has to be (exactly) equal to the square root of a if a is a perfect square. However, I don't think you can guarantee this. Suppose that int is 64 bits and float is 32 bits (I think that's allowed). Then a can be of the order 2^60, so its square root is of order 2^30. However, a float only stores 24 bits in the significand, so the rounding error is of order 2^(30-24) = 2^6. This is larger to 1, so b may contain the wrong integer. For instance, I think that the above code does not identify a = (2^30+1)^2 as a perfect square.
I would do.
// sqrt always returns positive value. So casting to int is equivalent to floor()
int down = static_cast<int>(sqrt(value));
int up = down+1; // This is the ceil(sqrt(value))
// Because of rounding problems I would test the floor() and ceil()
// of the value returned from sqrt().
if (((down*down) == value) || ((up*up) == value))
{
// We have a winner.
}
The more obvious, if slower -- O(sqrt(n)) -- way:
bool is_perfect_square(int i) {
int d = 1;
for (int x = 0; x <= i; x += d, d += 2) {
if (x == i) return true;
}
return false;
}
While others have noted that you should not test for equality with floats, I think you are missing out on chances to take advantage of the properties of perfect squares. First there is no point in re-squaring the calculated root. If a is a perfect square then sqrt(a) is an integer and you should check:
b = sqrt((float)a)
b - floor(b) < e
where e is set sufficiently small. There are also a number of integers that you can cross of as non-square before taking the square root. Checking Wikipedia you can see some necessary conditions for a to be square:
A square number can only end with
digits 00,1,4,6,9, or 25 in base 10
Another simple check would be to see that a % 4 == 1 or 0 before taking the root since:
Squares of even numbers are even,
since (2n)^2 = 4n^2.
Squares of odd
numbers are odd, since (2n + 1)^2 =
4(n^2 + n) + 1.
These would essentially eliminate half of the integers before taking any roots.
The cleanest solution is to use an integer sqrt routine, then do:
bool isSquare( unsigned int a ) {
unsigned int s = isqrt( a );
return s * s == a;
}
This will work in the full int range and with perfect precision. A few cases:
a = 0, s = 0, s * s = 0 (add an exception if you don't want to treat 0 as square)
a = 1, s = 1, s * s = 1
a = 2, s = 1, s * s = 1
a = 3, s = 1, s * s = 1
a = 4, s = 2, s * s = 4
a = 5, s = 2, s * s = 4
Won't fail either as you approach the maximum value for your int size. E.g. for 32-bit ints:
a = 0x40000000, s = 0x00008000, s * s = 0x40000000
a = 0xFFFFFFFF, s = 0x0000FFFF, s * s = 0xFFFE0001
Using floats you run into a number of issues. You may find that sqrt( 4 ) = 1.999999..., and similar problems, although you can round-to-nearest instead of using floor().
Worse though, a float has only 24 significant bits which means you can't cast any int larger than 2^24-1 to a float without losing precision, which introduces false positives/negatives. Using doubles for testing 32-bit ints, you should be fine, though.
But remember to cast the result of the floating-point sqrt back to an int and compare the result to the original int. Comparisons between floats are never a good idea; even for square values of x in a limited range, there is no guarantee that sqrt( x ) * sqrt( x ) == x, or that sqrt( x * x) = x.
basics first:
if you (int) a number in a calculation it will remove ALL post-comma data. If I remember my C correctly, if you have an (int) in any calculation (+/-*) it will automatically presume int for all other numbers.
So in your case you want float on every number involved, otherwise you will loose data:
sqrt((float)a)*sqrt((float)a)==(float)a
is the way you want to go
Floating point math is inaccurate by nature.
So consider this code:
int a=35;
float conv = (float)a;
float sqrt_a = sqrt(conv);
if( sqrt_a*sqrt_a == conv )
printf("perfect square");
this is what will happen:
a = 35
conv = 35.000000
sqrt_a = 5.916079
sqrt_a*sqrt_a = 34.999990734
this is amply clear that sqrt_a^2 is not equal to a.

c++ rounding of numbers away from zero

Hi i want to round double numbers like this (away from zero) in C++:
4.2 ----> 5
5.7 ----> 6
-7.8 ----> -8
-34.2 ----> -35
What is the efficient way to do this?
inline double myround(double x)
{
return x < 0 ? floor(x) : ceil(x);
}
As mentioned in the article Huppie cites, this is best expressed as a template that works across all float types
See http://en.cppreference.com/w/cpp/numeric/math/floor and http://en.cppreference.com/w/cpp/numeric/math/floor
or, thanks to Pax, a non-function version:
x = (x < 0) ? floor(x) : ceil(x);
There is a nice article about a similar problem on CPlusPlus.com. The easy solution to your problem should be something like this:
double customRound( double value ) const {
return value < 0 ? floor( value ) : ceil( value );
}
A better solution is the one mentioned in the article, which uses a template:
//--------------------------------------------------------------------------
// symmetric round up
// Bias: away from zero
template <typename FloatType>
FloatType ceil0( const FloatType& value )
{
FloatType result = std::ceil( std::fabs( value ) );
return (value < 0.0) ? -result : result;
}
The x < 0 ? floor(x) : ceil(x); approach of Ruben Bartelink is good. Yet consider what happens with special cases of x = -0.0, x = NaN.
Rather than have myround(-0.0) potentially return +0.01 and have myround(NaN) return with a changed payload of the NaN, consider the below.
myround_alt(-0.0) returns -0.0.
myround_alt(NaN) more likely returns an unchanged payload NaN. Not-a-number stuff is tricky and not well defined. IAC, it is the myround_alt(-0.0) --> -0.0 I am seeking.
inline double myround_alt(double x) {
if (x > 0) return ceil(x);
if (x < 0) return floor(x);
return x;
}
1IEC 60559 floating-point arithmetic specifies ceil(±0) returns ±0 so this approach not needed with implementations that strictly follow that spec. Yet many C floating point implementation do not follow that (C does not require it) or fail in such comer cases like this.
try
double rounded = _copysign(ceil(abs(x)), x);

How can I write a C++ function returning true if a real number is exactly representable with a double?

How can I write a C++ function returning true if a real number is exactly representable with a double?
bool isRepresentable( const char* realNumber )
{
bool answer = false;
// what goes here?
return answer;
}
Simple tests:
assert( true==isRepresentable( "0.5" ) );
assert( false==isRepresentable( "0.1" ) );
Parse the number into the form a + N / (10^k), where a and N are integers, and k is the number of decimal places you have.
Example: 12.0345 -> 12 + 345 / 10^4, a = 12, N = 345, k = 4
Now, 10^k = (2 * 5) ^ k = 2^k * 5^k
You can represent your number as exact binary fraction if and only if you get rid of the 5^k term in the denominator.
The result would check (N mod 5^k) == 0
Holy homework, batman! :)
What makes this interesting is that you can't simply do an (atof|strtod|sscanf) -> sprintf loop and check whether you got the original string back. sprintf on many platforms detects the "as close as you can get to 0.1" double and prints it as 0.1, for example, even though 0.1 isn't precisely representable.
#include <stdio.h>
int main() {
printf("%llx = %f\n",0.1,0.1);
}
prints:
3fb999999999999a = 0.100000
on my system.
The real answer probably would require parsing out the double to convert it to an exact fractional representation (0.1 = 1/10) and then making sure that the atof conversion times the denominator equals the numerator.
I think.
Here is my version. sprintf converts 0.5 to 0.50000, zeros at the end have to be removed.
EDIT: Has to be rewritten to handle numbers without decimal point that end with 0 correctly (like 12300).
bool isRepresentable( const char* realNumber )
{
bool answer = false;
double dVar = atof(realNumber);
char check[20];
sprintf(check, "%f", dVar);
// Remove zeros at end - TODO: Only do if decimal point in string
for (int i = strlen(check) - 1; i >= 0; i--) {
if (check[i] != '0') break;
check[i] = 0;
}
answer = (strcmp(realNumber, check) == 0);
return answer;
}
This should do the trick:
bool isRepresentable(const char *realNumber)
{
double value = strtod(realNumber, NULL);
char test[20];
sprintf(test, "%f", value);
return strcmp(realNumber, test) == 0;
}
Probably best to use the 'safe' version of sprintf to prevent a potential buffer overrun (is it even possible in this case?)
I'd convert the string to its numeric bit representation, (a bit array or a long), then convert the string to a double and see if they match.
Convert the string into a float with a larger scope than a double. Cast that to a double and see if they match.