Fixed Length Float in C/C++?

Fixed Length Float in C/C++? - c++

I was wondering whether it is possible to limit the number of characters we enter in a float.
I couldn't seem to find any method. I have to read in data from an external interface which sends float data of the form xx.xx. As of now I am using conversion to char and vice-versa, which is a messy work-around. Can someone suggest inputs to improve the solution?

If you always have/want only 2 decimal places for your numbers, and absolute size is not such a big issue, why not work internally with integers instead, but having their meaning be "100th of the target unit". At the end you just need to convert them back to a float and divide by 100.0 and you're back to what you want.

This is a slight misunderstanding. You cannot think of a float or double as being a decimal number.
Most any attempt to use it as a fixed decimal number of precision, say, 2, will incur problems as some values will not be precisely equal to xxx.xx but only approximately so.
One solution that many apps use is to ensure that:
1) display of floating point numbers is well controlled using printf/sprintf to a certain number of significant digits,
2) one does not do exact comparison between floating point numbers, i.e. to compare to the 2nd decimal point of precision two numbers a, b : abs(a-b) <= epsilon should generally be used. Outright equality is dangerous as 0.01 might have multiple floating point values, e.g. 0.0101 and 0.0103 might result if you do arithmetic, but be indistinguishable to the user if values are truncated to 2 dp, and they may be logically equivalent to your application which is assuming 2dp precision.
Lastly, I would suggest you use double instead of float. These days there is no real overhead as we aren't doing floating point without a maths coprocessor any more! And a float under 32-bit architectures has 7 decimal points of precision, and a double has 15, and this is enough to be significant in many case.

Rounding a float (that is, binary floating-point number) to 2 decimal digits doesn't make much sense because you won't be able to round it exactly in some cases anyway, so you'll still get a small delta which will affect subsequent calculations. If you really need it to be precisely 2 places, then you need to use decimal arithmetic; for example, using IBM's decNumber++ library, which implements ISO C/C++ TR 24773 draft

You can limit the number of significant numbers to output:
http://www.cplusplus.com/reference/iostream/manipulators/setprecision/
but I don't think there is a function to actually lop off a certain number of digits. You could write a function using ftoa() (or stringstream), lop off a certain number of digits, and use atof() (or stringstream) and return that.

You should checks the string rather than the converted float. It will be easier to check the number of digits.

Why don't you just round the floats to the desired precision?
double round(double val, int decimalPlaces)
{
double power_of_10 = pow(10.0, static_cast<double>(decimalPlaces));
return floor(val * power_of_10 + 0.5) / power_of_10;
}
int main()
{
double d;
cin >> d;
// round d to 3 decimal places...
d = round(d, 3);
// do something with d
d *= 1.75;
cout << setprecision(3) << d; // now output to 3 decimal places
}

There exist no fixed point decimal datatype in C, but you can mimic pascal's decimal with a struct of two ints.

If the need is to take 5 digits [ including or excluding the decimal point ], you could simply write like below.
scanf( "%5f", &a );
where a is declared as float.
Fo eg:
If you enter 123.45, scanf will consider the first 5 characters i.e., 4 digits and the decimal point & will store 123.4
If entered 123456, the value of a will be 12345 [ ~ 12345.00 ]
With printf, we would be able to control how many characters can be printed after decimal as well.
printf( "%5.2f \n", a );
The value of 123.4 will be printed as 12.30 [ total 5, including the decimal & 2 digits after decimal ]
But this have a limitation, where if the digits in the value are more than 5, it will display the actual value.
eg: The value of 123456.7, will be displayed as 123456.70.
This [ specifying the no. of digits after the decimal, as mentioned for printf ] I heard can be used for scanf as well, I am not sure sure & the compiler I use doesn't support that format. Verify whether your compiler does.
Now, when it comes to taking data from an external interface, are you talking about serialization here, I mean transmission of data on netwrok.
Then, to my knowledge your approach is fine.
We generally tend to read in the form of char only, to make sure the application works for any format of data.

You can print a float use with printf("%.2f", float), or something similar.

Related

How to assign exact content of a string to a double in c++

I dont' have much experience with C++, I have the following problem.
With the following code :
double d = 0.0000000;
stringstream ss;
ss << std::fixed << std::setprecision( 2 ) << d;
ss >> d;
or
std::string content = ss.str();
d = atof( content.c_str() );
Either of the two ways while debuging in MS Visual Studio , I see the value of d is 0.0000000 not 0.00 as in the string content
How do I get exact content of string assigned to double d?
May be I should ask a broader question :
I am writing a method that returns a double with precision as needed. For example if I have 2.446343434 as value of d and precision is 2, how can I get my method return d as 2.45 ?
After reading the below answers : I came to know that it is not possible to do such thing. So the next question is :
Even if my above code tries to put 2.45 into double, the C++ runtime will append zeros ( how many ? ) to 2.45 and return right? Is there a way to control appending zeros to the double?

I see the value of d is 0.0000000 not 0.00 as in the string
But both of those numbers have exactly the same value. So, a number can have the value 0.0000000 if and only if it also has the value 0.00.
How do I get exact content of string assigned to double d?
You cannot. double represents a numeric value. It does not represent a character string.
Also, in more general, that is not possible because floating point numbers cannot represent all the numbers that can be represented by a character string. But that is not a problem with 0, which is indeed representable.
I am writing a method that returns a double with precision as needed. For example if I have 2.446343434 as value of d and precision is 2, how can I get my method return d as 2.45
You cannot get your method to return a double with the value 2.45 unless the format of the double can represent 2.45. The binary64 format specified by IEEE 754 can not represent 2.45. In such case, the best that you can do, is to return the representable number closest to the number with 2 significant fractional digits, which in the case of IEEE 754 would be 2.45000000000000017763568394003. The program in your question achieves that.
If that's not what you want, then floating point is not appropriate for your use case.

The human-readable value for d that you're seeing in the debugger, "0.0000000", is just a representation, and a fairly arbitrary one at that. The actual double object does not store this string, nor anything with a fixed number of decimal places.
Its actual identity, at the lowest level, is (a) in binary, (b) encoded according to the floating-point specification, and (c) irrelevant for your purposes. The value is zero; period.
The debugger has simply chosen to use seven decimal places when converting the number into something you can read with your eyes and brain. When using printf or std::cout to similarly output the number for reading, you can pick some other format if you like, including a format with two decimal places to match your original string input. That's just different ways of saying the same thing.
Do not confuse value with representation.
Also, your insistence on specifically two decimal places makes me suspicious: if you're planning on using double to store currency, just don't. Floating-point types are not appropriate for that.

How to shift a floating-point value to the nearest one that can be represented exactly in a specific number of decimal places?

Is there an algorithm in C++ that will allow me to, given a floating-point value V of type T (e.g. double or float), returns the closest value to V in a given direction (up or down) that can be represented exactly in less than or equal to a specified number of decimal places D ?
For example, given
T = double
V = 670000.08267799998
D = 6
For direction = towards +inf I would like the result to be 670000.082678, and for direction = towards -inf I would like the result to be 670000.082677
This is somewhat similar to std::nexttoward(), but with the restriction that the 'next' value needs to be exactly representable using at most D decimal places.
I've considered a naive solution involving separating out the fractional portion and scaling it by 10^D, truncating it, and scaling it again by 10^-D and tacking it back onto the whole number portion, but I don't believe that guarantees that the resulting value will be exactly representable in the underlying type.
I'm hopeful that there's a way to do this properly, but so far I've been unable to find one.
Edit: I think my original explanation didn't properly convey my requirements. At the suggestion of #patricia-shanahan I'll try to describing my higher-level goal and then reformulate the problem a little differently in that context.
At the highest level, the reason I need this routine is due to some business logic wherein I must take in a double value K and a percentage P, split it into two double components V1 and V2 where V1 ~= P percent of K and V1 + V2 ~= K. The catch is that V1 is used in further calculations before being sent to a 3rd party over a wire protocol that accepts floating-point values in string format with a max of D decimal places. Because the value sent to the 3rd party (in string format) needs to be reconcilable with the results of the calculations made using V1 (in double format) , I need to "adjust" V1 using some function F() so that it is as close as possible to being P percent of K while still being exactly representable in string format using at most D decimal places. V2 has none of the restrictions of V1, and can be calculated as V2 = K - F(V1) (it is understood and acceptable that this may result in V2 such that V1 + V2 is very close to but not exactly equal to K).
At the lower level, I'm looking to write that routine to 'adjust' V1 as something with the following signature:
double F(double V, unsigned int D, bool roundUpIfTrueElseDown);
where the output is computed by taking V and (if necessary, and in the direction specified by the bool param) rounding it to the Dth decimal place.
My expectation would be that when V is serialized out as follows
const auto maxD = std::numeric_limits<double>::digits10;
assert(D <= maxD); // D will be less than maxD... e.g. typically 1-6, definitely <= 13
std::cout << std::fixed
<< std::setprecision(maxD)
<< F(V, D, true);
then the output contains only zeros beyond the Dth decimal place.
It's important to note that, for performance reasons, I am looking for an implementation of F() that does not involve conversion back and forth between double and string format. Though the output may eventually be converted to a string format, in many cases the logic will early-out before this is necessary and I would like to avoid the overhead in that case.

This is a sketch of a program that does what is requested. It is presented mainly to find out whether that is really what is wanted. I wrote it in Java, because that language has some guarantees about floating point arithmetic on which I wanted to depend. I only use BigDecimal to get exact display of doubles, to show that the answers are exactly representable with no more than D digits after the decimal point.
Specifically, I depended on double behaving according to IEEE 754 64-bit binary arithmetic. That is likely, but not guaranteed by the standard, for C++. I also depended on Math.pow being exact for simple exact cases, on exactness of division by a power of two, and on being able to get exact output using BigDecimal.
I have not handled edge cases. The big missing piece is dealing with large magnitude numbers with large D. I am assuming that the bracketing binary fractions are exactly representable as doubles. If they have more than 53 significant bits that will not be the case. It also needs code to deal with infinities and NaNs. The assumption of exactness of division by a power of two is incorrect for subnormal numbers. If you need your code to handle them, you will have to put in corrections.
It is based on the concept that a number that is both exactly representable as a decimal with no more than D digits after the decimal point and is exactly representable as a binary fraction must be representable as a fraction with denominator 2 raised to the D power. If it needs a higher power of 2 in the denominator, it will need more than D digits after the decimal point in its decimal form. If it cannot be represented at all as a fraction with a power-of-two denominator, it cannot be represented exactly as a double.
Although I ran some other cases for illustration, the key output is:
670000.082678 to 6 digits Up: 670000.09375 Down: 670000.078125
Here is the program:
import java.math.BigDecimal;
public class Test {
public static void main(String args[]) {
testIt(2, 0.000001);
testIt(10, 0.000001);
testIt(6, 670000.08267799998);
}
private static void testIt(int d, double in) {
System.out.print(in + " to " + d + " digits");
System.out.print(" Up: " + new BigDecimal(roundUpExact(d, in)).toString());
System.out.println(" Down: "
+ new BigDecimal(roundDownExact(d, in)).toString());
}
public static double roundUpExact(int d, double in) {
double factor = Math.pow(2, d);
double roundee = factor * in;
roundee = Math.ceil(roundee);
return roundee / factor;
}
public static double roundDownExact(int d, double in) {
double factor = Math.pow(2, d);
double roundee = factor * in;
roundee = Math.floor(roundee);
return roundee / factor;
}
}

In general, decimal fractions are not precisely representable as binary fractions. There are some exceptions, like 0.5 (½) and 16.375 (16⅜), because all binary fractions are precisely representable as decimal fractions. (That's because 2 is a factor of 10, but 10 is not a factor of 2, or any power of two.) But if a number is not a multiple of some power of 2, its binary representation will be an infinitely-long cyclic sequence, like the representation of ⅓ in decimal (.333....).
The standard C library provides the macro DBL_DIG (normally 15); any decimal number with that many decimal digits of precision can be converted to a double (for example, with scanf) and then converted back to a decimal representation (for example, with printf). To go in the opposite direction without losing information -- start with a double, convert it to decimal and then convert it back -- you need 17 decimal digits (DBL_DECIMAL_DIG). (The values I quote are based on IEEE-754 64-bit doubles).
One way to provide something close to the question would be to consider a decimal number with no more than DBL_DIG digits of precision to be an "exact-but-not-really-exact" representation of a floating point number if that floating point number is the floating point number which comes closest to the value of the decimal number. One way to find that floating point number would be to use scanf or strtod to convert the decimal number to a floating point number, and then try the floating point numbers in the vicinity (using nextafter to explore) to find which ones convert to the same representation with DBL_DIG digits of precision.
If you trust the standard library implementation to not be too far off, you could convert your double to a decimal number using sprintf, increment the decimal string at the desired digit position (which is just a string operation), and then convert it back to a double with strtod.

Total re-write.
Based on OP's new requirement and using power-of-2 as suggested by #Patricia Shanahan, simple C solution:
double roundedV = ldexp(round(ldexp(V, D)),-D); // for nearest
double roundedV = ldexp(ceil (ldexp(V, D)),-D); // at or just greater
double roundedV = ldexp(floor(ldexp(V, D)),-D); // at or just less
The only thing added here beyond #Patricia Shanahan fine solution is C code to match OP's tag.

In C++ integers must be represented in binary, but floating point types can have a decimal representation.
If FLT_RADIX from <limits.h> is 10, or some multiple of 10, then your goal of exact representation of a decimal values is attainable.
Otherwise, in general, it's not attainable.
So, as a first step, try to find a C++ implementation where FLT_RADIX is 10.
I wouldn't worry about algorithm or efficiency thereof until the C++ implementation is installed and proved to be working on your system. But as a hint, your goal seems to be suspiciously similar to the operation known as “rounding”. I think, after obtaining my decimal floating point C++ implementation, I’d start by investigating techniques for rounding, e.g., googling that, maybe Wikipedia, …

C++ determining if a number is an integer

I have a program in C++ where I divide two numbers, and I need to know if the answer is an integer or not. What I am using is:
if(fmod(answer,1) == 0)
I also tried this:
if(floor(answer)==answer)
The problem is that answer usually is a 5 digit number, but with many decimals. For example, answer can be: 58696.000000000000000025658 and the program considers that an integer.
Is there any way I can make this work?
I am dividing double a/double b= double answer
(sometimes there are more than 30 decimals)
Thanks!
EDIT:
a and b are numbers in the thousands (about 100,000) which are then raised to powers of 2 and 3, added together and divided (according to a complicated formula). So I am plugging in various a and b values and looking at the answer. I will only keep the a and b values that make the answer an integer. An example of what I got for one of the answers was: 218624 which my program above considered to be an integer, but it really was: 218624.00000000000000000056982 So I need a code that can distinguish integers with more than 20-30 decimals.

You can use std::modf in cmath.h:
double integral;
if(std::modf(answer, &integral) == 0.0)
The integral part of answer is stored in fraction and the return value of std::modf is the fractional part of answer with the same sign as answer.

The usual solution is to check if the number is within a very short distance of an integer, like this:
bool isInteger(double a){
double b=round(a),epsilon=1e-9; //some small range of error
return (a<=b+epsilon && a>=b-epsilon);
}
This is needed because floating point numbers have limited precision, and numbers that indeed are integers may not be represented perfectly. For example, the following would fail if we do a direct comparison:
double d=sqrt(2); //square root of 2
double answer=2.0/(d*d); //2 divided by 2
Here, answer actually holds the value 0.99999..., so we cannot compare that to an integer, and we cannot check if the fractional part is close to 0.
In general, since the floating point representation of a number can be either a bit smaller or a bit bigger than the actual number, it is not good to check if the fractional part is close to 0. It may be a number like 0.99999999 or 0.000001 (or even their negatives), these are all possible results of a precision loss. That's also why I'm checking both sides (+epsilon and -epsilon). You should adjust that epsilon variable to fit your needs.
Also, keep in mind that the precision of a double is close to 15 digits. You may also use a long double, which may give you some extra digits of precision (or not, it is up to the compiler), but even that only gets you around 18 digits. If you need more precision than that, you will need to use an external library, like GMP.

Floating point numbers are stored in memory using a very different bit format than integers. Because of this, comparing them for equality is not likely to work effectively. Instead, you need to test if the difference is smaller than some epsilon:
const double EPSILON = 0.00000000000000000001; // adjust for whatever precision is useful for you
double remainder = std::fmod(numer, denom);
if(std::fabs(0.0 - remainder) < EPSILON)
{
//...
}
Alternatively, if you want to include values that are close to integers (based on your desired precision), you can modify the if condition slightly (since the remainder returned by std::fmod will be in the range [0, 1)):
if (std::fabs(std::round(d) - d) < EPSILON)
{
// ...
}
You can see the test for this here.
Floating point numbers are generally somewhat precise to about 12-15 digits (as a double), but as they are stored as a mantissa (fraction) and a exponent, rational numbers (integers or common fractions) are not likely to be stored as such. For example,
double d = 2.0; // d might actually be 1.99999999999999995
Because of this, you need to compare the difference of what you expect to some very small number that encompasses the precision you desire (we will call this value, epsilon):
double d = 2.0;
bool test = std::fabs(2 - d) < epsilon; // will return true
So when you are trying to compare the remainder from std::fmod, you need to check it against the difference from 0.0 (not for actual equality to 0.0), which is what is done above.
Also, the std::fabs call prevents you from having to do 2 checks by asserting that the value will always be positive.
If you desire a precision that is greater than 15-18 decimal places, you cannot use double or long double; you will need to use a high precision floating point library.

How to convert the double to become integer

It is hard to explain the question, i would like to convert a double number to integer without rounding the value after the decimal point.
For example
double a = 123.456
I want to convert become
int b = 123456
I want to know how many digit there is, and move it back after calculated to become 123.456
PS:I just want pure mathematical method to solve this issue, without calculating the character of it.

Sorry, there's no solution to your problem because the number 123.456 does not exist as a double. It's rounded to 123.4560000000000030695446184836328029632568359375, and this number obviously does not fit into any integer type after you remove the decimal point.
If you want 123.456 to be treated as the exact number 123.456, then the only remotely simple way to do this is to convert it to a string and remove the decimal point from the string. This can be achieved with something like
snprintf(buf, sizeof buf, "%.13f", 123.456);
Actually figuring out the number of places you want to print it to, however, is rather difficult. If you use too many, you'll end up picking up part of the exact value I showed above. If you use too few, then obviously you'll drop places you wanted to keep.

try this :
double a = 123.456;
int i;
char str[20];
char str2[20];
sptrintf(str,"%d",a);
for(i=0;i<strlen(str);i++)
{
if(!str[i] == '.')
{
sptrintf(str2,%c,str[i]);
}
}
int b = atoi(str2);

I believe the canonical way to do this would be
#include <math.h>
#include <stdio.h>
int main()
{
double d = 123.456;
double int_part;
double fract_part;
fract_part = modf(d, &int_part);
int i = (int)int_part*1000 + (int)(fract_part*1000);
printf("%d", i);
}
where the literal 1000 is a constant determining the number of desired decimals.

If you have the text "123.456" you can simply remove the decimal point and convert the resulting text representation to an integer value. If you have already converted the text to a floating-point value (double a = 123.456;) then all bets are off: the floating-point value does not have a pre-set number of decimal digits, because it is represented as a binary fraction. It's sort of like 1/3 versus .3333 in ordinary usage: they do not have the same value, even though we usually pretend that .3333 means 1/3.

Multiply each time original value with 10^i, increasing each time i until abs(value' - abs(value')) < epsilon for a very small epsilon. value' should be computed from the original each time, e.g.
value' = value * pow(10, i)
if ( abs(value' - abs(value')) < epsilon ) then stop
Originally I suggested that you should simply multiply by ten, but as R.. suggested, each time the numerical error gets accumulated. As result you might get a result of e.g. 123.456999 for an epsilon = .0000001 instead of 123.456000 due to floating point math.
Please note that you might exceed int type boundaries this way and might want to handle infinity values as well.
As Ignacio Vazquez-Abrams noted this might lead to problems with scenarios where you want to convert 123.500 to 123500. You might solve it by adding a very small value first (and it should be smaller than epsilon). Adding such a value could lead to a numeric error though.

what is the workaround for floating point inacurracy?

Here's the code snippet:
float pNum = 9.2;
char* l_tmpCh = new char[255];
sprintf_s(l_tmpCh, 255, "%.9f", pNum);
cout << l_tmpCh << endl;
delete l_tmpCh;
the output is: 9.199999809
What to do in order for the result to be 9.200000000
Note: I need every float number printed with 9 decimals precision, so I don't want to have 9.2

The workaround is to not use floating point numbers..
Not every number can be represented accurately in the floating point format, such as, for example, 9.2. Or 0.1.
If you want all the decimals shown, then you get 9.199999809, because that's floating point value closest to 9.2.
If you use floating point numbers you have to accept this inaccuracy. Otherwise, your only option is to store the number in another format.
Required reading

There is no way a 32-bit binary float number have 9 digits of precision (there is only 7). You could fake it by appending 3 zeroes.
sprintf_s(l_tmpCh, 255, "%.6f000", pNum);
This won't work if the integer part exhausted a lot of precision already, e.g. 9222.2f will give 9222.200195000.

What you're asking for is not possible in the general case since floating point numbers by definiton are approximations, which might or might not have an exact representation in decimal. Read the famous Goldberg paper: http://docs.sun.com/source/806-3568/ncg_goldberg.html

Use a double literal rather than a float literal.
double pNum = 9.2;
char* l_tmpCh = new char[255];
sprintf_s(l_tmpCh, 255, "%.9f", pNum);
cout << l_tmpCh << endl;
delete l_tmpCh;
That f was making the literal a float; without it, the value is a double literal (more precise at about 15 decimal digits).
Of course, if 15 digits isn't enough, you're welcome to create your own class to represent values.

It's important to understand that native floating point numbers are seldom "accurate" because of the way they are represented in the computer. Thus most of the time you only get an approximation. And with printf, you also specify the precision with which to round that approximation to an output. E.g. "%.20f" will give you a representation that is rounded to 20 digits after the "."

This should do it:
double pNum = 9.2;
The f suffix makes it a float literal, which has only about 7 decimal digits of precision and of course suffers from representation errors. Assigning it to a double variable does not fix this. Of course, this assumes that float and double correspond to IEEE 754 single and double precision types.
EDIT: If you want to use float, then this problem cannot be solved at all. Read The Floating-Point Guide to understand why.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Fixed Length Float in C/C++? - c++

You should checks the string rather than the converted float. It will be easier to check the number of digits.

There exist no fixed point decimal datatype in C, but you can mimic pascal's decimal with a struct of two ints.

You can print a float use with printf("%.2f", float), or something similar.

Related

How to assign exact content of a string to a double in c++

How to shift a floating-point value to the nearest one that can be represented exactly in a specific number of decimal places?

C++ determining if a number is an integer

How to convert the double to become integer

what is the workaround for floating point inacurracy?

Categories

Resources