C++: how to truncate the double in efficient way? - c++

I would like to truncate the float to 4 digits.
Are there some efficient way to do that?
My current solution is:
double roundDBL(double d,unsigned int p=4)
{
unsigned int fac=pow(10,p);
double facinv=1.0/static_cast<double>(fac);
double x=static_cast<unsigned int>(d*fac)*facinv;
return x;
}
but using pow and delete seems to me not so efficient.
kind regards
Arman.

round(d*10000.0)/10000.0;
or if p must be variable;
double f = pow(10,p);
round(d*f)/f;
round will usually be compiled as a single instruction that is faster than converting to an integer and back. Profile to verify.
Note that a double may not have an accurate representation to 4 decimal places. You will not truly be able to truncate an arbitrary double, just find the nearest approximation.

Efficiency depends on your platform.
Whatever methods you try, you should profile to make sure
the efficiency is required (and a straightforward implementation is not fast enough for you)
the method you're trying is faster than others for your application on real data
You could multiply by 10000, truncate as an integer, and divide again. Converting between double and int might be faster or slower for you.
You could truncate on output, e.g. a printf format string of "%.4f"

You could replace pow with a more efficient integer-based variant instead. There's one here on Stack Overflow: The most efficient way to implement an integer based power function pow(int, int)
Also, if you can accept some inaccuracy, replace the divide with a multiply. Divisions are one of the slowest common math operations.
Other than that, I'll echo what others have said and simply truncate on output, unless you actually need to use the truncated double in calculations.

If you need to perform exact calculations that involve decimal digits, then stop using double right now! It's not the right data type for your purpose. You will not get actually rounded decimal values. Almost all values will (after truncation, not matter what method you use) be in fact be something like 1,000999999999999841, not 1,0001.
That's because double is implemented using binary fractions, not decimal ones. There are decimal types you can use instead that will work correctly. They will be a lot slower, but then, if the result does not need to be correct, I know a method to make it infinitely fast...

Related

Why is there a loss in precision when converting char * to float using sscanf_s or atof?

I am trying to convert a char * containing just a floating point value to a type float, but both sscanf_s and atof both produce the same invalid result.
char t[] = "2.10";
float aFloat( 0.0f ), bFloat( 0.0f );
sscanf_s( t, "%f", &aFloat );
bFloat = atof( t );
Output:
aFloat: 2.09999990
bFloat: 2.09999990
When I looked at similar questions in an attempt to ascertain the answer I attempted their solutions to no avail.
Converting char* to float or double
The solution given here was to include 'stdlib.h', and after doing so I changed the call to atof to an explicit call 'std::atof', but still no luck.
Unfortunately, not all floating point values can be explicitly represented in binary form. You will get the same result if you say
float myValue = 2.10;
I see the excellent answer in comments is missing (or I didn't find it there easily) one other option how to deal with it.
You should have wrote, why you need floating point number. If you by accident happen to work with monetary amounts (and not too huge ones), you can create custom parser of input values, and custom formatter for value output, to read it as 64b integer (*100), and work in your whole application with 100*amount values. If you are working with really huge amounts, use some library for big numbers, or you may create your own, working with char* numbers.
It's a special case of Fixed-point arithmetic.
If you are interested into "just to solve this", without coding too much, head for big numbers library anyway, even the *100 fixed-point variant is easy to write with bugs - if it's your first time and you don't have enough resources to do it correctly (TDD advised).
But definitely learn how the numbers are stored in computer, and why float/double can't represent all numbers. Float 2.1 for computer (base 2 used internally) is similar case to human's 1/3, which can't be represented in base 10 without infinite number of decimal places (and how 1.0 == 0.99999... in base 10). (thanks #tobi303)
After reading your new comment, if "Does this not have a big impact on financial applications?"
Answer: nope, zero impact, nobody sane (and professional) would create financial application with floats or doubles.

Adding double to double in a more fixed way?

I'm using double instead of float in my code but unfortunately i faced the next problem :
When i try to add :
1.000000000000020206059048177849 + 0.000000000000020206059048177849
i have this result :
1.000000000000040400000000000000
which avoid the last 14 number.. i want the result to be more accurate.
i know this might look silly but really this is so important to me .. anyone can help?
here's a simple code example :
#include <iomanip>
#include <iostream>
using namespace std;
int main()
{
double a=1.000000000000020206059048177849 + 0.000000000000020206059048177849;
cout<<fixed<<setprecision(30)<<a;
system("pause");
return 0;
}
Update: The answer below assumes that the expression is evaluated during run-time, i.e. you are not adding compile-time constants. This is not necessarily true, your compiler may evaluate the expression during compile time. It may use higher precision for this. As suggested in the comments, the way you print out the number might be the root cause for your problem.
If you absolutly need more precision and can't make any other twists, your only option is to increase precision. double values provide a precision of about 16 decimal digits. You have the following options:
Use a library that provides higher precision by implementing floating point operations in software. This is slow, but you can get as precise as you want to, see e.g. GMP, the GNU Multiple Precision Library.
The other option is to use a long double, which is at least as precise as double. On some platforms, long double may even provide more precision than a double, but in general it does not. On your typical desktop PC it may be 80 bits long (compared to 64 bits), but this is not necessarily true and depends on your platform and your compiler. It is not portable.
Maybe, you can avoid the hassle and tune your implementation a bit in order to avoid floating point errors. Can you reorder operations? Your intermediate results are of the form 1+x. Is there a way to compute x instead of 1+x? Subtracting 1 is not an option here, of course, because then precision of x is already lost.

how to take the root of a very large number?

given x=4 and y=1296;
we need to solve for z in z^x=y;
we can calculate z=6 in various ways;
Question is how do I find z if y is a very large number greater than 10^100? I obviously can't store that number as int, so how would I go about calculating z?
C++ implementation would be nice, if not, any solution will work.
It depends on the accuracy required. Since 1e100 cannot be exactly represented by a double, you have a problem.
This works, if you are willing to accept that it does not yield an exact solution. But then, I just said that 1e100 is not represented exactly as a double anyway. Thus, in MATLAB,
exp(log(1e100)/4)
ans =
1e+25
Ok, so it looks like 1e25 is the answer, but is it really? In fact, the number we really get, in terms of a double, is: 10000000000000026675773440.
One problem is the original number was not represented exactly anyway. So 1e100, when stored in the IEEE format, is more accurately stored as something like this:
1.00000000000000001590289110975991804683608085639452813897813e100
To solve this exactly, you would best be served by a big integer form, but a big decimal form would do reasonably well too.
Thus, in MATLAB, using my big decimal (HPF) form we see that 1e100 is exactly represented in 100 digits of precision.
x = hpf('1e100',100)
x =
1.e100
And, to 100 digits of precision, the root is correct.
exp(log(x)/4)
ans =
10000000000000000000000000
Actually though, be careful, as any floating point form cannot represent real numbers exactly. To more precision, we see that the number computed was actually slightly in error:
9999999999999999999999999.9999999999999999999999999999999999999999999999999999999999999999999999999999999999800
A big integer form will yield an exact result, if one exists. Thus, using a big integer form, we see the expected result:
vpi(10)^100
ans =
10000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000
nthroot(vpi(10)^100,4)
ans =
10000000000000000000000000
The point is, to do the computation you desire, you need to use tools that can do the computation. There are many such big decimal or big integer tools to be had. For example, Java has a BigDecimal and a BigInteger form that I have used on occasion (though I've written my own tools anyway, thus in MATLAB, HPF and VPI.)
Maybe you can do something evil with logarithms
maybe there is a library that you can find that lets you deal with big integers
You can try to use Newton's method. In this case you need to use arbitrary-precision arithmetic.
I.e. you need to write class for arbitrary-precision number. It would be composition of mantissa, which is represented by array of digits and exponent, which is represented by integer. You should realize basic operations on numbers similar to pencil-and-paper methods. Then you should realize Newton's algoriithm as described in wiki.

Two ints to one double C++

I am having a bit of a problem here.
I have two int values, one for dollars and one for cents. My job is to combine them into one double value and I am having some trouble.
Here's an example of what I want to be able to do:
int dollars = 10
int cents = 50
<some code which I haven't figured out yet>
double total = 10.50
I want to think it is relatively simple, but I'm having a hard time figuring it out.
Thanks for the help!
Start by thinking how you would solve this as a simple arithmetic problem, with pencil and paper (nothing to do with C). Once you find a way to do it manually, I'm sure the way to program it will seem trivial.
How about double total = double(dollars) + double(cents) / 100.0;?
Note that double is not a good data type to represent 10-based currencies, due to its inability to represent 1/100 precisely. Consider a fixed-point solution instead, or perhaps a decimal float (those are rare).
That's not difficult... you have to convert dollars to a double1 and add cents multiplied for 0.01 (or divided by 100. - notice the trailing dot, that's to indicate that 100. is a double constant, so / will perform a floating-point division instead of an integer division).
... but be aware of the fact that storing monetary values in binary floating-point variables is not a good idea at all, because binary doesn't have a finite representation of many "exact" decimal amounts (e.g. 0.1), that will be stored in an approximate representation. Working with such values may yield "strange" results when you start to do some arithmetic with them.
Actually, depending on your expression, it's probably not necessary due to implicit casts.
If you're interested in 'the whole idea' of programming and not only in getting your homework right, I suggest you think about this: "Is there any way I can represent a whole dollar as a certain amount of cents?" Why should you ask this? Because if you want to represent two different 'types' of certain values as one value, you need to 'normalize' them or 'standardize' them in a way so that there is not any data loss or corruption (or at least for the smaller problems).
Also I agree with Kerrek SB, representing money as double might not be the best solution.
Isn't it just as easy: total = dollars + (cents/100); ?
No reason to complicate this.

Preventing Rounding Errors

I was just reading about rounding errors in C++. So, if I'm making a math intense program (or any important calculations) should I just drop floats all together and use only doubles or is there an easier way to prevent rounding errors?
Obligatory lecture: What Every Programmer Should Know About Floating-Point Arithmetic.
Also, try reading IEEE Floating Point standard.
You'll always get rounding errors. Unless you use an infinite arbitrary precision library, like gmplib. You have to decide if your application really needs this kind of effort.
Or, you could use integer arithmetic, converting to floats only when needed. This is still hard to do, you have to decide if it's worth it.
Lastly, you can use float or double taking care not to make assumption about values at the limit of representation's precision. I'd wish this Valgrind plugin was implemented (grep for float)...
The rounding errors are normally very insignificant, even using floats. Mathematically-intense programs like games, which do very large numbers of floating-point computations, often still use single-precision.
This might work if your highest number is less than 10 billion and you're using C++ double precision.
if ( ceil(10000*(x + 0.00001)) > ceil(100000*(x - 0.00001))) {
x = ceil(10000*(x + 0.00004)) / 10000;
}
This should allow at least the last digit to be off +/- 9. I'm assuming dividing by 1000 will always just move a decimal place. If not, then maybe it could be done in binary.
You would have to apply it after every operation that is not +, -, *, or a comparison. For example, you can't do two divisions in the same formula because you'd have to apply it to each division.
If that doesn't work, you could work in integers by scaling the numbers up and always use integer division. If you need advanced functions maybe there is a package that does deterministic integer math. Integer division is required in a lot of financial settings because of round off error being subject to exploit like in the movie "The Office".