C/C++ rounding up decimals with a certain precision, efficiently - c++

I'm trying to optimize the following. The code bellow does this :
If a = 0.775 and I need precision 2 dp then a => 0.78
Basically, if the last digit is 5, it rounds upwards the next digit, otherwise it doesn't.
My problem was that 0.45 doesnt round to 0.5 with 1 decimalpoint, as the value is saved as 0.44999999343.... and setprecision rounds it to 0.4.
Thats why setprecision is forced to be higher setprecision(p+10) and then if it really ends in a 5, add the small amount in order to round up correctly.
Once done, it compares a with string b and returns the result. The problem is, this function is called a few billion times, making the program craw. Any better ideas on how to rewrite / optimize this and what functions in the code are so heavy on the machine?
bool match(double a,string b,int p) { //p = precision no greater than 7dp
double t[] = {0.2, 0.02, 0.002, 0.0002, 0.00002, 0.000002, 0.0000002, 0.00000002};
stringstream buff;
string temp;
buff << setprecision(p+10) << setiosflags(ios_base::fixed) << a; // 10 decimal precision
buff >> temp;
if(temp[temp.size()-10] == '5') a += t[p]; // help to round upwards
ostringstream test;
test << setprecision(p) << setiosflags(ios_base::fixed) << a;
temp = test.str();
if(b.compare(temp) == 0) return true;
return false;
}

I wrote an integer square root subroutine with nothing more than a couple dozen lines of ASM, with no API calls whatsoever - and it still could only do about 50 million SqRoots/second (this was about five years ago ...).
The point I'm making is that if you're going for billions of calls, even today's technology is going to choke.
But if you really want to make an effort to speed it up, remove as many API usages as humanly possible. This may require you to perform API tasks manually, instead of letting the libraries do it for you. Specifically, remove any type of stream operation. Those are slower than dirt in this context. You may really have to improvise there.
The only thing left to do after that is to replace as many lines of C++ as you can with custom ASM - but you'll have to be a perfectionist about it. Make sure you are taking full advantage of every CPU cycle and register - as well as every byte of CPU cache and stack space.
You may consider using integer values instead of floating-points, as these are far more ASM-friendly and much more efficient. You'd have to multiply the number by 10^7 (or 10^p, depending on how you decide to form your logic) to move the decimal all the way over to the right. Then you could safely convert the floating-point into a basic integer.
You'll have to rely on the computer hardware to do the rest.
<--Microsoft Specific-->
I'll also add that C++ identifiers (including static ones, as Donnie DeBoer mentioned) are directly accessible from ASM blocks nested into your C++ code. This makes inline ASM a breeze.
<--End Microsoft Specific-->

Depending on what you want the numbers for, you might want to use fixed point numbers instead of floating point. A quick search turns up this.

I think you can just add 0.005 for precision to hundredths, 0.0005 for thousands, etc. snprintf the result with something like "%1.2f" (hundredths, 1.3f thousandths, etc.) and compare the strings. You should be able to table-ize or parameterize this logic.

You could save some major cycles in your posted code by just making that double t[] static, so that it's not allocating it over and over.

Try this instead:
#include <cmath>
double setprecision(double x, int prec) {
return
ceil( x * pow(10,(double)prec) - .4999999999999)
/ pow(10,(double)prec);
}
It's probably faster. Maybe try inlining it as well, but that might hurt if it doesn't help.
Example of how it works:
2.345* 100 (10 to the 2nd power) = 234.5
234.5 - .4999999999999 = 234.0000000000001
ceil( 234.0000000000001 ) = 235
235 / 100 (10 to the 2nd power) = 2.35
The .4999999999999 was chosen because of the precision for a c++ double on a 32 bit system. If you're on a 64 bit platform you'll probably need more nines. If you increase the nines further on a 32 bit system it overflows and rounds down instead of up, i. e. 234.00000000000001 gets truncated to 234 in a double in (my) 32 bit environment.

Using floating point (an inexact representation) means you've lost some information about the true number. You can't simply "fix" the value stored in the double by adding a fudge value. That might fix certain cases (like .45), but it will break other cases. You'll end up rounding up numbers that should have been rounded down.
Here's a related article:
http://www.theregister.co.uk/2006/08/12/floating_point_approximation/

I'm taking at guess at what you really mean to do. I suspect you're trying to see if a string contains a decimal representation of a double to some precision. Perhaps it's an arithmetic quiz program and you're trying to see if the user's response is "close enough" to the real answer. If that's the case, then it may be simpler to convert the string to a double and see if the absolute value of the difference between the two doubles is within some tolerance.
double string_to_double(const std::string &s)
{
std::stringstream buffer(s);
double d = 0.0;
buffer >> d;
return d;
}
bool match(const std::string &guess, double answer, int precision)
{
const static double thresh[] = { 0.5, 0.05, 0.005, 0.0005, /* etc. */ };
const double g = string_to_double(guess);
const double delta = g - answer;
return -thresh[precision] < delta && delta <= thresh[precision];
}
Another possibility is to round the answer first (while it's still numeric) BEFORE converting it to a string.
bool match2(const std::string &guess, double answer, int precision)
{
const static double thresh[] = {0.5, 0.05, 0.005, 0.0005, /* etc. */ };
const double rounded = answer + thresh[precision];
std::stringstream buffer;
buffer << std::setprecision(precision) << rounded;
return guess == buffer.str();
}
Both of these solutions should be faster than your sample code, but I'm not sure if they do what you really want.

As far as i see you are checking if a rounded on p points is equal b.
Insted of changing a to string, make other way and change string to double
- (just multiplications and addion or only additoins using small table)
- then substract both numbers and check if substraction is in proper range (if p==1 => abs(p-a) < 0.05)

Old time developers trick from the dark ages of Pounds, Shilling and pence in the old country.
The trick was to store the value as a whole number fo half-pennys. (Or whatever your smallest unit is). Then all your subsequent arithmatic is straightforward integer arithimatic and rounding etc will take care of itself.
So in your case you store your data in units of 200ths of whatever you are counting,
do simple integer calculations on these values and divide by 200 into a float varaible whenever you want to display the result.
I beleive Boost does a "BigDecimal" library these days, but, your requirement for run time speed would probably exclude this otherwise excellent solution.

Related

Errors in Casting Doubles to Integers [duplicate]

This question already has answers here:
Round a float to a regular grid of predefined points
(11 answers)
Closed 4 years ago.
I am calculating the number of significant numbers past the decimal point. My program discards any numbers that are spaced more than 7 orders of magnitude apart after the decimal point. Expecting some error with doubles, I accounted for very small numbers popping up when subtracting ints from doubles, even when it looked like it should equal zero (To my knowledge this is due to how computers store and compute their numbers). My confusion is why my program does not handle this unexpected number given this random test value.
Having put many cout statements it would seem that it messes up when it tries to cast the final 2. Whenever it casts it casts to 1 instead.
bool flag = true;
long double test = 2029.00012;
int count = 0;
while(flag)
{
test = test - static_cast<int>(test);
if(test <= 0.00001)
{
flag = false;
}
test *= 10;
count++;
}
The solution I found was to cast only once at the beginning, as rounding may produce a negative and terminate prematurely, and to round thenceforth. The interesting thing is that both trunc and floor also had this issue, seemingly turning what should be a 2 into a 1.
My Professor and I were both quite stumped as I fully expected small numbers to appear (most were in the 10^-10 range), but was not expecting that casting, truncing, and flooring would all also fail.
It is important to understand that not all rational numbers are representable in finite precision. Also, it is important to understand that set of numbers which are representable in finite precision in decimal base, is different from the set of numbers that are representable in finite precision in binary base. Finally, it is important to understand that your CPU probably represents floating point numbers in binary.
2029.00012 in particular happens to be a number that is not representable in a double precision IEEE 754 floating point (and it indeed is a double precision literal; you may have intended to use long double instead). It so happens that the closest number that is representable is 2029.000119999999924402800388634204864501953125. So, you're counting the significant digits of that number, not the digits of the literal that you used.
If the intention of 0.00001 was to stop counting digits when the number is close to a whole number, it is not sufficient to check whether the value is less than the threshold, but also whether it is greater than 1 - threshold, as the representation error can go either way:
if(test <= 0.00001 || test >= 1 - 0.00001)
After all, you can multiple 0.99999999999999999999999999 with 10 many times until the result becomes close to zero, even though that number is very close to a whole number.
As multiple people have already commented, that won't work because of limitations of floating-point numbers. You had a somewhat correct intuition when you said that you expected "some error" with doubles, but that is ultimately not enough. Running your specific program on my machine, the closest representable double to 2029.00012 is 2029.0001199999999244 (this is actually a truncated value, but it shows the series of 9's well enough). For that reason, when you multiply by 10, you keep finding new significant digits.
Ultimately, the issue is that you are manipulating a base-2 real number like it's a base-10 number. This is actually quite difficult. The most notorious use cases for this are printing and parsing floating-point numbers, and a lot of sweat and blood went into that. For example, it wasn't that long ago that you could trick the official Java implementation into looping endlessly trying to convert a String to a double.
Your best shot might be to just reuse all that hard work. Print to 7 digits of precision, and subtract the number of trailing zeroes from the result:
#include <iostream>
#include <sstream>
#include <iomanip>
#include <string>
int main() {
long double d = 2029.00012;
auto double_string = (std::stringstream() << std::fixed << std::setprecision(7) << d).str();
auto first_decimal_index = double_string.find('.') + 1;
auto last_nonzero_index = double_string.find_last_not_of('0');
if (last_nonzero_index == std::string::npos) {
std::cout << "7 significant digits\n";
} else if (last_nonzero_index < first_decimal_index) {
std::cout << -(first_decimal_index - last_nonzero_index + 1) << " significant digits\n";
} else {
std::cout << (last_nonzero_index - first_decimal_index) << " significant digits\n";
}
}
It feels unsatisfactory, but:
it correctly prints 5;
the "satisfactory" alternative is possibly significantly harder to implement.
It seems to me that your second-best alternative is to read on floating-point printing algorithms and implement just enough of it to get the length of the value that you're going to print, and that's not exactly an introductory-level task. If you decide to go this route, the current state of the art is the Grisu2 algorithm. Grisu2 has the notable benefit that it will always print the shortest base-10 string that will produce the given floating-point value, which is what you seem to be after.
If you want sane results, you can't just truncate the digits, because sometimes the floating point number will be a hair less than the rounded number. If you want to fix this via a fluke, change your initialization to be
long double test = 2029.00012L;
If you want to fix it for real,
bool flag = true;
long double test = 2029.00012;
int count = 0;
while (flag)
{
test = test - static_cast<int>(test + 0.000005);
if (test <= 0.00001)
{
flag = false;
}
test *= 10;
count++;
}
My apologies for butchering your haphazard indent; I can't abide by them. According to one of my CS professors, "ideally, a computer scientist never has to worry about the underlying hardware." I'd guess your CS professor might have similar thoughts.

What can std::numeric_limits<double>::epsilon() be used for?

unsigned int updateStandardStopping(unsigned int numInliers, unsigned int totPoints, unsigned int sampleSize)
{
double max_hypotheses_=85000;
double n_inliers = 1.0;
double n_pts = 1.0;
double conf_threshold_=0.95
for (unsigned int i = 0; i < sampleSize; ++i)
{
n_inliers *= numInliers - i;//n_linliers=I(I-1)...(I-m+1)
n_pts *= totPoints - i;//totPoints=N(N-1)(N-2)...(N-m+1)
}
double prob_good_model = n_inliers/n_pts;
if ( prob_good_model < std::numeric_limits<double>::epsilon() )
{
return max_hypotheses_;
}
else if ( 1 - prob_good_model < std::numeric_limits<double>::epsilon() )
{
return 1;
}
else
{
double nusample_s = log(1-conf_threshold_)/log(1-prob_good_model);
return (unsigned int) ceil(nusample_s);
}
}
Here is a selection statement:
if ( prob_good_model < std::numeric_limits<double>::epsilon() )
{...}
To my understanding, the judgement statement is the same as(or an approximation to)
prob_good_model < 0
So whether or not I am right and where std::numeric_limits<double>::epsilon() can be used besides that?
The point of epsilon is to make it (fairly) easy for you to figure out the smallest difference you could see between two numbers.
You don't normally use it exactly as-is though. You need to scale it based on the magnitudes of the numbers you're comparing. If you have two numbers around 1e-100, then you'd use something on the order of: std::numeric_limits<double>::epsilon() * 1.0e-100 as your standard of comparison. Likewise, if your numbers are around 1e+100, your standard would be std::numeric_limits<double>::epsilon() * 1e+100.
If you try to use it without scaling, you can get drastically incorrect (utterly meaningless) results. For example:
if (std::abs(1e-100 - 1e-200) < std::numeric_limits<double>::epsilon())
Yup, that's going to show up as "true" (i.e., saying the two are equal) even though they differ by 100 orders of magnitude. In the other direction, if the numbers are much larger than 1, comparing to an (unscaled) epsilon is equivalent to saying if (x != y)--it leaves no room for rounding errors at all.
At least in my experience, the epsilon specified for the floating point type isn't often of a great deal of use though. With proper scaling, it tells you the smallest difference there could possibly be between two numbers of a given magnitude (for a particular floating point implementation).
In real use, however, that's of relatively little real use. A more realistic number will typically be based on the precision of the inputs, and an estimate of the amount of precision you're likely to have lost due to rounding (and such).
For example, let's assume you started with values measured to a precision of 1 part per million, and you did only a few calculations, so you believe you may have lost as much as 2 digits of precision due to rounding errors. In this case, the "epsilon" you care about is roughly 1e-4, scaled to the magnitude of the numbers you're dealing with. That is to say, under those circumstances, you can expect on the order of 4 digits of precision to be meaningful, so if you see a difference in the first four digits, it probably means the values aren't equal, but if they differ only in the fifth (or later) digits, you should probably treat them as equal.
The fact that the floating point type you're using can represent (for example) 16 digits of precision doesn't mean that every measurement you use is going to be nearly the precise--in fact, it's relatively rare the anything based on physical measurements has any hope of being even close to that precise. It does, however, give you a limit on what you can hope for from a calculation--even if you start with a value that's precise to, say, 30 digits, the most you can hope for after calculation is going to be defined by std::numeric_limits<T>::epsilon.
It can be used on situation where a function is undefined, but you still need a value at that point. You lose a bit of accuracy, especially in extreme cases, but sometimes it's alright.
Like let's say you're using 1/x somewhere but your range of x is [0, n[. you can use 1/(x + std::numeric_limits<double>::epsilon()) instead so that 0 is still defined. That being said, you have to be careful with how the value is used, it might not work for every case.

C++ decimal checking if statement errors?

I'm doing some homework for a C++ class, and i'm pretty new to C++. I've run into some issues with my if statement... What i'm doing, is i have the user input a time, between 0.00 and 23.59. the : is replaced by a period btw. that part works. i then am seperating the hour and the minute, and checking them to make sure that they are in valid restraints. checking the hour works, but not the minute... heres my code:
minute= startTime - static_cast<int>(startTime);
hour= static_cast<int>(startTime);
//check validity
if (minute > 0.59) {
cout << "ERROR! ENTERED INVALID TIME! SHUTTING DOWN..." << endl;;
return(0);
}
if (hour > 23) {
cout << "ERROR! ENTERED INVALID TIME! SHUTTING DOWN..." << endl;;
return(0);
}
again, the hour works if i enter 23, but if i enter 23.59, i get the error, but if i enter 23.01 i do not. also, if i enter 0.59 it also gives the error, but not 0.58. I tried switching the if(minute > 0.59) to if(minute > 0.6) but for some reason that caused problems elsewhere. i am at a complete loss as to what to do, so any help would be great! thanks a million!
EDIT: i just entered 0.58, and it didnt give me the error... but if i make it a 1.59 it gives an error again... also, upvotes would be nice :D
Floating-point arithmetic (float and double) is inherently fuzzy. There are always some digits behind the decimal point that you don't see in the rounded representation that is sent to your stream, and comparisons can also be fuzzy because the representation you are used to (decimal) is not the one the computer uses (binary).
Represent a time as int hours and int minutes, and your problems will fade away. Most libraries measure time in ticks (usually seconds or microseconds) and do not offer sub-tick resolution. You do well to emulate them.
Comparison of floating point numbers is prone to failure, because very few of them can be represented exactly in base 2. There's always going to be some possibility that two different numbers are going to round in different directions.
The simplest fix is to add a very tiny fudge factor to the number you're comparing:
if (minute > 0.59 + epsilon)
See What Every Computer Scientist Should Know About Floating-Point Arithmetic.
Don't, ever, use double (nor float) to store two integer values using the integer and decimal part as separator. The decimal data types are not precise enough, so you may have wrong results (in case of a round).
A good solution in your case is either to use a struct (or a pair) or even an integer.
With integer, you can use mod or div of multiples of then to extract the real value. Example:
int startTime = 2359;
int hour = startTime / 100;
int minute = startTime % 100;
Although, with struct the code would look simpler and easier to understand (and to maintain).
There's no way you can compare to exactly 0.59; this value
cannot be represented in the machine. You're comparing
something which is very close to 0.59 with something else that
is very close to 0.59.
Personally, I wouldn't input time this way at all. A much
better solution would be to learn how to use std::istream to
read the value. But if you insist, and you have startTime as
your double, you need to multiply it by 100.0 (so that all
of the values that interest you are integers, and can be
represented exactly), then round it, then convert it to an
integer, and use the modulo and division operators on the
integer to extract your values. Something along the lines of:
assert( startTime >= 0.0 && startTime <= 24.0 );
int tmpTime = 100.0 * startTime + 0.5;
int minute = tmpTime % 100;
int hour = tmpTime / 100;
When dealing with integral values, it's much simpler to use
integral types.

Extracting individual digits from a float

I have been banging my head on this one all day. The C++ project I am currently working on has a requirement to display an editable value. The currently selected digit displays the incremented value above and decremented value below for said digit. It is useful to be able to reference the editable value as both a number and collection of digits. What would be awesome is if there was some indexable form of a floating point number, but I have been unable to find such a solution. I am throwing this question out there to see if there is something obvious I am missing or if I should just roll my own.
Thanks for the advice! I was hoping for a solution that wouldn't convert from float -> string -> int, but I think that is the best way to get away from floating point quantization issues. I ended up going with boost::format and just referencing the individual characters of the string. I can't see that being a huge performance difference compared to using combinations of modf and fmod to attempt to get a digit out of a float (It probably does just that behind the scenes, only more robustly than my implementation).
Internal representation of the float point numbers aren't like was you see. You can only cast to a stirng.
To cast, do this:
char string[99];
sprintf(string,"%f",floatValue);
Or see this : http://www.parashift.com/c++-faq-lite/misc-technical-issues.html#faq-39.1
The wikipedia article can explain more on the representation: http://en.wikipedia.org/wiki/Floating_point
Oh, there are many ways to convert to a string. (Though I do prefer snprintf() myself.)
Or you could convert to an int and pull the digits out with modulus and integer-division. You can count the number of digits with log{base10}.
(Remember: log{baseA}_X / log{baseA}_B = log{baseB}_X.)
Example:
#define SHOW(X) cout << # X " = " << (X) << endl
int
main()
{
double d = 1234.567;
SHOW( (int(d)%10000) / 1000 );
SHOW( (int(d)%1000) / 100 );
SHOW( (int(d)%100) / 10 );
SHOW( (int(d)%10) );
SHOW( (int(d*10) % 10) );
SHOW( (int(d*100) % 10) );
SHOW( (int(d*1000)% 10) );
SHOW( log(d)/log(10) );
}
Though you ought to use static_cast...
Watch out for exponential notation. With a very big or very small numbers, you may have a problem.
Floating point numbers also have round off issues that may cause you some grief. (It's the same reason why we don't use operator== between two doubles. Or why you can't rely on a*b == b*a. Depending on the exact values of a & b, they may differ very slightly out around 10^-25.)
You can cast between string and float only by using boost::lexical_cast. However, you can't directly index the float form - it's internally not stored as decimal digits. This is probably not that much of an issue. For your UI, you will most likely keep a string form of the number anyway, with conversions to and from float in the getter/setter.

Where can I find the world's fastest atof implementation?

I'm looking for an extremely fast atof() implementation on IA32 optimized for US-en locale, ASCII, and non-scientific notation. The windows multithreaded CRT falls down miserably here as it checks for locale changes on every call to isdigit(). Our current best is derived from the best of perl + tcl's atof implementation, and outperforms msvcrt.dll's atof by an order of magnitude. I want to do better, but am out of ideas. The BCD related x86 instructions seemed promising, but I couldn't get it to outperform the perl/tcl C code. Can any SO'ers dig up a link to the best out there? Non x86 assembly based solutions are also welcome.
Clarifications based upon initial answers:
Inaccuracies of ~2 ulp are fine for this application.
The numbers to be converted will arrive in ascii messages over the network in small batches and our application needs to convert them in the lowest latency possible.
What is your accuracy requirement? If you truly need it "correct" (always gets the nearest floating-point value to the decimal specified), it will probably be hard to beat the standard library versions (other than removing locale support, which you've already done), since this requires doing arbitrary precision arithmetic. If you're willing to tolerate an ulp or two of error (and more than that for subnormals), the sort of approach proposed by cruzer's can work and may be faster, but it definitely will not produce <0.5ulp output. You will do better accuracy-wise to compute the integer and fractional parts separately, and compute the fraction at the end (e.g. for 12345.6789, compute it as 12345 + 6789 / 10000.0, rather than 6*.1 + 7*.01 + 8*.001 + 9*0.0001) since 0.1 is an irrational binary fraction and error will accumulate rapidly as you compute 0.1^n. This also lets you do most of the math with integers instead of floats.
The BCD instructions haven't been implemented in hardware since (IIRC) the 286, and are simply microcoded nowadays. They are unlikely to be particularly high-performance.
This implementation I just finished coding runs twice as fast as the built in 'atof' on my desktop. It converts 1024*1024*39 number inputs in 2 seconds, compared 4 seconds with my system's standard gnu 'atof'. (Including the setup time and getting memory and all that).
UPDATE:
Sorry I have to revoke my twice as fast claim. It's faster if the thing you're converting is already in a string, but if you're passing it hard coded string literals, it's about the same as atof. However I'm going to leave it here, as possibly with some tweaking of the ragel file and state machine, you may be able to generate faster code for specific purposes.
https://github.com/matiu2/yajp
The interesting files for you are:
https://github.com/matiu2/yajp/blob/master/tests/test_number.cpp
https://github.com/matiu2/yajp/blob/master/number.hpp
Also you may be interested in the state machine that does the conversion:
It seems to me you want to build (by hand) what amounts to a state machine where each state handles the Nth input digit or exponent digits; this state machine would be shaped like a tree (no loops!). The goal is to do integer arithmetic wherever possible, and (obviously) to remember state variables ("leading minus", "decimal point at position 3") in the states implicitly, to avoid assignments, stores and later fetch/tests of such values. Implement the state machine with plain old "if" statements on the input characters only (so your tree gets to be a set of nested ifs). Inline accesses to buffer characters; you don't want a function call to getchar to slow you down.
Leading zeros can simply be suppressed; you might need a loop here to handle ridiculously long leading zero sequences. The first nonzero digit can be collected without zeroing an accumulator or multiplying by ten. The first 4-9 nonzero digits (for 16 bit or 32 bits integers) can be collected with integer multiplies by constant value ten (turned by most compilers into a few shifts and adds). [Over the top: zero digits don't require any work until a nonzero digit is found and then a multiply 10^N for N sequential zeros is required; you can wire all this in into the state machine]. Digits following the first 4-9 may be collected using 32 or 64 bit multiplies depending on the word size of your machine. Since you don't care about accuracy, you can simply ignore digits after you've collected 32 or 64 bits worth; I'd guess that you can actually stop when you have some fixed number of nonzero digits based on what your application actually does with these numbers. A decimal point found in the digit string simply causes a branch in the state machine tree. That branch knows the implicit location of the point and therefore later how to scale by a power of ten appropriately. With effort, you may be able to combine some state machine sub-trees if you don't like the size of this code.
[Over the top: keep the integer and fractional parts as separate (small) integers. This will require an additional floating point operation at the end to combine the integer and fraction parts, probably not worth it].
[Over the top: collect 2 characters for digit pairs into a 16 bit value, lookup the 16 bit value.
This avoids a multiply in the registers in trade for a memory access, probably not a win on modern machines].
On encountering "E", collect the exponent as an integer as above; look up accurately precomputed/scaled powers of ten up in a table of precomputed multiplier (reciprocals if "-" sign present in exponent) and multiply the collected mantissa. (don't ever do a float divide). Since each exponent collection routine is in a different branch (leaf) of the tree, it has to adjust for the apparent or actual location of the decimal point by offsetting the power of ten index.
[Over the top: you can avoid the cost of ptr++ if you know the characters for the number are stored linearly in a buffer and do not cross the buffer boundary. In the kth state along a tree branch, you can access the the kth character as *(start+k). A good compiler can usually hide the "...+k" in an indexed offset in the addressing mode.]
Done right, this scheme does roughly one cheap multiply-add per nonzero digit, one cast-to-float of the mantissa, and one floating multiply to scale the result by exponent and location of decimal point.
I have not implemented the above. I have implemented versions of it with loops, they're pretty fast.
I've implemented something you may find useful.
In comparison with atof it's about x5 faster and if used with __forceinline about x10 faster.
Another nice thing is that it seams to have exactly same arithmetic as crt implementation.
Of course it has some cons too:
it supports only single precision float,
and doesn't scan any special values like #INF, etc...
__forceinline bool float_scan(const wchar_t* wcs, float* val)
{
int hdr=0;
while (wcs[hdr]==L' ')
hdr++;
int cur=hdr;
bool negative=false;
bool has_sign=false;
if (wcs[cur]==L'+' || wcs[cur]==L'-')
{
if (wcs[cur]==L'-')
negative=true;
has_sign=true;
cur++;
}
else
has_sign=false;
int quot_digs=0;
int frac_digs=0;
bool full=false;
wchar_t period=0;
int binexp=0;
int decexp=0;
unsigned long value=0;
while (wcs[cur]>=L'0' && wcs[cur]<=L'9')
{
if (!full)
{
if (value>=0x19999999 && wcs[cur]-L'0'>5 || value>0x19999999)
{
full=true;
decexp++;
}
else
value=value*10+wcs[cur]-L'0';
}
else
decexp++;
quot_digs++;
cur++;
}
if (wcs[cur]==L'.' || wcs[cur]==L',')
{
period=wcs[cur];
cur++;
while (wcs[cur]>=L'0' && wcs[cur]<=L'9')
{
if (!full)
{
if (value>=0x19999999 && wcs[cur]-L'0'>5 || value>0x19999999)
full=true;
else
{
decexp--;
value=value*10+wcs[cur]-L'0';
}
}
frac_digs++;
cur++;
}
}
if (!quot_digs && !frac_digs)
return false;
wchar_t exp_char=0;
int decexp2=0; // explicit exponent
bool exp_negative=false;
bool has_expsign=false;
int exp_digs=0;
// even if value is 0, we still need to eat exponent chars
if (wcs[cur]==L'e' || wcs[cur]==L'E')
{
exp_char=wcs[cur];
cur++;
if (wcs[cur]==L'+' || wcs[cur]==L'-')
{
has_expsign=true;
if (wcs[cur]=='-')
exp_negative=true;
cur++;
}
while (wcs[cur]>=L'0' && wcs[cur]<=L'9')
{
if (decexp2>=0x19999999)
return false;
decexp2=10*decexp2+wcs[cur]-L'0';
exp_digs++;
cur++;
}
if (exp_negative)
decexp-=decexp2;
else
decexp+=decexp2;
}
// end of wcs scan, cur contains value's tail
if (value)
{
while (value<=0x19999999)
{
decexp--;
value=value*10;
}
if (decexp)
{
// ensure 1bit space for mul by something lower than 2.0
if (value&0x80000000)
{
value>>=1;
binexp++;
}
if (decexp>308 || decexp<-307)
return false;
// convert exp from 10 to 2 (using FPU)
int E;
double v=pow(10.0,decexp);
double m=frexp(v,&E);
m=2.0*m;
E--;
value=(unsigned long)floor(value*m);
binexp+=E;
}
binexp+=23; // rebase exponent to 23bits of mantisa
// so the value is: +/- VALUE * pow(2,BINEXP);
// (normalize manthisa to 24bits, update exponent)
while (value&0xFE000000)
{
value>>=1;
binexp++;
}
if (value&0x01000000)
{
if (value&1)
value++;
value>>=1;
binexp++;
if (value&0x01000000)
{
value>>=1;
binexp++;
}
}
while (!(value&0x00800000))
{
value<<=1;
binexp--;
}
if (binexp<-127)
{
// underflow
value=0;
binexp=-127;
}
else
if (binexp>128)
return false;
//exclude "implicit 1"
value&=0x007FFFFF;
// encode exponent
unsigned long exponent=(binexp+127)<<23;
value |= exponent;
}
// encode sign
unsigned long sign=negative<<31;
value |= sign;
if (val)
{
*(unsigned long*)val=value;
}
return true;
}
I remember we had a Winforms application that performed so slowly while parsing some data interchange files, and we all thought it was the db server thrashing, but our smart boss actually found out that the bottleneck was in the call that was converting the parsed strings into decimals!
The simplest is to loop for each digit (character) in the string, keep a running total, multiply the total by 10 then add the value of the next digit. Keep on doing this until you reach the end of the string or you encounter a dot. If you encounter a dot, separate the whole number part from the fractional part, then have a multiplier that divides itself by 10 for each digit. Keep on adding them up as you go.
Example: 123.456
running total = 0, add 1 (now it's 1)
running total = 1 * 10 = 10, add 2 (now it's 12)
running total = 12 * 10 = 120, add 3 (now it's 123)
encountered a dot, prepare for fractional part
multiplier = 0.1, multiply by 4, get 0.4, add to running total, makes 123.4
multiplier = 0.1 / 10 = 0.01, multiply by 5, get 0.05, add to running total, makes 123.45
multipiler = 0.01 / 10 = 0.001, multiply by 6, get 0.006, add to running total, makes 123.456
Of course, testing for a number's correctness as well as negative numbers will make it more complicated. But if you can "assume" that the input is correct, you can make the code much simpler and faster.
Have you considered looking into having the GPU do this work? If you can load the strings into GPU memory and have it process them all you may find a good algorithm that will run significantly faster than your processor.
Alternately, do it in an FPGA - There are FPGA PCI-E boards that you can use to make arbitrary coprocessors. Use DMA to point the FPGA at the part of memory containing the array of strings you want to convert and let it whizz through them leaving the converted values behind.
Have you looked at a quad core processor? The real bottleneck in most of these cases is memory access anyway...
-Adam