Parse and convert denorm numbers? - c++

In C++, we can store denorm numbers into variables without problems:
double x = std::numeric_limits<double>::denorm_min();
Then, we can print this variable without problems:
std::cout<<std::setprecision(std::numeric_limits<double>::max_digits10)
std::cout<<std::scientific;
std::cout<<x;
std::cout<<std::endl;
And it will print:
4.94065645841246544e-324
But a problem occurs when one tries to parse this number. Imagine that this number is stored inside a file, and read as a string. The problem is that:
std::string str = "4.94065645841246544e-324";
double x = std::stod(str);
will throw an std::out_of_range exception.
So my question is: how to convert a denorm value stored in a string?

I'm not sure I have understood the problem, but using std::istringstream like this:
std::string str = "4.94065645841246544e-324";
double x;
std::istringstream iss(str);
iss >> x;
std::cout << std::setprecision(std::numeric_limits<double>::max_digits10);
std::cout << std::scientific;
std::cout << x << std::endl;
...gives me:
4.94065645841246544e-324

Apparently, you can use the strtod (or the older atof) interface from cstdlib. I doubt whether this is guaranteed or portable.

I'm not sure if it will make a difference, but you are actually printing:
(std::numeric_limits<double>::max_digits10 + 1) = 18 decimal digits.
e.g., an IEEE-754 64-bit double with round-trip precision is "1.16" in scientific notation. Perhaps this is introducing some ULP / rounding that interferes with the conversion?

The problem with denormals and std::stod is that the latter is defined in terms of std::strtod, which may set errno=ERANGE on underflow (it's implementation-defined whether it'll do, and in glibc it does). As reminded by gcc developers, in such a case std::stod is defined by the standard to throw std::out_of_range.
So your proper workaround is to use std::strtod directly, ignoring ERANGE when the value it returns is finite and nonzero, like here:
double stringToDouble(const char* str, std::size_t* pos=nullptr)
{
errno=0;
char* end;
const auto x=std::strtod(str, &end);
if(errno==ERANGE)
{
// Ignore it for denormals
if(x!=0 && x>-HUGE_VAL && x<HUGE_VAL)
return x;
throw std::out_of_range("strtod: ERANGE");
}
else if(errno)
throw std::invalid_argument("strtod failed");
if(pos)
*pos=end-str;
return x;
}
Note that, unlike std::istringstream approach suggested in another answer, this will work for hexfloats too.

Related

bulletproof use of from_chars()

I have some literal strings which I want to convert to integer and even double. The base is 16, 10, 8, and 2.
At this time, I wonder about the behavior of std::from_chars() - I try to convert and the error code inside from_chars_result return holds success - even if it isn't as shown here:
#include <iostream>
#include <string_view>
#include <charconv>
using namespace std::literals::string_view_literals;
int main()
{
auto const buf = "01234567890ABCDEFG.FFp1024"sv;
double d;
auto const out = std::from_chars(buf.begin(), buf.end(), d, std::chars_format::hex);
if(out.ec != std::errc{} || out.ptr != buf.end())
{
std::cerr << buf << '\n'
<< std::string(std::distance(buf.begin(), out.ptr), ' ') << "^- here\n";
auto const ec = std::make_error_code(out.ec);
std::cerr << "err: " << ec.message() << '\n';
return 1;
}
std::cout << d << '\n';
}
gives:
01234567890ABCDEFG.FFp1024
^- here
err: Success
For convenience also at coliru.
In my use case, I'll check the character set before but, I'm not sure about the checks to make it bulletproof. Is this behavior expected (maybe my English isn't sufficient, or I didn't read carefully enough)? I've never seen such checks on iterators on blogs etc.
The other question is related to different base like 2 and 8. Base of 10 and 16 seems to be supported - what would be the way for the other two bases?
Addendum/Edit:
Bulletproof here means that I can have nasty things in the string. The obvious thing for me is that 'G' is not a hex character. But I would have expected an appropriate error code in some way! The comparison out.ptr != buf.end() I've never seen in blogs (or I didn't read the right ones :)
If I enter a crazy long hex float, at least a numerical result out of range comes up.
By bulletproof I also mean that I can find such impossible strings by length, for example, so that I can save myself the call to from_chars() - for float/doubles and integers (here I would 'strlen' compare digits10 from std::numeric_limits).
The from_chars utility is designed to convert the first number it finds in the string and to return a pointer to the point where it stopped. This allows you to parse strings like "42 centimeters" by first converting the number and then parsing the rest of the string yourself for what comes after it.
The comparison out.ptr != buf.end() I've never seen in blogs (or I didn't read the right ones :)
If you know that the entire string should be a number, then checking that the pointer in the result points to the end of the string is the normal way to ensure that from_chars read the entire string.

Converting string to double with no manipulation [duplicate]

I am reading in a value with a string, then converting it to a double. I expect an input such as 2.d to fail with std::stod, but it returns 2. Is there a way to ensure that there is no character in the input string at all with std::stod?
Example code:
string exampleS = "2.d"
double exampleD = 0;
try {
exampleD = stod(exampleS); // this should fail
} catch (exception &e) {
// failure condition
}
cerr << exampleD << endl;
This code should print 0 but it prints 2. If the character is before the decimal place, stod throws an exception.
Is there a way to make std::stod (and I'm assuming the same behavior also occurs with std::stof) fail on inputs such as these?
You can pass a second argument to std::stod to get the number of characters converted. This can be used to write a wrapper:
double strict_stod(const std::string& s) {
std::size_t pos;
const auto result = std::stod(s, &pos);
if (pos != s.size()) throw std::invalid_argument("trailing characters blah blah");
return result;
}
This code should print 0 but it prints 2.
No, that is not how std::stod is specified. The function will discard whitespace (which you don't have), then parse your 2. substring (which is a valid decimal floating-point expression) and finally stop at the d character.
If you pass a non-nullptr to the second argument pos, the function will give you the number of characters processed, which maybe you can use to fulfill your requirement (it is not clear to me exactly what you need to fail on).

std::stod ignores nonnumerical values after decimal place

I am reading in a value with a string, then converting it to a double. I expect an input such as 2.d to fail with std::stod, but it returns 2. Is there a way to ensure that there is no character in the input string at all with std::stod?
Example code:
string exampleS = "2.d"
double exampleD = 0;
try {
exampleD = stod(exampleS); // this should fail
} catch (exception &e) {
// failure condition
}
cerr << exampleD << endl;
This code should print 0 but it prints 2. If the character is before the decimal place, stod throws an exception.
Is there a way to make std::stod (and I'm assuming the same behavior also occurs with std::stof) fail on inputs such as these?
You can pass a second argument to std::stod to get the number of characters converted. This can be used to write a wrapper:
double strict_stod(const std::string& s) {
std::size_t pos;
const auto result = std::stod(s, &pos);
if (pos != s.size()) throw std::invalid_argument("trailing characters blah blah");
return result;
}
This code should print 0 but it prints 2.
No, that is not how std::stod is specified. The function will discard whitespace (which you don't have), then parse your 2. substring (which is a valid decimal floating-point expression) and finally stop at the d character.
If you pass a non-nullptr to the second argument pos, the function will give you the number of characters processed, which maybe you can use to fulfill your requirement (it is not clear to me exactly what you need to fail on).

testing for double in visual c++

I am designing a gui in visual c++ and there is a textbox where user inputs values so a calculation can be performed. How do I validate the input to ensure it can be cast to a double value?
In any C++ environment where you have a std::string field and wish to check if it contains a double, you can simply do something like:
#include <sstream>
std::istringstream iss(string_value);
double double_value;
char trailing_junk;
if (iss >> double_value && !(iss >> trailing_junk))
{
// can use the double...
}
As presented, this will reject things like "1.234q" or "-13 what?" but accept surrounding whitespace e.g. " 3.9E2 ". If you want to reject whitespace, try #include <iomanip> then if (iss >> std::noskipws >> double_value && iss.peek() == EOF) ....
You could also do this using old-style C APIs:
double double_value;
if (sscanf(string_value.c_str(), "%lf%*c", &double_value) == 1)
You cannot "cast" a string to a double, you can only convert it. strtod function will return a pointer to the character within the string where the conversion stopped, so you can decide what to do further. So you can use this function for conversion AND checking.
I'd recommend Boost's lexical_cast, which will throw an exception if the conversion fails.
Since this seems to be a C++ CLI related question and your string from the textbox might be a .NET string, you might want to check the static Double::Parse method. For more portable solutions see the other answers...
As stated already, strtod(3) is the answer.
bool is_double(const char* str) {
char *end = 0;
strtod(str, &end);
// Is the end point of the double the end of string?
return end == str + strlen(str);
}
To address #Ian Goldby's concern, if white space at the end of the sting is a concern, then:
bool is_double(const char* str) {
char *end = 0;
strtod(str, &end);
// Is the end point of the double plus white space the end of string?
return end + strspn(end, " \t\n\r") == str + strlen(str);
}
Simply convert it to a double value. If it succeeds, the input is valid.
Really, you shouldn't be writing your own rules for deciding what is valid. You'll never get exactly the same rules as the library function that will do the actual conversion.
My favourite recipe is to use sscanf(), and check the return value to ensure exactly one field was converted. For extra credit, use a %n parameter to check that no non-whitespace characters were left over.

Precise floating-point<->string conversion

I am looking for a library function to convert floating point numbers to strings, and back again, in C++. The properties I want are that str2num(num2str(x)) == x and that num2str(str2num(x)) == x (as far as possible). The general property is that num2str should represent the simplest rational number that when rounded to the nearest representable floating pointer number gives you back the original number.
So far I've tried boost::lexical_cast:
double d = 1.34;
string_t s = boost::lexical_cast<string_t>(d);
printf("%s\n", s.c_str());
// outputs 1.3400000000000001
And I've tried std::ostringstream, which seems to work for most values if I do stream.precision(16). However, at precision 15 or 17 it either truncates or gives ugly output for things like 1.34. I don't think that precision 16 is guaranteed to have any particular properties I require, and suspect it breaks down for many numbers.
Is there a C++ library that has such a conversion? Or is such a conversion function already buried somewhere in the standard libraries/boost.
The reason for wanting these functions is to save floating point values to CSV files, and then read them correctly. In addition, I'd like the CSV files to contain simple numbers as far as possible so they can be consumed by humans.
I know that the Haskell read/show functions already have the properties I am after, as do the BSD C libraries. The standard references for string<->double conversions is a pair of papers from PLDI 1990:
How to read floating point numbers accurately, Will Klinger
How to print floating point numbers accurately, Guy Steele et al
Any C++ library/function based on these would be suitable.
EDIT: I am fully aware that floating point numbers are inexact representations of decimal numbers, and that 1.34==1.3400000000000001. However, as the papers referenced above point out, that's no excuse for choosing to display as "1.3400000000000001"
EDIT2: This paper explains exactly what I'm looking for: http://drj11.wordpress.com/2007/07/03/python-poor-printing-of-floating-point/
I am still unable to find a library that supplies the necessary code, but I did find some code that does work:
http://svn.python.org/view/python/branches/py3k/Python/dtoa.c?view=markup
By supplying a fairly small number of defines it's easy to abstract away the Python integration. This code does indeed meet all the properties I outline.
I think this does what you want, in combination with the standard library's strtod():
#include <stdio.h>
#include <stdlib.h>
int dtostr(char* buf, size_t size, double n)
{
int prec = 15;
while(1)
{
int ret = snprintf(buf, size, "%.*g", prec, n);
if(prec++ == 18 || n == strtod(buf, 0)) return ret;
}
}
A simple demo, which doesn't bother to check input words for trailing garbage:
int main(int argc, char** argv)
{
int i;
for(i = 1; i < argc; i++)
{
char buf[32];
dtostr(buf, sizeof(buf), strtod(argv[i], 0));
printf("%s\n", buf);
}
return 0;
}
Some example inputs:
% ./a.out 0.1 1234567890.1234567890 17 1e99 1.34 0.000001 0 -0 +INF NaN
0.1
1234567890.1234567
17
1e+99
1.34
1e-06
0
-0
inf
nan
I imagine your C library needs to conform to some sufficiently recent version of the standard in order to guarantee correct rounding.
I'm not sure I chose the ideal bounds on prec, but I imagine they must be close. Maybe they could be tighter? Similarly I think 32 characters for buf are always sufficient but never necessary. Obviously this all assumes 64-bit IEEE doubles. Might be worth checking that assumption with some kind of clever preprocessor directive -- sizeof(double) == 8 would be a good start.
The exponent is a bit messy, but it wouldn't be difficult to fix after breaking out of the loop but before returning, perhaps using memmove() or suchlike to shift things leftwards. I'm pretty sure there's guaranteed to be at most one + and at most one leading 0, and I don't think they can even both occur at the same time for prec >= 10 or so.
Likewise if you'd rather ignore signed zero, as Javascript does, you can easily handle it up front, e.g.:
if(n == 0) return snprintf(buf, size, "0");
I'd be curious to see a detailed comparison with that 3000-line monstrosity you dug up in the Python codebase. Presumably the short version is slower, or less correct, or something? It would be disappointing if it were neither....
The reason for wanting these functions is to save floating point values to CSV files, and then read them correctly. In addition, I'd like the CSV files to contain simple numbers as far as possible so they can be consumed by humans.
You cannot have conversion double → string → double and in the same time having the string human readable.
You need to need to choose between an exact conversion and a human readable string. This is the definition of max_digits10 and digits10:
difference explained by stackoverflow
digits10
max_digits10
Here is an implementation of num2str and str2num with two different contexts from_double (conversion double → string → double) and from_string (conversion string → double → string):
#include <iostream>
#include <limits>
#include <iomanip>
#include <sstream>
namespace from_double
{
std::string num2str(double d)
{
std::stringstream ss;
ss << std::setprecision(std::numeric_limits<double>::max_digits10) << d;
return ss.str();
}
double str2num(const std::string& s)
{
double d;
std::stringstream ss(s);
ss >> std::setprecision(std::numeric_limits<double>::max_digits10) >> d;
return d;
}
}
namespace from_string
{
std::string num2str(double d)
{
std::stringstream ss;
ss << std::setprecision(std::numeric_limits<double>::digits10) << d;
return ss.str();
}
double str2num(const std::string& s)
{
double d;
std::stringstream ss(s);
ss >> std::setprecision(std::numeric_limits<double>::digits10) >> d;
return d;
}
}
int main()
{
double d = 1.34;
if (from_double::str2num(from_double::num2str(d)) == d)
std::cout << "Good for double -> string -> double" << std::endl;
else
std::cout << "Bad for double -> string -> double" << std::endl;
std::string s = "1.34";
if (from_string::num2str(from_string::str2num(s)) == s)
std::cout << "Good for string -> double -> string" << std::endl;
else
std::cout << "Bad for string -> double -> string" << std::endl;
return 0;
}
Actually I think you'll find that 1.34 IS 1.3400000000000001. Floating point numbers are not precise. You can't get around this. 1.34f is 1.34000000333786011 for example.
As stated by others. Floating-point numbers are not that accurate its an artifact on how they store the value.
What you are really looking for is a Decimal number representation.
Basically this uses an integer to store the number and has a specific accuracy after the decimal point.
A quick Google got this:
http://www.codeproject.com/KB/mcpp/decimalclass.aspx