Jsoncpp writing float values incorrectly - c++

I am reading from a JSON file using jsoncpp. When I write back to the file, my float values are slightly off. For the sake of testing, I decided to parse the file to a Json::Value and then write that value back to the file. I would expect it to look the same, but instead the float values are different.
Example:
"Parameters":
{
"MinXValue": 0.1,
"MaxXValue": 0.15,
"MinYValue": 0.25,
"MaxYValue": 1.1,
"MinObjectSizeValue": 1
}
writes as:
"Parameters":
{
"MinXValue": 0.10000000000000001,
"MaxXValue": 0.14999999999999999,
"MinYValue": 0.25,
"MaxYValue": 1.1000000238418579,
"MinObjectSizeValue": 1
}
You may notice that 0.25 did not change, even though all of the other floats did. Any idea what's going on here?

It is actually an issue of floating point number parsing/printing implementation. Although floating point numbers can only represent some decimal numbers exactly (0.25 is one of ~2^64), it is necessary to parse a string representation to the nearest binary representation. When printing floating point, it is also necessary to print the (preferably the shortest) string representation which can be restored to the binary representation.
I admit that I had not investigate JsonCPP to see if there is a solution for this. But as I am the author of RapidJSON, I tried to see how RapidJSON performs for this:
const char json[] =
"{"
"\"MinXValue\": 0.1,"
"\"MaxXValue\": 0.15,"
"\"MinYValue\": 0.25,"
"\"MaxYValue\": 1.1,"
"\"MinObjectSizeValue\": 1"
"}";
using namespace rapidjson;
Document d;
d.Parse(json);
StringBuffer sb;
PrettyWriter<StringBuffer> writer(sb);
d.Accept(writer);
std::cout << sb.GetString();
And the result:
{
"MinXValue": 0.1,
"MaxXValue": 0.15,
"MinYValue": 0.25,
"MaxYValue": 1.1,
"MinObjectSizeValue": 1
}
RapidJSON implemented both parsing and printing algorithms internally. Normal precision parsing will have maximum 3 ULP errors, but with full precision parsing flag (kParseFullPrecisionFlag) it can always parse to nearest representation. The printing part implemented Grisu2 algorithm. It does always generate an exact result, and more than 99% of time to be shortest (optimal).
Actually, using strtod() and sprintf(..., "%.17g", ...) can solve this problem too. But they are much slower in current C/C++ standard library. For example, I have done a benchmark for printing double. So in RapidJSON we implemented its own optimized header-only solutions.

This feature has alreay been supported, for those who are still looking into this problem: https://github.com/open-source-parsers/jsoncpp/commit/772f634548f0cec058a0f16a2e641d9f7b78343d
std::ofstream ofs("example.json");
Json::Value root;
// ... Build your json object....
Json::StreamWriterBuilder wbuilder;
wbuilder["indentation"] = "";
wbuilder.settings_["precision"] = 6;
std::unique_ptr<Json::StreamWriter> writer(wbuilder.newStreamWriter());
// Write to file.
writer->write(root, &ofs);
ofs.close();

One solution is to a make small change to the jsoncpp source file.
Replace the 17 with a 15 on the following line such that it reads (line 4135 in my copy):
std::string valueToString(double value) { return valueToString(value, false, 15); }
Basically it's reducing the max number of printed digits from 17 to 15, but if you're ok with that it seems to fix all the undesirable printing artefacts you mention. I think one could also argue you shouldn't be using json's to pass around >15 significant digits anyways (near the limit of double precision), but that's another story...
E.g. what used to print for me as:
"yo" : 1726.6969999999999,
now prints as:
"yo" : 1726.697,

Related

Parsing a float number from JSON using nlohmann::json

I want to get a decimal number from a json file into a float variable by using the nlohmann::json library and allways get the output 86700.2031 instead of 86700.2. First I used a numeric json type:
{
"test": 86700.2,
}
and tried this way:
using json = nlohmann::json;
std::string config_path = "C:/config/default.json";
std::ifstream config_file(config_path);
json config = json::parse(config_file);
float test = config["test"]; //Output: 86700.2031
After that I changed the json type to string and tried this with the same outcome:
float test = std::stof(config["test"].get<std::string>()); //Output: 86700.2031
Reading integers or strings works fine. How do I read a float value properly?
"float" has very low precision. Never use float but use double instead. (One day you will know where you can go against that advice. Before that, use double). The result will not be exactly 86700.2 but very close to it.
Weird though that your library returns a string when the value clearly isn't a string. I'd check the documentation for what function you should call. You should never have to convert data yourself like that.

Issues saving double as binary in c++

In my simulation code for a particle system, I have a class defined for particles, and each particle has a property of pos containing its position, which is a double pos[3]; as there are 3 coordinate components per particle. So with particle object defined by particles = new Particle[npart]; (as we have npart many particles), then e.g. the y-component of the 2nd particle would be accessed with double dummycomp = particles[1].pos[1];
To save the particles to file before using binary I would use (saved as txt, with float precision of 10 and one particle per line):
#include <iostream>
#include <fstream>
ofstream outfile("testConfig.txt", ios::out);
outfile.precision(10);
for (int i=0; i<npart; i++){
outfile << particle[i].pos[0] << " " << particle[i].pos[1] << " " << particle[i].pos[2] << endl;
}
outfile.close();
But now, to save space, I am trying to save the configuration as a binary file, and my attempt, inspired from here, has been as follows:
ofstream outfile("test.bin", ios::binary | ios::out);
for (int i=0; i<npart; i++){
outfile.write(reinterpret_cast<const char*>(particle[i].pos),streamsize(3*sizeof(double)));
}
outfile.close();
but I am facing a segmentation fault when trying to run it. My questions are:
Am I doing something wrong with reinterpret_cast or rather in the argument of streamsize()?
Ideally, it would be great if the saved binary format could also be read within Python, is my approach (once fixed) allowing for that?
working example for the old saving approach (non-binary):
#include <iostream>
#include <fstream>
using namespace std;
class Particle {
public:
double pos[3];
};
int main() {
int npart = 2;
Particle particles[npart];
//initilizing the positions:
particles[0].pos[0] = -74.04119568;
particles[0].pos[1] = -44.33692582;
particles[0].pos[2] = 17.36278231;
particles[1].pos[0] = 48.16310086;
particles[1].pos[1] = -65.02325252;
particles[1].pos[2] = -37.2053818;
ofstream outfile("testConfig.txt", ios::out);
outfile.precision(10);
for (int i=0; i<npart; i++){
outfile << particles[i].pos[0] << " " << particles[i].pos[1] << " " << particles[i].pos[2] << endl;
}
outfile.close();
return 0;
}
And in order to save the particle positions as binary, substitute the saving portion of the above sample with
ofstream outfile("test.bin", ios::binary | ios::out);
for (int i=0; i<npart; i++){
outfile.write(reinterpret_cast<const char*>(particles[i].pos),streamsize(3*sizeof(double)));
}
outfile.close();
2nd addendum: reading the binary in Python
I managed to read the saved binary in python as follows using numpy:
data = np.fromfile('test.bin', dtype=np.float64)
data
array([-74.04119568, -44.33692582, 17.36278231, 48.16310086,
-65.02325252, -37.2053818 ])
But given the doubts cast in the comments regarding non-portability of binary format, I am not confident this type of reading in Python will always work! It would be really neat if someone could elucidate on the reliability of such approach.
The trouble is that base 10 representation of double in ascii is flawed and not guaranteed to give you the correct result (especially if you only use 10 digits). There is a potential for a loss of information even if you use all std::numeric_limits<max_digits10> digits as the number may not be representable in base 10 exactly.
The other issue you have is that the binary representation of a double is not standardized so using it is very fragile and can lead to code breaking very easily. Simply changing the compiler or compiler sittings can result in a different double format and changing architectures you have absolutely no guarantees.
You can serialize it to text in a non lossy representation by using the hex format for doubles.
stream << std::fixed << std::scientific << particles[i].pos[0];
// If you are using C++11 this was simplified to
stream << std::hexfloat << particles[i].pos[0];
This has the affect of printing the value with the same as "%a" in printf() in C, that prints the string as "Hexadecimal floating point, lowercase". Here both the radix and mantissa are converted into hex values before being printed in a very specific format. Since the underlying representation is binary these values can be represented exactly in hex and provide a non lossy way of transferring data between systems. IT also truncates proceeding and succeeding zeros so for a lot of numbers is relatively compact.
On the python side. This format is also supported. You should be able to read the value as a string then convert it to a float using float.fromhex()
see: https://docs.python.org/3/library/stdtypes.html#float.fromhex
But your goal is to save space:
But now, to save space, I am trying to save the configuration as a binary file.
I would ask the question do you really need to save space? Are you running on a low powered low resource environment? Sure then space saving can definitely be a thing (but that is rare nowadays (but these environments do exist)).
But it seems like you are running some form of particle simulation. This does not scream low resource use case. Even if you have tera bytes of data I would still go with a portable easy to read format over binary. Preferably one that is not lossy. Storage space is cheap.
I suggest using a library instead of writing a serialization/deserialization routine from scratch. I find cereal really easy to use, maybe even easier than boost::serialization. It reduces the opportunity for bugs in your own code.
In your case I'd go about serializing doubles like this using cereal:
#include <cereal/archives/binary.hpp>
#include <fstream>
int main() {
std::ofstream outfile("test.bin", ios::binary);
cereal::BinaryOutputArchive out(outfile);
double x, y, z;
x = y = z = 42.0;
out(x, y, z);
}
To deserialize them you'd use:
#include <cereal/archives/binary.hpp>
#include <fstream>
int main() {
std::ifstream infile("test.bin", ios::binary);
cereal::BinaryInputArchive in(infile);
double x,y,z;
in(x, y, z);
}
You can also serialize/deserialize whole std::vector<double>s in the same fashion. Just add #include <cereal/types/vector.hpp> and use in / out like in the given example on a single std::vector<double> instead of multiple doubles.
Ain't that swell.
Edit
In a comment you asked, whether it'd be possible to read a created binary file like that with Python.
Answer:
Serialized binary files aren't really meant to be very portable (things like endianness could play a role here). You could easily adapt the example code I gave you to write a JSON file (another advantage of using a library) and read that format in Python.
Oh and cereal::JSONOutputArchive has an option for setting precision.
Just curious if you ever investigated the idea of converting your data to vectored coordinates instead of Cartesian X,Y,Z? It would seem that this would potentially reduce the size of your data by about 30%: Two coordinates instead of three, but perhaps needing slightly higher precision in order to convert back to your X,Y,Z.
The vectored coordinates could still be further optimized by using the various compression techniques above (text compression or binary conversion).

removing trailing zeroes for a float value c++

I am trying to set up a nodemcu module to collect data from a temperature sensor, and send it using mqtt pubsubclient to my mqtt broker, but that is not the problem.
I am trying to send the temperature in a format that only has one decimal, and at this point I've succesfully made it round up or down, but the format is not right. as of now it rounds the temp to 24.50, 27.80, 23.10 etc. I want to remove the trailing zereos, so it becomes 24.5, 27.8, 23.1 etc.
I have this code set up so far:
#include <math.h>
#include <PubSubClient.h>
#include <ESP8266WiFi.h>
float temp = 0;
void loop {
float newTemp = sensors.getTempCByIndex(0);
temp = roundf((newTemp * 10)) / 10;
serial.println(String(temp).c_str())
client.publish("/test/temperature", String(temp).c_str(), true);
}
I'm fairly new to c++, so any help would be appreciated.
It's unclear what your API is. Seems like you want to pass in the C string. In that case just use sprintf:
#include <stdio.h>
float temp = sensors.getTempCByIndex(0);
char s[30];
sprintf(s, "%.1f", temp);
client.publish("/test/temperature", s, true);
Regardless of what you do to them, floating-point values always have the same precision. To control the number of digits in a text string, change the way you convert the value to text. In normal C++ (i.e., where there is no String type <g>), you do that with a stream:
std::ostrstream out;
out << std::fixed << std::setprecision(3) << value;
std::string text = out.str();
In the environment you're using, you'll have to either use standard streams or figure out what that environment provides for controlling floating-point to text conversions.
The library you are using is not part of standard C++. The String you are using is non-standard.
As Pete Becker noted in his answer, you won't be able to control the trailing zeros by changing the value of temp. You need to either control the precision when converting it to String, or do the conversion and then tweak the resultant string.
If you read the documentation for the String type you are using, there may be options do do one or both of;
control the precision when writing a float to a string; or
examine characters in a String and manually remove trailing zeros.
Or you could use a std::ostrstream to produce the value in a std::string, and work with that instead.

How to work with large numbers when writing and reading a file?

I have written a codes to write my data from one input file to another output file, I used to read all lines of my input file
while (!inputfile.eof())
but in my output file, the last line is missing. So I would like to know, how to prevent this error?
My second question is: for writing data into file, I used
Outputfile.write((char*)&a,sizeof(double));
Outputfile.write((char*)&b,sizeof(double));
here a = 289814.150 and b = 4320978.613 but in the output file, it shows like
289814 4.32098e+006
(value of a is rounded and b value shows with e values) so what is the reason for this and how to fixed this problem?
Here i tried to use cout.setf(ios::fixed);, but if this works for data written on the screen, I don’t know how to fix this to write double data inside my file.
I want to write real values with 3 decimals only in my output file. Please anyone can help thanks.
Okay, based on comments, the intent here has (at least I hope) become reasonably clear: to convert pairs of numbers in text format to binary format, and be able to verify that the converted numbers accurately represent the originals.
There are a number of ways to do that, but the first thing to keep in mind is that no matter what else you do, converting floating point numbers to/from text (decimal) format can and normally will lead to some degree of inaccuracy. The problem is fairly simple: floating point is (normally) done in binary. This means it can only represent fractions in which the denominator is a power of 2 (or a sum of powers of 2). Decimal, obviously enough, uses base 10, so fractions can be composed of a sum of powers of 2 and powers of 5. Any of those that involves a power of 2 (e.g., 0.2) can only be approximated in binary -- pretty much like trying to represent 1/3rd in decimal.
This means your only reasonable choice is to allow some discrepancy between the decimal and binary versions. The best you can hope for is to keep the errors to a minimum. To test for that, what you probably need/want to do is convert the binary floating point back to decimal in the original format, and check whether it's close to the original (e.g., ignore errors in the final digit, at least errors of +/- 1).
The conversion itself should be pretty trivial:
#include <fstream>
int main(int argc, char **argv) {
// checking argc omitted for clarity.
std::ifstream infile(argv[1]);
std::ofstream outfile(argv[2], std::ios::binary);
double a, b;
while (infile >> a && infile >> b) {
outfile.write((char const *)&a, sizeof(a));
outfile.write((char const *)&b, sizeof(b));
}
return 0;
}
Verifying the data isn't nearly so easy. One possibility would be something like this (starting from the two files, one binary and one text):
#include <iostream>
#include <fstream>
#include <sstream>
#include <iomanip>
int main(int argc, char **argv) {
std::string text;
std::ostringstream converter;
std::ifstream text_file(argv[1]);
std::ifstream bin_file(argv[2], std::ios::binary);
double bin_value;
while (text_file >> text) {
bin_file.read((char *)&bin_value, sizeof(bin_value));
// the manipulators will probably need tweaking to match original format.
converter << std::fixed << std::setw(3) << std::setprecision(3) << bin_value;
if (converter.str() != text)
;// they're identical
else if (converter.str().substr(0,3) == text.substr(0,3))
;// the first three digits are equal
else
;// bigger error
}
return 0;
}
That's much more likely to need some tweaking to work the way you want, but the general idea should be in the ballpark as long as you're sure the original numbers are all formatted consistently.

Precise floating-point<->string conversion

I am looking for a library function to convert floating point numbers to strings, and back again, in C++. The properties I want are that str2num(num2str(x)) == x and that num2str(str2num(x)) == x (as far as possible). The general property is that num2str should represent the simplest rational number that when rounded to the nearest representable floating pointer number gives you back the original number.
So far I've tried boost::lexical_cast:
double d = 1.34;
string_t s = boost::lexical_cast<string_t>(d);
printf("%s\n", s.c_str());
// outputs 1.3400000000000001
And I've tried std::ostringstream, which seems to work for most values if I do stream.precision(16). However, at precision 15 or 17 it either truncates or gives ugly output for things like 1.34. I don't think that precision 16 is guaranteed to have any particular properties I require, and suspect it breaks down for many numbers.
Is there a C++ library that has such a conversion? Or is such a conversion function already buried somewhere in the standard libraries/boost.
The reason for wanting these functions is to save floating point values to CSV files, and then read them correctly. In addition, I'd like the CSV files to contain simple numbers as far as possible so they can be consumed by humans.
I know that the Haskell read/show functions already have the properties I am after, as do the BSD C libraries. The standard references for string<->double conversions is a pair of papers from PLDI 1990:
How to read floating point numbers accurately, Will Klinger
How to print floating point numbers accurately, Guy Steele et al
Any C++ library/function based on these would be suitable.
EDIT: I am fully aware that floating point numbers are inexact representations of decimal numbers, and that 1.34==1.3400000000000001. However, as the papers referenced above point out, that's no excuse for choosing to display as "1.3400000000000001"
EDIT2: This paper explains exactly what I'm looking for: http://drj11.wordpress.com/2007/07/03/python-poor-printing-of-floating-point/
I am still unable to find a library that supplies the necessary code, but I did find some code that does work:
http://svn.python.org/view/python/branches/py3k/Python/dtoa.c?view=markup
By supplying a fairly small number of defines it's easy to abstract away the Python integration. This code does indeed meet all the properties I outline.
I think this does what you want, in combination with the standard library's strtod():
#include <stdio.h>
#include <stdlib.h>
int dtostr(char* buf, size_t size, double n)
{
int prec = 15;
while(1)
{
int ret = snprintf(buf, size, "%.*g", prec, n);
if(prec++ == 18 || n == strtod(buf, 0)) return ret;
}
}
A simple demo, which doesn't bother to check input words for trailing garbage:
int main(int argc, char** argv)
{
int i;
for(i = 1; i < argc; i++)
{
char buf[32];
dtostr(buf, sizeof(buf), strtod(argv[i], 0));
printf("%s\n", buf);
}
return 0;
}
Some example inputs:
% ./a.out 0.1 1234567890.1234567890 17 1e99 1.34 0.000001 0 -0 +INF NaN
0.1
1234567890.1234567
17
1e+99
1.34
1e-06
0
-0
inf
nan
I imagine your C library needs to conform to some sufficiently recent version of the standard in order to guarantee correct rounding.
I'm not sure I chose the ideal bounds on prec, but I imagine they must be close. Maybe they could be tighter? Similarly I think 32 characters for buf are always sufficient but never necessary. Obviously this all assumes 64-bit IEEE doubles. Might be worth checking that assumption with some kind of clever preprocessor directive -- sizeof(double) == 8 would be a good start.
The exponent is a bit messy, but it wouldn't be difficult to fix after breaking out of the loop but before returning, perhaps using memmove() or suchlike to shift things leftwards. I'm pretty sure there's guaranteed to be at most one + and at most one leading 0, and I don't think they can even both occur at the same time for prec >= 10 or so.
Likewise if you'd rather ignore signed zero, as Javascript does, you can easily handle it up front, e.g.:
if(n == 0) return snprintf(buf, size, "0");
I'd be curious to see a detailed comparison with that 3000-line monstrosity you dug up in the Python codebase. Presumably the short version is slower, or less correct, or something? It would be disappointing if it were neither....
The reason for wanting these functions is to save floating point values to CSV files, and then read them correctly. In addition, I'd like the CSV files to contain simple numbers as far as possible so they can be consumed by humans.
You cannot have conversion double → string → double and in the same time having the string human readable.
You need to need to choose between an exact conversion and a human readable string. This is the definition of max_digits10 and digits10:
difference explained by stackoverflow
digits10
max_digits10
Here is an implementation of num2str and str2num with two different contexts from_double (conversion double → string → double) and from_string (conversion string → double → string):
#include <iostream>
#include <limits>
#include <iomanip>
#include <sstream>
namespace from_double
{
std::string num2str(double d)
{
std::stringstream ss;
ss << std::setprecision(std::numeric_limits<double>::max_digits10) << d;
return ss.str();
}
double str2num(const std::string& s)
{
double d;
std::stringstream ss(s);
ss >> std::setprecision(std::numeric_limits<double>::max_digits10) >> d;
return d;
}
}
namespace from_string
{
std::string num2str(double d)
{
std::stringstream ss;
ss << std::setprecision(std::numeric_limits<double>::digits10) << d;
return ss.str();
}
double str2num(const std::string& s)
{
double d;
std::stringstream ss(s);
ss >> std::setprecision(std::numeric_limits<double>::digits10) >> d;
return d;
}
}
int main()
{
double d = 1.34;
if (from_double::str2num(from_double::num2str(d)) == d)
std::cout << "Good for double -> string -> double" << std::endl;
else
std::cout << "Bad for double -> string -> double" << std::endl;
std::string s = "1.34";
if (from_string::num2str(from_string::str2num(s)) == s)
std::cout << "Good for string -> double -> string" << std::endl;
else
std::cout << "Bad for string -> double -> string" << std::endl;
return 0;
}
Actually I think you'll find that 1.34 IS 1.3400000000000001. Floating point numbers are not precise. You can't get around this. 1.34f is 1.34000000333786011 for example.
As stated by others. Floating-point numbers are not that accurate its an artifact on how they store the value.
What you are really looking for is a Decimal number representation.
Basically this uses an integer to store the number and has a specific accuracy after the decimal point.
A quick Google got this:
http://www.codeproject.com/KB/mcpp/decimalclass.aspx