Where does this precision loss happen and how to prevent it? - c++

I'm writing a simple tool in Qt which reads data from two GPX (XML) files and combines them in a certain way. I tested my tool with track logs that contain waypoints having 6 decimal digits precision. When I read them from the GPX file, the precision gets reduced to 4 decimal digits (rounded properly). So for example this original tag:
<trkpt lat="61.510656" lon="23.777735">
turns into this when my tool writes it again:
<trkpt lat="61.5107" lon="23.7777">
Debug output shows the precision loss happens on this line:
double lat = in.attributes().value("", "lat").toString().toDouble();
but I can't see why. in is a QXmlStreamReader reading from a text file handle.

It is probably when you are writing the value back to the XML. Please post that code in your question.
If I had a guess before seeing the code, you are using QString::number to convert from the double back to a string. The default precision in the conversion is 6, which corresponds to what you are seeing. You can increase the precision to get all the decimals.

Related

powerquery: extra digits added to number when importing table

Glad to ask a question here again after more than 10 years (last one was about BASH scripting, now as I'm in corporate, guess what... it's about excel ;) )
here it's my question/issue:
I am importing data with powerquery for further analysis
I have discovered is that the values imported contains extradigits not present in the original table.
I have googled for this problem but I have not been able to find an explanation nor a solution ( a similar issue is this one this one , more than one year old, but with no feedback from Microsoft )
(columns are formatted as text in the screenshot but the issue is still present even if formatted as number)
The workaround I am using now, but I am not happy with that is the following:
I "increased decimal" to make sure all my digits are captured (in my source the entries do not have all the same significant digits),
saved as csv
imported impacted columns as number
convert columns as text (for future text match
I am really annoyed by this unwanted and unpredictable behaviour of excel.
I see a serious issue of data integrity, if we cannot rely on the powerquery/powerbi platform to maintain accurate queries, I wonder why would be use it
adding another screenshot to clarify that changing the source format to text does not solve the problem
another screenshot added following #David Bacci comments:
I think I wrongfully assumed my data was stored as text in the source, can you confirm?
If you are exporting and importing as text, then this will not happen. If you convert to number, you will lose precision. From the docs (my bold):
Represents a 64-bit (eight-byte) floating-point number. It's the most
common number type, and corresponds to numbers as you usually think of
them. Although designed to handle numbers with fractional values, it
also handles whole numbers. The Decimal Number type can handle
negative values from –1.79E +308 through –2.23E –308, 0, and positive
values from 2.23E –308 through 1.79E + 308. For example, numbers like
34, 34.01, and 34.000367063 are valid decimal numbers. The largest
precision that can be represented in a Decimal Number type is 15
digits long. The decimal separator can occur anywhere in the number.
The Decimal Number type corresponds to how Excel stores its numbers.
Note that a binary floating-point number can't represent all numbers
within its supported range with 100% accuracy. Thus, minor differences
in precision might occur when representing certain decimal numbers.
BTW, you should probably accept some of the good answers from your previous questions from 10 years ago.

One decimal field taking up 75% file size of power bi file

I have a Power bi file which is over a 2gb in size and found one field is taking up 1.5gb of the file size. When I change it to a whole number or decimal it is reduced to 350mb.
I wanted to change to a decimal but I feel it being changed to a decimal place shouldn't increase the file size so dramatically. Is this correct and wanted to check if this is expected behaviour
Thanks for any help
Here is a screenshot of the settings:
If you are ok with only preserving 4 decimals then you can switch to a “fixed decimal number” data type and it should compress the same as a whole number. Fixed decimal is stored as an integer and the last 4 digits are interpreted to be right of the decimal as explained here.

The expression "binary=True" in In embedding using word2vec

What does mean and what is used for the expression expression "binary=True" in the following line of code:
w2vmodel = gensim.models.KeyedVectors.load_word2vec_format(
'models/GoogleNews-vectors-negative300.bin.gz'),
binary=True # <-- this
)
The format written by Google's original word2vec.c program had an option to write in plain-text or binary. (Essentially, one wrote floating-point values as human-readable decimal strings, and the other as packed 4-byte binary representations which look like line-noise/strange-characters if viewed as text/characters.)
If you want to read such a file that was written in binary mode, you need to specify binary=True, or else the file format will be misinterpreted, likely failing with errors. There are no other differences in later behavior once the data has been successfully read.

Fortran output exponentials to file

I'm having trouble writing exponential numbers to a file. If I set output to be in the form E20.8 and have numbers in the range e-99 to e+99, I'm fine. When I try to output a number less than e-99, such as 1.23456e-100, I get 1.23456000-100 instead (dropping the e, zeros because of E20*.8*). This is problematic for post-processing.
Any suggestions for a fix? Is there another parameter for the Ew.d format for the size of the exponential?
I wasn't persistent enough in my searching: the full output format is Ew.d followed by "e" and the number of spaces to leave for the exponential. In my case, E20.8e3 worked great. The answer, for future reference, is here:
http://www.hicest.com/Format.htm

Error when reading in float in Fortran

This should be quite simple, but I can't manage to read in a floating point number in Fortran. My program test.f looks like this:
PROGRAM TEST
open(UNIT=1,FILE='test.inp')
read(1,'(f3.0)')line
STOP
END
The input file test.inp simply contains a single float: 1.2
Now the compiling of my testfile goes fine, but when I run it I get an error:
At line 4 of file test.f (unit = 1, file = 'test.inp')
Fortran runtime error: Expected REAL for item 1 in formatted transfer, got INTEGER
(f3.0)
^
I've tried different modifications of the code and also googling for the error message, but with no result. Any help would be greatly appreciated!
Regards,
Frank
Your variable line is implicitly defined as integer. This doesn't work with thef edit descriptor. If you want to read an integer use i edit descriptor (i3 for example). Otherwise declare line as real to math the "f" descriptor.
Note beside: the .0 is not a problem, because if Fortran gets a number with decimal point the .0 part in the descriptor is ignored. It is only used when an number without a decimal is entered and then it uses the number behind the decimal point in the desciptor to add a decimal point into the right place. For with F8.5, 123456789 is read as 123.45678. More ont this here http://software.intel.com/sites/products/documentation/hpc/compilerpro/en-us/fortran/lin/compiler_f/lref_for/source_files/pghredf.htm .
In your read statement
read(1,'(f3.0)')line
the f3.0 tells tour program to read 3 digits with 0 digits after the decimal (this is what the n.m syntax means). So I presume that the program is just reading 1 from the file (not 1.2), which is an integer. Try replacing that line with something like
read(1,'(f3.1)')line
although, if the number in your file is likely to change and be larger than 9.9 or have more than one decimal place you should increase the field width to something larger than 3.
See the documentation of the read intrinsic and for data edit descriptors for more information on reading and writing in Fortran.
Edit: the format specifier, the second argument in quotes in your read statment, has the form fw.d, where f indicates that the data to read is a floating point number, w is the width of the field including all blanks and decimal points and d specifies the number of digits to the right of the decimal point.
I would suggest reading/writing list formatted data, unless you have a very strong reason to do otherwise. assuming that you're reading in from a file with just a single float or integer in a single line, like this
123.45
11
42
then this should do the reading
real*8 :: x,y,z
open(1,file=filename)
read(1,*)x
read(1,*)y
read(1,*)z
close(1)