floating point instruction anomaly -- FLDZ malfunctioning? - c++

I am trying to debug the problem I posted earlier here:
C++ and pin tool -- very weird DOUBLE variable issue with IF statement.
I tracked down the moment when the weird behavior occurred using gdb. What I found is shown in the figure below that shows the gdb screenshot displaying the disassembled code and floating pointer register values. (larger image here)
Left-hand side image shows the screenshot before the highlighted FLDZ instruction is executed and the right-hand side image is after the instructions is executed. I looked up the x86 ISA and FLDZ is for loading +0.0 into ST(0). However, what I get is -nan instead of +0.0.
Does anybody know why this happens?
The system I am using is Intel xeon 5645 running 64-bit CentOS, but the target program I am trying to debug is 32-bit application. Also, as I mentioned in the earlier post, I tried two versions of gcc, 4.2.4 and 4.1.2 and observed the same problem.
Thanks.
--added--
By the way, below is the source code.
void Router::Evaluate( )
{
if (_id == 0) aaa++;
if ( _partial_internal_cycles != 0 )
{
aaa += 12345;
cout << "this is not a zero : " << endl;
on = true;
}
_partial_internal_cycles += (double) 1.0;
if ( _partial_internal_cycles >= (double)1.0 ) {
_InternalStep( );
_partial_internal_cycles -= (double)1.0;
}
if (GetSimTime() > 8646000 && _id == 0) cout << "aaa = " << aaa << endl;
if ( on)
{
cout << "break. id = " << _id << endl;
assert(false);
}
}

An exception was generated (notice the I bit is set in the stat field). As the documentation says:
If the ST(7) data register which would become the new ST(0) is not empty, both a Stack Fault and an Invalid operation exceptions are detected, setting both flags in the Status Word. The TOP register pointer in the Status Word would still be decremented and the new value in ST(0) would be the INDEFINITE NAN.
By the way, your underlying issue is because this is just the nature of floating point. It's not exact. See, for example, this gcc bug report -- and this one.

Related

Did Visual Studio 2022 17.4.3 break std::round?

(Note: This problem occurs for me only when the compiler switch /arch:AVX is set. More at the bottom)
My gtest unit tests have done this for 7 years
ASSERT_EQ(-3.0, std::round(-2.5f)); // (Note the 'f' suffix)
According to cpp-reference, std::round is supposed to round AWAY from zero, right? Yet with the current release, this test just started failing. Am I missing something? All I did was update my Visual Studio 2022 to 17.4.3 My co-worker with 17.3.3 does not have this problem
EDIT: I don't know if the problem is GTEST and its macros or assumptions my unit test makes about equality. I put the following two lines of code into my test
std::cerr << "std::round(-2.5) = " << std::round(-2.5) << std::endl;
std::cerr << "std::round(-2.5f) = " << std::round(-2.5f) << std::endl;
They produce the following output. The second one is wrong, is it not?
std::round(-2.5) = -3
std::round(-2.5f) = -2
EDIT #2: As I note above, the only occurs when I set the compiler flag /arch:AVX If just create a console app and do not set the flag of if I explicitly set it to /arch:IA32, the problem goes away. But the question then becomes: Is this a bug or am I just not supposed to use that option?
This is a known bug, see the bug report on developercommunity, which is already in the "pending release" state.
For completeness/standalone sake, the minimal example from there is (godbolt):
int main()
{
std::cout << "MSVC version: " << _MSC_FULL_VER << '\n';
std::cout << "Round 0.5f: " << std::round(0.5f) << '\n';
std::cout << "Round 0.5: " << std::round(0.5) << '\n';
}
compiled with AVX or AVX2.
The correct output e.g. with MSVC 19.33 is
MSVC version: 193331631
Round 0.5f: 1
Round 0.5: 1
while the latest MSVC 19.34 outputs
MSVC version: 193431931
Round 0.5f: 0
Round 0.5: 1

Wrong result for division of two doubles in release build

When I compile my application in Release mode I get incorrect division result of 40.0 / 5 = 7.
In debug compilation it is correct, and result is 8
I tried to cast to double, from double, to int, without abs() etc, but no luck. I know this must be related to weirdness of floating point math on computers, but I have no idea what exactly. I also logged the values on console, via the qDebugs() below the code - everything looks okay, except initial steps.
//somewhere in code
double tonnageToRecover = 0.5;//actually, its QDoubleSpinBox->value(), with 0.5 step set. Anyway, the value finally reduces to 0.5 every time
double tonnagePerArmorPoint = 0.0125;//taken from .json
int minimumArmorDelta = 5;//taken from .json
...
//palace where the calculations are preformed
double armorPointsPerHalfTon = tonnageToRecover / tonnagePerArmorPoint;
int steps = abs(static_cast<int>(armorPointsPerHalfTon / minimumArmorDelta));
qDebug() << "armorPointsPerHalfTon = " << armorPointsPerHalfTon;
qDebug() << "tonnagePerArmorPoint = " << tonnagePerArmorPoint;
qDebug() << "steps initial = " << steps;
qDebug() << "minimumArmorDelta = " << minimumArmorDelta;
both 1st division parts are type double, tonnageToRecover = 0.5, tonnagePerArmorPoint = 0.0125, result is 40 which is OK
minimumArmorDelta is int = 5
So why 40/5 isn't 8??
Compiler - MinGW 32 5.3.0, from Qt 5.11 pack
Screenshots:
Release
Debug
#Julian
I suspect that too, but how can I overcome this obstacle? Will try to change steps to double, then cast to int again.
RESUT: still does not work :/
I found a solution, but I have no idea exactly why it works now. Current code it:
double armorPointsPerHalfTon = tonnageToRecover / tonnagePerArmorPoint;
// int aPHT = (int)armorPointsPerHalfTon;
// double minDelta = 5.0;//static_cast<double>(minimumArmorDelta);
QString s(QString::number(abs(armorPointsPerHalfTon / minimumArmorDelta)));
int steps = abs(armorPointsPerHalfTon / minimumArmorDelta);
#define myqDebug() qDebug() << fixed << qSetRealNumberPrecision(10)
myqDebug() << "tonnageToRecover = " << tonnageToRecover;
myqDebug() << "tonnagePerArmorPoint = " << tonnagePerArmorPoint;
myqDebug() << "armorPointsPerHalfTon = " << armorPointsPerHalfTon;
//myqDebug() << "aPHT = " << aPHT;//this was 39 in Release, 40 in Debug
myqDebug() << "steps initial = " << steps;
myqDebug() << "string version = " << s;
myqDebug() << "minimumArmorDelta = " << minimumArmorDelta;// << ", minDelta = " << minDelta;
#undef myqDebug
I suppose that creation of that QString s flushes something, and that's why calculation of steps is correct now. String has incorrect value "7", though.
Your basic problem is that you are truncating.
Suppose real number arithmetic would give an answer of exactly 8. Floating point arithmetic will give an answer that is very close to 8, but can differ from it in either direction due to rounding error. If the floating point answer is slightly greater than 8, truncating will change it to 8. If it is even slightly less than 8, truncating will change it to 7.
I suggest writing a new question on how to avoid the truncation, with discussion of why you are doing it.
I guess, the reason is that armorPointsPerHalfTon / minimumArmorDelta could be not 8 but actually 7.99999999 in the Release-version. This value then changes to 7 through the int-cast.
So, if the Debug version calculates armorPointsPerHalfTon / minimumArmorDelta = 8.0000001, the result is static_cast<int>(armorPointsPerHalfTon / minimumArmorDelta) = 8.
It's not surprising that Debug / Release yield different results (on the order of machine precision), as several optimizations occur in the Release version.
EDIT: If it suits your requirements, you could just use std::round to round your double to the nearest integer, rather than truncation decimals.

inline asm, when to use r and when to use m? why this behavior?

I am trying to delve into some inline assembly. It is interesting stuff but the documentation is scarce and newb unfriendly.
This code works as expected, it correctly multiplies
{
int other_var=3;
asm volatile
(
"mov $3,%0\n\t"
"roll $2,%0;"
:"=r"(other_var)
:"r"(other_var)
);
cout << "other_var equals " << other_var <<endl;
return 0;
}
but this
int other_var=3;
cout << "other_var equals " << other_var <<endl;
asm volatile
(
"roll $2,%0;"
:"=r"(other_var)
:"r"(other_var)
);
cout << "other_var equals " <<hex<< other_var <<endl;
return 0;
}
When I remove the seemingly arbitrary mov, the code behaves as if undefined and outputs garbage. Suddenly the program does not load other_var from memory to register and the "=m" and "m" option is needed. Why is that? What is the piece of information I am missing here?
You should probably find your self a couple of reference books, pdf, or Websites. 1 that documents the very compiler specific nature of inline assembly, and 1 that documents the specific nature of assembly language. Then hope nobody ever tries to run your code on different hardware.
In the first chunk of code you assign the constant value 3, "$3", to the Output bound register, "%0".
Then you performes a roll on the output bound register, "%0", by the constant 2, "$2", bits.
Effectively multiplying 3 by 4.
Neither block of code actually reads the original value from the variable other_var.
m is for memory, r is for register. = is for output, no = is used for input.
mov %1, %0; load the register used for output with the value of the register used for input..
roll $2, %0; Then roll the output register
When you just grab a register and start using the existing bit pattern found there you are likely going to see something that resembles "Garbage"..
http://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html#s5
http://www.delorie.com/djgpp/doc/brennan/brennan_att_inline_djgpp.html

C++ and pin tool -- very weird DOUBLE variable issue with IF statement

I am working with pin tool that simulates a processor and having a very strange problem.
In the code snippet below, Router::Evaluate() is called repeatedly many times. After it is called several million times, strange behavior occurs intermittently where "_cycles != 0" is evaluated to be true in the first IF statement and to be false in the immediately following IF statement, falling into ELSE block.
void Router::Evaluate( )
{
//---------debug print code---------
if (_cycles != 0) {
cout << "not a zero" << endl;
if (_cycles != 0) cout << "---not a zero" << endl;
else cout << "---zero" << endl;
}
//----------------------------------
_cycles += _speedup;
while ( _cycles >= 1.0 ) {
_Step();
_cycles -= 1.0;
}
}
//class definition
class Router : public TimedModule {
Protected:
double _speedup; //initialized to 1.0
double _cycles; //initialized to 0.0
...
}
Below is the output of the code where "not a zero" followed by "---zero" is printed out from time to time seemingly randomly.
not a zero
---zero
(...some other output...)
not a zero
---zero
(...some other output...)
How could this possibly happen? This is not a multi-threaded program, so synchronization is not an issue. The program is compiled with gcc4.2.4 and executed on 32-bit CentOS. Does anybody have a clue?
Thanks.
--added---
I should have mentioned this, too. I did try printing the value of _cycles each time, and it is always 0.0, which should not be possible...
I also used the following g++ options: "-MM -MG -march=i686 -g -ggdb -g1 -finline-functions -O3 -fPIC"
Unless you have a horrible compiler bug, I would guess something like this is happening:
_cycles has some small fraction remaining after the subtractions. As long the compiler knows nothing else is changing its contents, it keeps its value in a higher precision floating point register. When it sees the I/O operation it is not certain the value of _cycles is needed elsewhere, so it makes sure to store its contents back to the double-precision memory location, rounding off the extra bits that were in the register. The next check assumes pessimistically the value might have changed during the I/O operation, and loads it back from memory, now without the extra bits that made it non-zero in the previous test.
As Daniel Fischer mentioned in a comment, using -ffloat-store inhibits the use of high-precision registers. If the problem goes away when using this option then the scenario I described is very likely. Check the assembly output of Router::Evaluate to be sure.

inconsistent output fixed by debug statement placement

EDIT: it was an uninitialized variable... :(
Explanation:
The PointLLA constructor I used only passed through Latitude and Longitude, but I never explicitly set the internal Altitude member variable to 0. Rookie mistake...
Original Question:
I'm having a pretty horrible time with a bug in my code. I'm calculating distances between a single point and the corners of a rectangle. In this case, the point is centered over the rectangle so I should get four equal distances. I get three equal distances, and one almost equal distance value that's inconsistent (different every time it runs).
If I have a few key debug statements (pretty much just a std::cout call) that explicitly print out the location of each rectangle corner, I get the expected value for the distance and the inconsistency disappears. Here's the relevant code:
// calculate the minimum and maximum distance to
// camEye within the available lat/lon bounds
Vec3 viewBoundsNE; convLLAToECEF(PointLLA(maxLat,maxLon),viewBoundsNE);
Vec3 viewBoundsNW; convLLAToECEF(PointLLA(maxLat,minLon),viewBoundsNW);
Vec3 viewBoundsSW; convLLAToECEF(PointLLA(minLat,minLon),viewBoundsSW);
Vec3 viewBoundsSE; convLLAToECEF(PointLLA(minLat,maxLon),viewBoundsSE);
// begin comment this block out, and buggy output
OSRDEBUG << "INFO: NE (" << viewBoundsNE.x
<< " " << viewBoundsNE.y
<< " " << viewBoundsNE.z << ")";
OSRDEBUG << "INFO: NW (" << viewBoundsNW.x
<< " " << viewBoundsNW.y
<< " " << viewBoundsNW.z << ")";
OSRDEBUG << "INFO: SE (" << viewBoundsSW.x
<< " " << viewBoundsSW.y
<< " " << viewBoundsSW.z << ")";
OSRDEBUG << "INFO: SW (" << viewBoundsSE.x
<< " " << viewBoundsSE.y
<< " " << viewBoundsSE.z << ")";
// --------------- end
// to get the maximum distance, find the maxima
// of the distances to each corner of the bounding box
double distToNE = camEye.DistanceTo(viewBoundsNE);
double distToNW = camEye.DistanceTo(viewBoundsNW); // <-- inconsistent
double distToSE = camEye.DistanceTo(viewBoundsSE);
double distToSW = camEye.DistanceTo(viewBoundsSW);
std::cout << "INFO: distToNE: " << distToNE << std::endl;
std::cout << "INFO: distToNW: " << distToNW << std::endl; // <-- inconsistent
std::cout << "INFO: distToSE: " << distToSE << std::endl;
std::cout << "INFO: distToSW: " << distToSW << std::endl;
double maxDistToViewBounds = distToNE;
if(distToNW > maxDistToViewBounds)
{ maxDistToViewBounds = distToNW; }
if(distToSE > maxDistToViewBounds)
{ maxDistToViewBounds = distToSE; }
if(distToSW > maxDistToViewBounds)
{ maxDistToViewBounds = distToSW; }
OSRDEBUG << "INFO: maxDistToViewBounds: " << maxDistToViewBounds;
So if I run the code shown above, I'll get output as follows:
INFO: NE (6378137 104.12492 78.289415)
INFO: NW (6378137 -104.12492 78.289415)
INFO: SE (6378137 -104.12492 -78.289415)
INFO: SW (6378137 104.12492 -78.289415)
INFO: distToNE: 462.71851
INFO: distToNW: 462.71851
INFO: distToSE: 462.71851
INFO: distToSW: 462.71851
INFO: maxDistToViewBounds: 462.71851
Exactly as expected. Note that all the distTo* values are the same. I can run the program over and over again and I'll get exactly the same output. But now, if I comment out the block that I noted in the code above, I get something like this:
INFO: distToNE: 462.71851
INFO: distToNW: 463.85601
INFO: distToSE: 462.71851
INFO: distToSW: 462.71851
INFO: maxDistToViewBounds: 463.85601
Every run will slightly vary distToNW. Why distToNW and not the other values? I don't know. A few more runs:
463.06218
462.79352
462.76194
462.74772
463.09787
464.04648
So... what's going on here? I tried cleaning/rebuilding my project to see if there was something strange going on but it didn't help. I'm using GCC 4.6.3 with an x86 target.
EDIT: Adding the definitions of relevant functions.
void MapRenderer::convLLAToECEF(const PointLLA &pointLLA, Vec3 &pointECEF)
{
// conversion formula from...
// hxxp://www.microem.ru/pages/u_blox/tech/dataconvert/GPS.G1-X-00006.pdf
// remember to convert deg->rad
double sinLat = sin(pointLLA.lat * K_PI/180.0f);
double sinLon = sin(pointLLA.lon * K_PI/180.0f);
double cosLat = cos(pointLLA.lat * K_PI/180.0f);
double cosLon = cos(pointLLA.lon * K_PI/180.0f);
// v = radius of curvature (meters)
double v = ELL_SEMI_MAJOR / (sqrt(1-(ELL_ECC_EXP2*sinLat*sinLat)));
pointECEF.x = (v + pointLLA.alt) * cosLat * cosLon;
pointECEF.y = (v + pointLLA.alt) * cosLat * sinLon;
pointECEF.z = ((1-ELL_ECC_EXP2)*v + pointLLA.alt)*sinLat;
}
// and from the Vec3 class defn
inline double DistanceTo(Vec3 const &otherVec) const
{
return sqrt((x-otherVec.x)*(x-otherVec.x) +
(y-otherVec.y)*(y-otherVec.y) +
(z-otherVec.z)*(z-otherVec.z));
}
The inconsistent output suggests that either you're making use of an uninitialized variable somewhere in your code, or you have some memory error (accessing memory that's been deleted, double-deleting memory, etc). I don't see either of those things happening in the code you pasted, but there's lots of other code that gets called.
Does the Vec3 constructor initialize all member variables to zero (or some known state)? If not, then do so and see if that helps. If they're already initialized, take a closer look at convLLAToECEF and PointLLA to see if any variables are not getting initialized or if you have any memory errors there.
Seems to me like the DistanceTo function is bugged in some way. If you cannot work out what is up, experiment a bit, and report back.
Try reordering the outputs to see if it's still NW that is wrong.
Try redoing the NW point 2-3 times into different vars to see if they are even consistent in one run.
Try using a different camEye for each point to rule out statefulness in that class.
As much as I hate it, have you stepped through it in a debugger? I usually bias towards stdout based debugging, but it seems like it'd help. That aside, you've got side effects from something nasty kicking around.
My guess is that the fact that you expect (rightfully of course) all four values to be the same is masking a "NW/SW/NE/SE" typo someplace. First thing I'd do is isolate the block you've got here into it's own function (that takes the box and point coordinates) then run it with the point in several different locations. I think the error should likely expose itself quickly at that point.
See if the problem reproduces if you have the debug statements there, but move them after the output. Then the debug statements could help determine whether it was the Vec3 object that was corrupted, or the calculation from it.
Other ideas: run the code under valgrind.
Pore over the disassembly output.