Visual Studio C++ 2008 / 2010 - break on float NaN - c++

Is there any way to set up Visual Studio (just upgraded from 2008 to 2010) to break, as if an assertion failed, whenever any floating point number becomes NaN, QNAN, INF, etc?
Up until now I have just been using the assert(x == x) trick, but I would rather something implicit, so that I dont have to add assertions everywhere.
Quite surprised I can't find an answer to this via google. Some stuff about 'floating point exceptions', but I'm not sure if they are the same thing, and I've tried enabling them in Visual Studio, but the program doesn't break until something catastrophic happens because of the NaN later on in execution.

1) Go to project option and enable /fp:strict (C/C++ -> Code Generation -> Floating Pint Model).
2) Use _controlfp to set the floating-point control word (see code below).
#include <float.h>
unsigned int fp_control_state = _controlfp(_EM_INEXACT, _MCW_EM);
#include <math.h>
int main () {
sqrtf(-1.0); // floating point exception
double x = 0.0;
double y = 1.0/x; // floating point exception
return 0;
}

Try enabling fp exceptions

At least on x86, when you generate an NaN etc, one of the FPU status register bits is set. There's a way you can set so that it throws a H/W exception on the next subsequent FP operation occurs, but that's not quite as soon as you hoped for. I can't recall the reference though.

I am not sure if this is possible the way you want it, but You could create an macro which wraps the code in the marked line into an assert or which sets a breakpoint for this.
Hope this helps

Related

C++ Floating point precision error compiler flag

In c++ is there a compiler flag or an option somewhere that makes it so that if 2 floats are within the error of the floating point arithmetic that they evaluate as equal?
It's annoying having to track down floating point errors.
For example a long time ago when testing something where I knew what the value was I even overwrote the value right before the line and it still failed.
This is a very simplified version of what it looked like
double x = 3;
if(x == 3)
printf("x is 3");
else
printf("x is not 3");
And that went into the else case and printed "x is not 3"
There has to be a way to handle this that doesn't mean I have to add handling to each floating point comparison.
If you use GCC and glibc you can include something like
#define _GNU_SOURCE 1
#include <fenv.h>
static void __attribute__((constructor)) trapfpe ()
{
/* Enable some exceptions. At startup all exceptions are masked. */
feenableexcept (FE_INEXACT);
}
in your project which will abort the program (with a core dump, if you have such enabled in your environment) when it hits one of the above FP exceptions.
That being said, I don't think FE_INEXACT is particularly useful in reality. A somewhat useful combination might be FE_INVALID|FE_DIVBYZERO|FE_OVERFLOW (but that's beside the question being asked).

c++ exp function different results under x64 on i7-3770 and i7-4790

When I execute a simple x64 application with the following code, I get different results on Windows PCs with a i7-3770 and i7-4790 CPU.
#include <cmath>
#include <iostream>
#include <limits>
void main()
{
double val = exp(-10.240990982718174);
std::cout.precision(std::numeric_limits<double>::max_digits10);
std::cout << val;
}
Result on i7-3770:
3.5677476354876406e-05
Result on i7-4790:
3.5677476354876413e-05
When I modify the code to call
unsigned int control_word;
_controlfp_s(&control_word, _RC_UP, MCW_RC);
before the exp function call, both CPUs deliver the same results.
My questions:
Does anyone have an idea for the reason of the differences between the i7-3770 and i7-4790?
Is there a way to set the floating point precision or consistency in a Visual Studio 2015/2017 C++ project for the whole project and not only for the following function call? The "Floating Point Model" setting (/fp) does not have any influence on the results here.
Assuming that double is coded using IEEE-754, and using this decimal to binary converter, you can see that:
3.5677476354876406e-05 is represented in hexa as 0x3F02B48CC0D0ABA8
3.5677476354876413e-05 is represented in hexa as 0x3F02B48CC0D0ABA9
which differ only in the last bit, probably due round error.
I did some further investigations and I found out the following facts:
the problem does also occur on Windows with a different compiler (Intel)
on a linux system both values are equal
I also posted this question to the Visual Studio Community. I got the information, that Haswell and newer CPUs use FMA3. You can disable this feature with _set_FMA3_enable(0) at the beginning of the program. When I do this, the results are the same.

Different optimization in VS2015 vs VS2013 causes floating point exception

I have a small example of issue which came up during the transition from VS2013 to VS2015. In VS2015 further mentioned code example causes floating-point invalid operation.
int main()
{
unsigned int enableBits = _EM_OVERFLOW | _EM_ZERODIVIDE | _EM_INVALID;
_clearfp();
_controlfp_s(0, ~enableBits, enableBits);
int count = 100;
float array[100];
for (int i = 0; i < count; ++i)
{
array[i] = (float)pow((float)(count - 1 - i) / count, 4); //this causes exception in VS2015
}
return 0;
}
This happens only in release mode so its probably caused by different optimization. Is there something wrong with this code or is this a bug in VS 2015?
Its hard to find issues like these across the whole code base so I am looking for some systematic fix not a workaround (e.g. use different variable instead of i which works)
I also checked generated assembly code and it seems in VS2013 it uses whole 128bit registry to perform 4 float operations in one division. In VS2015 it seems to do only 2 float operations and the rest of registry is zero (or some garbage) which probably introduces this exception.
Instruction which causes exception is marked in picture.
VS2013
and VS2015
Any help will be appreciated.
Thanks.
This looks to be an interaction with you using floating point exceptions but also enabling some floating point optimizations.
What the code is doing is it does 2 iterations at once (loop unrolling) but uses divps which does 4 divides at once (from the 4 floats in an XMM register). The upper 2 floats in the XMM register are not used, and are zero. As the result of the divide of the values in those slots aren't used it doesn't normally matter. However, as you set custom exception handling this raises a invalid op exception that you see even though its generating values which wont be used.
Your choices are, as I see them, to set /fp:strict which will disable optimisations so make this work (but it will obviously make the code slower) or remove the controlfp call.

Handling Floating-Point exceptions in C++

I'm finding the floating-point model/error issues quite confusing. It's an area I'm not familiar with and I'm not a low level C/asm programmer, so I would appreciate a bit of advice.
I have a largish C++ application built with VS2012 (VC11) that I have configured to throw floating-point exceptions (or more precisely, to allow the C++ runtime and/or hardware to throw fp-exceptions) - and it is throwing quite a lot of them in the release (optimized) build, but not in the debug build. I assume this is due to the optimizations and perhaps the floating-point model (although the compiler /fp:precise switch is set for both the release and debug builds).
My first question relates to managing the debugging of the app. I want to control where fp-exceptions are thrown and where they are "masked". This is needed because I am debugging the (optimized) release build (which is where the fp-exceptions occur) - and I want to disable fp-exceptions in certain functions where I have detected problems, so I can then locate new FP problems. But I am confused by the difference between using _controlfp_s to do this (which works fine) and the compiler (and #pragma float_control) switch "/fp:except" (which seems to have no effect). What is the difference between these two mechanisms? Are they supposed to have the same effect on fp exceptions?
Secondly, I am getting a number of "Floating-point stack check" exceptions - including one that seems to be thrown in a call to the GDI+ dll. Searching around the web, the few mentions of this exception seem to indicate it is due to compiler bugs. Is this generally the case? If so, how should I work round this? Is it best to disable compiler optimizations for the problem functions, or to disable fp-exceptions just for the problematic areas of code if there don't appear to be any bad floating-point values returned? For example, in the GDI+ call (to GraphicsPath::GetPointCount) that throws this exception, the actual returned integer value seems correct. Currently I'm using _controlfp_s to disable fp-exceptions immediately prior to the GDI+ call – and then use it again to re-enable exceptions directly after the call.
Finally, my application does make a lot of floating-point calculations and needs to be robust and reliable, but not necessarily hugely accurate. The nature of the application is that the floating-point values generally indicate probabilities, so are inherently somewhat imprecise. However, I want to trap any pure logic errors, such as divide by zero. What is the best fp model for this? Currently I am:
trapping all fp exceptions (i.e. EM_OVERFLOW | EM_UNDERFLOW | EM_ZERODIVIDE | EM_DENORMAL | EM_INVALID) using _controlfp_s and a SIGFPE Signal handler,
have enabled the denormals-are-zero (DAZ) and flush-to-zero (FTZ) (i.e. _MM_SET_FLUSH_ZERO_MODE(_MM_DENORMALS_ZERO_ON)), and
I am using the default VC11 compiler settings /fp:precise with /fp:except not specified.
Is this the best model?
Thanks and regards!
Most of the the following information comes from Bruce Dawson's blog post on the subject (link).
Since you're working with C++, you can create a RAII class that enables or disables floating point exceptions in a scoped manner. This lets you have greater control so that you're only exposing the exception state to your code, rather than manually managing calling _controlfp_s() yourself. In addition, floating point exception state that is set this way is system wide, so it's really advisable to remember the previous state of the control word and restore it when needed. RAII can take care of this for you and is a good solution for the issues with GDI+ that you're describing.
The exception flags _EM_OVERFLOW, _EM_ZERODIVIDE, and _EM_INVALID are the most important to account for. _EM_OVERFLOW is raised when positive or negative infinity is the result of a calculation, whereas _EM_INVALID is raised when a result is a signaling NaN. _EM_UNDERFLOW is safe to ignore; it signals when your computation result is non-zero and between -FLT_MIN and FLT_MIN (in other words, when you generate a denormal). _EM_INEXACT is raised too frequently to be of any practical use due to the nature of floating point arithmetic, although it can be informative if trying to track down imprecise results in some situations.
SIMD code adds more wrinkles to the mix; since you don't indicate using SIMD explicitly I'll leave out a discussion of that except to note that specifying anything other than /fp:fast can disable automatic vectorization of your code in VS 2012; see this answer for details on this.
I can't help much with the first two questions, but I have experience and a suggestion for the question about masking FPU exceptions.
I've found the functions
_statusfp() (x64 and Win32)
_statusfp2() (Win32 only)
_fpreset()
_controlfp_s()
_clearfp()
_matherr()
useful when debugging FPU exceptions and in delivering a stable and fast product.
When debugging, I selectively unmask exceptions to help isolate the line of code where an fpu exception is generated in a calculation where I cannot avoid calling other code that unpredictably generates fpu exceptions (like the .NET JIT's divide by zeros).
In released product I use them to deliver a stable program that can tolerate serious floating point exceptions, detect when they occur, and recover gracefully.
I mask all FPU exceptions when I have to call code that cannot be changed,does not have reliable exception handing, and occasionally generates FPU exceptions.
Example:
#define BAD_FPU_EX (_EM_OVERFLOW | _EM_ZERODIVIDE | _EM_INVALID)
#define COMMON_FPU_EX (_EM_INEXACT | _EM_UNDERFLOW | _EM_DENORMAL)
#define ALL_FPU_EX (BAD_FPU_EX | COMMON_FPU_EX)
Release code:
_fpreset();
Use _controlfp_s() to mask ALL_FPU_EX
_clearfp();
... calculation
unsigned int bad_fpu_ex = (BAD_FPU_EX & _statusfp());
_clearfp(); // to prevent reacting to existing status flags again
if ( 0 != bad_fpu_ex )
{
... use fallback calculation
... discard result and return error code
... throw exception with useful information
}
Debug code:
_fpreset();
_clearfp();
Use _controlfp_s() to mask COMMON_FPU_EX and unmask BAD_FPU_EX
... calculation
"crash" in debugger on the line of code that is generating the "bad" exception.
Depending on your compiler options, release builds may be using intrinsic calls to FPU ops and debug builds may call math library functions. These two methods can have significantly different error handling behavior for invalid operations like sqrt(-1.0).
Using executables built with VS2010 on 64-bit Windows 7, I have generated slightly different double precision arithmetic values when using identical code on Win32 and x64 platforms. Even using non-optimized debug builds with /fp::precise, the fpu precision control explicitly set to _PC_53, and the fpu rounding control explicitly set to _RC_NEAR. I had to adjust some regression tests that compare double precision values to take the platform into account. I don't know if this is still an issue with VS2012, but heads up.
I've been struggling for achieving some information about handling floating point exceptions on linux and I can tell you what I learned:
There are a few ways of enabling the exception mechanism:
fesetenv (FE_NOMASK_ENV); enables all exceptions
feenableexcept(FE_ALL_EXCEPT );
fpu_control_t fw;
_FPU_GETCW(fw);
fw |=FE_ALL_EXCEPT;
_FPU_SETCW(fw);
4.
> fenv_t envp; include bits/fenv.h
> fegetenv(&envp);
envp.__control_word |= ~_FPU_MASK_OM;
> fesetenv(&envp);
5.
> fpu_control_t cw;
> __asm__ ("fnstcw %0" : "=m" (*&cw));get config word
>cw |= ~FE_UNDERFLOW;
> __asm__ ("fldcw %0" : : "m" (*&cw));write config word
6.C++ mode: std::feclearexcept(FE_ALL_EXCEPT);
There are some useful links :
http://frs.web.cern.ch/frs/Source/MAC_headers/fpu_control.h
http://en.cppreference.com/w/cpp/numeric/fenv/fetestexcept
http://technopark02.blogspot.ro/2005/10/handling-sigfpe.html

How to trace a NaN in C++

I am going to do some math calculations using C++ . The input floating point number is a valid number, but after the calculations, the resulting value is NaN. I would like to trace the point where NaN value appears (possibly using GDB), instead of inserting a lot of isNan() into the code. But I found that even code like this will not trigger an exception when a NaN value appears.
double dirty = 0.0;
double nanvalue = 0.0/dirty;
Could anyone suggest a method for tracing the NaN or turning a NaN into an exception?
Since you mention using gdb, here's a solution that works with gcc -- you want the
functions defined in fenv.h :
#define _GNU_SOURCE
#include <fenv.h>
#include <stdio.h>
int main(int argc, char **argv)
{
double dirty = 0.0;
feenableexcept(FE_ALL_EXCEPT & ~FE_INEXACT); // Enable all floating point exceptions but FE_INEXACT
double nanval=0.0/dirty;
printf("Succeeded! dirty=%lf, nanval=%lf\n",dirty,nanval);
}
Running the above program produces the output "Floating point exception". Without
the call to feenableexcept, the "Succeeded!" message is printed.
If you were to write a signal handler for SIGFPE, that might be a good place to
set a breakpoint and get the traceback you want. (Disclaimer: haven't tried it!)
In Visual Studio you can use the _controlfp function to set the behavior of floating-point calculations (see http://msdn.microsoft.com/en-us/library/e9b52ceh(VS.80).aspx). Maybe there is a similar variant for your platform.
Some notes on floating point programming can be found on http://ds9a.nl/fp/ including the difference between 1/0 and 1.0/0 etc, and what a NaN is and how it acts.
One can enable so-called "signaling NaN". That should make it easily possible to make the debugger find the correct position.
Via google, I found this for enabling signaling NaNs in C++, no idea if it works:
std::numeric_limits::signaling_NaN();
Usefulness of signaling NaN?