I'm trying to port _controlfp( _CW_DEFAULT, 0xffffffff ); from WIN32 to Mac OS X / Intel. I have absolutely no idea how to port this instruction... And you? Thanks!
Try section 8.6 of Gough's Introduction to GCC, which demonstrates the x86 FLDCW instruction. But it helps if you tell us why you need it — if you want your doubles to be IEEE-754 64-bit doubles, the easiest way is to compile with -msse -mfpmath=sse.
What precision elements are you controlling?
According to Microsoft's Website:
The _control87 function gets and sets the floating-point control word. The floating-point control word allows the program to change the precision, rounding, and infinity modes in the floating-point math package. You can also mask or unmask floating-point exceptions using _control87. If the value for mask is equal to 0, _control87 gets the floating-point control word. If mask is nonzero, a new value for the control word is set: For any bit that is on (equal to 1) in mask, the corresponding bit in new is used to update the control word. In other words, fpcntrl = ((fpcntrl & ~mask) | (new & mask)) where fpcntrl is the floating-point control word.
Note the keyword "library". This function is manipulating the Microsoft library, which may not exist on the Mac.
I suggest the following:
Control the floating point chip on
the Mac yourself
Or use a Microsoft compiler for Mac
Or find a method to move the
floating point controls to a more
generic spot in your program.
The best advice I can give is to keep as much code inline with the standard and platform or library specific issues to minimum. For those functions involving platform specific features, move them into their own source files / translation units. Create copies of these functions, one for each platform. Let the linker decide with ones to use.
Good Luck.
Related
I'm currently testing some inline assembly in C++ on an old compiler (GCC circa 2004) and I wanted to perform the square root function on a floating point number. After trying and searching for a successful method, I came across the following code
float r3(float n){
__asm__("fsqrt" : "+t" (n));
return n;
};
which worked. The issue is, even though I understand the assembly instructions used, I'm unable to find any particular documentation as to what the "+t" flag means on the n variable. I'm under the genuine idea that it seems to be a manner by which to treat the variable n as both the input and output variable but I was unable to find any information on it. So, what exactly is the "t" flag and how does it work here?
+
Means that this operand is both read and written by the instruction.
(From here)
t
Top of 80387 floating-point stack (%st(0)).
(From here)
+ means you are reading and writing the register.
t means the value is on the top of the 80387 floating point stack.
References:
GCC manual, Extended Asm has general information about constraints - search for "constraints"
GCC manual, Machine Constraints has information about the specific constraints supported on each architecture - search for "x86 family"
We have numerical code written in C++. Rarely but under certain specific inputs, some of the computations result in an 'nan' value.
Is there a standard or recommended method by which we can stop and alert the user when a certain numerical calculation results in an 'nan' being generated? (under debug mode).Checking for each result if it is equal to 'nan' seems impractical given the huge sizes of matrices and vectors.
How do standard numerical libraries handle this situation? Could you throw some light on this?
NaN is propagated, when applied to a numeric operation. So, it is enough to check the final result for being a NaN. As for, how to do it -- if building for >= C++11, there is std::isnan, as Goz noticed. For < C++11 - if want to be bulletproof - I would personally do bit-checking (especially, if there may be an optimization involved). The pattern for NaN is
? 11.......1 xx.......x
sign bit ^ ^exponent^ ^fraction^
where ? may be anything, and at least one x must be 1.
For platform dependent solution, there seams to be yet another possibility. There is the function feenableexcept in glibc (probably with the signal function and the compiler option -fnon-call-exceptions), which turns on a generation of the SIGFPE sinals, when an invalid floating point operation occure. And the function _control87 (probably with the _set_se_translator function and compiler option /EHa), which allows pretty much the same in VC.
Although this is a nonstandard extension originally from glibc, on many systems you can use the feenableexcept routine declared in <fenv.h> to request that the machine trap particular floating-point exceptions and deliver SIGFPE to your process. You can use fedisableexcept to mask trapping, and fegetexcept to query the set of exceptions that are unmasked. By default they are all masked.
On older BSD systems without these routines, you can use fpsetmask and fpgetmask from <ieeefp.h> instead, but the world seems to be converging on the glibc API.
Warning: glibc currently has a bug whereby (the C99 standard routine) fegetenv has the unintended side effect of masking all exception traps on x86, so you have to call fesetenv to restore them afterward. (Shows you how heavily anyone relies on this stuff...)
On many architectures, you can unmask the invalid exception, which will cause an interrupt when a NaN would ordinarily be generated by a computation such as 0*infinity. Running in the debugger, you will break on this interrupt and can examine the computation that led to that point. Outside of a debugger, you can install a trap handler to log information about the state of the computation that produced the invalid operation.
On x86, for example, you would clear the Invalid Operation Mask bit in FPCR (bit 0) and MXCSR (bit 7) to enable trapping for invalid operations from x87 and SSE operations, respectively.
Some individual platforms provide a means to write to these control registers from C, but there's no portable interface that works cross-platform.
Testing f!=f might give problems using g++ with -ffast-math optimization enabled: Checking if a double (or float) is NaN in C++
The only foolproof way is to check the bitpattern.
As to where to implement the checks, this is really dependent on the specifics of your calculation and how frequent Nan errors are i.e. performance penalty of continuing tainted calculations versus checking at certain stages.
I'm finding the floating-point model/error issues quite confusing. It's an area I'm not familiar with and I'm not a low level C/asm programmer, so I would appreciate a bit of advice.
I have a largish C++ application built with VS2012 (VC11) that I have configured to throw floating-point exceptions (or more precisely, to allow the C++ runtime and/or hardware to throw fp-exceptions) - and it is throwing quite a lot of them in the release (optimized) build, but not in the debug build. I assume this is due to the optimizations and perhaps the floating-point model (although the compiler /fp:precise switch is set for both the release and debug builds).
My first question relates to managing the debugging of the app. I want to control where fp-exceptions are thrown and where they are "masked". This is needed because I am debugging the (optimized) release build (which is where the fp-exceptions occur) - and I want to disable fp-exceptions in certain functions where I have detected problems, so I can then locate new FP problems. But I am confused by the difference between using _controlfp_s to do this (which works fine) and the compiler (and #pragma float_control) switch "/fp:except" (which seems to have no effect). What is the difference between these two mechanisms? Are they supposed to have the same effect on fp exceptions?
Secondly, I am getting a number of "Floating-point stack check" exceptions - including one that seems to be thrown in a call to the GDI+ dll. Searching around the web, the few mentions of this exception seem to indicate it is due to compiler bugs. Is this generally the case? If so, how should I work round this? Is it best to disable compiler optimizations for the problem functions, or to disable fp-exceptions just for the problematic areas of code if there don't appear to be any bad floating-point values returned? For example, in the GDI+ call (to GraphicsPath::GetPointCount) that throws this exception, the actual returned integer value seems correct. Currently I'm using _controlfp_s to disable fp-exceptions immediately prior to the GDI+ call – and then use it again to re-enable exceptions directly after the call.
Finally, my application does make a lot of floating-point calculations and needs to be robust and reliable, but not necessarily hugely accurate. The nature of the application is that the floating-point values generally indicate probabilities, so are inherently somewhat imprecise. However, I want to trap any pure logic errors, such as divide by zero. What is the best fp model for this? Currently I am:
trapping all fp exceptions (i.e. EM_OVERFLOW | EM_UNDERFLOW | EM_ZERODIVIDE | EM_DENORMAL | EM_INVALID) using _controlfp_s and a SIGFPE Signal handler,
have enabled the denormals-are-zero (DAZ) and flush-to-zero (FTZ) (i.e. _MM_SET_FLUSH_ZERO_MODE(_MM_DENORMALS_ZERO_ON)), and
I am using the default VC11 compiler settings /fp:precise with /fp:except not specified.
Is this the best model?
Thanks and regards!
Most of the the following information comes from Bruce Dawson's blog post on the subject (link).
Since you're working with C++, you can create a RAII class that enables or disables floating point exceptions in a scoped manner. This lets you have greater control so that you're only exposing the exception state to your code, rather than manually managing calling _controlfp_s() yourself. In addition, floating point exception state that is set this way is system wide, so it's really advisable to remember the previous state of the control word and restore it when needed. RAII can take care of this for you and is a good solution for the issues with GDI+ that you're describing.
The exception flags _EM_OVERFLOW, _EM_ZERODIVIDE, and _EM_INVALID are the most important to account for. _EM_OVERFLOW is raised when positive or negative infinity is the result of a calculation, whereas _EM_INVALID is raised when a result is a signaling NaN. _EM_UNDERFLOW is safe to ignore; it signals when your computation result is non-zero and between -FLT_MIN and FLT_MIN (in other words, when you generate a denormal). _EM_INEXACT is raised too frequently to be of any practical use due to the nature of floating point arithmetic, although it can be informative if trying to track down imprecise results in some situations.
SIMD code adds more wrinkles to the mix; since you don't indicate using SIMD explicitly I'll leave out a discussion of that except to note that specifying anything other than /fp:fast can disable automatic vectorization of your code in VS 2012; see this answer for details on this.
I can't help much with the first two questions, but I have experience and a suggestion for the question about masking FPU exceptions.
I've found the functions
_statusfp() (x64 and Win32)
_statusfp2() (Win32 only)
_fpreset()
_controlfp_s()
_clearfp()
_matherr()
useful when debugging FPU exceptions and in delivering a stable and fast product.
When debugging, I selectively unmask exceptions to help isolate the line of code where an fpu exception is generated in a calculation where I cannot avoid calling other code that unpredictably generates fpu exceptions (like the .NET JIT's divide by zeros).
In released product I use them to deliver a stable program that can tolerate serious floating point exceptions, detect when they occur, and recover gracefully.
I mask all FPU exceptions when I have to call code that cannot be changed,does not have reliable exception handing, and occasionally generates FPU exceptions.
Example:
#define BAD_FPU_EX (_EM_OVERFLOW | _EM_ZERODIVIDE | _EM_INVALID)
#define COMMON_FPU_EX (_EM_INEXACT | _EM_UNDERFLOW | _EM_DENORMAL)
#define ALL_FPU_EX (BAD_FPU_EX | COMMON_FPU_EX)
Release code:
_fpreset();
Use _controlfp_s() to mask ALL_FPU_EX
_clearfp();
... calculation
unsigned int bad_fpu_ex = (BAD_FPU_EX & _statusfp());
_clearfp(); // to prevent reacting to existing status flags again
if ( 0 != bad_fpu_ex )
{
... use fallback calculation
... discard result and return error code
... throw exception with useful information
}
Debug code:
_fpreset();
_clearfp();
Use _controlfp_s() to mask COMMON_FPU_EX and unmask BAD_FPU_EX
... calculation
"crash" in debugger on the line of code that is generating the "bad" exception.
Depending on your compiler options, release builds may be using intrinsic calls to FPU ops and debug builds may call math library functions. These two methods can have significantly different error handling behavior for invalid operations like sqrt(-1.0).
Using executables built with VS2010 on 64-bit Windows 7, I have generated slightly different double precision arithmetic values when using identical code on Win32 and x64 platforms. Even using non-optimized debug builds with /fp::precise, the fpu precision control explicitly set to _PC_53, and the fpu rounding control explicitly set to _RC_NEAR. I had to adjust some regression tests that compare double precision values to take the platform into account. I don't know if this is still an issue with VS2012, but heads up.
I've been struggling for achieving some information about handling floating point exceptions on linux and I can tell you what I learned:
There are a few ways of enabling the exception mechanism:
fesetenv (FE_NOMASK_ENV); enables all exceptions
feenableexcept(FE_ALL_EXCEPT );
fpu_control_t fw;
_FPU_GETCW(fw);
fw |=FE_ALL_EXCEPT;
_FPU_SETCW(fw);
4.
> fenv_t envp; include bits/fenv.h
> fegetenv(&envp);
envp.__control_word |= ~_FPU_MASK_OM;
> fesetenv(&envp);
5.
> fpu_control_t cw;
> __asm__ ("fnstcw %0" : "=m" (*&cw));get config word
>cw |= ~FE_UNDERFLOW;
> __asm__ ("fldcw %0" : : "m" (*&cw));write config word
6.C++ mode: std::feclearexcept(FE_ALL_EXCEPT);
There are some useful links :
http://frs.web.cern.ch/frs/Source/MAC_headers/fpu_control.h
http://en.cppreference.com/w/cpp/numeric/fenv/fetestexcept
http://technopark02.blogspot.ro/2005/10/handling-sigfpe.html
I am new to Fortran 2008 and am trying to implement a Sieve of Atkin. In C++ I implemented this using a std::bitset but was unable to find anything in Fortran 2008 that serves this purpose.
Can anyone point me at any example code or explain an implementation strategy for one?
Standard Fortran doesn't have a precise analogue of what I understand std:bitset to be -- though I grant you my understanding may be defective. Generally, and if you want to stick to standard Fortran, you would use integers as sets of bits. If one integer doesn't have enough bits for your purposes, use arrays of integers. This does mean, though, that the responsibility for tracking where, say, the 307-th bit of your bitset is falls on you
Prior to the 2008 standard you have functions such as bit_size, iand, ibset, btest and others (see your compiler documentation or Google for language references, or try the Intel Fortran documentation) for bit manipulation.
If you are unfamiliar with Fortran's boz literals then familiarise yourself with them. You can, for example, set the bits of an integer using a statement such as this
integer :: mybits
...
mybits = b'00000011000000100000000000001111'
With the b edit descriptor you can read and write binary literals too. For example the statements
write(*,*) mybits
write(*,'(b32.32)') mybits
will produce the output
50462735
00000011000000100000000000001111
If you can lay your hands on a modern-enough compiler then you will find that the 2008 standard added new bit-twiddling functions such as bge, bgt, dshiftl, iall and a whole lot more. These are defined for input arguments which are integer arrays or integers, but I don't have any experience of using them to pass on.
This should be enough to get you started.
Fortran has bit intrinsics for manipulating the bits of default integers. Bit arrays are straightforward to build off that...
Determine how many bits you need, divide by number of bits in default integer, allocate an integer array of default kind of the size you computed +1 if the modulo of the division was non-zero, and you're essentially done. The bit intrinsics are well covered in Metcalf and Reid.
What you may want could look like:
program test
logical,allocatable:: flips(:)
...
allocate(flips(ntris),status=err)
call tris(ntris,...,flips)
...
end
subroutine tris(nnewtris, ...,flips)
logical flips(nnewtris)
...
if(flips(i)) then
...
end if
return
end
I just wonder if there is some convenient way to detect if overflow happens to any variable of any default data type used in a C++ program during runtime? By convenient, I mean no need to write code to follow each variable if it is in the range of its data type every time its value changes. Or if it is impossible to achieve this, how would you do?
For example,
float f1=FLT_MAX+1;
cout << f1 << endl;
doesn't give any error or warning in either compilation with "gcc -W -Wall" or running.
Thanks and regards!
Consider using boosts numeric conversion which gives you negative_overflow and positive_overflow exceptions (examples).
Your example doesn't actually overflow in the default floating-point environment in a IEEE-754 compliant system.
On such a system, where float is 32 bit binary floating point, FLT_MAX is 0x1.fffffep127 in C99 hexadecimal floating point notation. Writing it out as an integer in hex, it looks like this:
0xffffff00000000000000000000000000
Adding one (without rounding, as though the values were arbitrary precision integers), gives:
0xffffff00000000000000000000000001
But in the default floating-point environment on an IEEE-754 compliant system, any value between
0xfffffe80000000000000000000000000
and
0xffffff80000000000000000000000000
(which includes the value you have specified) is rounded to FLT_MAX. No overflow occurs.
Compounding the matter, your expression (FLT_MAX + 1) is likely to be evaluated at compile time, not runtime, since it has no side effects visible to your program.
In situations where I need to detect overflow, I use SafeInt<T>. It's a cross platform solution which throws an exception in overflow situations.
SafeInt<float> f1 = FLT_MAX;
f1 += 1; // throws
It is available on codeplex
http://www.codeplex.com/SafeInt/
Back in the old days when I was developing C++ (199x) we used a tool called Purify. Back then it was a tool that instrumented the object code and logged everything 'bad' during a test run.
I did a quick google and I'm not quite sure if it still exists.
As far as I know nowadays several open source tools exist that do more or less the same.
Checkout electricfence and valgrind.
Clang provides -fsanitize=signed-integer-overflow and -fsanitize=unsigned-integer-overflow.
http://clang.llvm.org/docs/UsersManual.html#controlling-code-generation