C++ gcc does associative-math flag disable float NAN values? - c++

I'm working with statistic functions with a lot of float data. I want it to run faster but Ofast disable NAN (fno-finite-math-only flag), which is not allowed in my case.
In this case, is it safe to turn on only associative-math ? I think this flag allows things like vectorized sum of vector array, even if the array contains NAN.

From the docs:
NOTE: re-ordering may change the sign of zero as well as ignore NaNs
So if you want correct handling of NaNs, you should not use -fassociative-math.

Related

Making a NaN on purpose in WebGL

I have a GLSL shader that's supposed to output NaNs when a condition is met. I'm having trouble actually making that happen.
Basically I want to do this:
float result = condition ? NaN : whatever;
But GLSL doesn't seem to have a constant for NaN, so that doesn't compile. How do I make a NaN?
I tried making the constant myself:
float NaN = 0.0/0.0; // doesn't work
That works on one of the machines I tested, but not on another. Also it causes warnings when compiling the shader.
Given that the obvious computation didn't work on one of the machines I tried, I get the feeling that doing this correctly is quite tricky and involves knowing a lot of real-world facts about the inconsistencies between various types of GPUs.
Don't use NaNs here.
Section 2.3.4.1 from the OpenGL ES 3.2 Spec states that
The special values Inf and −Inf encode values with magnitudes too large to be represented; the special value NaN encodes “Not A Number” values resulting from undefined arithmetic operations such as 0/0. Implementations are permitted, but not required, to support Inf's and NaN's in their floating-point computations.
So it seems to really depend on implementation. You should be outputing another value instead of NaN
Pass it in as a uniform
Instead of trying to make the NaN in glsl, make it in javascript then pass it in:
shader = ...
uniform float u_NaN
...
call shader with "u_NaN" set to NaN
Fool the Optimizer
It seems like the issue is the shader compiler performing an incorrect optimization. Basically, it replaces a NaN expression with 0.0. I have no idea why it would do that... but it does. Maybe the spec allows for undefined behavior?
Based on that assumption, I tried making an obfuscated method that produces a NaN:
float makeNaN(float nonneg) {
return sqrt(-nonneg-1.0);
}
...
float NaN = makeNaN(some_variable_I_know_isnt_negative);
The idea is that the optimizer isn't clever enough to see through this.
And, on the test machine that was failing, this works! I also tried simplifying the function to just return sqrt(-1.0), but that brought back the failure (further reinforcing my belief that the optimizer is at fault).
This is a workaround, not a solution.
A sufficiently clever optimizer could see through the obfuscation and start breaking things again.
I only tested it in a couple machines, and this is clearly something that varies a lot.
The Unity glsl compiler will convert 0.0f/0.0f to intBitsToFloat(int(0xFFC00000u) - since intBitsToFloat is supported from OpenGL ES 3.0 onwards, this is a solution that works in WebGL2 but not WebGL1

In fortran: Is CONJG(Z) equivialent to DCONJG(Z) when compiling with -fdefault-real-8?

If in existing code there are calls to DCONJG(Z) where Z is declared to be COMPLEX*16. Can the DCONJG call be replaced with CONJG when the -fdefault-real-8 flag is added?
If Z is defined as double complex does this still apply?
In the existing code double complex and complex*16 have both been used to increase precision (and should be equivalent). With the -fdefault-real-8 flag applied, do double complex map to complex*32?
Can the DCONJG call be replaced with CONJG when the -fdefault-real-8
flag is added?
Yes, the standard conjg will return a value of the same kind as its argument, irrespective of the compilation settings. Kind-specific variants of intrinsic functions, such as dconjg, are generally deprecated precisely because they are not kind-indifferent.
If Z is defined as double complex does this still apply?
Yes.
And is double complex equivalent to complex with the flag applied
(same for double precision and real)?
If you mean does that compilation flag also affect the size of the real and imaginary components of a complex value then yes, it does.
EDIT
I don't know what gfortran means by the non-standard (never was, isn't, and probably never will be) kind specification complex*32. But the compiler is reasonably well documented so have a scout yourself. Personally I'd stick to one of the standard ways of specifying a complex number's kind, in which case the standard assures you that the kind specified, e.g. complex(real64), means the kind of each component of the complex number.

Set all floating point literals to floats MSVC++

I am writing some numeric code in C++ and I want to be able to swap between using double and float. I have therefore added a #define MYFLT which I can make either a float or a double as needed. However, how do I deal with the various numeric literals.
For example
MYFLT someNumber = 1.2;
MYFLT someOtherNumber = 1.5f;
gives compiler warnings for the first line when MYFLT is a float and for the second line when MYFLT is a double. I know this is a trivial example, but there are other cases where I have longer expresions with literals in and floats can end up being converted to doubles then the result back to floats which I think is costing me significant performance. How should I deal with this?
I could do things like
MYFLT someNumber = MYFLT(1.2);
MYFLT someOtherNumber = MYFLT(1.5);
but this is quite tedious. I'm assuming that in that if I do this the compiler is clever enough to just use a float when needed (can anyone confirm that?). What would be better would be if there was a MSVC++ compiler switch or #define that will tell the compiler to treat all floating point literals as floats instead of doubles. Does such a switch exist?
Even when I wrap all my literals as above my code runs 50% slower when I use float rather than double. I was expecting a performance boost through simd type operations, not a penalty!
Phil
What you'd want is #define MYFLTCONST(x) x##f or #define MYFLTCONST(x) x depending on whether you want a f suffix for float appended.
This is a (not quite complete) answer to my own question.
I found that a small function that was called many times (a fast approximation to sin) didn't have its literals cast as MYFLT. The extra computational hit of this also meant that the compiler wasn't inlining it. This function accounted for most of the difference. Some further profiling seemed to indicate that accessing std::vector<float> was slower than std::vector<double> ( I am using [] to do the access if it matters ). Replacing std::vectors with raw fixed sized arrays sped up the double implementation a little and closed the gap significantly for the float implementation. The float version is now only about 10% slower than the double version. But definitely no speed increase due to either RAM access nor vectorization. I guess I need to think more carefully about my loops to get any benefit there.
I guess the conclusion here (yet again) is that the compiler is pretty good at optimising code - it's much better to work with it and do careful profiling than it is to try and do your own blind "optimisations" which might actually have negative effects, like stopping the compiler performing good inlining.

Signaling or catching 'nan' as they occur in computations in numerical code base in c++

We have numerical code written in C++. Rarely but under certain specific inputs, some of the computations result in an 'nan' value.
Is there a standard or recommended method by which we can stop and alert the user when a certain numerical calculation results in an 'nan' being generated? (under debug mode).Checking for each result if it is equal to 'nan' seems impractical given the huge sizes of matrices and vectors.
How do standard numerical libraries handle this situation? Could you throw some light on this?
NaN is propagated, when applied to a numeric operation. So, it is enough to check the final result for being a NaN. As for, how to do it -- if building for >= C++11, there is std::isnan, as Goz noticed. For < C++11 - if want to be bulletproof - I would personally do bit-checking (especially, if there may be an optimization involved). The pattern for NaN is
? 11.......1 xx.......x
sign bit ^ ^exponent^ ^fraction^
where ? may be anything, and at least one x must be 1.
For platform dependent solution, there seams to be yet another possibility. There is the function feenableexcept in glibc (probably with the signal function and the compiler option -fnon-call-exceptions), which turns on a generation of the SIGFPE sinals, when an invalid floating point operation occure. And the function _control87 (probably with the _set_se_translator function and compiler option /EHa), which allows pretty much the same in VC.
Although this is a nonstandard extension originally from glibc, on many systems you can use the feenableexcept routine declared in <fenv.h> to request that the machine trap particular floating-point exceptions and deliver SIGFPE to your process. You can use fedisableexcept to mask trapping, and fegetexcept to query the set of exceptions that are unmasked. By default they are all masked.
On older BSD systems without these routines, you can use fpsetmask and fpgetmask from <ieeefp.h> instead, but the world seems to be converging on the glibc API.
Warning: glibc currently has a bug whereby (the C99 standard routine) fegetenv has the unintended side effect of masking all exception traps on x86, so you have to call fesetenv to restore them afterward. (Shows you how heavily anyone relies on this stuff...)
On many architectures, you can unmask the invalid exception, which will cause an interrupt when a NaN would ordinarily be generated by a computation such as 0*infinity. Running in the debugger, you will break on this interrupt and can examine the computation that led to that point. Outside of a debugger, you can install a trap handler to log information about the state of the computation that produced the invalid operation.
On x86, for example, you would clear the Invalid Operation Mask bit in FPCR (bit 0) and MXCSR (bit 7) to enable trapping for invalid operations from x87 and SSE operations, respectively.
Some individual platforms provide a means to write to these control registers from C, but there's no portable interface that works cross-platform.
Testing f!=f might give problems using g++ with -ffast-math optimization enabled: Checking if a double (or float) is NaN in C++
The only foolproof way is to check the bitpattern.
As to where to implement the checks, this is really dependent on the specifics of your calculation and how frequent Nan errors are i.e. performance penalty of continuing tainted calculations versus checking at certain stages.

How to store doubles in memory

Recently I changed some code
double d0, d1;
// ... assign things to d0/d1 ...
double result = f(d0, d1)
to
double d[2];
// ... assign things to d[0]/d[1]
double result = f(d[0], d[1]);
I did not change any of the assignments to d, nor the calculations in f, nor anything else apart from the fact that the doubles are now stored in a fixed-length array.
However when compiling in release mode, with optimizations on, result changed.
My question is, why, and what should I know about how I should store doubles? Is one way more efficient, or better, than the other? Are there memory alignment issues? I'm looking for any information that would help me understand what's going on.
EDIT: I will try to get some code demonstrating the problem, however this is quite hard as the process that these numbers go through is huge (a lot of maths, numerical solvers, etc.).
However there is no change when compiled in Debug. I will double check this again to make sure but this is almost certain, i.e. the double values are identical in Debug between version 1 and version 2.
Comparing Debug to Release, results have never ever been the same between the two compilation modes, for various optimization reasons.
You probably have a 'fast math' compiler switch turned on, or are doing something in the "assign things" (which we can't see) which allows the compiler to legally reorder calculations. Even though the sequences are equivalent, it's likely the optimizer is treating them differently, so you end up with slightly different code generation. If it's reordered, you end up with slight differences in the least significant bits. Such is life with floating point.
You can prevent this by not using 'fast math' (if that's turned on), or forcing ordering thru the way you construct the formulas and intermediate values. Even that's hard (impossible?) to guarantee. The question is really "Why is the compiler generating different code for arrays vs numbered variables?", but that's basically an analysis of the code generator.
no these are equivalent - you have something else wrong.
Check the /fp:precise flags (or equivalent) the processor floating point hardware can run in more accuracy or more speed mode - it may have a different default in an optimized build
With regard to floating-point semantics, these are equivalent. However, it is conceivable that the compiler might decide to generate slightly different code sequences for the two, and that could result in differences in the result.
Can you post a complete code example that illustrates the difference? Without that to go on, anything anyone posts as an answer is just speculation.
To your concerns: memory alignment cannot effect the value of a double, and a compiler should be able to generate equivalent code for either example, so you don't need to worry that you're doing something wrong (at least, not in the limited example you posted).
The first way is more efficient, in a very theoretical way. It gives the compiler slightly more leeway in assigning stack slots and registers. In the second example, the compiler has to pick 2 consecutive slots - except of course if the compiler is smart enough to realize that you'd never notice.
It's quite possible that the double[2] causes the array to be allocated as two adjacent stack slots where it wasn't before, and that in turn can cause code reordering to improve memory access efficiency. IEEE754 floating point math doesn't obey the regular math rules, i.e. a+b+c != c+b+a