Trapping floating point exceptions with LAPACK - fortran

I have a program that uses LAPACK, or optionally compiles a subset of LAPACK. I would like to enable gfortran's -ffpe-trap=... (or similar with other compilers) to trap floating point exceptions and help me catch bugs and errors. However, LAPACK does not like this (https://github.com/Reference-LAPACK/lapack-release/blob/lapack-3.7.1/INSTALL/make.inc.gfortran):
# Note: During a regular execution, LAPACK might create NaN and Inf
# and handle these quantities appropriately. As a consequence, one
# should not compile LAPACK with flags such as -ffpe-trap=overflow.
I thought I could try compiling LAPACK without the flag, and then my program with it, but it seems it is the flags used for the main program that rule, and I get exceptions from LAPACK code. I've tried static and dynamic linking.
Is there some way I can use -ffpe-trap=... in my program but "disable" it for code inside LAPACK chain?

Related

C/FORTRAN set double underflow to zero

I have a legacy FORTRAN project with some very intense computations. I want this math code to be accessed by C/C++ code, so I built a FORTRAN dll, imported it in C/C++ and started to receive floating-point underflows from my FORTRAN dll.
At the same time, the FORTRAN dll code executes fine if I call it from a FORTRAN application.
Finally, I found out that the compiler I use (it's an FTN 95 integrated into VS2013) has an option (/UNDERFLOW). If this flag is not specified, all underflows are converted to zeroes by default. That happens in the FORTRAN app. When I use C code to execute methods from this dll, I receive underflows.
So, the question is: is there any way to force VC++ compiler to convert underflows to zeroes?
P.S.: yes, I understand that it is stupid to rely on a code that throws floating-point exceptions all the way. However, this code is old and at this point it is impossible to completely rewrite it using up-to-date techniques.
So, problem was with FTN95 compiler. The mentioned above flag (/UNDERFLOW) seems to be useful only when one build an application. The effect of this flag is neglected when the target output is DLL. Instead of this I found a compiler directive that is accessed through call to MASK_UNDERFLOW#() subroutine. After inserting an explicit call to this subroutine in the FORTRAN function that was throwing underflows and recompiling the DLL, I've managed to successfully launch a C program and perform necessary computations using functions from FORTRAN dlls. Also, an fp:/except- VC++ compiler flag was used to ensure that no other underflows will affect execution of C program.

What happens to floating point numbers in the absence of an FPU?

If you are programming with the C language for a microprocessor that does not have an FPU, does the compiler signal errors when floating point literals and keywords are encountered (0.75, float, double, etc)?
Also, what happens if the result of an expression is fractional?
I understand that there are software libraries that are used so you can do floating-point math, but I am specifically wondering what the results will be if you did not use one.
Thanks.
A C implementation is required to implement the types float and double, and arithmetic expressions involving them. So if the compiler knows that the target architecture doesn't have floating-point ops then it must bring in a software library to do it. The compiler is allowed to link against an external library, it's also allowed to implement floating point ops in software by itself as intrinsics, but it must somehow generate code to get it done.
If it doesn't do so [*] then it is not a conforming C implementation, so strictly speaking you're not "programming with the C language". You're programming with whatever your compiler docs tell you is available instead.
You'd hope that code involving float or double types will either fail to compile (because the compiler knows you're in a non-conforming mode and tells you) or else fails to link (because the compiler emits calls to emulation routines in the library, but the library is missing). But you're on your own as far as C is concerned, if you use something that isn't C.
I don't know the exact details (how old do I look?), but I imagine that back in the day if you took some code compiled for x87 then you might be able to link and load it on a system using an x86 with no FPU. Then the CPU would complain about an illegal instruction when you tried to execute it -- quite possibly the system would hang depending what OS you were running. So the worst possible case is pretty bad.
what happens if the result of an expression is fractional?
The actual result of an expression won't matter, because the expression itself was either performed with integer operations (in which case the result is not fractional) or else with floating-point operations (in which case the problem arises before you even find out the result).
[*] or if you fail to specify the options to make it do so ;-)
Floating-point is a required part of the C language, according to the C standard. If the target hardware does not have floating-point instructions, then a C implementation must provide floating-point operations in some other way, such as by emulating them in software. (All calculations are just functions of bits. If you have elementary operations for manipulating bits and performing tests and branches, then you can compute any function that a general computer can.)
A compiler could provide a subset of C without floating-point, but then it would not be a standard-compliant C compiler.
Software floating point can take two forms:
a compiler may generate calls to built-in floating point functions directly - for example the operation 1.2 * 2.5 may invoke (for example) fmul( 1.2, 2.5 ),
alternatively for architectures that support an FPU, but for which some device variants may omit it, it is common to use FPU emulation. When an FP instruction is encountered an invalid instruction exception will occur and the exception handler will vector to code that emulates the instruction.
FPU emulation has the advantage that when the same code is executed on a device with a real FPU, it will be used automatically and accelerate execution. However without an FPU there is usually a small overhead compared with direct software implementation, so if the application is never expected to run on an FPU, emulation might best be avoided is the compiler provides the option.
Software floating point is very much slower that hardware supported floating point. Use of fixed-point techniques can improve performance with acceptable precision in many cases.
Typically, such microprocessor comes along either with a driver-package or even with a complete BSP (board-support-package, consisting of drivers and OS linked together), both of which contain FP library routines.
The compiler replaces every floating-point operation with an equivalent function call. This should be taken into consideration, especially when invoking such operations iteratively (inside a for / while loop), since the compiler cannot apply loop-unrolling optimization as a result.
The result of not including the required libraries within the project would be linkage errors.

What is the best way to use openmp with multiple subroutines in Fortran

I have a program written in Fortran and I have more than 100 subroutines. However, I have around 30 subroutines where there are open-mp codes present. I was wondering what is the best procedure to compile these subroutines. When I used the all the files to compile at once then I found that open mp compiled code runs even slower than the one without open-mp. Should I compile the subroutines with open-mp tags separately ? What is the best practice under these conditions ?
Thank you so much.
Best Regards,
Jdbaba
The OpenMP-aware compilers look for the OpenMP pragma (the open signs after a comment symbol at the begin of the line). Therefore, sources without OpenMP code compiled with an OpenMP-aware compiler should result on the exact or very close object files (and executable).
Edit: One should note that as stated by Hristo Iliev below, enabling OpenMP could affect the serial code, for example by using OpenMP versions of libraries that may differ in algorithm (to be more effective in parallel) and optimizations.
Most likely, the problem here is more related to your code algorithms.
Or perhaps you did not compile with the same optimization flags when comparing OpenMP and non-OpenMP versions.

Intermediate Code as a result of OpenMP pragmas

Is there a way to get my hands on the intermediate source code produced by the OpenMP pragmas?
I would like to see how each kind of pragmas is translated.
Cheers.
OpenMp pragmas is part of a C / C++ compiler's implementation. Therefore before using it, you need to ensure that your compiler will support the pragmas ! If they are not supported, then they are ignored, so you may get no errors at compilation, but multi-thread wont work. In any case, as mentioned above, since they are part of the compiler's implementation, the best intermediate result that you can get is a lower level code. OpenMp is language extension + libraries, macros etc opposed to Pthreads that arms you purely with libraries !

Fortran: differences between generated code compiled using two different compilers

I have to work on a fortran program, which used to be compiled using Microsoft Compaq Visual Fortran 6.6. I would prefer to work with gfortran but I have met lots of problems.
The main problem is that the generated binaries have different behaviours. My program takes an input file and then has to generate an output file. But sometimes, when using the binary compiled by gfortran, it crashes before its end, or gives different numerical results.
This a program written by researchers which uses a lot of float numbers.
So my question is: what are the differences between these two compilers which could lead to this kind of problem?
edit:
My program computes the values of some parameters and there are numerous iterations. At the beginning, everything goes well. After several iterations, some NaN values appear (only when compiled by gfortran).
edit:
Think you everybody for your answers.
So I used the intel compiler which helped me by giving some useful error messages.
The origin of my problems is that some variables are not initialized properly. It looks like when compiling with compaq visual fortran these variables take automatically 0 as a value, whereas with gfortran (and intel) it takes random values, which explain some numerical differences which add up at the following iterations.
So now the solution is a better understanding of the program to correct these missing initializations.
There can be several reasons for such behaviour.
What I would do is:
Switch off any optimization
Switch on all debug options. If you have access to e.g. intel compiler, use ifort -CB -CU -debug -traceback. If you have to stick to gfortran, use valgrind, its output is somewhat less human-readable, but it's often better than nothing.
Make sure there are no implicit typed variables, use implicit none in all the modules and all the code blocks.
Use consistent float types. I personally always use real*8 as the only float type in my codes. If you are using external libraries, you might need to change call signatures for some routines (e.g., BLAS has different routine names for single and double precision variables).
If you are lucky, it's just some variable doesn't get initialized properly, and you'll catch it by one of these techniques. Otherwise, as M.S.B. was suggesting, a deeper understanding of what the program really does is necessary. And, yes, it might be needed to just check the algorithm manually starting from the point where you say 'some NaNs values appear'.
Different compilers can emit different instructions for the same source code. If a numerical calculation is on the boundary of working, one set of instructions might work, and another not. Most compilers have options to use more conservative floating point arithmetic, versus optimizations for speed -- I suggest checking the compiler options that you are using for the available options. More fundamentally this problem -- particularly that the compilers agree for several iterations but then diverge -- may be a sign that the numerical approach of the program is borderline. A simplistic solution is to increase the precision of the calculations, e.g., from single to double. Perhaps also tweak parameters, such as a step size or similar parameter. Better would be to gain a deeper understanding of the algorithm and possibly make a more fundamental change.
I don't know about the crash but some differences in the results of numerical code in an Intel machine can be due to one compiler using 80-doubles and the other 64-bit doubles, even if not for variables but perhaps for temporary values. Moreover, floating-point computation is sensitive to the order elementary operations are performed. Different compilers may generate different sequence of operations.
Differences in different type implementations, differences in various non-Standard vendor extensions, could be a lot of things.
Here are just some of the language features that differ (look at gfortran and intel). Programs written to fortran standard work on every compiler the same, but a lot of people don't know what are the standard language features, and what are the language extensions, and so use them ... when compiled with a different compiler troubles arise.
If you post the code somewhere I could take a quick look at it; otherwise, like this, 'tis hard to say for certain.