comparing derived types in fortran - fortran

I was trying to compile a project which solves the Navier-Stokes on a sphere available here:
https://fms.gfdl.noaa.gov/gf/
the default compiler used is ifort, and I wanted to use gfortran, since I want to make it finally available to whoever wishes to use it.
at some points in the code, there are statements like
if (x == y)
,where x and y are both derived types (called domain1d/2d) containing integers, reals and pointers. gfortran complains saying that the comparison is between non numbers and quits.
I then downloaded a trial version of ifort and it compiled without issues.
So, I wanted to know whether this is some kind of ifort shorthand to actual comparison of each member of the type/structure (im more comfortable with the C terminology!) or whether im missing something fundamental, being new to fortran?
I understand comparing derived types makes little sense sometimes, but here it simply seems to be checking whether both carry the same information.
Thanks,
Joy

Richard W is correct. I encountered a similar problem with an atmospheric code from NOAA. This is a bug in older versions of GCC (it affected me in 4.47 and not in 4.8). For some reason, overloading .EQ. does not overload == and vice versa (if you look at the implementation of domain1D, it definitely overloads .EQ. and then == shows up somewhere else in the code). I solved the problem by ensuring that either .EQ. or == was used throughout.
To the best of my knowledge, .EQ. and == should be equivalent (I haven't looked that hard at the standard) which is why ifort (and in my case, the SGI Fortran compiler) did not encounter this problem.

Related

Error: Old-style type declaration REAL*16 not supported

I was given some legacy code to compile. Unfortunately I only have access to a f95 compiler and have 0 knowledge of Fortran. Some modules compiled but others I was getting this error:
Error: Old-style type declaration REAL*16 not supported at (1)
My plan is to at least try to fix this error and see what else happens. So here are my 2 questions.
How likely will it be that my code written for Fortran 75 is compatible in the Fortran 95 compiler? (In /usr/bin my compiler is f95 - so I assume it is Fortran 95)
How do I fix this error that I am getting? I tried googling it but cannot see to find a clear crisp answer.
The error you are seeing is due to an old declaration style that was frequent before Fortran 90, but never became standard. Thus, the compiler does not accept the (formally incorrect) code.
In the olden days before Fortran 90, you had only two types of real numbers: REAL and DOUBLE PRECISION. These types were platform dependent, but most compilers nowadays map them to the IEEE754 formats binary32 and binary64.
However, some machines offered different formats, often with extra precision. In order to make them accessible to Fortran code, the type REAL*n was invented, with n an integer literal from a set of compiler-dependent values. This syntax was never standard, so you cannot be sure of what it will mean to a given compiler without reading its documentation.
In the real world, most compilers that have not been asked to be strictly standards-compliant (with some option like -std=f95) will recognize at least REAL*4 and REAL*8, mapping them to the binary32/64 formats mentioned before, but everything else is completely platform dependent. Your compiler may have a REAL*10 type for the 80-bit arithmetic used by the x86 387 FPU, or a REAL*16 type for some 128-bit floating point math. However, it is important to stress that since the syntax is not standard, the meaning of that type could change from compiler to compiler.
Finally, in Fortran 90 a way to refer to different kinds of real and integer types was made standard. The new syntax is REAL(n) or REAL(kind=n) and is supported by all standard-compliant compilers. However, The values of n are still compiler-dependent, but the standard provides three ways to obtain specific, repeatable results:
The SELECTED_REAL_KIND function, which allows you to query the system for the value of n to specify if you want a real type with certain precision and range requirements. Usually, what you do is ask for it once and store the result in an INTEGER, PARAMETER variable that you use when declaring the real variables in question. For example, you would declare a type with at least 15 digits of precision (decimal) and exponent range of at least 100 like this:
INTEGER, PARAMETER :: rk = SELECTED_REAL_KIND(15, 100)
REAL(rk) :: v
In Fortran 2003 and onwards, the ISO_C_BINDING module contains a series of constants that are meant to give you types guaranteed to be equivalent to the C types of the same compiler family (e.g. gcc for gfortran, icc for ifort, etc.). They are called C_FLOAT, C_DOUBLE and C_LONG_DOUBLE. Thus, you could declare a variable equivalent to a C double as REAL(C_DOUBLE) :: d.
In Fortran 2008 and onwards, the ISO_FORTRAN_ENV module contains a different series of constants REAL32, REAL64 and REAL128 that will give you a floating point type of the appropriate width - if some platform does not support one of those types, the constant will be a negative number. Thus, you can declare a 128-bit float as REAL(real128) :: q.
Javier has given an excellent answer to your immediate problem. However I'd just briefly like to address "How likely will it be that my code written for Fortran 77 is compatible in the Fortran 95 compiler?", with the obvious typo corrected.
Fortran, if the programmer adheres to the standard, is amazingly backward compatible. With very, very few exceptions standard conforming Fortran 77 is Fortran 2008 conforming. The problem is that it appears in your case the original programmer has not adhered to the international standard: real*8 and similar is not, and has never been part of any such standard and the problems you are seeing are precisely why such forms should never be, and should never have been used. That said if the original programmer only made this one mistake it may well be that the rest of the code will be OK, however without the detail it is impossible to tell
TL;DR: International standards are important, stick to them!
When we are at guessing instead of requesting proper code and full details I will venture to say that the other two answers are not correct.
The error message
Error: Old-style type declaration REAL*16 not supported at (1)
DOES NOT mean that the REAL*n syntax is not supported.
The error message is misleading. It actually means that the 16-byte reals are no supported. Had the OP requested the same real by the kind notation (in any of the many ways which return the gfortran's kind 16) the error message would be:
Error: Kind 16 not supported for type REAL at (1)
That can happen in some gfortran versions. Especially in MS Windows.
This explanation can be found with just a very quick google search for the error message: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56850#c1
It does not not complain that the old syntax is the problem, it just mentions it, because mentioning kinds might be confusing even more (especially for COMPLEX*32 which is kind 16).
The main message is: We really should be closing these kinds of questions and wait for proper details instead of guessing and upvoting the question where the OP can't even copy the full message the compiler has spit.
If you are using GNU gfortran, try "real(kind=10)", which will give you 10 bytes (80-bits) extended precision. I'm converting some older F77 code to run floating point math tests - it has a quad-precision ( the real*16 you mention) defined, but not used, as the other answers provided correctly point out (extended precision formats tend to be machine/compiler specific). The gfortran 5.2 I am using does not support real(kind=16), surprisingly. But gfortran (available for 64-bit MacOS and 64-bit Linux machines), does have the "real(kind=10)" which will give you precision beyond the typical real*8 or "double precision" as it was called in some Fortran compilers.
Be careful if your old Fortran code is calling C programs and/or maybe making assumptions about how precision is handled and floating point numbers are represented. You may have to get deep into exactly what is happening, and check the code carefully, to ensure things are working as expected, especially if fortran and C routines are calling each other. Here is url for the GNU gfortran info on quad-precision: https://people.sc.fsu.edu/~jburkardt/f77_src/gfortran_quadmath/gfortran_quadmath.html
Hope this helps.

Fortran : Initialize all variables to a specific default value

I am working on a ~40 years old Fortran spaghetti code with lots of variables that are implicitly declared. So there is not a simple way to even know what variables exist in the code in order to initialize their values. Now, is there a way to tell the compiler (for example Intel Fortran) to initialize all variables in the code to a specific default value (e.g., -999) other than zero or a very large number, as provided by Intel compiler?
gfortran provides some options for this. Integers can be intialized with -finit-integer=n where n is an integer. For real numbers you can use -finit-real=<zero|inf|-inf|nan|snan>. Together with -ffpe-trap=denormal this can be very helpful, to get uninitialized reals.
You probably want :
ifort -check uninit
Note per the man page this only checks scalars
Also, based on some quick testing it is a pretty weak check. It doesn't catch this simple thing for example:
program test
call f(i)
end
subroutine f(j)
write(*,*)j
end
returns 0 ..
I suppose its better than nothing though.

Convert FORTRAN DEC UNION/MAP extensions to anything else

Edit: Gfortran 6 now supports these extensions :)
I have some old f77 code that extensively uses UNIONs and MAPs. I need to compile this using gfortran, which does not support these extensions. I have figured out how to convert all non-supported extensions except for these and I am at a loss. I have had several thoughts on possible approaches, but haven't been able to successfully implement anything. I need for the existing UDTs to be accessed in the same way that they currently are; I can reimplement the UDTs but their interfaces must not change.
Example of what I have:
TYPE TEST
UNION
MAP
INTEGER*4 test1
INTEGER*4 test2
END MAP
MAP
INTEGER*8 test3
END MAP
END UNION
END TYPE
Access to the elements has to be available in the following manners: TEST%test1, TEST%test2, TEST%test3
My thoughts thusfar:
Replace somehow with fortran EQUIVALENCE.
Define the structs in C/C++ and somehow make them visible to the FORTRAN code (doubt that this is possible)
I imagine that there must have been lots of refactoring of f77 to f90/95 when the UNION and MAP were excluded from the standard. How if at all was/is this handled?
EDIT: The accepted answer has a workaround to allow memory overlap, but as far as preserving the API, it is not possible.
UNION and MAP were never part of any FORTRAN standard, they are vendor extensions. (See, e.g., http://fortranwiki.org/fortran/show/Modernizing+Old+Fortran). So they weren't really excluded from the Fortran 90/95 standard. They cause variables to overlap in memory. If the code actually uses this feature, then you will need to use equivalence. The preferred way to move data between variables of different types without conversion is the transfer intrinsic, but to you that you would have to identify every place where a conversion is necessary, while with equivalence it is taking place implicitly. Of course, that makes the code less understandable. If the memory overlays are just to save space and the equivalence of the variables is not used, then you could get rid of this "feature". If the code is like your example, with small integers, then I'd guess that the memory overlay is being used. If the overlays are large arrays, it might have been done to conserve memory. If these declarations were also creating new types, you could use user defined types, which are definitely part of Fortran >=90.
If the code is using memory equivalence of variables of different types, this might not be portable, e.g., the internal representation of integers and reals are probably different between the machine on which this code originally ran and the current machine. Or perhaps the variables are just being used to store bits. There is a lot to figure out.
P.S. In response to the question in the comment, here is a code sample. But .... to be clear ... I do not think that using equivalence is good coding pratice. With the compiler options that I normally use with gfortran to debug code, gfortran rejects this code. With looser options, gfortran will compile it. So will ifort.
module my_types
use ISO_FORTRAN_ENV
type test_p1_type
sequence
integer (int32) :: int1
integer (int32) :: int2
end type test_p1_type
type test_p2_type
sequence
integer (int64) :: int3
end type test_p2_type
end module my_types
program test
use my_types
type (test_p1_type) :: test_p1
type (test_p2_type) :: test_p2
equivalence (test_p1, test_p2)
test_p1 % int1 = 2
test_p1 % int1 = 4
write (*, *) test_p1 % int1, test_p1 % int2, test_p2 % int3
end program test
The question is whether the union was used to save space or to have alternative representations of the same data. If you are porting, see how it is used. Maybe, because the space was limited, it was written in a way where the variables had to be shared. Nowadays with larger amounts of memory, maybe this is not necessary and the union may not be required. In which case, it is just two separate types
For those just wanting to compile the code with these extensions: Gfortran now supports UNION, MAP and STRUCTURE in version 6. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56226

What do I do about a FORTRAN intrinsic that was not part of the standard?

I'm trying to get a legacy FORTRAN code working by building it from source using gfortran. I have finally been able to build it successfully, but now I'm getting an out-of-bounds error when it runs. I used gdb and traced the error to a function that uses the loc() intrinsic. When I try to print the value of loc(ae), with ae being my integer value being passed, I get the error "No symbol "loc" in current context." I tried compiling with ifort 11.x and debugged with DDT and got the same error. To me, this means that the compiler knows nothing of the intrinsic.
A little reading revealed that the loc intrinsic wasn't part of the F77 standard, so maybe that's part of the problem. I posted the definition of the intrinsic below, but I don't know how I can implement that into my code so loc() can be used.
Any advice or am I misinterpreting my problem? Because both gfortran and ifort crash in the same place due to an out of bounds error, but the function utilizing loc() returns the same large number between both compilers. It seems a bit strange that loc() wouldn't be working if both compilers shoot back the same value for loc.
Usage:
iaddr = loc(obj)
Where:
obj
is a variable, array, function or subroutine whose address is wanted.
iaddr
is an integer with the address of "obj". The address is in the same
format as stored by an LARn
instruction.
Description:
LOC is used to obtain the address of
something. The value returned is not
really useful within Fortran, but may
be needed for GMAP subroutines, or
very special debugging.
Well, no, the fact that it compiles means that loc is known by the compiler; the fact that gdb doesn't know about it just means it's just not known by the debugger (which probably doesn't know the matmult intrinsic, either).
loc is a widely-available non-standard extension. I hate those. If you want something standard that should work everywhere, c_loc, which is part of the C<->Fortran interoperability standard in Fortran2003, is something you could use. It returns a pointer that can be passed to C routines.
How is the value from the loc call being used?
Gfortran loc seems to work a bit differently with arrays to that of some other compilers. If you are using it to eg check for array copies or such then it can be better to do loc of the first element loc(obj(1,1)) or similar. This is equivalent to what loc does I think with intel, but in gfortran it gives instead some other address (so two arrays which share exactly the same memory layout have different loc results).

GCC: program doesn't work with compilation option -O3

I'm writing a C++ program that doesn't work (I get a segmentation fault) when I compile it with optimizations (options -O1, -O2, -O3, etc.), but it works just fine when I compile it without optimizations.
Is there any chance that the error is in my code? or should I assume that this is a bug in GCC?
My GCC version is 3.4.6.
Is there any known workaround for this kind of problem?
There is a big difference in speed between the optimized and unoptimized version of my program, so I really need to use optimizations.
This is my original functor. The one that works fine with no levels of optimizations and throws a segmentation fault with any level of optimization:
struct distanceToPointSort{
indexedDocument* point ;
distanceToPointSort(indexedDocument* p): point(p) {}
bool operator() (indexedDocument* p1,indexedDocument* p2){
return distance(point,p1) < distance(point,p2) ;
}
} ;
And this one works flawlessly with any level of optimization:
struct distanceToPointSort{
indexedDocument* point ;
distanceToPointSort(indexedDocument* p): point(p) {}
bool operator() (indexedDocument* p1,indexedDocument* p2){
float d1=distance(point,p1) ;
float d2=distance(point,p2) ;
std::cout << "" ; //without this line, I get a segmentation fault anyways
return d1 < d2 ;
}
} ;
Unfortunately, this problem is hard to reproduce because it happens with some specific values. I get the segmentation fault upon sorting just one out of more than a thousand vectors, so it really depends on the specific combination of values each vector has.
Now that you posted the code fragment and a working workaround was found (#Windows programmer's answer), I can say that perhaps what you are looking for is -ffloat-store.
-ffloat-store
Do not store floating point variables in registers, and inhibit other options that might change whether a floating point value is taken from a register or memory.
This option prevents undesirable excess precision on machines such as the 68000 where the floating registers (of the 68881) keep more precision than a double is supposed to have. Similarly for the x86 architecture. For most programs, the excess precision does only good, but a few programs rely on the precise definition of IEEE floating point. Use -ffloat-store for such programs, after modifying them to store all pertinent intermediate computations into variables.
Source: http://gcc.gnu.org/onlinedocs/gcc-3.4.6/gcc/Optimize-Options.html
I would assume your code is wrong first.
Though it is hard to tell.
Does your code compile with 0 warnings?
g++ -Wall -Wextra -pedantic -ansi
Here's some code that seems to work, until you hit -O3...
#include <stdio.h>
int main()
{
int i = 0, j = 1, k = 2;
printf("%d %d %d\n", *(&j-1), *(&j), *(&j+1));
return 0;
}
Without optimisations, I get "2 1 0"; with optimisations I get "40 1 2293680". Why? Because i and k got optimised out!
But I was taking the address of j and going out of the memory region allocated to j. That's not allowed by the standard. It's most likely that your problem is caused by a similar deviation from the standard.
I find valgrind is often helpful at times like these.
EDIT: Some commenters are under the impression that the standard allows arbitrary pointer arithmetic. It does not. Remember that some architectures have funny addressing schemes, alignment may be important, and you may get problems if you overflow certain registers!
The words of the [draft] standard, on adding/subtracting an integer to/from a pointer (emphasis added):
"If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined."
Seeing as &j doesn't even point to an array object, &j-1 and &j+1 can hardly point to part of the same array object. So simply evaluating &j+1 (let alone dereferencing it) is undefined behaviour.
On x86 we can be pretty confident that adding one to a pointer is fairly safe and just takes us to the next memory location. In the code above, the problem occurs when we make assumptions about what that memory contains, which of course the standard doesn't go near.
As an experiment, try to see if this will force the compiler to round everything consistently.
volatile float d1=distance(point,p1) ;
volatile float d2=distance(point,p2) ;
return d1 < d2 ;
The error is in your code. It's likely you're doing something that invokes undefined behavior according to the C standard which just happens to work with no optimizations, but when GCC makes certain assumptions for performing its optimizations, the code breaks when those assumptions aren't true. Make sure to compile with the -Wall option, and the -Wextra might also be a good idea, and see if you get any warnings. You could also try -ansi or -pedantic, but those are likely to result in false positives.
You may be running into an aliasing problem (or it could be a million other things). Look up the -fstrict-aliasing option.
This kind of question is impossible to answer properly without more information.
It is very seldom the compiler fault, but compiler do have bugs in them, and them often manifest themselves at different optimization levels (if there is a bug in an optimization pass, for example).
In general when reporting programming problems: provide a minimal code sample to demonstrate the issue, such that people can just save the code to a file, compile and run it. Make it as easy as possible to reproduce your problem.
Also, try different versions of GCC (compiling your own GCC is very easy, especially on Linux). If possible, try with another compiler. Intel C has a compiler which is more or less GCC compatible (and free for non-commercial use, I think). This will help pinpointing the problem.
It's almost (almost) never the compiler.
First, make sure you're compiling warning-free, with -Wall.
If that didn't give you a "eureka" moment, attach a debugger to the least optimized version of your executable that crashes and see what it's doing and where it goes.
5 will get you 10 that you've fixed the problem by this point.
Ran into the same problem a few days ago, in my case it was aliasing. And GCC does it differently, but not wrongly, when compared to other compilers. GCC has become what some might call a rules-lawyer of the C++ standard, and their implementation is correct, but you also have to be really correct in you C++, or it'll over optimize somethings, which is a pain. But you get speed, so can't complain.
I expect to get some downvotes here after reading some of the comments, but in the console game programming world, it's rather common knowledge that the higher optimization levels can sometimes generate incorrect code in weird edge cases. It might very well be that edge cases can be fixed with subtle changes to the code, though.
Alright...
This is one of the weirdest problems I've ever had.
I dont think I have enough proof to state it's a GCC bug, but honestly... It really looks like one.
This is my original functor. The one that works fine with no levels of optimizations and throws a segmentation fault with any level of optimization:
struct distanceToPointSort{
indexedDocument* point ;
distanceToPointSort(indexedDocument* p): point(p) {}
bool operator() (indexedDocument* p1,indexedDocument* p2){
return distance(point,p1) < distance(point,p2) ;
}
} ;
And this one works flawlessly with any level of optimization:
struct distanceToPointSort{
indexedDocument* point ;
distanceToPointSort(indexedDocument* p): point(p) {}
bool operator() (indexedDocument* p1,indexedDocument* p2){
float d1=distance(point,p1) ;
float d2=distance(point,p2) ;
std::cout << "" ; //without this line, I get a segmentation fault anyways
return d1 < d2 ;
}
} ;
Unfortunately, this problem is hard to reproduce because it happens with some specific values. I get the segmentation fault upon sorting just one out of more than a thousand vectors, so it really depends on the specific combination of values each vector has.
Wow, I didn't expect answers so quicly, and so many...
The error occurs upon sorting a std::vector of pointers using std::sort()
I provide the strict-weak-ordering functor.
But I know the functor I provide is correct because I've used it a lot and it works fine.
Plus, the error cannot be some invalid pointer in the vector becasue the error occurs just when I sort the vector. If I iterate through the vector without applying std::sort first, the program works fine.
I just used GDB to try to find out what's going on. The error occurs when std::sort invoke my functor. Aparently std::sort is passing an invalid pointer to my functor. (of course this happens with the optimized version only, any level of optimization -O, -O2, -O3)
as other have pointed out, probably strict aliasing.
turn it of in o3 and try again. My guess is that you are doing some pointer tricks in your functor (fast float as int compare? object type in lower 2 bits?) that fail across inlining template functions.
warnings do not help to catch this case. "if the compiler could detect all strict aliasing problems it could just as well avoid them" just changing an unrelated line of code may make the problem appear or go away as it changes register allocation.
As the updated question will show ;) , the problem exists with a std::vector<T*>. One common error with vectors is reserve()ing what should have been resize()d. As a result, you'd be writing outside array bounds. An optimizer may discard those writes.
post the code in distance! it probably does some pointer magic, see my previous post. doing an intermediate assignment just hides the bug in your code by changing register allocation. even more telling of this is the output changing things!
The true answer is hidden somewhere inside all the comments in this thread. First of all: it is not a bug in the compiler.
The problem has to do with floating point precision. distanceToPointSort should be a function that should never return true for both the arguments (a,b) and (b,a), but that is exactly what can happen when the compiler decides to use higher precision for some data paths. The problem is especially likely on, but by no means limited to, x86 without -mfpmath=sse. If the comparator behaves that way, the sort function can become confused, and the segmentation fault is not surprising.
I consider -ffloat-store the best solution here (already suggested by CesarB).