I just started to learn Fortran and I found that some intrinsic functions look mysteriously for me.
One of them is ALLOCATE: ALLOCATE(array(-5:4, 1:10)). If I want to write a similar function, how would it look? What is its argument? Which type it will have? Because it's not obvious to me what is array(-5:4, 1:10)? array still is not allocated, so what does this expression mean? What is its type?! What will be a difference in array(10) and array(-5:4, 1:10) as a type? Is it some hidden preallocated "meta-object" with some internal attribute like "dimension"? At least it does not look like array pointer in C.
And the next mysterious functions example is the PACK: pack(m, m /= 0). First, I thought that m /= 0 is like a function pointer, i.e. lambda, like in Python pack(m, lambda el: el != 0) or in Haskell pack m (\el -> el /= 0). But then I read somewhere in the Web that it's not a lambda but a list of booleans, once per each m item. But this means that it's very inefficient code - it eats a lot of memory if m is big! So, I cannot understand how do these intrinsic functions work, even more, I have feeling that user cannot write such functions - they are coded in C and not in Fortran itself. Is it truth? How they were written?!
Allocate is not a library function, as pointed out by #dave_thompson_085. Not only that, there are also some types of actual intrinsic functions that cannot be written by the user. Like min(), max(), transfer(). They are not just "library" functions, they are "intrisic", part of the core language and hence can do stuff that user code cannot. They are written in any language that was used to write the compiler. Mostly C, but could probably be also implemented in Fortran - just not like a normal Fortran function, but as a feature the compiler inserts.
When it comes to functions that accept a mask, like PACK, but there are many others that accept it in an optional argument, the mask is a logical array. The compiler is free to implement any optimizations to avoid allocating such an array, but these optimizations cannot be guaranteed. It is not just a library function that is called in a straightforward way, the compiler can insert any code that does what the function is supposed to do.
Related
I write code based on modern Fortran. For some reason, I want to modify it in a way that is compatible with the old version. Converting from the latest version to version 95 is desirable here. I have trouble with two intrinsic functions. "Mov_alloc" and "Norm2" are parts of these functions.
I want to know: are there any intrinsic functions for them in Fortran 95? Or, are there any external functions that do the same job precisely?
You can easily implement norm2() yourself based on the definition. Some care must be taken if your numbers are so large that overflow is an issue. But the simplest version is as simple
norm2 = sum(A**2)
There is no equivalent of move_alloc() in Fortran 95. You may need to use pointers instead of allocatable variables. You could implement your own version in C, but that would require many features from Fortran 2003-2018, so it makes little sense for you.
You can consider reallocating your arrays yourself and copying the data instead of doing move_alloc():
if (allocated(B)) deallocate(B)
allocate(B(lbound(A,1):ubound(A,1)))
B(:) = A
deallocate(A)
However, it is not the same as move_alloc().
I am working on a ~40 years old Fortran spaghetti code with lots of variables that are implicitly declared. So there is not a simple way to even know what variables exist in the code in order to initialize their values. Now, is there a way to tell the compiler (for example Intel Fortran) to initialize all variables in the code to a specific default value (e.g., -999) other than zero or a very large number, as provided by Intel compiler?
gfortran provides some options for this. Integers can be intialized with -finit-integer=n where n is an integer. For real numbers you can use -finit-real=<zero|inf|-inf|nan|snan>. Together with -ffpe-trap=denormal this can be very helpful, to get uninitialized reals.
You probably want :
ifort -check uninit
Note per the man page this only checks scalars
Also, based on some quick testing it is a pretty weak check. It doesn't catch this simple thing for example:
program test
call f(i)
end
subroutine f(j)
write(*,*)j
end
returns 0 ..
I suppose its better than nothing though.
I am asking this question with refernce to this SO question.
Accepted answer by Don stewart : First line says "Your code is highly polymorphic change all float vars to Double .." and it gives 4X performance improvement.
I am interested in doing matrix computations in Haskell, should I make it a habit of writing highly monomorphic code?
But some languages make good use of ad-hoc polymorphism to generate fast code, why GHC won't or can't? (read C++ or D)
why can't we have something like blitz++ or eigen for Haskell? I don't understand how typeclasses and (ad-hoc)polymorphism in GHC work.
With polymorphic code, there is usually a tradeoff between code size and code speed. Either you produce a separate version of the same code for each type that it will operate on, which results in larger code, or you produce a single version that can operate on multiple types, which will be slower.
C++ implementations of templates choose in favor of increasing code speed at the cost of increasing code size. By default, GHC takes the opposite tradeoff. However, it is possible to get GHC to produce separate versions for different types using the SPECIALIZE and INLINABLE pragmas. This will result in polymorphic code that has speed similar to monomorphic code.
I want to supplement Dirk's answer by saying that INLINABLE is usually recommended over SPECIALIZE. An INLINABLE annotation on a function guarantees that the module exports the original source code of the function so that it can be specialized at the point of usage. This usually removes the need to provide separate SPECIALIZE pragmas for every single use case.
Unlike INLINE, INLINABLE does not change GHC's optimization heuristics. It just says "Please export the source code".
I don't understand how typeclasses work in GHC.
OK, consider this function:
linear :: Num x => x -> x -> x -> x
linear a b x = a*x + b
This takes three numbers as input, and returns a number as output. This function accepts any number type; it is polymorphic. How does GHC implement that? Well, essentially the compiler creates a "class dictionary" which holds all the class methods inside it (in this case, +, -, *, etc.) This dictionary becomes an extra, hidden argument to the function. Something like this:
data NumDict x =
NumDict
{
method_add :: x -> x -> x,
method_subtract :: x -> x -> x,
method_multiply :: x -> x -> x,
...
}
linear :: NumDict x -> x -> x -> x -> x
linear dict a b x = a `method_multiply dict` x `method_add dict` b
Whenever you call the function, the compiler automatically inserts the correct dictionary - unless the calling function is also polymorphic, in which case it will have received a dictionary itself, so just pass that along.
In truth, functions that lack polymorphism are typically faster not so much because of a lack of function look-ups, but because knowing the types allows additional optimisations to be done. For example, our polymorphic linear function will work on numbers, vectors, matricies, ratios, complex numbers, anything. Now, if the compiler knows that we want to use it on, say, Double, now all the operations become single machine-code instructions, all the operands can be passed in processor registers, and so on. All of which results in fantastically efficient code. Even if it's complex numbers with Double components, we can make it nice and efficient. If we have no idea what type we'll get, we can't do any of those optimisations... That's where most of the speed difference typically comes from.
For a tiny function like linear, it's highly likely it will be inlined every time it's called, resulting in no polymorphism overhead and a small amount of code duplication - rather like a C++ template. For a larger, more complex polymorphic function, there may be some cost. In general, the compiler decides this, not you - unless you want to start sprinkling pragmas around the place. ;-) Or, if you don't actually use any polymorphism, you can just give everything monomorphic type signatures...
Edit: Gfortran 6 now supports these extensions :)
I have some old f77 code that extensively uses UNIONs and MAPs. I need to compile this using gfortran, which does not support these extensions. I have figured out how to convert all non-supported extensions except for these and I am at a loss. I have had several thoughts on possible approaches, but haven't been able to successfully implement anything. I need for the existing UDTs to be accessed in the same way that they currently are; I can reimplement the UDTs but their interfaces must not change.
Example of what I have:
TYPE TEST
UNION
MAP
INTEGER*4 test1
INTEGER*4 test2
END MAP
MAP
INTEGER*8 test3
END MAP
END UNION
END TYPE
Access to the elements has to be available in the following manners: TEST%test1, TEST%test2, TEST%test3
My thoughts thusfar:
Replace somehow with fortran EQUIVALENCE.
Define the structs in C/C++ and somehow make them visible to the FORTRAN code (doubt that this is possible)
I imagine that there must have been lots of refactoring of f77 to f90/95 when the UNION and MAP were excluded from the standard. How if at all was/is this handled?
EDIT: The accepted answer has a workaround to allow memory overlap, but as far as preserving the API, it is not possible.
UNION and MAP were never part of any FORTRAN standard, they are vendor extensions. (See, e.g., http://fortranwiki.org/fortran/show/Modernizing+Old+Fortran). So they weren't really excluded from the Fortran 90/95 standard. They cause variables to overlap in memory. If the code actually uses this feature, then you will need to use equivalence. The preferred way to move data between variables of different types without conversion is the transfer intrinsic, but to you that you would have to identify every place where a conversion is necessary, while with equivalence it is taking place implicitly. Of course, that makes the code less understandable. If the memory overlays are just to save space and the equivalence of the variables is not used, then you could get rid of this "feature". If the code is like your example, with small integers, then I'd guess that the memory overlay is being used. If the overlays are large arrays, it might have been done to conserve memory. If these declarations were also creating new types, you could use user defined types, which are definitely part of Fortran >=90.
If the code is using memory equivalence of variables of different types, this might not be portable, e.g., the internal representation of integers and reals are probably different between the machine on which this code originally ran and the current machine. Or perhaps the variables are just being used to store bits. There is a lot to figure out.
P.S. In response to the question in the comment, here is a code sample. But .... to be clear ... I do not think that using equivalence is good coding pratice. With the compiler options that I normally use with gfortran to debug code, gfortran rejects this code. With looser options, gfortran will compile it. So will ifort.
module my_types
use ISO_FORTRAN_ENV
type test_p1_type
sequence
integer (int32) :: int1
integer (int32) :: int2
end type test_p1_type
type test_p2_type
sequence
integer (int64) :: int3
end type test_p2_type
end module my_types
program test
use my_types
type (test_p1_type) :: test_p1
type (test_p2_type) :: test_p2
equivalence (test_p1, test_p2)
test_p1 % int1 = 2
test_p1 % int1 = 4
write (*, *) test_p1 % int1, test_p1 % int2, test_p2 % int3
end program test
The question is whether the union was used to save space or to have alternative representations of the same data. If you are porting, see how it is used. Maybe, because the space was limited, it was written in a way where the variables had to be shared. Nowadays with larger amounts of memory, maybe this is not necessary and the union may not be required. In which case, it is just two separate types
For those just wanting to compile the code with these extensions: Gfortran now supports UNION, MAP and STRUCTURE in version 6. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56226
I'm new to Fortran and just doing some simple things for work. And as a new programmer in general, not sure exactly how this works, so excuse me if my explanation or notation is not the best. At the top of the .F file there are common declarations. The person explaining it to me said think of it like a struct in C, and that they are global. Also in that same .F file, they have it declared with what type. So it's something like:
COMMON SOMEVAR
INTEGER*2 SOMEVAR
And then when I actually see it being used in some other file, they declare local variables, (e.g. SOMEVAR_LOCAL) and depending on the condition, they set SOMEVAR_LOCAL = 1 or 0.
Then there is another conditional later down the line that will say something like
IF (SOMEVAR_LOCAL. eq. 1)
SOMEVAR(PARAM) = 1;
(Again I apologize if this is not proper Fortran, but I don't have access to the code right now). So it seems to me that there is a "struct" like variable called SOMEVAR that is of some length (2 bytes of data?), then there is a local variable that is used as a flag so that later down the line, the global struct SOMEVAR can be set to that value. But because there is (PARAM), it's like an array for that particular instance? Thanks. Sorry for my bad explanation, but hopefully you will understand what I am asking.
Just to amplify something #MSB already mentioned: COMMON blocks tell a compiler how to lay variables out in memory. There is almost no reason to use them with modern Fortran, ie with any compiler which can cope with Fortran 90 or later, and there are good reasons to avoid them.
And to add one thing: in modern Fortran you can do approximately what C structs do with user defined types. Check your documentation for TYPE.
The first declaration has SOMEVAR as a scalar integer of two bytes. The usage you show has SOMEVAR has an array -- based on it being indexed. This is possible to do in Fortran via "sequence association" but it is poor practice. In one file you could declare SOMEVAR as INTEGER*2 and two bytes are allocated to this scalar. In another file you could declare it as INTEGER*1 SOMEVAR(2), and two bytes are reserved, this time for an array of two elements, each of one byte. Using the same common block in both files can cause these two variables to overlap, byte by byte -- sequence association. Many years ago, when memory was very small, programmers did this to reduce memory usage, knowing that different subroutines were using variables at different times. The reasons to do this today are very, very limited. Mostly one shouldn't because it is liable to be confusing.
You can also setup sequence association with the EQUIVALENCE statement. Again, best avoided. The modern replacement for the times that one must do "tricky" things that needed the EQUIVALENCE statement is the TRANSFER function.