Maximum number of gfortran continuation lines - fortran

I need a little help: according to this site, there is a limit on the maximum number of continuation lines.
So I decided to test this. I wrote a jumbo FUNCTION where it computes colossal algebraic formula expressed in statement split into 17,146 continuation lines.
!test.f90 1.6 MB file
DOUBLE COMPLEX FUNCTION myfunction(a, b)
DOUBLE COMPLEX, INTENT(IN) :: a
DOUBLE COMPLEX, INTENT(IN) :: b
myfunction = gd0/16.d0)*a*b*((a+b)**2)*((32.d0*DCONJG(f(4)))+(12&
&8.d0*DCONJG(f(11)))+(160.d0*DCONJG(f(24)))+(64.d0*DCONJG(f(46)))+(32.d0*DCONJG(f(3)))+(256.d0*DCON&
&JG(f(10)))+(480.d0*DCONJG(f(23)))+(256.d0*DCONJG(f(45)))+(32.d0*DCONJG(g(9)))+(128.d0*DCONJG(f(&
&9)))+(480.d0*DCONJG(f(22)))+(384.d0*DCONJG(f(44)))+(96.d0*DCONJG(g(21)))+(160.d0*DCONJG(f(21)))+(2&
&56.d0*DCONJG(f(43)))+(64.d0*DCONJG(g(42)))+(64.d0*DCONJG(f(42)))+(64.d0*DCONJG(f(8)))+(192.d0*DCON&
& (64.d0*DCONJG(g(42)) ! and so on and so forth...
END FUNCTION
I compiled this abomination with gfortran -c test.f90, and it returned an 11.4MB test.o file after 5 minutes, without any errors or warnings. I ran it, and it returned correct results.
Why isn't gfortran observing the maximum continuation lines rule?

Why isn't gfortran observing the maximum continuation lines rule?
This rule is not a constraint, the compiler is allowed by the standard to provide a larger limit as an extension, the standard sets a minimum for the max number of continuation lines. The compiler might report exceedence of the standard minimal value but it is not required and may require a flag.

Related

Is a matrix automatically deallocated at the end? [duplicate]

I am interested in the difference between alloc_array and automatic_array in the following extract:
subroutine mysub(n)
integer, intent(in) :: n
integer :: automatic_array(n)
integer, allocatable :: alloc_array(:)
allocate(alloc_array(n))
...[code]...
I am familiar enough with the basics of allocation (not so much on advanced techniques) to know that allocation allows you to change the size of the array in the middle of the code (as pointed out in this question), but I'm interested in considering the case where you don't need to change the size of the array; they might be passed onto other subroutines for operation, but the only purpose of both variables in the code and any subroutine is to hold the data of an array of dimension n (and maybe change the data, but not the size).
(1) Is there any difference in memory usage? I am not an expert in low level procedures, but I have a very slight knowledge of how they matter and how they can impact on the higher level programming (kind of experience I'm talkng about: once trying to run a big code in fortran I was getting a mistake I didn't understand, sysadmin told me "oh, yeah, you are probably saturating the stack; try adding this line in your running script"; anything that gives me insight into how to consider this things when actually coding and not having to patch them later is welcomed). I've been told by people that it might be dependent on many other things like compiler or architecture, but I interpreted from those responses that they were not completely sure of exactly how this was so. Is it so absolutely dependant on a multitude of factors or is there a default/intended behavior in the coding that may then be over-riden by optional compiling keywords or system preferences?
(2) Would the subroutines have different interface needs? Again, not an expert, but it had happened to me before that because of the way I declare variables of subroutine, I end up having to put the subroutines in a module. I've been given to understand this may vary depending on whether I use things that are special for allocatable variables. I am thinking about the case in which everything I do with the variables can be done both by allocatables and automatics, not intentionally using anything specific of allocatables (other than allocation before usage, that is).
Finally, in case this is of use: the reason I am asking is because we are developing in a group and we have recently noticed different people use those two declarations in different ways and we needed to determine if this is something that can be left to personal preference or if there might be any reasons why it might be a good idea to set a clear criteria (and how to set that criteria). I don't need extremely detailed answers, I am trying to determine if this is something I should be doing research about to be careful on how we use it and in what aspects of it should the research be directed.
Though I would be interested to know of "interesting tricks" than can be done with allocation but are not directly related to the need of having size variability, I am leaving those for a possible future follow-up question and focusing here on the strictly functional differences (meaning: what I am explicitly telling compilers to do with my code). The two items I mentioned are the thing I could come up with due to previous experiences, but any other important one that I am missing and should consider, please do mention them.
Because gfortran or ifort + Linux(x86_64) are among the most popular combinations used for HPC, I made some performance comparison between local allocatable vs automatic arrays for these combinations. The CPU used is Xeon E5-2650 v2#2.60GHz, and the compilers are gfortran4.8.2 and ifort14.0. The test program is like the following.
In test.f90:
!------------------------------------------------------------------------
subroutine use_automatic( n )
integer :: n
integer :: a( n ) !! local automatic array (with unknown size at compile-time)
integer :: i
do i = 1, n
a( i ) = i
enddo
call sub( a )
end
!------------------------------------------------------------------------
subroutine use_alloc( n )
integer :: n
integer, allocatable :: a( : ) !! local allocatable array
integer :: i
allocate( a( n ) )
do i = 1, n
a( i ) = i
enddo
call sub( a )
deallocate( a ) !! not necessary for modern Fortran but for clarity
end
!------------------------------------------------------------------------
program main
implicit none
integer :: i, nsizemax, nsize, nloop, foo
common /dummy/ foo
nloop = 10**7
nsizemax = 10
do i = 1, nloop
nsize = mod( i, nsizemax ) + 1
call use_automatic( nsize )
! call use_alloc( nsize )
enddo
print *, "foo = ", foo !! to check if sub() is really called
end
In sub.f90:
!------------------------------------------------------------------------
subroutine sub( a )
integer a( * )
integer foo
common /dummy/ foo
foo = a( 1 )
ends
In the above program, I tried avoiding compiler optimization that eliminates a(:) itself (i.e., no operation) by placing sub() in a different file and making the interface implicit. First, I compiled the program using gfortran as
gfortran -O3 test.f90 sub.f90
and tested different values of nsizemax while keeping nloop = 10^7. The result is in the following table (time is in sec, measured several times by the time command).
nsizemax use_automatic() use_alloc()
10 0.30 0.31 # average result
50 0.48 0.47
500 1.0 0.90
5000 4.3 4.2
100000 75.6 75.7
So the overall timing seems almost the same for two calls when -O3 is used (but see Edit for different options). Next, I compiled with ifort as
[O3] ifort -O3 test.f90 sub.f90
or
[O3h] ifort -O3 -heap-arrays test.f90 sub.f90
In the former case the automatic array is stored on the stack, while when -heap-arrays is attached the array is stored on the heap. The obtained result is
use_automatic() use_alloc()
[O3] [O3h] [O3] [O3h]
10 0.064 0.39 0.48 0.48
50 0.094 0.56 0.65 0.66
500 0.45 1.03 1.12 1.12
5000 3.8 4.4 4.4 4.4
100000 74.5 75.3 76.5 75.5
So for ifort, the use of automatic arrays seems beneficial when relatively small arrays are mainly used. On the other hand, gfortran -O3 shows no difference because both arrays are treated the same way (see Edit for more details).
Additional comparison:
Below is the result for Oracle Fortran compiler 12.4 for Linux (used with f90 -O3). The overall trend seems similar; automatic arrays are faster for small n, indicating the internal use of stack.
nsizemax use_automatic() use_alloc()
10 0.16 0.45
50 0.17 0.62
500 0.37 0.97
5000 2.04 2.67
100000 65.6 65.7
Edit
Thanks to Vladimir's comment, it has turned out that gfortran -O3 put automatic arrays (with unknown size at compile-time) on the heap. This explains why use_automatic() and use_alloc() did not make any difference above. So I made another comparison between different options below:
[O3] gfortran -O3
[O5] gfortran -O5
[O3s] gfortran -O3 -fstack-arrays
[Of] gfortran -Ofast # this includes -fstack-arrays
Here, -fstack-arrays means that the compiler puts all local arrays with unknown size on the stack. Note that this flag is enabled by default with -Ofast. The obtained result is
nsizemax use_automatic() use_alloc()
[Of] [O3s] [O5] [O3] [Of] [O3s] [O5] [O3]
10 0.087 0.087 0.29 0.29 0.29 0.29 0.29 0.29
50 0.15 0.15 0.43 0.43 0.45 0.44 0.44 0.45
500 0.57 0.56 0.84 0.84 0.92 0.92 0.92 0.92
5000 3.9 3.9 4.1 4.1 4.2 4.2 4.2 4.2
100000 75.1 75.0 75.6 75.6 75.6 75.3 75.7 76.0
where the average of ten measurements are shown. This table demonstrates that if -fstack-arrays is included, the execution time for small n becomes shorter. This trend is consistent with the results obtained for ifort above.
It should be mentioned, however, that the above comparison probably corresponds to the "best-case" scenario that highlights the difference between them, so the timing difference can be much smaller in practice. For example, I have compared the timing for the above options by using some other program (involving both small and large arrays), and the results were not much affected by the stack options. Also the result should depend on machine architecture as well as compilers, of course. So your mileage may vary.
For the sake of clarity, I'll briefly mention terminology. The two arrays are both local variables and arrays of rank 1.
alloc_array is an allocatable array;
automatic_array is an explicit-shape automatic object.
Being local variables their scope is that of the procedure. Automatic arrays and unsaved allocatable arrays come to an end when execution of the procedure completes (with the allocatable array being deallocated); automatic objects cannot be saved and saved allocatable objects are not deallocated on completion of execution.
Again, as in the linked question, after the allocation statement both arrays are of size n. These are still two very different things. Of course, the allocatable array can have its allocation status changed and its allocation moved. I'll leave both of those (mostly) out of the scope of this answer. An allocatable array, of course, doesn't have to have these things changed once it's been allocated.
Memory usage
What was partly contentious about a previous revision of the question is how ill-defined the concept of memory usage is. Fortran, as a language definition, tells us that both arrays come to be the same size and they'll have the same storage layout, and are both contiguous. Beyond that, much follows terms you'll hear a lot: implementation specific and processor dependent.
In a comment you expressed interest in ifort. So that I don't wander too far, I'll stick to that one compiler. Other compilers have similar concepts, albeit with different names and options.
Often, ifort will place automatic objects and array temporaries onto stack. There is a (default) compiler option -no-heap-arrays described as having effect
The compiler puts automatic arrays and temporary arrays in the stack storage area.
Using the alternative option -heap-arrays allows one to control that slightly:
This option puts automatic arrays and arrays created for temporary computations on the heap instead of the stack.
There is a possibility to control size thresholds for which heap/stack would be chosen (when that is known at compile-time):
If the compiler cannot determine the size at compile time, it always puts the automatic array on the heap.
As n isn't a constant, one would expect automatic_array to be on the heap with this option, regardless of the size specified. To determine the size, n, of the array at compile time, the compiler would potentially need to do quite a bit of code analysis, even if it is possible.
There's probably more to be said, but this answer would be far too long if I tried. One thing to note, however, is that automatic local objects and (post-Fortran 90) allocatable local objects can be expected not to leak memory.
Interface needs
There is nothing special about the interface requirements of the subroutine mysub: local variables have no impact on that. Any program unit calling that would be happy with an implicit interface. What you are asking about is how the two local arrays can be used.
This largely comes down to what uses the two arrays can be put to.
If the dummy argument of a second procedure has the allocatable attribute then only the allocatable array here can be passed to that procedure. It will also need to have an explicit interface. This is true whether or not the procedure changes the allocation.
Of course, both arrays could be passed as arguments to a dummy argument without the allocatable attribute and then we don't have different interface requirements.
Anyway, why would one want to pass an argument to an allocatable dummy when there will be no change in allocation status, etc.? There are good reasons:
there may be a code path in the procedure which does have an allocation change (controlled by a switch, say);
allocatable dummy arguments also pass bounds;
etc.,
This second one is more obvious if the subroutine had specification
subroutine mysub(n)
integer, intent(in) :: n
integer :: automatic_array(2:n+1)
integer, allocatable :: alloc_array(:)
allocate(alloc_array(2:n+1))
Finally, an automatic object has quite strict conditions on its size. n here is clearly allowed, but things don't have to be much more complicated before allocation is the only plausible way. Depending on how much one wants to play with block constructs.
Taking also a comment from IanH: if we have a very large n the automatic object is likely to lead to crash-and-burn. With the allocatable, one could use the stat= option to come to some amicable agreement with the compiler run-time.

when to use iso_Fortran_env ,selected_int_kind,real(8),or -fdefault-real-8 for writing or compiling fortran code?

I have this simple code which uses DGEMM routine for matrix multiplication
program check
implicit none
real(8),dimension(2,2)::A,B,C
A(1,1)=4.5
A(1,2)=4.5
A(2,1)=4.5
A(2,2)=4.5
B(1,1)=2.5
B(1,2)=2.5
B(2,1)=2.5
B(2,2)=2.5
c=0.0
call DGEMM('n','n',2,2,2,1.00,A,2,B,2,0.00,C,2)
print *,C(1,1)
print *,C(1,2)
print *,C(2,1)
print *,C(2,2)
end program check
now when i compile this code with command
gfortran -o check check.f90 -lblas
I get some random garbage values. But when I add
-fdefault-real-8
to the compiling options I get correct values.
But since it is not a good way of variable declaration in Fortran. So I used the iso_fortran_env intrinsic module and added two lines to the code
use iso_fortran_env
real(kind=real32),dimension(2,2)::A,B,C
and compiled with
gfortran -o check check.f90 -lblas
Again I got wrong output .
Where I'm erring in this code?
I'm on 32bit linux and using GCC
DGEMM expects double precision values for ALPHA and BETA.
Without further options, you are feeding single precision floats to LAPACK - hence the garbage.
Using -fdefault-real-8 you force every float specified to be double precision by default, and DGEMM is fed correctly.
In your case, the call should be:
call DGEMM('n','n',2,2,2,1.00_8,A,2,B,2,0.00_8,C,2)
which specifies the value for alpha to be 1 as a float of kind 8, and zero of kind 8 for beta.
If you want to perform the matrix-vector product in single precision, use SGEMM.
Note that this is highly compiler-specific, you should consider using REAL32/REAL64 from the ISO_Fortran_env module instead (also for the declaration of A, B, and C).

fortran77 to fortran90 differences in output

I have downloaded the following fortran program dragon.f at http://www.iamg.org/documents/oldftp/VOL32/v32-10-11.zip
I need to do a minor modification to the program which requires the program to be translated to fortran90 (see below to confirm if this is truly needed).
I have managed to do this (translation only) by three different methods:
replacing comment line indicators (c for !) and line continuation
indicators (* in column 6 for & at the end of last line)
using convert.f90 (see https ://wwwasdoc.web.cern.ch/wwwasdoc/WWW/f90/convert.f90)
using f2f.pl (see https :// bitbucket.org/lemonlab/f2f/downloads)
Both 1) and 3) worked (i.e. managed to compile program) while 2) didn't work straight away.
However, after testing the program I found that the results are different.
With the fortran77 program, I get the "expected" results for the example provided with the program (the program comes with an example data "grdata.txt", and its example output "flm.txt" and "check.txt"). However, after running the translated (fortran90) program the results I get are different.
I suspect there are some issues with the way some variables are declared.
Can you give me recommendations in how to properly translate this program so I get the exact same results?
The reason I need to do it in fortran90 is because I need to input the parameters via a text file instead of modifying the program. This shouldnt be an issue for most of the parameters involved, except for the declaration of the last one, in which the size is determined from parameters that the program does not know a priori (see below):
implicit double precision(a-h,o-z)
parameter(lmax=90,imax=45,jmax=30)
parameter(dcta=4.0d0,dfai=4.0d0)
parameter(thetaa=0.d0,thetab=180.d0,phaia=0.d0,phaib=120.d0)
dimension f(0:imax,0:jmax),coe(imax,jmax,4),coew(4),fw(4)
So for example, I will read lmax, imax, jmax, dcta, dfai, thetaa, thetab, phaia, and phaib and the program needs to declare f and coe but as far as I read after googling this issue, they cannot be declared with an unknown size in fortran77.
Edit: This was my attempt to do this modification:
character fname1*100
call getarg(1,fname1)
open(10,file=fname1)
read(10,*)lmax,imax,jmax,dcta,dfai,thetaa,thetab,phaia,phaib
close(10)
So the program will read these constants from a file (e.g. params.txt), where the name of the file is supplied as an argument when invoking the program. The problem when I do this is that I do not know how to modify the line
dimension f(0:imax,0:jmax)...
in order to declare this array when the values imax and jmax are not known when compiling the program (they depend on the size of the data that the user will use).
As has been pointed out in the comments above, parameters cannot be read from file since they are set at compile time. Read them in as integer, declare the arrays as allocatable, and then allocate.
integer imax,jmax
real(8), allocatable :: f(:,:),coe(:,:,:)
read(10,*) imax,jmax
allocate(f(0:imax,0:jmax),coe(imax,jmax,4))
I found out that the differences in the results were attributed to using different compilers.
PS I ended up adding a lot more code than I intended at the beginning to allow reading data from netcdf files. This program in particular is really helpful for spherical harmonic expansion. [tag:spherical harmonics]

Taylor Series Expansion for exp(x)/sin(x) in Fortran

I tried to write a Taylor series expansion for exp(x)/sin(x) using fortran, but when I tested my implementatin for small numbers(N=3 and X=1.0) and add them manually, the results are not matching what I expect. On by hand I calculated 4.444.., and with the program I found 7.54113. Could you please check my code and tell me if i got anything wrong.
Here is the expansion formula for e^x/sin(x) in wolframalpha: http://www.wolframalpha.com/input/?i=e%5Ex%2Fsin%28x%29
PROGRAM Taylor
IMPLICIT NONE
INTEGER ::Count1,Count2,N=3
REAL:: X=1.0,Sum=0.0
COMPLEX ::i=(0.0,0.1)
INTEGER:: FACT
DO Count1=1,N,1
DO Count2=0,N,1
Sum=Sum+EXP(i*X*(-1+2*Count1))*(X**Count2)/FACT(Count2)
END DO
END DO
PRINT*,Sum
END PROGRAM Taylor
INTEGER FUNCTION FACT(n)
IMPLICIT NONE
INTEGER, INTENT(IN) :: n
INTEGER :: i, Ans
Ans = 1
DO i = 1, n
Ans = Ans * i
END DO
FACT = Ans
END FUNCTION FACT
I don't see any complex terms in that expansion by Wolfram, so I'd wonder why you think you need the complex number in the exponential term. And you can't get that 1/x term the way you've programmed it. You need an x**(-1.0) term somewhere.
Your factorial implementation is rather naive, too.
I'd recommend that you forget about the loop and factorials and start with a polynomial, coefficients, and Horner's method for evaluation. Get that working and then see if you can sort out the loop.
The Wolfram article has the expansion formula stated using q = e**(ix) so there is a complex term. Therefore "sum" should be declared complex.
As already stated, the factorial function is simplistic. Be careful of overflow.
It is best to place your procedures into a module and "use" that module from the main program. Use as many compiler debugging options as possible. For example, gfortran, when the appropriate warning option is used, warns about the type of "sum": "Warning: Possible change of value in conversion from COMPLEX(4) to REAL(4)". If you are using gfortran try: -O2 -fimplicit-none -Wall -Wline-truncation -Wcharacter-truncation -Wsurprising -Waliasing -Wimplicit-interface -Wunused-parameter -fwhole-file -fcheck=all -std=f2008 -pedantic -fbacktrace
Since you can do this problem by hand, try outputting each step with a write statement and comparing to your hand calculation. You will probably quickly see where the calculation diverges. If it isn't clear why the calculation is different, break it down into pieces.

FFTW: Trouble with real to complex and complex to real 2D tranfsorms

As the title states I'm using FFTW (version 3.2.2) with Fortran 90/95 to perform a 2D FFT of real data (actually a field of random numbers). I think the forward step is working (at least I am getting some ouput). However I wanted to check everything by doing the IFFT to see if I can re-construct the original input. Unfortunately when I call the complex to real routine, nothing happens and I obtain no error output, so I'm a bit confused. Here are some code snippets:
implicit none
include "fftw3.f"
! - im=501, jm=401, and lm=60
real*8 :: u(im,jm,lm),recov(im,jm,lm)
complex*8 :: cu(1+im/2,jm)
integer*8 :: planf,planb
real*8 :: dv
! - Generate array of random numbers
dv=4.0
call random_number(u)
u=u*dv
recov=0.0
k=30
! - Forward step (FFT)
call dfftw_plan_dft_r2c_2d(planf,im,jm,u(:,:,k),cu,FFTW_ESTIMATE)
call dfftw_execute_dft_r2c(planf,u(:,:,k),cu)
call dfftw_destroy_plan(planf)
! - Backward step (IFFT)
call dfftw_plan_dft_c2r_2d(planb,im,jm,cu,recov(:,:,k),FFTW_ESTIMATE)
call dfftw_execute_dft_c2r(planb,cu,recov(:,:,k))
call dfftw_destroy_plan(planb)
The above forward step seems to work (r2c) but the backward step does not seem to work. I checked this by differencing the u and recov arrays - which ended up not being zero. Additionally the max and min values of the recov array were both zero, which seems to indicate that nothing was changed.
I've looked around the FFTW documentation and based my implementation on the following page http://www.fftw.org/fftw3_doc/Fortran-Examples.html#Fortran-Examples . I am wondering if the problem is related to indexing, at least that's the direction I am leaning. Anyway, if any one could offer some help, that would be wonderful!
Thanks!
Not sure if this is the root of all troubles here, but the way you declare variables may be the culprit.
For most compilers (this is apparently not even a standard), Complex*8 is an old syntax for single precision: the complex variable occupies a total of 8 bytes, shared between the real and the imaginary part (4+4 bytes).
[Edit 1 following Vladimir F comment to my answer, see his link for details:] In my experience (i.e. the systems/compiler I ever used), Complex(Kind=8) corresponds to the declaration of a double precision complex number (a real and an imaginary part, both of which occupy 8 bytes).
On any system/compiler, Complex(Kind=Kind(0.d0)) should declare a double precision complex.
In short, your complex array does not have the right size. Replace occurences of Real*8 and Complex*8 by Real(kind=8) and Complex(Kind=8) (or Complex(Kind=kind(0.d0)) for a better portability), respectively.