apparently, a limit exists in the number of allowed continuation lines in Fortran compilers. I have a temporary pathological case (made for quick testing purposes) where I am required to initialize a huge array without opening files or do any trickery, just slap data in as literals. The array is quite large (360000 entries).
How can I set the limit of the compiler to unlimited, or what alternative strategy can I use to host this array initialization ?
You could assign them in batches using implicit DO loops, up to the continuation limit imposed by your compiler:
REAL :: xarray(360000)
DATA (xarray(i) i=1,100) /1.0, 2.0, 3.0, 4.0, 5.0, 6.0, &
7.0, 8.0, &
...
98.0, 99.0, 100.0 /
DATA (xarray(i) i=101,200) /101.0, 102.0, 103.0, 104.0, 105.0, 106.0, &
107.0, 108.0, &
...
198.0, 199.0, 200.0 /
I've seen this in a lot of scientific Fortran code.
I don't know about any compiler settings for unlimited continuation lines, but I would suggest these alternatives:
assign each value on a single line
put the values in a file and read it :)
call a C function to fill your fortran array
Write some code to create your source files with data from a text file. Split the assignments by row or something to help avoid create one huge statement to initialize the array in one fell swoop. Remember code that generates code can be quite flexible.
Related
I am interested in the difference between alloc_array and automatic_array in the following extract:
subroutine mysub(n)
integer, intent(in) :: n
integer :: automatic_array(n)
integer, allocatable :: alloc_array(:)
allocate(alloc_array(n))
...[code]...
I am familiar enough with the basics of allocation (not so much on advanced techniques) to know that allocation allows you to change the size of the array in the middle of the code (as pointed out in this question), but I'm interested in considering the case where you don't need to change the size of the array; they might be passed onto other subroutines for operation, but the only purpose of both variables in the code and any subroutine is to hold the data of an array of dimension n (and maybe change the data, but not the size).
(1) Is there any difference in memory usage? I am not an expert in low level procedures, but I have a very slight knowledge of how they matter and how they can impact on the higher level programming (kind of experience I'm talkng about: once trying to run a big code in fortran I was getting a mistake I didn't understand, sysadmin told me "oh, yeah, you are probably saturating the stack; try adding this line in your running script"; anything that gives me insight into how to consider this things when actually coding and not having to patch them later is welcomed). I've been told by people that it might be dependent on many other things like compiler or architecture, but I interpreted from those responses that they were not completely sure of exactly how this was so. Is it so absolutely dependant on a multitude of factors or is there a default/intended behavior in the coding that may then be over-riden by optional compiling keywords or system preferences?
(2) Would the subroutines have different interface needs? Again, not an expert, but it had happened to me before that because of the way I declare variables of subroutine, I end up having to put the subroutines in a module. I've been given to understand this may vary depending on whether I use things that are special for allocatable variables. I am thinking about the case in which everything I do with the variables can be done both by allocatables and automatics, not intentionally using anything specific of allocatables (other than allocation before usage, that is).
Finally, in case this is of use: the reason I am asking is because we are developing in a group and we have recently noticed different people use those two declarations in different ways and we needed to determine if this is something that can be left to personal preference or if there might be any reasons why it might be a good idea to set a clear criteria (and how to set that criteria). I don't need extremely detailed answers, I am trying to determine if this is something I should be doing research about to be careful on how we use it and in what aspects of it should the research be directed.
Though I would be interested to know of "interesting tricks" than can be done with allocation but are not directly related to the need of having size variability, I am leaving those for a possible future follow-up question and focusing here on the strictly functional differences (meaning: what I am explicitly telling compilers to do with my code). The two items I mentioned are the thing I could come up with due to previous experiences, but any other important one that I am missing and should consider, please do mention them.
Because gfortran or ifort + Linux(x86_64) are among the most popular combinations used for HPC, I made some performance comparison between local allocatable vs automatic arrays for these combinations. The CPU used is Xeon E5-2650 v2#2.60GHz, and the compilers are gfortran4.8.2 and ifort14.0. The test program is like the following.
In test.f90:
!------------------------------------------------------------------------
subroutine use_automatic( n )
integer :: n
integer :: a( n ) !! local automatic array (with unknown size at compile-time)
integer :: i
do i = 1, n
a( i ) = i
enddo
call sub( a )
end
!------------------------------------------------------------------------
subroutine use_alloc( n )
integer :: n
integer, allocatable :: a( : ) !! local allocatable array
integer :: i
allocate( a( n ) )
do i = 1, n
a( i ) = i
enddo
call sub( a )
deallocate( a ) !! not necessary for modern Fortran but for clarity
end
!------------------------------------------------------------------------
program main
implicit none
integer :: i, nsizemax, nsize, nloop, foo
common /dummy/ foo
nloop = 10**7
nsizemax = 10
do i = 1, nloop
nsize = mod( i, nsizemax ) + 1
call use_automatic( nsize )
! call use_alloc( nsize )
enddo
print *, "foo = ", foo !! to check if sub() is really called
end
In sub.f90:
!------------------------------------------------------------------------
subroutine sub( a )
integer a( * )
integer foo
common /dummy/ foo
foo = a( 1 )
ends
In the above program, I tried avoiding compiler optimization that eliminates a(:) itself (i.e., no operation) by placing sub() in a different file and making the interface implicit. First, I compiled the program using gfortran as
gfortran -O3 test.f90 sub.f90
and tested different values of nsizemax while keeping nloop = 10^7. The result is in the following table (time is in sec, measured several times by the time command).
nsizemax use_automatic() use_alloc()
10 0.30 0.31 # average result
50 0.48 0.47
500 1.0 0.90
5000 4.3 4.2
100000 75.6 75.7
So the overall timing seems almost the same for two calls when -O3 is used (but see Edit for different options). Next, I compiled with ifort as
[O3] ifort -O3 test.f90 sub.f90
or
[O3h] ifort -O3 -heap-arrays test.f90 sub.f90
In the former case the automatic array is stored on the stack, while when -heap-arrays is attached the array is stored on the heap. The obtained result is
use_automatic() use_alloc()
[O3] [O3h] [O3] [O3h]
10 0.064 0.39 0.48 0.48
50 0.094 0.56 0.65 0.66
500 0.45 1.03 1.12 1.12
5000 3.8 4.4 4.4 4.4
100000 74.5 75.3 76.5 75.5
So for ifort, the use of automatic arrays seems beneficial when relatively small arrays are mainly used. On the other hand, gfortran -O3 shows no difference because both arrays are treated the same way (see Edit for more details).
Additional comparison:
Below is the result for Oracle Fortran compiler 12.4 for Linux (used with f90 -O3). The overall trend seems similar; automatic arrays are faster for small n, indicating the internal use of stack.
nsizemax use_automatic() use_alloc()
10 0.16 0.45
50 0.17 0.62
500 0.37 0.97
5000 2.04 2.67
100000 65.6 65.7
Edit
Thanks to Vladimir's comment, it has turned out that gfortran -O3 put automatic arrays (with unknown size at compile-time) on the heap. This explains why use_automatic() and use_alloc() did not make any difference above. So I made another comparison between different options below:
[O3] gfortran -O3
[O5] gfortran -O5
[O3s] gfortran -O3 -fstack-arrays
[Of] gfortran -Ofast # this includes -fstack-arrays
Here, -fstack-arrays means that the compiler puts all local arrays with unknown size on the stack. Note that this flag is enabled by default with -Ofast. The obtained result is
nsizemax use_automatic() use_alloc()
[Of] [O3s] [O5] [O3] [Of] [O3s] [O5] [O3]
10 0.087 0.087 0.29 0.29 0.29 0.29 0.29 0.29
50 0.15 0.15 0.43 0.43 0.45 0.44 0.44 0.45
500 0.57 0.56 0.84 0.84 0.92 0.92 0.92 0.92
5000 3.9 3.9 4.1 4.1 4.2 4.2 4.2 4.2
100000 75.1 75.0 75.6 75.6 75.6 75.3 75.7 76.0
where the average of ten measurements are shown. This table demonstrates that if -fstack-arrays is included, the execution time for small n becomes shorter. This trend is consistent with the results obtained for ifort above.
It should be mentioned, however, that the above comparison probably corresponds to the "best-case" scenario that highlights the difference between them, so the timing difference can be much smaller in practice. For example, I have compared the timing for the above options by using some other program (involving both small and large arrays), and the results were not much affected by the stack options. Also the result should depend on machine architecture as well as compilers, of course. So your mileage may vary.
For the sake of clarity, I'll briefly mention terminology. The two arrays are both local variables and arrays of rank 1.
alloc_array is an allocatable array;
automatic_array is an explicit-shape automatic object.
Being local variables their scope is that of the procedure. Automatic arrays and unsaved allocatable arrays come to an end when execution of the procedure completes (with the allocatable array being deallocated); automatic objects cannot be saved and saved allocatable objects are not deallocated on completion of execution.
Again, as in the linked question, after the allocation statement both arrays are of size n. These are still two very different things. Of course, the allocatable array can have its allocation status changed and its allocation moved. I'll leave both of those (mostly) out of the scope of this answer. An allocatable array, of course, doesn't have to have these things changed once it's been allocated.
Memory usage
What was partly contentious about a previous revision of the question is how ill-defined the concept of memory usage is. Fortran, as a language definition, tells us that both arrays come to be the same size and they'll have the same storage layout, and are both contiguous. Beyond that, much follows terms you'll hear a lot: implementation specific and processor dependent.
In a comment you expressed interest in ifort. So that I don't wander too far, I'll stick to that one compiler. Other compilers have similar concepts, albeit with different names and options.
Often, ifort will place automatic objects and array temporaries onto stack. There is a (default) compiler option -no-heap-arrays described as having effect
The compiler puts automatic arrays and temporary arrays in the stack storage area.
Using the alternative option -heap-arrays allows one to control that slightly:
This option puts automatic arrays and arrays created for temporary computations on the heap instead of the stack.
There is a possibility to control size thresholds for which heap/stack would be chosen (when that is known at compile-time):
If the compiler cannot determine the size at compile time, it always puts the automatic array on the heap.
As n isn't a constant, one would expect automatic_array to be on the heap with this option, regardless of the size specified. To determine the size, n, of the array at compile time, the compiler would potentially need to do quite a bit of code analysis, even if it is possible.
There's probably more to be said, but this answer would be far too long if I tried. One thing to note, however, is that automatic local objects and (post-Fortran 90) allocatable local objects can be expected not to leak memory.
Interface needs
There is nothing special about the interface requirements of the subroutine mysub: local variables have no impact on that. Any program unit calling that would be happy with an implicit interface. What you are asking about is how the two local arrays can be used.
This largely comes down to what uses the two arrays can be put to.
If the dummy argument of a second procedure has the allocatable attribute then only the allocatable array here can be passed to that procedure. It will also need to have an explicit interface. This is true whether or not the procedure changes the allocation.
Of course, both arrays could be passed as arguments to a dummy argument without the allocatable attribute and then we don't have different interface requirements.
Anyway, why would one want to pass an argument to an allocatable dummy when there will be no change in allocation status, etc.? There are good reasons:
there may be a code path in the procedure which does have an allocation change (controlled by a switch, say);
allocatable dummy arguments also pass bounds;
etc.,
This second one is more obvious if the subroutine had specification
subroutine mysub(n)
integer, intent(in) :: n
integer :: automatic_array(2:n+1)
integer, allocatable :: alloc_array(:)
allocate(alloc_array(2:n+1))
Finally, an automatic object has quite strict conditions on its size. n here is clearly allowed, but things don't have to be much more complicated before allocation is the only plausible way. Depending on how much one wants to play with block constructs.
Taking also a comment from IanH: if we have a very large n the automatic object is likely to lead to crash-and-burn. With the allocatable, one could use the stat= option to come to some amicable agreement with the compiler run-time.
I am trying to debug a huge program not written by me by writing out a large selection of the variables into text files. Some are arrays and some are single values.
The arrays were declared with huge initial sizes due to the code being incomplete and people didn't want to use the allocation method as no one knew how many more things would be added to the code. As a result, if I just straight up print out the entire variable, it would also print out the millions of zeros which I don't need and make the file much larger than necessary.
I searched for a way to write out non-zero elements and another post here had answers pointing to the pack() function.
However, pack() seems to have a size limit since visual studio would not even go into the lines that actually calls pack - visual studio would enter chkstk.asm upon entering the subroutine that writes the variables and return a stack overflow error before executing any of the lines inside the subroutine (the first few lines in the subroutine are just opening file and writing non-array variables).
So, what else can I do to write out all the non-zero elements inside these huge arrays?
The beginning of the subroutine is shown below:
subroutine write_everything(fileIDa,fileNamea,fileIDb,fileNameb)
use flags
use const
use mphase_props_v
use sample_props_v
use grain_props_v
use mphase_state_v
use grain_state_v
use mphase_rate_v
use grain_rate_v
use sample_state_v
use sample_rate_v
use twinning_v
use hard_law1_v
use back_stress_v
use phase_transf_v
use bc_v
use diffract_v
use output_v
use YS_v
use epsc_var
integer, intent(in) :: fileIDa,fileIDb
character(len=40), intent(in) :: fileNamea,fileNameb
1 format(1h,78('*'))
open(unit=fileIDa,file=fileNamea,status='unknown')
write(fileIDa,'(''flags'')')
write(fileIDa,1)
write(fileIDa,*) ishape,irot,ipileup,kSM,iPoleFigFlag,i_diff_dir
# ,iDiag,kCL,iSingleCry,iTwinLaw,i_prev_proc,iDetwOpt,iDtwMfp
# ,ilatBS,iBackStress,iPhTr,itwinning,iOutput,itexskip,nCoatedPh
# ,nCoatingPh,ivarBC,inonSch
write(fileIDa,'(''mphase_props_v'')')
write(fileIDa,1)
write(fileIDa,*) pack(nsm,nsm.ne.0),pack(itw,itw.ne.0)
# ,pack(nmodes,nmodes.ne.0),pack(nsys,nsys.ne.0)
# ,pack(nslmod,nslmod.ne.0),pack(nslsys,nslsys.ne.0)
# ,pack(ntwmod,ntwmod.ne.0),pack(ntwsys,ntwsys.ne.0)
# ,pack(nphngr,nphngr.ne.0),pack(icrysym,icrysym.ne.0)
# ,pack(ISECTW,ISECTW.ne.0),pack(ngrnph,ngrnph.ne.0)
Some of the array is of size 10, but some others are size 10000 and even 50 by 10000.
Note before I used pack the program writes the variables just fine, except the file is too large (780 MB) that neither Microsoft word nor notepad++ would open them and I need the compare functions from these programs so I can't just open them with regular notepad. I stopped short of splitting them into two files and decided to try to remove all the zeros.
Following the suggestions from the comments, I set heap array to 0 and although visual studio still goes into chkstk.asm it no longer returns error and pack() writes out non-zero elements just fine.
It's quite common in C-code to see stuff like:
malloc(sizeof(int)*100);
which will return a pointer to a block of memory big enough to hold 100 ints. Is there any equivalent in fortran?
Use case:
I have a binary file which is opened as:
open(unit=10,file='foo.dat',access='stream',form='unformatted',status='old')
I know that the file contains "records" which consist of a header with 20 integers, 20 real numbers and 80 characters, then another N real numbers. Each file can have hundreds of records. Basically, I'd like to read or write to a particular record in this file (assuming N is a fixed constant for simplicity).
I can easily calculate the position in the file I want to write if I know the size of each data-type:
header_size = SIZEOF_INT*20 + SIZEOF_FLOAT*20 + SIZEOF_CHAR*80
data_size = N*SIZEOF_FLOAT
position = (record_num-1)*(header_size+data_size)+1
Currently I have
!Hardcoded :-(
SIZEOF_INT = 4
SIZEOF_FLOAT = 4
SIZEOF_DOUBLE = 8
SIZEOF_CHAR = 1
Is there any way to do better?
constraints:
The code is meant to be run on a variety of platforms with a variety of compilers. A standard compliant solution is definitely preferred.
In your use case I think you could use
inquire(iolength=...) io-list
That will give you how many "file storage units" are required for the io-list. A caveat with calculating offsets in files with Fortran is that "file storage unit" need not be in bytes, and indeed I recall one quite popular compiler by default using a word (4 bytes) as the file storage unit. However, by using the iolength thing you don't need to worry about this issue.
#janneb's answer will address the OP's question, but it doesn't answer the "sizeof" question for Fortran.
A combination of inquire and file_storage_size will give the size of a type. Try this code:
program sizeof
use iso_fortran_env
integer :: num_file_storage_units
integer :: num_bytes
inquire(iolength=num_file_storage_units) 1.0D0
num_bytes = num_file_storage_units*FILE_STORAGE_SIZE/8
write(*,*) "double has size: ", num_bytes
end program sizeof
See:
http://gcc.gnu.org/onlinedocs/gfortran/ISO_005fFORTRAN_005fENV.html
http://h21007.www2.hp.com/portal/download/files/unprot/fortran/docs/lrm/lrm0514.htm
If all the records are the same, this would seem to be a case to use direct access rather than stream access. Then don't calculate the position in the file, you tell the compiler the record that you want, and it accesses it. Unless you want these files to be portable across platforms or the records are not all the same ... then you have to have more control or calculate the length of the records. While the original Fortran 90 concept was to declare variables according to the required precision, there are now portable ways to declare variables by size. Either with types provided by the already mentioned iso_c_binding module, or from the iso_fortran_env module.
I have 2 dimensional table in file, which look like this:
11, 12, 13, 14, 15
21, 22, 23, 24, 25
I want it to be imported in 2 dimensional array. I wrote this code:
INTEGER :: SMALL(10)
DO I = 1, 3
READ(UNIT=10, FMT='(5I4)') SMALL
WRITE(UNIT=*, FMT='(6X,5I4)') SMALL
ENDDO
But it imports everything in one dimensional array.
EDIT:
I've updated code:
program filet
integer :: reason
integer, dimension(2,5) :: small
open(10, file='boundary.inp', access='sequential', status='old', FORM='FORMATTED')
rewind(10)
DO
READ(UNIT=10, FMT='(5I4)', iostat=reason) SMALL
if (reason /= 0) exit
WRITE(UNIT=*, FMT='(6X,5I4)') SMALL
ENDDO
write (*,*) small(2,1)
end program
Here is output:
11 12 13 14 15
21 22 23 24 25
12
Well, you have defined SMALL to be a 1-D array, and Fortran is just trying to be helpful. You should perhaps have defined SMALL like this;
integer, dimension(2,5) :: small
What happened when the read statement was executed was that the system ran out of edit descriptor (you specified 5 integers) before either SMALL was full or the end of the file was encountered. If I remember rightly Fortran will re-use the edit descriptor until either SMALL is full or the end-of-file is encountered. But this behaviour has been changed over the years, according to Fortran standards, and various compilers have implemented various non-standard features in this part of the language, so you may need to check your compiler's documentation or do some more experiments to figure out exactly what happens.
I think your code is also a bit peculiar in that you read from SMALL 3 times. Why ?
EDIT: OK, we're getting there. You have just discovered that Fortran stores arrays in column-major order. I believe that most other programming languages store them in row-major order. In other words, the first element of your array is small(1,1), the second (in memory) is small(2,1), the third is small(1,2) and so forth. I think that your read (and write) statements are not standard but widely implemented (which is not unusual in Fortran compilers). I may be wrong, it may be standard. Either way, the read statement is being interpreted to read the elements of small in column-major order. The first number read is put in small(1,1), the second in small(2,1), the third in small(1,2) and so on.
Your write statement makes use of the same feature; you might have discovered this for yourself if you had written out the elements in loops with the indices printed too.
The idiomatic Fortran way of reading an array and controlling the order in which elements are placed into the array, is to include an implied-do loop in the read statement, like this:
READ(UNIT=10, FMT='(5I4)', iostat=reason) ((SMALL(row,col), col = 1,numCol), row=1,numRow)
You can also use this approach in write statements.
You should also study your compiler documentation carefully and determine how to switch on warnings for all non-standard features.
Adding to what High Performance Mark wrote...
If you want to use commas to separate the numbers, then you should use list-directed IO rather than formatted IO. (Sometimes this is called format-free IO, but that non-standard term is easy to confuse with binary IO). This is easier to use since you don't have to arrange the numbers precisely in columns and can separate them with spaces or commas. The read is simply "read (10, *) variables"
But sticking to formatted IO, here is some sample code:
program demo1
implicit none
integer, dimension (2,5) :: small
integer :: irow, jcol
open ( unit=10, file='boundary.txt', access='sequential', form='formatted' )
do irow=1, ubound (small, 1)
read (10, '(5I4)') (small (irow, jcol), jcol=1, ubound (small, 2))
end do
write (*, '( / "small (1,2) =", I2, " and small (2,1)=", I2 )' ) small (1,2), small (2,1)
end program demo1
Using the I4 formatted read, the data need to be in columns:
12341234123412341234
11 12 13 14 15
21 22 23 24 25
The data file shouldn't contain the first row "1234..." -- that is in the example to make the alignment required for the format 5I4 clear.
With my example program, there is an outer do loop for irow and an "implied do loop" as part of the read statement. You could also eliminate the outer do loop and use two implied do loops on the read statement, as High Performance Mark showed. In this case, if you kept the format specification (5I4), it would get reused to read the second line -- this is called format reversion. (On a more complicated format, one needs to read the rules to understand which part of the format is reused in format reversion.) This is standard, and has been so at least since FORTRAN 77 and probably FORTRAN IV. (Of course, the declarations and style of my example are Fortran 90).
I used "ubound" so that you neither have to carry around variables storing the dimensions of the array, nor use specific numeric values. The later method can cause problems if you later decide to change the dimension of the array -- then you have to hunt down all of the specific values (here 2 and 5) and change them.
There is no need for a rewind after an open statement.
I have a piece of fortran code, and I am not sure which standard it is - '77, '90 or '95. Is there a standard tool to identify which standard it subjects to?
There probably are automated tools, but my methods are largely heuristic:
Do comments use a ! anywhere on the line (F90+) or a C in the first column (F77)?
Do loops use do..end do (F90+) or do..continue (F77)?
Are lines continued using & at the end of the line (F90+) or in column 6 (f77)?
Does the code use module or type structures (F90)?
If the code uses arrays, does it operate on them as a single structure (F90) or always using loops (F77)?
Is dynamic memory (either using allocatable or pointer methods) used (F90)?
Generally these are enough to discriminate between F90 and F77. The differences between Fortran 90 and FORTRAN 77 are much, much larger than the differences between Fortran 90 and Fortran 95 so I usually stop there.
If you have access to GNU Fortran (gfortran) you can try compiling it with the different options for --std and see which one works. You can find details on the dialect options here.
I'm adding features in fortran 2003 and 2008 (which are just back of my head)
if the program has parameterized-derived datatypes (fortran 2003.)
if the array constructor uses square brackets [ ] instead of (/ /) (fortran 2003.)
if you see,there is provision of using coarrays (fortran 2008)
although many compilers have special functions(like Bessel functions) as part of extensions , it is a bonafide fortran 2008 feature.
(if any discrepancies let me know i'll edit)