sizeof in fortran - fortran

It's quite common in C-code to see stuff like:
malloc(sizeof(int)*100);
which will return a pointer to a block of memory big enough to hold 100 ints. Is there any equivalent in fortran?
Use case:
I have a binary file which is opened as:
open(unit=10,file='foo.dat',access='stream',form='unformatted',status='old')
I know that the file contains "records" which consist of a header with 20 integers, 20 real numbers and 80 characters, then another N real numbers. Each file can have hundreds of records. Basically, I'd like to read or write to a particular record in this file (assuming N is a fixed constant for simplicity).
I can easily calculate the position in the file I want to write if I know the size of each data-type:
header_size = SIZEOF_INT*20 + SIZEOF_FLOAT*20 + SIZEOF_CHAR*80
data_size = N*SIZEOF_FLOAT
position = (record_num-1)*(header_size+data_size)+1
Currently I have
!Hardcoded :-(
SIZEOF_INT = 4
SIZEOF_FLOAT = 4
SIZEOF_DOUBLE = 8
SIZEOF_CHAR = 1
Is there any way to do better?
constraints:
The code is meant to be run on a variety of platforms with a variety of compilers. A standard compliant solution is definitely preferred.

In your use case I think you could use
inquire(iolength=...) io-list
That will give you how many "file storage units" are required for the io-list. A caveat with calculating offsets in files with Fortran is that "file storage unit" need not be in bytes, and indeed I recall one quite popular compiler by default using a word (4 bytes) as the file storage unit. However, by using the iolength thing you don't need to worry about this issue.

#janneb's answer will address the OP's question, but it doesn't answer the "sizeof" question for Fortran.
A combination of inquire and file_storage_size will give the size of a type. Try this code:
program sizeof
use iso_fortran_env
integer :: num_file_storage_units
integer :: num_bytes
inquire(iolength=num_file_storage_units) 1.0D0
num_bytes = num_file_storage_units*FILE_STORAGE_SIZE/8
write(*,*) "double has size: ", num_bytes
end program sizeof
See:
http://gcc.gnu.org/onlinedocs/gfortran/ISO_005fFORTRAN_005fENV.html
http://h21007.www2.hp.com/portal/download/files/unprot/fortran/docs/lrm/lrm0514.htm

If all the records are the same, this would seem to be a case to use direct access rather than stream access. Then don't calculate the position in the file, you tell the compiler the record that you want, and it accesses it. Unless you want these files to be portable across platforms or the records are not all the same ... then you have to have more control or calculate the length of the records. While the original Fortran 90 concept was to declare variables according to the required precision, there are now portable ways to declare variables by size. Either with types provided by the already mentioned iso_c_binding module, or from the iso_fortran_env module.

Related

Intel Fortran error "allocatable array or pointer is not allocated"

When I tried to run a huge Fortran code (the code is compiled using Intel compiler version 13.1.3.192), it gave me error message like this:
...
Info[FDFI_Setup]: HPDF code version number is 1.00246
forrtl: severe (153): allocatable array or pointer is not allocated
Image PC Routine Line Source
arts 0000000002AD96BE Unknown Unknown Unknown
arts 0000000002AD8156 Unknown Unknown Unknown
arts 0000000002A87532 Unknown Unknown Unknown
...
Nonetheless, if I insert a small write statement (which is just to check the code, not to disturb the original purpose of the code) in one of the subroutines as the following (I couldn't put all the codes since they are too huge):
...
endif
call GetInputLine(Unit,line,eof,err)
enddo
if(err) return
! - [elfsummer] 20140815 Checkpoint 23
open(unit = 1, file = '/bin/monitor/log_checkpoint',status='old',position='append')
write(1,*) "BEFORE checking required keys: so far so good!"
close(1)
! check required keys
! for modes = 2,3, P and T are the required keys
if(StrmDat%ModeCI==2.or.StrmDat%ModeCI==3) then
...
then suddenly, the error message shown above disappears and the code can run correctly! I also tried to insert such write statements in other locations in the source code but the above error message still exists.
According to Intel's documentation:
severe (153): Allocatable array or pointer is not allocated
FOR$IOS_INVDEALLOC. A Fortran 90 allocatable array or pointer must already be allocated when you attempt to deallocate it. You must allocate the array or pointer before it can again be deallocated.
Note: This error can be returned by STAT in a DEALLOCATE statement.
However, I couldn't see any relations between the error and the "write statements" I added to the code. There is no such "allocate" command in the location I add the write statements.
So I am quite confused. Does anybody know the reasons? Any help is greatly appreciated!!
With traceback option, I could locate the error source directly:
subroutine StringRead(Str,delimiter,StrArray,ns) ! [private] read strings separated by delimiter
implicit none
character*(*),intent(in) :: Str
character*(*),intent(in) :: delimiter
character*(*),pointer :: StrArray(:)
integer,intent(out) :: ns
! - local variables
character(len=len(Str)) :: tline
integer :: nvalue,nvalue_max
character(len=len(StrArray)),pointer:: sarray(:),sarray_bak(:)
integer :: len_a,len_d,i
! deallocate StrArray
if(associated(StrArray)) deallocate(StrArray)
The error, according to the information the traceback gave me, lies in the last statement shown above. If I comment out this statement, then the "forrtl: severe (153)" error would disappear while new errors being generated... But still, I don't think this statement itself could go wrong...It acts as if it just ignores the if... condition and directly reads the deallocate commend, which seems weird to me.
You could have a bug in which you are illegally writing to memory and damaging the structure that stores the allocation information. Changing the code might cause the memory damage to occur elsewhere and that specific error to disappear. Generally, illegal memory accesses typically occur two ways in Fortran. 1) illegal subscripts, 2) mismatch between actual and dummy arguments, i.e., between variables in call and variables as declared in procedures. You can search for the first type of error by using your compiler's option for run-time subscript checking. You can guard against the second by placing all of your procedures in modules and useing those modules so that the compiler can check for argument consistency.
Sounds like some of the earlier comments give the general explanation. However,
1) Is StrArray(:) an Intent(out)? That is, are you reading the file's lines into StrArray() in the s/r, with the hope of returning that as the file's content? If so, declare it as an (Out), or whatever it should be.
2) Why is StrArray() a Pointer? Does it need to be a Pointer? If all you want is file content, you may be better off using a non-Pointer.
You may still need an Allocatable, or Automatic or something, but non-Pointers are easier in many cases.
3) If you must have StrArray(:) as a Pointer, then its size/shape etc must be created prior to use. If the calling sequence ACTUAL Arg is correctly defined (and if StrArray() is Intent(In) or Intent(InOUT), then that might do it.
By contrast, if it is an (Out), then, as with all Pointer arrays, it must be FIRST Allcoated() in the s/r.
If it is not Allocated somewhere early on, then it is undefined, and so the DeAllocate() fails, since it has nothing to DeAlloc, hence Stat = 153.
4) It is possible that you may wish to use this to read files without first knowing the number of lines to read. In that case, you cannot (at least not easily), Allocate StrArray() in advance, since you don't know the Size. In this case, alternate strategies are required.
One possible solution is a loop that simple reads the first char, or advances somehow, for each line in the file. Have the loop track the "sum" of each line read, until EOF. Then, you will know the size of the file (in terms of num lines), and you then allocate StrArray(SumLines) or something. Something like
SumLines = 0
Do i=1, ?? (or use a While)
... test to see if "line i" exists, or EOF, if so, Exit
SumLines = SumLines + 1
End Do
It may be best to do this in a separate s/r, so that the Size etc are known prior to calling the FileRead bits (i.e. that the file size is set prior to the FileRead s/r call).
However, that still leaves you with the problem of what Character(Len) to use. There are many possible solutions to this. Three of which are:
a) Use max length, like Character(Len = 2048), Intent(Out), or better yet, some compile time constant Parameter, call it MaxLineWidth
This has the obvious limitation to lines that <= MaxLineWidth, and that the memory usage may be excessively large when there many "short lines", etc.
b) Use a single char array, like Character(Len = 1), Intent(Out) :: StrArrayChar(:,:)
This is 2-D, since you need 1 D for the chars in each line, and the 2nd D for the lines.
This is a bit better compared to a) since it gives control over line width.
c) A more general approach might rely on a User Defined Type such as:
Type MyFileType
Character(Len=1), Allocatable :: FileLine(:) ! this give variable length lines, but each "line" must be allocated to the length of the line
End Type MyFileType
Then, create an array of this Type, such as:
Type(MyFileType), Allocatable :: MyFile(:) ! or, instead of Allocatable, can use Automatic etc etc
Then, Allocate MyFile to Size = num lines
... anyway, there are various choices, each with its own suitability for varying circumstances (and I have omitted much "housekeeping" re DeAllocs etc, which you will need to implement).
Incidentally, c) is also one possible prototype for "variable length strings" for many Fortran compilers that don't support such explicitly.

MPI_File_write_all with different count value for each processor

This is maybe a dumb question but,
I would like that N processors write all a different byte count in the same file with a different offset to make the data contignously.
I would like to use MPI_File_write_all(file,data,count,type,status) (individual file pointers, collective, blocking) function.
The first question can each processor specify a different value for the count parameter?
I could not find anything mentioned in the MPI 3.0 reference. (My intention is that it is not possible?)
What I found out so far is the following two problems:
When I want to write a large amount of MPI_BYTES the integer (32bit) count in the MPI_File_write... functions is to little and gives overflow of course!
I do not (cannot)/want to use a derived data type in MPI because as mentioned above all processor write a different byte count and the type is MPI_BYTES
Thanks for any help on this topic!
You've rolled up a few questions here
Absolutely, proceses can specify different or even zero amounts of data to the collective MPI_File_write_all routine. Not only can the count parameter differ, there's no reason the datatype parameter needs to be the same either.
Problem #1: If you want to write more than an int worth of MPI_BYTE data, you'll have to create a new datatpye. For example, let's say you wanted to write 9 billion bytes. Create a contig type of size 1 billion, then write 9 of those. (if the amount of data you want to write is not evenly divisible, you might need an hindexed or struct type).
problem #2: It's not at all a problem to have every MPI process create its own datatype or count of datatype.

performance of allocatable vs. statically sized arrays

I have a fortran code I had to modify to include a new library. Initially in the code the size of an array was passed in the Makefile, which meant every time I wanted to change the size of array I had to recompile the code. I changed this to read the size of the input array from an "input parameters file" so that it avoids the need to recompile every time. However, due to various reasons, my code is much slower than before.
Talking to my boss, he was of the opinion it might be possible that because we are not passing the size of the array during compile time, the code is not well optimized. Is it possibly true?
Thanks
---------------Edit---------------------
Initially there were these line in the makefile
NL = 8
#echo Making $(SIZE_FILE) .....
echo " integer, parameter( nl = " $(NL) " )" > $(SIZE_FILE)
This created a "sizefile" with value of "NL". This file was "include"d in the main program at as the header and then arrays were declared like this in the fortran file:
include "sizefile"
real*8, dimension ur(nl)
Now I have declared a subroutine called "read_input_parameters" which is called by the program which reads a text file with the value of "Nl". And then I allocate the array like this:
program test
integer n
allocatable :: ur(:)
call read_input_parameters(n)
allocate(ur(n))
*operations*
deallocate(ur)
stop
end
You should use a profiler and find the operations that are slow and post their code. The code you showed is useless. Are the results correct, at least?
The slowness can be caused by many factors. One of them is bad argument passing, which makes copy-in / copy-out necessary. Also, the fact that the subroutine does not know if the array is contiguous can do some harm, but not much.

Convert FORTRAN DEC UNION/MAP extensions to anything else

Edit: Gfortran 6 now supports these extensions :)
I have some old f77 code that extensively uses UNIONs and MAPs. I need to compile this using gfortran, which does not support these extensions. I have figured out how to convert all non-supported extensions except for these and I am at a loss. I have had several thoughts on possible approaches, but haven't been able to successfully implement anything. I need for the existing UDTs to be accessed in the same way that they currently are; I can reimplement the UDTs but their interfaces must not change.
Example of what I have:
TYPE TEST
UNION
MAP
INTEGER*4 test1
INTEGER*4 test2
END MAP
MAP
INTEGER*8 test3
END MAP
END UNION
END TYPE
Access to the elements has to be available in the following manners: TEST%test1, TEST%test2, TEST%test3
My thoughts thusfar:
Replace somehow with fortran EQUIVALENCE.
Define the structs in C/C++ and somehow make them visible to the FORTRAN code (doubt that this is possible)
I imagine that there must have been lots of refactoring of f77 to f90/95 when the UNION and MAP were excluded from the standard. How if at all was/is this handled?
EDIT: The accepted answer has a workaround to allow memory overlap, but as far as preserving the API, it is not possible.
UNION and MAP were never part of any FORTRAN standard, they are vendor extensions. (See, e.g., http://fortranwiki.org/fortran/show/Modernizing+Old+Fortran). So they weren't really excluded from the Fortran 90/95 standard. They cause variables to overlap in memory. If the code actually uses this feature, then you will need to use equivalence. The preferred way to move data between variables of different types without conversion is the transfer intrinsic, but to you that you would have to identify every place where a conversion is necessary, while with equivalence it is taking place implicitly. Of course, that makes the code less understandable. If the memory overlays are just to save space and the equivalence of the variables is not used, then you could get rid of this "feature". If the code is like your example, with small integers, then I'd guess that the memory overlay is being used. If the overlays are large arrays, it might have been done to conserve memory. If these declarations were also creating new types, you could use user defined types, which are definitely part of Fortran >=90.
If the code is using memory equivalence of variables of different types, this might not be portable, e.g., the internal representation of integers and reals are probably different between the machine on which this code originally ran and the current machine. Or perhaps the variables are just being used to store bits. There is a lot to figure out.
P.S. In response to the question in the comment, here is a code sample. But .... to be clear ... I do not think that using equivalence is good coding pratice. With the compiler options that I normally use with gfortran to debug code, gfortran rejects this code. With looser options, gfortran will compile it. So will ifort.
module my_types
use ISO_FORTRAN_ENV
type test_p1_type
sequence
integer (int32) :: int1
integer (int32) :: int2
end type test_p1_type
type test_p2_type
sequence
integer (int64) :: int3
end type test_p2_type
end module my_types
program test
use my_types
type (test_p1_type) :: test_p1
type (test_p2_type) :: test_p2
equivalence (test_p1, test_p2)
test_p1 % int1 = 2
test_p1 % int1 = 4
write (*, *) test_p1 % int1, test_p1 % int2, test_p2 % int3
end program test
The question is whether the union was used to save space or to have alternative representations of the same data. If you are porting, see how it is used. Maybe, because the space was limited, it was written in a way where the variables had to be shared. Nowadays with larger amounts of memory, maybe this is not necessary and the union may not be required. In which case, it is just two separate types
For those just wanting to compile the code with these extensions: Gfortran now supports UNION, MAP and STRUCTURE in version 6. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56226

Importing data from file to array

I have 2 dimensional table in file, which look like this:
11, 12, 13, 14, 15
21, 22, 23, 24, 25
I want it to be imported in 2 dimensional array. I wrote this code:
INTEGER :: SMALL(10)
DO I = 1, 3
READ(UNIT=10, FMT='(5I4)') SMALL
WRITE(UNIT=*, FMT='(6X,5I4)') SMALL
ENDDO
But it imports everything in one dimensional array.
EDIT:
I've updated code:
program filet
integer :: reason
integer, dimension(2,5) :: small
open(10, file='boundary.inp', access='sequential', status='old', FORM='FORMATTED')
rewind(10)
DO
READ(UNIT=10, FMT='(5I4)', iostat=reason) SMALL
if (reason /= 0) exit
WRITE(UNIT=*, FMT='(6X,5I4)') SMALL
ENDDO
write (*,*) small(2,1)
end program
Here is output:
11 12 13 14 15
21 22 23 24 25
12
Well, you have defined SMALL to be a 1-D array, and Fortran is just trying to be helpful. You should perhaps have defined SMALL like this;
integer, dimension(2,5) :: small
What happened when the read statement was executed was that the system ran out of edit descriptor (you specified 5 integers) before either SMALL was full or the end of the file was encountered. If I remember rightly Fortran will re-use the edit descriptor until either SMALL is full or the end-of-file is encountered. But this behaviour has been changed over the years, according to Fortran standards, and various compilers have implemented various non-standard features in this part of the language, so you may need to check your compiler's documentation or do some more experiments to figure out exactly what happens.
I think your code is also a bit peculiar in that you read from SMALL 3 times. Why ?
EDIT: OK, we're getting there. You have just discovered that Fortran stores arrays in column-major order. I believe that most other programming languages store them in row-major order. In other words, the first element of your array is small(1,1), the second (in memory) is small(2,1), the third is small(1,2) and so forth. I think that your read (and write) statements are not standard but widely implemented (which is not unusual in Fortran compilers). I may be wrong, it may be standard. Either way, the read statement is being interpreted to read the elements of small in column-major order. The first number read is put in small(1,1), the second in small(2,1), the third in small(1,2) and so on.
Your write statement makes use of the same feature; you might have discovered this for yourself if you had written out the elements in loops with the indices printed too.
The idiomatic Fortran way of reading an array and controlling the order in which elements are placed into the array, is to include an implied-do loop in the read statement, like this:
READ(UNIT=10, FMT='(5I4)', iostat=reason) ((SMALL(row,col), col = 1,numCol), row=1,numRow)
You can also use this approach in write statements.
You should also study your compiler documentation carefully and determine how to switch on warnings for all non-standard features.
Adding to what High Performance Mark wrote...
If you want to use commas to separate the numbers, then you should use list-directed IO rather than formatted IO. (Sometimes this is called format-free IO, but that non-standard term is easy to confuse with binary IO). This is easier to use since you don't have to arrange the numbers precisely in columns and can separate them with spaces or commas. The read is simply "read (10, *) variables"
But sticking to formatted IO, here is some sample code:
program demo1
implicit none
integer, dimension (2,5) :: small
integer :: irow, jcol
open ( unit=10, file='boundary.txt', access='sequential', form='formatted' )
do irow=1, ubound (small, 1)
read (10, '(5I4)') (small (irow, jcol), jcol=1, ubound (small, 2))
end do
write (*, '( / "small (1,2) =", I2, " and small (2,1)=", I2 )' ) small (1,2), small (2,1)
end program demo1
Using the I4 formatted read, the data need to be in columns:
12341234123412341234
11 12 13 14 15
21 22 23 24 25
The data file shouldn't contain the first row "1234..." -- that is in the example to make the alignment required for the format 5I4 clear.
With my example program, there is an outer do loop for irow and an "implied do loop" as part of the read statement. You could also eliminate the outer do loop and use two implied do loops on the read statement, as High Performance Mark showed. In this case, if you kept the format specification (5I4), it would get reused to read the second line -- this is called format reversion. (On a more complicated format, one needs to read the rules to understand which part of the format is reused in format reversion.) This is standard, and has been so at least since FORTRAN 77 and probably FORTRAN IV. (Of course, the declarations and style of my example are Fortran 90).
I used "ubound" so that you neither have to carry around variables storing the dimensions of the array, nor use specific numeric values. The later method can cause problems if you later decide to change the dimension of the array -- then you have to hunt down all of the specific values (here 2 and 5) and change them.
There is no need for a rewind after an open statement.