Importing data from file to array - fortran

I have 2 dimensional table in file, which look like this:
11, 12, 13, 14, 15
21, 22, 23, 24, 25
I want it to be imported in 2 dimensional array. I wrote this code:
INTEGER :: SMALL(10)
DO I = 1, 3
READ(UNIT=10, FMT='(5I4)') SMALL
WRITE(UNIT=*, FMT='(6X,5I4)') SMALL
ENDDO
But it imports everything in one dimensional array.
EDIT:
I've updated code:
program filet
integer :: reason
integer, dimension(2,5) :: small
open(10, file='boundary.inp', access='sequential', status='old', FORM='FORMATTED')
rewind(10)
DO
READ(UNIT=10, FMT='(5I4)', iostat=reason) SMALL
if (reason /= 0) exit
WRITE(UNIT=*, FMT='(6X,5I4)') SMALL
ENDDO
write (*,*) small(2,1)
end program
Here is output:
11 12 13 14 15
21 22 23 24 25
12

Well, you have defined SMALL to be a 1-D array, and Fortran is just trying to be helpful. You should perhaps have defined SMALL like this;
integer, dimension(2,5) :: small
What happened when the read statement was executed was that the system ran out of edit descriptor (you specified 5 integers) before either SMALL was full or the end of the file was encountered. If I remember rightly Fortran will re-use the edit descriptor until either SMALL is full or the end-of-file is encountered. But this behaviour has been changed over the years, according to Fortran standards, and various compilers have implemented various non-standard features in this part of the language, so you may need to check your compiler's documentation or do some more experiments to figure out exactly what happens.
I think your code is also a bit peculiar in that you read from SMALL 3 times. Why ?
EDIT: OK, we're getting there. You have just discovered that Fortran stores arrays in column-major order. I believe that most other programming languages store them in row-major order. In other words, the first element of your array is small(1,1), the second (in memory) is small(2,1), the third is small(1,2) and so forth. I think that your read (and write) statements are not standard but widely implemented (which is not unusual in Fortran compilers). I may be wrong, it may be standard. Either way, the read statement is being interpreted to read the elements of small in column-major order. The first number read is put in small(1,1), the second in small(2,1), the third in small(1,2) and so on.
Your write statement makes use of the same feature; you might have discovered this for yourself if you had written out the elements in loops with the indices printed too.
The idiomatic Fortran way of reading an array and controlling the order in which elements are placed into the array, is to include an implied-do loop in the read statement, like this:
READ(UNIT=10, FMT='(5I4)', iostat=reason) ((SMALL(row,col), col = 1,numCol), row=1,numRow)
You can also use this approach in write statements.
You should also study your compiler documentation carefully and determine how to switch on warnings for all non-standard features.

Adding to what High Performance Mark wrote...
If you want to use commas to separate the numbers, then you should use list-directed IO rather than formatted IO. (Sometimes this is called format-free IO, but that non-standard term is easy to confuse with binary IO). This is easier to use since you don't have to arrange the numbers precisely in columns and can separate them with spaces or commas. The read is simply "read (10, *) variables"
But sticking to formatted IO, here is some sample code:
program demo1
implicit none
integer, dimension (2,5) :: small
integer :: irow, jcol
open ( unit=10, file='boundary.txt', access='sequential', form='formatted' )
do irow=1, ubound (small, 1)
read (10, '(5I4)') (small (irow, jcol), jcol=1, ubound (small, 2))
end do
write (*, '( / "small (1,2) =", I2, " and small (2,1)=", I2 )' ) small (1,2), small (2,1)
end program demo1
Using the I4 formatted read, the data need to be in columns:
12341234123412341234
11 12 13 14 15
21 22 23 24 25
The data file shouldn't contain the first row "1234..." -- that is in the example to make the alignment required for the format 5I4 clear.
With my example program, there is an outer do loop for irow and an "implied do loop" as part of the read statement. You could also eliminate the outer do loop and use two implied do loops on the read statement, as High Performance Mark showed. In this case, if you kept the format specification (5I4), it would get reused to read the second line -- this is called format reversion. (On a more complicated format, one needs to read the rules to understand which part of the format is reused in format reversion.) This is standard, and has been so at least since FORTRAN 77 and probably FORTRAN IV. (Of course, the declarations and style of my example are Fortran 90).
I used "ubound" so that you neither have to carry around variables storing the dimensions of the array, nor use specific numeric values. The later method can cause problems if you later decide to change the dimension of the array -- then you have to hunt down all of the specific values (here 2 and 5) and change them.
There is no need for a rewind after an open statement.

Related

Can overindexing in FORTRAN 77 modify the program itself?

Here is a little program in FORTRAN 77
dimension totlev(20)
do 100 i=1,24
totlev(i)=0.0
write(0,*) 'totlev i=',i, totlev(i)
100 continue
end
I compile it using MinGW by typing gfortran test.f and I do get a warning (not an error):
test.f:4:14:
do 100 i=1,25
2
totlev(i)=0.0
1
Warning: Array reference at (1) out of bounds (25 > 20) in loop beginning at (2)
test.f:5:40:
test.f:3:72:
do 100 i=1,25
2
test.f:5:40:
write(0,*) 'totlev i=',i, totlev(i)
1
Warning: Array reference at (1) out of bounds (25 > 20) in loop beginning at (2)
However, not always such a warning would be produced if it was a longer program. An executable is created. When I run it it behaves like an infinite loop.
And this is my problem: How is an infinite loop even possible with the DO iteration? Isn't it a logical impossibility? My only explanation is that overindexing in this case reaches to the program code itself and changes it. Is that possible?
I use Windows 7 OS if that's relevant.
It's not changing the code, it's changing the variable i. Both the array totlev(20) and the scalar i are local variables, and thus typically stored in the program's stack frame (though the standard leaves this choice to the 'processor', Fortran-speak for implementation). In this case the compiler apparently put i 4 'real's (probably 16 bytes) after the end of totlev, so assigning to totlev(24) actually changes i. Fortran basically requires that an integer and single/default-precision real variable be the same size, and while it doesn't require any particular relationship between the representations for integers and reals, most machines today use 'IEEE 754' floating-point and in that system a real 0.0 has the same representation as an integer 0.
On many though not all computer architectures it is possible to address code by indexing an array out of range, but this almost always requires indexes far out of range: millions or billions or more, not one or two. On older architectures it was often possible both to read and write code this way, but most systems since about 1980 have memory protection so that you can't write to code. In particular all Windows NT-series systems do this, which includes Windows 7.

"Insufficient virtual memory" error for allocating small arrays

I have been using Fortran for a few months now, but I am self-taught and have only been learning it by reading someone else's codes so my knowledge of Fortran is very limited. I wrote this function which is meant to read a text file containing data and save these data in an array. Since I don't know the size of the data, I choose to allocate the array within the function.
FUNCTION RSEBIN(NAMEIN,NZNSEB)
IMPLICIT DOUBLE PRECISION (A-H, O-Z)
INTEGER DSEBTP, IIND, NZNSEB
CHARACTER(LEN=75) :: FILNAM
CHARACTER NAMEIN*(*)
REAL, ALLOCATABLE :: RSEBIN(:,:)
WRITE (FILNAM,1500) 'Extra_InputFiles/SEB_inputs/SEB_', NAMEIN,
2 '.txt' !Define the path and name of the input data text file
1500 FORMAT (A32,A,A4)
OPEN (UNIT=101, FILE=FILNAM, STATUS='OLD')
READ(101,*) !Skip the header
DSEBTP = 0
DO
READ(101,*,IOSTAT=IO) TRASH
IF (IO.NE.0) EXIT !Exit the loop when last line has been reached
DSEBTP = DSEBTP + 1 !Counts how many time periods inputs are set for the input data type
END DO
REWIND(101) !Rewind text file to read the inputs
ALLOCATE(RSEBIN(DSEBTP,NZNSEB+1)) !Allocate the input data array
READ(101,*) !Skip the header
DO 1510 ISEBTP=1,DSEBTP
READ(101,*) (RSEBIN(ISEBTP,IIND), IIND=1, NZNSEB+1) !Save the data in the main array
1510 CONTINUE
CLOSE (UNIT=101)
RETURN
END FUNCTION
I then use this function in another subroutine with this following statement:
ASEBAT = RSEBIN('AirTemperature',NZNSEB) !Allocate the air temperature array (first column is time)
When I try to run the program, I get a "Insufficient virtual memory" error. After a quick search, I discovered that this error usually occurs when one is allocating huge arrays. However, during my tests, I was only using a 3 X 5 array. After a few more tests, I realized that the function works fine if I declare the dimensions of my array RSEBIN rather than making it allocatable and allocating it in the function. However, this solution is not sustainable for me as I want this function to be able to read text files of various dimensions.
Does anyone have an idea why I have such error? Should I avoid allocating arrays in a function? As I said previously, I am fairly new to Fortran and I am pretty sure my code has many issues, so I apologize for my primitive code writing and would be grateful for any tip.
Also, I should note that I'm using the Intel Fortran Compiler from oneAPI for Windows. I recently switched from the fortran compiler in Intel XE, with which, if I can recall, I was using a similar function without any issue.
Thanks!

Intel Fortran error "allocatable array or pointer is not allocated"

When I tried to run a huge Fortran code (the code is compiled using Intel compiler version 13.1.3.192), it gave me error message like this:
...
Info[FDFI_Setup]: HPDF code version number is 1.00246
forrtl: severe (153): allocatable array or pointer is not allocated
Image PC Routine Line Source
arts 0000000002AD96BE Unknown Unknown Unknown
arts 0000000002AD8156 Unknown Unknown Unknown
arts 0000000002A87532 Unknown Unknown Unknown
...
Nonetheless, if I insert a small write statement (which is just to check the code, not to disturb the original purpose of the code) in one of the subroutines as the following (I couldn't put all the codes since they are too huge):
...
endif
call GetInputLine(Unit,line,eof,err)
enddo
if(err) return
! - [elfsummer] 20140815 Checkpoint 23
open(unit = 1, file = '/bin/monitor/log_checkpoint',status='old',position='append')
write(1,*) "BEFORE checking required keys: so far so good!"
close(1)
! check required keys
! for modes = 2,3, P and T are the required keys
if(StrmDat%ModeCI==2.or.StrmDat%ModeCI==3) then
...
then suddenly, the error message shown above disappears and the code can run correctly! I also tried to insert such write statements in other locations in the source code but the above error message still exists.
According to Intel's documentation:
severe (153): Allocatable array or pointer is not allocated
FOR$IOS_INVDEALLOC. A Fortran 90 allocatable array or pointer must already be allocated when you attempt to deallocate it. You must allocate the array or pointer before it can again be deallocated.
Note: This error can be returned by STAT in a DEALLOCATE statement.
However, I couldn't see any relations between the error and the "write statements" I added to the code. There is no such "allocate" command in the location I add the write statements.
So I am quite confused. Does anybody know the reasons? Any help is greatly appreciated!!
With traceback option, I could locate the error source directly:
subroutine StringRead(Str,delimiter,StrArray,ns) ! [private] read strings separated by delimiter
implicit none
character*(*),intent(in) :: Str
character*(*),intent(in) :: delimiter
character*(*),pointer :: StrArray(:)
integer,intent(out) :: ns
! - local variables
character(len=len(Str)) :: tline
integer :: nvalue,nvalue_max
character(len=len(StrArray)),pointer:: sarray(:),sarray_bak(:)
integer :: len_a,len_d,i
! deallocate StrArray
if(associated(StrArray)) deallocate(StrArray)
The error, according to the information the traceback gave me, lies in the last statement shown above. If I comment out this statement, then the "forrtl: severe (153)" error would disappear while new errors being generated... But still, I don't think this statement itself could go wrong...It acts as if it just ignores the if... condition and directly reads the deallocate commend, which seems weird to me.
You could have a bug in which you are illegally writing to memory and damaging the structure that stores the allocation information. Changing the code might cause the memory damage to occur elsewhere and that specific error to disappear. Generally, illegal memory accesses typically occur two ways in Fortran. 1) illegal subscripts, 2) mismatch between actual and dummy arguments, i.e., between variables in call and variables as declared in procedures. You can search for the first type of error by using your compiler's option for run-time subscript checking. You can guard against the second by placing all of your procedures in modules and useing those modules so that the compiler can check for argument consistency.
Sounds like some of the earlier comments give the general explanation. However,
1) Is StrArray(:) an Intent(out)? That is, are you reading the file's lines into StrArray() in the s/r, with the hope of returning that as the file's content? If so, declare it as an (Out), or whatever it should be.
2) Why is StrArray() a Pointer? Does it need to be a Pointer? If all you want is file content, you may be better off using a non-Pointer.
You may still need an Allocatable, or Automatic or something, but non-Pointers are easier in many cases.
3) If you must have StrArray(:) as a Pointer, then its size/shape etc must be created prior to use. If the calling sequence ACTUAL Arg is correctly defined (and if StrArray() is Intent(In) or Intent(InOUT), then that might do it.
By contrast, if it is an (Out), then, as with all Pointer arrays, it must be FIRST Allcoated() in the s/r.
If it is not Allocated somewhere early on, then it is undefined, and so the DeAllocate() fails, since it has nothing to DeAlloc, hence Stat = 153.
4) It is possible that you may wish to use this to read files without first knowing the number of lines to read. In that case, you cannot (at least not easily), Allocate StrArray() in advance, since you don't know the Size. In this case, alternate strategies are required.
One possible solution is a loop that simple reads the first char, or advances somehow, for each line in the file. Have the loop track the "sum" of each line read, until EOF. Then, you will know the size of the file (in terms of num lines), and you then allocate StrArray(SumLines) or something. Something like
SumLines = 0
Do i=1, ?? (or use a While)
... test to see if "line i" exists, or EOF, if so, Exit
SumLines = SumLines + 1
End Do
It may be best to do this in a separate s/r, so that the Size etc are known prior to calling the FileRead bits (i.e. that the file size is set prior to the FileRead s/r call).
However, that still leaves you with the problem of what Character(Len) to use. There are many possible solutions to this. Three of which are:
a) Use max length, like Character(Len = 2048), Intent(Out), or better yet, some compile time constant Parameter, call it MaxLineWidth
This has the obvious limitation to lines that <= MaxLineWidth, and that the memory usage may be excessively large when there many "short lines", etc.
b) Use a single char array, like Character(Len = 1), Intent(Out) :: StrArrayChar(:,:)
This is 2-D, since you need 1 D for the chars in each line, and the 2nd D for the lines.
This is a bit better compared to a) since it gives control over line width.
c) A more general approach might rely on a User Defined Type such as:
Type MyFileType
Character(Len=1), Allocatable :: FileLine(:) ! this give variable length lines, but each "line" must be allocated to the length of the line
End Type MyFileType
Then, create an array of this Type, such as:
Type(MyFileType), Allocatable :: MyFile(:) ! or, instead of Allocatable, can use Automatic etc etc
Then, Allocate MyFile to Size = num lines
... anyway, there are various choices, each with its own suitability for varying circumstances (and I have omitted much "housekeeping" re DeAllocs etc, which you will need to implement).
Incidentally, c) is also one possible prototype for "variable length strings" for many Fortran compilers that don't support such explicitly.

sizeof in fortran

It's quite common in C-code to see stuff like:
malloc(sizeof(int)*100);
which will return a pointer to a block of memory big enough to hold 100 ints. Is there any equivalent in fortran?
Use case:
I have a binary file which is opened as:
open(unit=10,file='foo.dat',access='stream',form='unformatted',status='old')
I know that the file contains "records" which consist of a header with 20 integers, 20 real numbers and 80 characters, then another N real numbers. Each file can have hundreds of records. Basically, I'd like to read or write to a particular record in this file (assuming N is a fixed constant for simplicity).
I can easily calculate the position in the file I want to write if I know the size of each data-type:
header_size = SIZEOF_INT*20 + SIZEOF_FLOAT*20 + SIZEOF_CHAR*80
data_size = N*SIZEOF_FLOAT
position = (record_num-1)*(header_size+data_size)+1
Currently I have
!Hardcoded :-(
SIZEOF_INT = 4
SIZEOF_FLOAT = 4
SIZEOF_DOUBLE = 8
SIZEOF_CHAR = 1
Is there any way to do better?
constraints:
The code is meant to be run on a variety of platforms with a variety of compilers. A standard compliant solution is definitely preferred.
In your use case I think you could use
inquire(iolength=...) io-list
That will give you how many "file storage units" are required for the io-list. A caveat with calculating offsets in files with Fortran is that "file storage unit" need not be in bytes, and indeed I recall one quite popular compiler by default using a word (4 bytes) as the file storage unit. However, by using the iolength thing you don't need to worry about this issue.
#janneb's answer will address the OP's question, but it doesn't answer the "sizeof" question for Fortran.
A combination of inquire and file_storage_size will give the size of a type. Try this code:
program sizeof
use iso_fortran_env
integer :: num_file_storage_units
integer :: num_bytes
inquire(iolength=num_file_storage_units) 1.0D0
num_bytes = num_file_storage_units*FILE_STORAGE_SIZE/8
write(*,*) "double has size: ", num_bytes
end program sizeof
See:
http://gcc.gnu.org/onlinedocs/gfortran/ISO_005fFORTRAN_005fENV.html
http://h21007.www2.hp.com/portal/download/files/unprot/fortran/docs/lrm/lrm0514.htm
If all the records are the same, this would seem to be a case to use direct access rather than stream access. Then don't calculate the position in the file, you tell the compiler the record that you want, and it accesses it. Unless you want these files to be portable across platforms or the records are not all the same ... then you have to have more control or calculate the length of the records. While the original Fortran 90 concept was to declare variables according to the required precision, there are now portable ways to declare variables by size. Either with types provided by the already mentioned iso_c_binding module, or from the iso_fortran_env module.

Unexpected "padding" in a Fortran unformatted file

I don't understand the format of unformatted files in Fortran.
For example:
open (3,file=filename,form="unformatted",access="sequential")
write(3) matrix(i,:)
outputs a column of a matrix into a file. I've discovered that it pads the file with 4 bytes on either end, however I don't really understand why, or how to control this behavior. Is there a way to remove the padding?
For unformated IO, Fortran compilers typically write the length of the record at the beginning and end of the record. Most but not all compilers use four bytes. This aids in reading records, e.g., length at the end assists with a backspace operation. You can suppress this with the new Stream IO mode of Fortran 2003, which was added for compatibility with other languages. Use access='stream' in your open statement.
I never used sequential access with unformatted output for this exact reason. However it depends on the application and sometimes it is convenient to have a record length indicator (especially for unstructured data). As suggested by steabert in Looking at binary output from fortran on gnuplot, you can avoid this by using keyword argument ACCESS = 'DIRECT', in which case you need to specify record length. This method is convenient for efficient storage of large multi-dimensional structured data (constant record length). Following example writes an unformatted file whose size equals the size of the array:
REAL(KIND=4),DIMENSION(10) :: a = 3.141
INTEGER :: reclen
INQUIRE(iolength=reclen)a
OPEN(UNIT=10,FILE='direct.out',FORM='UNFORMATTED',&
ACCESS='DIRECT',RECL=reclen)
WRITE(UNIT=10,REC=1)a
CLOSE(UNIT=10)
END
Note that this is not the ideal aproach in sense of portability. In an unformatted file written with direct access, there is no information about the size of each element. A readme text file that describes the data size does the job fine for me, and I prefer this method instead of padding in sequential mode.
Fortran IO is record based, not stream based. Every time you write something through write() you are not only writing the data, but also beginning and end markers for that record. Both record markers are the size of that record. This is the reason why writing a bunch of reals in a single write (one record: one begin marker, the bunch of reals, one end marker) has a different size with respect to writing each real in a separate write (multiple records, each of one begin marker, one real, and one end marker). This is extremely important if you are writing down large matrices, as you could balloon the occupation if improperly written.
Fortran Unformatted IO I am quite familiar with differing outputs using the Intel and Gnu compilers. Fortunately my vast experience dating back to 1970's IBM's allowed me to decode things. Gnu pads records with 4 byte integer counters giving the record length. Intel uses a 1 byte counter and a number of embedded coding values to signify a continuation record or the end of a count. One can still have very long record lengths even though only 1 byte is used.
I have software compiled by the Gnu compiler that I had to modify so it could read an unformatted file generated by either compiler, so it has to detect which format it finds. Reading an unformatted file generated by the Intel compiler (which follows the "old' IBM days) takes "forever" using Gnu's fgetc or opening the file in stream mode. Converting the file to what Gnu expects results in a factor of up to 100 times faster. It depends on your file size if you want to bother with detection and conversion or not. I reduced my program startup time (which opens a large unformatted file) from 5 minutes down to 10 seconds. I had to add in options to reconvert back again if the user wants to take the file back to an Intel compiled program. It's all a pain, but there you go.