In Fortran, I'm trying to read a file with data in 8-bit (hexadecimal) bytes, on Linux.
In 'hexedit' the first line looks as it should, for the tiff-file it is.
49 49 2A 00 08 00 20 00 00 00 0B 02 00 00 00 00 II*... .........
I declare a two-byte character variable (character(len=2) :: tifhead(8))
and read like this:
open(1,file=filename,access='stream')
read(1) tifhead,greyvalue
I get the first two (49 49), which print out as II in a formatted write
(format (2Z2), but not the other ones.
How can I get all these hex values out? I should see 49 49 2A 00 08 .......
.
Your read statement will simply read 2 characters for tifhead(1), the next 2 characters for tifhead(2), etc, including spaces. Therefore you end up with tifhead(1)="49", tifhead(2)=" 4", tifhead(3)="9 ", and so on. You think you read the first 2 bytes correctly only because you print the strings "49", " 4", "9 ",... one after the other, so it looks like "49 49 " in the output. The compiler has no way to know there is a single blank space separating strings and 2 spaces every four data.
To read your data properly you must use formatted reading which implies you must also declare your stream as 'formatted' in the open statement. The following example shows how this can be done:
program example
implicit none
character(len=2) :: tifhead(8), greyscale(8)
open(1, file="example.txt", access='stream', form='formatted')
read(1, "(4(a2,tr1),tr1,3(a2,tr1),a2)", advance='no') tifhead
read(1, "(tr2,4(a2,tr1),tr1,3(a2,tr1),a2)", advance='no') greyscale
close(1)
print "(a,7(a2,tr1),a2,a)", " tifhead = (", tifhead, ")"
print "(a,7(a2,tr1),a2,a)", "greyscale = (", greyscale, ")"
end program example
Perhaps some explanation is needed: a2,tr1 means read a string of 2 characters, then advance the reading pointer once (this skips the space between your hexadecimal "numbers" - actually, they are treated as just strings). 4(a2,tr1) means do that 4 times. This reads the first 4 bytes plus one space. Now, there is one more space before the next data to be read so we add tr1 to skip it, and our format is 4(a2,tr1),tr1 so far; then we read 3 more bytes with 3(a2,tr1), then the last byte alone with just a2 (not skipping the space after it). So the format string is (4(a2,tr1),tr1,3(a2,tr1),a2), which will read the first 8 bytes correctly, leaving the reading pointer right after the 8th byte. Note that advance='no' is necessary, otherwise Fortran will assume carriage return and will skip the rest of the data in the same record (line).
Now, to read the next 8 bytes we use the same format, except we add tr2 in the beginning to skip the two blank spaces. I added formatted printing in the program to check if data were read correctly. Running the program gives:
tifhead = (49 49 2A 00 08 00 20 00)
greyscale = (00 00 0B 02 00 00 00 00)
which verifies data were read correctly.
Last but not least, I would recommend to avoid old-fashion Fortran used in your code and the example above. This means use newunit to let the program find the first free unit instead of explicitly giving a unit number, have some way to check if the file you are trying to open actually exists or if you reached end of file, avoid unnamed arguments, use the dimension attribute to declare arrays, etc. None of those is strictly necessary, and it might look like unnecessary verbosity at first. But in the long run being strict (as modern Fortran encourages) will save you a lot of time while debugging larger programs. So the above example could (arguably should) be written as follows.
program example2
implicit none
integer :: unt, status
character(len=2), dimension(8) :: tifhead, greyscale
open(newunit=unt, file="example.txt", access='stream', form='formatted',&
action='read', status='old', iostat=status)
if (status /= 0) then
print "(a)","Error reading file."; stop
end if
! More sophisticated reading is probably needed to check for end of file.
read(unit=unt, fmt="(4(a2,tr1),tr1,3(a2,tr1),a2)", advance='no') tifhead
read(unit=unt, fmt="(tr2,4(a2,tr1),tr1,3(a2,tr1),a2)") greyscale
close(unit=unt)
print "(a,7(a2,tr1),a2,a)", " tifhead = (", tifhead, ")"
print "(a,7(a2,tr1),a2,a)", "greyscale = (", greyscale, ")"
end program example2
I wasn't sure if I had to massively modify my previous answers (since I believe they still serve a purpose), so I decided to just add yet another answer, hopefully the last one. I apologize for the verbosity.
The following Fortran 90 module provides a subroutine named tiff_reader_16bit which reads any TIFF data file and returns its 16-bit content in an array of integers:
module tiff_reader
implicit none
private
public :: tiff_reader_16bit
contains
subroutine tiff_reader_16bit(filename, tifdata, ndata)
character(len=*), intent(in) :: filename
integer, allocatable, intent(out) :: tifdata(:)
integer, intent(out) :: ndata
integer, parameter :: max_integers=10000000
integer :: unt, status, record_length, i, records, lsb, msb
character ch;
integer, dimension(max_integers) :: temp
ndata=0
inquire(iolength=record_length) ch
open(newunit=unt, file=filename, access='direct', form='unformatted',&
action='read', status='old', iostat=status, recl=record_length)
if (status /= 0) then
print "(3a)","Error reading file """,filename,""": File not found."; return
end if
records=1
do i=1,max_integers
read(unit=unt, rec=records, iostat=status) ch; msb=ichar(ch)
if (status /= 0) then; records=records-1; ndata=i-1; exit; end if
read(unit=unt, rec=records+1, iostat=status) ch; lsb=ichar(ch)
if (status /= 0) then; ndata=i; temp(ndata)=msb; exit; end if
temp(i)=lsb+256*msb; records=records+2
end do
close(unit=unt)
if (ndata==0) then
print "(a)","File partially read."; records=records-1; ndata=max_integers
end if
allocate(tifdata(ndata), stat=status); tifdata=temp(:ndata)
print "(2(i0,a),/)",records," records read, ",ndata," 16-bit integers returned."
end subroutine tiff_reader_16bit
end module tiff_reader
The subroutine gets the TIFF file name and returns an array of integers, together with the the total number of integers read. Internally, the subroutine uses a fixed-size array temp to temporarily store the data. To save memory, the subroutine returns an allocatable array tifdata which is part of temp, containing the data that was read only. The maximum number of data read is set in the parameter max_integers to 10 million, but can go up to huge(0) if necessary and if memory allows (in my system that's about 2.14 billion integers); it can go even further if you use "higher" kind of integers. Now, there are other ways to do that, avoiding the use of a temporary fixed-size array, but this usually comes at the cost of additional computation time, and I wouldn't go that way. More sophisticated implementations can also be done, but that would add more complexity to the code, and I don't think it fits here.
Since you need the results in the form of 16-bit data, two consecutive bytes from the file must be read, then you treat them as most significant byte first, less significant byte next. This is why the first byte read in each iteration is multiplied by 256. Note that this is NOT always the case in binary files (but it is in TIFF). Some binary files come with less significant byte first.
The subroutine is lengthier than the previous examples I posted, but that's because I added error checking, which is actually necessary. You should always check if the file exists and if end of file has been reached while reading it. Special care must also be taken for TIFF images with an "orphan" last byte (this is indeed the case for the sample file "FLAG_T24.TIF" which I found here - but not the case for the sample image "MARBLES.TIF" found at the same webpage).
An example driver program using the module above would be:
program tiff_reader_example
use tiff_reader
implicit none
integer :: n
integer, allocatable :: tifdata(:)
call tiff_reader_16bit("FLAG_T24.TIF", tifdata, n);
if (n > 0) then
print "(a,7(z4.4,tr1),z4.4,a)", "First 8 integers read: (", tifdata(:8), ")"
print "(a,7(z4.4,tr1),z4.4,a)", " Last 8 integers read: (", tifdata(n-7:), ")"
deallocate(tifdata)
end if
end program tiff_reader_example
Running the program gives:
46371 records read, 23186 16-bit integers returned.
First 8 integers read: (4949 2A00 0800 0000 0E00 FE00 0400 0100)
Last 8 integers read: (F800 F8F8 00F8 F800 F8F8 00F8 F800 00F8)
which is correct. Note that in this case the number of records (= bytes, since the file is opened as unformatted) is not double the number of integers returned. That's because this particular sample image has that "orphaned" last byte I mentioned earlier. Also note that I used another format to print 16-bit hexadecimals, including leading zeroes if needed.
There are more detailed explanations that can be given but this thread is already quite long. Feel free to ask in the comments if something is not clear.
EDIT: By default intel Fortran treats direct access records as 4-byte words, which doesn't seem quite right to me. This unusual behavior can be fixed with a compiler flag, but to avoid the lack of portability in case someone uses that specific compiler without such a flag, I slightly modified the module tiff_reader to take care of this.
Assuming your data are actually stored in binary format (in fact it seems to be a tiff image data file), my first answer is valid only if you convert data to plain text. If you prefer to read the binary file directly, the simplest way I can think of is to open the file with access='direct', and read data byte-by-byte. Each byte is read as a character, then it is converted to an integer, which I guess is more useful than a string supposed to represent a hexadecimal number.
As an example, the following program will read the header (first 8 bytes) from a tiff data file. The example reads data from a sample tiff image I found here, but it works for any binary file.
program read_tiff_data
implicit none
integer :: unt, status, i
character :: ch
integer, dimension(8) :: tifhead
open(newunit=unt, file="flag_t24.tif", access='direct', form='unformatted',
action='read', status='old', iostat=status, recl=1)
if (status /= 0) then
print "(a)","Error reading file."; stop
end if
do i=1,8
read(unit=unt, rec=i) ch; tifhead(i)=ichar(ch)
end do
close(unit=unt)
print "(a,7(i0,tr1),i0,a)", "tifhead = (", tifhead, ")"
end program read_tiff_data
The program gives this output:
tifhead = (73 73 42 0 8 0 0 0)
which is correct. You can easily expand the program to read more data from the file.
If you still need the hexadecimal representation, just replace i0 with z0 in the print statement so that it reads
print "(a,7(z0,tr1),z0,a)", "tifhead = (", tifhead, ")"
This will print the result in hexadecimals, in this case:
tifhead = (49 49 2A 0 8 0 0 0)
Here is the code that works for me. Most of this is comments. Any remarks you may have on the fortran style are most welcome. Please note that I've been familiar with fortran 77 in the past, and learned a little more modern fortran in the process of writing this piece of code
program putiff
c This program is solely intended to read the data from the .tif files made by the CCD camera
c PIXIS 1024F at beamline 1-BM at the Advanced Photon Source, so that they can be manipulated
c in fortran. It is not a general .tif reader.
c A little bit extra work may make this a reader for baseline .tif files,: some of the
c information below may help with such an implementation.
c
c The PIXIS .tif file is written in hex with the little-endian convention.
c The hex numbers have two 8-bit bytes. They are read with an integer(kind=2) declaration.
c When describing an unsigned integer these cover numbers from 0 to 65535 (or 2**16-1).
c For the PIXIS files the first two bytes are the decimal number 18761. The TIFF6 specification
c gives them as a hexadecimal number (0x4949 for a little-endian convention, 4D4D for the
c big-endian convention. The PIXIS files are little-endian.
c
c The next two bytes should be 42 decimal, and 0x2A.
c
c The next 4 bytes give the byte offset for the first image file directory (IFD) that contains
c all the other information needed to understand how the .tif files are put together.
c This number should be read together as a 4 byte integer (kind=4). These (unsigned) integers
c go from 0 to 2**32-1, or 4294967295: this is the maximum file length for a .tif file.
c For the PIXIS this number is 2097160, or 0x200008: in between are the image date for the
c PIXIS's 1024x1024 pixels, each with a two-byte gray range from 0 to 2**16-1 (or 65535 decimal).
c Therefore the PIXIS image can be read without understanding the IFD.
c
c The line right below the hex representation gives the byte order, for the
c little-endian convention indicated by two first bytes. It's 4949 for little-endian,
c in both the first and in the second byte separately. The byte order is then least importan
c part first; with two bytes together, it is byte by byte. For big-endian it is 4D4D.
c
c One way to confirm all this information is to look at the files
c with a binary editor (linux has xxd) or a binary editor (linux has hexedit).
c For the PIXIS image .tif file, the first 8 bytes in hexedit are indeed:
c 49 49 2A 00 08 00 20 00
c For a little-endian file, the bytes are read from the least important to the
c most important within the two-byte number, like this:
c 49 49 2A 00 08 00 20 00
c (34 12) (34 12) (78 56 34 12)
c Here the byte order is indicated below the numbers. The second two-byte number is
c therefore 2+2*16+0*256+0*4096, or 42. Likewise, the last 4-byte number is 0x00200008.
c
c (When the individual byte are read in binary (with 'xxd -b -l 100') this gives
c for the hexadecimals 49 49 2A 00 08 00 20 00
c binary 01001001 01001001 00101010 00000000 00001000 00000000 00100000 00000000
c in ASCII I I * . . . . . )
c After the PIXIS data comes the so-called IFD (Image File Directory).
c These contain 209 bytes. They mean something, but what I do not know. I printed them
c out one by one at the end of the program. Perhaps they are better read in two-byte units
c (right now they are read as 'integer(kind=1); integer(kind=2) may be better). But, then
c there's an odd number so you have to read one separately.
c I want to know these only because I want to use the same .tif format to
c write the results of rctopo (the max, the COM, the FWHM, and the spread).
c I know what's in the first 8 bytes, and what the data are, so I can just
c copy the ifd at the end and count on getting a good .tif file back.
c It's sort of stupid, but it should work.
use iso_fortran_env
implicit logical (A-Z)
integer :: j,jmin,jmax
integer :: k,kmin,kmax
integer :: ifdlength
data jmin,kmin/1,1,/
parameter(jmax=1024,kmax=1024)
parameter(ifdlength=209)
c 8-byte header that starts the PIXIS data file
integer (kind=2) :: tifh12,tifh34 ! each two (8-bit) bytes
integer (kind=4) :: tifh5678 ! 4 bytes
c open and read the file now that you have the correct file name in the sequence
open(newunit=unt,file='tiff_file,access='stream',iostat=ios)
if (ios /= 0) then ; call problem(ios,'read_in_samples'); end if
read (unt) tifh12,tifh34,tifh5678,greyread,ifd
close (unt)
stop
end
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I've been trying debugging in code::blocks. I'm a student.
I understand value written on left is the address of the variable x in memory written in hex. What about these weird numbers on its right?
Also, what is the 'bytes' menu? If I select 16 bytes (from drop down arrow menu) I get one row only (in the image there are 8, as 16*8=256) . Does this mean this variable x is using 16 bytes in memory (but if I issue a sizeof(x) command, it gives me 4). So what's happening here?
Thanks.
Image
code:
#include <iostream>
using namespace std;
int main()
{
int x=10;
x=6;
x=13;
int y=12;
cout<<&y<<endl<<sizeof(x);
int z=19;
return 0;
}
Basics
The smallest memory size that we can work with is 1 byte. To represent a byte in hex, you need two hex values (ie: A0). This is the basics in the hex editor/ viewer.
What you have when you debug memory is the memory portion which the application uses in the RAM. A typical hex editor/ viewer will look something like this,
The address (A) | The values (in hex) (B) | ASCII representation (C)
| | | | |
V | V | V
0x000000 | 00 00 00 00 00 00 00 00 00 | . . . . . . . . . .
The A column represents the base (beginning) address of the row.
The B column shows the actual values stored at a given address. The first hex value (ie: 00 in this depiction) has the address of 0x000000. The second one has the address of 0x000001 and so on.
The C column contains the ASCII representation of the values shown in one column. Meaning that it shows what the values in the row looks like in ASCII.
int in C++ is 4 bytes in size (32 bits). In your example, the address of x is 0x61ff1c. Right next to it, in the B column you get the value stored in little endian (https://en.wikipedia.org/wiki/Endianness).
The address of y is 0x61ff14 in the memory. Your image does not show it as the variable is stored at a memory location before the address of x.
The Bytes in the menus just lets you to decide how many bytes it should display starting from the address that you specified (in your case its the address of x).
What about these weird numbers on its right?
The number on the extreme left is the memory address of the first byte (8 bits). In your image 0x61ff1c is not the address of the entire row, just the first byte (0d). The address of the second byte is 0x61ff1d and so on. If you check, the address to the left of the second row reads 0x61ff2c, exactly a difference of 16 bytes.
Now, about the contents. Let's look at 0x61ff1c, it contains 0x0d. If you are using a CPU with the x86 architecture, it is the opcode of an instruction that tells the CPU to perform a logical OR operation between the numbers in the next locations. The CPU does not understand C, it only understands binary. When you compile a C program, it gets converted from a .c file to an executable. The executable file can be directly executed by the CPU because all it contains is opcodes for instructions and data. The instruction set can be completely different for a CPU with another architecture; there 0xd can mean something else. Your compiler takes care of generating the right instructions for your CPU.
What you are seeing is directly the binary contents of memory(in hex to make it simpler), which will be read by the CPU.
ASCII
To the extreme right is the textual representation of the instructions.
Ever tried opening an executable file in Notepad or another text editor? Basically, all computer files contain data in the form of 0s and 1s. It is the file type that tells the computer how to interpret it. When you create a text file in Notepad, it is interpreted as text/ASCII. For instance, the ASCII value of A is 65 or 0x41. When the computer sees, 0x41, it knows it is an A. But if your file type is not text but rather an executable, that same 0x41 could be an opcode for an instruction. When you open an executable with Notepad, you are interpreting CPU instructions as text. In your case 0xd means OR for an executable, but you are trying to interpret as text. The character with the ASCII value 0x40 is #, that's why you get an # for the 7th column in the extreme right.
It is my understanding that Fortran, when reading data from file, will skip lines starting with and asterisk (*) assuming that they are a comment. Well, I seem to be having a problem with achieving this behavior with a very simple program I created. This is my simple Fortran program:
1 program test
2
3 integer dat1
4
5 open(unit=1,file="file.inp")
6
7 read(1,*) dat1
8
9
10 end program test
This is "file.inp":
1 *Hello
2 1
I built my simple program with
gfortran -g -o test test.f90
When I run, I get the error:
At line 7 of file test.f90 (unit = 1, file = 'file.inp')
Fortran runtime error: Bad integer for item 1 in list input
When I run the input file with the comment line deleted, i.e.:
1 1
The code runs fine. So it seems to be a problem with Fortran correctly interpreting that comment line. It must be something exceedingly simple I'm missing here, but I can't turn up anything on google.
Fortran doesn't automatically skip comments lines in input files. You can do this easily enough by first reading the line into a string, checking the first character for your comment symbol or search the string for that symbol, then if the line is not a comment, doing an "internal read" of the string to obtain the numeric value.
Something like:
use, intrinsic :: iso_fortran_env
character (len=200) :: line
integer :: dat1, RetCode
read_loop: do
read (1, '(A)', isostat=RetCode) line
if ( RetCode == iostat_end) exit ReadLoop
if ( RetCode /= 0 ) then
... read error
exit read_loop
end if
if ( index (line, "*") /= 0 ) cycle read_loop
read (line, *) dat1
end do read_loop
Fortran does not ignore anything by default, unless you are using namelists and in that case comments start with an exclamation mark.
I found the use of the backspace statement to be a lot more intuitive than the proposed solutions. The following subroutine skips the line when a comment character, "#" is encountered at the beginning of the line.
subroutine skip_comments(fileUnit)
integer, intent(in) :: fileUnit
character(len=1) :: firstChar
firstChar = '#'
do while (firstChar .eq. '#')
read(fileUnit, '(A)') firstChar
enddo
backspace(fileUnit)
end subroutine skip_comments
This subroutine may be used in programs before the read statement like so:
open(unit=10, file=filename)
call skip_comments(10)
read(10, *) a, b, c
call skip_comments(10)
read(10, *) d, e
close(10)
Limitations for the above implementation:
It will not work if the comment is placed between the values of a variable spanning multiple lines, say an array.
It is very inefficient for large input files since the entire file is re-read from the beginning till the previous character when the backspace statement is encountered.
Can only be used for sequential access files, i.e. typical ASCII text files. Files opened with the direct or append access types will not work.
However, I find it a perfect fit for short files used for providing user-parameters.
I have an input file that I cannot alter the format. One of the lines in particular can contain either 6 or 7 reals and I don't have any way of knowing ahead of time.
After some reading, my understanding of the list-formatted read statement is that if I attempt to read 7 reals on a line containing 6, it will attempt to read from the next line. The author of the code says that when it was written, it would read the 6 reals and then default the 7th to 0. I am assuming he relied on some compiler specific behavior, because I cannot find a mention of this behavior anywhere.
I am using gfortran as my compiler, is there a way to specify this behavior? Or is there a good way to count a number of inputs on a line and rewind to then chose to read the correct number?
here is a little trick to accomplish that
character*100 line
real array(7)
read(unit,'(a)')line !read whole line as string'
line=trim(line)//' 0' !add a zero to the string
read(line,*)array !list read
If the input line only had 6 values, the zero is now the seventh.
If there were seven to begin with it will do nothing.
I try to avoid using format specifiers on input as much as possible.
Maybe you should use the IOSTAT statement for detecting the wrong format when you attempt to read 7 values when there are only 6. And you should use the ADVANCE statement to be able to retry to read the same line.
READ(LU,'7(F10.3)', IOSTAT=iError, ADVANCE='NO') MyArray(1:7)
IF(iError > 0) THEN
! Error when trying to read 7 values => try to read 6 !
READ(LU, '6(F10.3)') MyArray(1:6)
ELSEIF(iError == 0) THEN
READ(LU, *) ! For skipping the line read with success with 7 values
ENDIF
IOSTAT takes a negative value for example when you reach the end of the file, positive for problem of reading (typically formatting error) and 0 when the read succeed. See this link for a complete definition of gfortran error code: http://www.hep.manchester.ac.uk/u/samt/misc/gfortran_errors.html
Another way to do it could be to read the line as a string and manipulating the string in order to get the vector values :
CHARACTER(LEN=1000) :: sLine
...
READ(LU, '(A)') sLine
READ(sLine,'7(F10.3)', IOSTAT=iError) MyArray(1:7)
IF(iError > 0) THEN
! Error when trying to read 7 values => try to read 6 !
READ(sLine, '6(F10.3)') MyArray(1:6)
ENDIF
If the values are written in fixed format, you can determine the lenght of the vector by testing the lenght of the line:
CHARACTER(LEN=1000) :: sLine
INTEGER :: nbValues
CHARACTER(LEN=2) :: sNbValues
...
READ(LU, '(A)') sLine
nbValues = LEN_TRIM(sLine) / 10 ! If format is like '(F10.x)'
WRITE(sNbValues, '(I2)') nbValues
READ(sLine, '('//TRIM(sNbValues)//'(F10.3))') MyArray(1:nbValues)
In a Fortran program, I need to write an array into a file with a specific format.
I perfectly works for smaller array (e.g. alen=10 in the example below), but won't work for bigger arrays: it then splits each line into two, as if a maximum number of characters per line was exceeded.
Example (very similar to the structure in my program):
PROGRAM output_probl
IMPLICIT NONE
INTEGER, PARAMETER :: alen=110
DOUBLE PRECISION, DIMENSION(alen)::a
INTEGER :: i,j
OPEN(20,file='output.dat')
30 format(I5,1x,110(e14.6e3,1x))
DO i=1,15
DO j=1,alen
a(j)=(i*j**2)*0.0123456789
ENDDO
write(20,30)i,(a(j),j=1,alen)
ENDDO
END PROGRAM output_probl
It compiles and runs properly (with Compaq Visual Fortran). Just the output file is wrong. If I for example change the field width per array item from 14 to 8, it'll work fine (this is of course not a satisfactory solution).
I thought about an unsuitable default maximum record length, but can't find how to change it (even with RECL which doesn't seem to work - if you think it should, a concrete example with RECL is welcome).
This might be basic, but I've been stuck with it for some time... Any help is welcome, thanks a lot!
Why not stream access? With sequential there is allways some processor dependent record length limit.
PROGRAM output_probl
IMPLICIT NONE
INTEGER, PARAMETER :: alen=110
DOUBLE PRECISION, DIMENSION(alen)::a
INTEGER :: i,j
OPEN(20,file='output.dat',access='stream', form='formatted',status='replace')
30 format(I5,1x,110(e14.6e3,1x))
DO i=1,15
DO j=1,alen
a(j)=(i*j**2)*0.0123456789
ENDDO
write(20,30)i,(a(j),j=1,alen)
ENDDO
END PROGRAM output_probl
As a note, I would use a character variable for the format string, or place it directly in the write statement, instead of the FORMAT statement with a label.
Fortran 95 version:
PROGRAM output_probl
IMPLICIT NONE
INTEGER, PARAMETER :: alen=110
DOUBLE PRECISION, DIMENSION(alen)::a
INTEGER :: i,j,rl
character(2000) :: ch
inquire(iolength=rl) ch
OPEN(20,file='output.dat',access='direct', form='unformatted',status='replace',recl=rl)
30 format(I5,1x,110(e14.6e3,1x))
DO i=1,15
DO j=1,alen
a(j)=(i*j**2)*0.0123456789
ENDDO
write(ch,30)i,(a(j),j=1,alen)
ch(2000:2000) = achar(10)
write(20,rec=i) ch
ENDDO
END PROGRAM output_probl
The program below should test. With Absoft compiler it works fine for n=10000, 10 character words, that is a line 100000 characters wide (plus a couple) in all. With G95 I get a message "Not enough storage is available to process this command" for n=5000 (n=4000 works).
character*10,dimension(:),allocatable:: test
integer,dimension(:),allocatable::itest
1 write(,)'Enter n > 0'
read , n
if(n.le.0) then
write(,)'requires value n > 0'
go to 1
endif
write(,*)'n=',n
allocate(test(n),itest(n))
write(test,'((i10))')(i,i=1,n)
write(*,*)test
open(10,file='test.txt')
write(10,*)test
write(*,*)'file test.txt written'
close(10)
open(11,file='test.txt')
read(11,*)itest
write(*,*)itest
end
I have a shell script from which I pass a binary file to a fortran program such that
Mth=$1
loop=1
it=1
while test $it -le 12
do
Mth=`expr $Mth + $loop`
file="DataFile"$Mth".bin"
./fort_exe ${Yr} ${nt} ${it}
# Increment loop
it=`expr $it + 1`
done
This script is used to pass 12 files within a do loop to the fortran program. In the fortran program, I read the binary file passed from the shell script and I am trying to write a 2nd file which would compile in a single file all the data that was read from the consecutive files e.g.
!Open binary file passed from shell script
open(1,file='Datafile'//TRIM{Mth)//.bin',action='read',form='unformatted',access='direct', &
recl=4*x*y, status='old')
! Open write file for t 1. The status is different in t 1 and t > 1 so I open it twice: I guess there is a more elegant way to do this...
open(2,file='Newfile.bin',action='write',form='unformatted', &
access='stream', position='append', status='replace')
irec = 0
do t = 1, nt
! Read input file
irec = irec + 1
read(1,rec=irec) val(:,:)
! write output file
irecW= irec + (imonth-1)*nt
if ( t .eq. 1) write(2,pos=irecW) val(:,:)
! Close file after t = 1, update the status to old and reopen.
if ( t .eq. 2) then
close (2)
open(2,file='Newfile.bin',action='write',form='unformatted', &
access='stream', position='append',status='old')
endif
if ( t .ge. 2) write(2,pos=irecW) val(:,:)
enddo
I can read the binary data from the first file no problem but when I try and read from another program the binary data from the file that I wrote in the first program such that
open(1,file='Newfile.bin',action='read',form='unformatted', &
access='stream', status='old')
irec=0
do t = 1, nt
! Read input file
irec = irec + 1
read(1,pos=irec) val(:,:)
write(*,*) val(:,:)
enddo
val(:,:) is nothing but a list of zeros. This is the first time I use access=stream which I believe is the only way I can use position='append'. I have tried compiling with gfortran and ifort but I do not get any error messages.
Does anyone have any idea why this is happening?
Firstly, I do not think you need to close and reopen your output file as you are doing. The status specifier is only relevant to the open statement in which it appears: replace will delete Newfile.bin if it exists at that time, before opening a new file with the same name. The status is implicitly changed to old, but this does not affect any operations done to the file.
However, since your Fortran code does not know you run it 12 times, you should have a way of making sure the file is only replaced the first time and opened as old afterwards; otherwise, Newfile.bin will only contain the information from the last file processed.
As for reading in the wrong values, this most likely occurs because of the difference between direct access (where you can choose a record length) and stream access (where you cannot). With stream access, data is stored as a sequence of "file storage units". Their size is in general compiler-dependent, but is available through the module iso_fortran_env as file_storage_size; it is usually 8 bits. This means that each entry will usually occupy multiple storage units, so you have to take care that a read or write with the pos = specifier does not access the wrong storage units.
Edit:
Some example code writing and reading with stream access:
program stream
use, intrinsic :: iso_fortran_env
implicit none
integer :: i, offset
real(real32), dimension(4,6) :: val, nval
open(unit=2, file='Newfile.bin', action='readwrite', form='unformatted', &
access='stream', status='replace')
do i = 1,2
call random_number(val)
write(2) val
enddo
! The file now contains two sequences of 24 reals, each element of which
! occupies the following number of storage units:
offset = storage_size(val) / file_storage_size
! Retrieve the second sequence and compare:
read(2, pos = 1 + offset*size(val)) nval
print*, all(nval == val)
close(2)
end program
The value true should be printed to the screen.
Note also that it's not strictly necessary to specify a pos while writing your data to the file, because the file will automatically be positioned beyond the last record read or written.
That said, direct or stream access is most beneficial if you need to access the data in a non-sequential manner. If you only need to combine input files into one, it could be easier to write the output file with sequential access, for which you can also specify recl and position = 'append'.
You can check for the existence of a file in standard Fortran, by using the inquire statement:
logical :: exist
inquire(file="test.dat", exist=exist)
if (exist) then
print *, "File test.dat exists"
else
print *, "File test.dat does not exist"
end if
Alternatively you can have a look at the modFileSys library which provides libc like file manipulation routines.
As for appending and streams: Appending files is also possible when you use "classical" record based fortran files, you do not have to use streams for that.