Avoiding the GOTO paradigm in Fortran FORMAT when reading/writing - fortran

At the university, the professor teaching us Fortran, gave us the following code:
program example
integer year, month, day, inst, kind, ozone
real time
open(unit=1,file='C:\261.dat')
read(1,1000) year, month, day, inst, kind, ozone, time
close(1)
1000 format(i4,1x,i2,1x,i2,1x,i1,1x,i1,1x,i3,1x,f8.3)
end
In this code, the line indexed with 1000 specifies the particular input format. Isn't it something like a GOTO logic? And if yes, what's the most appropriate way to avoid it, in the context of Fortran?

In a formatted data transfer statement there are three ways to specify the format to use (Fortran 2018 R1215):
referencing a labelled FORMAT statement
using a (default) character expression
with a * (for list-directed formatting)
For example (using a PRINT statement for clarity):
1000 FORMAT (I0)
print 1000, 1 ! Pointing to a FORMAT statement
print '(I0)', 1 ! Literal constant: one form of a character expression
print *, 1 ! List-directed formatting
end
In none of these is the format specification functionally like a GO TO statement.
A GO TO statement changes the flow of execution whereas in the format specification execution remains at the data transfer statement, and then continues to the next statement.
Specifying a label for the format doesn't transfer execution control to that statement, it simply says "use the format given by the statement 1000". This is conceptually like how
character(*), parameter :: CHAR_FMT='(I0)'
print CHAR_FMT, 1
end
says "use the character object CHAR_FMT (which is declared/defined elsewhere)" as the format.
You'll find may objections to FORMAT statements and (reasonable) suggestions for alternatives, but no objection to using a FORMAT statement is based on "like a GO TO". (And, of course, GO TO statements are not inherently evil.)
Format specifications can be contrasted with the err= and end= and eor= specifiers: these functionally are like GO TO statements:
1 read(unit, fmt, err=10, end=20, eor=20) x
...
! COME FROM 1
20 continue
...
return
! COME FROM 1
10 ERROR STOP "Error in the read"
Such jump-like flow can be alternatively managed with IOSTAT control:
read(unit, fmt, iostat=iostat) x
if (iostat...) ...

You can avoid using format labels by instead using format strings, as
program example
integer year, month, day, inst, kind, ozone
real time
open(unit=1,file='C:\261.dat')
read(1,'(i4,1x,i2,1x,i2,1x,i1,1x,i1,1x,i3,1x,f8.3)') year, month, day, inst, kind, ozone, time
close(1)
end
you can treat the format string as a regular character variable, as
program example
integer year, month, day, inst, kind, ozone
real time
character(42) :: format
open(unit=1,file='C:\261.dat')
format = '(i4,1x,i2,1x,i2,1x,i1,1x,i1,1x,i3,1x,f8.3)'
read(1,format) year, month, day, inst, kind, ozone, time
close(1)
end
This allows you to pass formats around your code just like any other variable, and also allows you to generate and modify formats at runtime.

Related

Variable format statement when porting from Intel to GNU gfortran

Suppose I'm trying to write out a CSV file header that looks like this:
STRING1 2001, 2002, 2003, 2004,
And some variable-format Fortran90 code that does this is
INTEGER X<Y
X=2001
Y=2004
WRITE(6,'(A,(999(5X,I4,",")))') ' STRING1',(y,y=X,Y)
The "999" repeats the (5X,I4,",") format structure as many times as it needs to (up to 999 times, at least) to complete. Assume X and Y are subject to change and therefore the number of loop iterations may also change.
But if I want the header to look like this, with an additional string on the end of the sequence, like
STRING1 2001, 2002, 2003, 2004, STRING2
...I have tried adding another A toward the end of the format string, but that repeated variable format structure apparently doesn't know that it needs to "escape" when the integers are done with and so it errors out.
I can work around this by including 'ADVANCE="no"' in the format string and printing the second string using a new WRITE statement to get what I fundamentally want, but is there a way I can do this all with a single format structure?
[NOTE: no angle-bracket answers please; this is for GNU gfortran, which doesn't support that extension]
C'mon folks, get with the program!
This is standard Fortran 2008:
WRITE(6,'(A,*(5X,G0,:,","))') ' STRING1',(y,y=X,Y), ' STRING2'
I am fairly sure that gfortran supports the "indefinite group repeat count". G format was extended in Fortran 2008 to support any intrinsic data type, and a width of zero means "minimum number of characters." The colon is a F77 feature that stops the trailing comma from being emitted.
With this, ifort gives me:
STRING1 2001, 2002, 2003, 2004, STRING2
FWIW, I am not happy with your reuse of y as the loop control variable, since this is NOT a statement entity and will get set to 2005 at the end of the loop. Use a separate variable, please!
program test
character(len=20) :: N_number
integer :: X,Y
X=2001
Y=2004
write(N_number,*) Y-X+1
write(6,'(A,('//TRIM(N_number)//'(5X,I4,","))A)') ' STRING1',(y,y=X,Y),' STRING2'
end program test
It's a shame that the variable-format extension isn't standard. Since it isn't, most people recommend the approach shown by #anonymous. That is, instead of using <N>, you first convert the integer into a string using an internal-write statement. This string representation of N is then inserted within the format expression to be used in the write or print statements.1
Alternatively, you could convert the numerical values from the array into a string.2 It's also pretty straightforward. In the example below, I've shown both of these approaches.
program writeheader
implicit none
character(len=80) :: string1, string2, string3, fmt, num
integer, dimension(10) :: array
integer :: x,y,len
continue
string1 = "begin"
string3 = "end"
array = [1:10]
x = 3
y = 7
!! Method 1: Convert the variable number of values into a string, then use it
!! to create the format expression needed to write the line.
write(num, "(i)") y - x + 1
fmt = "(a,', ',(" // trim(adjustl(num)) // "(i0:', ')), a)"
print fmt, trim(string1), array(x:y), trim(string3)
!! Method 2: Convert the desired range of array values into a character string.
!! Then concat, and write the entire line as a string.
write(string2, "(*(', ',i0))" ) array(x:y)
len = len_trim(string2) + 1
print "(a)", trim(string1) // string2(1:len) // trim(string3)
end program writeheader
In either case shown in the example, the output looks like: begin, 3, 4, 5, 6, 7, end
1 If I can find it, I'll add a link to a nice solution here on SO that created a function to generate the format expression.
2 I've used the array bounds directly here, as an alternative to implied do-loops.

Write to file using an implicit do loop

I need a help about implicit do loop in Fortran.
This is my simple code:
Program Simple
Implicit none
Integer::i,j
Integer,parameter::N=2,M=3
Real,dimension(N,M)::Pot
Open(1,File='First.txt',Status='old')
Read(1,'(M(f3.1,1x))') ((Pot(i,j),j=1,M),i=1,N)
Close(1)
Open(2,File='Second.txt',Status='Unknown')
Write(2,'(M(i0,1x,i0,1x,f3.1,1x))') ((i,j,Pot(i,j),j=1,M),i=1,N)
Close(2)
Stop
End program Simple
This is the file First.txt:
1.1 1.2 1.3
2.1 2.2 2.3
When I try to execute this program I got a this message:
Unexpected element 'N' in format string
Unexpected element 'M' in format string
I want to keep the name of integer variables N and M in write statement.
Is there any way to also keep their values from declaration part?
You are using M and N in the string (as characters), not as variables. In order to use the variables you need to write their values into the format string:
character(len=128) :: fmtString
!...
write(fmtString,*) M
fmtString = '('//trim(adjustl(fmtString))//'(f3.1,1x))'
Read(1,fmtString) ((Pot(i,j),j=1,M),i=1,N)
And similarly for the write statement.
However, you can probably use list-directed input (Read(1,*)) for the input, and let Fortran figure out the exact format.
Instead of this string manipulation you can use (*(f3.1,1x)) in modern compilers, or if you have an old one just specify a very large number, e.g. (99999(f3.1,1x)). In both cases, the correct number of values will be printed. However, this will result into writing all m*n values in one single line [thanks #agentp for pointing this out].

Destring a time variable using Stata

How to destring a time variable (7:00) using Stata?
I have tried destring: however, the : prevents the destring. I then tried destring, ignore(:) but was unable to then make a double and/or format %tc. encode does not work; recast does not do the job.
I also have a separate string date that I was able to destring and convert to a double.
Am I missing that I could be combining these two string variables (one date, one time) into a date/time variable or is it correct to destring them individually and then combine them into a date/time variable?
Short answer
To give the bottom line first: two string variables that hold date and time information can be converted to a single numeric date-time variable using some operation like
generate double datetime = clock(date + time, "DMY hm")
format datetime %tc
except that the exact details will depend on exactly how your dates are held.
For understanding dates and times in Stata there is no substitute for
help dates and times
Everything else tried is likely to be wrong or irrelevant or both, as your experience shows.
Longer answer, addressing misconceptions
destring, encode and recast are all (almost always) completely wrong in Stata for converting string dates and/or times to numeric dates and/or times. (I can think of one exception: if somehow a date in years had been imported as string with values "1960", "1961", etc. then destring would be quite all right.)
In reverse order,
recast is not for any kind of numeric to string or string to numeric conversion. It only recasts among numeric or among string types.
encode is essentially for mapping obvious strings to numeric and (unless you specify otherwise) will produce integer values 1, 2, 3, and so forth which will be quite wrong for times or dates in general.
destring as you applied it implies that the string times "7:00", "7:59", "8:00" should be numeric, except that someone stupidly added irrelevant punctuation. But if you strip the colons :, you get times 700, 759, 800, etc. which will not match the standard properties of times. For example, the difference between "8:00" and "7:59" is clearly one minute, but removing the informative punctuation would just yield numbers 800 and 759, which differ by 41, which makes no sense.
For a pure time, you can set up your own system, or use Stata's date-time functions.
For a time between "00:00" and "23:59" you can use Stata's date-times:
. di %tc clock("7:00", "hm")
01jan1960 07:00:00
. di %tc_HH:MM clock("7:00", "hm")
07:00
With variables you would need to generate a new variable and make sure that it is created as double.
A pure time less than 24 hours is (notionally) a time on 1 January 1960, but you can ignore that. But you need to hold in mind (constantly!) that the underlying numeric units are milliseconds. Only the format gives you a time in conventional terms.
If you have times more than 24 hours, that is probably not a good idea.
Your own system could just be to convert string times in the form "hh:mm" to minutes and do calculations in those terms. For times held as variables, the easiest way forward would be to use split, destring to produce numeric variables holding hours and minutes and then use 60 * hours + minutes.
However, despite your title, the real problem here seems to be dealing jointly with date and time information, not just time information, so at this point, you might like to read the short answer again.

How does the reverse function in SAS work?

I have a time data field, say, 10/1/2014.
I want to extract the month and the year information dynamically in SAS, given any date.
I wrote the following code in SAS to extract the month info:
month = substr(time_field, 1, index(time_field, '/')-1);
This worked fine.
I wrote the following snippet to extract the year info:
year = substr(reverse(time_field), 1, 4);
This doesn't work; it throws a blank. Have I missed something? Please help.
SAS will return the year for you. No need to write any custom function for this purpose. Look:
data _null_;
length year 4.;
year=year(today());
put "we are on the year of " year;
run;
Your variable has trailing spaces most likely. So when you reverse it, the trailing spaces become leading spaces and then you take the first four characters which are blanks.
You can verify this by running the reverse function alone on the variable and see the results.
Try adding the compress function.
year = substr(reverse(compress(time_field)), 1, 4);
Though this may solve your problem, you should really convert your date to a SAS date and then use the Month/Day/Year functions.
data have;
length time_field $20.;
time_field="10/1/2014";
year_bad = substr(reverse(time_field),1, 4);
year_good = reverse(substr(reverse(compress(time_field)),1, 4));
year_better = year(input(time_field, mmddyy10.));
put "year_bad:" year_bad;
put "year_good:" year_good;
put "year_better:" year_better;
run;
Your data is either a month in a character field, or it is a numeric value formatted as a date. While you can use text expressions on numerics, you shouldn't; you should explicitly convert them.
When you don't, then you end up with things like this - ie, improper lengths of fields, because the automatic conversion is very loose. It tends to allow a huge amount of extra space where it's not required to.
If your data is numeric, use MONTH() or YEAR() and be done with it; there's no reason to play in text here. Look at the field in the data explorer; it will tell you if it's numeric or not. (Numeric with a format can still look like text, so actually look at it!)
If your data is text, then you have some better options than REVERSE.
First is SCAN. SCAN splits by word, similar to many other languages; often strsplit (R) or similar.
month=scan(mdy_var,1,'/');
day =scan(mdy_var,2,'/');
year =scan(mdy_var,3,'/');
Second, you could still use SUBSTR, along with LENGTH.
year = scan(mdy_var,length(mdy_var)-3,4);
LENGTH tells you how long the string really is (minus trailing spaces), so '10/1/2014' is 9 long; 6th character (9-3) is the 2, and then 4 characters after that [which should be unnecessary]. This method wouldn't really work with Day, of course, only with year (and only with 4 digit year). Scan is better really, but this is a good example of how this works.
Going along the same lines, you can use FIND and look backwards, also, using a negative start position.
year = substr(mdy_var,find(mdy_var,'/',-99)+1,4);
That starts it at the 99th character (which is realistically your maximum, right?) and goes left, and then tells you what position the first '/' it finds.

Save data in another external file name output.txt?

The program can run, I am not sure how to use open() and save the data in another external file name output.txt. My questions are stated below - please have a look and help.
program start
implicit none
integer ::n
real(kind=8)::x,h,k
real(kind=8),external:: taylorq
x=1.0
n=20
h=exp(x)
k=taylorq(x,n)
open(10,'output.txt') ----------- *question1:(when should i put this open file?)*
write(*,*)"The exact value=",h
write(*,*)"The approximate value=",k
write(*,*)"The error=",h-k
end program start
function taylorq(x,n)
implicit none
integer::n,i
real(kind=8):: x,taylor,taylor2,taylorq,h
h=exp(x)
taylor=1.
taylor2=taylor
write(*,*)"i exact appro error"-----------question2:(actually I want to draw a table with subtitle i, exact, appro, error in each column, is there a nice way to arrange them like eg.we can use %5s)
do i=1,n
taylor=taylor*x/i
taylor2=taylor2+taylor
write(10,*)i,h,taylor2,taylor2-h --------question3:*(I want to save the data written here into file output.txt)*
end do
close(10)
taylorq=taylor2
end function taylorq
1. where to open
You should put open(10,...) so it executes before any write(10,...) -- or read(10,...) if this was input.
Since your writes occur in the function taylorq, you should open() before the statement that calls taylorq.
For programs that do very large computations, which Fortran is suited/famous for, it is often best to do
all file open's very near the beginning of the program, so that if there is a problem opening any file,
it is caught and fixed without wasting hours or days of work. But your program is much simpler than that.
2. formatting
Yes, Fortran can do formatted output -- and also formatted input. Instead of a text string with
interpolated specifiers (like C and the C part of C++, and Java, and awk and perl and shell) it uses specifiers
with optionally interpolated text values, and the specifiers are written with the format letter on
the left followed by the width (almost always) and other parameters (sometimes).
You can either put the format directly in the WRITE (or READ) statement, or in a separate FORMAT
statement referred to by its label in the I/O statement.
write (10, '(I4,F10.2,F10.2,F10.2)' ) i,h,taylor2,taylor2-h
or
write (10, 900) i,h,taylor2,taylor2-h
! this next line can be anywhere in the same program-unit
900 format (I4,F10.2,F10.2,F10.2)
Unlike C-family languages, Fortran will always output the specified width; if the value doesn't fit,
it prints asterisks ***** instead of forcing the field wider (and thus misaligned) (or truncating as
COBOL does!). Your series grows fast enough you might want to use scientific notation like E10.3.
(The format letters can be in either case, but I find them easier to read in upper. YMMV.)
There are many, MANY, more options. Any textbook or your compiler manual should cover this.