We are having to convert some old Fortran 77 code to VB.net. With none of us knowing any Fortran, we have made significant progress.
However, we have come across the following write statement which has a couple of nested implied do loops. We are familiar with implied do loops but do not know what the significance of the colon in MN:MN is. We've only ever seen implied do loops using commas such as the latter one in this statement (NREC,MN).
Logical*1 DECLN(492)
WRITE(6,9238)NPERMN(NREC),CUSIPS(NREC),TICKRS(NREC),NAMES(NREC),(DECLN(MN:MN),MN=1,30),(SCORES(NREC,MN),MN=1,30))
format(I7, 1X, A8, 1X, A8, 1X, A20, 1X, 12A1, 1X, 12A1, 1X, 6A1/(12F10.5))
DECLN(MN:MN)
looks like a 1-character extract from a character variable called DECLN. The expression
(DECLN(MN:MN),MN=1,30)
(which is an io-implied-do expression) causes the program to write the first 30 characters of DECLN as 30 separate characters. The form
(DECLN(1:30))
writes the same characters in one 30-character long go.
It might be that DECLN(MN:MN) is a 1-element section of the rank-1 array DECLN, in which case it's an odd way to write DECLN(MN)
DECLN(MN:MN) is used to extract a single character from a string.
As noted in the update, DECLN is not character; it is 1-byte LOGICAL array (containing info about securities). (DECLN(MN:MN),MN=1,30) is identifying the first 30 elements, but I think could more easily be written (DECLN(MN),MN=1,30). The CUSIPs (which are 9-digit codes) are being written A8, which truncates the checksum (typical enough procedure, which I'm sure Brian knows because most clearing bodies ignore it)
Related
A line of code like this:
write(*,*) a,b
will produce as an output a and b separated by one tab. How can I write an output with aand b separated by two tabs?
Your line will typically not produce any tab character, but some number of blank (space) characters. If you want two tab characters, you have to use their ASCII code 9.
write(*,'(4g0)') a, achar(9), achar(9), b
Note the explicit format '(4g0)' used to avoid unwanted blank characters. With the g0 descriptor it will work for any type of a and b.
The <tab>-character is not a part of the Fortran character-set. So when you add it to your source code, most compilers should
complain [Cfr. Section 3 Fortran 2008 Standard].
If you want to add it to your output, you have to create a
character of the requested kind that represents that particular
character. To do this, you make use of ACHAR(I [, KIND]) which
converts the ASCII code I into that particular character of kind
KIND or the default kind if KIND is not specified. For the
<tab>-character this would read:
ACHAR(9)
Another way, but less preferred, would be to make use of the
ISO_C_BINDING module which defines the constant
C_HORIZNTAL_TAB. This represents \t, a character of the
C-character kind C_CHAR. If C_CHAR=-1, the constant is converted
to ACHAR(9) [Cfr. Section 15.2 Fortran 2008 Standard]
See Vladimir's answer for the howto.
I'm attempting to modernize an old code (or at least make it a bit more understandable) but I've run into an odd format for a, uh, FORMAT statement.
Specifically, it's a FORMAT statement with Hollerith constants in it (the nH where n is a number):
FORMAT(15H ((C(I,J),J=1,I3,12H),(D(J),J=1,I3, 6H),I=1,I3,') te'
1,'xt' )
This messes with the syntax highlighting as it appears this has unclosed parenthesis. It compiles fine with this format statement as is, but closing the parenthesis causes a compiling error (using either the intel or gfortran compiler).
As I understand it, Hollerith constants were a creature of Fortran 66 and were replaced with the advent of the CHARACTERin Fortran 77. I generally understand them when used as something like a character, but use as a FORMAT confuses me.
Further, if I change 15H ((... to 15H ((... (i.e. I remove one space) it won't compile. In fact, it won't compile even if I change the code to this:
FORMAT(15H ((C(I,J),J=1,I3,12H),(D(J),J=1,I3, 6H),I=1,I3,') text' )
I would like this to instead be in a more normal (F77+) format. Any help is appreciated.
What you have are actually Hollerith edit descriptors, not constants (which would occur in a DATA or CALL statement), although they use the same syntax. F77 replaced Hollerith constants outright; it added char-literal edit descriptor as a (much!) better alternative, but H edit descriptor remained in the standard until F95 (and even then some compilers still accepted it as a compatibility feature).
In any case, the number before the H takes that number of characters after the H, without any other delimiter; that's why deleting (or adding) a character after the H screws it up. Parsing your format breaks it into these pieces
15H ((C(I,J),J=1,
I3,
12H),(D(J),J=1,
I3,
6H),I=1,
I3,
') te'
'xt'
and thus a modern equivalent (with optional spaces for clarity) is
nn FORMAT( ' ((C(I,J),J=1,', I3, '),(D(J),J=1,', I3, '),I=1,', I3
1,') text' )
or if you prefer you can put that text after continuation (including the parens) in a CHARACTER value, variable or parameter, used in the I/O statement instead of a FORMAT label, but since you must double all the quote characters to get them in a CHARACTER value that's less convenient.
Your all-on-one-line version probably didn't compile because you were using fixed-form, perhaps by default, and only the first 72 characters of each source line are accepted in fixed-form, of which the first 6 are reserved for statement number and continuation indicator, leaving only 66 and that statement is 71 by my count. Practically any compiler you will find today also accepts free-form, which allows longer lines and has other advantages too for new code, but may require changes in existing code, sometimes extensive changes.
Ok, so here's the deal. This is a project for school, and we can't use #include < string >.
Basically, for any strings we'll be dealing with, we have to use cstrings, or char arrays that end with a null terminator. Basically the same thing right? Well I'm having a little bit of trouble. I have to read in a first name, last name, a student id, and a minimum of 5 grades but a maximum of 6 grades from an input file. To see what that looks like is below, but there is a catch. There can be an arbitrary amount of spaces in between each of those details, with the maximum length of the line being 250. So an example of the input is below.
Adam Zeller 452231 78 86 91 64 90 76
Barbara Young 274253 88 77 91 66 82
Carl Wilson 112231 87 77 76 78 77 82
Notice, how there are random amounts of spaces in between the details. Basically, I need to get the names (both first name and last name can vary in length), read the student id into an int, and then read all the rest of their grades (preferably into an int array). Also, they can have either 5 or 6 grades,the program should be able to handle either. How in the world do I go about sorting this data? I thought maybe I could getline() into a cstring char array of the whole line, and then seperate each bit accordingly into each array, but I just don't know how to go about this. Indefinitely, I don't want anyone to give me any code, but maybe point me in the right direction of how I could go about this. Sorting a line of data into different variables, while also accounting for either 5 or 6 grades without effecting the data, and also the list could be up to 60 lines long (meaning have up to 60 students on it but no more than that). This is only a portion of the project, but seems to be the one part I can't get past. Again, I don't want any code or direct answers, maybe just point me in the right direction of a way I could go about this. Thanks so much!
I'm not going to post any code (as requested), but consider filtering each line through strtok. strtok splits the string into tokens, where they can be arranged or stored however you like. See here for more: http://www.cplusplus.com/reference/cstring/strtok/
I guess you may follow the below steps:
read a line
get the words from the line
first two words are first and last names
rest of the words are number, use atoi to get the numbers from string
continue the above till EOF
How do you get the words ? May be "isspace" c library function will help
Technically, this is a trick project/question.
The hardest part of parsing the lines in the file is as you said: dealing with the random amount of spaces.
Since you requested no code, I'll give you a few hints based on what you suggested:
You know that the size of each line is a maximum of 250 characters (including the newline character?) - this is the length of each line, and as such, since it occurs with such regularity, you can read this many characters at a time with regular file functions, or, using fstreams (as deduced from your tag).
The only issue really, is storing these tokens. If you know ahead of time how many you will store, you can define an array of c-strings (as you can't use <string>) comprised of a maximum number that you think will occur. However, as that is both unreliable and a bit inefficient (as it's a waste of memory if you choose too much lines to store), you can make it dynamic. In that regard you have the option of using a C++ container like <vector>, for ease of access and storage.
After that, figuring out the values of the data on each line is relatively easy:
First, look at your data, what do you observe?
Each individual piece of data (a token in parsing nomenclature) is delimited by at least one space character.
Also, the names are the first two tokens of any line, and do not contain digits in them.
Hence anything that has not a space or a number belongs to a string, in this case: part of a name.
All you have to do is then iterate over the container/data structure you've used to store the c-strings and parse them using the criteria described above.
Is it possible to read a line with numerous numbers(integers) using Fortran?
lets say i have a file with only only line
1 2 3
the following program reads 3 integers in a line
program reading
implicit none
integer:: dump1,dump2,dump3
read(21,*) dump1,dump2,dump3
end
so dump1=1 dump2=3 dump3=3
If i have a file with only one line but with numerous integers like
1 2 3 4 5 6 7 8 ... 10000
is ti possible the above program to work without defining 10000 variables?
EDIT The first paragraph of this answer might seem rather strange as OP has modified the question.
Your use of the term string initially confused me, and I suspect that it may have confused you too. It's not incorrect to think of any characters in a file, or typed at a command-line as a string, but when all those characters are digits (interspersed with spaces) it is more useful to think of them as integers. The Fortran run-time system will
take care of translating a string of digit characters into an integer.
In that light I think your question might be better expressed as How to read a list of integers from an input line ? Here's one way:
Define an array. Here I define an array of fixed size:
integer, dimension(10**4) :: dump
(I often use expressions such as 10**4 to avoid having to count 0s carefully). This step, defining an array to capture all the values, seems to be the one you are missing.
To read those values from the terminal, at run-time, you might write
write(*,*) 'Enter ', 10**4, 'numbers now'
read(*,*) dump
and this will set dump(1) to the first number you type, dump(2) to the second, all the way to the 10**4-th. Needless to say, typing that number of numbers at the terminal is not recommended and a better approach would be to read them from a file. Which takes you back to your
read(21,*) dump
It wouldn't surprise me to find that your system imposes some limit on the length of a single line so you might have to be more sophisticated when trying to read as many as 10**4 integers, such as reading them in lines of 100 at a time, something like that. That's easy
read(*,*) dump(1:100)
will read 100 integers into the first 100 elements of the array. Write a loop to read 100 lines of 100 integers each.
I've got a string value of the form 10123X123456 where 10 is the year, 123 is the day number within the year, and the rest is unique system-generated stuff. Under certain circumstances, I need to add 400 to the day number, so that the number above, for example, would become 10523X123456.
My first idea was to substring those three characters, convert them to an integer, add 400 to it, convert them back to a string and then call replace on the original string. That works.
But then it occurred to me that the only character I actually need to change is the third one, and that the original value would always be 0-3, so there would never be any "carrying" problems. It further occurred to me that the ASCII code points for the numbers are consecutive, so adding the number 4 to the character "0", for example, would result in "4", and so forth. So that's what I ended up doing.
My question is, is there any reason that won't always work? I generally avoid "ASCII arithmetic" on the grounds that it's not cross-platform or internationalization friendly. But it seems reasonable to assume that the code points for numbers will always be sequential, i.e., "4" will always be 1 more than "3". Anybody see any problem with this reasoning?
Here's the code.
string input = "10123X123456";
input[2] += 4;
//Output should be 10523X123456
From the C++ standard, section 2.2.3:
In both the source and execution basic character sets, the value of each character after 0 in the
above list of decimal digits shall be one greater than the value of the previous.
So yes, if you're guaranteed to never need a carry, you're good to go.
The C++ language definition requres that the code-point values of the numerals be consecutive. Therefore, ASCII Arithmetic is perfectly acceptable.
Always keep in mind that if this is generated by something that you do not entirely control (such as users and third-party system), that something can and will go wrong with it. (Check out Murphy's laws)
So I think you should at least put on some validations before doing so.
It sounds like altering the string as you describe is easier than parsing the number out in the first place. So if your algorithm works (and it certainly does what you describe), I wouldn't consider it premature optimization.
Of course, after you add 400, it's no longer a day number, so you couldn't apply this process recursively.
And, <obligatory Year 2100 warning>.
Very long time ago I saw some x86 processor instructions for ASCII and BCD.
Those are AAA (ASCII Adjust for Addition), AAS (subtraction), AAM (mult), AAD (div).
But even if you are not sure about target platform you can refer to specification of characters set you are using and I guess you'll find that first 127 characters of ASCII is always have the same meaning for all characters set (for unicode that is first characters page).