Char array dimensioning confusion in Fortran 77 - fortran

I have the following piece of code in my subroutine:
character x*256 ,y*80
common /foo/ x ,y(999)
Well, I did not actually write this. So I don't understant the dimensions here. Is y an 999 element wide array of 80 character long strings?
If so, how can I define this properly in Fortran 90, without the common block?

I will first say that the code you have is "proper" Fortran 90, but I agree with wanting to move away from common blocks.
There is, essentially, nothing specific to the character nature of the declaration. Whenever
<type> A
common /foo/ A(<size>)
is used there are two parts to the declaration of A, as well as the common association: the type and the dimension. Ignoring the association, declaration of the dimension in the common statement is allowed and the above is like
<type> A
dimension A(<size>)
This is in turn the same as
<type>, dimension(<size>) :: A
Coming to the specific example, the type is a character of length 80. Your non-common declaration would simply be
character(len=80), dimension(999) :: y
Indeed, then, y is a rank-1 array of size 999 of length-80 characters. y(10) is a scalar length-80 character (the 10th element of the array y).
x(10) isn't correct syntax, as the (10) is array indexing, and x is a scalar. For substrings a different indexing is required. x(10:10) is the 10th character of the character variable x; y(10)(10:10) is the 10th character of the 10th element of the character array y.

Related

Indexing of integer array through characters of a string

I was doing a problem on dynamic programming. The problem was for printing distinct sub sequences from a given string. So I encounter something which was unknown to me. In that code elements of integer array were accessed via character of a string, (actually that was a vector of int type). So I tried to do the same thing in a new code. It was giving me some output. But I didn't understand that.
I have tried this code on my PC but couldn't understand the output. I want to know the logic behind the output and want to know whether indexing is possible through characters of a string.
#include<bits/stdc++.h>
using namespace std;
int main(){
string s;
cin>>s;
int* last = new int[1000];
for(int i=0;i<s.length();i++){
cout<<last[s[i]];
}
}
When I input something in it, lets say "abcdefgh", it will give me "00000000".
Why and what is this? I don't know what is expected output.
Let me explain this to you.
But first, some recommendation. If you do write real code and not for competitive programming, please do never use
#include<bits/stdc++.h>
using namespace std;
So, now for basic understanding. A string consists for your eyes of characters or letters. The computer, in its memory cannot store letters. It does only know bits and bytes. So, numbers.
There is a code for which number is associated with what character. One of them is ASCII
So, if the computer sends those numbers to an output or printing device, these numbers will be converted to some understandable letters.
In reality a string is an array of numbers. Just nicely wrapped for you. And, a string has an index operator []. If you say s[0], then it will give the first character to you, a number. You can check this by casting the character value to an integer. Simply try std::cout << static_cast<int>(s[0]);. And you will see a number.
Now you know that s[i] will give you a number. Then you have an array of int's: "last". And you use the index operator [k] to get the k'th element.
If you write last[s[i]], then, first the inner value is evaluated: s[i]. Let us assume that this was character 'A' which is equal to number 65. This results to last[65] and you will read the 66th element of last.
That is already important to understand.
Now to the array last. (By the way, never use ""arrays" or "new").
int* last = new int[1000];
"new" will allocate a contiguous memory area of 1000 int's (in nowadays computer systems 4000 bytes) on the heap, so somehwehre in the memory. Where it does that, is out of our control. And those values will have some (randdom) content. The memory area is not initialized. (In my opinion this is wrong, we shoud initialize everthing).
In your case, there are accidently many 0's in it, but sometimes also others.
And if you enter now the string "Hello", then this is equivalent to the numbers 72, 101, 108, 108, 111. With that, you will display, the 72nd, 101st, 108th, 108th and 111th integer value of the integer array last.
Hopy this make things clear.

Assumed string length input into a Fortran function

I am writing the following simple routine:
program scratch
character*4 :: word
word = 'hell'
print *, concat(word)
end program scratch
function concat(x)
character*(*) x
concat = x // 'plus stuff'
end function concat
The program should be taking the string 'hell' and concatenating to it the string 'plus stuff'. I would like the function to be able to take in any length string (I am planning to use the word 'heaven' as well) and concatenate to it the string 'plus stuff'.
Currently, when I run this on Visual Studio 2012 I get the following error:
Error 1 error #6303: The assignment operation or the binary
expression operation is invalid for the data types of the two
operands. D:\aboufira\Desktop\TEMP\Visual
Studio\test\logicalfunction\scratch.f90 9
This error is for the following line:
concat = x // 'plus stuff'
It is not apparent to me why the two operands are not compatible. I have set them both to be strings. Why will they not concatenate?
High Performance Mark's comment tells you about why the compiler complains: implicit typing.
The result of the function concat is implicitly typed because you haven't declared its type otherwise. Although x // 'plus stuff' is the correct way to concatenate character variables, you're attempting to assign that new character object to a (implictly) real function result.
Which leads to the question: "just how do I declare the function result to be a character?". Answer: much as you would any other character variable:
character(len=length) concat
[note that I use character(len=...) rather than character*.... I'll come on to exactly why later, but I'll also point out that the form character*4 is obsolete according to current Fortran, and may eventually be deleted entirely.]
The tricky part is: what is the length it should be declared as?
When declaring the length of a character function result which we don't know ahead of time there are two1 approaches:
an automatic character object;
a deferred length character object.
In the case of this function, we know that the length of the result is 10 longer than the input. We can declare
character(len=LEN(x)+10) concat
To do this we cannot use the form character*(LEN(x)+10).
In a more general case, deferred length:
character(len=:), allocatable :: concat ! Deferred length, will be defined on allocation
where later
concat = x//'plus stuff' ! Using automatic allocation on intrinsic assignment
Using these forms adds the requirement that the function concat has an explicit interface in the main program. You'll find much about that in other questions and resources. Providing an explicit interface will also remove the problem that, in the main program, concat also implicitly has a real result.
To stress:
program
implicit none
character(len=[something]) concat
print *, concat('hell')
end program
will not work for concat having result of the "length unknown at compile time" forms. Ideally the function will be an internal one, or one accessed from a module.
1 There is a third: assumed length function result. Anyone who wants to know about this could read this separate question. Everyone else should pretend this doesn't exist. Just like the writers of the Fortran standard.

Converting integer to character in Fortran90

I am trying to convert an integer to character in my program in Fortran 90.
Here is my code:
Write(Array(i,j),'(I5)') Myarray(i,j)
Array is an integer array and Myarray is a character array, and '(I5)', I don't know what it is, just worked for me before!
Error is:
"Unit has neither been opened not preconnected"
and sometimes
"Format/data mismatch"!
'(I5)' is the format specifier for the write statement: write the value as an integer with five characters in total.
Several thing could go wrong:
Make sure that Myarray really is an integer (and not e.g. a real)
Make sure array is a character array with a length of at least five characters for each element
Take care of the array shapes
Ensure that i and j hold valid values
Here is a working example:
program test
implicit none
character(len=5) :: array(2,2)
integer,parameter :: myArray(2,2) = reshape([1, 2, 3, 4], [2, 2])
integer :: i, j
do j=1,size(myArray,2)
do i=1,size(myArray,1)
write(array(i,j), '(I5)' ) myArray(i,j)
enddo !i
enddo !j
print *, myArray(1,:)
print *, myArray(2,:)
print *,'--'
print *, array(1,:)
print *, array(2,:)
end program
Alexander Vogt explains the meaning of the (I5) part. That answer also points out some other issues and fixes the main problem. It doesn't quite explicitly state the solution, so I'll write that here.
You have two errors, but both have the same cause. I'll re-state your write statement explicitly stating something which is implicit.
Write(unit=Array(i,j),'(I5)') Myarray(i,j)
That implicit thing is unit=. You are, then, asking to write the character variable Myarray(i,j) to the file connected to unit given by the integer variable Array(i,j).
For some values of the unit integer the file is not pre-connected. You may want to read about that. When it isn't you get the first error:
Unit has neither been opened not preconnected
For some values of Array(i,j), say 5, 6 or some other value depending on the compiler, the unit would be pre-connected. Then that first error doesn't come about and you get to
Format/data mismatch
because you are trying to write out a character variable with an integer edit descriptor.
This answer, then, is a long way of saying that you want to do
Write(Myarray(i,j),'(I5)') array(i,j)
You want to write the integer value to a character variable.
Finally, note that if you made the same mistake with a real variable array instead of integer, you would have got a different error message. In one way you just got unlucky that your syntax was correct but the intention was wrong.

What does this Fortran code do?

I was given some Fortran code (90, I believe) and I'm trying to figure out what it does. I know no Fortran, but do know Perl.
Here is a snippet that I've not been able to figure out:
fmly='I:\CEX\Fmly'
fmlyfile=fmly(1:23)//yearqtr(qtrcnt)
open(unit=13,file=fmlyfile)
I know that // is a concatenation operator, but I'm confused about what the fmly(1:23) part is doing.
fmly(1:23) is slicing a character string fmly from position 1 to position 23. Note that in Fortran, string indexing begins from 1 and not from 0. fmly(1:23) is equivalent to fmly(:23).
string(A:B) is a substring, selecting characters A to B of string string. fmly is initialized with fewer than 23 characters, so the trailing characters will be blanks. After that it will be concatenated with an element of the string array yearqtr (or possibly a string-valued function yearqtr).

What does it mean to be "terminated by a zero"?

I am getting into C/C++ and a lot of terms are popping up unfamiliar to me. One of them is a variable or pointer that is terminated by a zero. What does it mean for a space in memory to be terminated by a zero?
Take the string Hi in ASCII. Its simplest representation in memory is two bytes:
0x48
0x69
But where does that piece of memory end? Unless you're also prepared to pass around the number of bytes in the string, you don't know - pieces of memory don't intrinsically have a length.
So C has a standard that strings end with a zero byte, also known as a NUL character:
0x48
0x69
0x00
The string is now unambiguously two characters long, because there are two characters before the NUL.
It's a reserved value to indicate the end of a sequence of (for example) characters in a string.
More correctly known as null (or NUL) terminated. This is because the value used is zero, rather than being the character code for '0'. To clarify the distinction check out a table of the ASCII character set.
This is necessary because languages like C have a char data type, but no string data type. Therefore it is left to the devleoper to decide how to manage strings in their application. The usual way of doing this is to have an array of chars with a null value used to terminate (i.e. signify the end of) the string.
Note that there is a distinction between the length of the string, and the length of the char array that was originally declared.
char name[50];
This declares an array of 50 characters. However, these values will be uninitialised. So if I want to store the string "Hello" (5 characters long) I really don't want to bother setting the remaining 45 characters to spaces (or some other value). Instead I store a NUL value after the last character in my string.
More recent languages such as Pascal, Java and C# have a specific string type defined. These have a header value to indicate the number of characters in the string. This has a couple of benefits; firstly you don't need to walk to the end of the string to find out its length, secondly your string can contain null characters.
Wikipedia has further information in the String (computer science) entry.
Arrays and string in C is just a pointers to a memory location. By pointer you can find a start of array. The end of array is undefined. The end of character array (which is the string) is zero-byte.
So, in memory string hello is written as:
68 65 6c 6c 6f 00 |hello|
It refers to how C strings are stored in memory. The NUL character represented by \0 in string iterals is present at the end of a C string in memory. There is no other meta data associated with a C string like length for example. Note the different spelling between NUL character and NULL pointer.
There are two common ways to handle arrays that can have varying-length contents (like Strings). The first is to separately keep the length of the data stored in the array. Languages like Fortran and Ada and C++'s std::string do this. The disadvantage to doing this is that you somehow have to pass that extra information to everything that is dealing with your array.
The other way, is to reserve an extra non-data element at the end of the array to serve as a sentinel. For the sentinel you use a value that should never appear in the actual data. For strings, 0 (or "NUL") is a good choice, as that is unprintable and serves no other purpose in ASCII. So what C (and many languages copied from C) do is to assume that all strings end (or "are terminated by") a 0.
There are several drawbacks to this. For one thing, it is slow. Any time a routine needs to know the length of the string, it is an O(n) operation (searching through the entire string looking for the 0). Another problem is that you may one day want to put a 0 in your string for some reason, so now you need a whole second set of string routines that ignore the null and use a separate length anyway (eg: strnlen() ). The third big problem is that if someone forgets to put that 0 at the end (or it gets wiped out somehow), the next string operation to do a lenth check will go merrily marching through memory until it either happens to randomly find another 0, crashes, or the user loses patience and kills it. Such bugs can be a serious PITA to track down.
For all these reasons, the C approach is generally viewed with disfavor.
C-style strings are terminated by a NUL character ('\0'). This provides a marker for functions that operate on strings (e.g. strlen, strcpy) to use to identify the end of the string.
While the classic example of "terminated by a zero" is that of strings in C, the concept is more general. It can be applied to any list of things stored in an array, the size of which is not known explicitly.
The trick is simply to avoid passing around an array size by appending a sentinel value to the end of the array. Typically, some form of a zero is used, but it can be anything else (like a NAN if the array contains floating point values).
Here are three examples of this concept:
C strings, of course. A single zero character is appended to the string: "Hello" is encoded as 48 65 6c 6c 6f 00.
Arrays of pointers naturally allow zero termination, because the null pointer (the one that points to address zero) is defined to never point to a valid object. As such, you might find code like this:
Foo list[] = { somePointer, anotherPointer, NULL };
bar(list);
instead of
Foo list[] = { somePointer, anotherPointer };
bar(sizeof(list)/sizeof(*list), list);
This is why the execvpe() only needs three arguments, two of which pass arrays of user defined length. Since all that's passed to execvpe() are (possibly lots of) strings, this little function actually sports two levels of zero termination: null pointers terminating the string lists, and null characters terminating the strings themselves.
Even when the element type of the array is a more complex struct, it may still be zero terminated. In many cases, one of the struct members is defined to be the one that signals the end of the list. I have seen such function definitions, but I can't unearth a good example of this right now, sorry. Anyway, the calling code would look something like this:
Foo list[] = {
{ someValue, somePointer },
{ anotherValue, anotherPointer },
{ 0, NULL }
};
bar(list);
or even
Foo list[] = {
{ someValue, somePointer },
{ anotherValue, anotherPointer },
{} //C zeros out an object initialized with an empty initializer list.
};
bar(list);