I have one question with mixing C-string and fortran-string in one file.
Supposed I am playing with the name string with fixed length 9, I define a length Macro like
#define NAME_LEN 9
in a .c file.
There is an existing fortran-function, let's name it fortran_function(char* name)
Now I have to call this fortran function in a c function, let's name is
c_function(char name[]) {
fortran_function(name)
}
Now the problem is, how should I declare the c_function signature?
c_function(char name[])
c_function(char name[NAME_LEN +1])
or
c_function(char name[NAME_LEN])
Under what situations, I should use 9 as name length or 10?
My understanding is that, as long as you passed a null-terminated string with 9 characters to the c_function, all the declaration are correct. Is that right?
Any other concern should be put here? Any potential bugs?
There's one more gotcha here, if I remember correctly. Fortran does not use null-terminated strings; instead, it pads the right end of the buffer with 0x20 (space). So, if you have access to the Fortran source, I would modify the function signature to take the length of the passed-in string as an argument. Otherwise, you will probably crash the Fortran side of the code.
Dave is correct, the standard Fortran concept for strings is fixed-length, padded with blanks on the right. (Fortran now also have variable length strings, but these are not yet common and would be very tricky to inter-operate with C.) If you want the lengths fixed, then have the same parameter NAME_LEN in your Fortran code, with the same value. Dave's suggestion of an additional length argument is probably better.
An additional refinement is to use the ISO C Binding facility on the Fortran side (corrected per the comment!)
subroutine Fort_String_code (my_string), bind (C, name="Fort_String_code")
use iso_c_binding
integer, parameter :: NAME_LEN = 9
character (kind=c_char, len=1), dimension (NAME_LEN), intent (inout) :: my_string
etc. The "bind" name is the name by which C can call the Fortran routine -- it can be different from the Fortran name. Also provided as part of the iso_c_binding is the symbol C_NULL_CHAR which you can use in the Fortran code to provide the terminating null character that C expects, etc.
There's no difference to those calls. The c compiler will treat them all as char*. You just have to make sure you null terminate it before you use it in c. If you're only using the string in the fortran side and the c functions are just holding on to it, then you don't need to do anything.
Related
Due to some restriction on my assignment, F77 is used.
I am learning to use subroutine but I encounter error when trying to write string out.
PROGRAM test
IMPLICIT NONE
INTEGER a
CHARACTER*20 STR,str1
STR = 'Hello world'
a = 1
WRITE (*,*) a
WRITE (*,10) STR
CALL TEST(str1)
STOP
END
SUBROUTINE test(str2)
CHARACTER*20 str2
str2 = 'testing'
WRITE (*,10) STR2
RETURN
END
When trying to compile this code, it returns that 'Error: missing statement number 10'
Also, I have some other questions:
What does the *20 mean in CHARACTER*20 STR?
Is this the size of the string?
How about 10 in WRITE (*,10) STR? Is this the length of string to be written?
what does (*,*) mean in WRITE (*,*) a
As you can read for example here:
https://www.obliquity.com/computer/fortran/io.html
the second value given to write is an argument for the implicit format keyword, which is the label of a statement within the same program unit, a character expression or array containing the complete format specification, or an asterisk * for list-directed formatting.
Thus if you provide the data directly, you may want to use * there instead.
Otherwise, your program needs to have the label 10 at some line with formatting statement.
And yes, CHARACTER*20 STR means that the variable STR is of length 20, as you can read for instance here: https://www.obliquity.com/computer/fortran/datatype.html
The *20 after CHARACTER specifies the size of the CHARACTER variable (in this case 20 characters). FORTRAN doesn't use null-terminated strings like other languages, instead you have to reserve a specific number of characters. Your actual string can be shorter than the variable, but never longer.
The comma ( , ) in the write statement is used to separate the various arguments. Some versions of FORTRAN allow you to supply 'named' arguments but the default is the first argument is the file code to write to (a '*' implies the standard output). The second argument would be the line number of a FORMAT statement. There can be more arguments, you'd have to look up the specifics for the OPEN statement in your version of FORTRAN.
Some of your WRITE() statements are specifying to use the FORMAT statement found at lable '10'. But your sample doesn't provide any FORMAT statement, so this would be an error.
If you don't want to deal with a FORMAT statement, you can use an asterisk ( * ) as the second argument and then FORTRAN will use a general default format. That is what your first WRITE(,) is doing. It writes to 'stdout' using a general format.
I am writing the following simple routine:
program scratch
character*4 :: word
word = 'hell'
print *, concat(word)
end program scratch
function concat(x)
character*(*) x
concat = x // 'plus stuff'
end function concat
The program should be taking the string 'hell' and concatenating to it the string 'plus stuff'. I would like the function to be able to take in any length string (I am planning to use the word 'heaven' as well) and concatenate to it the string 'plus stuff'.
Currently, when I run this on Visual Studio 2012 I get the following error:
Error 1 error #6303: The assignment operation or the binary
expression operation is invalid for the data types of the two
operands. D:\aboufira\Desktop\TEMP\Visual
Studio\test\logicalfunction\scratch.f90 9
This error is for the following line:
concat = x // 'plus stuff'
It is not apparent to me why the two operands are not compatible. I have set them both to be strings. Why will they not concatenate?
High Performance Mark's comment tells you about why the compiler complains: implicit typing.
The result of the function concat is implicitly typed because you haven't declared its type otherwise. Although x // 'plus stuff' is the correct way to concatenate character variables, you're attempting to assign that new character object to a (implictly) real function result.
Which leads to the question: "just how do I declare the function result to be a character?". Answer: much as you would any other character variable:
character(len=length) concat
[note that I use character(len=...) rather than character*.... I'll come on to exactly why later, but I'll also point out that the form character*4 is obsolete according to current Fortran, and may eventually be deleted entirely.]
The tricky part is: what is the length it should be declared as?
When declaring the length of a character function result which we don't know ahead of time there are two1 approaches:
an automatic character object;
a deferred length character object.
In the case of this function, we know that the length of the result is 10 longer than the input. We can declare
character(len=LEN(x)+10) concat
To do this we cannot use the form character*(LEN(x)+10).
In a more general case, deferred length:
character(len=:), allocatable :: concat ! Deferred length, will be defined on allocation
where later
concat = x//'plus stuff' ! Using automatic allocation on intrinsic assignment
Using these forms adds the requirement that the function concat has an explicit interface in the main program. You'll find much about that in other questions and resources. Providing an explicit interface will also remove the problem that, in the main program, concat also implicitly has a real result.
To stress:
program
implicit none
character(len=[something]) concat
print *, concat('hell')
end program
will not work for concat having result of the "length unknown at compile time" forms. Ideally the function will be an internal one, or one accessed from a module.
1 There is a third: assumed length function result. Anyone who wants to know about this could read this separate question. Everyone else should pretend this doesn't exist. Just like the writers of the Fortran standard.
Say the following module is given to me, and I am not allowed to edit it:
module somemod
type somestruct
character(40) somestr
end type
end module
And I use it in this code:
program myprog
use somemod
implicit none
character(size(somestruct%somestr)) localstr !Is this possible?
end program
Is there syntax accomplish what the marked line is trying to do? That is, can I get the size of an array in an user-defined data structure without instantiating the data structure?
First,
character(40) somestr
is not an array, it is a character string of length 40.
The difference is substantial, it is not just nitpicking. You use arrays and strings differently. See Difference between "character*10 :: a" and "character :: a(10)" for more.
The length of a string is inquired by the intrinsic function len().
But unfortunately, you cannot call it on a component of a derived type, without first having a variable (instance) of that type.
So you need
program myprog
use somemod
implicit none
type(somestruct) :: o
character(len(o%somestr)) localstr !This is possible.
end program
If you needed the size of an array component, it would be the same, but with the size() intrinsic function.
I would like to use deferred-length character strings in a "simple" manner to read user input. The reason that I want to do this is that I do not want to have to declare the size of a character string before knowing how large the user input will be. I know that there are "complicated" ways to do this. For example, the iso_varying_string module can be used: https://www.fortran.com/iso_varying_string.f95. Also, there is a solution here: Fortran Character Input at Undefined Length. However, I was hoping for something as simple, or almost as simple, as the following:
program main
character(len = :), allocatable :: my_string
read(*, '(a)') my_string
write(*,'(a)') my_string
print *, allocated(my_string), len(my_string)
end program
When I run this program, the output is:
./a.out
here is the user input
F 32765
Notice that there is no output from write(*,'(a)') my_string. Why?
Also, my_string has not been allocated. Why?
Why isn't this a simple feature of Fortran? Do other languages have this simple feature? Am I lacking some basic understanding about this issue in general?
vincentjs's answer isn't quite right.
Modern (2003+) Fortran does allow automatic allocation and re-allocation of strings on assignment, so a sequence of statements such as this
character(len=:), allocatable :: string
...
string = 'Hello'
write(*,*)
string = 'my friend'
write(*,*)
string = 'Hello '//string
write(*,*)
is correct and will work as expected and write out 3 strings of different lengths. At least one compiler in widespread use, the Intel Fortran compiler, does not engage 2003 semantics by default so may raise an error on trying to compile this. Refer to the documentation for the setting to use Fortran 2003.
However, this feature is not available when reading a string so you have to resort to the tried and tested (aka old-fashioned if you prefer) approach of declaring a buffer of sufficient size for any input and of then assigning the allocatable variable. Like this:
character(len=long) :: buffer
character(len=:), allocatable :: string
...
read(*,*) buffer
string = trim(buffer)
No, I don't know why the language standard forbids automatic allocation on read, just that it does.
Deferred length character is a Fortran 2003 feature. Note that many of the complicated methods linked to are written against earlier language versions.
With Fortran 2003 support, reading a complete record into a character variable is relatively straight forward. A simple example with very minimal error handling below. Such a procedure only needs to be written once, and can be customized to suit a user's particular requirements.
PROGRAM main
USE, INTRINSIC :: ISO_FORTRAN_ENV, ONLY: INPUT_UNIT
IMPLICIT NONE
CHARACTER(:), ALLOCATABLE :: my_string
CALL read_line(input_unit, my_string)
WRITE (*, "(A)") my_string
PRINT *, ALLOCATED(my_string), LEN(my_string)
CONTAINS
SUBROUTINE read_line(unit, line)
! The unit, connected for formatted input, to read the record from.
INTEGER, INTENT(IN) :: unit
! The contents of the record.
CHARACTER(:), INTENT(OUT), ALLOCATABLE :: line
INTEGER :: stat ! IO statement IOSTAT result.
CHARACTER(256) :: buffer ! Buffer to read a piece of the record.
INTEGER :: size ! Number of characters read from the file.
!***
line = ''
DO
READ (unit, "(A)", ADVANCE='NO', IOSTAT=stat, SIZE=size) buffer
IF (stat > 0) STOP 'Error reading file.'
line = line // buffer(:size)
! An end of record condition or end of file condition stops the loop.
IF (stat < 0) RETURN
END DO
END SUBROUTINE read_line
END PROGRAM main
Deferred length arrays are just that: deferred length. You still need to allocate the size of the array using the allocate statement before you can assign values to it. Once you allocate it, you can't change the size of the array unless you deallocate and then reallocate with a new size. That's why you're getting a debug error.
Fortran does not provide a way to dynamically resize character arrays like the std::string class does in C++, for example. In C++, you could initialize std::string var = "temp", then redefine it to var = "temporary" without any extra work, and this would be valid. This is only possible because the resizing is done behind the scenes by the functions in the std::string class (it doubles the size if the buffer limit is exceeded, which is functionally equivalent to reallocateing with a 2x bigger array).
Practically speaking, the easiest way I've found when dealing with strings in Fortran is to allocate a reasonably large character array that will fit most expected inputs. If the size of the input exceeds the buffer, then simply increase the size of your array by reallocateing with a larger size. Removing trailing white space can be done using trim.
You know that there are "complicated" ways of doing what you want. Rather than address those, I'll answer your first two "why?"s.
Unlike intrinsic assignment a read statement does not have the target variable first allocated to the correct size and type parameters for the thing coming in (if it isn't already like that). Indeed, it is a requirement that the items in an input list be allocated. Fortran 2008, 9.6.3, clearly states:
If an input item or an output item is allocatable, it shall be allocated.
This is the case whether the allocatable variable is a character with deferred length, a variable with other deferred length-type parameters, or an array.
There is another way to declare a character with deferred length: giving it the pointer attribute. This doesn't help you, though, as we also see
If an input item is a pointer, it shall be associated with a definable target ...
Why you have no output from your write statement is related to why you see that the character variable isn't allocated: you haven't followed the requirements of Fortran and so you can't expect the behaviour that isn't specified.
I'll speculate as to why this restriction is here. I see two obvious ways to relax the restriction
allow automatic allocation generally;
allow allocation of a deferred length character.
The second case would be easy:
If an input item or an output item is allocatable, it shall be allocated unless it is a scalar character variable with deferred length.
This, though, is clumsy and such special cases seem against the ethos of the standard as a whole. We'd also need a carefully thought out rule about alloction for this special case.
If we go for the general case for allocation, we'd presumably require that the unallocated effective item is the final effective item in the list:
integer, allocatable :: a(:), b(:)
character(7) :: ifile = '1 2 3 4'
read(ifile,*) a, b
and then we have to worry about
type aaargh(len)
integer, len :: len
integer, dimension(len) :: a, b
end type
type(aaargh), allocatable :: a(:)
character(9) :: ifile = '1 2 3 4 5'
read(ifile,*) a
It gets quite messy very quickly. Which seems like a lot of problems to resolve where there are ways, of varying difficulty, of solving the read problem.
Finally, I'll also note that allocation is possible during a data transfer statement. Although a variable must be allocated (as the rules are now) when appearing in input list components of an allocated variable of derived type needn't be if that effective item is processed by defined input.
I am transfering an integer to a string using the following method
Character (len=10) :: s
Integer :: i
i = 7
Write (s, *, iostat=ios) i
However this leaves the contents of s empty. When I introduce an explicit format, s is populated as intended.
Character (len=10) :: s
Integer :: i
i = 7
Write (s, "(i3)", iostat=ios) i
It looks like the problem is related to the length of the string s because
when I use a longer length, I do get the correct number.
Character (len=25) :: s
The first lesson to take from this is: if you use the iostat= specifier, check the result. However, the behaviour of your code is seemingly not guaranteed. I'll base this part of the answer on intepreting how your compiler is taking things.
In addition to the iostat= specifier, in general you can use, as Vladimir F mentions in a comment, the iomsg= specifier to get a friendly message. [As IanH notes, the nominated variable is updated, and in particular may otherwise remain undefined, only in situations where the variable for the iostat= specifier is (or would be) set to non-zero.]
character (len=10) s
character (len=58) mesg
integer ios
write(s,*, iostat=ios, iomsg=mesg) 7
if (ios/=0) print*, ios, TRIM(mesg)
You want to check this, because you are using list-directed output. Here, the compiler is free to choose "reasonable" values for the integer edit format. It's more than likely that, for default integer kind, the field would be longer than 9 (don't forget the leading blank). Thus, it doesn't "fit" into the length-10 character record: "End of record" would be reasonable for mesg in this case.
With the explicit format I3 it fits with much to spare.
Why you see LEN_TRIM(s) as 10, is that s actually likely becomes junk.
Now, coming to the final part. It appears that your code with list-directed output is not valid. Fortran 2008 (and I presume many others) explicitly states:
On output, the output list and format specification shall not specify more characters for a record than ... the record length of an internal file.
The record length of your internal file being 10.
The usual caveats of relying on any particular behaviour hold. I'd be disappointed, though, if something dramatic happened.