I have some functions written in Fortran that take a structure as an argument, but the caller has the data stored in an INTEGER*4(2) array. In order to avoid the copy between the two data structures, I'm wondering if the following implementation of a C++-like reinterpret_cast is valid according to the specification:
STRUCTURE /TimeStamp/
INTEGER*4 secondsSinceEpoch
INTEGER*4 nanos
END STRUCTURE
STRUCTURE /reinterpret_cast/
UNION
MAP
INTEGER*4, POINTER :: array(:)
END MAP
MAP
TYPE (TimeStamp), POINTER :: tstamp
END MAP
END UNION
END STRUCTURE
SUBROUTINE set_time(timeArg)
INTEGER*4, TARGET :: timeArg(2)
RECORD /reinterpret_cast/ time
time % array => timeArg
time % tstamp % secondsSinceEpoch = 12
time % tstamp % nanos = 0
END
Is this implementation of the set_time method guaranteed to work (e.g., set the values of timeArg(1) and timeArg(2))?
No, your function is not guaranteed to work by the Fortran standard and many compilers will refuse the syntax altogether. I am not sure whether Fortran pointers are allowed in the DEC structures and if yes, whether you can union them. They (structure and union and record) were designed before Fortran pointers were put into the standard and are strongly discouraged for new code, but it is quite possible Intel allowed Fortran pointers in allowed them.
Much easier (at least for me) way is to use Fortran standard type(c_ptr) which is basically the C void * pointer.
SUBROUTINE set_time(timeArg)
USE, INTRINSIC :: ISO_C_BINDING
INTEGER(c_int_32), TARGET :: timeArg(2)
type(TimeStamp), POINTER :: tstamp
CALL c_f_pointer(c_loc(timeArg), tstamp)
tstamp % secondsSinceEpoch = 12
tstamp % nanos = 0
END
I also changed the INTEGER*4 because it is also not standard conforming and not guaranteed to be C-interoperable.
Do note that the address of the target dummy argument is valid only in the subroutine unless the actual argument is pointer or target.
What you are looking for is the F90-standard function TRANSFER. It interprets the bit representation of the operand as if it was of the same type of another variable (the "mold"). Thus, this:
USE ISO_FORTRAN_ENV ! For the REALnn and INTnn constants
REAL(REAL32) r
INTEGER(INT32) i
r = 1.0
i = TRANSFER(r, i) ! The second "i" here is unevaluated, just gives the type
Is equivalent to this:
float r = 1.0;
int32_t i;
i = *reinterpret_cast<int*>(&f);
Note that the REALnn and INTnn constants are from Fortran 2008, so your compiler might not have them. I just used them as examples to make sure that the types were compatible, since just like in C, the standard does not say precisely how big a "default real" or "default integer" are.
As an example, I frequently use this function when creating Fortran-based MEX functions in Matlab, since the Matlab interface with Fortran is based on F77 and does not allow you to use pointers to Matlab memory directly, unlike the C interface. I use the TRANSFER function and the ISO_C_BINDING module (F2003) to cast the "integer" (actually a C pointer) Matlab gives me to the Fortran type C_PTR, to a Fortran pointer. Like this:
USE ISO_C_BINDING ! For C_PTR and related functions
INTEGER(INT32), POINTER :: arrayPtr(:)
mwSize n ! This is a type defined in the Matlab-Fortran interface
mwPointer myMatlabArray = ... ! So is this
TYPE(C_PTR) cPtrToData
! Cast the returned C pointer to the data (Matlab interface returns an integer type)
cPtrToData = TRANSFER(mxGetData(myMatlabArray), cPtrToData)
! Since Fortran arrays/pointers have size information, get the length
n = mxGetNumberOfElements(myMatlabArray)
CALL C_F_PTR(cPtrToData, arrayPtr, [n]) ! Associate the Fortran ptr
array(3:7) = ... ! Do whatever, no need to copy
Which is the rough equivalent to the C version:
mxArray* myMatlabArray = ...; //
mwSize n = mxGetNumberOfElements(myMatlabArray);
int* arrayPtr = (int*)mxGetData(myMatlabArray);
array[3] = ... // Do whatever, no need to copy
So in both cases these MEX functions could be called with Matlab array of Matlab type int32.
Related
I am working with a legacy Fortran library that requires a character scalar PATH as an argument to the subroutine. The original interface was:
SUBROUTINE MINIMAL(VAR1, ..., PATH)
CHARACTER (LEN=4096) PATH
...
I need to be able to call this from C++ so I have made the following changes:
SUBROUTINE MINIMAL(VAR1, ..., PATH) &
BIND (C, NAME="minimal_f")
USE ISO_C_BINDING, ONLY: C_CHAR, C_NULL_CHAR
CHARACTER (KIND=C_CHAR, LEN=1), DIMENSION(4096), INTENT(IN) :: PATH
CHARACTER (LEN=4096):: new_path
! Converting C char array to Fortran CHARACTER.
new_path = " "
loop_string: do i=1, 4096
if ( PATH (i) == c_null_char ) then
exit loop_string
else
new_path (i:i) = PATH (i)
end if
end do loop_string
as per this answer. This works to convert the C-style char array to its Fortran scalar equivalent, with two problems:
This code is on the critical path so doing the same conversion every time when the answer is the same is inefficient
I would strongly prefer to not have to edit legacy code
I have tried:
Just accepting a CHARACTER (LENGTH=4096) :: new_path directly with the ISO C binding, but I get the following compiler error:
Error: Character argument 'new_path' at (1) must be length 1 because procedure 'minimal' is BIND(C)
This answer and others that I have read suggest that the ISO C binding seems to restrict what I can pass as parameters to the function, although I haven't found any official documentation yet.
This answer, which gives another algorithm to turn a C-style string
into a Fortran-style equivalent in the C code and passing it to the Fortran subroutine without using the ISO C binding. (This function suggests a similar algorithm). This seems like exactly what I want but I have a linker error without the binding:
Undefined symbols for architecture x86_64:
"_minimal", referenced from:
C++-side function declaration:
extern "C" {
double minimal(int* var1, ..., const char* path);
}
This suggests that my compiler (gcc) prepends the function name with an underscore when in an extern block. gfortran, however, does not let me name the subroutine _minimal so the linker can't find the symbol _minimal. (The aforementioned link suggests adding an underscore to the end of the C-side function name but this doesn't work either because of the leading underscore.)
I want to process a C-style string into a Fortran-style character scalar once in my C++ code and be able to pass it into the original interface. Any ideas?
Fortran 2018 allows interoperable procedures to have character dummy arguments of assumed length, relaxing the restriction that such dummy arguments must be of length one.
So we can write a Fortran procedure as
subroutine minimal(path) bind(c)
use, intrinsic :: iso_c_binding, only : c_char
character(*,c_char), intent(in) :: path
...
end subroutine minimal
and continue our life knowing that we've also improved our Fortran code by using an assumed length scalar instead of an explicit length one. No "Fortran side" copy of this character dummy is required.
The sad part of this story is that the dummy argument path is not interoperable with a char. So instead of the formal parameter of the C (or C++) function being char * , it must be CFI_cdesc_t *. For (C) example:
#include "ISO_Fortran_binding.h"
#include "string.h"
void minimal(CFI_cdesc_t *);
int main(int argc, char *argv[]) {
/* Fortran argument will be a scalar (rank 0) */
CFI_CDESC_T(0) fpath;
CFI_rank_t rank = 0;
char path[46] = "afile.txt";
CFI_establish((CFI_cdesc_t *)&fpath, path, CFI_attribute_other,
CFI_type_char, strlen(path)*sizeof(char), rank, NULL);
minimal((CFI_cdesc_t *)&fpath);
return 0;
}
A C++ example will be similar.
An notable part of the story is that you'll need a Fortran compiler which implements this part of Fortran 2018. GCC 11 does not.
IanH's answer draws attention to an approach which avoids modifying the original Fortran subroutine at all. There certainly are times when avoiding any change there is good (repeating slightly what IanH said):
using bind(c) means an explicit interface will now always be required when calling the modified subroutine through Fortran itself. Perhaps some parts of your code used it with an implicit interface
the original was tested (or wasn't) and you don't want to break anything
you don't want to potentially change the argument from default kind to interoperable kind (if these do differ)
the explicit length dummy argument really is wanted
you just don't want to modify it if not required
Any one of those would make a good argument, so in that spirit I'll add to the C example with the thin wrapper.
Fortran:
subroutine minimal_wrap(path) bind(c, name='minimal')
use, intrinsic :: iso_c_binding, only : c_char
character(*,c_char), intent(in) :: path
call minimal(path)
end subroutine minimal_wrap
subroutine minimal(path)
character(4096) path
print*, trim(path)
end subroutine minimal
C:
#include "ISO_Fortran_binding.h"
#include "string.h"
void minimal(CFI_cdesc_t *);
static const int pathlength=4096;
int main(int argc, char *argv[]) {
/* Fortran argument will be a scalar (rank 0) */
CFI_CDESC_T(0) fpath;
CFI_rank_t rank = 0;
char path[pathlength];
/* Set path as desired. Recall that it shouldn't be null-terminated
for Fortran */
CFI_establish((CFI_cdesc_t *)&fpath, path, CFI_attribute_other,
CFI_type_char, pathlength*sizeof(char), rank, NULL);
minimal((CFI_cdesc_t *)&fpath);
return 0;
}
C++ using containers will arguably be nicer.
Recall that this puts responsibility on the C side to ensure the array is long enough (as you have in pure Fortran calls).
Equally if you need to be robust to differences in default character and interoperable character with that copy (as in IanH's answer) you can apply those same tricks to copy as required (or you can do this with conditional compilation and configure-time checks). By this point however, you may as well just assume always copy or use the array argument.
The answer to the question title is typically a std::string object, padded to the relevant fixed Fortran CHARACTER scalar length with spaces. Alternative storage objects (std::vector<char>, or a C-style char array) could be used on the C++ side, but the approach is similar.
(If the Fortran code used an assumed length character argument, rather than fixed length, then the padding would not be required. Whether this change is possible depends on the details of the MINIMAL subroutine. Fixed length character variables are typically an anachronism - this answer is not advocating their use in new code.)
On the Fortran side, you can write a thin wrapper that the C++ can call, that uses sequence and pointer association to avoid the need to copy the string data, for typical C++/Fortran platforms of today. A copy (or modification of the legacy Fortran code) is unavoidable if the interoperable character kind is not the same as the character kind of the legacy Fortran procedure. The example code below is robust to this situation, but I expect platforms that require that code path to be rare.
For default character and C_CHAR interoperable character arguments, sequence association permits an array dummy argument to be associated with the sequence of characters designated by the actual argument. This effectively permits association between character scalars and arrays with different lengths.
(Do not confuse the ISO_C_BINDING intrinsic module with the BIND(C) procedure suffix. BIND(C) fundamentally changes the interface of a procedure to enable calls between C and Fortran - ISO_C_BINDING is just a module with some handy types, constants and procedures for such calls.)
Example C++:
#include <string>
#include <cassert>
const int path_length = 4096;
extern "C" int legacy_cintf(char* array);
int main()
{
std::string some_long_text
= "It was the best of times, it was the worst of times, it was "
"the age of wisdom, it was the age of foolishness, it was the "
"epoch of belief, it was the epoch of incredulity, it was the "
"season of light, it was the season of darkness, it was the "
"spring of hope, it was the winter of despair.";
assert(some_long_text.size() < path_length);
std::string path = std::string(path_length, ' ');
path.replace(0, some_long_text.size(), some_long_text);
legacy_cintf(&path[0]);
return 0;
}
Example Fortran:
MODULE m
IMPLICIT NONE
CONTAINS
SUBROUTINE legacy_cintf(array) BIND(C, NAME='legacy_cintf')
USE, INTRINSIC :: ISO_C_BINDING, ONLY: C_INT, C_CHAR
CHARACTER(LEN=1,KIND=C_CHAR), TARGET :: array(4096)
CHARACTER(LEN=SIZE(array)), POINTER :: scalar
LOGICAL :: copy_required
copy_required = C_CHAR /= KIND(scalar)
IF (copy_required) THEN
ALLOCATE(scalar)
CALL do_copy(array, scalar)
ELSE
CALL do_associate(array, scalar)
END IF
CALL LEGACY(scalar)
IF (copy_required) DEALLOCATE(scalar)
END SUBROUTINE legacy_cintf
SUBROUTINE do_associate(arg, scalar)
CHARACTER(*), INTENT(OUT), POINTER :: scalar
CHARACTER(LEN=LEN(scalar)), INTENT(IN), TARGET :: arg(1)
scalar => arg(1)
END SUBROUTINE
SUBROUTINE do_copy(arg, scalar)
USE, INTRINSIC :: ISO_C_BINDING, ONLY: C_CHAR
CHARACTER(*), INTENT(OUT) :: scalar
CHARACTER(LEN=LEN(scalar), KIND=C_CHAR), INTENT(IN) :: arg(1)
scalar = arg(1)
END SUBROUTINE do_copy
END MODULE m
SUBROUTINE LEGACY(PATH)
CHARACTER(4096) :: PATH
PRINT *, TRIM(PATH)
END SUBROUTINE LEGACY
I'm modernizing some old Fortran code and I cannot get rid of an equivalence statement somewhere (long story short: it's mixed use is so convoluted it'd take too much work to convert everything).
I need the length of the EQUIVALENCEd arrays to depend on some input, like the following code:
program test_equivalence
implicit none
type :: t1
integer :: len = 10
end type t1
type(t1) :: o1
call eqv_int(o1%len)
call eqv(o1)
return
contains
subroutine eqv_int(len)
integer, intent(in) :: len
integer :: iwork(len*2)
real(8) :: rwork(len)
equivalence(iwork,rwork)
print *, 'LEN = ',len
print *, 'SIZE(IWORK) = ',size(iwork)
print *, 'SIZE(RWORK) = ',size(rwork)
end subroutine eqv_int
subroutine eqv(o1)
type(t1), intent(in) :: o1
integer :: iwork(o1%len*2)
real(8) :: rwork(o1%len)
equivalence(iwork,rwork)
print *, 'LEN = ',o1%len
print *, 'SIZE(IWORK) = ',size(iwork)
print *, 'SIZE(RWORK) = ',size(rwork)
end subroutine eqv
end program test_equivalence
This program will create 0-length arrays with gfortran 9.2.0. This is the output:
LEN = 10
SIZE(IWORK) = 0
SIZE(RWORK) = 0
LEN = 10
SIZE(IWORK) = 0
SIZE(RWORK) = 0
The same code will return Array 'rwork' at (1) with non-constant bounds cannot be an EQUIVALENCE object when compiled with gfortran 5.3.0, the warning disappears since gfortran 6.2.0, but the size of the arrays is always 0. So maybe compiler bug?
The source code is indeed not a valid Fortran program. To be specific, it violates the numbered constraint C8106 of Fortran 2018:
An equivalence-object shall not be a designator with a base object that is .. an automatic data object ..
Being a numbered constraint, the compiler must be capable of detecting this violation. If hasn't such a capability this is a deficiency in the compiler (a bug). Being "capable" doesn't mean doing so by default, so please look carefully to see whether there are options which do lead to this detection. Someone familiar with the internals of GCC can give further detail here.
As the source isn't a valid Fortran program, the compiler is allowed to consider the arrays of size zero if it has skipped the violation detection.
We are trying to take over the memory allocation of a legacy Fortran code (+100,000 lines of code) in C++, because we are using a C library for partitioning and allocating distributed memory on a cluster. The allocatable variables are defined in modules. When we call subroutines that use these modules the index seems to be wrong (shifted by one). However, if we pass the same argument to another subroutine we get what we expect. The following simple example illustrates the issue:
hello.f95:
MODULE MYMOD
IMPLICIT NONE
INTEGER, ALLOCATABLE, DIMENSION(:) :: A
SAVE
END MODULE
SUBROUTINE TEST(A)
IMPLICIT NONE
INTEGER A(*)
PRINT *,"A(1): ",A(1)
PRINT *,"A(2): ",A(2)
END
SUBROUTINE HELLO()
USE MYMOD
IMPLICIT NONE
PRINT *,"A(1): ",A(1)
PRINT *,"A(2): ",A(2)
CALL TEST(A)
end SUBROUTINE HELLO
main.cpp
extern "C" int* __mymod_MOD_a; // Name depends on compiler
extern "C" void hello_(); // Name depends on compiler
int main(int args, char** argv)
{
__mymod_MOD_a = new int[10];
for(int i=0; i<10; ++i) __mymod_MOD_a[i] = i;
hello_();
return 0;
}
We are compiling with:
gfortran -c hello.f95; c++ -c main.cpp; c++ main.o hello.o -o main -lgfortran;
Output from running ./main is
A(1): 1
A(2): 2
A(1): 0
A(2): 1
As you can see the output of A is different, though both subroutines printed A(1) and A(2). Thus, it seems that HELLO starts from A(0) and not A(1). This is probably due to that ALLOCATE has never been called directly in Fortran so that it is not aware of the bounds of A. Any work arounds?
The ISO_C_BINDING "equivalent" code:
c++ code:
extern "C" int size;
extern "C" int* c_a;
extern "C" void hello();
int main(int args, char** argv)
{
size = 10;
c_a = new int[size];
for(int i=0; i<size; ++i) c_a[i] = i;
hello();
return 0;
}
fortran code:
MODULE MYMOD
USE, INTRINSIC :: ISO_C_BINDING
IMPLICIT NONE
INTEGER, BIND(C) :: SIZE
TYPE (C_PTR), BIND(C) :: C_A
INTEGER(C_INT), POINTER :: A(:)
SAVE
END MODULE
SUBROUTINE TEST(A)
IMPLICIT NONE
INTEGER A(*)
PRINT *,"A(1): ",A(1)
PRINT *,"A(2): ",A(2)
END
SUBROUTINE HELLO() BIND(C)
USE, INTRINSIC :: ISO_C_BINDING
USE MYMOD
IMPLICIT NONE
CALL C_F_POINTER(C_A,A,(/SIZE/))
PRINT *,"A(1): ",A(1)
PRINT *,"A(2): ",A(2)
CALL TEST(A)
END SUBROUTINE
Output:
A(1): 0
A(2): 1
A(1): 0
A(2): 1
Fortran array dummy arguments always start at the lower bound defined in the subroutine. Their lower bound is not retained during the call. Therefore the argument A in TEST() will always start at one. If you wish it to start from 42, you must do:
INTEGER A(42:*)
Regarding the allocation, you are playing with fire. It is much better to use Fortran pointers for this.
integer, pointer :: A(:)
You can then set the array to point to a C buffer by
use iso_c_binding
call c_f_pointer(c_ptr, a, [the dimensions of the array])
where c_ptr is of type(c_ptr), interoperable with void *, which also comes from iso_c_binding.
---Edit---
Once I see that #Max la Cour Christensen implemented what I sketched above, I see I misunderstood the output of your code. The descriptor was indeed wrong, though I didn't write anything plain wrong. The solution above still applies.
The internal representation of fortran arrays is very different than the one used in C/C++.
Fortran uses descriptors that start with a pointer to the array data, and followed by element type size, number of dimensions, some padding bytes, an internal 32/64 bit byte sequence indicating various flags such as pointer, target, allocatable, can be deallocated, etc. Most of these flags are not documented (at least in ifort that I have worked with), and at the end is a sequence of records, each describing the number of elements in the corresponding dimension, distance between elements, etc.
To 'see' an externally created array from fortran, you'd need to create such descriptors in C/C++, but, it does not end there because fortran also makes copies of them in the startup code of each subroutine before it gets to the first one of your statements, depending on indicators like 'in', 'out, 'inout', and other indicators used in the fortran array declaration.
Arrays within a type declared with specific sizes map well (again in ifort) to corresponding C struct members of the same type and number of elements, but pointer and allocatable type members are really descriptors in the type that need to be initialized to the correct values in all their fields so fortran can 'see' the allocatable value. This is at best tricky and dangerous, since the fortran compiler may generate copy code for arrays in undocumented ways for optimization purposes, but it needs to 'see' all the involved fortran code to do so. Anything coming outise of the fortran domain, is not known and can result in unexpected behavior.
Your best bet is to see if gfortran supports something like iso_c_binding and define such interfaces for your fortran code, and then use iso_c_binding intrinsics to map the C_PTR pointers to fortran pointers to types, arrays, etc.
You can also pass a pointer to a one-dimensional array of char, and its size, and this works for strings mostly as long as the size is passed by value as last argument (again, compiler and compiler-flag dependent).
Hope this helps.
EDIT: changed 'ifort's iso_c_binding' to 'iso_c_binding after Vladimir's comment - thanks!
I've recently inherited Fortran code that used to be built with an older version of the Intel Visual Fortran compiler. There's a section of code that used to compile, but now throws an error #6633 'The type of the actual argument differs from the type of the dummy argument.'
The problem is when a function called READ_AND_CONVERT is called with REAL*4 DATA_ARRAY(*), but in READ_AND_CONVERT that parameter is declared as INT*2. I think it really just wants the address of the DATA_ARRAY.
Is there a way to pass the address of the DATA_ARRAY, even though they're of different types?
Here is READ_AND_CONVERT:
SUBROUTINE READ_AND_CONVERT (MX, N)
C=======================================================================
C Reads Integer*2 Data Array and Converts it to Real*4.
C
C This is a service routine called by subroutines
C READ_XYZ_2, READ_XYZ_4, READ_XYZ_ALL and READ_XYZ_FULL
C=======================================================================
C
IMPLICIT NONE
C
INCLUDE 'XYZ.FOR'
INCLUDE 'COMMON_XYZIO.FOR'
INCLUDE 'COMMON_HDR.FOR'
C
C-----------------------------------------------------------------------
C Local Parameters
C-----------------------------------------------------------------------
C
LOGICAL BB_FOUND
INTEGER*2 MX, MY
INTEGER*4 N, J
REAL*4 YJ, BB
C
DIMENSION MX(*), MY(2)
EQUIVALENCE (YJ, MY(1))
C
C-----------------------------------------------------------------------
C
CALL GET_REAL_PARAMETER ('XYZ$_OFFSET', BB, BB_FOUND)
C
READ (LUGIN) (MX(J), J = 1,N)
C
IF (BB_FOUND) THEN
DO J = N, 1, -1
YJ = (SCALE_FACTOR * MX(J)) + BB
MX(2*J) = MY(2)
MX(2*J-1) = MY(1)
END DO
ELSE
DO J = N, 1, -1
YJ = SCALE_FACTOR * MX(J)
MX(2*J) = MY(2)
MX(2*J-1) = MY(1)
END DO
END IF
C
RETURN
END
Found a solution here:
Basically disable the warning... by setting Properties | Fortran | Diagnostics | Check Routine Interfaces [change from Yes to No]
The article also shows how to do casting, in their example of a complex array to a real array:
use ISO_C_BINDING
complex(8), allocatable :: c(:)
real(8), pointer:: p(:)
allocate(c(N))
call C_F_POINTER(C_LOC(c), p, [2*N])
call donothing(N, p)
There are directives in Intel Fortran which disable the argument type check for a given routine and for a given argument. To disable the checks for all your code is dangerous!
!DEC$ ATTRIBUTES NO_ARG_CHECK :: ARGUMENT_NAME
source: https://software.intel.com/en-us/forums/intel-visual-fortran-compiler-for-windows/topic/288896
The Fortran intrinsic function transfer can be used to covert a derived type into a real or integer array. This is potentially very useful when working in legacy systems which relies on arrays of primitive types (integer, real etc.) for persistence.
The code below runs at least on ifort and gfortran and converts a simple derived type example to an integer array (updated with solution):
program main
implicit none
integer, parameter :: int_mem_size = storage_size(1)
type subtype
integer a
double precision b
end type subtype
type :: mytype
integer :: foo
double precision :: bar
type(subtype) :: some_type
end type
type(mytype) :: my_var
type(subtype) :: my_subtype
! Old version: integer :: x(30)
integer, allocatable :: x(:)
integer :: mem_size
!Allocate array with required size
mem_size = storage_size(my_var)
allocate(x(mem_size/int_mem_size))
my_subtype%a = 1
my_subtype%b = 2.7
my_var%foo = 42
my_var%bar = 3.14
my_var%some_type = my_subtype
write(*,*) "transfering..."
x = transfer(my_var, x)
write(*,*) "Integer transformation:", x
end program main
On my PC, this is the output (this result is at least platform dependent):
transfering...
Integer transformation: 42 0 1610612736 1074339512
999 0 -1610612736 1074108825
My problem is that I have "guessed" that a 30 element long integer array is large enough to store this data structure. Is there a way I can determine how large the array needs to be to store the whole data structure?
If you have a Fortran 2008 compliant compiler, or one that is compliant enough, you will find the intrinsic function storage_size which returns the number of bits used to store its argument. Failing that most compilers that I am familiar with implement a non-standard function to do this; the Intel Fortran compiler has a function called sizeof which returns the number of bytes required to store its argument.