I'm trying to call some Fortran 90 code from a C++ main program. The Fortran subroutine takes a array of double (call it X) as parameter, then proceeds to use size(X) in many places in the code. I call the routine with a C array created through
double *x = new double[21]
but when I print the result of size(X) in the Fortran code I get 837511505, or some other big numbers.
Right now I can modify the fortran code, so worst case is to rewrite the function, passing the size as a parameter. But I'd rather not do it.
Does anyone know if there's a way I can create the C array in such a way that the Fortran routine can figure out its size?
This is an implementation-specific feature. Many implementations (RSX and OpenVMS, for example) define a structure for passing a pointer to the data as well as a description of the dimensions, types, etc. Other implementations pass no such thing unless the external declaration explicitly invokes a mechanism to generate a descriptor. Most others provide no such mechanism.
Without knowing which implementation in use:
a) read the compiler's documentation
b) have the compiler generate assembly, and inspect it to see what it expects
Intel F95 uses array descriptor structure, which apart from the array pointer also store the bounds and dimension information. size() gets the information from the descriptor.
Since you're passing from C only pointer, no descriptor info is available, thus size() returns gibberish.
Generally, you're in the rough territory of mixed language programming, where arrays and structures are often a programmer's pain. Intel compiler user doc has a separate section about C<=>F95 mixed calling.
In particular, check about interfaces and binding -- a nice F95 feature that helps in inter-language calls.
The good news, C<=>F95 calling works very well once you stick to the conventions.
I personally use a ton of mixed coding from c++ to fortran 90/95/2003. I typically use gfortran as my compiler, but to avoid this issue, it is common practice to always send the size of the arrays. This even allows you to change the shape. Consider a 2 dimensional array containing x,y points:
double* x = new double[2*21]
real(8),intent(in),dimension(2,21)::x
This is a very handy feature and will then allow you to use the size command. The answers about compiler specifics are correct. To make your code usable on most compilers you should specify length when using multi-language interfaces.
Related
I am creating a C shared library. I provide a function to the user that has the declaration below:
int getResults(Elements** el)
where Elements is an array of structs provided by the user, which the function then fills with the values. The final number of calculated elements is different each time, depending on parameters in other functions, so there must be a way to inform the user about the number of them.
Instead of having a separate function to return the number of elements, the way I have implemented this is that the user can call this same function with NULL argument to get the number of existing elements:
int n = getResults(NULL);
allocate the required memory and then pass the array pointer. So, inside the function, I check:
if(el == NULL)
{
return numEl;
}
else
{
// Proceed to fill the structs.
// If all good, return 0
return 0;
}
Now, my concern is, could this approach fail?
I have read that NULL does not necessarily mean a specific number, 0 that is. So, if for example a user links with the library using another compiler, standard or integrates it with C++, is it guaranteed that this equation will always be true?
could this approach fail?
It could be misunderstood.
Other than that, I don't see a problem with the API.
I have read that NULL does not necessarily mean a specific number, 0 that is. So, if for example a user links with the library using another compiler, standard or integrates it with C++, is it guaranteed that this equation will always be true?
Null is null, regardless of what number represents it. There aren't many direct guarantees in the language standards about compatibility across language / compiler barriers. This is not limited to representation of null, but many aspects of the language implementation. Generally, compilers strive to be compatible with other compilers on same system. If a compiler is compatible with another, then there is no problem. If it is not compatible, then changing API is unlikely to fix the incompatiblity.
To use a shared library is to rely on compatibility of compilers used to produce the components. If you cannot rely on the compilers being compatible with one another, then you cannot make function calls across their boundary. Instead, you would have to rely on serialised communication over for example a socket.
I would consider case where the compilers are otherwise compatible except for their definition of null to be highly theoretical. But there is a way to design the API such that it doesn't rely on definition of null, to avoid problem in such imaginary case: Let the user of the API supply the pointer value that the library should accept as "null".
I have been working in C and C++ and when it comes to file handling I get confused. Let me state the things I know.
In C, we use functions:
fopen, fclose, fwrite, fread, ftell, fseek, fprintf, fscanf, feof, fileno, fgets, fputs, fgetc, fputc.
FILE *fp for file pointer.
Modes like r, w, a
I know when to use these functions (Hope I didn't miss anything important).
In C++, we use functions / operators:
fstream f
f.open, f.close, f>>, f<<, f.seekg, f.seekp, f.tellg, f.tellp, f.read, f.write, f.eof.
Modes like ios::in, ios::out, ios::bin , etc...
So is it possible (recommended) to use C compatible file operations in C++?
Which is more widely used and why?
Is there anything other than these that I should be aware of?
Sometimes there's existing code expecting one or the other that you need to interact with, which can affect your choice, but in general the C++ versions wouldn't have been introduced if there weren't issues with the C versions that they could fix. Improvements include:
RAII semantics, which means e.g. fstreams close the files they manage when they leave scope
modal ability to throw exceptions when errors occur, which can make for cleaner code focused on the typical/successful processing (see http://en.cppreference.com/w/cpp/io/basic_ios/exceptions for API function and example)
type safety, such that how input and output is performed is implicitly selected using the variable type involved
C-style I/O has potential for crashes: e.g. int my_int = 32; printf("%s", my_int);, where %s tells printf to expect a pointer to an ASCIIZ character buffer but my_int appears instead; firstly, the argument passing convention may mean ints are passed differently to const char*s, secondly sizeof int may not equal sizeof const char*, and finally, even if printf extracts 32 as a const char* at best it will just print random garbage from memory address 32 onwards until it coincidentally hits a NUL character - far more likely the process will lack permissions to read some of that memory and the program will crash. Modern C compilers can sometimes validate the format string against the provided arguments, reducing this risk.
extensibility for user-defined types (i.e. you can teach streams how to handle your own classes)
support for dynamically sizing receiving strings based on the actual input, whereas the C functions tend to need hard-coded maximum buffer sizes and loops in user code to assemble arbitrary sized input
Streams are also sometimes criticised for:
verbosity of formatting, particularly "io manipulators" setting width, precision, base, padding, compared to the printf-style format strings
a sometimes confusing mix of manipulators that persist their settings across multiple I/O operations and others that are reset after each operation
lack of convenience class for RAII pushing/saving and later popping/restoring the manipulator state
being slow, as Ben Voigt comments and documents here
The performance differences between printf()/fwrite style I/O and C++ IO streams formatting are very much implementation dependent.
Some implementations (visual C++ for instance), build their IO streams on top of FILE * objects and this tends to increase the run-time complexity of their implementation. Note, however, that there was no particular constraint to implement the library in this fashion.
In my own opinion, the benefits of C++ I/O are as follows:
Type safety.
Flexibility of implementation. Code can be written to do specific formatting or input to or from a generic ostream or istream object. The application can then invoke this code with any kind of derived stream object. If the code that I have written and tested against a file now needs to be applied to a socket, a serial port, or some other kind of internal stream, you can create a stream implementation specific to that kind of I/O. Extending the C style I/O in this fashion is not even close to possible.
Flexibility in locale settings: the C approach of using a single global locale is, in my opinion, seriously flawed. I have experienced cases where I invoked library code (a DLL) that changed the global locale settings underneath my code and completely messed up my output. A C++ stream allows you to imbue() any locale to a stream object.
An interesting critical comparison can be found here.
C++ FQA io
Not exactly polite, but makes to think...
Disclaimer
The C++ FQA (that is a critical response to the C++ FAQ) is often considered by the C++ community a "stupid joke issued by a silly guy the even don't understand what C++ is or wants to be"(cit. from the FQA itself).
These kind of argumentation are often used to flame (or escape from) religion battles between C++ believers, Others languages believers or language atheists each in his own humble opinion convinced to be in something superior to the other.
I'm not interested in such battles, I just like to stimulate critical reasoning about the pros and cons argumentation. The C++ FQA -in this sens- has the advantage to place both the FQA and the FAQ one over the other, allowing an immediate comparison. And that the only reason why I referenced it.
Following TonyD comments, below (tanks for them, I makes me clear my intention need a clarification...), it must be noted that the OP is not just discussing the << and >> (I just talk about them in my comments just for brevity) but the entire function-set that makes up the I/O model of C and C++.
With this idea in mind, think also to other "imperative" languages (Java, Python, D ...) and you'll see they are all more conformant to the C model than C++. Sometimes making it even type safe (what the C model is not, and that's its major drawback).
What my point is all about
At the time C++ came along as mainstream (1996 or so) the <iostream.h> library (note the ".h": pre-ISO) was in a language where templates where not yet fully available, and, essentially, no type-safe support for varadic functions (we have to wait until C++11 to get them), but with type-safe overloaded functions.
The idea of oveloading << retuning it's first parameter over and over is -in fact- a way to chain a variable set of arguments using only a binary function, that can be overload in a type-safe manner. That idea extends to whatever "state management function" (like width() or precision()) through manipulators (like setw) appear as a natural consequence. This points -despite of what you may thing to the FQA author- are real facts. And is also a matter of fact that FQA is the only site I found that talks about it.
That said, years later, when the D language was designed starting offering varadic templates, the writef function was added in the D standard library providing a printf-like syntax, but also being perfectly type-safe. (see here)
Nowadays C++11 also have varadic templates ... so the same approach can be putted in place just in the same way.
Moral of the story
Both C++ and C io models appear "outdated" respect to a modern programming style.
C retain speed, C++ type safety and a "more flexible abstraction for localization" (but I wonder how many C++ programmers are in the world that are aware of locales and facets...) at a runtime-cost (jut track with a debugger the << of a number, going through stream, buffer locale and facet ... and all the related virtual functions!).
The C model, is also easily extensible to parametric messages (the one the order of the parameters depends on the localization of the text they are in) with format strings like
#1%d #2%i allowing scrpting like "text #2%i text #1%d ..."
The C++ model has no concept of "format string": the parameter order is fixed and itermixed with the text.
But C++11 varadic templates can be used to provide a support that:
can offer both compile-time and run-time locale selection
can offer both compile-time and run-time parametric order
can offer compile-time parameter type safety
... all using a simple format string methodology.
Is it time to standardize a new C++ i/o model ?
from the source code published in the programmer's manual of a commercial program, I have isolated a code snippet which puzzles me quite a lot.
The function below is expected to be called multiple times by a kernel and is supposed to implement the temporal behaviour of a component in a system consisting of many interconnected components (I have removed the Input/Output parameters from the function prototype because they are irrelevant to the point I intend to rise).
To distinguish between different instances of the same block type the kernel pass an instance number in the INFO(1) element.
As far as I have understood, the designer of this program took a great deal of effort trying to save the time spent in copying the values of the parameters from the PAR vector to local variables with meaningful names at each call (as if they were not aware of the optimizations a compiler can do). It seems to me that they wanted to assign them to the local variables only in the first call, or when the caller switch to a different instance of the same type.
However I can't understand how this could work if the local variables are not declared static with the "save" keyword. Does FORTRAN store local variables statically i.e. not on a stack? (I am sorry if the question sounds stupid, I am used to the C/C++ languages)
Thank you.
SUBROUTINE TYPE151(PAR, INFO, *)
IMPLICIT NONE
INTEGER*4 INFO(15), IUNIT
DOUBLE PRECISION PAR, QMAX
PARAMETER (NP=1)
DIMENSION PAR(NP)
! First call
IF (INFO(7).EQ.-1) THEN
IUNIT = INFO(1)
QMAX = PAR(1)
RETURN 1
ENDIF
! later calls
IF(INFO(1).NE.IUNIT) THEN
IUNIT = INFO(1)
QMAX = PAR(1)
ENDIF
! Making use of QMAX in some ways...
RETURN 1
END SUBROUTINE TYPE151
Storage methods are not part of the language standard. Old FORTRAN compilers (FORTRAN 77 and earlier) frequently stored all variables statically. The language requires that you use "SAVE" for variables for which the values should be retained across calls to the procedure. But many programmers ignored this requirement and relied on the behavior that all variables retained their values because of the typical design of compilers in the FORTRAN 77 era.
Modern Fortran compilers typically use memory differently and local variables of procedures do not always retain their values if SAVE is omitted. This frequently causes bugs when old programs are compiled with current compilers. Compilers typically provide an option to restore the old behavior. Otherwise it could be a great deal of work to identify all variables in a large legacy program that needed to have the SAVE attribute added to their declaration.
I would like to have the following method as a generic method for any array,
int arrayLength(`anyType` array[])
{
return sizeof(array) / sizeof(array[0]);
}
However it appears C++ doesn't allow any ambiguity of types at all,
why is this, and how should I go about getting around it?
Because types have to be pushed onto the stack and then popped back off, and the sizeof one type is not equal to the sizeof another type.
If the size of types being passed on the stack between functions is not fixed or known in advance, how can the compiler compile a function?
The solutions to this problem -- as others have noted -- is templates and macros, both of which dynamically generate code -- which is then, in turn, compiled -- at compile-time, appearing to "solve" the problem, but really only obviating or distracting you from it by offloading the work onto the compiler.
In Visual C++ there's a __countof() construct that does the same. It's implemented as a template for C++ compiling and as a macro for C. The C++ version errors out if used on a pointer (as opposed to a true array), the C version does not.
I think what you're really asking is "Why does C++ insist on static typing?"
The answer: because it's easier to write a compiler that generates small, fast programs if the language uses static typing. And that's the purpose of C++: creating small, fast programs whose complexity would be relatively unmanageable if written in C.
When I say "small", I'm including the size of any required runtime libraries.
In my Fortran code I made the following call to the dnrm2 routine:
d = dnrm2(n, ax, 1)
Just a simple call that would return me a double precision result.
The question is, should I declare the function at the start of my script? I found that if I don't declare it, when I compile the code in 32 bit Windows, then the result is correct.
But if I compile the code in 64 bit Windows, then the result isn't be correct.
Why is this so? Must an external routine always be declared in Fortran?
If you don't correctly describe your subprograms (subroutines and functions) to a calling program, the compiler may not correctly call them. Fortran compiles each unit separately, so the compiler doesn't "know", by default, about the characteristics of other subprograms. There are several ways that you can describe/declare a subprogram in Fortran 90/95/2003.
The easiest and best method is to place your subprograms into a module and then "use" that module in the calling program. This automatically makes the interface known to the compiler and will enable the compiler to check the consistency of actual arguments (in the call) and dummy arguments in the subprogram. It will also make known the return type of a function. The various subprograms in a module have their interfaces known to each other.
You can also write an "interface" containing a subprogram declaration that matches the declarations of the actual subprogram. (This method can be very similar to the style of including header files in C.) This method is more work and prone to error because you have to manually maintain consistency between the actual subprogram and interface whenever changes are made. The interface method is useful when you don't have the code to the subprogram or the subprogram is written in a language other than Fortran.
Or you can simply declare a function name to specify the type-return of the function, but this won't give you any checking of the arguments. In my opinion this method is weaker since having the compiler check argument consistency eliminates a major class of programming mistakes.
I don't do Fortran, but in C, the size of a pointer and the size of a long int varies between 32 and 64 bit OS'es, but the size of an int does not. Perhaps the program is using ints to do pointer arithmetic?