I have a FORTRAN code with a routine:
SUBROUTINE READ_NC_VALS(NCID, RECID, VARNAME, VARDATA)
integer ncid, recid
character*(*) varname
real*8 vardata
dimension vardata(15,45,75)
etc.
I want to add some flexibility to this code, and I thought I would do it by first adding an optional flag argument:
SUBROUTINE READ_NC_VALS(NCID, RECID, VARNAME, VARDATA, how_to_calculate)
! everything the same and then ...
logical, optional :: how_to_calculate
Now, at this point, I am not even using "how_to_calculate". I am just putting it into the code for testing. So I compile the code without a hitch. Then I run it, and I get an error in the subroutine. Specifically, some of the values in the code later on are "magically" altered from what they were without that optional argument. The new values don't make sense to the logic of the code, so it exits politely with an error message. Let me stress again that at this point, I am not even using this optional argument. So then, on a lark, I go back to all the places in the source that call this routine and, even though my new argument in optional, I put in values for it in all the calls. When I do that, the code runs fine. So, what's up? How can the mere presence of an unused optional argument in a subroutine result in other data being corrupted? And how can adding input parameters for this optional argument fix things again? This is being compiled with PGI, by the way.
Any ideas? Thanks.
BTW, sorry for not providing more code. My boss might not be too happy with me if I did that. I don't make the rules; I just work here.
Optional arguments in Fortran are implemented by passing 0 (a null pointer) for each optional argument that has no value provided by the calling subroutine. Because of this subroutines that take optional arguments have to:
either have an explicit INTERFACE definition inside the calling subroutine
or be a module-level subroutine (for them interfaces are generated automatically)
If you add an optional argument to a subroutine but it neither has an interface in the caller or is not a module-level subroutine, then the compiler will not generate the right calling sequence - it will pass less arguments than expected. This could pose a problem on Unix systems, where PGI passes the length of all CHARACTER*(*) arguments at the end of the argument list (on Windows it passes the length as the next argument after the address of the string). One missing argument would shift the length arguments placement in the stack (or put them in the wrong registers on x64) leading to the incorrect length of the VARNAME string being received by READ_NC_VALS. This could lead to all sorts of ill behaviour, including overwritting memory and "magically" changing values that should not change according to the program logic.
Obviously, just adding a dummy argument to a subroutine, with or without the optional attribute, shouldn't cause any problems. What might happen, though, is that the change exposes some other problem with the code, which was already there, but didn't cause any visible bad effects.
Without any more code, all we can do is guess, which usually isn't really useful.
One thing that pops into mind is the necessity of an explicit interface, when using optional arguments. Is the code organized into modules?
Related
I'm working on a Fortran program and running into a strange bug with some Heisenbug-type characteristics, and looking for some insight into what might be going on. The code is too large to post in full but I hopefully I can the general idea.
What's basically going on is I have a subroutine that reads a list of numerical parameters from a text file,
call read_parameters(filename, parameter_array)
and then this list of parameters is sent into another subroutine that runs a program using those parameter values.
call run_program(parameter_array)
These calls are part of a loop that calls run_program with slightly different parameters each time through the loop---the intention is to find better parameter sets.
I've found that on the first pass through this loop, run_program gives bizarre results, which seems to indicate that something is going wrong with the first call to read_parameters. But all the subsequent passes behave normally and I haven't been able to understand what's going wrong with that first pass despite a lot of investigating, including for example printing the values of the parameters themselves within the actual run_program code.
While testing, I realized that if I put another call to read_parameters right above the call to run_program, then the first pass of the program runs normally, but here's the thing: this new call to read_parameters is just a dummy call, with an output array parameter_array2 that doesn't even get used! As in,
call read_parameters(parameter_array)
call read_parameters(parameter_array2)
call run_program(parameter_array)
If the second line is present, the program runs just fine, even though parameter_array2 isn't used anywhere, while if it's absent the program gives erroneous results for the first pass through the loop.
Does anyone have any ideas about what might be going on?
Thanks.
I use the Intel Visual Fortran. According to Chapmann's book, declaration of function type in the routine that calls it, is NECESSARY. But look at this piece of code,
module mod
implicit none
contains
function fcn ( i )
implicit none
integer :: fcn
integer, intent (in) :: i
fcn = i + 1
end function
end module
program prog
use mod
implicit none
print *, fcn ( 3 )
end program
It runs without that declaration in the calling routine (here prog) and actually when I define its type (I mean function type) in the program prog or any other unit, it bears this error,
error #6401: The attributes of this name conflict with those made accessible by a USE statement. [FCN] Source1.f90 15
What is my fault? or if I am right, How can it be justified?
You must be working with a very old copy of Chapman's book, or possibly misinterpreting what it says. Certainly a calling routine must know the type of a called function, and in Fortran-before-90 it was the programmer's responsibility to ensure that the calling function had that information.
However, since the 90 standard and the introduction of modules there are other, and better, ways to provide information about the called function to the calling routine. One of those ways is to put the called functions into a module and to use-associate the module. When your program follows this approach the compiler takes care of matters. This is precisely what your code has done and it is not only correct, it is a good approach, in line with modern Fortran practice.
association is Fortran-standard-speak for the way(s) in which names (such as fcn) become associated with entities, such as the function called fcn. use-association is the way implemented by writing use module in a program unit, thereby making all the names in module available to the unit which uses module. A simple use statement makes all the entities in the module known under their module-defined names. The use statement can be modified by an only clause, which means that only some module entities are made available. Individual module entities can be renamed in a use statement, thereby associating a different name with the module entity.
The error message you get if you include a (re-)declaration of the called function's type in the calling routine arises because the compiler will only permit one declaration of the called function's type.
I want to solve a differential equation lots of times for different parameters. It is more complicated than this, but for the sake of clarity let's say the ODE is y'(x) = (y+a)*x with y(0) = 0 and I want y(1). I picked the dverk algorithm from netlib for solving the ODE and it expects the function on the right hand side to be of a certain form. Now what I did with the Intel Fortran compiler is the following (simplified):
subroutine f(x,a,ans)
implicite none
double precision f,a,ans,y,tol,c(24),w(9)
...
call dverk(1,faux,x,y,1.d0,tol,ind,c,1,w)
...
contains
subroutine faux(n,xx,yy,yprime)
implicite none
integer n
double precision xx,yy(n),yprime(n)
yprime(1) = (yy(1)+a)*xx
end subroutine faux
end subroutine f
This works just fine with ifort, the sub-subroutine faux sees the parameter a and everything works as expected. But I'd like the code to be compatible with gfortran, and with this compiler I get the following error message:
Error: Internal procedure 'faux' is not allowed as an actual argument at (1)
I need to have the faux routine inside f, or else I don't know how to tell it the value of a, because I can't change the list of parameters, since this is what the dverk routine expects.
I would like to keep the dverk routine and understand how to solve this specific problem without a workaround, since I feel it will become important again when I need to integrate a parameterized function with different integrators.
You could put this all in a module, and make a a global module variable. Make faux a module procedure. That way, it has access to a.
module ode_module
double precision::a
contains
subroutine f(x,a,ans)
implicit none
double precision f,ans,y,tol,c(24),w(9)
call dverk(1,faux,x,y,1.d0,tol,ind,c,1,w)
end subroutine
subroutine faux(n,xx,yy,yprime)
implicite none
integer n
double precision xx,yy(n),yprime(n)
yprime(1) = (yy(1)+a)*xx
end subroutine faux
end module
You can check here which compilers support this F2008 feature:
http://fortranwiki.org/fortran/show/Fortran+2008+status
On that page, search for "Internal procedure as an actual argument" .
Passing an internal procedure is allowed by Fortran 2008. That means that your code is correct Fortran and just a more recent compiler is needed.
At the time when this question was posted the support for that in existing compilers was patchy. Some did support it, some did not. This changed considerably and at the time of writing this revision of the answer one can normally use it without needing to think about compiler support any more.
You are not limited to sharing the data in a module. You just have to make sure, that the containing procedure is still running. It is not possible to save the pointer and make a so called 'closure' known from LISP and many modern languages. In languages like Fortran or C the enclosing environment for future calls is not saved and pointers to internal functions become invalid.
The advantage of internal procedure over module procedure is that you can share the data without sharing it in the whole module.
It works well will reasonably old versions of GCC (gfortran), Intel Fortran and Cray Fortran (personally tested) and possibly others. I remember not being able to compile by PGI and Oracle Fortran. As shown by Daniel, it can be checked at http://fortranwiki.org/fortran/show/Fortran+2008+status (search Internal procedure as an actual argument). The most recent version of this table is periodically published in ACM SIGPLAN Fortran Forum.
There is some interesting info in this article https://stevelionel.com/drfortran/2009/09/02/doctor-fortran-in-think-thank-thunk/
Be aware that the implementation of this feature normally requires executable stack due to the use of trampolines. One does not normally think about that but if the OS forbids executable stack, the executable might fail (there were issues in initial versions of Windows Subsystem for Linux, for example).
I am updating legacy code and I need to use a simple mathematical function inside a subroutine. I cannot figure out how to do this. I have a function that works when called from a test program. What do I need to do differently for a subroutine?
example:
subroutine foo(i,j,k)
i = bar(j,k)
stuff = otherstuff
return
end
other info:
bar is an erf approximation.
I am using the PGF90 compiler.
I am new to FORTRAN from C.
thanks!
Basically, calling from a program or a subroutine shouldn't differ. Does the code really look like this, without any declarations? This means all variables will have implicit types: variables with names starting with the letters i-n will be integer, all others real; this also holds for function return values. The code you show, tries to assign a real (bar()) to an integer (i).
If you're writing new code, always start programs and procedures with IMPLICIT NONE. This forces you to explicitly include type declarations for all variables and function return values, greatly reducing errors.
I came across the following weird chunk of code.Imagine you have the following typedef:
typedef int (*MyFunctionPointer)(int param_1, int param_2);
And then , in a function , we are trying to run a function from a DLL in the following way:
LPCWSTR DllFileName; //Path to the dll stored here
LPCSTR _FunctionName; // (mangled) name of the function I want to test
MyFunctionPointer functionPointer;
HINSTANCE hInstLibrary = LoadLibrary( DllFileName );
FARPROC functionAddress = GetProcAddress( hInstLibrary, _FunctionName );
functionPointer = (MyFunctionPointer) functionAddress;
//The values are arbitrary
int a = 5;
int b = 10;
int result = 0;
result = functionPointer( a, b ); //Possible error?
The problem is, that there isn't any way of knowing if the functon whose address we got with LoadLibrary takes two integer arguments.The dll name is provided by the user at runtime, then the names of the exported functions are listed and the user selects the one to test ( again, at runtime :S:S ).
So, by doing the function call in the last line, aren't we opening the door to possible stack corruption? I know that this compiles, but what sort of run-time error is going to occur in the case that we are passing wrong arguments to the function we are pointing to?
There are three errors I can think of if the expected and used number or type of parameters and calling convention differ:
if the calling convention is different, wrong parameter values will be read
if the function actually expects more parameters than given, random values will be used as parameters (I'll let you imagine the consequences if pointers are involved)
in any case, the return address will be complete garbage, so random code with random data will be run as soon as the function returns.
In two words: Undefined behavior
I'm afraid there is no way to know - the programmer is required to know the prototype beforehand when getting the function pointer and using it.
If you don't know the prototype beforehand then I guess you need to implement some sort of protocol with the DLL where you can enumerate any function names and their parameters by calling known functions in the DLL. Of course, the DLL needs to be written to comply with this protocol.
If it's a __stdcall function and they've left the name mangling intact (both big ifs, but certainly possible nonetheless) the name will have #nn at the end, where nn is a number. That number is the number of bytes the function expects as arguments, and will clear off the stack before it returns.
So, if it's a major concern, you can look at the raw name of the function and check that the amount of data you're putting onto the stack matches the amount of data it's going to clear off the stack.
Note that this is still only a protection against Murphy, not Machiavelli. When you're creating a DLL, you can use an export file to change the names of functions. This is frequently used to strip off the name mangling -- but I'm pretty sure it would also let you rename a function from xxx#12 to xxx#16 (or whatever) to mislead the reader about the parameters it expects.
Edit: (primarily in reply to msalters's comment): it's true that you can't apply __stdcall to something like a member function, but you can certainly use it on things like global functions, whether they're written in C or C++.
For things like member functions, the exported name of the function will be mangled. In that case, you can use UndecorateSymbolName to get its full signature. Using that is somewhat nontrivial, but not outrageously complex either.
I do not think so, it is a good question, the only provision is that you MUST know what the parameters are for the function pointer to work, if you don't and blindly stuff the parameters and call it, it will crash or jump off into the woods never to be seen again... It is up to the programmer to convey the message on what the function expects and the type of parameters, luckily you could disassemble it and find out from looking at the stack pointer and expected address by way of the 'stack pointer' (sp) to find out the type of parameters.
Using PE Explorer for instance, you can find out what functions are used and examine the disassembly dump...
Hope this helps,
Best regards,
Tom.
It will either crash in the DLL code (since it got passed corrupt data), or: I think Visual C++ adds code in debug builds to detect this type of problem. It will say something like: "The value of ESP was not saved across a function call", and will point to code near the call. It helps but isn't totally robust - I don't think it'll stop you passing in the wrong but same-sized argument (eg. int instead of a char* parameter on x86). As other answers say, you just have to know, really.
There is no general answer. The Standard mandates that certain exceptions be thrown in certain circumstances, but aside from that describes how a conforming program will be executed, and sometimes says that certain violations must result in a diagnostic. (There may be something more specific here or there, but I certainly don't remember one.)
What the code is doing there isn't according to the Standard, and since there is a cast the compiler is entitled to go ahead and do whatever stupid thing the programmer wants without complaint. This would therefore be an implementation issue.
You could check your implementation documentation, but it's probably not there either. You could experiment, or study how function calls are done on your implementation.
Unfortunately, the answer is very likely to be that it'll screw something up without being immediately obvious.
Generally if you are calling LoadLibrary and GetProcByAddrees you have documentation that tells you the prototype. Even more commonly like with all of the windows.dll you are provided a header file. While this will cause an error if wrong its usually very easy to observe and not the kind of error that will sneak into production.
Most C/C++ compilers have the caller set up the stack before the call, and readjust the stack pointer afterwards. If the called function does not use pointer or reference arguments, there will be no memory corruption, although the results will be worthless. And as rerun says, pointer/reference mistakes almost always show up with a modicum of testing.