I'm dealing with some legacy code that uses COMMON blocks extensively and sometimes uses the SAVE statement. After consulting the Fortran standard, it says:
The appearance of a common block name preceded and followed by a slash in a SAVE statement has the effect of specifying all of the entities in that common block.
Under what circumstances does placing a variable in a common block not imply SAVE? Since the variable must be accessible in any other program unit that includes that common block, how could it not be SAVEed?
I had to look it up, because I was under the same impression as you are.
It seems that only variables in an unnamed, so-called blank, common block retain their definition status across the entire program. Unsaved variables in a named common block become undefined on return from a subprogram, unless another currently active program unit includes a common statement for the same common block.
From the standard (Fortran 77, but the latest one contains similar wording):
17.3 Events That Cause Entities to Become Undefined
[...]
6. The execution of a RETURN statement or an END statement within a subprogram causes all entities within the subprogram to become undefined except for the following:
[...]
d. Entities in a named common block that appears in the subprogram and appears in at least one other program unit that is either directly or indirectly referencing the subprogram
Many compilers of the Fortran 77 era "saved" all local procedure variables, whether or not "SAVE" was specified. This is a common reason for legacy programs to fail with modern compilers, which will undefine variables when they go out of scope, as allowed by the language standard. Probably those older compilers would also maintain the values of all common variables for the duration of the program run, even though that wasn't required by the language standard.
Related
I've taken a good time studying TOC and Compiler design, not done yet but I feel comfortable with the conceptions. On the other hand I have a very shallow knowledge of assembly and machine code, and I have always the desire/need to connect the two sides( HLL and LLL representation of the code ), as I'm learning C++ with paying great attention to performance and optimization discussions.
C++ is a statically typed language:
My question is: Our variables when written as expressions in the statements of the code, do all these variables ( and other entities with identifiers ) become at runtime, mere instructions of addressing to positions of the virtual memory ( for static and for globals ) and addressing relevant to stack address for local variables?
I mean, after a successful compilation including semantic and syntactic verification, isn't wise to deal with data at runtime as guaranteed entities of target memory bytes without any thinking of any identifier or any checking, with the symbol table no more needed?
If my question appeared to be the type of questions that are due to lacking of learning effort ( which I hope it doesn't ), please just inform me about that, and tell me where to read. If that was the case, then it's honestly because I'm concentrating on C++ nowadays and haven't got the chance yet to have a sound knowledge of low level languages, I apologize for that in advance.
You're spot on. Once compiled to machine code, there is no longer any notion of a variable identifier (or variable type, for that matter). It's just bytes at a certain location. Which location was determined by the compiler (when compiling) based on the variable name, or by the linker (when linking) in the case of global variables.
Of course, it can be useful to retain information such as identifiers, for debugging purposes. This is precisely what "compilation with debug information" means: when you do that, the compiler will somehow embed the (redundant) identifiers into the generated code such that a debugger can access them. Or put them in a separate file alongside; the details of that depend on the format of the debugging information.
Yes, mostly. There are a few details that will make identifiers remain more than just addresses or stack offsets.
First we have in RTTI in C++ which means that during runtime the name of at least types may still be available. For example:
const std::type_info &info = typeid(*ptr_interface);
std::cout << info.name() << std::endl;
would print the name of whatever type *ptr_interface is of.
Second, due to the way a program is linked the symbols from the object files may still be present in the executing image. You have for example the linux kernel making use of this as it can produce a backtrace of the stack including the function names. Also it uses knowledge of function names in order to be able to load and link modules. Similar functionality exists in Gnu C library, than when linked for it is able to retrieve function names in stack traces.
In normal cases though the code will not be affected by the original names of the variables (but the compiler will of course emit code suitable for the type the variable have).
The Gnu C Compiler (gcc) allows labels as values as a language-extension:
http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html
But the documentation says:
if we use this mechanism to jump to code in a different function then totally
unpredictable things will happen.
what restrictions do we have for c like language?
The restriction is not (only) in GCC, but in the C standard itself.
A label name is the only kind of identifier that has function scope.
It can be used (in a goto statement) anywhere in the function in which
it appears, and is declared implicitly by its syntactic appearance
(followed by a : and a statement).
(from N1548, §6.2.1.3).
Having "label variables" doesn't change the fact that the environment in different functions (eg. stack) is completely different (and unlike inside of a single function, predicting what it would be is impossible); jumping around would break pretty much everything.
The stack problem more precisely: The values of the local function variables in the target function are unknown, the function parameters are unknown, as soon as the target functions ends it is unknown where the program should continue etc.etc. (And the stack is not the only problem)
This question already has answers here:
Why do the older C language specs require function-local variables to be declared up-front?
(3 answers)
Closed 7 years ago.
I have been going through a bit of history of C, and I find that in earlier versions of C, like in C89 standard, it is mandatory to declare variables at the beginning of a block.
But I also find there are some relaxations from C99 standard specification, where variable can be declared anywhere before it is used.
My question is why the earlier versions made it mandatory? my emphasis is to know if there was any technical difficulties in designing the compiler at those days, that prevented them identifying declarations at any point.
Also, with a compiler design perspective I understand, with such a restriction in C89, it is easy to handle variable declarations and usage with the help of an intermediate file to store the mappings. But are there methods that can handle the case without using an intermediary file, say some memory based storage?.
If the compiler sees a consolidated list of all the local/automatic variables up front, it can immediately work out the total amount by which to move the stack pointer to reserve stack memory for them - just one operation on the stack pointer. If it handles the variables in dribs and drabs as they're encountered in the function, moving the stack pointer incrementally, then there ends up being more opcodes dedicated to stack setup and stack pointer updates. It's important that the stack pointer be up to date whenever a further function call's performed. Newer compilers do a tiny bit of extra work to patch back in the amount by which to move the stack pointer after all the function's been considered. (I'd hazard that the effort's so minimal that the early Standard was shaped more by the conceptual appeal of knowing what to do up front than the effort of being more flexible, but if you just want to get something working - why make extra efforts?)
C99 Rationale didn't directly explain why it was not permited in C89, but did say it was added in C99 because it was permited in other languages and the it has been found useful.
Rationale for International Standard — Programming Languages — C
§6.2.4 Storage durations of objects
A new feature of C99: C89 requires all declarations in a block to occur before any statements. On the other hand, many languages similar to C (such as Algol 68 and C++) permit declarations and statements to be mixed in an arbitrary manner. This feature has been found to be useful and has been added to C99.
I am a newbie to Fortran. Please look at the code below:
c main program
call foo(2)
print*, 2
stop
end
subroutine foo(x)
x = x + 1
return
end
In some implementations of Fortran IV, the above code would print a 3. Why is that? Can you suggest an explanation?
How do you suppose more recent Fortran implementations get around the problem?
Help is very much appreciated. Thank You.
The program breaks the language rules - the dummy argument x in the subroutine is modified via the line x = x + 1, but it is associated with something that is an expression (a simple constant). In general, values that result from expressions cannot be modified.
That specific code is still syntactically valid Fortran 2008. It remains a programming error in Fortran 2008 - as it was in Fortran IV/66. This isn't something that compilers are required to diagnose. Some may, perhaps with additional debugging options, and perhaps not till runtime.
Because the program breaks the language rules anything could happen when you run the program. Exactly what depends on the code generated by the compiler. Compilers may have set aside modifiable storage for the value that results from the expression such that it internally looks like a variable (the program might print three and the program carries on), that modifiable storage might be shared across the program for other instances of the constant 2 (suddenly the value of 2 becomes three everywhere!), the storage for the value of the constant might in non-modifiable memory (the program may crash), the compiler may issue an error message, the program may get upset and sulk in its bedroom, the program might declare war on a neighbouring nation - it is a programming error - what happens is unspecified.
As of Fortran 90, facilities were introduced into the language to allow programmers to write new code that is practical for compilers to check for errors such as these (and in some cases compilers are required to check for errors if they are to be regarded as standard conforming).
For the code as presented, the main program and the subroutine are to be regarded as separately compiled - the main program is unaware of the details of the subroutine and vice versa (it is possible that the subroutine could be compiled long after the main program, on a different machine, with the outputs of the two being linked together at some later stage - without fancy link time behaviour or static analysis it is therefore not possible to resolve errors such as this). Language rules are such that when compiling the main program the compiler must implicitly assume the details of the interface of the subroutine based only on the way the subroutine is referenced - inside the main program the subroutine has an implicit interface.
Fortran 90 introduced the concept of an explicit interface, where the compiler is explicitly told what the interface of the subroutine in various ways, and can then check that any reference to the subroutine is consistent with that interface. If a procedure is a module procedure, internal procedure or intrinsic procedure - that interface is automatically realized, alternatively for external subprograms, procedure pointers, etc, the programmer can explicitly describe the interface using an interface block.
In addition, Fortran 90 introduced the intent attribute - a characteristic of a dummy argument of a procedure that is also then a characteristic of the interface for a procedure. The intent of the argument indicates to the compiler whether the procedure may define the argument (it also may implications for default initialization and component allocation status) and hence whether an expression could be a valid actual argument. x in subroutine foo would typically be declared INTENT(INOUT).
Collectively these new language features provide a robust defence against this sort of programming error when using compilers with a basic level of implementation quality. If you are starting with the language then it is recommended that these new features become part of your standard approach - i.e. use implicit none, all procedures should generally be module procedures or internal procedures, use external procedures only when absolutely required, always specify dummy argument intent, use free form source.
from the source code published in the programmer's manual of a commercial program, I have isolated a code snippet which puzzles me quite a lot.
The function below is expected to be called multiple times by a kernel and is supposed to implement the temporal behaviour of a component in a system consisting of many interconnected components (I have removed the Input/Output parameters from the function prototype because they are irrelevant to the point I intend to rise).
To distinguish between different instances of the same block type the kernel pass an instance number in the INFO(1) element.
As far as I have understood, the designer of this program took a great deal of effort trying to save the time spent in copying the values of the parameters from the PAR vector to local variables with meaningful names at each call (as if they were not aware of the optimizations a compiler can do). It seems to me that they wanted to assign them to the local variables only in the first call, or when the caller switch to a different instance of the same type.
However I can't understand how this could work if the local variables are not declared static with the "save" keyword. Does FORTRAN store local variables statically i.e. not on a stack? (I am sorry if the question sounds stupid, I am used to the C/C++ languages)
Thank you.
SUBROUTINE TYPE151(PAR, INFO, *)
IMPLICIT NONE
INTEGER*4 INFO(15), IUNIT
DOUBLE PRECISION PAR, QMAX
PARAMETER (NP=1)
DIMENSION PAR(NP)
! First call
IF (INFO(7).EQ.-1) THEN
IUNIT = INFO(1)
QMAX = PAR(1)
RETURN 1
ENDIF
! later calls
IF(INFO(1).NE.IUNIT) THEN
IUNIT = INFO(1)
QMAX = PAR(1)
ENDIF
! Making use of QMAX in some ways...
RETURN 1
END SUBROUTINE TYPE151
Storage methods are not part of the language standard. Old FORTRAN compilers (FORTRAN 77 and earlier) frequently stored all variables statically. The language requires that you use "SAVE" for variables for which the values should be retained across calls to the procedure. But many programmers ignored this requirement and relied on the behavior that all variables retained their values because of the typical design of compilers in the FORTRAN 77 era.
Modern Fortran compilers typically use memory differently and local variables of procedures do not always retain their values if SAVE is omitted. This frequently causes bugs when old programs are compiled with current compilers. Compilers typically provide an option to restore the old behavior. Otherwise it could be a great deal of work to identify all variables in a large legacy program that needed to have the SAVE attribute added to their declaration.