I would like to clarify my understanding of evaluation order in Fortran.
Let's say I have a Stack type with methods pop and push_back.
If I execute the following code:
call stack%push_back(1)
call stack%push_back(3)
call stack%push_back(stack%pop() - stack%pop())
write(*, *) stack%pop() ! could print 2, or -2
the last element on the Stack depends on the evaluation order and as this answer explains, the Compiler is free to change the evaluation order.
But even if I have a commutative operation, there are still problems.
The Fortran 2008 standard says (7.1.4):
The evaluation of a function reference shall neither affect nor be affected by the evaluation of any other entity within the statement.
So even this code:
call stack%push_back(1)
call stack%push_back(3)
call stack%push_back(stack%pop() + stack%pop())
is not standard conforming?
This means that I always have to write it like this:
call stack%push_back(1)
call stack%push_back(3)
block
integer :: A, B
A = stack%pop()
B = stack%pop()
call stack%push_back(A - B)
end block
write(*, *) stack%pop() ! is guaranteed to print 2
Is this true?
(Here we assume reasonable and intuitive definitions of things in the question and imply similar in the answer.)
The statement
call stack%push_back(stack%pop() - stack%pop())
would indeed be invalid, for the reason stated in the question. However, it may be easier to see this in terms of the other restriction in the pair, not quoted (Fortran 2018 10.1.4, but similar in F2008):
if a function reference causes definition or undefinition of an actual argument of the function, that argument or any associated entities shall not appear elsewhere in the same statement.
The statement of concern is the same as
call stack%push_back(pop(stack) - pop(stack))
where it's quite clear that stack is an actual argument and it appears more than once in the statement; stack is defined by the function pop.
Yes you will need to use different statements to have the desired effect. The approach given in the question is reasonable.
Related
This might be a stupid question, but I am confused. I had a feeling that an immediate (consteval) function has to be executed during compile time and we simply cannot see its body in the binary.
This article clearly supports my feeling:
This has the implication that the [immediate] function is only seen at compile time. Symbols are not emitted for the function, you cannot take the address of such a function, and tools such as debuggers will not be able to show them. In this matter, immediate functions are similar to macros.
The similar strong claim might be found in Herb Sutter's publication:
Note that draft C++20 already contains part of the first round of reflection-related work to land in the standard: consteval functions that are guaranteed to run at compile time, which came from the reflection work and are designed specifically to be used to manipulate reflection information.
However, there is a number of evidences that are not so clear about this fact.
From cppreference:
consteval - specifies that a function is an immediate function, that is, every call to the function must produce a compile-time constant.
It does not mean it has to be called during compile time only.
From the P1073R3 proposal:
There is now general agreement that future language support for reflection should use constexpr functions, but since "reflection functions" typically have to be evaluated at compile time, they will in fact likely be immediate functions.
Seems like this means what I think, but still it is not clearly said. From the same proposal:
Sometimes, however, we want to express that a function should always produce a constant when called (directly or indirectly), and a non-constant result should produce an error.
Again, this does not mean the function has to be evaluated during compile time only.
From this answer:
your code must produce a compile time constant expression. But a compile time constant expression is not an observable property in the context where you used it, and there are no side effects to doing it at link or even run time! And under as-if there is nothing preventing that
Finally, there is a live demo, where consteval function is clearly called during runtime. However, I hope this is due to the fact consteval is not yet properly supported in clang and the behavior is actually incorrect, just like in Why does a consteval function allow undefined behavior?
To be more precise, I'd like to hear which of the following statements of the cited article are correct:
An immediate function is only seen at compile time (and cannot be evaluated at run time)
Symbols are not emitted for an immediate function
Tools such as debuggers will not be able to show an immediate function
To be more precise, I'd like to hear which of the following statements of the cited article are correct:
An immediate function is only seen at compile time (and cannot be evaluated at run time)
Symbols are not emitted for an immediate function
Tools such as debuggers will not be able to show an immediate function
Almost none of these are answers which the C++ standard can give. The standard doesn't define "symbols" or what tools can show. Almost all of these are dealer's choice as far as the standard is concerned.
Indeed, even the question of "compile time" vs. "run time" is something the standard doesn't deal with. The only question that concerns the standard is whether something is a constant expression. Invoking a constexpr function may produce a constant expression, depending on its parameters. Invoking a consteval function in a way which does not produce a constant expression is il-formed.
The one thing the standard does define is what gets "seen". Though it's not really about "compile time". There are a number of statements in C++20 that forbid most functions from dealing in pointers/references to immediate functions. For example, C++20 states in [expr.prim.id]/3:
An id-expression that denotes an immediate function shall appear only
as a subexpression of an immediate invocation, or
in an immediate function context.
So if you're not in an immediate function, or you're not using the name of an immediate function to call another immediate function (passing a pointer/reference to the function), then you cannot name an immediate function. And you can't get a pointer/reference to a function without naming it.
This and other statements in the spec (like pointers to immediate function not being valid results of constant expressions) essentially make it impossible for a pointer/reference to an immediate function to leak outside of constant expressions.
So statements about the visibility of immediate functions are correct, to some degree. Symbols can be emitted for immediate functions, but you cannot use immediate functions in a way that would prevent an implementation from discarding said symbols.
And that's basically the thing with consteval. It doesn't use standard language to enforce what must happen. It uses standard language to make it impossible to use the function in a way that will prevent these things from happening. So it's more reasonable to say:
You cannot use an immediate function in a way that would prevent the compiler from executing it at compile time.
You cannot use an immediate function in a way that would prevent the compiler from discarding symbols for it.
You cannot use an immediate function in a way that would force debuggers to be able to see them.
Quality of implementation is expected to take things from there.
It should also be noted that debugging builds are for... debugging. It would be entirely reasonable for advanced compiler tools to be able to debug code that generates constant expressions. So a debugger which could see immediate functions execute is an entirely desirable technology. This becomes moreso as compile-time code grows more complex.
The proposal mentions:
One consequence of this specification is that an immediate function never needs to be seen by a back end.
So it is definitely the intention of the proposal that calls are replaced by the constant. In other words, that the constant expression is evaluated during translation.
However, it does not say it is required that it is not seen by the backend. In fact, in another sentence of the proposal, it just says it is unlikely:
It also means that, unlike plain constexpr functions, consteval functions are unlikely to show up in symbolic debuggers.
More generally, we can re-state the question as:
Are compilers forced to evaluate constant expressions (everywhere; not just when they definitely need it)?
For instance, a compiler needs to evaluate a constant expression if it is the number of elements of an array, because it needs to statically determine the total size of the array.
However, a compiler may not need to evaluate other uses, and while any decent optimizing compiler will try to do so anyway, it does not mean it needs to.
Another interesting case to think about is an interpreter: while an interpreter still needs to evaluate some constant expressions, it may just do it lazily all the time, without performing any constant folding.
So, as far as I know, they aren't required, but I don't know the exact quotes we need from the standard to prove it (or otherwise). Perhaps it is a good follow-up question on its own, which would answer this one too.
For instance, in [expr.const]p1 there is a note that says they can, not that they are:
[Note: Constant expressions can be evaluated during translation. — end note]
The Code
The following MWE describes what I want to use (note that I did not design this, I am merely trying to use someones code, I would not usually use global variables).
PROGRAM MAIN
IMPLICIT NONE
integer :: N
real(8), allocatable :: a(:,:)
N=3
allocate(a(N,3))
a=initialize_array()
CONTAINS
function initialize_array() result(a)
IMPLICIT NONE
real(8) :: a(N,3)
a=1
end function initialize_array
END PROGRAM MAIN
The Problem
gfortran gives an error which reads Error: Variable 'n' cannot appear in the expression at (1), pointing to to the real(8) :: a(N,3) inside the function.
In a subroutine it would work, so what might be the problem here?
The question
Why does ifort (v. 15.0.3) compile this, while gfortran (v. 4.8.4) does not?
As has been commented by others, it is hard to say whether something is clearly allowed: the language is mostly based on rules and constraints giving restrictions.
So, I won't prove that the code is not erroneous (and that gfortran is not allowed to reject it), but let's look at what's going on.
First, I'll object to one thing given by High Performance Mark as this is slightly relevant:
The declaration of an array with a dimension dependent on the value of a variable, such as a(N,3), requires that the value of the variable be known (or at least knowable) at compile time.
The bounds of an explicit shape array need not always be given by constant expressions (what we loosely define as "known/knowable at compile time"): in some circumstances an explicit shape array can have bounds given by variables. These are known as automatic objects (and the bounds given by specification expressions).
A function result is one such place where an automatic object is allowed. In the question's example for the declaration of the function result, N is host associated and forms a specification expression.
Rather than exhausting all other constraints to see that the declaration of a truly is allowed, let's look at how gfortran responds to small modifications of the program.
First, a trimmed down version of the question's code to which gfortran objects.
integer n
contains
function f() result(g)
real g(n)
end function f
end program
The function result for f has the name g. It doesn't matter what we call the function result, so what happens when we call it f?
integer n
contains
function f()
real f(n)
end function f
end program
This compiles happily for me.
What if we frame this first lump in a module instead of a main program?
module mod
integer n
contains
function f() result(g)
real g(n)
end function f
end module
That also compiles.
The natural conclusion: even if gfortran is correct (we've missed some well-hidden constraint) to reject the first code it's either horribly inconsistent in not rejecting the others, or the constraint is really quite strange.
I think this might be an explanation, though like #VladimirF I can't actually either recall or find the relevant section (if there is one) of the standard.
This line
real(8) :: a(N,3)
declares that the result of the function is an array called a. This masks the possibility of the same name referring to the array a by host association. The a inside the function scope is not the a in the program scope.
The declaration of an array with a dimension dependent on the value of a variable, such as a(N,3), requires that the value of the variable be known (or at least knowable) at compile time. In this case giving n in the host scope the attribute parameter fixes the problem. Though it doesn't fix the poor design -- but OP's hands seem tied on that point.
It doesn't surprise me that the Intel compiler compiles this, it compiles all sorts of oddities that its ancestors have compiled over the years for the sake of backwards compatibility.
I only offer this half-baked explanation because experience has taught me that as soon as I do one of the real Fortran experts (IanH, francescalus, (usually) VladimirF) will get so outraged as to post a correction and we'll all learn something.
Am working with some legacy FORTRAN code. The author defined a function (not a subroutine, but a function -- that's going to be important) called REDUCE_VEC(). It accepts a 1D array and returns a scalar real*8. So, if you want to "reduce" your vector, you make a call to the function
RV = REDUCE_VEC(V1)
and everything is fine. But occasionally, he has lines that look like
CALL REDUCE_VEC(V2)
So, two questions: 1) What the heck would this second form of the call do? (Note that there is no way to return data.) 2) This won't even compile under gfortran, even though it did with PGI, so is this even legal FORTRAN?
Thanks.
This will compile with many processors if the interface is implicit, because the compiler cannot check that, it just calls some symbol. Consider the following:
function f(a)
dimension a(*)
f = 0
do i=1,10
f = f + a(i)
end do
end function
program p
call f([1.,2.,3.,4.,5.,6.,7.,8.,9.,10.])
end program
Compiles and does not even crash immediately with ifort, whereas sunf90 and gfortran will compile that only if it is in separate source files and then the result also does not crash on my machine. If the return value is placed in register, it may cause no harm to the rest of the program, but a stack corruption is quite likely otherwise.
It is not legal Fortran. As presented it is more than likely a programming error (it is possible for the same name in different scopes to refer to different things, but that's not what is implied by the question). If the Fortran processor happens to support an extension to the language that allows this, then what happens is up to the Fortran processor. Otherwise, "anything" could happen, where "anything" could include (but is not limited to) "nothing", or "very, very bad things".
I know the order that function parameters are evaluated is unspecified in C++, see below,
// The simple obvious one.
callFunc(getA(),getB());
Can be equivalent to this:
int a = getA();
int b = getB();
callFunc(a,b);
Or this:
int b = getB();
int a = getA();
callFunc(a,b);
This is perfect fine & I think most ppl know this.
But I have tried VC10, gcc 4.72 and they all evaluate b first (from right to left), meaning b got pushed into stack frame first then a.
I am just wondering which c++ compiler should I try to make the code above to evalute a first ?
So a got pushed to stack before b.
Thanks
The parameter evaluation order substantially depends from the calling convention used for calling the given function - if parameters are pushed on the stack RTL it's usually more convenient to elaborate the rightmost parameters first.
According to this table, on x86 the only calling convention available on IA32 with LTR parameter order on the stack is fastcall on Borland, that however passes the first three integer/pointer parameters in registers. So you should write a function that takes more than three integers, mark it as fastcall and compile it with a Borland compiler; in that case probably the other parameters besides the first three should be evaluated in LTR order.
Going on other platforms probably you'll find other calling conventions with LTR parameter passing (and so probably LTR parameters evaluation).
Notice that the parameter passing order <=> parameter evaluation order are logically bound, but if for some reason the compiler finds that it's better to evaluate some parameter before the others there's nothing in the standard preventing it to do so.
I know that C++ doesn't specify the order in which parameters are passed to a function. But if we write the following code:
void __cdecl func(int a, int b, int c)
{
printf("%d,%d,%d", a,b,c);
}
int main()
{
int i=10;
func(++i, i, ++i);
}
Can we reliably say the output would be 12,11,11 since the __cdecl ensures that argument-passing order is right to left?
As per the Standard, there are two things you need to understand and differentiate:
C++ doesn't specify the order in which parameters are passed to a
function (as you said yourself, that is true!)
C++ doesn't specify the order in which the function arguments are evaluated [expr.call].
Now, please note, __cdecl ensures only the first, not the second. Calling conventions decide only how the functions arguments will be passed, left-to-right or right-to-left; they can still be evaluated in ANY order!
Hope this clarifies your doubts regarding the calling conventions.
However, since these conventions are Microsoft compiler extension to C++, so your code is non-portable. In that case, you can see how MSVC++ compiler evaluates function arguments and be relax IF you don't want to run the same code on other platform!
func(++i, i, ++i);
Note that this particular code invokes undefined behavior, because i is incremented more than once without any intervening any sequence point.
No, you cannot assume that.
An optimizing compiler will inline short functions. All that __stdcall will garantee is that the __stdcall version of function will be generated, but this does not mean that compiler cannot also inline it in the same time.
If you really want to be sure it's not inlined, you have to declare it in another compilation unit, alough linker optimizations could inline even in this case.
Also, the order of parameters on the stack have nothing to do with the order they are evaluated. For example, for a function call fn(a, b, c) GCC usually won't do
push c
push b
push a
call fn
but rather
sub esp, 0xC
mov [esp+8], c
mov [esp+4], b
mov [esp], a
call fn
Note that in the second case it has no restrictions on the order.
It is possible for a particular C implementation to define what a compiler will do in certain cases which would, per the standard, be "undefined behavior". For example, setting an int variable to ~0U would constitute undefined behavior, but there's nothing in the C standard that wouldn't allow a compiler to evaluate the int as -1 (or -493, for that matter). Nor is there anything that would forbid a particular compiler vendor from stating that their particular compiler will in fact set the variable to -1. Since __cdecl isn't defined in the C standard, and is only applicable to certain compilers, the question of how its semantics are defined is up to those vendors; since the C standard lists it as undocumented behavior, it will only be documented to the extent particular vendors document it.
You are changing the same variable more than once between sequence points (function argument evaluation is one sequence point), which cause undefined behaviour regardless of calling convention.