gfortran optimizer prevents an access violation - fortran

The following program tries to do a common mistake: modify a function argument,
whereas it is passed initially as a constant. Thus, usually, the constant is stored
in a read-only section in object code, and at run time one gets an access violation.
It's exactly what happens with gfortran, with optimization -O0 or -O1 (gfortran 4.8.1 on Windows).
But it disappears with -O2, and the second PRINT shows the value 100, like the first.
By inspection of the assembly output, I can see that in the -O1 case, the function F is optimized out, but the computations are still done in the code of A, and storing 117 causes a crash. With -O2, no computation is done, the result (201) is included in the assembly output as a constant, and the value 117 is never stored.
program bob
implicit none
call a(100)
contains
subroutine a(n)
integer :: n
print *, "In A:", f(n), n
print *, n
end subroutine
function f(n)
integer :: n, f
f = 2*n + 1
n = 117
end function
end program
Is this behaviour accepted by the standard? Is this a bug?
My first thought was that maybe it's a bug of the optimizer (it does not do something that would have indeed an effect, since the modified value is printed afterwards). But I'm aware that usually, an undefined behaviour in the standard can have any consequence when actually run.
If I replace the constant 100 in the call, with a variable previously initialized to 100, the compiler produces the expected result (the second PRINT gives me 117, with any optimization level).
So, maybe the optimizer is very clever, in the "constant" case: since the code would crash, the print woud not happen, so the value is not needed, so optmized out, and finally the program won't crash. But I still find it a bit puzzling.

The behaviour of the erroneous program is consistent with what the standard requires.
The standard doesn't require the compiler to diagnose this particular error (it is not a violation of the numbered syntax rules or numbered constraints). Beyond that, if a program is in error in this way, then the standard doesn't impose any requirements on the Fortran processor.
It does not reveal a bug in the compiler. Any behaviour is valid, including things like the compiler beating you over the head with a stick.
Perhaps you should have stated your INTENT.

This is probably a bug in the constants propagation module of the GCC optimiser. It is enabled by default for any optimisation level greater than -O1 and could be disabled by passing -fno-ipa-cp.
This example only serves to illustrate the importance of giving each dummy argument the correct INTENT attribute. When n is marked as INTENT(INOUT) in a, the compiler gives an error, no matter what the optimisation level.

Related

Why is fprintf causing memory leak and behaving unpredictably when width argument is missing

The following simple program is behaving unpredictably. Sometimes it prints "0.00000", sometimes it prints more "0" than I can count. Some times it uses up all memory on the system, before the system either kills some process, or it fails with bad_alloc.
#include "stdio.h"
int main() {
fprintf(stdout, "%.*f", 0.0);
}
I'm aware that this is incorrect usage of fprintf. There should be another argument specifying the width of the formatting. It's just surprising that the behavior is so unpredictable. Sometimes it seems to use a default width, while sometimes it fails very badly. Could this not be made to always fail or always use some default behaviour?
I came over similar usage in some code at work, and spent a lot of time figuring out what was happening. It only seemed to happen with debug builds, but would not happen while debugging with gdb. Another curiosity is that running it through valgrind would consistently bring about the printing of many "0"s case, which otherwise happens quite seldom, but the memory usage issue would never occur then either.
I am running Red Hat Enterprise Linux 7, and compiled with gcc 4.8.5.
Formally this is undefined behavior.
As for what you're observing in practice:
My guess is that fprintf ends up using an uninitialized integer as the number of decimal places to output. That's because it'll try to read a number from a location where the caller didn't write any particular value, so you'll just get whatever bits happen to be stored there. If that happens to be a huge number, fprintf will try to allocate a lot of memory to store the result string internally. That would explain the "running out of memory" part.
If the uninitialized value isn't quite that big, the allocation will succeed and you'll end up with a lot of zeroes.
And finally, if the random integer value happens to be just 5, you'll get 0.00000.
Valgrind probably consistently initializes the memory your program sees, so the behavior becomes deterministic.
Could this not be made to always fail
I'm pretty sure it won't even compile if you use gcc -pedantic -Wall -Wextra -Werror.
The format string does not match the parameters, therefore the bahaviour of fprintf is undefined. Google "undefined behaviour C" for more information about "undefined bahaviour".
This would be correct:
// printf 0.0 with 7 decimals
fprintf(stdout, "%.*f", 7, 0.0);
Or maybe you just want this:
// printf 0.0 with de default format
fprintf(stdout, "%f", 0.0);
About this part of your question: Sometimes it seems to use a default width, while sometimes it fails very badly. Could this not be made to always fail or always use some default behaviour?
There cannot be any default behaviour, fprintf is reading the arguments according to the format string. If the arguments don't match, fprintf ends up with seamingly random values.
About this part of your question: Another curiosity is that running it through valgrind would consistently bring about the printing of many "0"s case, which otherwise happens quite seldom, but the memory usage issue would never occur then either.:
This is just another manifestation of undefined behaviour, with valgrind the conditions are quite different and therefore the actual undefined bahaviour can be different.
Undefined behaviour is undefined.
However, on x86-64 System-V ABI it is well-known that arguments are not passed on stack but in registers. Floating point variables are passed in floating-point registers, and integers are passed in general-purpose registers. There is no parameter store on stack, so the width of the arguments does not matter. Since you never passed any integer in the variable argument part, the general purpose register corresponding to the first argument will contain whatever garbage it had from before.
This program will show how the floating point values and integers are passed separately:
#include <stdio.h>
int main() {
fprintf(stdout, "%.*f\n", 42, 0.0);
fprintf(stdout, "%.*f\n", 0.0, 42);
}
Compiled on x86-64, GCC + Glibc, both printfs will produce the same output:
0.000000000000000000000000000000000000000000
0.000000000000000000000000000000000000000000
This is undefined behaviour in the standard. It means "anything is fair game" because you're doing wrong things.
The worst part is that most certainly any compiler will warn you, but you have ignored the warning. Putting some kind of validation other than the compiler will incurr in a cost that everybody will pay just so you can do what's wrong.
That's the opposite of what C and C++ stand for: you pay for what you use. If you want to pay the cost, it's up to you to do the checking.
What's really happening depends on the ABI, compiler and architecture. It's undefined behaviour because the language gives the implementer the freedom to do what's better on every machine (meaning, sometimes faster code, sometimes shorter code).
As an example, when you call a function on the machine, it just means that you're instructing the microprocessor to go to a certain code location.
In some made up assembly and ABI, then, printf("%.*f", 5, 1); will translate into something like
mov A, STR_F ; // load into register A the 32 bit address of the string "%.*f"
mov B, 5 ; // load second 32 bit parameter into B
mov F0, 1.0 ; // load first floating point parameter into register F0
call printf ; // call the function
Now, if you miss some parameter, in this case B, it will take any value that was there before.
The thing with functions like printf is that they allow anything in their parameter list (it's printf(const char*, ...), so anything is valid). That's why you shouldn't use printf on C++: you have better alternatives, like streams. printf avoids the checkings of the compiler. streams are better aware of types and are extensible to your own types. Also, that's why your code should compile without warnings.

If i call a function in fortran, without defining a variable, what happens?

So i have a program, which has something like this in it:
integer :: mgvn, stot, gutot, iprint, iwrit, ifail, iprnt
...
call readbh(lubnd,nbset,nchan,mgvn,stot,gutot,nstat,nbound,rr,bform,iprnt,iwrit,ifail)
And then inside readbh:
CALL GETSET(LUBND,NSET,KEYBC,BFORM,IFAIL)
IF(IFAIL.NE.0) GO TO 99
...
99 WRITE(IWRITE,98) NBSET,LUBND
IFAIL = 1
RETURN
Where all the other variables are defined, but ifail is not. If i add in write(*,*) ifail before the function call, i get the undefined variable error, but if i leave it out, it doesn't complain, and just runs away with the function, and always fails, with IFAIL=1.
Is this because it's just getting to the end of the arguments in the readbh function, reading in uninitialised memory - which is just random jibberish - and then casting those bits to an int - which is not going to be zero unless i'm very (un)lucky, and so nearly always making ifail.ne.0 true?
I'll choose to interpret what you call undefined variable as uninitialised variable. Generally speaking Fortran, and many other compiled programming languages, will quite happily carry on computing with uninitialised variables. It/they are programming languages for grown-ups, it's on our own head if you program this sort of behaviour. It is not syntactically incorrect to write a Fortran program which uses uninitialised variables so a compiler is not bound by the language standard to raise a warning or error.
Fortran does, though, have the facility for you to program functions and subroutines to ensure that output arguments are given values. If you use the intent(out) attribute on arguments which ought to have values assigned to them inside a procedure, then the compiler will check that an assignment is made and raise an error if one is not.
Most compilers have an option to implement run-time checking for use of uninitialised variables. Intel Fortran, for example, has the flag -check:uninit. Without this check, yes, your program will interpret whatever pattern of bits it finds in the region of memory labelled ifail as an integer and carry on.
You write that your function always fails with ifail == 1. From what you've shown us ifail is, just prior to the return at (presumably) the end of the call to readbh, unconditionally set to 1.
From what you've revealed of your code it looks to me as if ifail is intended as an error return code from getset so it's not necessarily wrong that it is uninitialised on entry to that subroutine. But it is a little puzzling that readbh then sets it to 1 before returning.

Why does this code compile without warnings?

I have no idea why this code complies :
int array[100];
array[-50] = 100; // Crash!!
...the compiler still compiles properly, without compiling errors, and warnings.
So why does it compile at all?
array[-50] = 100;
Actually means here:
*(array - 50) = 100;
Take into consideration this code:
int array[100];
int *b = &(a[50]);
b[-20] = 5;
This code is valid and won't crash. Compiler has no way of knowing, whether the code will crash or not and what programmer wanted to do with the array. So it does not complain.
Finally, take into consideration, that you should not rely on compiler warnings while finding bugs in your code. Compilers will not find most of your bugs, they barely try to make some hints for you to ease the bugfixing process (sometimes they even may be mistaken and point out, that valid code is buggy). Also, the standard actually never requires the compiler to emit warning, so these are only an act of good will of compiler implementers.
It compiles because the expression array[-50] is transformed to the equivalent
*(&array[0] + (-50))
which is another way of saying "take the memory address &array[0] and add to it -50 times sizeof(array[0]), then interpret the contents of the resulting memory address and those following it as an int", as per the usual pointer arithmetic rules. This is a perfectly valid expression where -50 might really be any integer (and of course it doesn't need to be a compile-time constant).
Now it's definitely true that since here -50 is a compile-time constant, and since accessing the minus 50th element of an array is almost always an error, the compiler could (and perhaps should) produce a warning for this.
However, we should also consider that detecting this specific condition (statically indexing into an array with an apparently invalid index) is something that you don't expect to see in real code. Therefore the compiler team's resources will be probably put to better use doing something else.
Contrast this with other constructs like if (answer = 42) which you do expect to see in real code (if only because it's so easy to make that typo) and which are hard to debug (the eye can easily read = as ==, whereas that -50 immediately sticks out). In these cases a compiler warning is much more productive.
The compiler is not required to catch all potential problems at compile time. The C standard allows for undefined behavior at run time (which is what happens when this program is executed). You may treat it as a legal excuse not to catch this kind of bugs.
There are compilers and static program analyzers that can do catch trivial bugs like this, though.
True compilers do (note: need to switch the compiler to clang 3.2, gcc is not user-friendly)
Compilation finished with warnings:
source.cpp:3:4: warning: array index -50 is before the beginning of the array [-Warray-bounds]
array[-50] = 100;
^ ~~~
source.cpp:2:4: note: array 'array' declared here
int array[100];
^
1 warning generated.
If you have a lesser (*) compiler, you may have to setup the warning manually though.
(*) ie, less user-friendly
The number inside the brackets is just an index. It tells you how many steps in memory to take to find the number you're requesting. array[2] means start at the beginning of array, and jump forwards two times.
You just told it to jump backwards 50 times, which is a valid statement. However, I can't imagine there being a good reason for doing this...

What do I do about a FORTRAN intrinsic that was not part of the standard?

I'm trying to get a legacy FORTRAN code working by building it from source using gfortran. I have finally been able to build it successfully, but now I'm getting an out-of-bounds error when it runs. I used gdb and traced the error to a function that uses the loc() intrinsic. When I try to print the value of loc(ae), with ae being my integer value being passed, I get the error "No symbol "loc" in current context." I tried compiling with ifort 11.x and debugged with DDT and got the same error. To me, this means that the compiler knows nothing of the intrinsic.
A little reading revealed that the loc intrinsic wasn't part of the F77 standard, so maybe that's part of the problem. I posted the definition of the intrinsic below, but I don't know how I can implement that into my code so loc() can be used.
Any advice or am I misinterpreting my problem? Because both gfortran and ifort crash in the same place due to an out of bounds error, but the function utilizing loc() returns the same large number between both compilers. It seems a bit strange that loc() wouldn't be working if both compilers shoot back the same value for loc.
Usage:
iaddr = loc(obj)
Where:
obj
is a variable, array, function or subroutine whose address is wanted.
iaddr
is an integer with the address of "obj". The address is in the same
format as stored by an LARn
instruction.
Description:
LOC is used to obtain the address of
something. The value returned is not
really useful within Fortran, but may
be needed for GMAP subroutines, or
very special debugging.
Well, no, the fact that it compiles means that loc is known by the compiler; the fact that gdb doesn't know about it just means it's just not known by the debugger (which probably doesn't know the matmult intrinsic, either).
loc is a widely-available non-standard extension. I hate those. If you want something standard that should work everywhere, c_loc, which is part of the C<->Fortran interoperability standard in Fortran2003, is something you could use. It returns a pointer that can be passed to C routines.
How is the value from the loc call being used?
Gfortran loc seems to work a bit differently with arrays to that of some other compilers. If you are using it to eg check for array copies or such then it can be better to do loc of the first element loc(obj(1,1)) or similar. This is equivalent to what loc does I think with intel, but in gfortran it gives instead some other address (so two arrays which share exactly the same memory layout have different loc results).

GCC: program doesn't work with compilation option -O3

I'm writing a C++ program that doesn't work (I get a segmentation fault) when I compile it with optimizations (options -O1, -O2, -O3, etc.), but it works just fine when I compile it without optimizations.
Is there any chance that the error is in my code? or should I assume that this is a bug in GCC?
My GCC version is 3.4.6.
Is there any known workaround for this kind of problem?
There is a big difference in speed between the optimized and unoptimized version of my program, so I really need to use optimizations.
This is my original functor. The one that works fine with no levels of optimizations and throws a segmentation fault with any level of optimization:
struct distanceToPointSort{
indexedDocument* point ;
distanceToPointSort(indexedDocument* p): point(p) {}
bool operator() (indexedDocument* p1,indexedDocument* p2){
return distance(point,p1) < distance(point,p2) ;
}
} ;
And this one works flawlessly with any level of optimization:
struct distanceToPointSort{
indexedDocument* point ;
distanceToPointSort(indexedDocument* p): point(p) {}
bool operator() (indexedDocument* p1,indexedDocument* p2){
float d1=distance(point,p1) ;
float d2=distance(point,p2) ;
std::cout << "" ; //without this line, I get a segmentation fault anyways
return d1 < d2 ;
}
} ;
Unfortunately, this problem is hard to reproduce because it happens with some specific values. I get the segmentation fault upon sorting just one out of more than a thousand vectors, so it really depends on the specific combination of values each vector has.
Now that you posted the code fragment and a working workaround was found (#Windows programmer's answer), I can say that perhaps what you are looking for is -ffloat-store.
-ffloat-store
Do not store floating point variables in registers, and inhibit other options that might change whether a floating point value is taken from a register or memory.
This option prevents undesirable excess precision on machines such as the 68000 where the floating registers (of the 68881) keep more precision than a double is supposed to have. Similarly for the x86 architecture. For most programs, the excess precision does only good, but a few programs rely on the precise definition of IEEE floating point. Use -ffloat-store for such programs, after modifying them to store all pertinent intermediate computations into variables.
Source: http://gcc.gnu.org/onlinedocs/gcc-3.4.6/gcc/Optimize-Options.html
I would assume your code is wrong first.
Though it is hard to tell.
Does your code compile with 0 warnings?
g++ -Wall -Wextra -pedantic -ansi
Here's some code that seems to work, until you hit -O3...
#include <stdio.h>
int main()
{
int i = 0, j = 1, k = 2;
printf("%d %d %d\n", *(&j-1), *(&j), *(&j+1));
return 0;
}
Without optimisations, I get "2 1 0"; with optimisations I get "40 1 2293680". Why? Because i and k got optimised out!
But I was taking the address of j and going out of the memory region allocated to j. That's not allowed by the standard. It's most likely that your problem is caused by a similar deviation from the standard.
I find valgrind is often helpful at times like these.
EDIT: Some commenters are under the impression that the standard allows arbitrary pointer arithmetic. It does not. Remember that some architectures have funny addressing schemes, alignment may be important, and you may get problems if you overflow certain registers!
The words of the [draft] standard, on adding/subtracting an integer to/from a pointer (emphasis added):
"If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined."
Seeing as &j doesn't even point to an array object, &j-1 and &j+1 can hardly point to part of the same array object. So simply evaluating &j+1 (let alone dereferencing it) is undefined behaviour.
On x86 we can be pretty confident that adding one to a pointer is fairly safe and just takes us to the next memory location. In the code above, the problem occurs when we make assumptions about what that memory contains, which of course the standard doesn't go near.
As an experiment, try to see if this will force the compiler to round everything consistently.
volatile float d1=distance(point,p1) ;
volatile float d2=distance(point,p2) ;
return d1 < d2 ;
The error is in your code. It's likely you're doing something that invokes undefined behavior according to the C standard which just happens to work with no optimizations, but when GCC makes certain assumptions for performing its optimizations, the code breaks when those assumptions aren't true. Make sure to compile with the -Wall option, and the -Wextra might also be a good idea, and see if you get any warnings. You could also try -ansi or -pedantic, but those are likely to result in false positives.
You may be running into an aliasing problem (or it could be a million other things). Look up the -fstrict-aliasing option.
This kind of question is impossible to answer properly without more information.
It is very seldom the compiler fault, but compiler do have bugs in them, and them often manifest themselves at different optimization levels (if there is a bug in an optimization pass, for example).
In general when reporting programming problems: provide a minimal code sample to demonstrate the issue, such that people can just save the code to a file, compile and run it. Make it as easy as possible to reproduce your problem.
Also, try different versions of GCC (compiling your own GCC is very easy, especially on Linux). If possible, try with another compiler. Intel C has a compiler which is more or less GCC compatible (and free for non-commercial use, I think). This will help pinpointing the problem.
It's almost (almost) never the compiler.
First, make sure you're compiling warning-free, with -Wall.
If that didn't give you a "eureka" moment, attach a debugger to the least optimized version of your executable that crashes and see what it's doing and where it goes.
5 will get you 10 that you've fixed the problem by this point.
Ran into the same problem a few days ago, in my case it was aliasing. And GCC does it differently, but not wrongly, when compared to other compilers. GCC has become what some might call a rules-lawyer of the C++ standard, and their implementation is correct, but you also have to be really correct in you C++, or it'll over optimize somethings, which is a pain. But you get speed, so can't complain.
I expect to get some downvotes here after reading some of the comments, but in the console game programming world, it's rather common knowledge that the higher optimization levels can sometimes generate incorrect code in weird edge cases. It might very well be that edge cases can be fixed with subtle changes to the code, though.
Alright...
This is one of the weirdest problems I've ever had.
I dont think I have enough proof to state it's a GCC bug, but honestly... It really looks like one.
This is my original functor. The one that works fine with no levels of optimizations and throws a segmentation fault with any level of optimization:
struct distanceToPointSort{
indexedDocument* point ;
distanceToPointSort(indexedDocument* p): point(p) {}
bool operator() (indexedDocument* p1,indexedDocument* p2){
return distance(point,p1) < distance(point,p2) ;
}
} ;
And this one works flawlessly with any level of optimization:
struct distanceToPointSort{
indexedDocument* point ;
distanceToPointSort(indexedDocument* p): point(p) {}
bool operator() (indexedDocument* p1,indexedDocument* p2){
float d1=distance(point,p1) ;
float d2=distance(point,p2) ;
std::cout << "" ; //without this line, I get a segmentation fault anyways
return d1 < d2 ;
}
} ;
Unfortunately, this problem is hard to reproduce because it happens with some specific values. I get the segmentation fault upon sorting just one out of more than a thousand vectors, so it really depends on the specific combination of values each vector has.
Wow, I didn't expect answers so quicly, and so many...
The error occurs upon sorting a std::vector of pointers using std::sort()
I provide the strict-weak-ordering functor.
But I know the functor I provide is correct because I've used it a lot and it works fine.
Plus, the error cannot be some invalid pointer in the vector becasue the error occurs just when I sort the vector. If I iterate through the vector without applying std::sort first, the program works fine.
I just used GDB to try to find out what's going on. The error occurs when std::sort invoke my functor. Aparently std::sort is passing an invalid pointer to my functor. (of course this happens with the optimized version only, any level of optimization -O, -O2, -O3)
as other have pointed out, probably strict aliasing.
turn it of in o3 and try again. My guess is that you are doing some pointer tricks in your functor (fast float as int compare? object type in lower 2 bits?) that fail across inlining template functions.
warnings do not help to catch this case. "if the compiler could detect all strict aliasing problems it could just as well avoid them" just changing an unrelated line of code may make the problem appear or go away as it changes register allocation.
As the updated question will show ;) , the problem exists with a std::vector<T*>. One common error with vectors is reserve()ing what should have been resize()d. As a result, you'd be writing outside array bounds. An optimizer may discard those writes.
post the code in distance! it probably does some pointer magic, see my previous post. doing an intermediate assignment just hides the bug in your code by changing register allocation. even more telling of this is the output changing things!
The true answer is hidden somewhere inside all the comments in this thread. First of all: it is not a bug in the compiler.
The problem has to do with floating point precision. distanceToPointSort should be a function that should never return true for both the arguments (a,b) and (b,a), but that is exactly what can happen when the compiler decides to use higher precision for some data paths. The problem is especially likely on, but by no means limited to, x86 without -mfpmath=sse. If the comparator behaves that way, the sort function can become confused, and the segmentation fault is not surprising.
I consider -ffloat-store the best solution here (already suggested by CesarB).