I see the flag in the documentation of how to compile some f90 code I have acquired (specifically, mpfi90 -O5 file.f90), but researching the -O5 flag turned up nothing in the gfortran docs, mpfi docs, or anywhere else. I assume it is an optimization flag like -O1, etc., but I'm not sure.
Thanks!
Source: http://publib.boulder.ibm.com/infocenter/comphelp/v7v91/index.jsp?topic=%2Fcom.ibm.xlf91a.doc%2Fxlfug%2Fhu00509.htm
The flag -O5 is an optimizer like -O3 and -O2. The linked source says,
qnoopt/-O0 Fast compilation, debuggable code, conserved program
semantics.
-O2 (same as -O) Comprehensive low-level optimization; partial debugging support.
-O3 More extensive optimization; some precision trade-offs.
-O4 and -O5 Interprocedural optimization; loop optimization; automatic machine tuning.
With each higher number containing all the optimizations of the lower levels.
Related
Intel Fortran compiler/linker has the optional flag -ipo-c or /Qipo-c which enables the generation of a single interprocedurally-optimized object file from all files, which can be later used for linking. Is there an equivalent flag to Intel's -ipo-cin gfortran?
GCC has -fwhole-program, does that work for gfortran?
Or if you don't want to pass all the Fortran source files on one giant command line, there's -flto link-time optimization which uses a linker "plugin" to run the optimizer on GIMPLE stored in .o files (instead of or as well as machine code).
LTO means you should pass all your optimization options to the invocation of gfortran that does the linking, as well as the gfortran -c that compiles to .o.
So you might use gfortran -ffast-math -O3 -march=native -flto to compile and link, assuming gfortran supports the same options as gcc. (And that -march=native is what you want: make an executable optimized for the computer you compiled on, which might SIGILL on other computers without all the ISA extensions this one supports.)
at the moment I am doing some experiments with the GNU C++-Compiler and the -Os optimization option for minimal code size. I checked the enabled compiler flags at -Os with the following command:
g++ -c -Q -Os --help=optimizers | grep "enabled"
I got this list of enabled options:
-faggressive-loop-optimizations [enabled]
-falign-functions [enabled]
-falign-jumps [enabled]
-falign-labels [enabled]
-falign-loops [enabled]
-fasynchronous-unwind-tables [enabled]
...
This seems a bit strange, because I also looked up, which flags should be enabled at -Os, here and under the -Os section it is written that all the falign- options should be disabled for code minimization.
Q: So is this a bug or am I doing something wrong here ? Cause after reading what the falign- flags do I really think they should be disabled in -Os !
My gcc-version is 4.9.2 and I am working on Arch-Linux.
Already thanks for helping :)
Q: So is this a bug or am I doing something wrong here ? Cause after reading what the falign- flags do I really think they should be disabled in -Os
I think Hans did a good job of finding part of the problem. Its definitely a documentation bug. But no one from GCC commented on why -Os enabled them, so you might not have all of the information.
Older ARM devices were very intolerant of unaligned accesses. Older arm devices included ARMv4 and I think ARMv5. If you performed an unaligned access, you would get a SIGBUS (been there, done that, got the tee shirt).
Modern ARM devices fix up unaligned accesses like x86 processors do, so you no longer get a SIGBUS. Instead, you just take the performance penalty.
You should try to specify an architecture in case those options are an artifact from older ARM device support. For example, -march=armv7. If you find it on ARMv6 and ARMv7, then that could still be a bug. It depends if the GCC team decided the tradeoff was sufficient for ARM (code size vs performance penalty).
Is there a way to enable vectorization only for some part of the code, like a pragma directive? Basically having as if the -ftree-vectorize is enabled only while compiling some part of the code? Pragma simd for example is not available with gcc...
The reason is that from benchmarking we saw that with -O3 (which enables vectorization) the timings were worse than with -O2. But there are some part of the code for which we would like the compiler to try vectorizing loops.
One solution I could use would be to restrict the compiler directive to one file.
Yes, this is possible. You can either disable it for the whole module or individual functions. You can't however do this for particular loops.
For individual functions use
__attribute__((optimize("no-tree-vectorize"))).
For whole modules -O3 automatic enables -ftree-vectorize. I'm not sure how to disable it once it's enabled but you can use -O2 instead. If you want to use all of -O3 except -ftree-vectorize then do this
gcc -c -Q -O3 --help=optimizers > /tmp/O3-opts
gcc -c -Q -O2 --help=optimizers > /tmp/O2-opts
diff /tmp/O2-opts /tmp/O3-opts | grep enabled
And then include all the options except for -ftree-vectorize.
Edit: I don't see -fno-tree-vectorize in the man pages but it works anyway so you can do -O3 -fno-tree-vectorize.
Edit: The OP actually wants to enable vectorization for particular functions or whole modules. In that case for individual functions __attribute__((optimize("tree-vectorize"))) can be used and for whole modules -O2 -ftree-vectorize.
Edit (from Antonio): In theory there is a pragma directive to enable tree-vectorizing all functions that follow
#pragma GCC optimize("tree-vectorize")
But it seems not to work with my g++ compiler, maybe because of the bug mentioned here:
How to enable optimization in G++ with #pragma. On the other hand, the function attribute works.
In g++ 4.6 (or later), what extra optimisations does -Ofast enable other than -ffast-math?
The man page says this option "also enables optimizations that are not valid for all standard compliant programs". Where can I find more information about whether this might affect my program or not?
Here's a command for checking what options are enabled with -Ofast:
$ g++ -c -Q -Ofast --help=optimizers | grep enabled
Since I only have g++ 4.4 that doesn't support -Ofast, I can't show you the output.
The -Ofast options might silently enable the gcc C++ extensions. You should check your sources to see if you make any use of them. In addition, the compiler might turn off some obscure and rarely encountered syntax checking for digraphs and trigraphs (this only improves compiler performance, not the speed of the compiled code).
I am wondering about the use of -O0,-O1 and -g for enabling debug symbols in a lib.
Some suggest to use -O0 to enable debug symbols and some suggest to use -g.
So what is the actual difference between -g and -O0 and what is the difference between -01 and -O0 and which is best to use.
-O0 is optimization level 0 (no optimization, same as omitting the -O argument)
-O1 is optimization level 1.
-g generates and embeds debugging symbols in the binaries.
See the gcc docs and manpages for further explanation.
For doing actual debugging, debuggers are usually not able to make sense of stuff that's been compiled with optimization, though debug symbols are useful for other things even with optimization, such as generating a stacktrace.
-OX specify the optimisation level that the compiler will perform. -g is used to generate debug symbols.
From GCC manual
http://gcc.gnu.org/onlinedocs/
3.10 Options That Control Optimization`
-O
-O1
Optimize. Optimizing compilation takes somewhat more time, and a lot more memory for a large function. With -O, the compiler tries to reduce code size and execution time, without performing any optimizations that take a great deal of compilation time.`
-O2
Optimize even more. GCC performs nearly all supported optimizations that do not involve a space-speed tradeoff. As compared to -O, this option increases both compilation time and the performance of the generated code.`
-O3
Optimize yet more. -O3 turns on all optimizations specified by -O2 and also turns on the -finline-functions, -funswitch-loops, -fpredictive-commoning, -fgcse-after-reload, -ftree-vectorize and -fipa-cp-clone options.`
-O0
Reduce compilation time and make debugging produce the expected results. This is the default. `
-g
Produce debugging information in the operating system's native format (stabs, COFF, XCOFF, or DWARF 2). GDB can work with this debugging information.`
-O0 doesn't enable debug symbols, it just disables optimizations in the generated code so debugging is easier (the assembly code follows the C code more or less directly). -g tells the compiler to produce symbols for debugging.
It's possible to generate symbols for optimized code (just continue to specify -g), but trying to step through code or set breakpoints may not work as you expect because the emitted code will likely not "follow along" with the original C source closely. So debugging in that situation can be considerably trickier.
-O1 (which is the same as -O) performs a minimal set of optimizations. -O0 essentially tells the compiler not to optimize. There are a slew of options that allow a very fine control over how you might want the compiler to perform: http://gcc.gnu.org/onlinedocs/gcc-4.6.3/gcc/Optimize-Options.html#Optimize-Options
As mentioned by others, -O set of options indicate the levels of optimization that must be done by the compiler whereas, the -g option adds the debugging symbols.
For a more detailed understanding, please refert to the following links
http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#Optimize-Options
http://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html#Debugging-Options