Inline keyword gfortran - fortran

Is there any Fortran keyword equivalent to the C "inline" keyword?
If some compiler-specific keyword exist, is there any for gfortran?

In general, the Fortran specifications grant the compiler writers enormous scope in how to implement things, so a language-level construct that forced (or even hinted) specific optimizations would be very un-Fortran-y.
What you typically do in modern Fortran is not specify optimizations, but tell the compiler things it can use to decide what optimizations to implement. So an example is labelling a side-effect-free function or subroutine PURE, so that certain optimizations are enabled (and actually, this may make for easier inlining).
Otherwise, as #Vladimir F points out, you can use compiler options which are presecriptive in this way.
In a similar vein, it seems that CONTAINed subprogram are more aggressively inlined by gfortran, but that may or may not help.

There is no source code statement I know of. Sometimes you can use statement functions, which are obviously inlined. Otherwise use compiler comand line options as gfortran's "-finline-functions".

Related

In standard Fortran grammar, can we specify some unused variant?

To find out all the (possible) problems that existed in the program, we had better turn on all the debug tools of the compiler. The tool will always tell us something like "remark #7712: This variable has not been used.".
In many cases, in order to keep some rules, I have to keep some input and output without using them. At the same time, I want to keep the debug tool turned on.
Can we do something by standard grammar to tell the compiler we really mean to do it and do not report any warning about it?
The Fortran standard sets out the rules for correct programs and requires that compilers identify any breach of those rules. Such breaches, which cause compilation to fail, are generally known as errors.
However, programmers make many mistakes which are not errors and which a (Fortran) compiler is not required to spot. Some compilers provide additional diagnostic capabilities, such as identifying unused variables, which go beyond what the standard requires. The compilers raise what are generally known as warnings in these cases. This type of mistake does not cause compilation to fail. Compilers also generally provide some means to determine which warnings are raised during compilation, so that you can switch off and on this diagnostic capability. For details of these capabilities refer to your compiler's documentation.
The standard is entirely silent on this type of mistake so, if I understand the question correctly, there is nothing
by standard grammar to tell the compiler we really mean to do it and
do not report any warning about it
The simplest thing (besides of course not declaring things you don't use)
may be to simply use the variables.
real x
x=huge(x) !reminder x is declared but not used.
at least makes gfortran happy that you have "used" the variable.

What are guiding principles of expansion of callee inside the caller (Inlining - Compiler Optimization) [duplicate]

This question already has answers here:
How will i know whether inline function is actually replaced at the place where it is called or not?
(10 answers)
Closed 7 years ago.
My understanding is that compilers follow certain semantics that decide whether or not a function should be expanded inline. for example, if the callee unconditionally (no if/élse-if to return) returns a value, it may be expanded in caller itself. Similarly, function call overhead can also guide this expansion.(I may be completely wrong)
Similarly, the hardware parameters like cache-usage may also play a role in expansion.
As a programmer, I want to understand these semantics and the algorithms which guide inline expansion. Ultimately, I should be able to write(or recognize) a code that surely will be inlined(not-inlined). I don't mean to override compiler or that I think I would be able to write a code better than compiler itself. The question is rather to understand internals of the compilers.
EDIT: Since I use gcc/g++ in my work, we can limit the scope to these two alone. Though, I was of opinion that there will be several things common across compilers in this context.
You don't need to understand the inlining (or other optimizations) criteria, because by definition (assuming that the optimizing compiler is not buggy on that respect), an inlined code should behave the same as a non-inlined code.
Your first example (callee unconditionally returning a value) is in practice certainly wrong, in the sense that several compilers are able to inline conditional returns.
For example, consider this f.c file:
static int fact (int n) {
if (n <= 0) return 1;
else
return n * fact (n - 1);
}
int foo () {
return fact (10);
}
Compile it with gcc -O3 -fverbose-asm -S f.c; the resulting f.s assembly file contains only one function (foo), the fact function has completely gone, and the fact(10) has been inlined (recursively) and replaced (constant folding) by 3628800.
With GCC -current version is GCC 5.2 in july 2015-, assuming you ask it to optimize (e.g. compile with gcc -O2 or g++ -O2 or -O3) the inlining decision is not easy to understand. The compiler would very probably make inlining decisions better than what you can do. There are many internal heuristics guiding it (so no simple few guiding principles, but some heuristics to inline, other to avoid inlining, and probably some meta-heuristics to choose). Read about optimize options (-finline-limit=...), function attributes.
You might use the always_inline and gnu_inline and noinline (and also noclone) function attributes, but I don't recommend doing that in general.
you could disable inlining with noinline but very often the resulting code would be slower. So don't do that...
The key point is that the compiler is better optimizing and inlining than what you reasonably can, so trust it to inline and optimize well.
Optimizing compilers (see also this) can (and do) inline functions even without you knowing that, e.g. they are sometimes inlining functions not marked inline or not inlining some functions marked inline.
So no, you don't want to "understand these semantics and the algorithms which guide inline expansion", they are too difficult ... and vary from one compiler to another (even one version to another). If you really want to understand why GCC is inlining (this means spending months of work, and I believe you should not lose your time on that), use -fdump-tree-all and other dump flags, instrument the compiler using MELT -which I am developing-, dive into the source code (since GCC is a free software).
You'll need more than your life time, or at least several dozens of years, to understand all of GCC (more than ten millions lines of source code) and how it is optimizing. By the time you understood something, the GCC community would have worked on new optimizations, etc...
BTW, if you compile and link an entire application or library with gcc -flto -O3 (e.g. with make CC='gcc -flto -O3') the GCC compiler would do link-time optimization and inline some calls accross translation units (e.g. in f1.c you call foo defined in f2.c, and some of the calls to foo in f1.c would got inlined).
The compiler optimizations do take into account cache sizes (for deciding about inlining, unrolling, register allocation & spilling and other optimizations), in particular when compiling with gcc -mtune=native -O3
Unless you force the compiler (e.g. by using noinline or alwaysinline function attributes in GCC, which is often wrong and would produce worse code), you'll never be able in practice to guess that a given code chunk would certainly be inlined. Even people working on GCC middle end optimizations cannot guess that reliably! So you cannot reliably understand -and predict- the compiler behavior in practice, hence don't even waste your time to try that.
Look also into MILEPOST GCC; by using machine learning techniques to tune some GCC parameters, they have been able to sometimes get astonishing performance improvements, but they certainly cannot explain or understand them.
If you need to understand your particular compiler while coding some C or C++, your code is probably wrong (e.g. probably could have some undefined behavior). You should code against some language specification (either the C11 or C++14 standards, or the particular GCC dialect e.g. -std=gnu11 documented and implemented by your GCC compiler) and trust your compiler to be faithful w.r.t. that specification.
Inlining is like copy-paste. There aren't so many gotchas that will prevent it from working, but it should be used judiciously. If it gets out of control, the program will become bloated.
Most compilers use a heuristic based on the "size" of the function. Since this is usually before any code generation pass, the number of AST nodes may be used as a proxy for size. A function that includes inlined calls needs to include them it its own size, or inlining can go totally out of control. However, AST nodes that will not generate instructions should not prevent inlining. It can be difficult to tell what will generate a "move" instruction and what will generate nothing.
Since modern C++ tends to involve lots of functions that perform conceptual rearrangement with no underlying instructions, the difficulty is telling the difference between no instructions, "just a few" moves, and enough move instructions to cause a problem. The only way to tell for a particular instance is to run the program in a debugger and/or read the disassembly.
Mostly in typical C++ code, we just assume that the inliner is working hard enough. For performance-critical situations, you can't just eyeball it or assume that anything is working optimally. Detailed performance analysis at the disassembly level is essential.

Does the 'private' access modifier give the compiler more room for optimization?

Does it allow the compiler to inline it, knowing that only functions in the same class can access it? Or is it only for the programmer's convenience?
The compiler can (but is not required to) optimize as you suggest, but that's not the point. The point of access modifiers is to catch certain classes (no pun intended) of programming errors at compile time. Private functions are functions that, if someone called them from outside the class, that would be a bug, and you want to know about it as early as possible.
(Any time you ask the question "could the compiler make optimizations based on this information available to it", the answer is "yes, unless there's a specific rule in the standard that says it's not allowed to" (such as the rules for volatile, whose entire purpose is to inhibit optimizations). However, compilers do not necessarily bother optimizing based on any given piece of information. There is, after all, no requirement for compilers to do any optimization in the first place! How clever your compiler is, nowadays, largely depends on how long you are willing to let it run; MSVC's whole-program PGO mode is capable of inlining through virtual method dispatch -- it guesses the most likely target, and falls back to a regular virtual call at runtime if the guess was wrong -- but slows down compiles by at least a factor of two.)
The access specifiers are a part of C++ mechanism to implement OOP principles of Encapsulation and Abstraction and not optimization for compilers.
Some intelligent compiler can possibly implement some optimization through it but it not enforced to do so by the C++ Standard. The purpose of access specifiers is not Optimization but to facilitate language constructs for principles supported by the C++ language.

Why there is no standard way to force inline in C++?

According to the wikipedia C++ article
C++ is designed to give the programmer choice, even if this makes it possible for the programmer to choose incorrectly.
If it is designed this way why there is no standard way to force the compiler to inline something even if I might be wrong?
Or I can ask why is inline keyword is just a hint?
I think I have no choice here.
In the OOP world we call methods on the objects and directly accessing members should be avoided. If we can't force the accessors to be inlined, then we are unable to write high performance but still maintainable applications.
(I know many compilers implement their own way to force inlining but it's ugly. Using macros to make inline accessors on a class are ugly too.)
Does the compiler always do it better than the programmer?
How would a compiler inline a recursive function (especially if the compiler does not support Tail-call optimization and even if it does, the function is not Tail-call optimize-able).
This is just one reason where compiler should decide whether inline is practical or not. There can be others as well which I cant think of right now.
Does the compiler always do it better than the programmer?
No, not always... but the programmer is far more error prone, and less likely to maintain the optimal tuning over a span of years. The bottom line is that inlining only helps performance if the function is really small (for at least one common/important code path) but then it can help by about an order of magnitude, depending on many things of course. It's often impractical for the programmer to assess let alone keep a careful eye on how trivial a function is, and the thresholds can vary with compiler implementation choices, command line options, CPU model etc.. There are so many things that could suddenly bloat a function - any non-builtin type can trigger all sorts of different behaviours (esp in templates), use of an operator (even new) can be overloaded, the verbosity of calling conventions and exception-handling steps aren't generally obvious to the programmer.
The chances are that if the compiler isn't inlining something that's small enough for you to expect a useful performance improvement if it was inlined, then the compiler's aware of some implementation issue you're not that would actually make it worse. In those gray cases where the compiler might go either way and you're just over some threshold the performance difference isn't likely to be significant anyway.
Further, some programmers (myself included) can be lazy and deliberately abuse inline as a convenient way to put implementation in a header file, getting around the ODR, even though they know those functions are large and that it would be disastrous if the compiler (were required to) actually inline them. This doesn't preclude a forced-inline keyword/notation though... it just explains why it's hard to change the expectations around the current inline keyword.
Or I can ask why is inline keyword is
just a hint?
Because you "might" know better than the compiler.
Most of the time, for functions not marked inline (and correctly declared/defined), the compiler, depending on it's configuration and implementation, will itself evaluate if the function can be inlined or not.
For example, most compilers will automatically inline member functions that are fully defined in the header, if the code is'isn't long and/or too complex. That's because as the function is available in the header, why not inline it as much as we can?
However this don't happen, for example, in Debug mode for Visual Studio : in Debug the debug informations still need to map the binary code of the functions, so it avoid inlining, but will still inline functions marked inline, because the user required it. That's useful if you want to mark functions yuo don't need to have debug-time informations (like simple getters) while getting better performance at debug-time.
In Release mode (by default) the compiler will agresively inline everything it can, making harder to debug some part of the code even if you activate debugging informations.
So, the general idea is that if you code in a way that helps the compiler inlining, it will inline as much as it can. If you write your code in ways that is hard or impossible to inline, it will avoid. If you mark something inline, you just tell the compiler that if it find it hard but not impossible to inline, it should inline it.
As inlining depends on both contexts of the caller and the callee, there is no "rule".
What's often advised is to simply ignore explicitly mark function inline but in two cases :
if you need to put a function definition in a header, it just have to be inlined; often the case for template (member or not) functions, and other utility functions that are just shortcuts;
if you want a specific compiler to behave in specific way at compile time, like marking some member functions inline to be inlined even in Debug configuration on Visual Studio compilers, for example.
Does the compiler always do it better
than the programmer?
No, that's why sometimes using the inline keyword can help. The programmer can have sometimes a better general view of what's necessary than the compiler. For example, if the programmer wants it's binary to be the smallest possible, depending on code, inlining can be harmful. In speed performance required application, inlining aggressively can help very much. How would the compiler know what's required? It have to be configured and be allowed to know in a fine-grain way what is really wanted to be inline.
Mistaken assumption.
There is a way. It's spelled #define. And for many early C projects, that was good enough. inline was sufficiently different - hint, better semantics - that it could be added besides macros. But once you had both, there was little room left for a third option in between, one with the nicer semantics but non-optional.
If you really need to force the inline of a function (why?), you can do it: copy the code and paste it, or use a macro.

Making a long function inline

Suppose I have a 10 line function. If I add inline keyword, let's say there is a chance of 50% that compiler will make it inline.
If I have a 2 line function, there might be 90% chance it will be inlined.
Can I split the code in 10 line function into 5 functions to make it inlined with better chances?
There may be a reason why the compiler isn't inlining it, possibly something to look at. In addition, the function call overhead becomes less of an issue with longer functions, so inlining them may not be as important (if that's your only reason).
Splitting the function into 5 small functions will just make a mess of your code, and possibly confuse the compiler and end up with it not inlining anything. I would not recommend that.
Depending on your C++ compiler, you may be able to force it to inline the function. Visual C++ has the __forceinline attribute, as well as a setting for how inlining should be handled and how often it should be used in the project settings. As Tony mentions, the GCC equivalent is __attribute__((always_inline)).
You may also be able to use some preprocessor trickery to inline the code itself, but I would not typically recommend that.
If it makes the code more readable, go for it. If not, trust the compiler and don't go messing up your code on the off chance that it'll help. The compiler's a lot smarter than you think, and generally knows better than you do when inlining will help -- and when it won't, or worse, will break stuff.