Does GCC, when compiling C++ code, ever try to optimize for speed by choosing to inline functions that are not marked with the inline keyword?
Yes. Any compiler is free to inline any function whenever it thinks it is a good idea. GCC does that as well.
At -O2 optimization level the inlining is done when the compiler thinks it is worth doing (a heuristic is used) and if it will not increase the size of the code. At -O3 it is done whenever the compiler thinks it is worth doing, regardless of whether it will increase the size of the code. Additionally, at all levels of optimization (enabled optimization that is), static functions that are called only once are inlined.
As noted in the comments below, these -Ox are actually compound settings that envelop multiple more specific settings, including inlining-related ones (like -finline-functions and such), so one can also describe the behavior (and control it) in terms of those more specific settings.
Yes, especially if you have a high level of optimizations enabled.
There is a flag you can provide to the compiler to disable this: -fno-inline-functions.
If you use '-finline-functions' or '-O3' it will inline functions. You can also use '-finline_limit=N' to tune how much inlining it does.
Yes, it does, although it will also generate a non-inlined function body for non-static non-inline functions as this is needed for calls from other translation units.
For inline functions, it is an error to fail to provide a function body if the function is used in any particular translation unit so this isn't a problem.
Related
Let's imagine a blah.h header file that contains:
// A declaration without any code. We force inline
__attribute__((always_inline)) void inline_func();
And a blah.cpp source file that contains:
#include "blah.h"
// The code of the inline function
void inline_func() {
...
}
// Use the inline function
void foo() {
inline_func();
}
The question is, will the compiler actually inline the inline_func()? Should the code be with the declaration or they can be separate?
Assume no LTO
Note the (GCC) force inline decoration in inline_func()
Inlining is a two-step process:
* Is it possible?
* Is it worthwhile?
The first step is fairly trivially decided by the compiler, the second is a far more complex heuristic. Thus it makes sense to only consider the benefits of possible optimizations.
always_inline means that the second step is ignored. It does not affect the first consideration. Now, you've also stated that LTO is disabled, which means that first consideration, the ability for inlining, is restricted. This shows that LTO and always_inline are pretty unrelated since they affect two different inlining considerations.
Not that LTO matters for your example anyway. The two functions under consideration are in the same Translation Unit. There appear to be no other restrictions such as recursion, library calls, or other observable side effects. That means it should be possible to inline, and since that's the only consideration, it should be inlined.
You need to have the body available at the time the inlining is supposed to happen.
I.e. if you have the following files:
inlineFunc.h
inlineFunc.c
main.c
And you compile with:
compile inline.c
compile main.c
link innline.o mcompile inline.c
compile main.c
link innline.o main.o yourCoookProgramain.o yourCoookProgram
there is no way that inlineFunc gets inlined in main.c however calls to inlineFunc in inlineFunc.c can be inlined.
As Paolo mentioned, inline is only a hint to a compiler however some compilers also have ways to force the inining, i.e. for gcc you can use __attribute__(always_inline). Take alook here for a discussion on how gcc handles inlining.
An interesting sitenote:
The warning is issued in this case as the definition of foo() is not
available when main() is compiled. However, with -O2 or better
optimizations, gcc performs a kind of "backward inlining", meaning
that even function definitions that are further ahead in the source
file can be embedded into a caller. Consequently, the warning
disappears as soon as one uses at least -O2 optimizations. Is there
any specific option responsible for this behavior? I would like to
enable "backward inlining" even with -O1 or -O0.
Well, it depends. In your example, it will be inlined, because the definition of the function is in the same translation unit where it is used.
Otherwise, if no LTO is possible, and at compile time the definition of the function is not available to the compiler, then no, the function will not be inlined.
Prior answer
The answer is: it depends. It depends on the compiler, and it may depend on compiler's configuration(1)(2) too.
See also inline description at cppreference.com (quoted below):
The intent of the inline keyword is to serve as an indicator to the optimizer that inline substitution of the function is preferred over function call, that is, instead of executing the call CPU instruction to transfer control to the function body, a copy of the function body is executed without generating the call. This avoids extra overhead created by the function call (copying the arguments and retrieving the result) but it may result in a larger executable as the code for the function has to be repeated multiple times.
Since this meaning of the keyword inline is non-binding, compilers are free to use inline substitution for any function that's not marked inline, and are free to generate function calls to any function marked inline. Those choices do not change the rules regarding multiple definitions and shared statics listed above.
This question already has answers here:
How will i know whether inline function is actually replaced at the place where it is called or not?
(10 answers)
Closed 9 years ago.
In gcc, using
-E gives the preprocessed code;
-S, the assembly code;
-c, the code compiled, but not linked.
Is there anything close to a -I, that would allow me to see whether a function has been inlined or not, i.e., to see the code expanded, as though inline functions were preporcessed macros?
If not, should I get my way through the assembly code, or are the inline applications performed later?
I think examining the assembly code is the best (and pretty much the only) way to see what's been inlined.
Bear in mind that, in certain circumstances, some inlining can take place at link time. See Can the linker inline functions?
You can use the -Winline option to see whether a function can not be inlined and it was declared as inline.
Quoted from http://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#Warning-Options
-Winline
Warn if a function that is declared as inline cannot be inlined. Even with this option, the compiler does not warn about failures to inline functions declared in system headers.
The compiler uses a variety of heuristics to determine whether or not to inline a function. For example, the compiler takes into account the size of the function being inlined and the amount of inlining that has already been done in the current function. Therefore, seemingly insignificant changes in the source program can cause the warnings produced by -Winline to appear or disappear.
However, inline is not a command, whether inline a function or not(though declared as inline) is decided by the compiler. It may consider the size of the function being inlined and how many times inline already been done in the current function.
The best way to see whether a function has really been inlined is to check the assembly code. For example, you can use
gcc -O2 -S -c foo.c
to generate assembly code for foo.c and output assembly code file foo.s.
The issue here is that generally inlining is a link time optimaztion (when there are multiple object files) as the compiler simply doesn't see the implementation of functions in other object files, until link time.
Hence in multi object file compiles your best shot is to inspect the generated assembly, however within every single object file, the inlining is possible, assuming the function to be inlined is in the same compilation unit, however most compilers do not do inlining at this point, as it doesnt know where this function may be called from, and whether it should itself be inlined.
So in general inlining is performed on linking, however for very small functions, it can and should be done at compilation time.
Also I do believe that if you compile your code with clang/llvm, you'll get a c output file with inlining, tho I haven't tried it.
Do note that in order to get GCC to do the link time optimization (including inlining) you'll have to provide it with an argument, I think its -flto.
An alternative is to have all your inline functions visible in your all of your compilation units (e.g. In the header files), this will actually usually ensure inlining in order to avoid multiple declarations of the same function, in different object files.
Also an easy way to check the assembly for inlining is to compare the number of calls to the function in the source code to the number of calls in the assembly.
C++ allows you to annotate functions with the inline keyword. From what I understand, this provides a hint (but no obligation) to the compiler to inline the function, thereby avoiding the small function calling overhead.
I have some methods which are called so often that they really should be inlined. But inline-annotated functions need to be implemented in the header, so this makes the code less well-arranged. Also, I think that inlining is a compiler optimization that should happen transparently to the programmer where it makes sense.
So, do I have to annotate my functions with inline for inlining to happen, or does GCC figure this out without the annotation when I compile with -O3 or other appropriate optimization flags?
inline being just a suggestion to compiler is not true & is misleading.There are two possible effects of marking a function inline:
Substitution of function definition inline to where the function call was made &
Certain relaxations w.r.t One definition rule, allowing you to define functions in header files.
An compiler may or may not perform #1 but it has to abide to #2. So inline is not just a suggestion.There are some rules which will be applied once function is marked inline.
As a general guideline, do not mark your functions inline just for sake of optimizations. Most modern compilers will perform these optimizations on their own without your help. Mark your functions inline if you wish to include them in header files because it is the only correct way to include a function definition in header file without breaking the ODR.
Common folklore is that gcc always decides on its own (based on some cost heuristics) whether to inline something or not (depending on the compiler/linker options, it can even do so at link time). You can observe this sometimes when using -Winline where gcc warns that an inline hint was ignored, it often even gives a reason.
If you want to know exactly what is going on, you probably have to read the source code of it, or take the word of someone who read it.
I was told long ago to make short functions/methods that are called often inline, by using the keyword inline and writing the body in the header file.
This was to optimize the code so there would be no overhead for the actual function call.
How does it look with that today? Does modern compilers (Visual Studio 2010's in this case) inline such short functions automatically or is it still "necessary" to do so yourself?
inline has always been a hint to the compiler, and these days compilers for the most part make their own decisions in this regard (see register).
In order to expand a function inline, the compiler has to have seen the definition of that function. For functions that are defined and used in only one translation unit, that's no problem: put the definition somewhere before it's used, and the compiler will decide whether to inline the function.
For functions that are used in more than one translation unit, in order for the compiler to see the definition of the function, the definition has to go in a header file. When you do that, you need to mark the function inline to tell the compiler and linker that it's okay that there's more than one definition of that function. (well, I suppose you could make the function static, but then you could end up wasting space with multiple copies)
Enable warning C4710, this will warn you if a function which you define as inline is not inlined by the compiler.
Enable warning C4711, this will warn you if the compiler inlines a function not designated for inlining.
The combination of these two warnings will give you a better understanding of what the compiler is actually doing with your code and possibly whether it is worth designating inline functions manually or not.
Generally speaking, the inline keyword is used more now to allow you to "violate" the one definition rule when you define a function in a header than to give the compiler a hint about inlining. Many compilers are getting really good at deciding when to inline functions or not, as long as the function body is visible at th4e point of call.
Of course if you define the function only in a source file non-inline, the compiler will be able to inline it in that one source file but not in any other translation unit.
Inlining may be done by a compiler in the following situations:
You marked the function as inline and
it's defined in a current translation unit or in file that it's included in it;
compiler decides that it's worth doing so. According to the MSDN
The inline keyword tells the compiler that inline expansion is preferred.
The compiler treats the inline expansion options and keywords as suggestions.
You used the __forceinline keyword (or __ attribute __((always_inline)) in gcc). This will make compiler to skip some checks and do the inlining for you.
The __forceinline keyword overrides the cost/benefit analysis and relies on the judgment of the programmer instead.
Microsoft compiler can also perform cross module inlining if you have turned on link time code generation by passing /GL flag to the compiler or /LTCG to the linker. It's quite clever in making such optimizations: try to examine the assembly code of modules compiled with /LTCG.
Please note, that inlining will never happen if your function is:
a recursive one;
called through a pointer to it.
Yes, modern compilers will (depending on various configuration options) automatically choose to inline functions, even if they're in the source (not header) file. Using the inline directive can give a hint.
Aside from your main point (what amount of placing inline instructions in code is useful as of compilers today), keep in mind that inline functions are just a hint to the compiler, and are not necessarily being compiled as inline.
In short, yes, compilers will decide whether or not your function becomes inline. You can check this question:
Does the compiler decide when to inline my functions (in C++)?
In C++, the keyword "inline" serves two purposes. First, it allows a definition to appear in multiple translation units. Second, it's a hint to the compiler that a function should be inlined in the compiled code.
My question: in code generated by GCC and Clang/LLVM, does the keyword "inline" have any bearing on whether a function is inlined? If yes, in what situations? Or is the hint completely ignored? Note this is a not a language question, it is a compiler-specific question.
[Caveat: not a C++/GCC guru] You'll want to read up on inline here.
Also, this, for GCC/C99.
The extent to which
suggestions made by using the inline
function specifier are effective (C99
6.7.4).
GCC will not inline any functions if the -fno-inline option is
used or if -O0 is used. Otherwise, GCC
may still be unable to inline a
function for many reasons; the
-Winline option may be used to determine if a function has not been
inlined and why not.
So it appears that unless your compiler settings (like -fno-inline or -O0) are used, the compiler takes the hint. I can't comment on Clang/LLVM (or GCC really).'
I recommend using -Winline if this isn't a code-golf question and you need to know what's going on.
An interesting explanation from gcc: An Inline Function is As Fast As a Macro:
Some calls cannot be integrated for
various reasons (in particular, calls
that precede the function's definition
cannot be integrated, and neither can
recursive calls within the
definition). If there is a
nonintegrated call, then the function
is compiled to assembler code as
usual. The function must also be
compiled as usual if the program
refers to its address, because that
can't be inlined.
Note that certain usages in a function
definition can make it unsuitable for
inline substitution. Among these
usages are: use of varargs, use of
alloca, use of variable sized data
types (see Variable Length), use of
computed goto (see Labels as Values),
use of nonlocal goto, and nested
functions (see Nested Functions).
Using -Winline will warn when a
function marked inline could not be
substituted, and will give the reason
for the failure.
As required by ISO C++, GCC considers
member functions defined within the
body of a class to be marked inline
even if they are not explicitly
declared with the inline keyword. You
can override this with
-fno-default-inline; see Options Controlling C++ Dialect.
GCC does not inline any functions when
not optimizing unless you specify the
`always_inline' attribute for the
function, like this:
/* Prototype. */
inline void foo (const char) __attribute__((always_inline)); The remainder of this section is specific
to GNU C90 inlining.
When an inline function is not static,
then the compiler must assume that
there may be calls from other source
files; since a global symbol can be
defined only once in any program, the
function must not be defined in the
other source files, so the calls
therein cannot be integrated.
Therefore, a non-static inline
function is always compiled on its own
in the usual fashion.
If you specify both inline and extern
in the function definition, then the
definition is used only for inlining.
In no case is the function compiled on
its own, not even if you refer to its
address explicitly. Such an address
becomes an external reference, as if
you had only declared the function,
and had not defined it.
This combination of inline and extern
has almost the effect of a macro. The
way to use it is to put a function
definition in a header file with these
keywords, and put another copy of the
definition (lacking inline and extern)
in a library file. The definition in
the header file will cause most calls
to the function to be inlined. If any
uses of the function remain, they will
refer to the single copy in the
library.
A lot of information can be gathered on this by reading GCC and the LLVM project's code. Here is some information that has been gathered by reading the code directly. (Note: this is not necessarily fully comprehensive and this doesn't list every single way in which inline affects inlining in every single detail, it's only to give an overview of most of it)
This information was gathered from GCC and and LLVM's current development HEAD as of 2021/11/13, so it might not be up to date in the future.
On GCC's side:
GCC will not consider functions not declared inline for early-inlining when neither -finline-small-functions (implied by -O2) and -finline-functions (implied by -O2) are specified
GCC will not consider functions not declared inline when trying to inline a function optimized for size when the caller isn't and inlining wouldn't shrink the caller
GCC's devirtualization time bonus calculations will consider functions declared inline to be slightly better unless they are also under the inline function instructions threshold (i.e. max-inline-insns-auto, which is explained later) (the code is complicated so I'm being kind of vague here about what exactly it does)
GCC's inlining use "badness" as a heuristic to determine which edges to inline first. A function declared inline will have their "badness" value divided by 8, such that they will be much more likely to be inlined first
GCC's insns limit (i.e. the "instruction count" heuristic that tries to calculate how many instructions an instruction will take and compare that to the limit) will use a parameter value called max-inline-insns-auto in many more cases when the function is not declared inline (although I can see cases where this isn't the case, such as when GCC's heuristics consider that the "speedup seems big", for example): this value is by default 15, whereas the max-inline-insns-single value is used when inline is specified, and has a default value of 70, which implies that a function declared inline can be inlined when it is up to 4.6 times bigger than GCC would consider for a function not declared inline (note: the previous hyperlink is only the most important example of those limits being used, they are applied in some other places too)
GCC will not split a function if it is not declared inline and -finline-small-functions is not specified (implied by -O2)
Also note that GCC will normally not investigate inlining non inline functions at all without -finline-functions (included in -O2), unless they are very short, in which case it only just requires -finline-short-functions (included in -O2)
There are a few things in the code I wasn't able to fully comprehend but seem to affect inlining in some ways
On LLVM's side:
A function declared inline will have the inlinehint attribute attached to the LLVM bytecode function by Clang
When trying to determine the threshold for inlining, it will use the inlinehint-threshold value for inlinehint functions, which has a default of 325, whereas otherwise it would use other things (such as whether the callsite is hot or cold and other things like that), and if nothing else is found, will use the inlinedefault-threshold value, which has a default of 225. This value is compared to a heuristic calculated by LLVM which will calculate for each function call how expensive inlining a given function would be, and if this value is smaller than the threshold, the function will be inlined, which means that inlinehint will essentially diminish the cost of inlining a function by ~1.44 as seen by LLVM
When calculating a function's "synthetic entry count", inlinehint will make LLVM use the inline-synthetic-count, which has a default of 15, as the initial value (having a higher count is beneficial to inlining in some form, idk exactly how but it does), whereas it would otherwise use 0
In other words, it looks like this is quite the strong hint, on both GCC and Clang.
How strong the hint is depends entirely on the compile options you use. Most compilers have options to do no inlining, only inline those marked 'inline', or use its best judgement and ignore the hints.
The last one probably works best. :-)