Let's imagine a blah.h header file that contains:
// A declaration without any code. We force inline
__attribute__((always_inline)) void inline_func();
And a blah.cpp source file that contains:
#include "blah.h"
// The code of the inline function
void inline_func() {
...
}
// Use the inline function
void foo() {
inline_func();
}
The question is, will the compiler actually inline the inline_func()? Should the code be with the declaration or they can be separate?
Assume no LTO
Note the (GCC) force inline decoration in inline_func()
Inlining is a two-step process:
* Is it possible?
* Is it worthwhile?
The first step is fairly trivially decided by the compiler, the second is a far more complex heuristic. Thus it makes sense to only consider the benefits of possible optimizations.
always_inline means that the second step is ignored. It does not affect the first consideration. Now, you've also stated that LTO is disabled, which means that first consideration, the ability for inlining, is restricted. This shows that LTO and always_inline are pretty unrelated since they affect two different inlining considerations.
Not that LTO matters for your example anyway. The two functions under consideration are in the same Translation Unit. There appear to be no other restrictions such as recursion, library calls, or other observable side effects. That means it should be possible to inline, and since that's the only consideration, it should be inlined.
You need to have the body available at the time the inlining is supposed to happen.
I.e. if you have the following files:
inlineFunc.h
inlineFunc.c
main.c
And you compile with:
compile inline.c
compile main.c
link innline.o mcompile inline.c
compile main.c
link innline.o main.o yourCoookProgramain.o yourCoookProgram
there is no way that inlineFunc gets inlined in main.c however calls to inlineFunc in inlineFunc.c can be inlined.
As Paolo mentioned, inline is only a hint to a compiler however some compilers also have ways to force the inining, i.e. for gcc you can use __attribute__(always_inline). Take alook here for a discussion on how gcc handles inlining.
An interesting sitenote:
The warning is issued in this case as the definition of foo() is not
available when main() is compiled. However, with -O2 or better
optimizations, gcc performs a kind of "backward inlining", meaning
that even function definitions that are further ahead in the source
file can be embedded into a caller. Consequently, the warning
disappears as soon as one uses at least -O2 optimizations. Is there
any specific option responsible for this behavior? I would like to
enable "backward inlining" even with -O1 or -O0.
Well, it depends. In your example, it will be inlined, because the definition of the function is in the same translation unit where it is used.
Otherwise, if no LTO is possible, and at compile time the definition of the function is not available to the compiler, then no, the function will not be inlined.
Prior answer
The answer is: it depends. It depends on the compiler, and it may depend on compiler's configuration(1)(2) too.
See also inline description at cppreference.com (quoted below):
The intent of the inline keyword is to serve as an indicator to the optimizer that inline substitution of the function is preferred over function call, that is, instead of executing the call CPU instruction to transfer control to the function body, a copy of the function body is executed without generating the call. This avoids extra overhead created by the function call (copying the arguments and retrieving the result) but it may result in a larger executable as the code for the function has to be repeated multiple times.
Since this meaning of the keyword inline is non-binding, compilers are free to use inline substitution for any function that's not marked inline, and are free to generate function calls to any function marked inline. Those choices do not change the rules regarding multiple definitions and shared statics listed above.
Related
This question already has answers here:
How will i know whether inline function is actually replaced at the place where it is called or not?
(10 answers)
Closed 9 years ago.
In gcc, using
-E gives the preprocessed code;
-S, the assembly code;
-c, the code compiled, but not linked.
Is there anything close to a -I, that would allow me to see whether a function has been inlined or not, i.e., to see the code expanded, as though inline functions were preporcessed macros?
If not, should I get my way through the assembly code, or are the inline applications performed later?
I think examining the assembly code is the best (and pretty much the only) way to see what's been inlined.
Bear in mind that, in certain circumstances, some inlining can take place at link time. See Can the linker inline functions?
You can use the -Winline option to see whether a function can not be inlined and it was declared as inline.
Quoted from http://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#Warning-Options
-Winline
Warn if a function that is declared as inline cannot be inlined. Even with this option, the compiler does not warn about failures to inline functions declared in system headers.
The compiler uses a variety of heuristics to determine whether or not to inline a function. For example, the compiler takes into account the size of the function being inlined and the amount of inlining that has already been done in the current function. Therefore, seemingly insignificant changes in the source program can cause the warnings produced by -Winline to appear or disappear.
However, inline is not a command, whether inline a function or not(though declared as inline) is decided by the compiler. It may consider the size of the function being inlined and how many times inline already been done in the current function.
The best way to see whether a function has really been inlined is to check the assembly code. For example, you can use
gcc -O2 -S -c foo.c
to generate assembly code for foo.c and output assembly code file foo.s.
The issue here is that generally inlining is a link time optimaztion (when there are multiple object files) as the compiler simply doesn't see the implementation of functions in other object files, until link time.
Hence in multi object file compiles your best shot is to inspect the generated assembly, however within every single object file, the inlining is possible, assuming the function to be inlined is in the same compilation unit, however most compilers do not do inlining at this point, as it doesnt know where this function may be called from, and whether it should itself be inlined.
So in general inlining is performed on linking, however for very small functions, it can and should be done at compilation time.
Also I do believe that if you compile your code with clang/llvm, you'll get a c output file with inlining, tho I haven't tried it.
Do note that in order to get GCC to do the link time optimization (including inlining) you'll have to provide it with an argument, I think its -flto.
An alternative is to have all your inline functions visible in your all of your compilation units (e.g. In the header files), this will actually usually ensure inlining in order to avoid multiple declarations of the same function, in different object files.
Also an easy way to check the assembly for inlining is to compare the number of calls to the function in the source code to the number of calls in the assembly.
C++ allows you to annotate functions with the inline keyword. From what I understand, this provides a hint (but no obligation) to the compiler to inline the function, thereby avoiding the small function calling overhead.
I have some methods which are called so often that they really should be inlined. But inline-annotated functions need to be implemented in the header, so this makes the code less well-arranged. Also, I think that inlining is a compiler optimization that should happen transparently to the programmer where it makes sense.
So, do I have to annotate my functions with inline for inlining to happen, or does GCC figure this out without the annotation when I compile with -O3 or other appropriate optimization flags?
inline being just a suggestion to compiler is not true & is misleading.There are two possible effects of marking a function inline:
Substitution of function definition inline to where the function call was made &
Certain relaxations w.r.t One definition rule, allowing you to define functions in header files.
An compiler may or may not perform #1 but it has to abide to #2. So inline is not just a suggestion.There are some rules which will be applied once function is marked inline.
As a general guideline, do not mark your functions inline just for sake of optimizations. Most modern compilers will perform these optimizations on their own without your help. Mark your functions inline if you wish to include them in header files because it is the only correct way to include a function definition in header file without breaking the ODR.
Common folklore is that gcc always decides on its own (based on some cost heuristics) whether to inline something or not (depending on the compiler/linker options, it can even do so at link time). You can observe this sometimes when using -Winline where gcc warns that an inline hint was ignored, it often even gives a reason.
If you want to know exactly what is going on, you probably have to read the source code of it, or take the word of someone who read it.
I was told long ago to make short functions/methods that are called often inline, by using the keyword inline and writing the body in the header file.
This was to optimize the code so there would be no overhead for the actual function call.
How does it look with that today? Does modern compilers (Visual Studio 2010's in this case) inline such short functions automatically or is it still "necessary" to do so yourself?
inline has always been a hint to the compiler, and these days compilers for the most part make their own decisions in this regard (see register).
In order to expand a function inline, the compiler has to have seen the definition of that function. For functions that are defined and used in only one translation unit, that's no problem: put the definition somewhere before it's used, and the compiler will decide whether to inline the function.
For functions that are used in more than one translation unit, in order for the compiler to see the definition of the function, the definition has to go in a header file. When you do that, you need to mark the function inline to tell the compiler and linker that it's okay that there's more than one definition of that function. (well, I suppose you could make the function static, but then you could end up wasting space with multiple copies)
Enable warning C4710, this will warn you if a function which you define as inline is not inlined by the compiler.
Enable warning C4711, this will warn you if the compiler inlines a function not designated for inlining.
The combination of these two warnings will give you a better understanding of what the compiler is actually doing with your code and possibly whether it is worth designating inline functions manually or not.
Generally speaking, the inline keyword is used more now to allow you to "violate" the one definition rule when you define a function in a header than to give the compiler a hint about inlining. Many compilers are getting really good at deciding when to inline functions or not, as long as the function body is visible at th4e point of call.
Of course if you define the function only in a source file non-inline, the compiler will be able to inline it in that one source file but not in any other translation unit.
Inlining may be done by a compiler in the following situations:
You marked the function as inline and
it's defined in a current translation unit or in file that it's included in it;
compiler decides that it's worth doing so. According to the MSDN
The inline keyword tells the compiler that inline expansion is preferred.
The compiler treats the inline expansion options and keywords as suggestions.
You used the __forceinline keyword (or __ attribute __((always_inline)) in gcc). This will make compiler to skip some checks and do the inlining for you.
The __forceinline keyword overrides the cost/benefit analysis and relies on the judgment of the programmer instead.
Microsoft compiler can also perform cross module inlining if you have turned on link time code generation by passing /GL flag to the compiler or /LTCG to the linker. It's quite clever in making such optimizations: try to examine the assembly code of modules compiled with /LTCG.
Please note, that inlining will never happen if your function is:
a recursive one;
called through a pointer to it.
Yes, modern compilers will (depending on various configuration options) automatically choose to inline functions, even if they're in the source (not header) file. Using the inline directive can give a hint.
Aside from your main point (what amount of placing inline instructions in code is useful as of compilers today), keep in mind that inline functions are just a hint to the compiler, and are not necessarily being compiled as inline.
In short, yes, compilers will decide whether or not your function becomes inline. You can check this question:
Does the compiler decide when to inline my functions (in C++)?
In C++, the keyword "inline" serves two purposes. First, it allows a definition to appear in multiple translation units. Second, it's a hint to the compiler that a function should be inlined in the compiled code.
My question: in code generated by GCC and Clang/LLVM, does the keyword "inline" have any bearing on whether a function is inlined? If yes, in what situations? Or is the hint completely ignored? Note this is a not a language question, it is a compiler-specific question.
[Caveat: not a C++/GCC guru] You'll want to read up on inline here.
Also, this, for GCC/C99.
The extent to which
suggestions made by using the inline
function specifier are effective (C99
6.7.4).
GCC will not inline any functions if the -fno-inline option is
used or if -O0 is used. Otherwise, GCC
may still be unable to inline a
function for many reasons; the
-Winline option may be used to determine if a function has not been
inlined and why not.
So it appears that unless your compiler settings (like -fno-inline or -O0) are used, the compiler takes the hint. I can't comment on Clang/LLVM (or GCC really).'
I recommend using -Winline if this isn't a code-golf question and you need to know what's going on.
An interesting explanation from gcc: An Inline Function is As Fast As a Macro:
Some calls cannot be integrated for
various reasons (in particular, calls
that precede the function's definition
cannot be integrated, and neither can
recursive calls within the
definition). If there is a
nonintegrated call, then the function
is compiled to assembler code as
usual. The function must also be
compiled as usual if the program
refers to its address, because that
can't be inlined.
Note that certain usages in a function
definition can make it unsuitable for
inline substitution. Among these
usages are: use of varargs, use of
alloca, use of variable sized data
types (see Variable Length), use of
computed goto (see Labels as Values),
use of nonlocal goto, and nested
functions (see Nested Functions).
Using -Winline will warn when a
function marked inline could not be
substituted, and will give the reason
for the failure.
As required by ISO C++, GCC considers
member functions defined within the
body of a class to be marked inline
even if they are not explicitly
declared with the inline keyword. You
can override this with
-fno-default-inline; see Options Controlling C++ Dialect.
GCC does not inline any functions when
not optimizing unless you specify the
`always_inline' attribute for the
function, like this:
/* Prototype. */
inline void foo (const char) __attribute__((always_inline)); The remainder of this section is specific
to GNU C90 inlining.
When an inline function is not static,
then the compiler must assume that
there may be calls from other source
files; since a global symbol can be
defined only once in any program, the
function must not be defined in the
other source files, so the calls
therein cannot be integrated.
Therefore, a non-static inline
function is always compiled on its own
in the usual fashion.
If you specify both inline and extern
in the function definition, then the
definition is used only for inlining.
In no case is the function compiled on
its own, not even if you refer to its
address explicitly. Such an address
becomes an external reference, as if
you had only declared the function,
and had not defined it.
This combination of inline and extern
has almost the effect of a macro. The
way to use it is to put a function
definition in a header file with these
keywords, and put another copy of the
definition (lacking inline and extern)
in a library file. The definition in
the header file will cause most calls
to the function to be inlined. If any
uses of the function remain, they will
refer to the single copy in the
library.
A lot of information can be gathered on this by reading GCC and the LLVM project's code. Here is some information that has been gathered by reading the code directly. (Note: this is not necessarily fully comprehensive and this doesn't list every single way in which inline affects inlining in every single detail, it's only to give an overview of most of it)
This information was gathered from GCC and and LLVM's current development HEAD as of 2021/11/13, so it might not be up to date in the future.
On GCC's side:
GCC will not consider functions not declared inline for early-inlining when neither -finline-small-functions (implied by -O2) and -finline-functions (implied by -O2) are specified
GCC will not consider functions not declared inline when trying to inline a function optimized for size when the caller isn't and inlining wouldn't shrink the caller
GCC's devirtualization time bonus calculations will consider functions declared inline to be slightly better unless they are also under the inline function instructions threshold (i.e. max-inline-insns-auto, which is explained later) (the code is complicated so I'm being kind of vague here about what exactly it does)
GCC's inlining use "badness" as a heuristic to determine which edges to inline first. A function declared inline will have their "badness" value divided by 8, such that they will be much more likely to be inlined first
GCC's insns limit (i.e. the "instruction count" heuristic that tries to calculate how many instructions an instruction will take and compare that to the limit) will use a parameter value called max-inline-insns-auto in many more cases when the function is not declared inline (although I can see cases where this isn't the case, such as when GCC's heuristics consider that the "speedup seems big", for example): this value is by default 15, whereas the max-inline-insns-single value is used when inline is specified, and has a default value of 70, which implies that a function declared inline can be inlined when it is up to 4.6 times bigger than GCC would consider for a function not declared inline (note: the previous hyperlink is only the most important example of those limits being used, they are applied in some other places too)
GCC will not split a function if it is not declared inline and -finline-small-functions is not specified (implied by -O2)
Also note that GCC will normally not investigate inlining non inline functions at all without -finline-functions (included in -O2), unless they are very short, in which case it only just requires -finline-short-functions (included in -O2)
There are a few things in the code I wasn't able to fully comprehend but seem to affect inlining in some ways
On LLVM's side:
A function declared inline will have the inlinehint attribute attached to the LLVM bytecode function by Clang
When trying to determine the threshold for inlining, it will use the inlinehint-threshold value for inlinehint functions, which has a default of 325, whereas otherwise it would use other things (such as whether the callsite is hot or cold and other things like that), and if nothing else is found, will use the inlinedefault-threshold value, which has a default of 225. This value is compared to a heuristic calculated by LLVM which will calculate for each function call how expensive inlining a given function would be, and if this value is smaller than the threshold, the function will be inlined, which means that inlinehint will essentially diminish the cost of inlining a function by ~1.44 as seen by LLVM
When calculating a function's "synthetic entry count", inlinehint will make LLVM use the inline-synthetic-count, which has a default of 15, as the initial value (having a higher count is beneficial to inlining in some form, idk exactly how but it does), whereas it would otherwise use 0
In other words, it looks like this is quite the strong hint, on both GCC and Clang.
How strong the hint is depends entirely on the compile options you use. Most compilers have options to do no inlining, only inline those marked 'inline', or use its best judgement and ignore the hints.
The last one probably works best. :-)
I had a discussion with Johannes Schaub regarding the keyword inline.
The code there was this:
namespace ... {
static void someFunction() {
MYCLASS::GetInstance()->someFunction();
}
};
He stated that:
Putting this as an inline function may
save code size in the executable
But according to my findings here and here it wouldn't be needed, since:
[Inline] only occurs if the compiler's cost/benefit analysis show it to be profitable
Mainstream C++ compilers like Microsoft Visual C++ and GCC support an option that lets the compilers automatically inline any suitable function, even those not marked as inline functions.
Johannes however states that there are other benefits of explicitly specifying it. Unfortunately I do not understand them. For instance, he stated that And "inline" allows you to define the function multiple times in the program., which I am having a hard time understanding (and finding references to).
So
Is inline just a recommendation for the compiler?
Should it be explicitly stated when you have a small function (I guess 1-4 instructions?)
What other benefits are there with writing inline?
is it needed to state inline in order to reduce the executable file size, even though the compiler (according to wikipedia [I know, bad reference]) should find such functions itself?
Is there anything else I am missing?
To restate what I said in those little comment boxes. In particular, I was never talking about inlin-ing:
// foo.h:
static void f() {
// code that can't be inlined
}
// TU1 calls f
// TU2 calls f
Now, both TU1 and TU2 have their own copy of f - the code of f is in the executable two times.
// foo.h:
inline void f() {
// code that can't be inlined
}
// TU1 calls f
// TU2 calls f
Both TUs will emit specially marked versions of f that are effectively merged by the linker by discarding all but one of them. The code of f only exists one time in the executable.
Thus we have saved space in the executable.
Is inline just a recommendation for the compiler?
Yes.
7.1.2 Function specifiers
2 A function declaration (8.3.5, 9.3, 11.4) with an inline specifier declares an inline function. The inline
specifier indicates to the implementation that inline substitution of the function body at the point of call
is to be preferred to the usual function call mechanism. An implementation is not required to perform this
inline substitution at the point of call; however, even if this inline substitution is omitted, the other rules
for inline functions defined by 7.1.2 shall still be respected.
For example from MSDN:
The compiler treats the inline expansion options and keywords as suggestions. There is no guarantee that functions will be inlined. You cannot force the compiler to inline a particular function, even with the __forceinline keyword. When compiling with /clr, the compiler will not inline a function if there are security attributes applied to the function.
Note though:
3.2 One definition rule
3 [...]An inline function shall be defined in every translation unit in which it is used.
4 An inline function shall be defined in every translation unit in which it is used and shall have exactly
the same definition in every case (3.2). [ Note: a call to the inline function may be encountered before its
definition appears in the translation unit. —end note ] If the definition of a function appears in a translation
unit before its first declaration as inline, the program is ill-formed. If a function with external linkage is
declared inline in one translation unit, it shall be declared inline in all translation units in which it appears;
no diagnostic is required. An inline function with external linkage shall have the same address in all
translation units. A static local variable in an extern inline function always refers to the same object.
A string literal in the body of an extern inline function is the same object in different translation units.
[ Note: A string literal appearing in a default argument expression is not in the body of an inline function
merely because the expression is used in a function call from that inline function. —end note ] A type
defined within the body of an extern inline function is the same type in every translation unit.
[Note: Emphasis mine]
A TU is basically a set of headers plus an implementation file (.cpp) which leads to an object file.
Should it be explicitly stated when you have a small function (I
guess 1-4 instructions?)
Absolutely. Why not help the compiler help you generate less code? Usually, if the prolog/epilog part incurs more cost than having it inline force the compiler to generate them? But you must, absolutely must go through this GOTW article before getting started with inlining: GotW #33: Inline
What other benefits are there with writing inline?
namespaces can be inline too. Note that member functions defined in the class body itself are inline by default. So are implicitly generated special member functions.
Function templates cannot be defined in an implementation file (see FAQ 35.12) unless of course you provide a explicit instantiations (for all types for which the template is used -- generally a PITA IMO). See the DDJ article on Moving Templates Out of Header Files (If you are feeling weird read on this other article on the export keyword which was dropped from the standard.)
Is it needed to state inline in order to reduce the executable file
size, even though the compiler
(according to wikipedia [I know, bad
reference]) should find such functions
itself?
Again, as I said, as a good programmer, you should, when you can, help the compiler. But here's what the C++ FAQ has to offer about inline. So be wary. Not all compilers do this sort of analysis so you should read the documentation on their optimization switches. E.g: GCC does something similar:
You can also direct GCC to try to integrate all “simple enough” functions into their callers with the option -finline-functions.
Most compilers allow you to override the compiler's cost/benefit ratio analysis to some extent. The MSDN and GCC documentation is worth reading.
Is inline just a recommendation for the compiler?
Yes. But the linker needs it if there are multiple definitions of the function (see below)
Should it be explicitly stated when you have a small function (I guess 1-4 instructions?)
On functions that are defined in header files it is (usually) needed. It does not hurt to add it to small functions (but I don't bother). Note class members defined within the class declaration are automatically declared inline.
What other benefits are there with writing inline?
It will stop linker errors if used correctly.
is it needed to state inline in order to reduce the executable file size, even though the compiler (according to wikipedia [I know, bad reference]) should find such functions itself?
No. The compiler makes a cost/benefit comparison of inlining each function call and makes an appropriate choice. Thus calls to a function may be inlined in curtain situations and not inlined in other (depending on how the compilers algorithm works).
Speed/Space are two competing forces and it depends what the compiler is optimizing for which will determine weather functions are inlined and weather the executable will grow or shrink.
Also note if excessively aggressive inlining is used causing the program to expand too much, then locality of reference is lost and this can actually slow the program down (as more executable pages need to be brought into memory).
Multiple definition:
File: head.h
// Without inline the linker will choke.
/*inline*/ int add(int x, int y) { return x + y; }
extern void test()
File: main.cpp
#include "head.h"
#include <iostream>
int main()
{
std::cout << add(2,3) << std::endl;
test();
}
File: test.cpp
#include "head.h"
#include <iostream>
void test()
{
std::cout << add(2,3) << std::endl;
}
Here we have two definitions of add(). One in main.o and one in test.o
Yes. It's nothing more.
No.
You hint the compiler that it's a function that gets called a lot, where the jump-to-the-function part takes a lot of the execution time.
The compiler might decide to put the function code right where it gets called instead where normal functions are. However, if a function is inlined in x places, you need x times the space of a normal function.
Always trust your compiler to be much smarter than yourself on the subject of premature micro-optimization.
Actually, inline function may increase executable size, because inline function code is duplicated in every place where this function is called. With modern C++ compilers, inline mostly allows to programmer to believe, that he writes high-performance code. Compiler decides itself whether to make function inline or not. So, writing inline just allows us to feel better...
With regards to this:
And "inline" allows you to define the function multiple times in the program.
I can think of one instance where this is useful: Making copy protection code harder to crack. If you have a program that takes user information and verifies it against a registration key, inlining the function that does the verification will make it harder for a cracker to find all duplicates of that function.
As to other points:
inline is just a recommendation to compiler, but there are #pragma directives that can force inlining of any function.
Since it's just a recommendation, it's probably safe to explicitly ask for it and let the compiler override your recommendation. But it's probably better to omit it altogether and let the compiler decide.
The obfuscation mentioned above is one possible benefit of inlining.
As others have mentioned, inline would actually increase the size of the compiled code.
Yes, it will readily ignore it when it thinks the function is too large or uses incompatible features (exception handling perhaps). Furthermore, there is usually a compiler setting to let it automatically inline functions that it deems worthy (/Ob2 in MSVC).
It should be explicitly stated if you put the definition of the function in the header file. Which is usually necessary to ensure that multiple translation units can take advantage of it. And to avoid multiple definition errors. Furthermore, inline functions are put in the COMDAT section. Which tells the linker that it can pick just one of the multiple definitions. Equivalent to __declspec(selectany) in MSVC.
Inlined functions don't usually make the executable smaller. Since the call opcode is typically smaller than the inlined machined code, except for very small property accessor style functions. It depends but bigger is not an uncommon outcome.
Another benefit of in-lining (note that actual inlining is sometimes orthogonal to use of the "inline" directive) occurs when a function uses reference parameters. Passing two variables to a non-inline function to add its first operand to the second would require pushing the value of the first operand and the address of the second and then calling a function which would have to pop the first operand and address of the second, and then add the former value indirectly to the popped address. If the function were expanded inline, the compiler could simply add one variable to the other directly.
Actually inlining leads to bigger executables, not smaller ones.
It's to reduce one level of indirection, by pasting the function code.
http://www.parashift.com/c++-faq-lite/inline-functions.html