Edit: I've restored the original title but really what I should have asked was this: 'How do C++ linkers handle class methods which have been defined in multiple object files'
Say I have a C++ class defined in a header along these lines:
class Klass
{
int Obnoxiously_Large_Method()
{
//many thousands of lines of code here
}
}
If I compile some C++ code which uses 'Obnoxiously_Large_Method' in several locations, will the resulting object file always inline the code for 'Obnoxiously_Large_Method' or will it optimise for size (for example, when using g++ -Os) and create a single instance of 'Obnoxiously_Large_Method' and use it like a normal function?, if so, how do linkers resolve the collisions between other object files which have instantiated the same function?. Is there some arcane C++ namespace Juju which keeps the separate object instances of method from colliding with each other?
7.1.2 Function specifiers
A function declaration (8.3.5, 9.3, 11.4) with an inline specifier
declares an inline function. The inline specifier indicates to the
implementation that inline substitution of the function body at the
point of call is to be preferred to the usual function call mechanism.
An implementation is not required to perform this inline substitution
at the point of call; however, even if this inline substitution is
omitted, the other rules for inline functions defined by 7.1.2 shall
still be respected.
So, the compiler is not required to actually 'inline' any function.
However, the standard also says,
An inline function with external linkage shall have the same address in all translation units.
Member functions normally have external linkage (one exception is when the member function belongs to a 'local' class), so inline functions must have a unique address for cases where the address of the function is taken. In this case, the compiler will arrange for the linker to throw away all but one instance of a non-inlined copy of the function and fix-up all address references to the function to be to the one that's kept.
Section [9.3], Member functions, of the C++98 Standard states:
A member function may be defined (8.4) in its class definition, in which case it is an inline member function (7.1.2).
Thus, it has always been the case that marking member functions defined in the class definition explicitly inline is unnecessary.
On the inline function specifier, the Standard states:
A function declaration (8.3.5, 9.3, 11.4) with an inline specifier declares an inline function. The inline specifier indicates to the [C++ compiler] that inline substitution of the function body at the point of call is to be preferred to the usual function call mechanism. [However, a C++ compiler] is not required to perform this inline substitution at the point of call;
So, it is up to the compiler whether it will actually inline the definition of the function rather than call it via the usual function call mechanism.
Nothing is always inlined (unless your compiler has an attribute or private keyword to force it to do so...at which point you're writing $(COMPILER)-flavored C++ rather than standard C++). Very long functions, recursive functions, and a few other things generally aren't inlined.
The compiler can choose not to inline stuff if it determines that doing so will degrade performance, unreasonably increase the object file's size, or make things work incorrectly. Or if it's optimizing for size instead of speed. Or if you ask it not to. Or if it doesn't like your shirt. Or if it's feeling lazy today, cause it compiled too much last night. Or for any other reason. Or for no reason at all.
There is no - single answer to this question. Compilers are smart beasts. You can specifically use the inline words if you want, but this doesn't mean that the compiler will actually inline the function.
Inline is there to help the developer with optmization. It hints at the compiler that something should be inlined, but these hints are generally ignored nowadays, since compilers can do better at register assignment and deciding when to inline functions (in fact, a compiler can either inline or not inline a function at different times). Code generation on modern processors is far more complicated than on the more deterministic ones common when Ritchie was inventing C.
What the word means now, in C++, is that it can have multiple identical definitions, and needs to be defined in every translation unit that uses it. (In other words, you need to make sure it can be inlined.) You can have an inline function in a header with no problems, and member functions defined in a class definition are automatically effectively inline.
That said, I used to work with a greenhills compiler, and it actually obeyed my will more than it disobeyed it :).. It's up to the compiler, really.
The inline keyword deals with c++ definition of a function. The compiler may inline object code where ever it wants.
Functions defined inline (eg they use the inline keyword), create object code for the function in every compilation unit. Those functions are marked as special so the linker knows to only use one.
See this answer for more specifics.
It doesn't have to be inlined, no; it's just like if you specified inline explicitly.
When you write inline, you promise that this method won't be called from translation units where it isn't defined, and therefore, that it can have internal linkage (so the linker won't connect one object-file's reference to it to another object-file's definition of it). [This paragraph was wrong. I'm leaving it intact, just struck-out, so that the below comments will still make sense.]
Related
I know in advance that, when writing a program in C or C++, even if I declare a function as "inline" the compiler is free to ignore this and decide not to expand it at each (or any) call.
Is the opposite true as well? That is, can a compiler automatically inline a very short function that wasn't defined as inline if the compiler believes doing so will lead to a performance gain?
Two other subquestions: is this behaviour defined somewhere in the ANSI standards? Is C different from C++ in this regard, or do they behave the same?
inline is non-binding with regards to whether or not a function will be inlined by the compiler. This was originally what it was intended to do. But since then, it's been realized that whether or not a function is worth inlining depends as much on the call site as the function itself and is best left to the compiler to decide.
From https://en.cppreference.com/w/cpp/language/inline :
Since this meaning of the keyword inline is non-binding, compilers are free to use inline substitution for any function that's not marked inline, and are free to generate function calls to any function marked inline. Those optimization choices do not change the rules regarding multiple definitions and shared statics listed above.
Edit : Since you asked for C as well, from https://en.cppreference.com/w/c/language/inline :
The intent of the inline specifier is to serve as a hint for the compiler to perform optimizations, such as function inlining, which require the definition of a function to be visible at the call site. The compilers can (and usually do) ignore presence or absence of the inline specifier for the purpose of optimization.
Regarding the relation between C and C++, the inline specifier is treated differently in each language.
In C++: inline functions (and function like entities, and variables (since C++17) ) that have not been previously declared with internal linkage will have external linkage and be visible from other compilation units. Since inline functions (usually) reside in header files, this means that the same function will have repeated definitions across different compilation units (this is would be a violation of the One definition rule but the inline makes it legal). At the end of the build process (when linking an executable or a shared lib), inline definitions of the same entity are merged together. Informally, C++ inline means: "there may be multiple identical definitions of some function across multiple source files, but I want them to end up as a unique definition".
In C: If extern is not explicitly specified, then an inline function definition is not visible from other translation units, different translation units may have different definitions with inline specifier for the same function name. Also, there may exist (at most) one definition for a function name that is both inline and extern and this qualifies that function as the one that is externally visible (ie gets selected when one applies the address of & operator to the function name). The One definition rule from C and its relation with extern and inline is somehow different from C++.
can a compiler automatically inline a very short function that wasn't defined as inline if the compiler believes doing so will lead to a performance gain?
Limitation:
When code uses a pointer to the function, then the function needs to exist non-inlined.
Limitation:
When the function is visible outside the local .c file (not static), this prevents simplistic inlined code.
Not a limitation:
The length of the function is not an absolute limitation, albeit a practical one.
I've worked with embedded processor that commonly inline static functions. (Given code does not use a pointer to them.)
The usefulness of the inline keyword does not affect the ability for a compiler to inline function.
When it comes to the standard, the keyword inline has nothing to do with inlining.
The rules (in c++) are basically:
A function which is not declared inline can by only defined in one translation union. It still needs to be delared in each translation unit where it is used.
A function which is declared inline has to be defined in each translation unit where it is odr-used (ord-use means to call the function or to take the pointer,...).
So, in a standard project setting it is almost always correct to follow the following two rules. Functions that are defined in a header file, are always to be declared inline. Functions defined in a *.cpp-file are never declared inline.
This said, I think the compiler cannot really draw any conclusions about the programmer wanted inlining from using or not using keyword inline. The name of the keyword is an unfortunate legacy from a bad naming.
I had a discussion with Johannes Schaub regarding the keyword inline.
The code there was this:
namespace ... {
static void someFunction() {
MYCLASS::GetInstance()->someFunction();
}
};
He stated that:
Putting this as an inline function may
save code size in the executable
But according to my findings here and here it wouldn't be needed, since:
[Inline] only occurs if the compiler's cost/benefit analysis show it to be profitable
Mainstream C++ compilers like Microsoft Visual C++ and GCC support an option that lets the compilers automatically inline any suitable function, even those not marked as inline functions.
Johannes however states that there are other benefits of explicitly specifying it. Unfortunately I do not understand them. For instance, he stated that And "inline" allows you to define the function multiple times in the program., which I am having a hard time understanding (and finding references to).
So
Is inline just a recommendation for the compiler?
Should it be explicitly stated when you have a small function (I guess 1-4 instructions?)
What other benefits are there with writing inline?
is it needed to state inline in order to reduce the executable file size, even though the compiler (according to wikipedia [I know, bad reference]) should find such functions itself?
Is there anything else I am missing?
To restate what I said in those little comment boxes. In particular, I was never talking about inlin-ing:
// foo.h:
static void f() {
// code that can't be inlined
}
// TU1 calls f
// TU2 calls f
Now, both TU1 and TU2 have their own copy of f - the code of f is in the executable two times.
// foo.h:
inline void f() {
// code that can't be inlined
}
// TU1 calls f
// TU2 calls f
Both TUs will emit specially marked versions of f that are effectively merged by the linker by discarding all but one of them. The code of f only exists one time in the executable.
Thus we have saved space in the executable.
Is inline just a recommendation for the compiler?
Yes.
7.1.2 Function specifiers
2 A function declaration (8.3.5, 9.3, 11.4) with an inline specifier declares an inline function. The inline
specifier indicates to the implementation that inline substitution of the function body at the point of call
is to be preferred to the usual function call mechanism. An implementation is not required to perform this
inline substitution at the point of call; however, even if this inline substitution is omitted, the other rules
for inline functions defined by 7.1.2 shall still be respected.
For example from MSDN:
The compiler treats the inline expansion options and keywords as suggestions. There is no guarantee that functions will be inlined. You cannot force the compiler to inline a particular function, even with the __forceinline keyword. When compiling with /clr, the compiler will not inline a function if there are security attributes applied to the function.
Note though:
3.2 One definition rule
3 [...]An inline function shall be defined in every translation unit in which it is used.
4 An inline function shall be defined in every translation unit in which it is used and shall have exactly
the same definition in every case (3.2). [ Note: a call to the inline function may be encountered before its
definition appears in the translation unit. —end note ] If the definition of a function appears in a translation
unit before its first declaration as inline, the program is ill-formed. If a function with external linkage is
declared inline in one translation unit, it shall be declared inline in all translation units in which it appears;
no diagnostic is required. An inline function with external linkage shall have the same address in all
translation units. A static local variable in an extern inline function always refers to the same object.
A string literal in the body of an extern inline function is the same object in different translation units.
[ Note: A string literal appearing in a default argument expression is not in the body of an inline function
merely because the expression is used in a function call from that inline function. —end note ] A type
defined within the body of an extern inline function is the same type in every translation unit.
[Note: Emphasis mine]
A TU is basically a set of headers plus an implementation file (.cpp) which leads to an object file.
Should it be explicitly stated when you have a small function (I
guess 1-4 instructions?)
Absolutely. Why not help the compiler help you generate less code? Usually, if the prolog/epilog part incurs more cost than having it inline force the compiler to generate them? But you must, absolutely must go through this GOTW article before getting started with inlining: GotW #33: Inline
What other benefits are there with writing inline?
namespaces can be inline too. Note that member functions defined in the class body itself are inline by default. So are implicitly generated special member functions.
Function templates cannot be defined in an implementation file (see FAQ 35.12) unless of course you provide a explicit instantiations (for all types for which the template is used -- generally a PITA IMO). See the DDJ article on Moving Templates Out of Header Files (If you are feeling weird read on this other article on the export keyword which was dropped from the standard.)
Is it needed to state inline in order to reduce the executable file
size, even though the compiler
(according to wikipedia [I know, bad
reference]) should find such functions
itself?
Again, as I said, as a good programmer, you should, when you can, help the compiler. But here's what the C++ FAQ has to offer about inline. So be wary. Not all compilers do this sort of analysis so you should read the documentation on their optimization switches. E.g: GCC does something similar:
You can also direct GCC to try to integrate all “simple enough” functions into their callers with the option -finline-functions.
Most compilers allow you to override the compiler's cost/benefit ratio analysis to some extent. The MSDN and GCC documentation is worth reading.
Is inline just a recommendation for the compiler?
Yes. But the linker needs it if there are multiple definitions of the function (see below)
Should it be explicitly stated when you have a small function (I guess 1-4 instructions?)
On functions that are defined in header files it is (usually) needed. It does not hurt to add it to small functions (but I don't bother). Note class members defined within the class declaration are automatically declared inline.
What other benefits are there with writing inline?
It will stop linker errors if used correctly.
is it needed to state inline in order to reduce the executable file size, even though the compiler (according to wikipedia [I know, bad reference]) should find such functions itself?
No. The compiler makes a cost/benefit comparison of inlining each function call and makes an appropriate choice. Thus calls to a function may be inlined in curtain situations and not inlined in other (depending on how the compilers algorithm works).
Speed/Space are two competing forces and it depends what the compiler is optimizing for which will determine weather functions are inlined and weather the executable will grow or shrink.
Also note if excessively aggressive inlining is used causing the program to expand too much, then locality of reference is lost and this can actually slow the program down (as more executable pages need to be brought into memory).
Multiple definition:
File: head.h
// Without inline the linker will choke.
/*inline*/ int add(int x, int y) { return x + y; }
extern void test()
File: main.cpp
#include "head.h"
#include <iostream>
int main()
{
std::cout << add(2,3) << std::endl;
test();
}
File: test.cpp
#include "head.h"
#include <iostream>
void test()
{
std::cout << add(2,3) << std::endl;
}
Here we have two definitions of add(). One in main.o and one in test.o
Yes. It's nothing more.
No.
You hint the compiler that it's a function that gets called a lot, where the jump-to-the-function part takes a lot of the execution time.
The compiler might decide to put the function code right where it gets called instead where normal functions are. However, if a function is inlined in x places, you need x times the space of a normal function.
Always trust your compiler to be much smarter than yourself on the subject of premature micro-optimization.
Actually, inline function may increase executable size, because inline function code is duplicated in every place where this function is called. With modern C++ compilers, inline mostly allows to programmer to believe, that he writes high-performance code. Compiler decides itself whether to make function inline or not. So, writing inline just allows us to feel better...
With regards to this:
And "inline" allows you to define the function multiple times in the program.
I can think of one instance where this is useful: Making copy protection code harder to crack. If you have a program that takes user information and verifies it against a registration key, inlining the function that does the verification will make it harder for a cracker to find all duplicates of that function.
As to other points:
inline is just a recommendation to compiler, but there are #pragma directives that can force inlining of any function.
Since it's just a recommendation, it's probably safe to explicitly ask for it and let the compiler override your recommendation. But it's probably better to omit it altogether and let the compiler decide.
The obfuscation mentioned above is one possible benefit of inlining.
As others have mentioned, inline would actually increase the size of the compiled code.
Yes, it will readily ignore it when it thinks the function is too large or uses incompatible features (exception handling perhaps). Furthermore, there is usually a compiler setting to let it automatically inline functions that it deems worthy (/Ob2 in MSVC).
It should be explicitly stated if you put the definition of the function in the header file. Which is usually necessary to ensure that multiple translation units can take advantage of it. And to avoid multiple definition errors. Furthermore, inline functions are put in the COMDAT section. Which tells the linker that it can pick just one of the multiple definitions. Equivalent to __declspec(selectany) in MSVC.
Inlined functions don't usually make the executable smaller. Since the call opcode is typically smaller than the inlined machined code, except for very small property accessor style functions. It depends but bigger is not an uncommon outcome.
Another benefit of in-lining (note that actual inlining is sometimes orthogonal to use of the "inline" directive) occurs when a function uses reference parameters. Passing two variables to a non-inline function to add its first operand to the second would require pushing the value of the first operand and the address of the second and then calling a function which would have to pop the first operand and address of the second, and then add the former value indirectly to the popped address. If the function were expanded inline, the compiler could simply add one variable to the other directly.
Actually inlining leads to bigger executables, not smaller ones.
It's to reduce one level of indirection, by pasting the function code.
http://www.parashift.com/c++-faq-lite/inline-functions.html
If we define a member function inside the class definition itself, is it necessarily treated inline or is it just a request to the compiler which it can ignore.
Yes, functions that are defined inside a class body are implicitly inline.
(As with other functions declared inline it doesn't mean that the complier has to perform inline expansion in places where the function is called, it just enables the permitted relaxations of the "one definition rule", combined with the requirement that a definition must be included in all translation units where the function is used.)
As stated by others, a method defined within a class is automatically requested inline.
It's useful to understand why.
Suppose it weren't. You'd have to generate code for such a function, and everywhere it is called, a jump to subroutine instruction would have to reference the location, via the linker.
class A {
public:
void f() { ... your code ... }
};
Every time this code is seen, if it's not inline, the compiler can only assume it must be generated, so it would generate a symbol. Suppose it was like this:
A__f_v:
If that symbol were global, then if you happened to include this class code multiple times in different modules, you would have a multiply defined symbol error at link time. So it can't be global. Instead, it's file local.
Imagine you include the above header file in a number of modules. In each one, it's going to generate a local copy of that code. Which is better than not compiling at all, but you're getting multiple copies of the code when you really need only one.
This leads the the following conclusion: if your compiler is not going to inline a function, you are significantly better off declaring it somewhere once, and not requesting it to be inlined.
Unfortunately, what is and is not inline is not portable. It's defined by the compiler writer. A good rule of thumb is to always make every one liner, particularly all functions which themselves just call a function, inline, as you remove overhead. Anything below three lines of linear code is almost certainly ok. But if you have a loop in the code, the question is whether the compiler will allow it inline, and more to the point, how much benefit you would see even if it did what you want.
consider this inline code:
inline int add(int a, int b) { return a + b; }
It's not only almost as small as the prototype would be in source code, but the assembly language generated by the inline code is smaller than the call to a routine would be. So this code is smaller, and faster.
And, if you happen to be passing in constants:
int c= add(5,4);
It's resolved at compile time and there is no code.
In gcc, I recently noticed that even if I don't inline code, if it's local to a file, they will sneakily inline it anyway. It's only if I declare the function in a separate source module that they do not optimize away the call.
On the other end of the spectrum, suppose you request inline on a 1000 line piece of code. Even if your compiler is silly enough to go along with it, the only thing you save is the call itself, and the cost is that every time you call it, the compiler must paste all that code in. If you call that code n times, your code grows by the size of the routine * n. So anything bigger than 10 lines is pretty much not worth inlining, except for the special case where it is only called a very small number of times. An example of that might be in a private method called by only 2 others.
If you request to inline a method containing a loop, it only makes sense if it often executes a small number of times. But consider a loop which iterates one million times. Even if the code is inlined, the percentage of time spent in the call is tiny. So if you have methods with loops in it, which tend to be bigger anyway, those are worth removing from the header file because they a) will tend to be rejected as inline by the compiler and b) even if they were inlined, are generally not going to provide any benefit
It is necessarily treated by the compiler as a request for inline -- which it can ignore. There are some idioms for defining some functions in the header (e.g. empty virtual destructors) and some necessary header definitions (template functions), but other than that see GotW #33 for more information.
Some have noted that the compiler may even inline functions you never asked it to, but I'm not sure whether that would defeat the purpose of requesting to inline a function.
It is indeed inlined - but any inline request can be ignored by the compiler.
It is a request to the compiler that it can ignore.
The 2003 ISO C++ standard says
7.1.2/2 A function declaration (8.3.5, 9.3, 11.4) with an inline
specifier declares an inline
function. The inline specifier
indicates to the implementation that
inline substitution of the function
body at the point of call is to be
preferred to the usual function call
mechanism. An implementation is not
required to perform this inline
substitution at the point of call;
however, even if this inline
substitution is omitted, the other
rules for inline functions defined
by 7.1.2 shall still be respected.
7.1.2/3 A function defined within a class definition is an inline
function. The inline specifier shall
not appear on a block scope function
declaration.
7.1.2/4 An inline function shall be defined in every translation unit in
which it is used and shall have
exactly the same definition in every
case (3.2). [Note: a call to the
inline function may be encountered
before its defi-nition appears in the
translation unit. ] If a function
with external linkage is declared
inline in one transla-tion unit, it
shall be declared inline in all
translation units in which it
appears; no diagnostic is required. An
inline function with external
linkage shall have the same address in
all translation units. A static
local variable in an extern inline
function always refers to the same
object. A string literal in an
extern inline function is the same
object in different translation
units.
There are two things that shouldn't be lumped together:
How you mark a function as being inline: define it with inline in front of the signature or define it at declaration point;
What the compiler will treat such inline marking: regardless of how you marked the function as inline it will be treated as a request by the compiler.
In C++, do methods only get inlined if they are explicitly declared inline (or defined in a header file), or are compilers allowed to inline methods as they see fit?
The inline keyword really just tells the linker (or tells the compiler to tell the linker) that multiple identical definitions of the same function are not an error. You'll need it if you want to define a function in a header, or you will get "multiple definition" errors from the linker, if the header is included in more than one compilation unit.
The rationale for choosing inline as the keyword seems to be that the only reason why one would want to define a (non-template) function in a header is so it could be inlined by the compiler. The compiler cannot inline a function call, unless it has the full definition. If the function is not defined in the header, the compiler only has the declaration and cannot inline the function even if it wanted to.
Nowadays, I've heard, it's not only the compiler that optimizes the code, but the linker can do that as well. A linker could (if they don't do it already) inline function calls even if the function wasn't defined in the same compilation unit.
And it's probably not a good idea to define functions larger than perhaps a single line in the header if at all (bad for compile time, and should the large function be inlined, it might lead to bloat and worse performance).
Yes, the compiler can inline code even if it's not explicitly declared as inline.
Basically, as long as the semantics are not changed, the compiler can virtually do anything it wants to the generated code. The standard does not force anything special on the generated code.
Compilers might inline any function or might not inline it. They are allowed to use the inline decoration as a hint for this decision, but they're also allowed to ignore it.
Also note that class member functions have an implicit inline decoration if they are defined right in the class definition.
Compilers may ignore your inline declaration. It is basically used by the compiler as a hint in order decide whether or not to do so. Compilers are not obligated to inline something that is marked inline, or to not inline something that isn't. Basically you're at the mercy of your compiler and the optimization level you choose.
If I'm not mistaken, when optimizations are turned on, the compiler will inline any suitable routine or method.
Text from IBM information Center,
Using the inline specifier is only a
suggestion to the compiler that an
inline expansion can be performed; the
compiler is free to ignore the
suggestion.
C Language Any function, with the exception of main, can be declared or
defined as inline with the inline
function specifier. Static local
variables are not allowed to be
defined within the body of an inline
function.
C++ functions implemented inside of a class declaration are
automatically defined inline. Regular
C++ functions and member functions
declared outside of a class
declaration, with the exception of
main, can be declared or defined as
inline with the inline function
specifier. Static locals and string
literals defined within the body of an
inline function are treated as the
same object across translation units;
Your compiler's documentation should tell you since it is implementation dependent. For example, GCC according to its manual never inlines any code unless optimisation is applied.
If the compiler does not inline the code, the inline keyword will have the same effect as static, and each compilation unit that calls the code will have its own copy. A smart linker may reduce these to a single copy.
The compiler can inline whatever it wants in case inlining doesn't violate the code semantics and it can reach the function code. It can also inline selectively - do inline when it feels it's a good idea and not inline when it doesn't feel it's a good idea or when it would violate the code semantics.
Some compilers can do inlining even if the function is in another translation unit - that's called link-time code generation.
Typical cases of when inlining would violate code semantics are virtual calls and passing a function address into another function or storing it.
Compiler optimize as he wants unless you spec the opposite.
The inline keyword is just a request to the compiler. The compiler reserves the right to make or not make a function inline. One of the major factor that drives the compiler's decision is the simplicity of code(not many loops)
Member functions are declared inline by default.(The compiler decides here also)
These are not hard and fast rules. It varies according to the compiler implementations.
If anybody knows other factors involved, please post.
Some of the situations where inline expansion may NOT work are:
For functions returning values, if a loop, a switch, or a goto exists
For function not returning values, if a return statement exits;
If functions contain static variables
If inline functions are recursive.
Inline expansion makes a program run faster because the overhead of a function call and return statement is eliminated. However, it makes the program to take up more memory because the statements that define the inline functions are reproduced at each point where the function is called. So, a trade-off becomes necessary.
(As given in one of my OOP books)
ISO C++ says that the inline definition of member function in C++ is the same as declaring it with inline. This means that the function will be defined in every compilation unit the member function is used. However, if the function call cannot be inlined for whatever reason, the function is to be instantiated "as usual". (http://msdn.microsoft.com/en-us/library/z8y1yy88%28VS.71%29.aspx) The problem I have with this definition is that it does not tell in which translation unit it would be instantiated.
The problem I encountered is that when facing two object files in a single static library, both of which have the reference to some inline member function which cannot be inlined, the linker might "pick" an arbitrary object file as a source for the definition. This particular choice might introduce unneeded dependencies. (among other things)
For instance:
In a static library
A.h:
class A{
public:
virtual bool foo() { return true; }
};
U1.cpp:
A a1;
U2.cpp:
A a2;
and lots of dependencies
In another project
main.cpp:
#include "A.h"
int main(){
A a;
a.foo();
return 0;
}
The second project refers the first. How do I know which definition the compiler will use, and, consequently which object files with their dependencies will be linked in? Is there anything the standard says on that matter? (Tried, but failed to find that)
Thanks
Edit: since I've seen some people misunderstand what the question is, I'd like to emphasize: If the compiler decided to create a symbol for that function (and in this case, it will, because of 'virtualness', there will be several (externally-seen) instantiations in different object file, which definition (from which object file?) will the linker choose?)
Just my two cents. This is not about virtual function in particular, but about inline and member-functions generally. Maybe it is useful.
C++
As far as Standard C++ is concerned, a inline function must be defined in every translation unit in which it is used. And an non-static inline function will have the same static variables in every translation unit and the same address. The compiler/linker will have to merge the multiple definitions into one function to achieve this. So, always place the definition of an inline function into the header - or place no declaration of it into the header if you define it only in the implementation file (".cpp") (for a non-member function), because if you would, and someone used it, you would get a linker error about an undefined function or something similar.
This is different from non-inline functions which must be defined only once in an entire program (one-definition-rule). For inline functions, multiple definitions as outlined above are rather the normal case. And this is independent on whether the call is atually inlined or not. The rules about inline functions still matter. Whether the Microsoft compiler adheres to those rules or not - i can't tell you. If it adheres to the Standard in that regard, then it will. However, i could imagine some combination using virtual, dlls and different TUs could be problematic. I've never tested it but i believe there are no problems.
For member-functions, if you define your function in the class, it is implicitly inline. And because it appears in the header, the rule that it has to be defined in every translation unit in which it is used is automatically satisfied. However, if you define the function out-of-class and in a header file (for example because there is a circular dependency with code in between), then that definition has to be inline if you include the corresponding file more than once, to avoid multiple-definition errors thrown by the linker. Example of a file f.h:
struct f {
// inline required here or before the definition below
inline void g();
};
void f::g() { ... }
This would have the same effect as placing the definition straight into the class definition.
C99
Note that the rules about inline functions are more complicated for C99 than for C++. Here, an inline function can be defined as an inline definition, of which can exist more than one in the entire program. But if such a (inline-) definition is used (e.g if it is called), then there must be also exactly one external definition in the entire program contained in another translation unit. Rationale for this (quoting from a PDF explaining the rationale behind several C99 features):
Inlining in C99 does extend the C++ specification in two ways. First, if a function is declared inline in one translation unit, it need not be declared inline in every other translation unit. This allows, for example, a library function that is to be inlined within the library but available only through an external definition elsewhere. The alternative of using a wrapper function for the external function requires an additional name; and it may also adversely impact performance if a translator does not actually do inline substitution.
Second, the requirement that all definitions of an inline function be "exactly the same" is replaced by the requirement that the behavior of the program should not depend on whether a call is implemented with a visible inline definition, or the external definition, of a function. This allows an inline definition to be specialized for its use within a particular translation unit. For example, the external definition of a library function might include some argument validation that is not needed for calls made from other functions in the same library. These extensions do offer some advantages; and programmers who are concerned about compatibility can simply abide by the stricter C++ rules.
Why do i include C99 into here? Because i know that the Microsoft compiler supports some stuff of C99. So in those MSDN pages, some stuff may come from C99 too - haven't figured anything in particular though. One should be careful when reading it and when applying those techniques to ones own C++ code intended to be portable C++. Probably informing which parts are C99 specific, and which not.
A good place to test small C++ snippets for Standard conformance is the comeau online compiler. If it gets rejected, one can be pretty sure it is not strictly Standard conforming.
When you have an inline method that is forced to be non-inlined by the compiler, it will really instantiate the method in every compiled unit that uses it. Today most compilers are smart enough to instantiate a method only if needed (if used) so merely including the header file will not force instantiation. The linker, as you said, will pick one of the instantiations to include in the executable file - but keep in mind that the record inside the object module is of a special kind (for instance, a COMDEF) in order to give the linker enough information to know how to discard duplicated instances. These records will not, therefore, result in unwanted dependencies between modules, because the linker will use them with less priority than "regular" records to resolve dependencies.
In the example you gave, you really don't know, but it doesn't matter. The linker won't resolve dependencies based on non-inlined instances alone ever. The result (in terms of modules included by the linker) will be as good as if the inline method didn't exist.
AFAIK, there is no standard definition of how and when a C++ compiler will inline a function call. These are usually "recommendations" that the compiler is in no way required to follow. In fact, different users may want different behaviors. One user may care about speed, while another may care about small generated object file size. In addition, compilers and platforms are different. Some compilers may apply smarter analysis, some may not. Some compilers may generate longer code from the inline, or work on a platform where calls are too expensive, etc.
When you have an inline function, the compiler should still generate a symbol for it and eventually resolve a single version of it. So that if it is in a static library, people can still call the function not in inline. In other words, it still acts as a normal function,.
The only effect of the inline is that some cases, the compiler will see the call, see the inline, and skip the call completely, but the function should still be there, it's just not getting called in this case.
If the compiler decided to create a symbol for that function (and in this case, it will, because of 'virtualness', there will be several (externally-seen) instantiations in different object file, which definition (from which object file?) will the linker choose?)
The definition that is present in the corresponding translation unit. And a translation unit cannot, and I repeat, cannot have but exactly one such definition. The standard is clear about that.
[...]the linker might "pick" an arbitrary object file as a source for the definition.
EDIT: To avoid any further misunderstanding, let me make my point clear: As per my reading of the standard, the ability to have multiple definition across different TUs does not give us any practical leverage. By practical, I mean having even slightly varying implementations. Now, if all your TUs have the exact same definition, why bother which TU the definition is being picked up from?
If you browse through the standard you will find the One Definition Rule is applied everywhere. Even though it is allowed to have multiple definitions of an inline function:
3.2 One Definition Rule:
5 There can be more than one definition of a class type (Clause 9), concept (14.9), concept map (14.9.2), enumeration type (7.2), inline function with external linkage (7.1.2), [...]
Read it in conjunction with
3 [...] An inline function shall be defined in every translation unit in which it is used.
This means that the function will be defined in every compilation unit [...]
and
7.1.2 Function Specifiers
2 A function declaration (8.3.5, 9.3, 11.4) with an inline specifier declares an inline function. The inline specifier indicates to the implementation that inline substitution of the function body at the point of call is to be preferred to the usual function call mechanism. An implementation is not required to perform this inline substitution at the point of call; however, even if this inline substitution is omitted, the other rules
for inline functions defined by 7.1.2 shall still be respected.
3 A function defined within a class definition is an inline function. The inline specifier shall not appear on a block scope function declaration.[footnote: 82] If the inline specifier is used in a friend declaration, that declaration shall be a definition or the function shall have previously been declared inline.
and the footnote:
82) The inline keyword has no effect on the linkage of a function.
§ 7.1.2 138
as well as:
4 An inline function shall be defined in every translation unit in which it is used and shall have exactly the same definition in every case (3.2). [ Note: a call to the inline function may be encountered before its definition appears in the translation unit. —end note ] If the definition of a function appears in a translation unit before its first declaration as inline, the program is ill-formed. If a function with external linkage is
declared inline in one translation unit, it shall be declared inline in all translation units in which it appears; no diagnostic is required. An inline function with external linkage shall have the same address in all translation units. A static local variable in an extern inline function always refers to the same object. A string literal in the body of an extern inline function is the same object in different translation units. [ Note: A string literal appearing in a default argument expression is not in the body of an inline function merely because the expression is used in a function call from that inline function. —end note ]
Distilled: Its ok to have multiple definitions, but they must have the same look and feel in every translation unit and address -- but that doesn't really give you much to cheer about. Having multiple deinition across translation units is therefore not defined (note: I am not saying you are invoking UB, yet).
As for the virtual thingy -- there won't be any inlining. Period.
The standard says:
The same declaration must be available
There must be one definition
From MSDN:
A given inline member function must be declared the same way in every compilation unit. This constraint causes inline functions to behave as if they were instantiated functions. Additionally, there must be exactly one definition of an inline function.
Your A.h contains the class definition and the member foo()'s definition.
U1.cpp and U2.cpp both define two different objects of class A.
You create another A object in main(). This is just fine.
So far, I have seen only one definition of A::foo() which is inline. (Remember that a function defined within the class declaration is always inlined whether or not it is preceded by the inline keyword.)
Don't inline your functions if you want to ensure they get compiled into a specific library.