Linking an inline virtual method - c++

GCC 4.7.2 compiles this code as follows:
The reference to Obj::getNumber() in Obj::defaultNumber() is inlined
The body of Obj::getNumber() is separately compiled and exported and can be linked to from a different translation unit.
VC++2012 fails at the link step with:
Error 1 error LNK2001: unresolved external symbol "public: virtual int __thiscall Obj::getNumber(unsigned int)" in main.obj
VC++ seems to be inlining the call within Obj::defaultNumber() but not exporting getNumber() in the symbol table.
VC++ can be made to compile it by one of the following:
Remove the inline keyword from the getNumber() definition, or
Remove the virtual keyword from the getNumber() declaration (Why!?)
At first glance, GCC's behavior certainly seems more helpful/intuitive. Perhaps someone familiar with the Standard could point me in the right direction here.
Does either compiler have conforming behavior in this case?
Why does VC++ work if the method is non-virtual?
.
// Obj.h
class Obj
{
public:
virtual int getNumber(unsigned int i);
virtual int defaultNumber();
};
// Obj.cpp
static const int int_table[2] = { -1, 1 };
inline int Obj::getNumber(unsigned int i)
{
return int_table[i];
}
int Obj::defaultNumber()
{
return getNumber(0);
}
// main.cpp
#include <iostream>
#include "Obj.h"
int main()
{
Obj obj;
std::cout << "getNumber(1): " << obj.getNumber(1) << std::endl;
std::cout << "defaultNumber(): " << obj.defaultNumber() << std::endl;
}

I changed the answer completely once, why not twice? It's quite late sorry for any typos.
Short
Does either compiler have conforming behavior in this case?
Both of them do. It's undefined behaviour.
Why does VC++ work if the method is non-virtual?
Because your code is non-standard conformant and the compiler can construe your code at its own discretion.
Long
Why are the compilers conformant?
The standard says that a virtual member is used per one-definition rule if not pure and that every translation unit needs its own definition of an odr-used inline function.
§ 3.2
2) "[...] A virtual member function is odr-used if it is not pure. [...]"
3) "[...] An inline function shall be defined in every translation unit in which it is odr-used. [...]"
§ 7.1.2
4) An inline function shall be defined in every translation unit in which it is odr-used and shall have exactly the same definition in every case (3.2). [...]
Additionally the standard requires the function to be declared inline within each translation unit but allows the compilers to omit any diagnostics.
§ 7.1.2
4) [...] If a function with external linkage is declared inline in one translation unit, it shall be declared inline in all translation units in which it appears; no diagnostic is required. [...]
Since your getNumber() isn't declared inline in both main.cpp and Obj.cpp you're in the Land of undefined behaviour here.
(edit) I interpret this (see standard quotes below) in a way that the compiler is not required to check whether the function is declared inline everywhere it appears. I don't know but suspect that this "compilers don't have to check"-rule doesn't refer to the sentence "An inline function shall be defined in every translation unit in which it is odr-used[...]".
MSVC compiles even if getNumber is additionally defined in main (without ever being declared inline from main's point of view), even if the Obj.cpp implementation is still declared inline and present in the symbol table. This behaviour of the compiler is allowed by the standard even though the code is non-conformant: undefined behaviour (UB).(/edit)
Both compilers are therefore free to either accept and compile or reject your code.
Why does VC++ work if the method is non-virtual?
The question here is: Why doesn't VC++ instantiate a the virtual inline function but does so if virtual is left out?
You probably read those statements all around SO where it says "UB may blow your house, kill your mum, end the world" etc. I have no clue why there's no function instance if the function is virtual but otherwise there is. I guess that's what UB is about - weird things.
Why do both GCC and VS (without virtual) provide externally linked function instances of an inline function?
The compilers are not required to actually substitute the functions inline, they may well produce an instance of the function.
Since inline functions have external linkage by default, this results in the possibility to call getNumber from main if there is an instance created.
The compiler will just add an undefined external into the main.obj (no problem, it's not inline here).
The linker will find the very same external symbol in the Obj.obj and link happily.
7.1.2
2) An implementation is not required to perform this inline substitution at the point of call; however, even if this inline substitution is omitted, the other rules for inline functions defined by 7.1.2 shall still be respected.
4) An inline function with external linkage shall have the same address in all translation units.
§ 9.3
3) Member functions of a class in namespace scope have external linkage.
MSDN inline, __inline, __forceinline
The inline keyword tells the compiler that inline expansion is preferred. However, the compiler can create a separate instance of the function (instantiate) and create standard calling linkages instead of inserting the code inline. Two cases where this can happen are:
Recursive functions.
Functions that are referred to through a pointer elsewhere in the translation unit.
These reasons may interfere with inlining, as may others, at the discretion of the compiler; you should not depend on the inline specifier to cause a function to be inlined.
So probably there is one of the "other" reasons that makes the compiler instantiate it and create a symbol that is used by main.obj but I can't tell why that reason disappears if the function is virtual.
Conclusion:
The compiler may or may not instantiate an inline function and it may or may not accept the same function being inline in one and non-inline in another translation unit. To have your behaviour defined you need to
Have every declaration of your function inline or have all of them non-inline
Provide a separate definition of an inline function for every translation unit (i.e. inline definition in the header or separate definitions in the source files)

Both compilers are correct. You are invoking undefined behavior for which no diagnostic is required per 7.1.2 para 2:
An inline function shall be defined in every translation unit in which it is odr-used and shall have exactly the same definition in every case (3.2). ... If a function with external linkage is declared inline in one translation unit, it shall be declared inline in all translation units in which it appears; no diagnostic is required.
GCC is free to do what it does and give you a program that just might happen to work. VC++ is equally valid in rejecting your code.

Inline functions should have a definition in every translation unit where used and in your case there is no one in main.cpp
The definition of an inline function doesn't have to be in a header
file but, because of the one definition rule for inline functions, an
identical definition for the function must exist in every translation
unit that uses it.
The easiest way to achieve this is by putting the definition in a
header file.
If you want to put the definition of a function in a single source
file then you shouldn't declare it inline. A function not declared
inline does not mean that the compiler cannot inline the function.
By #CharlesBailey
1 Does either compiler have conforming behavior in this case?
From this standard citation we can say that in this specific case VC is right.
C++ Standard, §3.2 One definition rule / 3: An inline function shall
be defined in every translation unit in which it is odr-used
thanks #Pixelchemist
2 Why does VC++ work if the method is non-virtual?
Accoring to the standard non-virtual functions are not "One Definition Rule" - used, so it allows to use it the way you want.

Related

Why does defining inline global function in 2 different cpp files cause a magic result?

Suppose I have two .cpp files file1.cpp and file2.cpp:
// file1.cpp
#include <iostream>
inline void foo()
{
std::cout << "f1\n";
}
void f1()
{
foo();
}
and
// file2.cpp
#include <iostream>
inline void foo()
{
std::cout << "f2\n";
}
void f2()
{
foo();
}
And in main.cpp I have forward declared the f1() and f2():
void f1();
void f2();
int main()
{
f1();
f2();
}
Result (doesn't depend on build, same result for debug/release builds):
f1
f1
Whoa: Compiler somehow picks only the definition from file1.cpp and uses it also in f2(). What is the exact explanation of this behavior?.
Note, that changing inline to static is a solution for this problem. Putting the inline definition inside an unnamed namespace also solves the problem and the program prints:
f1
f2
This is undefined behavior, because the two definitions of the same inline function with external linkage break C++ requirement for objects that can be defined in several places, known as One Definition Rule:
3.2 One definition rule
...
There can be more than one definition of a class type (Clause 9), enumeration type (7.2), inline function with external linkage (7.1.2), class template (Clause 14),[...] in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements. Given such an entity named D defined in more than one translation unit, then
6.1 each definition of D shall consist of the same sequence of tokens; [...]
This is not an issue with static functions, because one definition rule does not apply to them: C++ considers static functions defined in different translation units to be independent of each other.
The compiler may assume that all definitions of the same inline function are identical across all translation units because the standard says so. So it can choose any definition it wants. In your case, that happened to be the one with f1.
Note that you cannot rely on the compiler always picking the same definition, violating the aforementioned rule makes the program ill-formed. The compiler could also diagnose that and error out.
If the function is static or in an anonymous namespace, you have two distinct functions called foo and the compiler must pick the one from the right file.
Relevant standardese for reference:
An inline function shall be defined in every translation unit in which it is odr-used and shall have exactly
the same definition in every case (3.2). [...]
7.1.2/4 in N4141, emphasize mine.
As others have noted, the compilers are in compliance with the C++ standard because the One definition rule states that you shall have only one definition of a function, except if the function is inline then the definitions must be the same.
In practice, what happens is that the function is flagged as inline, and at linking stage if it runs into multiple definitions of an inline flagged token, the linker silently discards all but one. If it runs into multiple definitions of a token not flagged inline, it instead generates an error.
This property is called inline because, prior to LTO (link time optimization), taking the body of a function and "inlining" it at the call site required that the compiler have the body of the function. inline functions could be put in header files, and each cpp file could see the body and "inline" the code into the call site.
It doesn't mean that the code is actually going to be inlined; rather, it makes it easier for compilers to inline it.
However, I am unaware of a compiler that checks that the definitions are identical before discarding duplicates. This includes compilers that otherwise check definitions of function bodies for being identical, such as MSVC's COMDAT folding. This makes me sad, because it is a reall subtle set of bugs.
The proper way around your problem is to place the function in an anonymous namespace. In general, you should consider putting everything in a source file in an anonymous namespace.
Another really nasty example of this:
// A.cpp
struct Helper {
std::vector<int> foo;
Helper() {
foo.reserve(100);
}
};
// B.cpp
struct Helper {
double x, y;
Helper():x(0),y(0) {}
};
methods defined in the body of a class are implicitly inline. The ODR rule applies. Here we have two different Helper::Helper(), both inline, and they differ.
The sizes of the two classes differ. In one case, we initialize two sizeof(double) with 0 (as the zero float is zero bytes in most situations).
In another, we first initialize three sizeof(void*) with zero, then call .reserve(100) on those bytes interpreting them as a vector.
At link time, one of these two implementations is discarded and used by the other. What more, which one is discarded is likely to be pretty determistic in a full build. In a partial build, it could change order.
So now you have code that might build and work "fine" in a full build, but a partial build causes memory corruption. And changing the order of files in makefiles could cause memory corruption, or even changing the order lib files are linked, or upgrading your compiler, etc.
If both cpp files had a namespace {} block containing everything except the stuff you are exporting (which can use fully qualified namespace names), this could not happen.
I've caught exactly this bug in production multiple times. Given how subtle it is, I do not know how many times it slipped through, waiting for its moment to pounce.
POINT OF CLARIFICATION:
Although the answer rooted in C++ inline rule is correct, it only applies if both sources are compiled together. If they are compiled separately, then, as one commentator noted, each resulting object file would contain its own 'foo()'. HOWEVER: If these two object files are then linked together, then because both 'foo()'-s are non-static, the name 'foo()' appears in the exported symbol table of both object files; then the linker has to coalesce the two table entries, hence all internal calls are re-bound to one of the two routines (presumably the one in the first object file processed, since it is already bound [i.e the linker would treat the second record as 'extern' regardless of binding]).

Inline keyword vs header definition

What's the difference between using the inline keyword before a function and just declaring the whole function in the header?
so...
int whatever() { return 4; }
vs
.h:
inline int whatever();
.cpp:
inline int myClass::whatever()
{
return 4;
}
for that matter, what does this do:
inline int whatever() { return 4; }
There are several facets:
Language
When a function is marked with the inline keyword, then its definition should be available in the TU or the program is ill-formed.
Any function defined right in the class definition is implicitly marked inline.
A function marked inline (implicitly or explicitly) may be defined in several TUs (respecting the ODR), whereas it is not the case for regular functions.
Template functions (not fully specialized) get the same treatment as inline ones.
Compiler behavior
A function marked inline will be emitted as a weak symbol in each object file where it is necessary, this may increase their size (look up template bloat).
Whereas the compiler actually inlines the call (ie, copy/paste the code at the point of use instead of performing a regular function call) is entirely at the compiler's discretion. The presence of the keyword may, or not, influence the decision but it is, at best, a hint.
Linker behavior
Weak symbols are merged together to have a single occurrence in the final library. A good linker could check that the multiple definitions concur but this is not required.
without inline, you will likely end up with multiple exported symbols, if the function is declared at the namespace or global scope (results in linker errors).
however, for a class (as seen in your example), most compilers implicitly declare the method as inline (-fno-default-inline will disable that default on GCC).
if you declare a function as inline, the compiler may expect to see its definition in the translation. therefore, you should reserve it for the times the definition is visible.
at a higher level: a definition in the class declaration is frequently visible to more translations. this can result in better optimization, and it can result in increased compile times.
unless hand optimization and fast compiles are both important, it's unusual to use the keyword in a class declaration these days.
The purpose of inline is to allow a function to be defined in more than one translation unit, which is necessary for some compilers to be able to inline it wherever it's used. It should be used whenever you define a function in a header file, although you can omit it when defining a template, or a function inside a class definition.
Defining it in a header without inline is a very bad idea; if you include the header from more than one translation unit, then you break the One Definition Rule; your code probably won't link, and may exhibit undefined behaviour if it does.
Declaring it in a header with inline but defining it in a source file is also a very bad idea; the definition must be available in any translation unit that uses it, but by defining it in a source file it is only available in one translation unit. If another source file includes the header and tries to call the function, then your program is invalid.
This question explains a lot about inline functions What does __inline__ mean ? (even though it was about inline keyword.)
Basically, it has nothing to do with the header. Declaring the whole function in the header just changes which source file has that the source of the function is in. Inline keyword modifies where the resulting compiled function will be put - in it's own place, so that every call will go there, or in place of every call (better for performance). However compilers sometimes choose which functions or methods to make inline for themselves, and keywords are simply suggestions for the compiler. Even functions which were not specified inline can be chosen by the compiler to become inline, if that gives better performance.
If you are linking multiple objects into an executable, there should normally only be one object that contains the definition of the function. For int whatever() { return 4; } - any translation unit that is used to produce an object will contain a definition (i.e. executable code) for the whatever function. The linker won't know which one to direct callers to. If inline is provided, then the executable code may or may not be inlined at the call sites, but if it's not the linker is allowed to assume that all the definitions are the same, and pick one arbitrarily to direct callers to. If somehow the definitions were not the same, then it's considered YOUR fault and you get undefined behaviour. To use inline, the definition must be known when compiler the call, so your idea of putting an inline declaration in a header and the inline definition in a .cpp file will only work if all the callers happen to be later in that same .cpp file - in general it's broken, and you'd expect the (nominally) inline function's definition to appear in the header that declares it (or for there to be a single definition without prior declaration).

Can GCC optimize things better when I compile everything in one step?

gcc optimizes code when I pass it the -O2 flag, but I'm wondering how well it can actually do that if I compile all source files to object files and then link them afterwards.
Here's an example:
// in a.h
int foo(int n);
// in foo.cpp
int foo(int n) {
return n;
}
// in main.cpp
#include "a.h"
int main(void) {
return foo(5);
}
// code used to compile it all
gcc -c -O2 foo.cpp -o foo.o
gcc -c -O2 main.cpp -o main.o
gcc -O2 foo.o main.o -o executable
Normally, gcc should inline foo because it's a small function and -O2 enables -finline-small-functions, right? But here, gcc only sees the code of foo and main independently before it creates the object files, so there won't be any optimizations like that, right? So, does compiling like this really make code slower?
However, I could also compile it like this:
gcc -O2 foo.cpp main.cpp -o executable
Would that be faster? If not, would it be faster this way?
// in foo.cpp
int foo(int n) {
return n;
}
// in main.cpp
#include "foo.cpp"
int main(void) {
return foo(5);
}
Edit: I looked at objdump, and its disassembled code showed that only the #include "foo.cpp" thing worked.
It seems that you have rediscovered on your own the issue about the separate compilation model that C and C++ use. While it certainly eases memory requirements (which was important at the time of its creation), it does so by exposing only minimal information to the compiler, meaning that some optimizations (like this one) cannot be performed.
Newer languages, with their module systems can expose as much information as necessary, and we can hope to rip those benefits if modules get into the next version of C++...
In the mean time, the simplest thing to go for is called Link-Time Optimization. The idea is that you will perform as much optimization as possible on each TU (Translation Unit) to obtain an object file, but you will also enrich the traditional object file (which contain assembly) with IR (Intermediate Representation, used by compilers to optimize) for part of or all functions.
When the linker will be invoked to merge those object files together, instead of just merging the files together, it will merge the IR representations, rexeecute a number of optimization passes (constant propagation, inlining, ...) and then create assembly on its own. It means that instead of being just a linker, it is in fact a backend optimizer.
Of course, like all optimization passes this has a cost, so makes for longer compilation. Also, it means that both the compiler and the linker should be passed a special option to trigger this behavior, in the case of gcc, it would be -lto or -O4.
You may be looking for Link-Time Optimization (LTO), aka Whole Program Optimization.
Since you're using GCC, you can use the C99 inline function specifier mechanism. This is from ISO/IEC 9899:1999.
§ 6.7.4 Function specifiers
Syntax
¶1 function-specifier:
inline
Constraints
¶2 Function specifiers shall be used only in the declaration of an identifier for a function.
¶3 An inline definition of a function with external linkage shall not contain a definition of a
modifiable object with static storage duration, and shall not contain a reference to an
identifier with internal linkage.
¶4 In a hosted environment, the inline function specifier shall not appear in a declaration
of main.
Semantics
¶5 A function declared with an inline function specifier is an inline function. The
function specifier may appear more than once; the behavior is the same as if it appeared
only once. Making a function an inline function suggests that calls to the function be as
fast as possible.118) The extent to which such suggestions are effective is
implementation-defined.119)
¶6 Any function with internal linkage can be an inline function. For a function with external
linkage, the following restrictions apply: If a function is declared with an inline
function specifier, then it shall also be defined in the same translation unit. If all of the
file scope declarations for a function in a translation unit include the inline function
specifier without extern, then the definition in that translation unit is an inline
definition. An inline definition does not provide an external definition for the function,
and does not forbid an external definition in another translation unit. An inline definition
provides an alternative to an external definition, which a translator may use to implement
any call to the function in the same translation unit. It is unspecified whether a call to the
function uses the inline definition or the external definition.120)
¶7 EXAMPLE The declaration of an inline function with external linkage can result in either an external
definition, or a definition available for use only within the translation unit. A file scope declaration with
extern creates an external definition. The following example shows an entire translation unit.
inline double fahr(double t)
{
return (9.0 * t) / 5.0 + 32.0;
}
inline double cels(double t)
{
return (5.0 * (t - 32.0)) / 9.0;
}
extern double fahr(double); // creates an external definition
double convert(int is_fahr, double temp)
{
/* A translator may perform inline substitutions */
return is_fahr ? cels(temp) : fahr(temp);
}
¶8 Note that the definition of fahr is an external definition because fahr is also declared with extern, but
the definition of cels is an inline definition. Because cels has external linkage and is referenced, an
external definition has to appear in another translation unit (see 6.9); the inline definition and the external
definition are distinct and either may be used for the call.
118) By using, for example, an alternative to the usual function call mechanism, such as "inline
substitution". Inline substitution is not textual substitution, nor does it create a new function.
Therefore, for example, the expansion of a macro used within the body of the function uses the
definition it had at the point the function body appears, and not where the function is called; and
identifiers refer to the declarations in scope where the body occurs. Likewise, the function has a
single address, regardless of the number of inline definitions that occur in addition to the external
definition.
119) For example, an implementation might never perform inline substitution, or might only perform inline
substitutions to calls in the scope of an inline declaration.
120) Since an inline definition is distinct from the corresponding external definition and from any other
corresponding inline definitions in other translation units, all corresponding objects with static storage
duration are also distinct in each of the definitions.
Note that GCC also had inline functions in C before they were standardized. Read the GCC manual for details if you need that notation.

How does a compiler deal with inlined exported functions?

If a header file contains a function definition it can be inlined by the compiler. If the function is exported, the function's name and implementation must also be made available to clients during linkage. How does a compiler achieve this? Does it both inline the function and provide an implementation for external callers?
Consider Foo.h:
class Foo
{
int bar() { return 1; }
};
Foo::bar may be inlined or not in library foo.so. If another piece of code includes Foo.h does it always create its own copy of Foo::bar, whether inlined or not?
Header files are just copy-pasted into the source file — that's all #include does. A function is only inline if declared using that keyword or if defined inside the class definition, and inline is only a hint; it doesn't force the compiler to produce different code or prohibit you from doing anything you could otherwise do.
You can still take the address of an inline function, or equivalently, as you mention, export it. For those uses, the compiler simply treats it as non-inline and uses a One Definition Rule (the rule which says the user can't apply two definitions to the same function, class, etc) to "ensure" the function is defined once and only one copy is exported. Normally you are only allowed to have one definition among all sources; an inline function must have one definition which is repeated exactly in each source it is used.
Here is what the standard has to say about inline extern functions (7.1.2/4):
An inline function shall be defined in
every translation unit in which it is
used and shall have exactly the same
definition in every case (3.2). [Note:
a call to the inline function may be
encountered before its defi- nition
appears in the translation unit. ] If
a function with external linkage is
declared inline in one transla- tion
unit, it shall be declared inline in
all translation units in which it
appears; no diagnostic is required. An
inline function with external linkage
shall have the same address in all
translation units. A static local
variable in an extern inline function
always refers to the same object. A
string literal in an extern inline
function is the same object in
different translation units.
It usually means that it ends up creating a separate inlined method for every obj file that uses it at link time. It can also fail or refuse to inline many things, so this can cause a problem because you can wind up with bloated objs without getting the performance benefitting of inlining. The same thing can happen with virtual method inlining so it can be worth forcing inining and setting warning for inline failure (about the only useful warning message compilers give).
By export, I'm guessing you mean something such as getting a pointer to the function and later calling the function through the pointer.
Yes, in that case, the compiler will generate a regular function so that it can be invoked from a pointer.
One way to do this is with a link-once section. The idea is that in translation unit gets the code in a special type of section that has a name based on the function name. During linking, the linker will only keep one instance of identically named link-once sections.
inlined functions do not exist in the compiled binary: that is because they are taken and placed directly at the call site (so called IN-LINE). Each usage of the inlined function results in the complete code to be pulled in at that place.
So inlined functions cannot be exported because they do not exist. But you can still use them if you have a definition in one header. And yes, you MUST provide a definition for an inlined function, otherwise you cannot use it.
If you managed to export an inlined function then it is sure that it is not inline anymore: inline is not a strict semantic element. Depending on the compiler and compiler settings, one compiler might choose to inline, another not, sometimes provide a warning, sometimes even an error (which personnally I would prefer being the default behaviour, because it shows up the places where unintended things occur)

Inline member functions in C++

ISO C++ says that the inline definition of member function in C++ is the same as declaring it with inline. This means that the function will be defined in every compilation unit the member function is used. However, if the function call cannot be inlined for whatever reason, the function is to be instantiated "as usual". (http://msdn.microsoft.com/en-us/library/z8y1yy88%28VS.71%29.aspx) The problem I have with this definition is that it does not tell in which translation unit it would be instantiated.
The problem I encountered is that when facing two object files in a single static library, both of which have the reference to some inline member function which cannot be inlined, the linker might "pick" an arbitrary object file as a source for the definition. This particular choice might introduce unneeded dependencies. (among other things)
For instance:
In a static library
A.h:
class A{
public:
virtual bool foo() { return true; }
};
U1.cpp:
A a1;
U2.cpp:
A a2;
and lots of dependencies
In another project
main.cpp:
#include "A.h"
int main(){
A a;
a.foo();
return 0;
}
The second project refers the first. How do I know which definition the compiler will use, and, consequently which object files with their dependencies will be linked in? Is there anything the standard says on that matter? (Tried, but failed to find that)
Thanks
Edit: since I've seen some people misunderstand what the question is, I'd like to emphasize: If the compiler decided to create a symbol for that function (and in this case, it will, because of 'virtualness', there will be several (externally-seen) instantiations in different object file, which definition (from which object file?) will the linker choose?)
Just my two cents. This is not about virtual function in particular, but about inline and member-functions generally. Maybe it is useful.
C++
As far as Standard C++ is concerned, a inline function must be defined in every translation unit in which it is used. And an non-static inline function will have the same static variables in every translation unit and the same address. The compiler/linker will have to merge the multiple definitions into one function to achieve this. So, always place the definition of an inline function into the header - or place no declaration of it into the header if you define it only in the implementation file (".cpp") (for a non-member function), because if you would, and someone used it, you would get a linker error about an undefined function or something similar.
This is different from non-inline functions which must be defined only once in an entire program (one-definition-rule). For inline functions, multiple definitions as outlined above are rather the normal case. And this is independent on whether the call is atually inlined or not. The rules about inline functions still matter. Whether the Microsoft compiler adheres to those rules or not - i can't tell you. If it adheres to the Standard in that regard, then it will. However, i could imagine some combination using virtual, dlls and different TUs could be problematic. I've never tested it but i believe there are no problems.
For member-functions, if you define your function in the class, it is implicitly inline. And because it appears in the header, the rule that it has to be defined in every translation unit in which it is used is automatically satisfied. However, if you define the function out-of-class and in a header file (for example because there is a circular dependency with code in between), then that definition has to be inline if you include the corresponding file more than once, to avoid multiple-definition errors thrown by the linker. Example of a file f.h:
struct f {
// inline required here or before the definition below
inline void g();
};
void f::g() { ... }
This would have the same effect as placing the definition straight into the class definition.
C99
Note that the rules about inline functions are more complicated for C99 than for C++. Here, an inline function can be defined as an inline definition, of which can exist more than one in the entire program. But if such a (inline-) definition is used (e.g if it is called), then there must be also exactly one external definition in the entire program contained in another translation unit. Rationale for this (quoting from a PDF explaining the rationale behind several C99 features):
Inlining in C99 does extend the C++ specification in two ways. First, if a function is declared inline in one translation unit, it need not be declared inline in every other translation unit. This allows, for example, a library function that is to be inlined within the library but available only through an external definition elsewhere. The alternative of using a wrapper function for the external function requires an additional name; and it may also adversely impact performance if a translator does not actually do inline substitution.
Second, the requirement that all definitions of an inline function be "exactly the same" is replaced by the requirement that the behavior of the program should not depend on whether a call is implemented with a visible inline definition, or the external definition, of a function. This allows an inline definition to be specialized for its use within a particular translation unit. For example, the external definition of a library function might include some argument validation that is not needed for calls made from other functions in the same library. These extensions do offer some advantages; and programmers who are concerned about compatibility can simply abide by the stricter C++ rules.
Why do i include C99 into here? Because i know that the Microsoft compiler supports some stuff of C99. So in those MSDN pages, some stuff may come from C99 too - haven't figured anything in particular though. One should be careful when reading it and when applying those techniques to ones own C++ code intended to be portable C++. Probably informing which parts are C99 specific, and which not.
A good place to test small C++ snippets for Standard conformance is the comeau online compiler. If it gets rejected, one can be pretty sure it is not strictly Standard conforming.
When you have an inline method that is forced to be non-inlined by the compiler, it will really instantiate the method in every compiled unit that uses it. Today most compilers are smart enough to instantiate a method only if needed (if used) so merely including the header file will not force instantiation. The linker, as you said, will pick one of the instantiations to include in the executable file - but keep in mind that the record inside the object module is of a special kind (for instance, a COMDEF) in order to give the linker enough information to know how to discard duplicated instances. These records will not, therefore, result in unwanted dependencies between modules, because the linker will use them with less priority than "regular" records to resolve dependencies.
In the example you gave, you really don't know, but it doesn't matter. The linker won't resolve dependencies based on non-inlined instances alone ever. The result (in terms of modules included by the linker) will be as good as if the inline method didn't exist.
AFAIK, there is no standard definition of how and when a C++ compiler will inline a function call. These are usually "recommendations" that the compiler is in no way required to follow. In fact, different users may want different behaviors. One user may care about speed, while another may care about small generated object file size. In addition, compilers and platforms are different. Some compilers may apply smarter analysis, some may not. Some compilers may generate longer code from the inline, or work on a platform where calls are too expensive, etc.
When you have an inline function, the compiler should still generate a symbol for it and eventually resolve a single version of it. So that if it is in a static library, people can still call the function not in inline. In other words, it still acts as a normal function,.
The only effect of the inline is that some cases, the compiler will see the call, see the inline, and skip the call completely, but the function should still be there, it's just not getting called in this case.
If the compiler decided to create a symbol for that function (and in this case, it will, because of 'virtualness', there will be several (externally-seen) instantiations in different object file, which definition (from which object file?) will the linker choose?)
The definition that is present in the corresponding translation unit. And a translation unit cannot, and I repeat, cannot have but exactly one such definition. The standard is clear about that.
[...]the linker might "pick" an arbitrary object file as a source for the definition.
EDIT: To avoid any further misunderstanding, let me make my point clear: As per my reading of the standard, the ability to have multiple definition across different TUs does not give us any practical leverage. By practical, I mean having even slightly varying implementations. Now, if all your TUs have the exact same definition, why bother which TU the definition is being picked up from?
If you browse through the standard you will find the One Definition Rule is applied everywhere. Even though it is allowed to have multiple definitions of an inline function:
3.2 One Definition Rule:
5 There can be more than one definition of a class type (Clause 9), concept (14.9), concept map (14.9.2), enumeration type (7.2), inline function with external linkage (7.1.2), [...]
Read it in conjunction with
3 [...] An inline function shall be defined in every translation unit in which it is used.
This means that the function will be defined in every compilation unit [...]
and
7.1.2 Function Specifiers
2 A function declaration (8.3.5, 9.3, 11.4) with an inline specifier declares an inline function. The inline specifier indicates to the implementation that inline substitution of the function body at the point of call is to be preferred to the usual function call mechanism. An implementation is not required to perform this inline substitution at the point of call; however, even if this inline substitution is omitted, the other rules
for inline functions defined by 7.1.2 shall still be respected.
3 A function defined within a class definition is an inline function. The inline specifier shall not appear on a block scope function declaration.[footnote: 82] If the inline specifier is used in a friend declaration, that declaration shall be a definition or the function shall have previously been declared inline.
and the footnote:
82) The inline keyword has no effect on the linkage of a function.
§ 7.1.2 138
as well as:
4 An inline function shall be defined in every translation unit in which it is used and shall have exactly the same definition in every case (3.2). [ Note: a call to the inline function may be encountered before its definition appears in the translation unit. —end note ] If the definition of a function appears in a translation unit before its first declaration as inline, the program is ill-formed. If a function with external linkage is
declared inline in one translation unit, it shall be declared inline in all translation units in which it appears; no diagnostic is required. An inline function with external linkage shall have the same address in all translation units. A static local variable in an extern inline function always refers to the same object. A string literal in the body of an extern inline function is the same object in different translation units. [ Note: A string literal appearing in a default argument expression is not in the body of an inline function merely because the expression is used in a function call from that inline function. —end note ]
Distilled: Its ok to have multiple definitions, but they must have the same look and feel in every translation unit and address -- but that doesn't really give you much to cheer about. Having multiple deinition across translation units is therefore not defined (note: I am not saying you are invoking UB, yet).
As for the virtual thingy -- there won't be any inlining. Period.
The standard says:
The same declaration must be available
There must be one definition
From MSDN:
A given inline member function must be declared the same way in every compilation unit. This constraint causes inline functions to behave as if they were instantiated functions. Additionally, there must be exactly one definition of an inline function.
Your A.h contains the class definition and the member foo()'s definition.
U1.cpp and U2.cpp both define two different objects of class A.
You create another A object in main(). This is just fine.
So far, I have seen only one definition of A::foo() which is inline. (Remember that a function defined within the class declaration is always inlined whether or not it is preceded by the inline keyword.)
Don't inline your functions if you want to ensure they get compiled into a specific library.