Does name mangling apply to virtual functions in c++? - c++

We all know that all the functions in C++ are name mangled during the compile time only, so is this applied to virtual functions too?

Yes. Member function names are mangled. They need to embed their argument types so that you can overload them with different argument types.
In theory, a compiler could encode the argument types in some other way, but at some level each function body needs to be labelled by (and to have references to it resolved using) both the function name and its argument types. All major compilers certainly use mangling.

Name mangling is unrelated to member functions being virtual or not; after all virtual methods can be called non-virtually just like any member function. Only if the compiler could be certain that a virtual method is exclusively called through the vtable, it might avoid generating any linker symbol at all for the method (just inserting its address in the vtable instead). But I don't think there is any practical way a compiler can know that a method is not being called directly in another compilation unit (as it can for functions that are visible only in the current compilation unit).

Related

Why does my compiler insist on unused function definitions only for virtual? [duplicate]

I find it quite odd that unused virtual functions must still be defined unlike unused ordinary functions. I understand somewhat about the implicit vtables and vpointers which are created when a class object is created - this somewhat answers the question (that the function must be defined so that the pointers to the virtual function can be defined) but this pushes my query back further still.
Why would a vtable entry need to be created for a function if there's absolutely no chance that virtual function will be called at all?
class A{
virtual bool test() const;
};
int main(){
A a; //error: undefined reference to 'vtable for A'
}
Even though I declared A::test() it was never used in the program but it still throws up an error. Can the compiler not run through the program and realise test() was never called - and thus not require a vtable entry for it? Or is that an unreasonable thing to expect of the compiler?
Because it would inevitably be a very difficult problem to solve on the compiler writer's part, when the usefulness of being able to leave virtual functions undefined is at best dubious. Compiler authors surely have better problems to solve.
Besides, you ARE using that function even though you don't call it. You are taking its address.
The OP says that he already knows about vtables and vpointers, so he understands that there is a difference between unused virtual functions and unused non-virtual functions: an unused non-virtual function is not referenced anywhere, while a virtual function is referenced at least once, in the vtable of its class. So, essentially the question is asking why is the compiler not smart enough to refrain from placing a reference to a virtual function in the vtable if that function is not used anywhere. That would allow the function to also go undefined.
The compiler generally sees only one .cpp file at a time, so it does not know whether you have some source file somewhere which invokes that function.
Some tools support this kind of analysis, they call it "global" analysis or something similar. You might even find it built-in in some compilers, and accessible via some compiler option. But it is never enabled by default, because it would tremendously slow down compilation.
As a matter of fact, the reason why you can leave a non-virtual function undefined is also related to lack of global analysis, but in a different way: if the compiler knew that you have omitted the definition of a function, it would probably at least warn you. But since it does not do global analysis, it can't. This is evidenced by the fact that if you do try to use an undefined function, the error will not be caught by the compiler: it will be caught by the linker.
So, just define an empty virtual function which contains an ASSERT(FALSE) and proceed with your life.
The whole point of virtual functions is that they can be called through a base class pointer. If you never use the base class virtual function, then, why did you define it ? If it is used, they you either have to leave the parent implementation (if it's not pure virtual), or define your own implementation, so that code using your objects through the base class can actually make use of it. In that case, the function is used, it's just not used directly.

Does C++ use static name resolution or dynamic name resolution?

I have been reading about "Name resolution" in wikipedia (Name resolution WIKI) and it has been given in that that C++ uses "Static Name Resolution". If that is true then I couldn't figure out how C++ manages to provide "polymorphism" without using dynamic name resolution.
Can anyone please answer whether C++ uses "Static Name Resolution" or "Dynamic Name Resolution". If it is static, can you also explain how C++ provides polymorphism.
Wikipedia's definition of name resolution is about how tokens are resolved into the names of constructs (functions, typenames, etc). Given that definition, C++ is 100% static with its name resolution. Every token that represents an identifier must be associated at compile-time with a specific entity.
C++ polymorphism is effectively cheating. The compiler can see that a static name resolves to a member function defined with the virtual keyword. If the compiler sees that the object you are calling this on is a dynamic object (ie: a pointer/reference to that type rather than a value of that type), the the compiler emits special code to call that function.
This special code does not change the name it resolves to. What it changes is the function that eventually gets called. That is not dynamic naming; that is dynamic function dispatch. The name gets resolved at compile-time; the function gets resolved at runtime.
C++ use static name resolution because it renames each function to made each one have an unique.
That mean that the function int foo(int bar) will be known by the compiler as something like _Z3fooi, while int foo(float bar) will be known as something like _Z3foof.
This is what we call name mangling.

Implicit inline virtual function implemented in header

Writing a function in a .h file and its implementation right after (implicit inline), while using the virtual keyword:
virtual void g(){cout<<"is Inline?"};
Is the virtual functionality meaningless because the function is implemented in the .h?
Is this considered to be an inline?
Is the virtual functionality meaningless because the function is implemented in the .h?
No. virtual and inline are completely independent concepts.
virtual means that the function to call is chosen, at run-time if necessary, according to the dynamic type of the object it's invoked on.
inline means that you're allowed to define the function in more than one translation unit, and must define it in any translation unit that uses it. This is necessary (for some compilers) to allow the function to be inlined, but does not force all calls to be inlined. In particular, virtual calls usually won't be inlined (unless the dynamic type can be determined at compile time), so virtual will certainly retain its meaning here.
Is this considered to be an inline?
Yes, but (as mentioned above) that does not mean that all calls will be inlined.
Is the virtual functionality meaningless because the function is
implemented in the .h?
Nope. No reason to feel so. Header file is preprocessed and copy-pasted wherever it's included. So ultimately it's as good as implementing your g() in whatever .cpp file.
Is this considered to be an inline?
Yes. But here the inline doesn't mean usual interpretation of replacing function call with its content. virtual function resolution happens at runtime, so that can definitely not be inlined in that (macro style) way.
It means, that compiler guarantees to generate only 1 definition for all translation (.cpp file) units. Thus linker will not complain about multiple definition errors.
If you declare your function virtual, it is virtual, period. But, since virtual functions are usually selected at runtime, usually the compiler will not be able to inline them. If you call the function on an object, the compiler may inline it, since the call can be resolved at compile time. But it won't be able to inline a call through a reference or pointer, since it cannot resolve the dynamic type at compile time.
Take into account that neither the inline keyword not the implicit inlining here are mandatory for the compiler; they are just suggestions. But the virtual keyword is mandatory.

C++ vtable query

I have a query in regard to the explaination provided here http://www.parashift.com/c++-faq/virtual-functions.html#faq-20.4
In the sample code the function mycode(Base *p), calls virt3 method as p->virt3(). Here how exactly the compiler know that virt3 is found in third slot of vtable? How do it compare and with what ?
When the compiler sees the definition of Base it decides the layout of its vtable according to some algorithm1, which is common to all its derived classes as far as methods inherited from Base are concerned (derived classes may add other virtual methods, but they are put into the vtable after the stuff inherited from Base).
Thus, when the compiler sees p->virt3(), it already knows that for any object that inherits from Base the pointer to the correct virt3 is e.g. in the third slot of the vtable (because that's how it laid out the vtable of Base at the moment of its definition), so it can correctly generate the code for the virtual call.
Long story short (driving inspiration from #David Rodríguez's comment): it knows where it stays because he decided it before.
1. The standard do not mandate any particular algorithm (actually, it doesn't say anything about how the C++ ABI should be implenented), but there are several widespread C++ ABI specifications, notably the COM ABI on Windows and the Itanium ABI on Linux (and in general for gcc). Obviously, given the same class definition, the algorithm must give the same vtable layout every time, otherwise it would be impossible to link together different object modules.
The layout of the vtable is specified by the Itanium C++ ABI, followed by many compilers including GCC. The compiler itself doesn't decide where the function pointers go (though I suppose it does decide to abide by the ABI!).
The order of the virtual function pointers in a virtual table is the order of declaration of the corresponding member functions in the class.
(Example.)
COM — used by Visual Studio — also emits vtable pointers in source order (though I can't find normative documentation to prove that).
Also, because the function name doesn't even exist at runtime (but a function pointer), the layout of the vtable at compile-time doesn't really matter. The function call translation works in just the same way that a normal function call translation works: the compiler is already mapping the function name to an address in its internal machinery. The only difference is that the mapping here is to a location in the vtable, rather than to the start of the actual function code.
This also addresses your concern about interoperability, to some extent.
Do remember, though, that this is all implementation machinery and C++ itself has no knowledge that virtual tables even exist.
The compiler has a well defined algorithm for allocating the entries in the vtable so that the order of the entries will always be the same regardless of which translation unit is being processed. Internal to the compiler is a mapping between the function names and their location in the vtable so the compiler can do the correct transformation between function call and vtable index.
It is important, therefore, that changes to the definition of a class with virtual functions causes all source files that are dependent on the class to be recompiled, otherwise bad things could happen.

Where is function overriding done?

Where in the process of creating the program, compiler, linker etc., is the overriding of functions and operator overloading done?
I'm particularly interested where it is done in C++, Ruby and Python.
Function overloading is (at least in C++) handled internally inside the compiler. The idea is that the code that the compiler ultimately generates will be hardcoded to call the appropriate function, as if the functions all had different names and you called the function uniquely suited to the arguments. More generally, in most compiled languages that support overloading, the overload resolution is done at compile-time and the emitted code will always call the indicated function. For example, Haskell supports compile-time overloading this way.
Operator overloading is a special case of general overloading, so it's usually handled the same way.
Function overriding (a term that arises in OOP when a derived class inherits from a base class and redefines one of its methods) is almost always resolved at runtime, since a compiler can't always tell which function is going to be invoked without actually knowing about the types at runtime. Some compilers might be able to statically prove that a certain object has a specific type and can then optimize the dynamic dispatch away, but it's impossible to do this in all cases.
I am not aware of any dynamic languages that support overloading, since in theory you could introduce new overload candidates as the program was running. I would love to be enlightened if such a language exists.
For C++, operator overloading is done at the compiler level though a name-mangling process that creates a unique name identifier for every function so that the linker will not complain about duplicate function definitions. In C++, operator overloading is possible because overloadable operations like +, -, *, etc. are actual functions themselves that have the prefix operator followed by the symbol of the operation. So for instance, an overloaded operator+ function with a function signature like
my_type operator+(const my_type& lhs, const my_type& rhs);
will not conflict with another operator+ function with a different signature, even though both functions have the same operator+ name, because each version of the function will have a different name at the assembly-language level after the C++ compiler's name-mangling process is complete. Name-mangling has another benefit in that allows C and C++ compiled code to be used with the same linker, since two functions with the same name will not exist and cause a linker error.
Note that in C, that even if you create two functions with different signatures, if they have the same name, since the C-compiler will not do any name-mangling, the linker will complain about duplicate definitions of the function.
Python is not linked/compiled, it is interpreted.
So, the normal overriding is done when class sources are parsed. Of course, due to dynamic nature you can always override during the runtime as well.
I suppose that alternate implementations using the byto-code compilation do it on the compile-time.
I also think the above is true for Ruby as well.