C++ Method Override and Overloading (Compiler level) - c++

I know what the difference between the two. Overriding basically lets you "redefine" your a method in a child class and overloading basically lets you "redefine" your method with different arguments or parameters. I'm a little confused on what's going on under the hood though. I read that when you overload a method, the compiler will have all the overloaded methods and find the best match or report an error if none exists. This is obviously done during compile time but I'm confused on how Override works. I've read that handling overrides is extremely hard because you'll have to check if the return type matches with the class hierarchy and there can be a lot of class levels to check
(ie. class Living is the super class of Human and Animal. Human and Animal can have many derived classes which means we will have a deep level of classes).
Without getting too detailed, how does overriding work at the compiler level and why is it that overriding is done during run time and not compile time?

It depends on if the overridden method is virtual or not. If the overridden method is not virtual, then under the hood it usually works in the same way as overloading, the compiler looks at the static type of the object and calls the correct function based on that.
For objects with virtual methods a vtable is usually used. This is a collection of function pointers to the virtual methods. The reason this is done at run time is to allow for runtime polymorphism. The usual way that a vtable is generate is the compilier will generate a single vtable for each class and populate it with the required pointers at compile time and include this in the executable. The constructor will then set a hidden pointer in the class to point to the correct vtable. When looking up methods it first dereferences the hidden pointer to find the vtable then dereferences the correct slot from the vtable.

Related

Is virtual dispatch used if class type is known?

Suppose we have base class A with at least one virtual method. Suppose then we have another class B that derives from A and may or may not override this virtual method.
Finally, suppose you create an object of class B with local scope, and call this virtual method.
From the C++ docs we know that if this virtual method is inlined, the inline version will be used, because class type is known and this is not pointer or reference, but the class itself.
Will virtual dispatch be used in this case or it will be bypassed? Will this work for normal (non inline) methods?
I am interested in gcc / clang.
Since both the stack and the vtable are implementation details, it's probably better to phrase it:
Can the compiler use static - rather than virtual - dispatch if the object's (real, runtime) type is statically known?
to which the answer is: yes. Anywhere the compiler knows for certain what version of a virtual method will be used, it can just emit a regular statically-dispatched function call.
Note that there are some places you might expect the compiler to know the object's runtime type and be mistaken - specifically inside constructors.
If you want to know whether a particular compiler does emit this particular optimization for some particular code (at a particular optimization level), just check the assembly output. Even if you're not sure what both versions of a call should look like, you can compare the output with a simple and a fully-qualified call (b.B::foo() vs b.foo()). I'd expect gcc and clang to do a reasonable job in this case, but it's easy enough to check.

Delegate part of an interface to a subclass in C++? [duplicate]

Here is what I am talking about
// some guy wrote this, used as a Policy with templates
struct MyWriter {
void write(std::vector<char> const& data) {
// ...
}
};
In some existing code, the people did not use templates, but interfaces+type-erasure
class IWriter {
public:
virtual ~IWriter() {}
public:
virtual void write(std::vector<char> const& data) = 0;
};
Someone else wanted to be usable with both approaches and writes
class MyOwnClass: private MyWriter, public IWriter {
// other stuff
};
MyOwnClass is implemented-in-terms-of MyWriter. Why doesn't MyOwnClass' inherited member functions implement the interface of IWriter automatically? Instead the user has to write forwarding functions that do nothing but call the base class versions, as in
class MyOwnClass: private MyWriter, public IWriter {
public:
void write(std::vector<char> const& data) {
MyWriter::write(data);
}
};
I know that in Java when you have a class that implements an interface and derives from a class that happens to have suitable methods, that base class automatically implements the interface for the derived class.
Why doesn't C++ do that? It seems like a natural thing to have.
This is multiple inheritance, and there are two inherited functions with the same signature, both of which have implementation. That's where C++ is different from Java.
Calling write on an expression whose static type is MyBigClass would therefore be ambiguous as to which of the inherited functions was desired.
If write is only called through base class pointers, then defining write in the derived class is NOT necessary, contrary to the claim in the question. Now that the question changed to include a pure specifier, implementing that function in the derived class is necessary to make the class concrete and instantiable.
MyWriter::write cannot be used for the virtual call mechanism of MyBigClass, because the virtual call mechanism requires a function that accepts an implicit IWriter* const this, and MyWriter::write accepts an implicit MyWriter* const this. A new function is required, which must take into account the address difference between the IWriter subobject and the MyWriter subobject.
It would be theoretically possible for the compiler to create this new function automatically, but it would be fragile, since a change in a base class could suddenly cause a new function to be chosen for forwarding. It's less fragile in Java, where only single inheritance is possible (there's only one choice for what function to forward to), but in C++, which supports full multiple inheritance, the choice is ambiguous, and we haven't even started on diamond inheritance or virtual inheritance yet.
Actually, this problem (difference between subobject addresses) is solved for virtual inheritance. But it requires additional overhead that's not necessary most of the time, and a C++ guiding principle is "you don't pay for what you don't use".
Why doesn't C++ do that? It seems like a natural thing to have.
Actually, no, it is extremely unnatural thing to have.
Please note that my reasoning is based on my own understanding of "common sense" and can be fundamentally flawed as a result.
You see, you have two different methods, first one in MyWriter, which is non virtual and second one in IWriter which is virtual. They are completely different despite "looking" similar.
I suggest to check this question. The good thing about non-virtual methods is that no matter what you do, as long as they don't call virtual methods, their behavior will never change. I.e. somebody deriving from your class with non-virtual methods will not break existing method by masking them. Virtual methods are designed to be overriden. The price of that is that it is possible to break underlying logic by improperly overriding virtual method. And this is a root of your problem.
Let's say what you propose is allowed. (automatic conversion to virtual with multiple inheritance) There two possible solutions:
Solution #1
MyWriter becomes virtual. Consequences: All existing C++ code in the world becomes easy to break via typo or name clash. MyWriter method was not supposed to be overriden initially, so suddenly turning it into virtual will (murphy's law) break underlying logic of MyWriter class when somebody derives from MyOwnClass. Which means that suddenly making MyWriter::write virtual is a bad idea.
Soluion #2
MyWriter remains static BUUUT it is included temporarily as a virtual method into IWriter, until overriden. At first glance there's nothing to worry about, but let's think about it. IWriter implements some kind of concept you had in mind, and it is supposed to do something. MyWriter implements another concept. To assign MyWriter::write as IWriter::write method you need two guarantees:
Compiler must ensure that MyWriter::write does what IWriter::write() is supposed to do.
Compiler must ensure that calling MyWriter::write from IWriter will not break existing functionality in MyWriter code programmer expects to use elsewhere.
So, the thing is that compiler cannot guarantee that. Functions have similar name and argument list, but by Murphy's law that means that they're prbably doing completely different thing. (sinf and cosf have same argument list, for example), and it is unlikely that compiler will be able to predict the future and make sure that at no point in development will MyWriter be changed in such way that it will become incompatible with IWriter. So, since machine can't make reasonable decision (no AI for that) by itself, it has to ask YOU, programmer - "What is it you wish to do?". And you say "redirect virtual method into MyWriter::write(). It totally won't break anything. I think.".
And that's why you must specify which method you want to use manually....
Doing it automatically would be unintuitive and surprising. C++ does not assume that multiple base classes are related to each other, and protects the user against name collisions between their members by defining nested name specifiers for nonstatic members. Adding implicit declarations to MyOwnClass where signatures from IWriter and MyWriter collide would be antithetical to protecting names.
However, C++11 extensions do bring us closer. Consider this:
class MyOwnClass: private MyWriter, public IWriter {
public:
void write(std::vector<char> const& data) final = MyWriter::write;
};
This mechanism would be safe because it expresses that MyWriter doesn't expect any further overrides, and convenient because it names the function signature that will be "joined" but nothing more. Also, final would be ill-formed if the function weren't implicitly virtual, so it checks that the signature matches the virtual interface.
On one hand, most interfaces don't just happen to match up this way. Defining this feature to work only with identical signatures would be safe but rarely useful. Defining it as a shortcut to a delegating function body would be useful but fragile. So it might not really be a good feature
On the other hand, this is a good design pattern to provide functionality which isn't virtual when you don't need it to be. So given this idiom, we might use it to write good code, even if it doesn't match up well with current practices.
Why doesn't C++ do that?
I'm not sure what you're asking here. Could C++ be rewritten to allow this? Yes, but to what end?
Because MyWriter and IWriter are completely different classes, it is illegal in C++ to call a member of MyWriter through an instance of IWriter. The member pointers have completely different types. And just as a MyWriter* is not convertible to a IWriter*, neither is a void (MyWriter::*)(const std::vector<char>&) convertible to a void (IWriter::*)(const std::vector<char>&).
The rules of C++ don't change just because there could be a third class that combines the two. Neither class is a direct parent/child relative of one another. Therefore, they are treated as entirely distinct classes.
Remember: member functions always take an additional parameter: a this pointer to the object that they point to. You cannot call void (MyWriter::*)(const std::vector<char>&) on an IWriter*. The third class can have a method that casts itself into the proper base class, but it must actually have this method. So either you or the C++ compiler must create it. The rules of C++ require this.
Consider what would have to happen to make this work without a derived-class method.
A function gets an IWriter*. The user calls the write member of it, using nothing more than the IWriter* pointer. So... exactly how can the compiler generate the code to call MyWriter::writer? Remember: MyWriter::writer needs a MyWriter instance. And there is no relationship between IWriter and MyWriter.
So how exactly could the compiler do the type coercion locally? The compiler would have to check the virtual function to see if the actual function to be called takes IWriter or some other type. If it takes another type, it would have to convert the pointer to its true type, then do another conversion to the type needed by the virtual function. After doing all of that, it would then be able to make the call.
All of this overhead would affect every virtual call. All of them would have to at least check to see if the actual function to be call. Every call will also have to generate the code to do the type conversions, just in case.
Every virtual function call would have a "get type" and conditional branch in it. Even if it is never possible to trigger that branch. So you would be paying for something regardless of whether you use it or not. That's not the C++ way.
Even worse, a straight v-table implementation of virtual calls is no longer possible. The fastest method of doing virtual dispatch would not be a conforming implementation. The C++ committee is not going to make any change that would make such implementations impossible.
Again, to what end? Just so that you don't have to write a simple forwarding function?
Just make MyWriter derive from IWriter, eliminate the IWriter derivation in MyOwnClass, and move on with life. This should resolve the problem and should not interfere with the template code.

Why does C++ not let baseclasses implement a derived class' inherited interface?

Here is what I am talking about
// some guy wrote this, used as a Policy with templates
struct MyWriter {
void write(std::vector<char> const& data) {
// ...
}
};
In some existing code, the people did not use templates, but interfaces+type-erasure
class IWriter {
public:
virtual ~IWriter() {}
public:
virtual void write(std::vector<char> const& data) = 0;
};
Someone else wanted to be usable with both approaches and writes
class MyOwnClass: private MyWriter, public IWriter {
// other stuff
};
MyOwnClass is implemented-in-terms-of MyWriter. Why doesn't MyOwnClass' inherited member functions implement the interface of IWriter automatically? Instead the user has to write forwarding functions that do nothing but call the base class versions, as in
class MyOwnClass: private MyWriter, public IWriter {
public:
void write(std::vector<char> const& data) {
MyWriter::write(data);
}
};
I know that in Java when you have a class that implements an interface and derives from a class that happens to have suitable methods, that base class automatically implements the interface for the derived class.
Why doesn't C++ do that? It seems like a natural thing to have.
This is multiple inheritance, and there are two inherited functions with the same signature, both of which have implementation. That's where C++ is different from Java.
Calling write on an expression whose static type is MyBigClass would therefore be ambiguous as to which of the inherited functions was desired.
If write is only called through base class pointers, then defining write in the derived class is NOT necessary, contrary to the claim in the question. Now that the question changed to include a pure specifier, implementing that function in the derived class is necessary to make the class concrete and instantiable.
MyWriter::write cannot be used for the virtual call mechanism of MyBigClass, because the virtual call mechanism requires a function that accepts an implicit IWriter* const this, and MyWriter::write accepts an implicit MyWriter* const this. A new function is required, which must take into account the address difference between the IWriter subobject and the MyWriter subobject.
It would be theoretically possible for the compiler to create this new function automatically, but it would be fragile, since a change in a base class could suddenly cause a new function to be chosen for forwarding. It's less fragile in Java, where only single inheritance is possible (there's only one choice for what function to forward to), but in C++, which supports full multiple inheritance, the choice is ambiguous, and we haven't even started on diamond inheritance or virtual inheritance yet.
Actually, this problem (difference between subobject addresses) is solved for virtual inheritance. But it requires additional overhead that's not necessary most of the time, and a C++ guiding principle is "you don't pay for what you don't use".
Why doesn't C++ do that? It seems like a natural thing to have.
Actually, no, it is extremely unnatural thing to have.
Please note that my reasoning is based on my own understanding of "common sense" and can be fundamentally flawed as a result.
You see, you have two different methods, first one in MyWriter, which is non virtual and second one in IWriter which is virtual. They are completely different despite "looking" similar.
I suggest to check this question. The good thing about non-virtual methods is that no matter what you do, as long as they don't call virtual methods, their behavior will never change. I.e. somebody deriving from your class with non-virtual methods will not break existing method by masking them. Virtual methods are designed to be overriden. The price of that is that it is possible to break underlying logic by improperly overriding virtual method. And this is a root of your problem.
Let's say what you propose is allowed. (automatic conversion to virtual with multiple inheritance) There two possible solutions:
Solution #1
MyWriter becomes virtual. Consequences: All existing C++ code in the world becomes easy to break via typo or name clash. MyWriter method was not supposed to be overriden initially, so suddenly turning it into virtual will (murphy's law) break underlying logic of MyWriter class when somebody derives from MyOwnClass. Which means that suddenly making MyWriter::write virtual is a bad idea.
Soluion #2
MyWriter remains static BUUUT it is included temporarily as a virtual method into IWriter, until overriden. At first glance there's nothing to worry about, but let's think about it. IWriter implements some kind of concept you had in mind, and it is supposed to do something. MyWriter implements another concept. To assign MyWriter::write as IWriter::write method you need two guarantees:
Compiler must ensure that MyWriter::write does what IWriter::write() is supposed to do.
Compiler must ensure that calling MyWriter::write from IWriter will not break existing functionality in MyWriter code programmer expects to use elsewhere.
So, the thing is that compiler cannot guarantee that. Functions have similar name and argument list, but by Murphy's law that means that they're prbably doing completely different thing. (sinf and cosf have same argument list, for example), and it is unlikely that compiler will be able to predict the future and make sure that at no point in development will MyWriter be changed in such way that it will become incompatible with IWriter. So, since machine can't make reasonable decision (no AI for that) by itself, it has to ask YOU, programmer - "What is it you wish to do?". And you say "redirect virtual method into MyWriter::write(). It totally won't break anything. I think.".
And that's why you must specify which method you want to use manually....
Doing it automatically would be unintuitive and surprising. C++ does not assume that multiple base classes are related to each other, and protects the user against name collisions between their members by defining nested name specifiers for nonstatic members. Adding implicit declarations to MyOwnClass where signatures from IWriter and MyWriter collide would be antithetical to protecting names.
However, C++11 extensions do bring us closer. Consider this:
class MyOwnClass: private MyWriter, public IWriter {
public:
void write(std::vector<char> const& data) final = MyWriter::write;
};
This mechanism would be safe because it expresses that MyWriter doesn't expect any further overrides, and convenient because it names the function signature that will be "joined" but nothing more. Also, final would be ill-formed if the function weren't implicitly virtual, so it checks that the signature matches the virtual interface.
On one hand, most interfaces don't just happen to match up this way. Defining this feature to work only with identical signatures would be safe but rarely useful. Defining it as a shortcut to a delegating function body would be useful but fragile. So it might not really be a good feature
On the other hand, this is a good design pattern to provide functionality which isn't virtual when you don't need it to be. So given this idiom, we might use it to write good code, even if it doesn't match up well with current practices.
Why doesn't C++ do that?
I'm not sure what you're asking here. Could C++ be rewritten to allow this? Yes, but to what end?
Because MyWriter and IWriter are completely different classes, it is illegal in C++ to call a member of MyWriter through an instance of IWriter. The member pointers have completely different types. And just as a MyWriter* is not convertible to a IWriter*, neither is a void (MyWriter::*)(const std::vector<char>&) convertible to a void (IWriter::*)(const std::vector<char>&).
The rules of C++ don't change just because there could be a third class that combines the two. Neither class is a direct parent/child relative of one another. Therefore, they are treated as entirely distinct classes.
Remember: member functions always take an additional parameter: a this pointer to the object that they point to. You cannot call void (MyWriter::*)(const std::vector<char>&) on an IWriter*. The third class can have a method that casts itself into the proper base class, but it must actually have this method. So either you or the C++ compiler must create it. The rules of C++ require this.
Consider what would have to happen to make this work without a derived-class method.
A function gets an IWriter*. The user calls the write member of it, using nothing more than the IWriter* pointer. So... exactly how can the compiler generate the code to call MyWriter::writer? Remember: MyWriter::writer needs a MyWriter instance. And there is no relationship between IWriter and MyWriter.
So how exactly could the compiler do the type coercion locally? The compiler would have to check the virtual function to see if the actual function to be called takes IWriter or some other type. If it takes another type, it would have to convert the pointer to its true type, then do another conversion to the type needed by the virtual function. After doing all of that, it would then be able to make the call.
All of this overhead would affect every virtual call. All of them would have to at least check to see if the actual function to be call. Every call will also have to generate the code to do the type conversions, just in case.
Every virtual function call would have a "get type" and conditional branch in it. Even if it is never possible to trigger that branch. So you would be paying for something regardless of whether you use it or not. That's not the C++ way.
Even worse, a straight v-table implementation of virtual calls is no longer possible. The fastest method of doing virtual dispatch would not be a conforming implementation. The C++ committee is not going to make any change that would make such implementations impossible.
Again, to what end? Just so that you don't have to write a simple forwarding function?
Just make MyWriter derive from IWriter, eliminate the IWriter derivation in MyOwnClass, and move on with life. This should resolve the problem and should not interfere with the template code.

Inheritance in C++ internals

Can some one explain me how inheritance is implemented in C++ ?
Does the base class gets actually copied to that location or just refers to that location ?
What happens if a function in base class is overridden in derived class ? Does it replace it with the new function or copies it in other location in derived class memory ?
first of all you need to understand that C++ is quite different to e.g. Java, because there is no notion of a "Class" retained at runtime. All OO-features are compiled down to things which could also be achieved by plain C or assembler.
Having said this, what acutally happens is that the compiler generates kind-of a struct, whenever you use your class definition. And when you invoke a "method" on your object, actually the compiler just encodes a call to a function which resides somewhere in the generated executable.
Now, if your class inherits from another class, the compiler somehow includes the fields of the baseclass in the struct he uses for the derived class. E.g. it could place these fields at the front and place the fields corresponding to the derived class after that. Please note: you must not make any assumptions regarding the concrete memory layout the C++ compiler uses. If you do so, you're basically on your own and loose any portability.
How is the inheritance implemented? well, it depends!
if you use a normal function, then the compiler will use the concrete type he's figured out and just encode a jump to the right function.
if you use a virtual function, the compiler will generate a vtable and generate code to look up a function pointer from that vtable, depending on the run time type of the object
This distinction is very important in practice. Note, it is not true that inheritance is allways implemented through a vtable in C++ (this is a common gotcha). Only if you mark a certain member function as virtual (or have done so for the same member function in a baseclass), then you'll get a call which is directed at runtime to the right function. Because of this, a virtual function call is much slower than a non-virtual call (might be several hundered times)
Inheritance in C++ is often accomplished via the vtable. The linked Wikipedia article is a good starting point for your questions. If I went into more detail in this answer, it would essentially be a regurgitation of it.

Is there a way to determine at runtime if an object can do a method in C++?

In Perl, there is a UNIVERSAL::can method you can call on any class or object to determine if it's able to do something:
sub FooBar::foo {}
print "Yup!\n" if FooBar->can('foo'); #prints "Yup!"
Say I have a base class pointer in C++ that can be any of a number of different derived classes, is there an easy way to accomplish something similar to this? I don't want to have to touch anything in the other derived classes, I can only change the area in the base class that calls the function, and the one derived class that supports it.
EDIT: Wait, this is obvious now (nevermind the question), I could just implement it in the base that returns a number representing UNIMPLEMENTED, then check that the return is not this when you call it. I'm not sure why I was thinking of things in such a complicated manner.
I was also thinking I would derive my class from another one that implemented foo then see if a dynamic cast to this class worked or not.
If you have a pointer or reference to a base class, you can use dynamic_cast to see which derived class it is (and therefore which derived class's methods it supports).
If you can add methods to the base class, you can add a virtual bool can_foo() {return false;} and override it in the subclass that has foo to return true.
C++ does not have built in run-time reflection. You are perfectly free to build your own reflection implementation into your class hierarchy. This usually involves a static map that gets populated with a list of names and functions. You have to manually register each function you want available, and have consistency as to the calling convention and function signature.
I believe the most-correct way would be to use the typeid<> operator and get a reference to the type_info object, and then you could compare that (== operator) to the desired type_info for the data types you wish to care about.
This doesn't give you method-level inspection, and does require that you've built with RTTI enabled (I believe that using typeid<> on an object that was built without RTTI results with "undefined" behavior), but there you are.
MSDN has an online reference to get you started : http://msdn.microsoft.com/en-us/library/b2ay8610%28VS.80%29.aspx