I hope the title is not confusing. I am trying to understand the following issue that arises from defining methods of a class virtual or not in C++. Imagine I have a base class A and a derived class B, such that
class A {
public:
void print() { cout << "A"; }
}
class B : A {
public:
void print() { cout << "B"; }
}
If I know execute the code below, the print command would print out "A".
A *a = new A();
B *b = new B();
((A *)b)->print(); // this prints "A"
However, if I declare the "print" methods in both classes as virtual, I would instead see a "B" printed in my screen. Why is this happening exactly?
If a function is NOT virtual, the compiler will just use whatever type the expression gives. So when you cast a B object to an A object, it will call the A::print function.
If you use virtual, a table of function pointers [1] is built by the compiler, and when the function is to be called, the compiler generates code to call through that table, rather than just looking at the current type, which allows a base-type to call function in a derived class, and thus allowing polymorphic behaviour.
[1] technically, the specification doesn't tell us how to implement this, but this is how nearly all compilers work. If the compiler can produce the same behaviour using magic, it is allowed to do so - as long as the magic is reliable and reproducable.
A *b = new B(); //LHS is compile time = RHS is runtime(coz' object is created at runtime)
B.print();
if print is non virtual:Compile time: A.print() is resolved to goto work (since it is a real function) Runtime : A.print() is deployed/dispatched to work.
If print is a Virttual/ not real funtion:Compile time: A.print() is bypassed and B.print is resolved to goto work
runtime: B.print() is deployed/dispatched to work.
Related
I read about virtual functions but i am not able to clear the concept.
In the below mentioned example.We are creating a base pointer and assigning base object first and calling function is base class and later assigning derived object and calling its function. Since we have already mentioned which objects will be assigned does not compiler know which object function to call during compilation? I did not get why the decision will be delayed till run time. Am i missing something here.?
#include <iostream>
using std::cout;
using std::endl;
// Virtual function selection
class Base
{
public:
virtual void print() const
{
cout << "Inside Base" << endl;
}
};
class Derived : public Base
{
public:
// virtual as well
void print() const
{
cout << "Inside Derived" << endl;
}
};
int main()
{
Base b;
Derived f;
Base* pb = &b; // points at a Base object
pb->print(); // call Base::print()
pb = &f; // points at Derived object
pb->print(); // call Derived::print()
}
In your particular case, the compiler could potentially figure out the type of the objects being pointer at by the base class pointer. But the virtual dispatch mechanism is designed for situations in which you do not have this information at compile time. For example,
int n;
std::cin >> n;
Base b;
Derived d;
Base* pb = n == 42 ? &b : &d;
Here, the choice is made based on user input. The compiler cannot know what pb will point to.
Since we have already mentioned which objects will be assigned does not compiler know which object function to call during compilation? I did not get why the decision will be delayed till run time.
In this very specific, contrived case, your compiler can optimise out all the polymorphism, yes.
Am i missing something here.?
The imagination to realise that the vast majority of code in real life is not this simple. There are infinitely many C++ programs for which the compiler does not have enough information to perform this optimisation.
As per my understanding, the compiler will just look at the reference type at compile time and bind the function defined and declared in that class. Since the Derived -> print() should be called you have to make the print function virtual in the base class so that the compiler will delay the binding to run time and use the function defined in the derived class.
Due to the fact that it is virtual, it is able to dynamically bind the function to the correct object. This means that the pointer calling the function will call the referenced object's function.
I've tried to map it out in my head, but honestly I have no idea what's really going on here.
What exactly is happening when I add and remove the virtual keyword from the below example?
#include <iostream>
#include <string>
class A {
public:
A() { me = "From A"; }
void caller() { func(); }
virtual void func() { std::cout << me << std::endl; } // THIS LINE!
private:
std::string me;
};
class B : public A {
public:
B() { me = "From B"; }
void func() { std::cout << me << std::endl; }
private:
std::string me;
};
int main() {
A a;
a.caller();
B b;
b.caller();
return 0;
}
With the virtual keyword, it prints "From A", then "From B".
Without the virtual keyword, it prints "From A", then "From A".
So far, this is the only time I've found a use for virtual functions without pointers being involved. I thought that if the virtual keyword was removed, the compiler would do the standard thing which is to overload the inherited function and end up printing "From A", and "From B" anyway.
I think this is deeper than just the VTable, and that it's more about the way it behaves in particular circumstances. Does B even have a VTable?
The call
func()
is equivalent to
this->func()
so there is a pointer involved.
Still, there's no need to involve pointers to understand the behavior.
Even a direct call of e.g. b.func() has to work as if it's a virtual call, when func is virtual in the statically known type. The compiler can optimize it based on knowing the most derived type of b. But that's a different kind of consideration (optimizations can do just about anything).
Apart from the issue of virtual dispatch, what may bring extra confusion, is that you have two mes, one declared in A and another declared in B. These are two distinct objects.
An object of type B has two data members of type std::string; one on its own, and one incorporated into the subobject of type A. The latter one, though, is not immediately available in the methods of type B because its name is eclipsed by the new me introduced in this class (though you may use a qualified name, A::me to refer to it).
Therefore, even though the bodies of A::func and B::func seem identical, the identifier me used in both of them refers to different members.
In your example, you won't see the difference:
With the virtual function, the compiler will generate a call via the VTable and at runtime, each objects will call the right function for their real class.
With the non virtual function, the compiler determines at compile time the right function to call, based on the objects defined class.
Now try the following, to see the virtual function in action:
A *pa = &b; // pointer to an A: valid as b is a B wich is also an A.
pa -> caller(); // guess what will be called if virtual or not.
No need for pointer to experimenting with virtual functions. You can observe the same effect with references as well:
A& ra = b; // create a reference to an A, but could as well be a parameter passed by reference.
ra.caller();
Virtual functions are useful for polymorphism. The idea is that you work with a general object of a class, but you don't know at compile time, if at runtime the object will really be of this class, or if it will not be a more specialiszed object (inheriting from the class).
Given the codes below:
class Base
{
public:
virtual void f()
{
std::cout << "virtual Base::f()\n";
}
};
class D1 : public Base
{
public:
virtual void f()
{
std::cout << "virtual D1::f()\n";
}
};
int main()
{
D1 d1;
Base *bp = &d1;
bp->f();
return 0;
}
The output was exactly what I had expected:
virtual D1::f()
Press <RETURN> to close this window...
But once I removed the virtual void f() from class Base, the compiler complained that:
error: 'class Base' has no member named 'f'
Can anyone tell me why compiler didn't generate codes such that it can bind virtual functions at rum time?
You are calling virtual member functions via a pointer to Base. That means that you can only call methods that exist in the Base class. You cannot simply add methods to a type dynamically.
Although a little late as an answer, straight quote from C++ Primer on how function calls are resolved in relation to inheritance. Your code fails on name lookup (step 2 below), which is done statically.
Understanding how function calls are resolved is crucial to
understanding inheritance in C++. Given the call p->mem() (or
obj.mem()), the following four steps happen:
First determine the static type of p (or obj). Because we’re calling
a member, that type must be a class type.
Look for mem in the class that corresponds to the static type of p (or obj). If mem is not found, look in the direct base class and
continue up the chain of classes until mem is found or the last class
is searched. If mem is not found in the class or its enclosing base
classes, then the call will not compile.
Once mem is found, do normal type checking (§6.1, p. 203) to see if this call is legal given the definition that was found.
Assuming the call is legal, the compiler generates code, which varies depending on whether the call is virtual or not:
– If mem is virtual and the call is made through a reference or
pointer, then the compiler generates code to determine at run time
which version to run based on the dynamic type of the object.
– Otherwise, if the function is nonvirtual, or if the call is on an
object (not a reference or pointer), the compiler generates a normal
function call.
Just use this main function:
int main()
{
D1 d1;
D1 *bp = &d1;
bp->f();
return 0;
}
This is because if the bp is a Base poiter, Name lookup will start from the class Base
I know that when use a base class pointer which point to a derived class object to call a virtual function, the compiler will use dynamic binding to call the derived version.
But when use a base class pointer which point to a base class object to call a virtual function, does the compiler use dynamic binding or static binding to call the virtual
function?
For example:
class Base
{
public:
virtual void show()
{
cout << "base class";
}
}
int main()
{
Base *pb; //Base class pointer
Base b; //Base class object
pb = &b;
pb->show(); //Is static binding or dynamic binding?
}
Because My English is very bad, so I want to make my question as simple as possible, but I will describle my question in more detail in follow:
Actually the problem stems from that I am summarizing how to trigger dynamic binding.
At first I summary the trigger condition are:
the function must a virtual function.
must use pointer or reference to call the function.
The two trigger condition cause the problem that I asked:
"when a base class pointer point to a base class object whether the compiler will use dynamic binding?"
I have google for search answer, and I find a fragment (the demo is here):
struct A {
virtual void f() { cout << "Class A" << endl; }
};
struct B: A {
//Note that it not overrides A::f.
void f(int) { cout << "Class B" << endl; }
};
struct C: B {
void f() { cout << "Class C" << endl; }
};
int main() {
B b; C c;
A* pa1 = &b;
A* pa2 = &c;
// b.f();
pa1->f();
pa2->f();
}
The following is the output of the above example:
"Class A"
"Class C"
According to pa1->f() will output Class A, I summary third trigger condition:
3.function in base class must be overridden in the derived class.
Now according to the three trigger condition, when use a base class pointer which point to a base class object to call a virtual function, the compiler will use static binding to call the virtual function, because the virtual is not overridden.
But when use a derived class pointer which point to a derived class object to call a virtual function, it will use dynamic binding, because the virtual is overridden.
It made me very confused.
It can choose whichever, or neither, depending on how smart it is and how well it can detect. The rule is polymorphism must work. How this is achieved is an implementation detail.
If the same end-result can be achieved with both dynamic or static binding, as is the case here, both are valid options for the compiler.
In your case, the function doesn't have to be called at all - the generated code could be just as well identical to code generated by
int main()
{
cout << "base class";
}
I guess it depends on compiler optimization. Compiler might be clever enough to figure out that Base::show is always the one called or it might not. You can look at the disassembly to find out. You can force static-binding with b->Base::show()
Short answer: No. At least in theory not. Because in theory, the compiler does not know wether the pointer points to a Base, a Derived or to YetAnotherDerived object. Therefore it has to apply the same mechanism regardless of the dynamic type of the object.
But: In practise, compilers have optimizers, capable of identifying some use cases where the dynamic type is known. I your case it can detect the aliasing, meaning it knows that pb points to b and that it is a local variable and cannot be changed concurrently, so it knows that in fact you are calling b.show() and will abbreviate the output to reflect that fact and get rid of the virtual dispatch. Similar optimizations are possible e.g. in this code:
auto pb = make_unique<Base>();
pb->show();
But as any optimization it is up to the compiler if it applies them - the standard says virtual dispatch happens even if the pointer points to a Base object, and that's it.
class base {
public:
virtual void foo()
{
cout << "Base class virutal function " << endl;
}
}
class Derived : public base {
public:
void foo()
{
cout << "Derived class virtual function " << endl;
int main()
{
Base b, *ptr;
Derived d;
ptr = &b;
ptr->foo();
ptr = &d;
ptr->foo();
}
Hi,
I have a doubt regarding dymnamic binding here. since compiler knows that when b.foo() is there it can use base virtual fumction. and when d.foo() is present it can use derived version of foo. I mean compiler has every bit of info during compile time but still literature says that which function will be used is decided at run time.
In the specific example you give here, you're correct; the compiler has enough information to know which function to call at compile time. The compiler thus does not have to look up the function at run time, but this is a compiler optimization that does not affect the result of the program - the compiler is free to bypass the lookup or not.
In this specific case1, the compiler (at least in theory) can determine which function is called in each case, and generate code to simply call the two functions directly (and when I've tested it, most compilers will optimize code like this exactly that way).
A more typical case for polymorphism would involve some sort of user input, so the function to be called can't be determined statically. For example, using your Base and Derived classes, consider something like:
int main(int argc, char **argv) {
Base b;
Derived d;
Base *arr[] = {&b, &d};
int i = atoi(argv[1]) != 0;
arr[i]->foo();
return 0;
}
In this case, the compiler can't determine the correct function to call statically -- depending on what you pass on the command line when you run it, it might use either the Base or the Derived version of foo().
You also seem to have a kind-of intermediate case you started to try to include in your code, but never really completed -- you initialize ptr to point at your Base object and then your Derived object, but you never invoke a function via ptr, only directly on the objects themselves. If you did invoke the functions via the pointer, it would be harder to optimize than when working only directly with concrete objects. Fewer compilers will determine the type statically in this case, but at least some still can/will.
1Well, almost this specific case anyway. As it stands right now, the code won't compile, because you've defined a class named base (lower-case 'b') and tried to instantiate a class named Base (upper-case 'B'). Also, your "Derived" class isn't actually derived from Base, as you presumably intended. Once those are fixed though...
Are you saying that compiler can notice that you cast d to Base *ptr and resolve actual function address at runtime?
If yes, where is a sample:
if (rand() > 0.5)
{
ptr = &b;
}
else
{
ptr = &d;
}
ptr->foo();
how can compiler know address at runtime? That's why dynamic binding is there.