I have been studying c++ for an exam and I thought that i had understood most of the c++ commons misconcemptions with much fatigue but i've encountered an exercise from a past exam that is driving me crazy, it combines virtual methods and inheritance in a way that i dont seem to understand here is the code:
#include <iostream>
class B;
class A {
public:
virtual A* set(A* a) = 0;
};
class B : public A {
public:
virtual A* set(B* b) {
std::cout << "set1 has been called" << std::endl;
b = this;
return b;
}
virtual B* set(A* a) {
std::cout << "set2 has been called" << std::endl;
a = this;
return this;
}
};
int main(int argc, char *argv[]) {
B *b = new B();
A *a = b->set(b);
a = b->set(a);
a = a->set(b);
a = a->set(a);
return 0;
}
the output is
set1 has been called
set2 has been called
set2 has been called
set2 has been called
From what i've gathered the first call (b->set(b) ) calls the first method of class B and return b itself and then this objectref gets casted to A meaning that now the object b is now of type A?
so i have A *a = A *b;
now it makes sense to me that i should call set of A since i have this situation in my mind
objectoftypeA->set(objectoftypeA) so i m not supposed to look into virtual methods since the two object are base classes ?
Anyway as you can see I have much confusion so bear with me if i make stupid errors i would be glad if someone could explain whats going on this code,i tried to search the web but i find only small and easy example that dont cause troubles.
The program demonstrates how member functions are looked up. The static type of the object determines the function overload that will be called: it performs the name lookup. The dynamic type then determines the virtual override that gets called.
Perhaps the key point is that different overlods of the same name are really different functions.
Since A has only one set member, there is only one thing that can happen when you call a->set(), no matter what the argument is. But when you call b->set(), there are a couple potential functions, and the best one is selected.
Since B::set is never overridden, it makes no difference whether it's virtual or not. virtual members of the same class don't talk to each other at all.
Potatoswatter is right, but I think I have a bit "clearer" explanation. I think the OP is getting confused on what happens at run-time with dynamic type lookup versus compile-time, and when up-casting happens automatically, versus when it does not.
First off, return type does NOT affect which overload is called. You probably know that, but it bears repeating. A return type mis-match will cause an error at compile-time, but not run-time, and does not affect which overload is called. Also it's worth noting that as long as it is a compatible pointer type (in a hierarchy together) returning a pointer doesn't ever "change" it. It is still the same pointer, unlike converting floats to ints, where there is an actual change.
Now to go through the calls one-by-one. This is my understanding of the process, not necessarily what the standard, or what "really" happens.
When you call b->set(b) the compiler (not run-time) goes "looking for a method named set with an argument of pointer to B" which it finds with the one that outputs set1. It's virtual, so there's code to check if the class points to anything lower, but there isn't, so it just calls it, and returns the this pointer into a.
Now you're calling b->set(a). Again it's the compiler that goes "does b have an overload that takes pointer to A?" Yes it does, so it calls the "set2" method. It's the compiler that sees an A* and so the call is "determined" at that point. Even though the pointer points to an object that is of type B, the compiler doesn't know that, or care. So it's the compile-time types of the arguments that determine which overloaded method get taken. From that point on, where in the hierarchy the virtual gets taken is on the underlying type of the this pointer, but only downward.
Here's a different case though. Try this: b->set(dynamic_cast<B*>(a)) This should call the "set1" method, because the compiler is going to definitely have a pointer to B (even if it's nullptr).
Now the third case: a->set(b). What's happening here is the compiler says "there is only one set method, so can the argument be up-cast or constructed to that type?" The answer is yes, as B is a child of A. So that cast happens transparantly, and the compiler calls the ABSTRACT dispatcher for the set method of the type A. This occurs at compile time before the "real" type of what a is pointer to. Then at run-time, the program "walks the virtual" and finds the lowest one, the B->set(A*) method that emits "set2". The actual type of what the argument points to isn't used, only the type to the left of the arrow operator, and that only determines how far down the hierarchy.
And the fourth case is just the 3rd again. The type of the argument (the pointer, not whta is pointed to) is compatible, so it goes as before. If you want a dramatic demonstration of this, try this:
a->set((A*)nullptr) // prints "set2 has been called"
b->set((A*)nullptr) // prints "set2 has been called"
b->set((B*)nullptr) // prints "set1 has been called"
The underlying type of what the arguments point to doesn't affect dynamic dispatch. Only their "surface" type affects the overload called.
Related
While trying to understand the inner workings of virtual function and RTTI, I observed the subsequent fact by examining the gcc compiler:
When structs or classes have a virtual function than the space occupied by them get's expanded by preciding them with a pointer that determines their type.
More over and worse, when multiple inheritance is active, each substructure adds another pointer. So e.g. a structure as:
struct C: public A, public B{...}
creates a structure with the data of A and B plus two pointers.
That is memory and efficiency consuming.
The question is if that is needed.
I don't have experience in working with those arguments but the logic thougths that I would like to ask if are buggy are as follows:
a) when we declare a variable of a certain struct type we know it's type. So when we have to pass that variable by value to another function we know it's type and we may pass it as a plain old structure along with an extra value indicating it's type ( I mean the compiler should take care of that).
b) When we pass a struct by reference or by pointer we know again the starting type and we may pass a pointer along with an extra value indicating the type.
In order to assign a struct passed by value to a narrower type we have just to take the necessary subpart (offsets calculation).
In order to change to a narrower pointer we have to adjust the pointer and change the extra value indicating the type accordingly.
So along that line of thoughs, we need to have the extra type indication only when passing called arguments between functions on the stack and it isn't needed to save them in each struct.
What am am I missing ?
The only case that I see would need the type information stored along with the data is when we use a Union and we need to save any kind of a certain specific list of structures but tha is any case resolved by other means.
So the question is if dynamic type information is needed besides when passing arguments between functions, and if and where eventually that is mandated by the standard.
What am am I missing ?
rtti and dynamic_cast need to be able to deduce the layout of the object from any one of its interfaces.
How will you do that unless each interface encapsulates a pointer to this information?
example of RTTI:
struct A { virtual ~A() = default; }; // virtual base class
struct B : A {}; // derived polymorphic class
A* p = new B();
std::cout << typeid(*p).name() << std::endl;
Exercise for reader:
Which class's name gets printed?
Why?
I think, following code will explain it all:
// in header.h
struct Base {
virtual void foo() = 0;
};
// in impl.cpp
void work(Base* b) {
b->foo();
}
Assume the code above gets compiled into a C++ library. Never touched afterwards.
Now, some time later, we write following code:
#include <header.h>
struct D : Base {
void foo() { std::cout << "Foo!\n"; }
};
void my() {
D* d = new D;
work(d);
}
As you see, we can't possibly pass any type to work() - when it was compiled, D didn't even exist! The elegance of virtual functions through pointers to virtual function table - this is what those pointers are - is that above design is allowed.
I was studying Virtual Functions and Pointers. Below code made me to think about, why does one need Virtual Function when we can type cast base class pointer the way we want?
class baseclass {
public:
void show() {
cout << "In Base\n";
}
};
class derivedclass1 : public baseclass {
public:
void show() {
cout << "In Derived 1\n";
}
};
class derivedclass2 : public baseclass {
public:
void show() {
cout << "In Derived 2\n";
}
};
int main(void) {
baseclass * bptr[2];
bptr[0] = new derivedclass1;
bptr[1] = new derivedclass2;
((derivedclass1*) bptr)->show();
((derivedclass2*) bptr)->show();
delete bptr[0];
delete bptr[1];
return 0;
}
Gives same result if we use virtual in base class.
In Derived 1
In Derived 2
Am I missing something?
Your example appears to work, because there is no data, and no virtual methods, and no multiple inheritance. Try adding int value; to derivedclass1, const char *cstr; to derivedclass2, initialize these in corresponding constructors, and add printing these to corresponding show() methods.
You will see how show() will print garbage value (if you cast pointer to derivedclass1 when it is not) or crash (if you cast the pointer to derivedclass2 when class in fact is not of that type), or behave otherwise oddly.
C++ class member functions AKA methods are nothing more than functions, which take one hidden extra argument, this pointer, and they assume that it points to an object of right type. So when you have an object of type derivedclass1, but you cast a pointer to it to type derivedclass2, then what happens without virtual methods is this:
method of derivedclass2 gets called, because well, you explicitly said "this is a pointer to derivedclass2".
the method gets pointer to actual object, this. It thinks it points to actual instance of derivedclass2, which would have certain data members at certain offsets.
if the object actually is a derivedclass1, that memory contains something quite different. So if method thinks there is a char pointer, but in fact there isn't, then accessing the data it points to will probably access illegal address and crash.
If you instead use virtual methods, and have pointer to common base class, then when you call a method, compiler generates code to call the right method. It actually inserts code and data (using a table filled with virtual method pointers, usually called vtable, one per class, and pointer to it, one per object/instance) with which it knows to call the right method. So when ever you call a virtual method, it's not a direct call, but instead the object has extra pointer to the vtable of the real class, which tells what method should really be called for that object.
In summary, type casts are in no way an alternative to virtual methods. And, as a side note, every type cast is a place to ask "Why is this cast here? Is there some fundamental problem with this software, if it needs a cast here?". Legitimate use cases for type casts are quite rare indeed, especially with OOP objects. Also, never use C-style type casts with object pointers, use static_cast and dynamic_cast if you really need to cast.
If you use virtual functions, your code calling the function doesn't need to know about the actual class of the object. You'd just call the function blindly and correct function would be executed. This is the basis of polymorphism.
Type-casting is always risky and can cause run-time errors in large programs.
Your code should be open for extension but closed for modifications.
Hope this helps.
You need virtual functions where you don't know the derived type until run-time (e.g. when it depends on user input).
In your example, you have hard-coded casts to derivedclass2 and derivedclass1. Now what would you do here?
void f(baseclass * bptr)
{
// call the right show() function
}
Perhaps your confusion stems from the fact that you've not yet encountered a situation where virtual functions were actually useful. When you always know exactly at compile-time the concrete type you are operating on, then you don't need virtual functions at all.
Two other problems in your example code:
Use of C-style cast instead of C++-style dynamic_cast (of course, you usually don't need to cast anyway when you use virtual functons for the problem they are designed to solve).
Treating arrays polymorphically. See Item 3 in Scott Meyer's More Effective C++ book ("Never treat arrays polymorphically").
(C++,MinGW 4.4.0,Windows OS)
All that is commented in the code, except labels <1> and <2>, is my guess. Please correct me in case you think I'm wrong somewhere:
class A {
public:
virtual void disp(); //not necessary to define as placeholder in vtable entry will be
//overwritten when derived class's vtable entry is prepared after
//invoking Base ctor (unless we do new A instead of new B in main() below)
};
class B :public A {
public:
B() : x(100) {}
void disp() {std::printf("%d",x);}
int x;
};
int main() {
A* aptr=new B; //memory model and vtable of B (say vtbl_B) is assigned to aptr
aptr->disp(); //<1> no error
std::printf("%d",aptr->x); //<2> error -> A knows nothing about x
}
<2> is an error and is obvious. Why <1> is not an error? What I think is happening for this invocation is: aptr->disp(); --> (*aptr->*(vtbl_B + offset to disp))(aptr) aptr in the parameter being the implicit this pointer to the member function. Inside disp() we would have std::printf("%d",x); --> std::printf("%d",aptr->x); SAME AS std::printf("%d",this->x); So why does <1> give no error while <2> does?
(I know vtables are implementation specific and stuff but I still think it's worth asking the question)
this is not the same as aptr inside B::disp. The B::disp implementation takes this as B*, just like any other method of B. When you invoke virtual method via A* pointer, it is converted to B* first (which may even change its value so it is not necessarily equal to aptr during the call).
I.e. what really happens is something like
typedef void (A::*disp_fn_t)();
disp_fn_t methodPtr = aptr->vtable[index_of_disp]; // methodPtr == &B::disp
B* b = static_cast<B*>(aptr);
(b->*methodPtr)(); // same as b->disp()
For more complicated example, check this post http://blogs.msdn.com/b/oldnewthing/archive/2004/02/06/68695.aspx. Here, if there are multiple A bases which may invoke the same B::disp, MSVC generates different entry points with each one shifting A* pointer by different offset. This is implementation-specific, of course; other compilers may choose to store the offset somewhere in vtable for example.
The rule is:
In C++ dynamic dispatch only works for member functions functions not for member variables.
For a member variable the compiler only looksup for the symbol name in that particular class or its base classes.
In case 1, the appropriate method to be called is decided by fetching the vpt, fetching the address of the appropriate method and then calling the appropiate member function.
Thus dynamic dispatch is essentially a fetch-fetch-call instead of a normal call in case of static binding.
In Case 2: The compiler only looks for x in the scope of this Obviously, it cannot find it and reports the error.
You are confused, and it seems to me that you come from more dynamic languages.
In C++, compilation and runtime are clearly isolated. A program must first be compiled and then can be run (and any of those steps may fail).
So, going backward:
<2> fails at compilation, because compilation is about static information. aptr is of type A*, thus all methods and attributes of A are accessible through this pointer. Since you declared disp() but no x, then the call to disp() compiles but there is no x.
Therefore, <2>'s failure is about semantics, and those are defined in the C++ Standard.
Getting to <1>, it works because there is a declaration of disp() in A. This guarantees the existence of the function (I would remark that you actually lie here, because you did not defined it in A).
What happens at runtime is semantically defined by the C++ Standard, but the Standard provides no implementation guidance. Most (if not all) C++ compilers will use a virtual table per class + virtual pointer per instance strategy, and your description looks correct in this case.
However this is pure runtime implementation, and the fact that it runs does not retroactively impact the fact that the program compiled.
virtual void disp(); //not necessary to define as placeholder in vtable entry will be
//overwritten when derived class's vtable entry is prepared after
//invoking Base ctor (unless we do new A instead of new B in main() below)
Your comment is not strictly correct. A virtual function is odr-used unless it is pure (the converse does not necessarily hold) which means that you must provide a definition for it. If you don't want to provide a definition for it you must make it a pure virtual function.
If you make one of these modifications then aptr->disp(); works and calls the derived class disp() because disp() in the derived class overrides the base class function. The base class function still has to exist as you are calling it through a pointer to base. x is not a member of the base class so aptr->x is not a valid expression.
I've searched for questions, looking at forums, books, etc. I can recognize a polymorphic behavior of methods, and there are lots of simple examples when an invoked method is decided in compile or runtime. But I was confused by this code, where a class C inherits from B that inherits from A:
class A {
protected:
int x;
public:
virtual void change() = 0;
virtual void change(int a) { x = a; }
};
class B : public A {
public:
void change() { x = 1; }
};
class C : public B {
public:
void change() { x = 2; }
void change(int a) { x = a*2; }
};
int main () {
B *objb = new B();
C *objc = new C();
A *obja;
objb->change();
obja = objc;
objc->change();
obja->change(5);
// ...
}
Many examples tells (and it is clear) that a polymorphic behavior occurs and it is decided in runtime what method to call when the following line is executed:
obja->change(5);
But my questions are:
What happens when I call the following (overrided from a pure virtual)?
objb->change();
What happens when I call the following (overrided from a virtual, but non-pure)?
objc->change(5);
Since the class declaration of the pointer variables are the same of the objects, should the method calling be decided at compile or runtime?
If the compiler can deduce the actual type at compile time, it can avoid the virtual function dispatch. But it can only do this when it can prove that the behavior is equivalent to a run-time dispatch. Whether this happens depends on how smart your particular compiler is.
The real question is, why do you care? You obviously understand the rules for calling virtual functions, and the semantics of the behavior are always those of a run-time dispatch, so it should make no difference to how you write your code.
There are three issues to consider. The first is overload resolution:
in this case, the compiler uses the static type of the expression to
construct the set of functions it chooses from. Thus, if you had
written:
objb->change( 2 );
the code wouldn't have compiled, because there is no change which
takes an int in the scope of B. Had there been no change at all
in the scope of B, the compiler would have looked further, and found
the change (all of them) in A, but once it finds the name, it stops.
This is name lookup and function overload resolution, and it is entirely
static.
The second issue is which function should be called, once the compiler
has chosen to call a specific function in the interface. If the chosen
function is virtual, the actual function called will be the function
with the exact same signature in the most derived class of the dynamic
type—that is, the type of the actual object in question.
Finally, there is the question of whether dynamic dispatch is used in
the generated code. And that's entirely up to the compiler. The
compiler can do anything it wants, as long as the correct function, as
determined by the two preceding issues, is called. Generally: if the
function isn't virtual, dynamic dispatch will never be used; and if the
access is directly to the object (named object or temporary), dynamic
dispatch will generally not be used, since the compiler can trivially
know the most derived type. When the call is through a reference or a
pointer, the compiler will generally use dynamic dispatch, but it is
sometimes possible for the compiler to track the pointer enough to know
the type it will point to at runtime, and forego dynamic dispatch. And
good compilers will often go further, using profiler information, to
determine that 99% of the time, the same function will be called, and
the call is in a tight loop, and will generate two versions of the loop,
one with dynamic dispatch, and one with the most frequently called
function inlined, and select which version of the loop via an if, at
runtime.
objb->change()
calls B::change() because objb contains address of an object of type B
objc->change(5);
calls C::change(int) because objc contains address of an object of type C
The method calling will be still dynamic/run time because the methods B::change() & C::change(int) are still virtual, because the virtual attribute in inherited.
To answer the question of whether the functions are dynamically dispatched or at compile time
The answer is No can definitely say whether it will be a compile time dispatch or dynamic dispatch. The dynamic/run time dispatch takes place in the first place because the compiler cannot definitely decide which versions of the functions to call at compile time, So if a compiler can deduce a definite manner as to which function to call, the dispatch might very well be decided at compile time itself.
Having said so, whether the dispatch happens at run time or compile time does not change the semantics of calling which version of overidden function gets finally called because the C++ standard explicitly states the rules of which functions of functions in this regard.
My question is not about calling a virtual member function from a base class constructor, but whether the pointer to a virtual member function is valid in the base class constructor.
Given the following
class A
{
void (A::*m_pMember)();
public:
A() :
m_pMember(&A::vmember)
{
}
virtual void vmember()
{
printf("In A::vmember()\n");
}
void test()
{
(this->*m_pMember)();
}
};
class B : public A
{
public:
virtual void vmember()
{
printf("In B::vmember()\n");
}
};
int main()
{
B b;
b.test();
return 0;
}
Will this produce "In B::vmember()" for all compliant c++ compilers?
The pointer is valid, however you have to keep in mind that when a virtual function is invoked through a pointer it is always resolved in accordance with the dynamic type of the object used on the left-hand side. This means that when you invoke a virtual function from the constructor, it doesn't matter whether you invoke it directly or whether you invoke it through a pointer. In both cases the call will resolve to the type whose constructor is currently working. That's how virtual functions work, when you invoke them during object construction (or destruction).
Note also that pointers to member functions are generally not attached to specific functions at the point of initalization. If the target function is non-virtual, they one can say that the pointer points to a specific function. However, if the target function is virtual, there's no way to say where the pointer is pointing to. For example, the language specification explicitly states that when you compare (for equality) two pointers that happen to point to virtual functions, the result is unspecified.
"Valid" is a specific term when applied to pointers. Data pointers are valid when they point to an object or NULL; function pointers are valid when they point to a function or NULL, and pointers to members are valid when the point to a member or NULL.
However, from your question about actual output, I can infer that you wanted to ask something else. Let's look at your vmember function - or should I say functions? Obviously there are two function bodies. You could have made only the derived one virtual, so that too confirms that there are really two vmember functions, who both happen to be virtual.
Now, the question becomes whether when taking the address of a member function already chooses the actual function. Your implementations show that they don't, and that this only happens when the pointer is actually dereferenced.
The reason it must work this way is trivial. Taking the address of a member function does not involve an actual object, something that would be needed to resolve the virtual call. Let me show you:
namespace {
void (A::*test)() = &A::vmember;
A a;
B b;
(a.*test)();
(b.*test)();
}
When we initialize test, there is no object of type A or B at all, yet is it possible to take the address of &A::vmember. That same member pointer can then be used with two different objects. What could this produce but "In A::vmember()\n" and "In B::vmember()\n" ?
Read this article for an in-depth discussion of member function pointers and how to use them. This should answer all your questions.
I have found a little explanation on the Old New Thing (a blog by Raymond Chen, sometimes referred to as Microsoft's Chuck Norris).
Of course it says nothing about the compliance, but it explains why:
B b;
b.A::vmember(); // [1]
(b.*&A::vmember)(); // [2]
1 and 2 actually invoke a different function... which is quite surprising, really. It also means that you can't actually prevent the runtime dispatch using a pointer to member function :/
I think no. Pointer to virtual member function is resolved via VMT, so the same way as call to this function would happen. It means that it is not valid, since VMT is populated after constructor finished.
IMO it is implementation defined to take address of a virtual function. This is because virtual functions are implemented using vtables which are compiler implementation specific. Since the vtable is not guaranteed to be complete until the execution of the class ctor is done, a pointer to an entry in such a table (virtual function) may be implementation defined behavior.
There is a somewhat related question that I asked on SO here few months back; which basically says taking address of the virtual function is not specified in the C++ standard.
So, in any case even if it works for you, the solution will not be portable.