C++, what does pointer really mean? when virtual involved - c++

For the following code
class A
{
public:
~A()
{
std::cout << "a" << std::endl;
}
};
class B : public A {
public:
virtual ~B()
{
std::cout << "b" << std::endl;
}
};
int main()
{
B* b = new B();
A* a = b;
if (a == b)
{
}
delete a;
}
Question is , will “a” equals to “b” ? Why and How this happened?
And what a pointer really means? Not just an address and the length of the memory block?

will “a” equals to “b” ?
Yes
Why and How this happened?
To perform the comparison of the two pointers the compiler will perform a conversion to a common type. In this case, as A is a base of B, the conversion is to A*, yielding code equivalent to:
A* __tmp = b;
if ( a == __tmp ) ...
And what a pointer really means? Not just an address and the length of the memory block?
A pointer is a variable the holds the address of an object (no size information stored in the pointer). But the pointer has a type, and the compiler will interpret the memory location that the pointer refers to be an object of that type. That extra information that is stored outside of the pointer is what allows the compiler to perform the conversion.

The a variable will point at the A class portion of the allocated b object. A pointer is just a memory address, nothing more. What is important is what kind of data the pointer is pointing at in memory.

Let's split the answer in three parts
Pointers
A pointer is a variable which holds the memory address of another variable. The type of the pointed variable is important and will be checked by the compiler. One can force the pointer mechanic by using pointers to void (should be avoided in C++, unless you really know what you are doing).
Accessing derived class objects using base class pointers
An object, i.e. a variable, of a derived class (b in your code) can be referred to using a pointer to its parent type (A in your code). This will allow you to access the members of A that are present in B. As already stated by Richard J. Ross, using a pointer to the base class will yield the same address as using the derived class pointer (unless multiple inheritance is involved).
Virtual functions
Virtual methods allow you to call an overloaded method of the derived object using a base class pointer. This is especially useful for destructors because one can rest assured that objects will be properly destroyed even when using a base class pointer (assuming the destructor is well written).
Still, your code is conceptually wrong. Because the destructor of A is not virtual, the destructor of the B part of b will not be called, hence there might be memory leaks and similar problems.

As David already mentioned, simple comparison of two pointers a == b will give you True because the compiler will cast both of them to the common type.
But, if you were to modify this to (void*)a == (void*)b, result may be false.
This is because classes A and B have different memory layouts due to B having a virtual function table and A - not.
MSVC compiler puts virtual function pointer on the "top" of the class, before the first data member, but nothing stops other compilers from placing it on the "bottom".
You may also try making class A destructor virtual.

I found some explain on inside c++ object model, this is just like multiple inheritance, the vptr locates at the beginning of the object.
when the base class having no virtual and child class having a virtual, the assign of pointer will be adjusted by the compiler by step over the vptr,

The example is a comparison between two pointers. As they are pointing to the same location, a will be equal to b.

Related

I thought pointers could only point to the same class/datatype?

I thought i knew pointers but then as i study run time polymorphism/dynamic binding, i've seen a very different use for pointers. Here are my 3 questions, they're all regarding the single line code below:
#include <iostream>
class Base {
public:
virtual void fun (int x) { std::cout << "base" << std::endl; }
};
class Derived : public Base {
public:
void fun (int x) { std::cout << "base" << std::endl; }
};
int main() {
//--------------------------------------
//1. Pointer can only hold memory address, 'Derived()' is calling the default constructor it is not a memory address
//2. What's the point of putting this into heap?
//3. I thought pointers could only hold memory address of the same datatype/class
Base* obj = new Derived();
//--------------------------------------
obj->fun(5);
return 0;
}
Pointer can only hold memory address, 'Derived()' is calling the default constructor it is not a memory address
The expression here is not Derived() (which would indeed construct a temporary object), but new Derived(), which is new's specific syntax to allocate and construct an object with dynamic lifetime and return a pointer to it. Note that the corresponding delete to end the object's lifetime is missing, but see the next point.
What's the point of putting this into heap?
None. In particular, the lifetime of an object does not affect the use of pointers with it, nor are pointers required for polymorphism (references work just as well). In any case, dynamic allocation should be done with smart pointers, not naked news.
I thought pointers could only hold memory address of the same datatype/class
That's still true: obj is pointing at the Base subobject of the Derived object. C++ provides an implicit pointer-adjusting conversion from Derived * to Base * (and Derived & to Base & as well) to facilitate the use of polymorphism.
Pointer can only hold memory address, 'Derived()' is calling the default constructor it is not a memory address
A new expression does multiple things. First, it allocates enough memory for the type of object. Then, it calls the constructor for that type at the allocated memory. Then, it returns a pointer to the newly created object. Pointers point to objects. new returns a pointer to an object. So there is no problem here, obj will now point to the newly created Derived object.
What's the point of putting this into heap?
It isn't strictly necessary to dynamically create an object here. You could just as easily have the following and illustrate polymorphism :
Derived foo;
Base* obj = &foo;
Maybe the author of this example didn't think of that, or maybe they wanted to illustrate the mechanics of new at the same time. You'd have to ask them.
I thought pointers could only hold memory address of the same datatype/class
It is true that obj is a Base* so it can only point to a Base type object. But a Derived object is also a Base object, so obj can easily point to a Derived object. That is what public inheritance achieves. The : public Base part of class Derived : public Base means Derived is also a Base.
Pointers can point to anything with a memory address.
A pointer to an instance of a derived class can be handled as if it were a pointer to the base class, even if the base class itself is abstract (unable to be instantiated itself). That's how polymorphism works in C++.
Why put a pointer to an object on the stack and the object itself into the heap? Many reasons, some of them good and others not so good.

Why I cannot call virtual function from a casted byte(char) array?

Having class A, for example,
class A{
public:
int x;
void Update(){
cout << " from a\n";
}
};
In order to instantiate an object of A without calling the constructor, it goes like:
A* a = (A*) new char[sizeof(A)];
a->x = 9;
cout << a->x; a->Update(); //"9 from a" is outputted
now In case Update is a virtual function
class A{
public:
int x;
virtual void Update(){
cout << " from a\n";
}
};
It throw an "Access violation" exception, why ?
When a class declares one or more virtual functions, the compiler implicitly adds a V-Table pointer to the class declaration, and thereby, to each object.
For every virtual-function-call, the compiler generates code which reads the function's address from the object's V-Table pointer, and then jumps to that address.
The V-Table pointer of an object is initialized only during runtime, when the constructor of the class is called and the object is created.
So when you create an object without explicitly calling the object's class-constructor, the V-Table pointer of that object is not initialized.
Then, when you call a virtual function of that object, the CPU attempts to read the function's address from the object's V-Table pointer, and a memory access violation occurs.
Calling a virtual function involves dereferencing the virtual function table pointer, stored within the object. It is dereferencing the place in memory where this value is supposed to be, but of course, it's uninitialized garbage, hence the error.
From the context of the question, it looks to me like you might be interested in Placement new. Placement new will allow you to re-use heap memory that has already been allocated, but will properly initialize the vtable pointer.
If you don't call the constructor of the object, then the virtual method table is not initialized. Therefore you can't expect calling a virtual function call to work. What are you trying to do in the first place >
Casting from an array of bytes into an object is only valid for POD types, that is, types that comply:
No user-defined destructor, constructor or copy operator.
Only public member variables.
No base classes.
No virtual functions.
No member variables of non-POD type, recursively.
That is, your first example is a POD, but your second one is not.
So your cast is valid for the first but not for the second. In the second case you will need to call placement new to construct a proper A object:
char *bytes = new char[sizeof(A)];
A *a= new (bytes) A;
And then, it is your responsibility to call the destructor when you are finished with it:
a->~A();
What are you expecting? Writing silly code results in silly results.
Anyway the objects need a virtual table - see What are you expecting? Writing silly code results in silly results. This table is not being initialised in the code supplied
See http://www.learncpp.com/cpp-tutorial/125-the-virtual-table/

Virtual destructor and memory deallocation

I'm not quite sure I understand virtual destructors and the concept of allocating space on the heap right. Let's look at the following example:
class Base
{
public:
int a;
};
class Derived : public Base
{
public:
int b;
};
I imagine that if I do something like this
Base *o = new Derived;
that 8 Bytes (or whatever two integers need on the system) are allocated on the heap, which looks then something like this:
... | a | b | ...
Now if I do this:
delete o;
How does 'delete' know, which type o is in reality in order to remove everything from the heap? I'd imagine that it has to assume that it is of type Base and therefore only deletes a from the heap (since it can't be sure whether b belongs to the object o):
... | b | ...
b would then remain on the heap and be unaccessible.
Does the following:
Base *o = new Derived;
delete o;
truly provoke memory leaks and do I need a virtual destructor here? Or does delete know that o is actually of the Derived class, not of the Base class? And if so, how does that work?
Thanks guys. :)
You're making a lot of assumptions about the implementation, which may
or may not hold. In a delete expression, the dynamic type must be the
same as the static type, unless the static type has a virtual
destructor. Otherwise, it is undefined behavior. Period. That's
really all you have to know—I've used with implementations where
it would crash otherwise, at least in certain cases; and I've used
implementations where doing this would corrupt the free space arena, so
that the code would crash sometime later, in a totally unrelated piece
of code. (For the record, VC++ and g++ both fall in the second case, at
least when compiled with the usual options for released code.)
Firstly, the classes you declared in your example have trivial internal structure. From purely practical point of view, in order to destroy object of such classes properly the run-time code does not need to know the actual type of the object being deleted. All it needs to know is the proper size of the memory block to be deallocated. This is actually something that is already achieved by C-style library functions like malloc and free. As you probably know, free implicitly "knows" how much memory to deallocate. Your example above does not involve anything in addition to of that. In other words, your example above is not elaborate enough to truly illustrate anything C++-specific.
However, formally the behavior of your examples is undefined, since virtual destructor is formally required by C++ language for polymorphic deletion regardless of how trivial the internal structure of the class is. So, your "how delete knows..." question simply does not apply. Your code is broken. It does not work.
Secondly, the actual tangible C++-specific effects begin to appear when you begin to require non-trivial destruction for your classes: either by defining an explicit body for the destructor or by adding non-trivial member subobjects to your class. For example, if you add a std::vector member to your derived class, the destructor of the derived class will become responsible for (implicit) destruction of that subobject. And in order for that to work, you will have to declare you destructors virtual. A proper virtual destructor is called through the same mechanism as any other virtual function is called. That's basically the answer to your question: the run-time code does not care about the actual type of the object simply because the ordinary virtual dispatch mechanism will ensure that the proper destructor is called (just like it works with any other virtual function).
Thirdly, another significant effect of virtual destruction appears when you define dedicated operator delete functions for your classes. The language specification requires that the proper operator delete function is selected as if it is looked up from inside the destructor of the class being deleted. And many implementations implement this requirement literally: they actually implicitly call operator delete from inside the class destructor. In order for that mechanism to work properly, the destructor has to be virtual.
Fourthly, a part of your question seems to suggest that you believe that failing to define a virtual destructor will lead to "memory leaks". This a popular, but completely incorrect and totally useless urban legend, perpetuated by low-quality sources. Performing polymorphic deletion on a class that has no virtual destructor leads to undefined behavior and to completely unpredictable devastating consequences, not to some "memory leaks". "Memory leaks" are not the issue in such cases.
There is no problem in the size of the object being deleted - it is known. The problem solved by virtual destructors can be demonstrated as follows:
class Base
{
public:
Base() { x = new char[1]; }
/*virtual*/ ~Base() { delete [] x; }
private:
char* x;
};
class Derived : public Base
{
public:
Derived() { y = new char[1]; }
~Derived() { delete [] y;}
private:
char* y;
};
Then having:
Derived* d = new Derived();
Base* b = new Derived();
delete d; // OK
delete b; // will only call Base::~Base, and not Derived::~Derived
The second delete will not finalize the object properly. If the virtual keyword was uncommented, then the second delete statement will behave as expected, and it will call Derived::~Derived along with Base::~Base.
As pointed out in the comments, to be strict, the second delete yields an undefined behavior, but it's used here only for the sake of making the point about virtual destructors.

downcast problem in c++

#include <iostream>
using std::cout;
using std::endl;
class Base
{
public :
void f();
void g();
int mBaseData1;
};
class Derived : public Base
{
public :
int mDerivedData1;
};
void main()
{
Base* base = new Base();
Derived* derived = (Derived*)(base); // DownCast
derived->mDerivedData1 = 6;
cout<< derived->mDerivedData1<<endl; // Result = 6;
}
in this code new base() allocate memory in heap
and Derived* derived = (Derived*)(base) cast base to derived
how we can use mDerivedData1? i cant find where we allocate memory for mDerivedData1 or when we call constructor of Derived for allocate mDerivedData1 ?
The reason you can't find where memory for mDerivedData1 was allocated is because no memory was allocated. You have performed an invalid type-cast. The thing stored in base is a pointer to a Base instance. Using a type-cast to tell the compiler that it's actually a pointer to a Derived instance doesn't make it so (but the compiler will believe you anyway because you're the one in charge). The object is still just a Base. If you want a Derived, then you'll need to instantiate a Derived. You can use dynamic_cast to convert the Base pointer into a Derived pointer.
Base* base = new Derived;
Derived* derived = dynamic_cast<Derived*>(base);
derived->mDerivedData1 = 6;
It will work correctly if you change:
Base* base = new Base();
to:
Base* base = new Derived();
but in general you should never downcast unless you are really sure you know what you are doing, and even then it's usually a sign of a very bad design.
You are really overwritting some part of the heap that is not part of the original base object. This could overwrite another object on the heap, or other unexplained happenings could occur.
C++ lets you do what you tell it to do for the most part. You are shooting yourself in the foot. :)
Your program exhibits undefined behavior. You cannot access mDerivedData1 because you don't actually have an instance of Derived.
You can cast an instance of a child class to a base class, but you cannot (well... should not) cast an instance of a base class into a child class.
Edit:
If you are confused about how casting works:
An object never actually changes during a cast -- in truth an object is really just a contiguous block of memory. When an object is cast, the only thing that changes is how the program sees and works with the object.
That's why casting an instance of a base object to a child object results in undefined behavior; the runtime interprets the base object as a child object and uses the pointer for the object as a starting point for referencing data of the object. If a field that is defined on the child class is used on a base object cast as the child object, the program will reference memory that is not part of the instance. If this referenced memory happens to be unused by the rest of the program, things might seem just fine (for a little while), but if the memory is used by another object, strange things could happen in your program -- the other object might have a value changed that it shouldn't have, or worse. And this is just when dealing with heap allocated objects; try this with a pointer to a stack allocated object and you could totally derail your entire program -- assuming you don't segfault.
So in general, if B derives from A:
You can cast an instance of B to A
You can cast an instance of B that is already cast as an A back to B, but this may indicate sloppy architecture of your program.
You cannot (should not) cast an instance of A to B, as this will result in undefined behavior.

When is "this" pointer initialized in C++?

Hi I have a question about this pointer, when an object is constructed, when it is initialized? Which means, when can I use it? The virtual table is constructed in the constructor, is the same with this pointer?
For example, I have a code like this. The output is 8. Does it mean that before the constructor is entered, this pointer is already initialized?
class A{
public:
A() { cout<<sizeof(*this);}
int i;
int *p;
};
int main() {
A a;
}
If it is true, what else would happen before the constructor is entered ?
If it is not true, when is the this pointer initialized ?
The this pointer isn't a member of the object or class - it's an implicit parameter to the method that you call. As such it's passed in much like any other parameter - except that you don't directly ask for it.
In your example above, the constructor is a special method, which is in turn a special kind of function. When you construct the object, the compiler allocates memory for it (in this case on the stack, as a is a local variable in the main function. Then it automatically calls the constructor to initialise the object.
As part of calling the constructor, the implicit parameter this - a pointer to your object - is passed in as a parameter.
In a method with the following signature...
void MyMethod (const int* p) const;
there are actually two parameters, both pointers. There's the explicit parameter p and the implicit parameter this. The const at the end of the line specifies that this is a const pointer, much as the earlier one specifies that p is a const pointer. The need for that special syntax only exists because this is passed implicitly, so you can't specify const-ness in the normal way as with other parameters.
A "static" method doesn't have the implicit "this" parameter, and cannot directly access the object members either - there may not be a particular object associated with the call. It is basically a standard function rather than a method, except with access to private members (providing it can find an object to access).
As Steve Fallows points out, sizeof (this) is known at compile-time, because it is a pointer type and all pointers (*1) have the same sizeof value. The "8" you see implies you are compiling for a 64-bit platform. this is usable at this point - it points to valid memory, and all the members have completed their constructor calls. However, it isn't necessarily fully initialised - you are still in the constructor call after all.
EDIT
*1 - strictly, that may not be true - but the compiler knows what type of pointer it's dealing with here even though the value isn't known until runtime.
The this pointer is not stored. When the constructor is called for an object that occupies a specific memory location, that location is passed as a parameter to the constructor and other member functions.
If this would be stored inside the object, how to retrieve that pointer? Right, you would again need the this pointer :)
sizeof(*this) is known at compile time. So the cout statement reveals nothing about the initialization of this.
Given that the constructor can immediately begin accessing members of the object, clearly this is initialized before the constructor begins.
What else happens before the constructor? Well could be anything. I don't think the standard limits what a compiler could do. Maybe you should specify anything you're thinking might happen.
The virtual table is constructed in the constructor, is the same with this pointer?
The virtual table is NOT constructed in the constructor.
Typically, a single global v-table is shared by all instances of the same class, and each individual class has its own global v-table.The v-table is known at compile-time, and "constructed" at program load time.
The this pointer is "constructed" (I think "allocated" is a better term) at allocation time, that is, after the global new operator is called, and before the constructor is entered.
In cases where the object is stack-allocated instead of heap-allocated, global new is not called, but this is still available as a result of allocating stack-space, which is just before the constructor is entered.
The instance vptr is assigned after the object's memory is allocated, and just before the constructor is called.
Does it mean that before the constructor is entered, the this pointer is already initialized?
Yes, the value of the this pointer is known before the constructor is even called. This value is available via the this keyword inside constructors, constructor initialization lists, destructors, member methods. The this keyword behaves on the surface as a method variable (of pointer type) but is not one; it typically sits in a register (ecx on x86 platforms) and you typically won't be able to compile code like &this.
What else would happen before the constructor is entered
At least as far as the this pointer is concerned, the first thing that happens (unless using placement new) is the allocation of memory ultimately pointed to by this, be it on the stack (like in your example) or on the heap (using new.) At this point the this pointer is known. Then, either default constructors or explicitly specified constructors (via constructor initialization lists) are then called on the base classes (if any) and on your class non-POD member variables (if any). The class vtable pointer is also set before this point if your class contains virtual methods or destructor. Then, your class constructor body, if any, is invoked. (Constructor are called recursively, i.e. when a base class' constructor is called, the latter's base class constructors are called followed by non-POD member constructors, with the base class' vtable pointer being set, followed by the class' constructor body.)
The this pointer is the first argument to every call of the class, including the constructor.
When a class method is called, the address of the class is pushed onto the stack last (assuming cdecl calling convention here). This is read back into a register to use as the this pointer.
Constructors are in fact called as if they were ordinary member functions.
You cannot have a virtual constructor because the constructor is responsible for setting the vtable member.
As nobugz already pointed out, your example doesn't really mean much -- sizeof yields its results based on the type of the object you pass to it. It does not evaluate its operand at run-time.
That said, yes, this is initialized before entry to the ctor. Basically, the compiler allocates space for the object (on the stack if the object has automatic storage duration, or using ::operator new if it has dynamic storage duration). Upon entry to the ctor, the ctors for base classes (if any) have already run to completion. When your ctor is called, this gives the address of the memory that was allocated for the object.
this starts pointing to the current object and all members and base classes have been initialized before you enter the constructor body.
Therefore, you can hand out pointer to this in the initialization list, but the receiver should do nothing else other than storing it, because the pointed-at instance may not be fully constructed at the time.
#include <iostream>
class B;
class A
{
B* b_ptr;
public:
A(B* b);
};
class B
{
A a;
int i;
public:
B(): a(this), i(10) {}
void foo() const { std::cout << "My value is " << i << '\n'; }
};
A::A(B* b):
b_ptr(b) //Ok to store
{
b_ptr->foo(); //not OK to use, will access initialized member
}
int main()
{
B b;
}