The C++ standard says "... the interpretation of the call of a virtual function depends on the type of the object for which it is called (the dynamic type)" (p. 252) and then
"if a pointer p whose static type is 'pointer to B' is pointing to an object of class D, derived from B, the dynamic type of *p is D" (p. 2). Here B is a base class and D a derived class.
This seems to suggest (to me) that if I say
D d;
B *p = new B();
*p = d;
then, if f() is virtual in B, p->f() should call D::f(), which is wrong. I guess I'm not clear about the meaning of "pointing to an object of class ...". I know, if I say p = &d, then D::f() is called, but I would like to know why the above is wrong.
D d;
B *p = new B();
*p = d;
That final line assigns d to *p. This means it will copy the D instance using the B assignment operator and the object will be sliced. The dynamic type of *p in that case is still B.
p = &d;
That assigns the pointer to the D object to p. No slicing will occur in this case, because you are just assigning the pointers, not the objects themselves. The dynamic type of *p in this case is D.
The crux of the answer is that in C++, the dynamic type of an object can never change. You're thinking about the assignment expression *p = d as somehow completely replacing the object at *p with the object d. But that's not what happens. In C++, an object can never truly replace another.
Since class types are involved, *p = d just calls the assignment operator of class B (that's the static type of *p) with an argument D. An object in C++ can only be acted upon, it cannot really be "replaced."
Sure, we talk about copying objects, assigning to them etc. But that's just for convenience, as most of the time, the very exact semantics doesn't matter and thinking about = as assigning one object to another is simple. But deep down, it's either a function call on the target object (for class types), or copying the value of some bits into the target object's space (for primitive types). The target object itself always remains.
As TartanLlama said above, in the third line you're "wrongfully" assigning d to the object pointed by P, thereby what we refer to as Slicing occurs. From Professional C++,
...When upcasting, use a pointer or reference to the superclass to avoid slicing
D d;
B *p = new B();
*p = d;
This means the correct code evaluates to the code hereunder
D d;
B *p = new B();
p = &d;
OR
D d;
B *p = new B();
p = static_cast<B*>(&d);
Consequently, the dynamic type of the object pointed by p (*p) is still D. This allows you to switch back and forth through the inheritance hierarchy by upcasting or downcasting the pointer to the child class and not the object.
Related
Suppose B is a base class of D (maybe virtual, maybe multiple inheritance, need not be a direct base class).
Let obj be an object of type D (not of a subclass of D -- exactly D).
Let
D * d = std::addressof(obj);
B * b = d;
Can we safely assume that
(char*) d <= (char*) b && (char*) b < (char*) d + sizeof(D)
?
Background: This is to become a step in a routine determining whether some object has been created by placement new in a particular aligned_storage. I need to be sure that, if yes, all pointers to base objects of this object point to some address within the aligned_storage.
I am pretty sure that your assumption is safe given D is the final type of the object. Otherwise it would be treacherous to use placement new in the first place.
#include <stdlib.h>
#include <new>
struct B { int i; };
struct D : virtual B { int j; };
int
main()
{
auto const storage = malloc(sizeof(D));
D* d = new (storage) D();
free(storage);
return 0;
}
If B were located before d then the placement new would need to return a pointer adjusted based on the layout of D but "the standard allocation function void* operator new(std::size_t, void*) ... simply returns its second argument unchanged." (http://en.cppreference.com/w/cpp/language/new) Likewise, the storage of B cannot be situated such that it extends beyond (char*)d + sizeof(D) because it would overrun the memory allocated.
Thanks for sharing an interesting question. Perhaps since asking the question you have already found a more satisfactory answer. I would be interested in reading a more concrete proof why the assumption holds or does not.
I'm coming from java so please bear with me. I've read several other articles and can't seem to find an answer.
I've got a base class (Obj) header file shown below.
class Obj {
public:
Obj();
Obj(int);
int testInt;
virtual bool worked();
Obj & operator = (const Obj & other) {
if(this != &other) {
//other.testInt = this->testInt;
return *this;
}
}
};
Base class
Obj::Obj() {
}
Obj::Obj(int test) {
this->testInt = test;
}
bool Obj::worked() {
return false;
}
Here's the child class header
class Obj1 : public Obj {
public:
Obj1();
Obj1(int);
virtual bool worked();
};
Child class
#include "Obj1.h"
Obj1::Obj1() {
}
Obj1::Obj1(int a) {
this->testInt = a / 2;
}
bool Obj1::worked() {
return true;
}
Here's my main class
int main() {
Obj obj = Obj(99);
Obj1 obj1 = Obj1(45);
obj = obj1;
if(obj.worked())
cout << "good" << obj.testInt << endl;
else cout << "bad " << obj.testInt << endl;
if(obj1.worked()) {
cout << "1good " << obj1.testInt << endl;
} else
cout << "1bad " << obj1.testInt << endl;
return 0;
}
Here's the output when it's ran
bad 99
1good 22
How do I get it so obj = obj1; (found in main above) makes it so that obj.worked() will return true (since that's how obj1's class defines it)? Essentially how do I get it to behave like it would in java? I don't need a deep copy, I just want to toss out what obj used to reference and have it point to obj1 (I think thats how it works in java).
Note: I'm not very familiar with Java.
There's a major difference between "variables" in C++ and Java:
class X { public: int m = 5; };
X a; // no `= X();` required
X b;
a = b;
a.m = 42;
print(b.m); // this line is pseudo-code
In Java, variables may point to different objects. In the example above, after the assignment, a and b point to the same object. Modifying this object through one will make the modification visible when accessing the object through the other, print(b.m) will print 42.
In C++, "variables" (actually: names) always refer to the same object. There are two objects, one named a and one named b, and the assignment doesn't change that. Per default/convention, assignment in C++ means (deep) copy. a = b will be interpreted by most people and in the case of built-in types as copy the contents of b to a (or, more formally, change a such that it will be equal to b afterwards, without altering b).
Now it should be clear that you cannot alter which override of worked will be called by using the assignment in C++: which override of a virtual function is called is selected based on the type of the object (dynamic type), and you cannot change which object a name (variable) refers to.
However, there are pointers in C++, so-called raw pointers and smart pointers. Pointers are objects themselves that point to other objects of one specific type. X* is a raw pointer that points to an object of type X even with polymorphism! Similarly, std::shared_ptr<X> is a smart pointer that points to an object of type X.
std::shared_ptr<X> pa = std::make_shared<X>();
std::shared_ptr<X> pb = std::make_shared<X>();
Every make_shared creates an object. So we have four objects in this example: pa, pb, and the two unnamed objects created via make_shared.
For pointers, there are several operators for dealing with the object pointed to. The most important one is the asterisk, which dereferences the pointer. *pa will give you the object pa points to. The pa-> operator is a shorthand for (*pa)., so you can use it to access members of the object pointed to.
The assignment of pointers does not copy the object pointed to. After the assigment pa = pb, both will point to the same object. For smart pointers, that implies cleaning up objects that are not referred to any more:
std::shared_ptr<X> pa = std::make_shared<X>();
std::shared_ptr<X> pb = std::make_shared<X>();
// 4 objects exist at this point
pa = pb;
// only 3 objects still exist, the one `pa` formerly pointed to was destroyed
Polymorphism in C++ now works with either references (not explained here) or pointers. I said earlier that pointers can only point to one specific type of object. The crux is that this object might be part of a bigger object, e.g. via composition. But inheritance in C++ is very similar to composition: all the members of a base class become part of the base class subobject of a derived class' object:
std::shared_ptr<Obj1> pobj1 = std::make_shared<Obj1>();
std::shared_ptr<Obj> pobj = pobj1;
Here, pobj points to the Obj base class subobject within the object *pobj1 (i.e. within the object pobj1 points to).
Polymorphism now works via virtual functions. Those have a special rule for which function is actually called. The expression *pobj gives us the object which pobj points to, and it is of type Obj. But in this example, it is only a base class subobject, i.e. the object we originally created is of a type derived from Obj. For these cases, we differentiate between the static and the dynamic type of an expression:
The static type of *pobj is always Obj - generally, for an object p, whose type is pointer to some_type, the static type of *p is just some_type, removing one level of indirection / one pointer to.
The dynamic type of *pobj depends on which object pobj currently points to, and therefore generally is not known at compile-time. If the object is a base class subobject, we use the derived class object which it is part of, and recurse until the object we have is not a base class subobject any more. The type of the object we end up with is the dynamic type of the expression. In the example above, pobj points to the Obj base class subobject of *pobj1. The object *pobj1 itself is not a base class subobject here, therefore the dynamic type of *pobj is Obj1.
This dynamic type is now used to select which virtual function override is called. In the case pobj->worked(), where the dynamic type of *pobj is Obj1, the override selected is Obj1::worked, which will return true.
N.B. As Ben Voigt pointed out, the dynamic type does not depend on composition. It is only about inheritance.
In C++, your objects are values not references as in java. The assignment (obj = obj1) will reference to the Obj part of Obj1. In C++ you have to use pointer or reference.
Pointer
Obj* obj = new Obj(99);
Obj1* obj1 = new Obj1(45);
delete obj;// you have to free the memory manually as there's no GC in C++
obj = obj1;
obj->Worked();// the worked will be true here
delete obj1; // manually delete it
and if you want to delete obj1 via obj (delete obj instead of delete obj1), you have to change the Obj's destructor to be virtual, otherwise the destructor of Obj1 won't be called. Damn, this is C++, enjoy it.
reference
Obj obj = Obj(99);
Obj1 obj1 = Obj1(45);
Obj& obj2 = obj1;
obj2.Worked() // should be true
In this case, unlike the pointer, you don't have to delete the objects as they are on stack (not created by 'new'). But you cannot create an array of Obj& (e.g. vector)
I have a doubt about downcasting an object in C++.
Here it comes an example:
class A { }
class B : public A {
public:
void SetVal(int i) { _v = i; }
private:
int _v;
}
A* a = new A();
B* b = dynamic_cast<B*>(a);
b->SetVal(2);
What would it happen with this example? We are modifying a base clase like if it is a child one... how does it work related with the memory?
With this cast... Is it like creating an instance of B and copying the values of A?
Thanks
A* a;
This just gives you a pointer to an A. It doesn't point anywhere in particular. It doesn't point at an A or B object at all. Whether your code works or not depends on the dynamic type of the object it is pointing at.
So there are two situations you might want to know about. First, this one:
A* a = new A();
B* b = dynamic_cast<B*>(a);
b->SetVal(2);
This will give you undefined behaviour because the dynamic_cast will return a null pointer. It returns a null pointer when the dynamic type of the object is really not a B. In this case, the object is an A. You then attempt to dereference the null pointer with b->SetVal(2), so you get undefined behaviour.
A* a = new B();
B* b = dynamic_cast<B*>(a);
b->SetVal(2);
This will work fine because the object really is a B object. The dynamic cast will succeed and the SetVal call will work just fine.
However, note that for this to work, A must be a polymorphic type. For that to be true, it must have at least one virtual member function.
That shouldn't even compile, because the classes aren't polymorphic so you can't use dynamic_cast.
If it did, it would be undefined behavior.
I found the following snippet in the C++03 Standard under 5.3.5 [expr.delete] p3:
In the first alternative (delete object), if the static type of the object to be deleted is different from its dynamic type, the static type shall be a base class of the operand’s dynamic type and the static type shall have a virtual destructor or the behavior is undefined. In the second alternative (delete array) if the dynamic type of the object to be deleted differs from its static type, the behavior is undefined.
Quick review on static and dynamic types:
struct B{ virtual ~B(){} };
struct D : B{};
B* p = new D();
Static type of p is B*, while the dynamic type of *p is D, 1.3.7 [defns.dynamic.type]:
[Example: if a pointer p whose static type is “pointer to class B” is pointing to an object of class D, derived from B, the dynamic type of the expression *p is “D.”]
Now, looking at the quote at the top again, this would mean that the follwing code invokes undefined behaviour if I got that right, regardless of the presence of a virtual destructor:
struct B{ virtual ~B(){} };
struct D : B{};
B* p = new D[20];
delete [] p; // undefined behaviour here
Did I misunderstand the wording in the standard somehow? Did I overlook something? Why does the standard specify this as undefined behaviour?
Base* p = new Base[n] creates an n-sized array of Base elements, of which p then points to the first element. Base* p = new Derived[n] however, creates an n-sized array of Derived elements. p then points to the Base subobject of the first element. p does not however refer to the first element of the array, which is what a valid delete[] p expression requires.
Of course it would be possible to mandate (and then implement) that delete [] p Does The Right Thing™ in this case. But what would it take? An implementation would have to take care to somehow retrieve the element type of the array, and then morally dynamic_cast p to this type. Then it's a matter of doing a plain delete[] like we already do.
The problem with that is that this would be needed every time an array of polymorphic element type, regardless of whether the polymorphism is used on not. In my opinion, this doesn't fit with the C++ philosophy of not paying for what you don't use. But worse: a polymorphic-enabled delete[] p is simply useless because p is almost useless in your question. p is a pointer to a subobject of an element and no more; it's otherwise completely unrelated to the array. You certainly can't do p[i] (for i > 0) with it. So it's not unreasonable that delete[] p doesn't work.
To sum up:
arrays already have plenty of legitimate uses. By not allowing arrays to behave polymorphically (either as a whole or only for delete[]) this means that arrays with a polymorphic element type are not penalized for those legitimate uses, which is in line with the philosophy of C++.
if on the other hand an array with polymorphic behaviour is needed, it's possible to implement one in terms of what we have already.
It's wrong to treat an array-of-derived as an array-of-base, not only when deleting items. For example even just accessing the elements will usually cause disaster:
B *b = new D[10];
b[5].foo();
b[5] will use the size of B to calculate which memory location to access, and if B and D have different sizes, this will not lead to the intended results.
Just like a std::vector<D> can't be converted to a std::vector<B>, a pointer to D[] shouldn't be convertible to a B*, but for historic reasons it compiles anyway. If a std::vector would be used instead, it would produce a compile time error.
This is also explained in the C++ FAQ Lite answer on this topic.
So delete causes undefined behavior in this case because it's already wrong to treat an array in this way, even though the type system can't catch the error.
Just to add to the excellent answer of sth - I have written a short example to illustrate this issue with different offsets.
Note that if you comment out the m_c member of the Derived class, the delete operation will work well.
Cheers,
Guy.
#include <iostream>
using namespace std;
class Base
{
public:
Base(int a, int b)
: m_a(a)
, m_b(b)
{
cout << "Base::Base - setting m_a:" << m_a << " m_b:" << m_b << endl;
}
virtual ~Base()
{
cout << "Base::~Base" << endl;
}
protected:
int m_a;
int m_b;
};
class Derived : public Base
{
public:
Derived()
: Base(1, 2) , m_c(3)
{
}
virtual ~Derived()
{
cout << "Derived::Derived" << endl;
}
private:
int m_c;
};
int main(int argc, char** argv)
{
// create an array of Derived object and point them with a Base pointer
Base* pArr = new Derived [3];
// now go ahead and delete the array using the "usual" delete notation for an array
delete [] pArr;
return 0;
}
IMHO this has to do with limitation of arrays to deal with constructor/destructor. Note that, when new[] is called, compiler forces to instantiate only default constructor. In the same way when delete[] is called, compiler might look for only the destructor of calling pointer's static type.
Now in the case of virtual destructor, Derived class destructor should be called first followed by the Base class. Since for arrays compiler might see the static type of calling object (here Base) type, it might end up calling just Base destructor; which is UB.
Having said that, it's not necessarily UB for all compilers; say for example gcc calls destructor in proper order.
I think it all comes down to the zero-overhead principle. i.e. the language doesn't allow storing information about the dynamic type of elements of the array.
I have a question about C++, how to assign a Base object to a Derived object? or how to assign a pointer to a Base object to a pointer to a Derived object?
In the code below, the two lines are wrong. How to correct that?
#include <iostream>
using namespace std;
class A{
public:
int a;
};
class B:public A{
public:
int b;
};
int main(){
A a;
B b;
b = a; //what happend?
cout << b.b << endl;
B* b2;
b2 = &a; // what happened?
cout << b->b << endl;
}
It makes no sense to assign a base object to a derived (or a base pointer to a derived pointer), so C++ will do its best to stop you doing it. The exception is when the base pointer really points at a derived, in which case you can use dynamic cast:
base * p = new derived;
derived * d = dynamic_cast <derived *>( p );
In this case, if p actually pointed at a base, the pointer d would contain NULL.
When an object is on the stack, you can only really assign objects of the same type to one another. They can be converted through overloaded cast operators or overloaded assignment operators, but you're specifying a conversion at that point. The compiler can't do such conversions itself.
A a;
B b;
b = a;
In this case, you're trying to assigning an A to a B, but A isn't a B, so it doesn't work.
A a;
B b;
a = b;
This does work, after a fashion, but it probably won't be what you expect. You just sliced your B. B is an A, so the assignment can take place, but because it's on the stack, it's just going to assign the parts of b which are part of A to a. So, what you get is an A. It's not a B in spite of the fact that you assigned from a B.
If you really want to be assigning objects of one type to another, they need to be pointers.
A* pa = NULL;
B* pb = new B;
pa = pb;
This works. pa now points to pb, so it's still a B. If you have virtual functions on A and B overrides them, then when you call them on pa, they'll call the B version (non-virtual ones will still call the A version).
A* pa = new A;
B* pb = pa;
This doesn't work. pa doesn't point B, so you can't assign it to pb which must point to a B. Just because a B is an A doesn't mean than an A is a B.
A a;
B* pb = &a;
This doesn't work for the same reason as the previous one. It just so happens that the A is on the stack this time instead of the heap.
A* pa;
B b;
pa = &b;
This does work. b is a B which is an A, so A can point to it. Virtual functions will call the B versions and non-virtual ones will call the A versions.
So, basically, A* can point to B's because B is an A. B* can't point to A because it isn't a B.
The compiler won't allow that kind of thing. And even if you manage to do it through some casting hack, doing so makes no sense. Assigning a derived object to a pointer of a base makes sense because everything that base can do, derived can do. However, if the opposite case was allowed, what if you try to access a member defined in derived on a base object? You would be trying to access an area of memory filled with garbage or irrelevant data.
b = a; //what happend?
This is plain illegal - A is not B, so you can't do it.
b2 = &a; // what happened?
Same here.
In neither case, the compiler wouldn't know what to assign to int b, hence he prevents you from doing that. The other way around (assigning Derived to Base) works, because Base is a subset of Derived.
Now if you would tell us, what exactly you want to achieve, we might help you.
If it's a case of assigning an A that is known to be a Derived type, you can do a cast:
A* a = new B();
B* b = dynamic_cast<B>(a);
Just remember that if a is not a B then dynamic_cast will return NULL. Note that this method works only on pointers for a reason.
Derived object is a kind of Base object, not the other way around.