Address of upcast object - c++

Suppose B is a base class of D (maybe virtual, maybe multiple inheritance, need not be a direct base class).
Let obj be an object of type D (not of a subclass of D -- exactly D).
Let
D * d = std::addressof(obj);
B * b = d;
Can we safely assume that
(char*) d <= (char*) b && (char*) b < (char*) d + sizeof(D)
?
Background: This is to become a step in a routine determining whether some object has been created by placement new in a particular aligned_storage. I need to be sure that, if yes, all pointers to base objects of this object point to some address within the aligned_storage.

I am pretty sure that your assumption is safe given D is the final type of the object. Otherwise it would be treacherous to use placement new in the first place.
#include <stdlib.h>
#include <new>
struct B { int i; };
struct D : virtual B { int j; };
int
main()
{
auto const storage = malloc(sizeof(D));
D* d = new (storage) D();
free(storage);
return 0;
}
If B were located before d then the placement new would need to return a pointer adjusted based on the layout of D but "the standard allocation function void* operator new(std::size_t, void*) ... simply returns its second argument unchanged." (http://en.cppreference.com/w/cpp/language/new) Likewise, the storage of B cannot be situated such that it extends beyond (char*)d + sizeof(D) because it would overrun the memory allocated.
Thanks for sharing an interesting question. Perhaps since asking the question you have already found a more satisfactory answer. I would be interested in reading a more concrete proof why the assumption holds or does not.

Related

(c++) subclass chain destruction when calling destructor

class A {
....
};
class B: public A {
int x;
...
};
class C: public A{
A * ptr;
.....
};
class D : public A {
A* ptr1;
A* ptr2;
....
};
Note: I made all the constructors for B,C,D just didn't include them in there.
So A (with no fields) is the super class and I have 3 subclasses (B,C and D) each with different fields.
A is an abstract class and its mostly chaining of Class (B,C,D)
So like I might have a situation like
B *x = new B {5};
B *x2 = new B {5}
D * y = new D{x,x2);
So when I do delete y; I want to make it chain destruct the 2 pointers of its two fields which are (B objects). How would I make the destructor for class D then chain destruct?
Like the example I show is really simple but other examples have more and more layers. I want to make sure that everything is deleted so no memory leaks occur.
Should my dtor for Class D look like this ?
~D(){
delete ptr1;
delete ptr2;
}
and for the case of class C would I just do this?
~C(){
delete ptr;
}
Because I did this and it doesnt work I get memory leaks so whats wrong?
I assume your question is mistaken since you say it's a chain of inheritance but at time of writing B, C and D are inhereting directly from A.
When you do new D what happens is you have a composite object consisting of A, B, C and D - they are constructed in inheritance order.
*y = {
// B starts
int x;
// C starts
A * ptr;
// D starts
A * ptr1;
A * ptr2;
};
Note that destructors will be called in reverse order, D first then C and so forth, to clean up the memory in reverse order (to deal with dependencies on superclass). However this depends on what sort of object you are deleting.
If we did this:
D * y = new D;
A * x = y;
delete x;
we have effectively treated y as an A object, meaning we are limited to what an A object knows about and can do so long as we're using x. If we delete using x, we are going to call ~A() rather than ~D(). This is why in this situation the destructors need to be virtual, which allows virtual ~A() to call ~D(), which will eventually call ~A() again (the real one not the virtual despatch).
Regardless of whether you are or are not using polymorphism / upcasting, your intuition is correct - ~D() should only clean up memory that D objects introduced to the class (ptr1 and ptr2), likewise for ~C() (only ptr).
The leak would be because of object slicing happening somewhere. For example (if you forced this to compile), this is worse than the above example because the object data won't be copied (as oppose to the structure definition being incomplete relative to the sort of object being referred to):
D d;
C * c = d; // not too bad, we can't refer to D-specific things.
C c2 = d; // bad, c2 is copied from D up to C superclass and no further.
Be very careful that you only pass the pointers / references to objects that may need to hold derived types, and where you do make sure you use virtual destructors. Otherwise what you're doing is correct, letting each class's destructor clean up just for that class.

Meaning of dynamic type

The C++ standard says "... the interpretation of the call of a virtual function depends on the type of the object for which it is called (the dynamic type)" (p. 252) and then
"if a pointer p whose static type is 'pointer to B' is pointing to an object of class D, derived from B, the dynamic type of *p is D" (p. 2). Here B is a base class and D a derived class.
This seems to suggest (to me) that if I say
D d;
B *p = new B();
*p = d;
then, if f() is virtual in B, p->f() should call D::f(), which is wrong. I guess I'm not clear about the meaning of "pointing to an object of class ...". I know, if I say p = &d, then D::f() is called, but I would like to know why the above is wrong.
D d;
B *p = new B();
*p = d;
That final line assigns d to *p. This means it will copy the D instance using the B assignment operator and the object will be sliced. The dynamic type of *p in that case is still B.
p = &d;
That assigns the pointer to the D object to p. No slicing will occur in this case, because you are just assigning the pointers, not the objects themselves. The dynamic type of *p in this case is D.
The crux of the answer is that in C++, the dynamic type of an object can never change. You're thinking about the assignment expression *p = d as somehow completely replacing the object at *p with the object d. But that's not what happens. In C++, an object can never truly replace another.
Since class types are involved, *p = d just calls the assignment operator of class B (that's the static type of *p) with an argument D. An object in C++ can only be acted upon, it cannot really be "replaced."
Sure, we talk about copying objects, assigning to them etc. But that's just for convenience, as most of the time, the very exact semantics doesn't matter and thinking about = as assigning one object to another is simple. But deep down, it's either a function call on the target object (for class types), or copying the value of some bits into the target object's space (for primitive types). The target object itself always remains.
As TartanLlama said above, in the third line you're "wrongfully" assigning d to the object pointed by P, thereby what we refer to as Slicing occurs. From Professional C++,
...When upcasting, use a pointer or reference to the superclass to avoid slicing
D d;
B *p = new B();
*p = d;
This means the correct code evaluates to the code hereunder
D d;
B *p = new B();
p = &d;
OR
D d;
B *p = new B();
p = static_cast<B*>(&d);
Consequently, the dynamic type of the object pointed by p (*p) is still D. This allows you to switch back and forth through the inheritance hierarchy by upcasting or downcasting the pointer to the child class and not the object.

interpreting object addresses with reintepret_cast

The following code gives the output as 136. But I could not understand how the first two address comparisons are equal. Appreciate any help to understand this. Thank you.
#include <iostream>
class A
{
public:
A() : m_i(0){ }
protected:
int m_i;
};
class B
{
public:
B() : m_d(0.0) { }
protected:
double m_d;
};
class C : public A, public B
{
public:
C() : m_c('a') { }
private:
char m_c;
};
int main( )
{
C d;
A *b1 = &d;
B *b2 = &d;
const int a = (reinterpret_cast<char *>(b1) == reinterpret_cast<char *>(&d)) ? 1 : 2;
const int b = (b2 == &d) ? 3 : 4;
const int c = (reinterpret_cast<char *>(b1) == reinterpret_cast<char *>(b2)) ? 5 : 6;
std::cout << a << b << c << std::endl;
return 0;
}
When you use multiple inheritance like in your example the first base class and the derived class share the same base address. Additional classes you inherit from are arranged in order at an offset based on the size of all preceding classes. The result of the comparison is true because the base address of d and b1 are the same.
In your case, if the size of A is 4 bytes then B will start at base address of A + 4 bytes. When you do B *b2 = &d; the compiler calculates the offset and adjusts the pointer value accordingly.
When you do b2 == &d an implicit conversion from type 'C' to type 'B' is performed on d before the comparison is done. This conversion adjusts the offset of the pointer value just as it would in an assignment.
It’s pretty typical for a derived class (like C here) to be laid out in memory so it starts with its two base classes (A and B), so the address of an instance of type C would be identical to the address of the instance its first base class (i.e. A).
In this kind of inheritance (when virtual is not involved,) each instance of C will have the following layout:
First, there will be all members of A (which is just m_i, a 4-byte integer)
Second will be all members of B (which is just m_d, an 8-byte double)
Last will be all members of C itself, which is just a character (1 byte, m_c)
When you cast a pointer to an instance of C to A, because A is the first parent, no address adjustment takes place, and the numerical value of the two pointers will be the same. This is why the first comparison evaluates to true. (Note that doing a reinterpret_cast<char *>() on a pointer never causes adjustment, so it always gives the numerical value of the pointer. Casting to void * would have the same effect and is probably safer for comparison.)
Casting a pointer to an instance of C to B will cause a pointer adjustment (by 4 bytes) which means that the numerical value of b2 will not be equal to &d. However, when you directly compare b2 and &d, the compiler automatically generates a cast for &d to B *, which will adjust the numerical value by 4 bytes. This is the reason that the second comparison also evaluates to true.
The third comparison return false because, as said before, casting a pointer to an instance of C to A or to B will have different results (casting to A * doesn't do adjustment, while casting to B * does.)

Why is it undefined behavior to delete[] an array of derived objects via a base pointer?

I found the following snippet in the C++03 Standard under 5.3.5 [expr.delete] p3:
In the first alternative (delete object), if the static type of the object to be deleted is different from its dynamic type, the static type shall be a base class of the operand’s dynamic type and the static type shall have a virtual destructor or the behavior is undefined. In the second alternative (delete array) if the dynamic type of the object to be deleted differs from its static type, the behavior is undefined.
Quick review on static and dynamic types:
struct B{ virtual ~B(){} };
struct D : B{};
B* p = new D();
Static type of p is B*, while the dynamic type of *p is D, 1.3.7 [defns.dynamic.type]:
[Example: if a pointer p whose static type is “pointer to class B” is pointing to an object of class D, derived from B, the dynamic type of the expression *p is “D.”]
Now, looking at the quote at the top again, this would mean that the follwing code invokes undefined behaviour if I got that right, regardless of the presence of a virtual destructor:
struct B{ virtual ~B(){} };
struct D : B{};
B* p = new D[20];
delete [] p; // undefined behaviour here
Did I misunderstand the wording in the standard somehow? Did I overlook something? Why does the standard specify this as undefined behaviour?
Base* p = new Base[n] creates an n-sized array of Base elements, of which p then points to the first element. Base* p = new Derived[n] however, creates an n-sized array of Derived elements. p then points to the Base subobject of the first element. p does not however refer to the first element of the array, which is what a valid delete[] p expression requires.
Of course it would be possible to mandate (and then implement) that delete [] p Does The Right Thing™ in this case. But what would it take? An implementation would have to take care to somehow retrieve the element type of the array, and then morally dynamic_cast p to this type. Then it's a matter of doing a plain delete[] like we already do.
The problem with that is that this would be needed every time an array of polymorphic element type, regardless of whether the polymorphism is used on not. In my opinion, this doesn't fit with the C++ philosophy of not paying for what you don't use. But worse: a polymorphic-enabled delete[] p is simply useless because p is almost useless in your question. p is a pointer to a subobject of an element and no more; it's otherwise completely unrelated to the array. You certainly can't do p[i] (for i > 0) with it. So it's not unreasonable that delete[] p doesn't work.
To sum up:
arrays already have plenty of legitimate uses. By not allowing arrays to behave polymorphically (either as a whole or only for delete[]) this means that arrays with a polymorphic element type are not penalized for those legitimate uses, which is in line with the philosophy of C++.
if on the other hand an array with polymorphic behaviour is needed, it's possible to implement one in terms of what we have already.
It's wrong to treat an array-of-derived as an array-of-base, not only when deleting items. For example even just accessing the elements will usually cause disaster:
B *b = new D[10];
b[5].foo();
b[5] will use the size of B to calculate which memory location to access, and if B and D have different sizes, this will not lead to the intended results.
Just like a std::vector<D> can't be converted to a std::vector<B>, a pointer to D[] shouldn't be convertible to a B*, but for historic reasons it compiles anyway. If a std::vector would be used instead, it would produce a compile time error.
This is also explained in the C++ FAQ Lite answer on this topic.
So delete causes undefined behavior in this case because it's already wrong to treat an array in this way, even though the type system can't catch the error.
Just to add to the excellent answer of sth - I have written a short example to illustrate this issue with different offsets.
Note that if you comment out the m_c member of the Derived class, the delete operation will work well.
Cheers,
Guy.
#include <iostream>
using namespace std;
class Base
{
public:
Base(int a, int b)
: m_a(a)
, m_b(b)
{
cout << "Base::Base - setting m_a:" << m_a << " m_b:" << m_b << endl;
}
virtual ~Base()
{
cout << "Base::~Base" << endl;
}
protected:
int m_a;
int m_b;
};
class Derived : public Base
{
public:
Derived()
: Base(1, 2) , m_c(3)
{
}
virtual ~Derived()
{
cout << "Derived::Derived" << endl;
}
private:
int m_c;
};
int main(int argc, char** argv)
{
// create an array of Derived object and point them with a Base pointer
Base* pArr = new Derived [3];
// now go ahead and delete the array using the "usual" delete notation for an array
delete [] pArr;
return 0;
}
IMHO this has to do with limitation of arrays to deal with constructor/destructor. Note that, when new[] is called, compiler forces to instantiate only default constructor. In the same way when delete[] is called, compiler might look for only the destructor of calling pointer's static type.
Now in the case of virtual destructor, Derived class destructor should be called first followed by the Base class. Since for arrays compiler might see the static type of calling object (here Base) type, it might end up calling just Base destructor; which is UB.
Having said that, it's not necessarily UB for all compilers; say for example gcc calls destructor in proper order.
I think it all comes down to the zero-overhead principle. i.e. the language doesn't allow storing information about the dynamic type of elements of the array.

how to assign a Base object to a Derived object

I have a question about C++, how to assign a Base object to a Derived object? or how to assign a pointer to a Base object to a pointer to a Derived object?
In the code below, the two lines are wrong. How to correct that?
#include <iostream>
using namespace std;
class A{
public:
int a;
};
class B:public A{
public:
int b;
};
int main(){
A a;
B b;
b = a; //what happend?
cout << b.b << endl;
B* b2;
b2 = &a; // what happened?
cout << b->b << endl;
}
It makes no sense to assign a base object to a derived (or a base pointer to a derived pointer), so C++ will do its best to stop you doing it. The exception is when the base pointer really points at a derived, in which case you can use dynamic cast:
base * p = new derived;
derived * d = dynamic_cast <derived *>( p );
In this case, if p actually pointed at a base, the pointer d would contain NULL.
When an object is on the stack, you can only really assign objects of the same type to one another. They can be converted through overloaded cast operators or overloaded assignment operators, but you're specifying a conversion at that point. The compiler can't do such conversions itself.
A a;
B b;
b = a;
In this case, you're trying to assigning an A to a B, but A isn't a B, so it doesn't work.
A a;
B b;
a = b;
This does work, after a fashion, but it probably won't be what you expect. You just sliced your B. B is an A, so the assignment can take place, but because it's on the stack, it's just going to assign the parts of b which are part of A to a. So, what you get is an A. It's not a B in spite of the fact that you assigned from a B.
If you really want to be assigning objects of one type to another, they need to be pointers.
A* pa = NULL;
B* pb = new B;
pa = pb;
This works. pa now points to pb, so it's still a B. If you have virtual functions on A and B overrides them, then when you call them on pa, they'll call the B version (non-virtual ones will still call the A version).
A* pa = new A;
B* pb = pa;
This doesn't work. pa doesn't point B, so you can't assign it to pb which must point to a B. Just because a B is an A doesn't mean than an A is a B.
A a;
B* pb = &a;
This doesn't work for the same reason as the previous one. It just so happens that the A is on the stack this time instead of the heap.
A* pa;
B b;
pa = &b;
This does work. b is a B which is an A, so A can point to it. Virtual functions will call the B versions and non-virtual ones will call the A versions.
So, basically, A* can point to B's because B is an A. B* can't point to A because it isn't a B.
The compiler won't allow that kind of thing. And even if you manage to do it through some casting hack, doing so makes no sense. Assigning a derived object to a pointer of a base makes sense because everything that base can do, derived can do. However, if the opposite case was allowed, what if you try to access a member defined in derived on a base object? You would be trying to access an area of memory filled with garbage or irrelevant data.
b = a; //what happend?
This is plain illegal - A is not B, so you can't do it.
b2 = &a; // what happened?
Same here.
In neither case, the compiler wouldn't know what to assign to int b, hence he prevents you from doing that. The other way around (assigning Derived to Base) works, because Base is a subset of Derived.
Now if you would tell us, what exactly you want to achieve, we might help you.
If it's a case of assigning an A that is known to be a Derived type, you can do a cast:
A* a = new B();
B* b = dynamic_cast<B>(a);
Just remember that if a is not a B then dynamic_cast will return NULL. Note that this method works only on pointers for a reason.
Derived object is a kind of Base object, not the other way around.