C++ pointer multi-inheritance fun - c++

I'm writing some code involving inheritance from a basic ref-counting pointer class; and some intricacies of C++ popped up. I've reduced it as follows:
Suppose I have:
class A{};
class B{};
class C: public A, public B {};
C c;
C* pc = &c;
B* pb = &c;
A* pa = &c;
// does pa point to a valid A object?
// does pb point to a valid B object?
// does pa == pb ?
Furthermore, does:
// pc == (C*) pa ?
// pc == (C*) pb ?
Thanks!

does pa point to a valid A object?
does pb point to a valid B object?
Yes, the C* gets converted so that pa and pb point to the correct addresses.
does pa == pb ?
No, usually not. There can't be an A object and a B object at the same address.
Furthermore, does
pc == (C*) pa ?
pc == (C*) pb ?
The cast converts the pointers back to the address of the C object, so both equalities are true.

Item 28 Meaning of Pointer Comparison in C++ Common Knowledge: Essential Intermediate Programming) explains the key of object pointer in C++:
In C++, an object can have multiple, valid addresses, and pointer comparison is not a question about addresses. It's a question about object identity.
Take a look at the code:
class A{};
class B{};
class C: public A, public B {};
C c;
C* pc = &c;
B* pb = &c;
A* pa = &c;
class C derives from both class A and class B, so class C is both class A and class B. the object C c has 3 valid addresses: address for class A, class B and class C. The implementation depends on compiler, so you can't assume the memory layout of class C, and it may like this:
---------- <- pc (0x7ffe7d10e1e0)
| |
---------- <- pa (0x7ffe7d10e1e4)
| A data |
---------- <- pb (0x7ffe7d10e1e8)
| B data |
----------
| C data |
----------
In above case, although the address value of pc, pa and pb aren't same, they all refer to the same object (c), so the compiler must ensure that pc compares equal to both pa and pb, i.e., pc == pa and pc == pb. The compiler accomplishes this comparison by adjusting the value of one of the pointers being compared by the appropriate offset. E.g.,
pc == pa
is translated to:
pc ? ((uintptr_t)pc + 4 == (uintptr_t)pa) : (pa == 0)
Among other things, since A and B have no inheritance relationship, we can't compare pa and pb directly.
For your questions:
(1) does pa point to a valid A object?
(2) does pb point to a valid B object?
Yes, refer the above diagram.
(3) pc == (C*) pa ?
(4) pc == (C*) pb ?
Yes, No need to add (C*).
(5) does pa == pb ?
No. We can't compare them.

C embeds an A and a B.
class C: public A, public B {};
is very similar to the C code
struct C {
A self_a;
B self_b;
};
and (B*) &c; is equivalent to static_cast< B* >( &c ) is similar to &c.self_b if you were using straight C.
In general, you can't rely on pointers to different types being interchangeable or comparable.

pc == pa;
pc == pb;
Not defined, depends on class structure.
pc == (C*) pa;
pc == (C*) pb;
Thats ok.
pa == pb;
No.
Do they point to valid objects?
Yes

What you get is something like this in memory
----------
| A data |
----------
| B data |
----------
| C data |
----------
So if you want the entire C object you'll get a pointer to the beginning of the memory. If you want only the A "part", you get the same address since that's where the data members are located. If you want the B "part" you get the beginning + sizeof(A) + sizeof(whatever the compiler adds for vtable).
Thus, in the example, pc != pb (could be pc != pa) but pa is never equal to pb.

Related

lvalue ref dereference

Lets assume the following code compiles:
int main()
{
A* p = new B(); // 1
C& c = p->method(); // 2
*c = *p; // 3
c = *p; // 4
}
given there is not a single operator= that expects the class A,B, C and not a single casting operator to any of those to any of those.
Is it possible that B inherits from A inherits from C?
And how is it possible that both line 3 and line 4 will compile?
This line
*c = *p
could never compile without overloading the *operator for C. References, while having a lot of similarities to pointers, are not pointers. They are not memory addresses and thus cannot be dereferenced. Instead, they act as pointers that are automatically dereferenced whenever used. Furthermore, references can't be reassigned to point to different objects so although the code would compile if you changed it to c = *p, it wouldn't work as intended. c would still be pointing to the object that method returned.
The relations you described between A, B and C could be the case if you made c a pointer to C instead of a reference, as well as remove the line 4. The code would then look like this:
int main()
{
A* p = new B(); // works because B inherits from A
C* c = p->method(); // works because is "method" is a member function
// of A and returns a pointer to an instance of C
c* = *p; // works because A inherits from C
}

How does evaluate pointers and reinterpret_cast?

I have the following code that I run in Visual Studio. The address of c is the same as the address to which points pa but not the same as pb. Yet both ternary operator will evaluate as true, which is what would have expected by only viewing the code and not see the pointed addresses for pa and pb in debugger.
The third ternary operator will evaluate as false.
#include <iostream>
class A
{
public:
A() : m_i(0) {}
protected:
int m_i;
};
class B
{
public:
B() : m_d(0.0) {}
protected:
double m_d;
};
class C
: public A
, public B
{
public:
C() : m_c('a') {}
private:
char m_c;
};
int main()
{
C c;
A *pa = &c;
B *pb = &c;
const int x = (pa == &c) ? 1 : 2;
const int y = (pb == &c) ? 3 : 4;
const int z = (reinterpret_cast<char*>(pa) == reinterpret_cast<char*>(pb)) ? 5 : 6;
std::cout << x << y << z << std::endl;
return 0;
}
How does this work?
pa and pb are actually different. One way to test that is:
reinterpret_cast<char*>(pa) == reinterpret_cast<char*>(pb)
pa == &c and pb == &c both return true, but that does not mean the above must be true. &c will be converted to appropriate pointer type (A* or B*) via implicit pointer conversion. This conversion changes the pointer's value to the address of respective base class subobject of the object pointed-to by &c.
From cppreference:
A prvalue pointer to a (optionally cv-qualified) derived class type can be converted to a prvalue pointer to its accessible, unambiguous (identically cv-qualified) base class. The result of the conversion is a pointer to the base class subobject within the pointed-to object. The null pointer value is converted to the null pointer value of the destination type.
(emphasis mine)
A is the first non-virtual base class of C, so it is placed directly at the beginning of C's memory space, i.e.:
reinterpret_cast<char*>(pa) == reinterpret_cast<char*>(&c)
is true. But, B subobject is laid out after A, so it can not possibly satisfy the above condition. Both implicit conversion and static_cast then gives you the right address of the base subobject.
A C instance has an A subobject and a B subobject.
Something like this:
|---------|
|---------|
| A |
|---------|
C: |---------|
| B |
|---------|
|---------|
Now,
A *pa = &c;
makes pa point to the location of the A subobject, and
B *pb = &c;
makes pb point to the location of the B subobject.
|---------|
|---------| <------ pa
| A |
|---------|
C: |---------| <------ pb
| B |
|---------|
|---------|
When you compare pa and pb to &c, the same thing happens - in the first case, &c is the location of the A subobject and in the second it's the location of the B subobject.
So the reason that they both compare equal to &c is that the expression &c actually has different values (and different types) in the comparisons.
When you reinterpret_cast, no adjustment takes place - it means "take the representation of this value and interpret it as representing a value of a different type".
Since the subobjects are in different locations, the results of reinterpreting them as locations of a char are also different.
If you add some extra output, you can see what is going on; I added the following line:
std::cout << "pa: " << pa << "; pb: " << pb << "; c: " << &c << std::endl;
The output of this will vary of course, since I am printing the values of the pointers, but it will look like:
pa: 0x1000 pb: 0x1008 c: 0x1000
The pb pointer is in fact pointing at pa + sizeof(int) (which on my 64 bit machine is 8 bytes). This is because when you do:
B *pb = &c;
The compiler is casting the C object to a B, and will return you the value of the B variable. The confusion is that your second ternary operator shows true. This is (I am assuming) because the address of B is within the bounds of the address of C.
You're comparing the address pa and pb pointing to directly, they're different because A and B are both base class of C, and pa is pointing to the base class subobject A of c, pb is pointing to the base class subobject B of c, the actual memory address will be different. They can't/shouldn't point to the same memory address.

Address of upcast object

Suppose B is a base class of D (maybe virtual, maybe multiple inheritance, need not be a direct base class).
Let obj be an object of type D (not of a subclass of D -- exactly D).
Let
D * d = std::addressof(obj);
B * b = d;
Can we safely assume that
(char*) d <= (char*) b && (char*) b < (char*) d + sizeof(D)
?
Background: This is to become a step in a routine determining whether some object has been created by placement new in a particular aligned_storage. I need to be sure that, if yes, all pointers to base objects of this object point to some address within the aligned_storage.
I am pretty sure that your assumption is safe given D is the final type of the object. Otherwise it would be treacherous to use placement new in the first place.
#include <stdlib.h>
#include <new>
struct B { int i; };
struct D : virtual B { int j; };
int
main()
{
auto const storage = malloc(sizeof(D));
D* d = new (storage) D();
free(storage);
return 0;
}
If B were located before d then the placement new would need to return a pointer adjusted based on the layout of D but "the standard allocation function void* operator new(std::size_t, void*) ... simply returns its second argument unchanged." (http://en.cppreference.com/w/cpp/language/new) Likewise, the storage of B cannot be situated such that it extends beyond (char*)d + sizeof(D) because it would overrun the memory allocated.
Thanks for sharing an interesting question. Perhaps since asking the question you have already found a more satisfactory answer. I would be interested in reading a more concrete proof why the assumption holds or does not.

Are address of object and pointer to object the same thing for an object of polymorph class?

I was trying to solve a c++ test, and saw this question.
#include <iostream>
class A
{
public:
A() : m_i(0) { }
protected:
int m_i;
};
class B
{
public:
B() : m_d(0.0) { }
protected:
double m_d;
};
class C
: public A
, public B
{
public:
C() : m_c('a') { }
private:
char m_c;
};
int main()
{
C c;
A *pa = &c;
B *pb = &c;
const int x = (pa == &c) ? 1 : 2;
const int y = (pb == &c) ? 3 : 4;
const int z = (reinterpret_cast<char*>(pa) == reinterpret_cast<char*>(pb)) ? 5 : 6;
std::cout << x << y << z << std::endl;
return 0;
}
Output :
136
Can anyone explain it's output? I thought the base pointer points to the part of the base part, so it's not the real address of the object.
Thanks.
pa points to A subobject of c. pb points to B subobject of c. Obviously, they point to different locations in memory (so 6 in the output).
But when they are compared to &c, &c is again converted to A* and B* respectively, thus pointing to the same A and B subobject.
Here's for illustration the likely layout of c in memory:
+------------------------+-------------+-------------------+
| A subobject | B subobject | Remainder of C |
+------------------------+-------------+-------------------+
^ &c is here ^ pb points here
^ pa also points here
Background
Object C looks something like this in memory
----------- <----- Start of the object
| A |
|---------| <----- Beginning of B implementation
| B |
|---------|
| C |
|_________| <----- End of the object
When you take a pointer to a base class from a derived class (e.g. A* pa = &c), the pointer points to the beginning of that class implementation for that object.
So this means A* will point to the beginning of A (which happens to be the beginning of the object) and B* will point to the beginning of B. Note that C* will not point to the beginning of C because it knows that C is derived from A and B. It will point to the beginning of the object.
Why?
Because when you call pb->someFunction(), it actually takes the pointer pointing to B and adds some precalculated offset and executes. If pb was pointing to the beginning of A, then it would end up in inside A. The pre-calculated offset is necessary because you have no idea what pb actually points to (is it C, is it "D", or just plain old B?). This approach allows us to always rely on the offset for finding the function.
Here's what your code is really doing
((A*)pa == (A*)&c) // Obviously true, since we defined it as such above.
((B*)pb == (B*)&c) // Obviously true, since we defined it as such above.
(reinterpret_cast<char*>(pa) == reinterpret_cast<char*>(pb)) // We know pa and pb point to different places in memory. If we cast them both to char*, they will obviously not be equivalent.
An interesting thing to try is
if (pa == pb)
This will give you a compilation error because you need to cast both pointers to a common type.

how to assign a Base object to a Derived object

I have a question about C++, how to assign a Base object to a Derived object? or how to assign a pointer to a Base object to a pointer to a Derived object?
In the code below, the two lines are wrong. How to correct that?
#include <iostream>
using namespace std;
class A{
public:
int a;
};
class B:public A{
public:
int b;
};
int main(){
A a;
B b;
b = a; //what happend?
cout << b.b << endl;
B* b2;
b2 = &a; // what happened?
cout << b->b << endl;
}
It makes no sense to assign a base object to a derived (or a base pointer to a derived pointer), so C++ will do its best to stop you doing it. The exception is when the base pointer really points at a derived, in which case you can use dynamic cast:
base * p = new derived;
derived * d = dynamic_cast <derived *>( p );
In this case, if p actually pointed at a base, the pointer d would contain NULL.
When an object is on the stack, you can only really assign objects of the same type to one another. They can be converted through overloaded cast operators or overloaded assignment operators, but you're specifying a conversion at that point. The compiler can't do such conversions itself.
A a;
B b;
b = a;
In this case, you're trying to assigning an A to a B, but A isn't a B, so it doesn't work.
A a;
B b;
a = b;
This does work, after a fashion, but it probably won't be what you expect. You just sliced your B. B is an A, so the assignment can take place, but because it's on the stack, it's just going to assign the parts of b which are part of A to a. So, what you get is an A. It's not a B in spite of the fact that you assigned from a B.
If you really want to be assigning objects of one type to another, they need to be pointers.
A* pa = NULL;
B* pb = new B;
pa = pb;
This works. pa now points to pb, so it's still a B. If you have virtual functions on A and B overrides them, then when you call them on pa, they'll call the B version (non-virtual ones will still call the A version).
A* pa = new A;
B* pb = pa;
This doesn't work. pa doesn't point B, so you can't assign it to pb which must point to a B. Just because a B is an A doesn't mean than an A is a B.
A a;
B* pb = &a;
This doesn't work for the same reason as the previous one. It just so happens that the A is on the stack this time instead of the heap.
A* pa;
B b;
pa = &b;
This does work. b is a B which is an A, so A can point to it. Virtual functions will call the B versions and non-virtual ones will call the A versions.
So, basically, A* can point to B's because B is an A. B* can't point to A because it isn't a B.
The compiler won't allow that kind of thing. And even if you manage to do it through some casting hack, doing so makes no sense. Assigning a derived object to a pointer of a base makes sense because everything that base can do, derived can do. However, if the opposite case was allowed, what if you try to access a member defined in derived on a base object? You would be trying to access an area of memory filled with garbage or irrelevant data.
b = a; //what happend?
This is plain illegal - A is not B, so you can't do it.
b2 = &a; // what happened?
Same here.
In neither case, the compiler wouldn't know what to assign to int b, hence he prevents you from doing that. The other way around (assigning Derived to Base) works, because Base is a subset of Derived.
Now if you would tell us, what exactly you want to achieve, we might help you.
If it's a case of assigning an A that is known to be a Derived type, you can do a cast:
A* a = new B();
B* b = dynamic_cast<B>(a);
Just remember that if a is not a B then dynamic_cast will return NULL. Note that this method works only on pointers for a reason.
Derived object is a kind of Base object, not the other way around.