How does evaluate pointers and reinterpret_cast? - c++

I have the following code that I run in Visual Studio. The address of c is the same as the address to which points pa but not the same as pb. Yet both ternary operator will evaluate as true, which is what would have expected by only viewing the code and not see the pointed addresses for pa and pb in debugger.
The third ternary operator will evaluate as false.
#include <iostream>
class A
{
public:
A() : m_i(0) {}
protected:
int m_i;
};
class B
{
public:
B() : m_d(0.0) {}
protected:
double m_d;
};
class C
: public A
, public B
{
public:
C() : m_c('a') {}
private:
char m_c;
};
int main()
{
C c;
A *pa = &c;
B *pb = &c;
const int x = (pa == &c) ? 1 : 2;
const int y = (pb == &c) ? 3 : 4;
const int z = (reinterpret_cast<char*>(pa) == reinterpret_cast<char*>(pb)) ? 5 : 6;
std::cout << x << y << z << std::endl;
return 0;
}
How does this work?

pa and pb are actually different. One way to test that is:
reinterpret_cast<char*>(pa) == reinterpret_cast<char*>(pb)
pa == &c and pb == &c both return true, but that does not mean the above must be true. &c will be converted to appropriate pointer type (A* or B*) via implicit pointer conversion. This conversion changes the pointer's value to the address of respective base class subobject of the object pointed-to by &c.
From cppreference:
A prvalue pointer to a (optionally cv-qualified) derived class type can be converted to a prvalue pointer to its accessible, unambiguous (identically cv-qualified) base class. The result of the conversion is a pointer to the base class subobject within the pointed-to object. The null pointer value is converted to the null pointer value of the destination type.
(emphasis mine)
A is the first non-virtual base class of C, so it is placed directly at the beginning of C's memory space, i.e.:
reinterpret_cast<char*>(pa) == reinterpret_cast<char*>(&c)
is true. But, B subobject is laid out after A, so it can not possibly satisfy the above condition. Both implicit conversion and static_cast then gives you the right address of the base subobject.

A C instance has an A subobject and a B subobject.
Something like this:
|---------|
|---------|
| A |
|---------|
C: |---------|
| B |
|---------|
|---------|
Now,
A *pa = &c;
makes pa point to the location of the A subobject, and
B *pb = &c;
makes pb point to the location of the B subobject.
|---------|
|---------| <------ pa
| A |
|---------|
C: |---------| <------ pb
| B |
|---------|
|---------|
When you compare pa and pb to &c, the same thing happens - in the first case, &c is the location of the A subobject and in the second it's the location of the B subobject.
So the reason that they both compare equal to &c is that the expression &c actually has different values (and different types) in the comparisons.
When you reinterpret_cast, no adjustment takes place - it means "take the representation of this value and interpret it as representing a value of a different type".
Since the subobjects are in different locations, the results of reinterpreting them as locations of a char are also different.

If you add some extra output, you can see what is going on; I added the following line:
std::cout << "pa: " << pa << "; pb: " << pb << "; c: " << &c << std::endl;
The output of this will vary of course, since I am printing the values of the pointers, but it will look like:
pa: 0x1000 pb: 0x1008 c: 0x1000
The pb pointer is in fact pointing at pa + sizeof(int) (which on my 64 bit machine is 8 bytes). This is because when you do:
B *pb = &c;
The compiler is casting the C object to a B, and will return you the value of the B variable. The confusion is that your second ternary operator shows true. This is (I am assuming) because the address of B is within the bounds of the address of C.

You're comparing the address pa and pb pointing to directly, they're different because A and B are both base class of C, and pa is pointing to the base class subobject A of c, pb is pointing to the base class subobject B of c, the actual memory address will be different. They can't/shouldn't point to the same memory address.

Related

Are address of object and pointer to object the same thing for an object of polymorph class?

I was trying to solve a c++ test, and saw this question.
#include <iostream>
class A
{
public:
A() : m_i(0) { }
protected:
int m_i;
};
class B
{
public:
B() : m_d(0.0) { }
protected:
double m_d;
};
class C
: public A
, public B
{
public:
C() : m_c('a') { }
private:
char m_c;
};
int main()
{
C c;
A *pa = &c;
B *pb = &c;
const int x = (pa == &c) ? 1 : 2;
const int y = (pb == &c) ? 3 : 4;
const int z = (reinterpret_cast<char*>(pa) == reinterpret_cast<char*>(pb)) ? 5 : 6;
std::cout << x << y << z << std::endl;
return 0;
}
Output :
136
Can anyone explain it's output? I thought the base pointer points to the part of the base part, so it's not the real address of the object.
Thanks.
pa points to A subobject of c. pb points to B subobject of c. Obviously, they point to different locations in memory (so 6 in the output).
But when they are compared to &c, &c is again converted to A* and B* respectively, thus pointing to the same A and B subobject.
Here's for illustration the likely layout of c in memory:
+------------------------+-------------+-------------------+
| A subobject | B subobject | Remainder of C |
+------------------------+-------------+-------------------+
^ &c is here ^ pb points here
^ pa also points here
Background
Object C looks something like this in memory
----------- <----- Start of the object
| A |
|---------| <----- Beginning of B implementation
| B |
|---------|
| C |
|_________| <----- End of the object
When you take a pointer to a base class from a derived class (e.g. A* pa = &c), the pointer points to the beginning of that class implementation for that object.
So this means A* will point to the beginning of A (which happens to be the beginning of the object) and B* will point to the beginning of B. Note that C* will not point to the beginning of C because it knows that C is derived from A and B. It will point to the beginning of the object.
Why?
Because when you call pb->someFunction(), it actually takes the pointer pointing to B and adds some precalculated offset and executes. If pb was pointing to the beginning of A, then it would end up in inside A. The pre-calculated offset is necessary because you have no idea what pb actually points to (is it C, is it "D", or just plain old B?). This approach allows us to always rely on the offset for finding the function.
Here's what your code is really doing
((A*)pa == (A*)&c) // Obviously true, since we defined it as such above.
((B*)pb == (B*)&c) // Obviously true, since we defined it as such above.
(reinterpret_cast<char*>(pa) == reinterpret_cast<char*>(pb)) // We know pa and pb point to different places in memory. If we cast them both to char*, they will obviously not be equivalent.
An interesting thing to try is
if (pa == pb)
This will give you a compilation error because you need to cast both pointers to a common type.

Difference between static and dynamic cast

The class is polymorphic.
Why do both print the same output?
class A
{
public:
virtual void P(){ cout << "A" << endl; }
};
class B : public A
{
public:
void P()override{
cout << "B" << endl;
}
B(){ cout << "Created B" << endl; s = "Created by B"; }
string s;
};
And main:
Variant 1:
A* a = new B(); // Created B
B* b = static_cast<B*>(a);
b->P(); B
cout<<b->s<<endl; // Created by B
And variant 2:
A* a = new B();
B* b = dynamic_cast<B*>(a);
if (b){
b->P();
cout << b->s << endl; // Prints same
}
Both of your examples will do the same thing, and that's fine. Try with this instead:
A* a = new A();
In this case, the static_cast will "succeed" (though it is undefined behavior), whereas the dynamic_cast will "fail" (by returning nullptr, which you already check for).
Your original examples don't show anything interesting because they both succeed and are valid casts. The difference with dynamic_cast is that it lets you detect invalid casts.
If you want to know how dynamic_cast does this, read about RTTI, Run Time Type Information. This is some additional bookkeeping that C++ does in certain cases to inspect the type of an object (it's important for this and also if you use typeid()).
In this case, static_cast is semantically equivalent to dynamic_cast.
static_cast < new_type > ( expression )
2) If new_type is a pointer or reference to some class D and the
type of expression is a pointer or reference to its non-virtual base
B, static_cast performs a downcast. Such static_cast makes no
runtime checks to ensure that the object's runtime type is actually D,
and may only be used safely if this precondition is guaranteed by
other means, such as when implementing static polymorphism. Safe
downcast may be done with dynamic_cast.
dynamic_cast < new_type > ( expression )
5) If expression is a pointer or reference to a polymorphic type Base,
and new_type is a pointer or reference to the type Derived a run-time
check is performed:
a) The most derived object pointed/identified by expression is
examined. If, in that object, expression points/refers to a public
base of Derived, and if only one subobject of Derived type is derived
from the subobject pointed/identified by expression, then the result
of the cast points/refers to that Derived subobject. (This is known as
a "downcast".)
[...]
c) Otherwise, the runtime check fails. If the
dynamic_cast is used on pointers, the null pointer value of type
new_type is returned. If it was used on references, the exception
std::bad_cast is thrown.
The last clause is what makes dynamic_cast safer, as you can check if the cast was unsuccessful:
Base* b1 = new Base;
if(Derived* d = dynamic_cast<Derived*>(b1))
{
std::cout << "downcast from b1 to d successful\n";
d->name(); // safe to call
}

interpreting object addresses with reintepret_cast

The following code gives the output as 136. But I could not understand how the first two address comparisons are equal. Appreciate any help to understand this. Thank you.
#include <iostream>
class A
{
public:
A() : m_i(0){ }
protected:
int m_i;
};
class B
{
public:
B() : m_d(0.0) { }
protected:
double m_d;
};
class C : public A, public B
{
public:
C() : m_c('a') { }
private:
char m_c;
};
int main( )
{
C d;
A *b1 = &d;
B *b2 = &d;
const int a = (reinterpret_cast<char *>(b1) == reinterpret_cast<char *>(&d)) ? 1 : 2;
const int b = (b2 == &d) ? 3 : 4;
const int c = (reinterpret_cast<char *>(b1) == reinterpret_cast<char *>(b2)) ? 5 : 6;
std::cout << a << b << c << std::endl;
return 0;
}
When you use multiple inheritance like in your example the first base class and the derived class share the same base address. Additional classes you inherit from are arranged in order at an offset based on the size of all preceding classes. The result of the comparison is true because the base address of d and b1 are the same.
In your case, if the size of A is 4 bytes then B will start at base address of A + 4 bytes. When you do B *b2 = &d; the compiler calculates the offset and adjusts the pointer value accordingly.
When you do b2 == &d an implicit conversion from type 'C' to type 'B' is performed on d before the comparison is done. This conversion adjusts the offset of the pointer value just as it would in an assignment.
It’s pretty typical for a derived class (like C here) to be laid out in memory so it starts with its two base classes (A and B), so the address of an instance of type C would be identical to the address of the instance its first base class (i.e. A).
In this kind of inheritance (when virtual is not involved,) each instance of C will have the following layout:
First, there will be all members of A (which is just m_i, a 4-byte integer)
Second will be all members of B (which is just m_d, an 8-byte double)
Last will be all members of C itself, which is just a character (1 byte, m_c)
When you cast a pointer to an instance of C to A, because A is the first parent, no address adjustment takes place, and the numerical value of the two pointers will be the same. This is why the first comparison evaluates to true. (Note that doing a reinterpret_cast<char *>() on a pointer never causes adjustment, so it always gives the numerical value of the pointer. Casting to void * would have the same effect and is probably safer for comparison.)
Casting a pointer to an instance of C to B will cause a pointer adjustment (by 4 bytes) which means that the numerical value of b2 will not be equal to &d. However, when you directly compare b2 and &d, the compiler automatically generates a cast for &d to B *, which will adjust the numerical value by 4 bytes. This is the reason that the second comparison also evaluates to true.
The third comparison return false because, as said before, casting a pointer to an instance of C to A or to B will have different results (casting to A * doesn't do adjustment, while casting to B * does.)

how to assign a Base object to a Derived object

I have a question about C++, how to assign a Base object to a Derived object? or how to assign a pointer to a Base object to a pointer to a Derived object?
In the code below, the two lines are wrong. How to correct that?
#include <iostream>
using namespace std;
class A{
public:
int a;
};
class B:public A{
public:
int b;
};
int main(){
A a;
B b;
b = a; //what happend?
cout << b.b << endl;
B* b2;
b2 = &a; // what happened?
cout << b->b << endl;
}
It makes no sense to assign a base object to a derived (or a base pointer to a derived pointer), so C++ will do its best to stop you doing it. The exception is when the base pointer really points at a derived, in which case you can use dynamic cast:
base * p = new derived;
derived * d = dynamic_cast <derived *>( p );
In this case, if p actually pointed at a base, the pointer d would contain NULL.
When an object is on the stack, you can only really assign objects of the same type to one another. They can be converted through overloaded cast operators or overloaded assignment operators, but you're specifying a conversion at that point. The compiler can't do such conversions itself.
A a;
B b;
b = a;
In this case, you're trying to assigning an A to a B, but A isn't a B, so it doesn't work.
A a;
B b;
a = b;
This does work, after a fashion, but it probably won't be what you expect. You just sliced your B. B is an A, so the assignment can take place, but because it's on the stack, it's just going to assign the parts of b which are part of A to a. So, what you get is an A. It's not a B in spite of the fact that you assigned from a B.
If you really want to be assigning objects of one type to another, they need to be pointers.
A* pa = NULL;
B* pb = new B;
pa = pb;
This works. pa now points to pb, so it's still a B. If you have virtual functions on A and B overrides them, then when you call them on pa, they'll call the B version (non-virtual ones will still call the A version).
A* pa = new A;
B* pb = pa;
This doesn't work. pa doesn't point B, so you can't assign it to pb which must point to a B. Just because a B is an A doesn't mean than an A is a B.
A a;
B* pb = &a;
This doesn't work for the same reason as the previous one. It just so happens that the A is on the stack this time instead of the heap.
A* pa;
B b;
pa = &b;
This does work. b is a B which is an A, so A can point to it. Virtual functions will call the B versions and non-virtual ones will call the A versions.
So, basically, A* can point to B's because B is an A. B* can't point to A because it isn't a B.
The compiler won't allow that kind of thing. And even if you manage to do it through some casting hack, doing so makes no sense. Assigning a derived object to a pointer of a base makes sense because everything that base can do, derived can do. However, if the opposite case was allowed, what if you try to access a member defined in derived on a base object? You would be trying to access an area of memory filled with garbage or irrelevant data.
b = a; //what happend?
This is plain illegal - A is not B, so you can't do it.
b2 = &a; // what happened?
Same here.
In neither case, the compiler wouldn't know what to assign to int b, hence he prevents you from doing that. The other way around (assigning Derived to Base) works, because Base is a subset of Derived.
Now if you would tell us, what exactly you want to achieve, we might help you.
If it's a case of assigning an A that is known to be a Derived type, you can do a cast:
A* a = new B();
B* b = dynamic_cast<B>(a);
Just remember that if a is not a B then dynamic_cast will return NULL. Note that this method works only on pointers for a reason.
Derived object is a kind of Base object, not the other way around.

C++ pointer multi-inheritance fun

I'm writing some code involving inheritance from a basic ref-counting pointer class; and some intricacies of C++ popped up. I've reduced it as follows:
Suppose I have:
class A{};
class B{};
class C: public A, public B {};
C c;
C* pc = &c;
B* pb = &c;
A* pa = &c;
// does pa point to a valid A object?
// does pb point to a valid B object?
// does pa == pb ?
Furthermore, does:
// pc == (C*) pa ?
// pc == (C*) pb ?
Thanks!
does pa point to a valid A object?
does pb point to a valid B object?
Yes, the C* gets converted so that pa and pb point to the correct addresses.
does pa == pb ?
No, usually not. There can't be an A object and a B object at the same address.
Furthermore, does
pc == (C*) pa ?
pc == (C*) pb ?
The cast converts the pointers back to the address of the C object, so both equalities are true.
Item 28 Meaning of Pointer Comparison in C++ Common Knowledge: Essential Intermediate Programming) explains the key of object pointer in C++:
In C++, an object can have multiple, valid addresses, and pointer comparison is not a question about addresses. It's a question about object identity.
Take a look at the code:
class A{};
class B{};
class C: public A, public B {};
C c;
C* pc = &c;
B* pb = &c;
A* pa = &c;
class C derives from both class A and class B, so class C is both class A and class B. the object C c has 3 valid addresses: address for class A, class B and class C. The implementation depends on compiler, so you can't assume the memory layout of class C, and it may like this:
---------- <- pc (0x7ffe7d10e1e0)
| |
---------- <- pa (0x7ffe7d10e1e4)
| A data |
---------- <- pb (0x7ffe7d10e1e8)
| B data |
----------
| C data |
----------
In above case, although the address value of pc, pa and pb aren't same, they all refer to the same object (c), so the compiler must ensure that pc compares equal to both pa and pb, i.e., pc == pa and pc == pb. The compiler accomplishes this comparison by adjusting the value of one of the pointers being compared by the appropriate offset. E.g.,
pc == pa
is translated to:
pc ? ((uintptr_t)pc + 4 == (uintptr_t)pa) : (pa == 0)
Among other things, since A and B have no inheritance relationship, we can't compare pa and pb directly.
For your questions:
(1) does pa point to a valid A object?
(2) does pb point to a valid B object?
Yes, refer the above diagram.
(3) pc == (C*) pa ?
(4) pc == (C*) pb ?
Yes, No need to add (C*).
(5) does pa == pb ?
No. We can't compare them.
C embeds an A and a B.
class C: public A, public B {};
is very similar to the C code
struct C {
A self_a;
B self_b;
};
and (B*) &c; is equivalent to static_cast< B* >( &c ) is similar to &c.self_b if you were using straight C.
In general, you can't rely on pointers to different types being interchangeable or comparable.
pc == pa;
pc == pb;
Not defined, depends on class structure.
pc == (C*) pa;
pc == (C*) pb;
Thats ok.
pa == pb;
No.
Do they point to valid objects?
Yes
What you get is something like this in memory
----------
| A data |
----------
| B data |
----------
| C data |
----------
So if you want the entire C object you'll get a pointer to the beginning of the memory. If you want only the A "part", you get the same address since that's where the data members are located. If you want the B "part" you get the beginning + sizeof(A) + sizeof(whatever the compiler adds for vtable).
Thus, in the example, pc != pb (could be pc != pa) but pa is never equal to pb.