Converting Derived** to Base** and Derived* to Base* - c++

Ok, I was reading through this entry in the FQA dealing about the issue of converting a Derived** to a Base** and why it is forbidden, and I got that the problem is that you could assign to a Base* something which is not a Derived*, so we forbid that.
So far, so good.
But, if we apply that principle in depth, why aren't we forbidding such example?
void nasty_function(Base *b)
{
*b = Base(3); // Ouch!
}
int main(int argc, char **argv)
{
Derived *d = new Derived;
nasty_function(d); // Ooops, now *d points to a Base. What would happen now?
}
I agree that nasty_function does something idiotic, so we could say that letting that kind of conversion is fine because we enable interesting designs, but we could say that also for the double-indirection: you got a Base **, but you shouldn't assign anything to its deference because you really don't know where that Base ** comes, just like the Base *.
So, the question: what's special about that extra-level-of-indirection? Maybe the point is that, with just one level of indirection, we could play with virtual operator= to avoid that, while the same machinery isn't available on plain pointers?

nasty_function(d); // Ooops, now *d points to a Base. What would happen now?
No, it doesn't. It points to a Derived. The function simply changed the Base subobject in the existing Derived object. Consider:
#include <cassert>
struct Base {
Base(int x) : x(x) {}
int x;
};
struct Derived : Base {
Derived(int x, int y) : Base(x), y(y) {}
int y;
};
int main(int argc, char **argv)
{
Derived d(1,2); // seriously, WTF is it with people and new?
// You don't need new to use pointers
// Stop it already
assert(d.x == 1);
assert(d.y == 2);
nasty_function(&d);
assert(d.x == 3);
assert(d.y == 2);
}
d doesn't magically become a Base, does it? It's still a Derived, but the Base part of it changed.
In pictures :)
This is what Base and Derived objects look like:
When we have two levels of indirection it doesn't work because the things being assigned are pointers:
Notice how neither of the Base or Derived objects in question are attempted to be changed: only the middle pointer is.
But, when you only have one level of indirection, the code modifies the object itself, in a way that the object allows (it can forbid it by making private, hiding, or deleting the assignment operator from a Base):
Notice how no pointers are changed here. This is just like any other operation that changes part of an object, like d.y = 42;.

No, nasty_function() isn't as nasty as it sounds. As the pointer b points to something that is-a Base, it's perfectly legal to assign a Base-value to it.
Take care: your "Ooops" comment is not correct: d still points to the same Derived as before the call! Only, the Base part of it was reassigned (by value!). If that gets your whole Derived out of consistency, you need to redesign by making Base::operator=() virtual. Then, in the nasty_function(), in fact the Derived assignment operator will be called (if defined).
So, I think, your example does not have that much to do with the pointer-to-pointer case.

*b = Base(3) calls Base::operator=(const Base&), which is actually present in Derived as member functions (inc. operators) are inherited.
What would happen then (calling Derived::operator=(const Base&)) is sometimes called "slicing", and yes, it's bad (usually). It's a sad consequence of the sheer omnipresence of the "become-like" operator (the =) in C++.
(Note that the "become-like" operator doesn't exist in most OO languages like Java, C# or Python; = in object contexts there means reference assignment, which is like pointer assignment in C++;).
Summing up:
Casts Derived** -> Base** are forbidden, because they can cause a type error, because then you could end up with a pointer of type Derived* pointing to an object of type Base.
The problem you mentioned isn't a type error; it's a different type of error: mis-use of the interface of the derived object, rooting from the sorry fact that it has inherited the "become-like" operator of its parent class.
(Yes, I call op= in objects contexts "become-like" deliberately, as I feel that "assignment" isn't a good name to show what's happening here.)

Well the code you gave makes sense. Indeed the assignement operator cannot overrite data specific to Derived but only base. Virtual functions are still from Derived and not from Base.

*b = Base(3); // Ouch!
Here the object at *b really is a B, it's the base sub-object of *d. Only that base sub-object gets modified, the rest of the derived object isn't changed and d still points to the same object of the derived type.
You might not want to allow the base to be modified, but in terms if the type system it's correct. A Derived is a Base.
That's not true for the illegal pointer case. A Derived* is convertible to Base* but is not the same type. It violates the type system.
Allowing the conversion you're asking about would be no different to this:
Derived* d;
Base b;
d = &b;
d->x;

Reading through the good answers of my question I think I got the point of the issue, which comes from first principles in OO and has nothing to do with sub-objects and operator overloading.
The point is that you can use a Derived whenever a Base is required (substitution principle), but you cannot use a Derived* whenever a Base* is needed due to the possibility of assigning pointers of instances of derived classes to it.
Take a function with this prototype:
void f(Base **b)
f can do a bunch of things with b, dereferencing it among other things:
void f(Base **b)
{
Base *pb = *b;
...
}
If we passed to f a Derived**, it means that we are using a Derived* as a Base*, which is incorrect since we may assign a OtherDerived* to Base* but not to Derived*.
On the other way, take this function:
void f(Base *b)
If f dereferences b, then we would use a Derived in place of a Base, which is entirerly fine (provided that you give a correct implementation of your class hierarchies):
void f(Base *b)
{
Base pb = *b; // *b is a Derived? No problem!
}
To say it in another way: the substitution principles (use a derived class instead of the base one) works on instances, not on pointers, because the "concept" of a pointer to A is "points to an instance of whichever class inherits A", and the set of classes which inherits Base strictly contains the set of classes that inherits Derived.

Related

what difference does it make if i dont use pointer?

I was watching a video on youtube, how to do casting in c++?
As this is static casting I just want to know what will happen if I don't use pointers.
class Base{};
class Derived1: public Base {};
class Derived2: public Base{};
int main(){
Derived d1;
Derived d2;
Base *bp1=static_cast<Base*>(&d1);
Base *bp2=static_cast<Base*>(&d2);
Derived *d1p=static_cast<Derived1*>(bh2);
Derived *d2p=static_cast<Derived2*>(bp1);
return 0;
}
For example:
Base bp1=static_cast<Base*>(d1);
PS: I am sorry if this question doesn't make any sense.
Casting pointers (or references) doesn't do anything to the underlying objects, it just changes how you LOOK at something at the given address (when going through the casted pointer.) If the cast is invalid, of course your code will have undefined behavior.
Without pointers, you are actually modifying objects.
Derived d1;
Base b = d1; // makes a copy of just the "Base" part of d1
The above is ok, and needs no cast, but it's not a view of d1 looking like a Base. It is a sliced off copy of the base part of d1. "b" is an unrelated object, merely initialized by d. That is why it's called "slicing".
Casting an object to a pointer is non-sensical, unless your class has a conversion operator to that type, though that would be a generally terrible design:
struct Derived : Base {
explicit operator Base*() { return this; } // don't really do this, but it compiles
};
Given such a weird conversion operator, your cast would work, and it would call Derived::operator Base*. Because it's explicit, you would need the cast or the conversion would (usually) not happen. In general, I think you should think about this as mis-matching concepts. Pointers are not objects; they refer to them. Doing things to blur that is going to make your program very hard to reason about.

Cast array of pointers to derived class to array of pointers to base class

Here is some code that illustrates the question:
#include <iostream>
class Base {
};
class Derived : public Base {
};
void doThings(Base* bases[], int length)
{
for (int i = 0; i < length; ++i)
std::cout << "Do ALL the things\n";
}
int main(int argc, const char * argv[])
{
Derived* arrayOfDerived[2] = { new Derived(), new Derived() };
doThings(arrayOfDerived, 2); // Candidate function not viable: no known conversion from 'Derived *[2]' to 'Base **' for 1st argument
// Attempts to work out the correct cast
Derived** deriveds = &arrayOfDerived[0];
Base** bases = dynamic_cast<Base**>(deriveds); // 'Base *' is not a class
Base** bases = dynamic_cast<Base**>(arrayOfDerived); // 'Base *' is not a class
// Just pretend that it should work
doThings(reinterpret_cast<Base**>(arrayOfDerived), 2);
return 0;
}
Clang produces the errors given in the comments. The question is: "Is there a correct way to cast arrayOfDerived to something that doThings can take?
Bonus marks:
Why does clang produce the errors "'Base *' is not a class" on the given lines? I know that Base* isn't a class it's a pointer. What is the error trying to tell me? Why has dynamic_cast been designed so that in dynamic_cast<T*> the thing T must be a class?
What are the dangers of using the reinterpret_cast to force everything to work?
Thanks as always :)
No. What you’re asking for here is a covariant array type, which is not a feature of C++.
The risk with reinterpret_cast, or a C-style cast, is that while this will work for simple types, it will fail miserably if you use multiple or virtual inheritance, and may also break in the presence of virtual functions (depending on the implementation). Why? Because in those cases a static_cast or dynamic_cast may actually change the pointer value. That is, given
class A {
int a;
};
class B {
string s;
};
class C : public A, B {
double f[4];
};
C *ptr = new C();
unless the classes are empty, ptr == (A *)ptr and/or ptr == (B *)ptr will be false. It only takes a moment to work out why that must be the case; if you're using single inheritance and there is no vtable, it’s guaranteed that the layout of a subclass is the same as the layout of the superclass, followed by member variables defined in the subclass. If, however, you have multiple inheritance, the compiler must choose to lay the class out in some order or other; I'm not sure what the C++ standard has to say about it (you could check - it might define or constrain the layout order somehow), but whatever order it picks, it should be apparent that one of the casts must result in a change the pointer value.
The first error in your code is that you are using a syntax that should be avoided:
void doThings(Base* bases[], int length)
if you look at the error message, it tells you that bases is actually a Base**, and that is also what you should write if you really want to pass such a pointer.
The second problem has to do with type safety. Imagine this code, where I reduced the array of pointers to a single pointer:
class Base {...};
class Derived1: public Base {...};
class Derived2: public Base {...};
Derived1* p1 = 0;
Base*& pb = p1; // reference to a pointer
pb = new Derived2(); // implicit conversion Derived -> Base
If this code compiled, p1 would now suddenly point to a Derived2! The important difference to a simple conversion of a pointer-to-derived to a pointer-to-base is that we have a reference here, and the pointer in your case is no different.
You can use two variations that work:
Base* pb = p1; // pointer value conversion
Base* const& pb = p1; // reference-to-const pointer
Applied to your code, this involves copying the array of pointers-to-derived to an array of pointers-to-base, which is probably what you want to avoid. Unfortunately, the C++ type system doesn't provide any different means to achieve what you want directly. I'm not exactly sure about the reason, but I think that at least according to the standard pointers can have different sizes.
There are two things I would consider:
Convert doThings() to a function template taking two iterators. This follows the spirit of the STL and allows calls with different pointer types or others that provide an iterator interface.
Just write a loop. If the body of the loop is large, you could also consider extracting just that as a function.
In answer to your main question, the answer is not really,
because there's no way of reasonably implementing it. There's
no guarantee that the actual addresses in the Base* will be
the same as those in the Derived*, and in fact, there are many
cases where they aren't. In the case of multiple inheritance,
it's impossible that both bases have the same address as the
derived, because they must have different addresses from each
other. If you have an array of Derived*, whether it be
std::vector<Derived*> or a Derived** pointing to the first
element of an array, the only way to get an array of Base* is
by copying: using std::vector<Derived*>, this would look
something like:
std::vector<Base*> vectBase( vectDerived.size() );
std::transform(
vectDerived.cbegin(),
vectDerived.cend(),
vectBase.begin(),
[]( Derived* ptr ) { return static_cast<Base*>( ptr ); } );
(You can do exactly the same thing with Derived** and
Base**, but you'll have to tweek it with the known lengths.)
As for your "bonus" questions:
Clang (an I suspect every other compiler in existance),
produces the error Base* is not a class because it isn't
a class. You're trying to dynamic_cast between Base**
and Derived**, the pointed to types are Base* and
Derived*, and dynamic_cast requires the pointed to types
to be classes.
The only danger with reinterpret_cast is that it won't
work. You'll get undefined behavior, which will possibly
work in a few simple cases, but won't work generally. You'll
end up with something that the compiler thinks is a Base*,
but which doesn't physically point to a Base*.

Can 'this' pointer be different than the object's pointer?

I've recently came across this strange function in some class:
void* getThis() {return this;}
And later in the code it is sometimes used like so: bla->getThis() (Where bla is a pointer to an object of the class where this function is defined.)
And I can't seem to realize what this can be good for. Is there any situation where a pointer to an object would be different than the object's this (where bla != bla->getThis())?
It seems like a stupid question but I wonder if I'm missing something here..
Of course, the pointer values can be different! Below an example which demonstrates the issue (you may need to use derived1 on your system instead of derived2 to get a difference). The point is that the this pointer typically gets adjusted when virtual, multiple inheritance is involved. This may be a rare case but it happens.
One potential use case of this idiom is to be able to restore objects of a known type after storing them as void const* (or void*; the const correctness doesn't matter here): if you have a complex inheritance hierarchy, you can't just cast any odd pointer to a void* and hope to be able to restore it to its original type! That is, to easily obtain, e.g., a pointer to base (from the example below) and convert it to void*, you'd call p->getThis() which is a lot easier to static_cast<base*>(p) and get a void* which can be safely cast to a base* using a static_cast<base*>(v): you can reverse the implicit conversion but only if you cast back to the exact type where the original pointer came from. That is, static_cast<base*>(static_cast<void*>(d)) where d is a pointer to an object of a type derived from base is illegal but static_cast<base*>(d->getThis()) is legal.
Now, why is the address changing in the first place? In the example base is a virtual base class of two derived classes but there could be more. All subobjects whose class virtually inherits from base will share one common base subject in object of a further derived class (concrete in the example below). The location of this base subobject may be different relative to the respective derived subobject depending on how the different classes are ordered. As a result, the pointer to the base object is generally different from the pointers to the subobjects of classes virtually inheriting from base. The relevant offset will be computed at compile-time, when possible, or come from something like a vtable at run-time. The offsets are adjusted when converting pointers along the inheritance hierarchy.
#include <iostream>
struct base
{
void const* getThis() const { return this; }
};
struct derived1
: virtual base
{
int a;
};
struct derived2
: virtual base
{
int b;
};
struct concrete
: derived1
, derived2
{
};
int main()
{
concrete c;
derived2* d2 = &c;
void const* dptr = d2;
void const* gptr = d2->getThis();
std::cout << "dptr=" << dptr << " gptr=" << gptr << '\n';
}
No. Yes, in limited circumstances.
This looks like it is something inspired by Smalltalk, in which all objects have a yourself method. There are probably some situations in which this makes code cleaner. As the comments note, this looks like an odd way to even implement this idiom in c++.
In your specific case, I'd grep for actual usages of the method to see how it is used.
Your class can have custom operator& (so &a may not return this of a). That's why std::addressof exists.
I ran across something like this many (many many) years ago. If I recall correctly, it was needed when a class is manipulating other instances of the same class. One example might be a container class that can contain its own type/(class?).
That might be a way to override the this keyword.
Lets say that you have a memory pool, full initialized at the start of your program, for instance you know that at any time you can deal with a max of 50 messages, CMessage.
You create a pool at the size of 50 * sizeof(CMessage) (what ever this class might be), and CMessage implements the getThis function.
That way instead of overriding the new keyword you just override the "this", accessing the pool.
It can also mean that the object might be defined on different memory spaces, lets say on a SRAM, in boot mode, and then on a SDRAM.
It might be that the same instance will return different values for getThis through the program in such a situation, on purpose of course, when overriden.

C++ Inheritance. Changing Object data Types

I am having trouble with forcing data type changes has on my own objects. I have a base class say A and two classes derived from A called B and C. I pass objects B and C to a function that checks which type of object it is (B or C). Here is some example code below and the question to my problem:
enum ClassType {"B", "C"};
class A {
protected:
m_Type;
public:
ClassType Type() { return m_Type}
...
...
};
class B : public A {
otherMemberFunctions();
}
class C : public A {
otherMemberFunctions();
}
void WhatType(vector<A*>* candidates){
vector<B*> b_candidates(0);
vector<C*> c_candidates(0);
for(int i = 0; i < candidates->size(); i++){
if(candidates->at(i)->Type() == B ){
B* b = (B*) candidates->at(i);
b_candidates(b);
}
//Same idea for Object C
}
}
I would then use WhatType(vector<A*>* candidates) as follows
vector<B*>* b_example
WhatType((vector<A*>*) b_exmaple)
When I have filled the new vector b_candidates in the function WhatType. Will I still have access to the member functions in the B object or will I only have the access to the member functions in the base class A?
I am confused to what happens with the object when I change the type of the object.
Here
WhatType((vector<A*>*) b_exmaple)
and here
B* b = (B*) candidates->at(i);
When you receive a pointer to a polymorphic object you have two types: the "static" type of the object, which, in your case, will be A *, and its "dynamic" or "real" type, that depends on what was actually assigned to it.
Casting your A * to B * forces the compiler to consider that pointer as a pointer to B; this is safe as long as you actually know that that pointer is actually a pointer to B, otherwise the compiler will start writing nonsensical code (invoking B methods on data of another type).
The checks you are trying to implement are a homegrown version of RTTI, which is a mechanism that allows you to know which is the "real type" of a pointer or a reference to a polymorphic class, and to perform that kind of casts safely. Check out typeid and dynamic_cast on your C++ manual for more info about it. (Incidentally, IIRC dynamic_cast is not only for safety in case the dynamic type is wrong, but it may perform also some extra magic on your pointer if you use it in complicated class hierarchies; so, avoid C-style casting for polymorphic classes)
By the way, in general it's considered "code smell" to have to manually check the "real type" of the pointer in order to cast it and use its methods: the OOP ideal would be being able to do the work only though virtual methods available in the base class.
Big warning: RTTI works only on polymorphic classes, i.e. classes that have at least one virtual method. On the other hand, if you are building a class hierarchy where objects are being passed around as pointers to the base class you'll almost surely want to have a virtual destructor, so that's no big deal.
Since you cast to B*, you will have access to B's members.
The actual type of the objects does not change, of course, but if you only have a pointer (or reference) to the base class you can not access fields specific to the sub-classes.
What you can do to access sub-class fields is to use dynamic_cast to cast it to the sub-class:
A *a = new B; // We cant reach the members of class B in a
B *b = dynamic_cast<B *>(a); // But now we have a proper pointer to B
Ok, so if you had an object of type B instantiated on the heap and held by a pointer of type A. you can only see type A's member functions, to access type B's member functions you have to static_cast<B*> which is what the ... "(B*)" ... is doing.
dynamic cast is better as it will return a null if the conversion is not possible. but of course it happens a run-time so there's a penalty.
As B and C are À derived, a vector<B *> and vector<C *> contains A base class objects. If you ensure to set your A::m_Type attribute in your constructor, you will no have problems:
enum ClassType {'B', 'C'}; // see I modified your definition
class A {
protected:
ClassType m_Type;
public:
ClassType Type() { return m_Type};
...
...
};
class B : public A {
public:
B() : m_Type('B') {}
....
};
Using this, you will check without problems your B and Cobjects. After that, as you are casting base objects to derived ones, you will have fully access to their public methods and attributes.

C++ ISO Standard interpretation of dereferencing pointer to base

I would like to know standard's view on dereferencing pointer to base, but I'm not making any progress finding it. Take these two classes for example:
class Base
{
public:
virtual void do_something() = 0;
};
class Derived : public Base
{
public:
virtual void do_something();
};
void foo2(Base *b)
{
Base &b1 = *b; // how this work by standard?
}
void foo()
{
Derived *d = new Derived();
foo2(d); // does this work by standard?
}
So, basically, if pointer of type B to an object of type D is dereferenced, will slicing happen in place, or temporary will emerge? I'm prone to believe that temporary is not an option, because that would mean that temporary is instance of abstract class.
Whatever the truth, I would appreciate any pointers to the ISO standard that says one or the other. (Or third, for that matter. :) )
EDIT:
I threw the point with temporary not being an option as a possible line of reasoning why it behaves the way it does, which is quite logical, but I can't find confirmation in standard, and I'm not a regular reader.
EDIT2:
Through discussion, it became obvious that my question was actually about dereferencing a pointer mechanism, and not about splicing or temporaries. I thank everyone for trying to dumb it down for me, and I finally got answer to the question the puzzled me the most: Why I can't find anything in the standard about this... Obviously it was the wrong question, but I've got the right answer.
Thnx
Base &b = *static_cast<Base *>(d); // does this work by standard?
Yes.
But you can simply do this:
Base &b = *d;
//use b polymorphically!
b.do_something(); //calls Derived::do_something()
No need to use static_cast. After all, Derived is derived from Base.
Reply to your edit:
foo2(d); // does this work by standard?
Yes. Pointer of type Base* can be initialized with pointer of type Derived*.
--
Base &b = *b; // how this work by standard?
No. They're same name. If you mean, Base &b1 = *b, then yes, that works. b1 refers to the object pointed to by b.
Object slicing only occurs when the copy constructor or the assignment operator of the base class gets involved somehow, like in parameter passing by value. You can easily avoid these errors by inheriting from Boost's noncopyable for example, even if only in DEBUG mode.
Neither casting pointers or references nor dereferencing involve any copy construction or assignment. Making a Base reference from a Derived reference is perfectly safe, it's even a standard implicit conversion.
In my C++11 draft, 10 [class.derived] /1 says
[ Note: The scope resolution operator :: (5.1) can be used to refer to
a direct or indirect base member explicitly. This allows access to a
name that has been redeclared in the derived class. A derived class
can itself serve as a base class subject to access control; see 11.2.
A pointer to a derived class can be implicitly converted to a pointer
to an accessible unambiguous base class (4.10). An lvalue of a
derived class type can be bound to a reference to an accessible
unambiguous base class (8.5.3). —end note ]
In most implementations, your foo2 function will store Base& b as a Base*. It obviously can't be a Base itself, because that would be a copy, not a reference. Since it acts (at runtime, not syntactically) like a pointer instead of a copy, there's no splicing concerns.
In your code before your edit, the compiler would know that Base& b was actually d, it would be syntactic sugar, and wouldn't even generate a pointer in the assembly.