I've recently came across this strange function in some class:
void* getThis() {return this;}
And later in the code it is sometimes used like so: bla->getThis() (Where bla is a pointer to an object of the class where this function is defined.)
And I can't seem to realize what this can be good for. Is there any situation where a pointer to an object would be different than the object's this (where bla != bla->getThis())?
It seems like a stupid question but I wonder if I'm missing something here..
Of course, the pointer values can be different! Below an example which demonstrates the issue (you may need to use derived1 on your system instead of derived2 to get a difference). The point is that the this pointer typically gets adjusted when virtual, multiple inheritance is involved. This may be a rare case but it happens.
One potential use case of this idiom is to be able to restore objects of a known type after storing them as void const* (or void*; the const correctness doesn't matter here): if you have a complex inheritance hierarchy, you can't just cast any odd pointer to a void* and hope to be able to restore it to its original type! That is, to easily obtain, e.g., a pointer to base (from the example below) and convert it to void*, you'd call p->getThis() which is a lot easier to static_cast<base*>(p) and get a void* which can be safely cast to a base* using a static_cast<base*>(v): you can reverse the implicit conversion but only if you cast back to the exact type where the original pointer came from. That is, static_cast<base*>(static_cast<void*>(d)) where d is a pointer to an object of a type derived from base is illegal but static_cast<base*>(d->getThis()) is legal.
Now, why is the address changing in the first place? In the example base is a virtual base class of two derived classes but there could be more. All subobjects whose class virtually inherits from base will share one common base subject in object of a further derived class (concrete in the example below). The location of this base subobject may be different relative to the respective derived subobject depending on how the different classes are ordered. As a result, the pointer to the base object is generally different from the pointers to the subobjects of classes virtually inheriting from base. The relevant offset will be computed at compile-time, when possible, or come from something like a vtable at run-time. The offsets are adjusted when converting pointers along the inheritance hierarchy.
#include <iostream>
struct base
{
void const* getThis() const { return this; }
};
struct derived1
: virtual base
{
int a;
};
struct derived2
: virtual base
{
int b;
};
struct concrete
: derived1
, derived2
{
};
int main()
{
concrete c;
derived2* d2 = &c;
void const* dptr = d2;
void const* gptr = d2->getThis();
std::cout << "dptr=" << dptr << " gptr=" << gptr << '\n';
}
No. Yes, in limited circumstances.
This looks like it is something inspired by Smalltalk, in which all objects have a yourself method. There are probably some situations in which this makes code cleaner. As the comments note, this looks like an odd way to even implement this idiom in c++.
In your specific case, I'd grep for actual usages of the method to see how it is used.
Your class can have custom operator& (so &a may not return this of a). That's why std::addressof exists.
I ran across something like this many (many many) years ago. If I recall correctly, it was needed when a class is manipulating other instances of the same class. One example might be a container class that can contain its own type/(class?).
That might be a way to override the this keyword.
Lets say that you have a memory pool, full initialized at the start of your program, for instance you know that at any time you can deal with a max of 50 messages, CMessage.
You create a pool at the size of 50 * sizeof(CMessage) (what ever this class might be), and CMessage implements the getThis function.
That way instead of overriding the new keyword you just override the "this", accessing the pool.
It can also mean that the object might be defined on different memory spaces, lets say on a SRAM, in boot mode, and then on a SDRAM.
It might be that the same instance will return different values for getThis through the program in such a situation, on purpose of course, when overriden.
Related
Context
My goal is to have a base container class that contains and manipulates several base class objects, and then a derived container class that contains and manipulates several derived class objects. On the advice of this answer, I attempted to do this by having each contain an array of pointers (a Base** and a Derived**), and cast from Derived** to Base** when initializing the base container class.
However, I ran into a problem – despite compiling just fine, when manipulating the contained objects, I'd have segfaults or the wrong methods would be called.
Problem
I have boiled the problem down to the following minimal case:
#include <iostream>
class Base1 {
public:
virtual void doThing1() {std::cout << "Called Base1::doThing1" << std::endl;}
};
class Base2 {
public:
virtual void doThing2() {std::cout << "Called Base2::doThing2" << std::endl;}
};
// Whether this inherits "virtual public" or just "public" makes no difference.
class Derived : virtual public Base1, virtual public Base2 {};
int main() {
Derived derived;
Derived* derivedPtrs[] = {&derived};
((Base2**) derivedPtrs)[0]->doThing2();
}
You may expect this to print "Called Base2::doThing2", but…
$ g++ -Wall -Werror main.cpp -o test && ./test
Called Base1::doThing1
Indeed – the code calls Base2::doThing2, but Base1::doThing1 ends up being called. I've also had this segfault with more complex classes, so I assume it's address-related hijinks (perhaps vtable-related – the error doesn't seem to occur without virtual methods). You can run it here, and see the assembly it compiles to here.
You can see my actual structure here – it's more complex, but it ties it into the context and explains why I need something along the lines of this.
Why does the Derived**→Base** cast not work properly when Derived*→Base* does, and more importantly, what is the right way to handle an array of derived objects as an array of base objects (or, failing that, another way to make a container class that can contain multiple derived objects)?
I can't index before upcasting (((Base2*) derivedPtrs[0])->doThing2()), I'm afraid, since in the full code that array is a class member – and I'm not sure it's a good idea (or even possible) to cast manually in every place contained objects are used in the container classes. Correct me if that is the way to handle this, though.
(I don't think it makes a difference in this case, but I'm in an environment where std::vector is unavailable.)
Edit: Solution
Many of the answers suggested that casting each pointer individually is the only way to have an array that can contain derived objects – and that does indeed seem to be the case. For my particular use case, though, I managed to solve the problem using templates! By supplying a type parameter for what the container class is supposed to contain, instead of having to contain an array of derived objects, the type of the array can be set to the derived type at compile time (e.g. BaseContainer<Derived> container(length, arrayOfDerivedPtrs);).
Here's a version of the broken "actual structure" code above, fixed with templates.
There are many things that make this code pretty awful and contribute to this issue:
Why are we dealing with two-star types in the first place? If std::vector doesn't exist, why not write your own?
Don't use C-style casts. You can cast pointers to completely unrelated types into each other and the compiler is not allowed to stop you (coincidentally, this is exactly what is happening here). Use static_cast/dynamic_cast instead.
Let's assume we had std::vector, for ease of notation. You are trying to cast a std::vector<Derived*> to std::vector<Base*>. Those are unrelated types (the same is true for Derived** and Base**), and casting one to the other is not legal in any way.
Pointer casts from/to derived are not necessarily trivial. If you have struct X : A, B {}, then a pointer to the B base will be different from a pointer to the A base† (and with a vtable in play, possible also different from a pointer to X). They must be, because the (sub)objects cannot reside at the same memory address. The compiler will adjust the pointer value when you cast the pointer. This of course does not/cannot happen for each individual pointer if you (try to) cast an array of pointers.
If you have an array of pointers to Derived and want to get an array of those pointers to Base, then you have to manually cast each one. Since the pointer values will generally be different in both arrays, there is no way to "reuse" the same array.
†(Unless the conditions for empty base optimization are fulfilled, which is not the case for you).
Like others said the problem is that you are not letting the compiler doing its job in adjusting indices, suppose Derived layout in memory is something like (not guaranteed by the standard, just a possible implementation):
| vtable_Base1 | Base1 | vtable_Base2 | Base2 | vtable_Derived | Derived |
Then &derived points to the start of the object, when you normally do
Base2* base = static_cast<Derived*>(&derived)
the compiler knows the offset which Base2 structure has inside Derived type and adjusts the address to point to the start of it.
If, instead, you cast an array of pointers directly the compiler is coercing the type, assuming that your array is storing pointers to Base2 already, but without adjusting them.
A dirty hack which may work or not in your situation is having a method which returns the pointer to itself, eg:
class Base2 {
public:
Base2* base2() { return this; }
}
so that you can do derivedPtrs[0]->base2()->doThing2().
It doesn't work the same reason why this don't work:
struct Base {};
struct Derived : Base { int i; };
int main() {
Derived d[6];
Derived* d2 = d;
Base** b = &d2; // ERROR!
}
Your c-style cast is a bad practice because it didn't warn you of the error. Just don't do that to your code. Your c-style cast was actually a reinterpret_cast in disguise, which is tottaly wrong in the case.
But why cannot you cast an array to derived into an array to base? Simple: they have different layout.
You see, when you iterate on an array of a type, each elements in the array are contiguous in memory. The Derived class may have a size of let's say, 24 bytes, and Base a size of 8:
Derived d[4];
------------------------------------------------------
| D1 | D2 | D3 | D4 |
------------------------------------------------------
Base b[4];
---------------------
| B1 | B2 | B3 | B4 |
---------------------
As you can see, Derived[4] and Base[4] are different types with different layouts.
Then what can you do?
There's many solution in fact. The easiest would be to create a new array of pointer to base, and cast each derived to base pointers. You have to ajust the pointer of each objects anyway.
It would look like this:
std::vector<Base*> bases;
bases.reserve(std::size(derived_arr))
std::transform(
std::begin(derived_arr), std::end(derived_arr),
std::back_inserter(bases),
[](Dervied* d) {
// You must use dynamic cast because the
// pointer offset in only known at runtime
// when using virtual inheritance
return dynamic_cast<Base*>(d);
}
);
The other solution that is in place in memory would be to create you own type of iterator that would do the cast when calling the operator* and the operator->. This is a bit harder to do but can save you an allocation by making iteration a bit slower.
On top all that, this may be unrelated, but I would advise against virtual inheritance. This is not a good practice and in my experience resulted in more pain than anything. I would suggest using the adaptator pattern and wrap non-polymorphic types instead.
When you cast from Derived* to Base* the compiler will adjust the value. When you cast Derived** to Base** you defeat that.
This is a good reason to always use static_cast. Example error with your code changed:
test.cpp:19:5: error: static_cast from 'Derived **' to 'Base2 **' is not
allowed
static_cast<Base2**>(derivedPtrs)[0]->doThing2();
(Base2 **) is in fact a reinterpret_cast (You can confirm this by trying all four casts) and that expression causes UB. The fact that you can implicitly cast a pointer to derived to a pointer to base doesn't mean that they are the same, e.g. int and float. And here, you are referring an object by the type which it isn't, in this case, causes an UB.
Calling a virtual function this way results the final overrider of that object being called. How this is achieved depends on the compiler.
Assume that the compiler uses a "vtable", finding the vtable of Base2 (might be adding an offset to the memory. assembly) from the memory address of a Derived is not trivial.
Basically, you must perform a dynamic cast on every pointer, store them somewhere or dynamic cast it when in need.
I'm working on an old C++ project, in the source there are two lines:
memcpy( static_cast<PLONADDRESS>(this), pa, sizeof(LONADDRESS) );
memcpy( static_cast<PLONIOFILTER)(this), pf, sizeof(LONIOFILTER) );
this is an object of type CLonFilterUnit, it is derived from public classes:
class CLonFilterUnit : public LONADDRESS, public LONIOFILTER
PLONADDRESS is:
typedef LONADDRESS* PLONADRESS;
PLONFILTER is:
typedef LONIOFILTER* PLONFILTER;
pa is of type PLONADDRESS and pf is of type PLONIOFILTER.
What I don't understand is how the same base address is used as the destination in both memcpy instructions? Is this permitted due to the way static_cast works?
When you have a class that is derived from multiple base classes those classes can be thought of as sub objects of the derived class. You will have a base1, base2, ..., baseN part of the derived object. When you static_cast a pointer to the derived class to a pointer of one of its base classes the cast will adjust the pointer to point to the correct base (sub object) of the object. You can see that with this little example:
struct foo
{
int a;
};
struct bar
{
int b;
};
struct foobar : foo, bar {};
int main() {
foobar f;
std::cout << static_cast<foo*>(&f) << "\t" << static_cast<bar*>(&f);
}
output:
0x7ffe250056c8 0x7ffe250056cc
live example
I would also like to point out that if your class is not trivially copyable then the code has undefined behavior as memcpy requires that.
static_cast does the necessary address adjustment.
The code (with memcpy, uppercase names, typedefs of pointers) is a great example of how to absolutely not do things. Maybe it's used as an example in a lecture series about how to lose your job quickly.
This is an example of some extremely dubious C++ code. I can hardly imagine why this code would be necessary - most likely, it is not.
However, to answer the question as asked, the way multiple inheritance works in C++ is by having different 'subobjects' of different base classes within the derived class. Those subobjects do not have the same address. By using static_cast on this, you select one subobject or another, and results of static_cast yield different addresses.
I have this parent class in C++
//ParentClass header file
public ParentClass{
public:
ParentClass();
virtual void someParentFunction();
private:
//other member variables and functions
};
//Functions implemented done in respective .cpp file
I extended this class so I have a child that looks like this
//ChildOneClass header file
public ChildOneClass : public ParentClass{
public:
//Constructors and other functions
private:
//Other members
};
//Functions implemented in respective .cpp file
Example declaration:
//Dynamically create one ChildOneClass object
ChildOneClass * c = new ChildOneClass();
//I know this is never done, but for example purposes i just did this
void * v = c;
I know if you have a pointer that points to the object you can do both:
((ParentClass *) v)->someParentFunction();
or:
((ChildOneClass *) v)->someParentFunction();
But which way is the correct way? Does it matter if I cast a pointer thats pointing to the subclass as a parent class? Sorry if this is confusing please give me some feedback if the question is confusing. I'll do my best to clarify
The only correct cast of a void* to a class pointer is the cast to the original class pointer type passed to the void*. Anything else may lead to unexpected results (eg.: having virtual or multiple inheritance)
NOTE: This answer addresses a later revision of the original question, which has since been reverted. For an answer to the original and current revision of the question, please see Dieter Lucking's answer.
If you want to call someParentFunction() on a class which may have a derived class which contains that function, you'll want to use dynamic_cast to the most base class with which that call is valid:
GrandParentClass *g = ...;
if (ParentClass* pc = dynamic_cast<ParentClass*>(g)) {
// ok, it's a ParentClass, this is safe
pc->someParentFunction();
}
else {
// not a ParentClass, do something else, log an error, throw, etc.
}
There's no reason to cast all the way down to ChildOneClass, since you would miss all the types that are ParentClass but are not ChildOneClass. This covers all the valid subsets. Note that GrandParentClass would need to be polymorphic in order for this to work (e.g. GrandParentClass has a virtual member function).
When you are creating polymorphic hierarchies you should be thinking in terms of interfaces. Ideally you should never need to cast. However in some cases it is necessary.
When you cast it should be to the specific interface that you need. So if you need to process objects of the derived type cast to the derived type. If you need to process objects of the base type cast to the base type.
Your design should make it obvious what interface is being dealt with at which point in the system.
If you are casting a lot (or even at all) it could be a symptom of poor design.
I was studying Virtual Functions and Pointers. Below code made me to think about, why does one need Virtual Function when we can type cast base class pointer the way we want?
class baseclass {
public:
void show() {
cout << "In Base\n";
}
};
class derivedclass1 : public baseclass {
public:
void show() {
cout << "In Derived 1\n";
}
};
class derivedclass2 : public baseclass {
public:
void show() {
cout << "In Derived 2\n";
}
};
int main(void) {
baseclass * bptr[2];
bptr[0] = new derivedclass1;
bptr[1] = new derivedclass2;
((derivedclass1*) bptr)->show();
((derivedclass2*) bptr)->show();
delete bptr[0];
delete bptr[1];
return 0;
}
Gives same result if we use virtual in base class.
In Derived 1
In Derived 2
Am I missing something?
Your example appears to work, because there is no data, and no virtual methods, and no multiple inheritance. Try adding int value; to derivedclass1, const char *cstr; to derivedclass2, initialize these in corresponding constructors, and add printing these to corresponding show() methods.
You will see how show() will print garbage value (if you cast pointer to derivedclass1 when it is not) or crash (if you cast the pointer to derivedclass2 when class in fact is not of that type), or behave otherwise oddly.
C++ class member functions AKA methods are nothing more than functions, which take one hidden extra argument, this pointer, and they assume that it points to an object of right type. So when you have an object of type derivedclass1, but you cast a pointer to it to type derivedclass2, then what happens without virtual methods is this:
method of derivedclass2 gets called, because well, you explicitly said "this is a pointer to derivedclass2".
the method gets pointer to actual object, this. It thinks it points to actual instance of derivedclass2, which would have certain data members at certain offsets.
if the object actually is a derivedclass1, that memory contains something quite different. So if method thinks there is a char pointer, but in fact there isn't, then accessing the data it points to will probably access illegal address and crash.
If you instead use virtual methods, and have pointer to common base class, then when you call a method, compiler generates code to call the right method. It actually inserts code and data (using a table filled with virtual method pointers, usually called vtable, one per class, and pointer to it, one per object/instance) with which it knows to call the right method. So when ever you call a virtual method, it's not a direct call, but instead the object has extra pointer to the vtable of the real class, which tells what method should really be called for that object.
In summary, type casts are in no way an alternative to virtual methods. And, as a side note, every type cast is a place to ask "Why is this cast here? Is there some fundamental problem with this software, if it needs a cast here?". Legitimate use cases for type casts are quite rare indeed, especially with OOP objects. Also, never use C-style type casts with object pointers, use static_cast and dynamic_cast if you really need to cast.
If you use virtual functions, your code calling the function doesn't need to know about the actual class of the object. You'd just call the function blindly and correct function would be executed. This is the basis of polymorphism.
Type-casting is always risky and can cause run-time errors in large programs.
Your code should be open for extension but closed for modifications.
Hope this helps.
You need virtual functions where you don't know the derived type until run-time (e.g. when it depends on user input).
In your example, you have hard-coded casts to derivedclass2 and derivedclass1. Now what would you do here?
void f(baseclass * bptr)
{
// call the right show() function
}
Perhaps your confusion stems from the fact that you've not yet encountered a situation where virtual functions were actually useful. When you always know exactly at compile-time the concrete type you are operating on, then you don't need virtual functions at all.
Two other problems in your example code:
Use of C-style cast instead of C++-style dynamic_cast (of course, you usually don't need to cast anyway when you use virtual functons for the problem they are designed to solve).
Treating arrays polymorphically. See Item 3 in Scott Meyer's More Effective C++ book ("Never treat arrays polymorphically").
We can use Polymorphism (inheritance + virtual functions) in order to generalize different types under a common base-type, and then refer to different objects as if they were of the same type.
Using dynamic_cast appears to be the exact opposite approach, as in essence we are checking the specific type of an object before deciding what action we want to take.
Is there any known example for something that cannot be implemented with conventional polymorphism as easily as it is implemented with dynamic_cast?
Whenever you find yourself wanting a member function like "IsConcreteX" in a base class (edit: or, more precisely, a function like "ConcreteX *GetConcreteX"), you are basically implementing your own dynamic_cast. For example:
class Movie
{
// ...
virtual bool IsActionMovie() const = 0;
};
class ActionMovie : public Movie
{
// ...
virtual bool IsActionMovie() const { return true; }
};
class ComedyMovie : public Movie
{
// ...
virtual bool IsActionMovie() const { return false; }
};
void f(Movie const &movie)
{
if (movie.IsActionMovie())
{
// ...
}
}
This may look cleaner than a dynamic_cast, but on closer inspection, you'll soon realise that you've not gained anything except for the fact that the "evil" dynamic_cast no longer appears in your code (provided you're not using an ancient compiler which doesn't implement dynamic_cast! :)). It's even worse - the "self-written dynamic cast" approach is verbose, error-prone and repetitve, while dynamic_cast will work just fine with no additional code whatsoever in the class definitions.
So the real question should be whether there are situations where it makes sense that a base class knows about a concrete derived class. The answer is: usually it doesn't, but you will doubtlessly encounter such situations.
Think, in very abstract terms, about a component of your software which transmits objects from one part (A) to another (B). Those objects are of type Class1 or Class2, with Class2 is-a Class1.
Class1
^
|
|
Class2
A - - - - - - - -> B
(objects)
B, however, has some special handling only for Class2. B may be a completely different part of the system, written by different people, or legacy code. In this case, you want to reuse the A-to-B communication without any modification, and you may not be in a position to modify B, either. It may therefore make sense to explicitly ask whether you are dealing with Class1 or Class2 objects at the other end of the line.
void receiveDataInB(Class1 &object)
{
normalHandlingForClass1AndAnySubclass(object);
if (typeid(object) == typeid(Class2))
{
additionalSpecialHandlingForClass2(dynamic_cast<Class2 &>(object));
}
}
Here is an alternative version which does not use typeid:
void receiveDataInB(Class1 &object)
{
normalHandlingForClass1AndAnySubclass(object);
Class2 *ptr = dynamic_cast<Class2 *>(&object);
if (ptr != 0)
{
additionalSpecialHandlingForClass2(*ptr);
}
}
This might be preferable if Class2 is not a leaf class (i.e. if there may be classes further deriving from it).
In the end, it often comes down to whether you are designing a whole system with all its parts from the beginning or have to modify or adapt parts of it at a later stage. But if you ever find yourself confronted with a problem like the one above, you may come to appreciate dynamic_cast as the right tool for the right job in the right situation.
It allows you to do things which you can only do to the derived type. But this is usually a hint that a redesign is in order.
struct Foo
{
virtual ~Foo() {}
};
struct Bar : Foo
{
void bar() const {}
};
int main()
{
Foo * f = new Bar();
Bar* b = dynamic_cast<Bar*>(f);
if (b) b->bar();
delete f;
}
I can't think of any case where it's not possible to use virtual functions (other than such things as boost:any and similar "lost the original type" work).
However, I have found myself using dynamic_cast a few times in the Pascal compiler I'm currently writing in C++. Mostly because it's a "better" solution than adding a dozen virtual functions to the baseclass, that are ONLY used in one or two places when you already (should) know what type the object is. Currently, out of roughly 4300 lines of code, there are 6 instances of dynamic_cast - one of which can probably be "fixed" by actually storing the type as the derived type rather than the base-type.
In a couple of places, I use things like ArrayDecl* a = dynamic_cast<ArrayDecl*>(type); to determine that type is indeed an array declaration, and not someone using an non-array type as a base, when accessing an index (and I also need a to access the array type information later). Again, adding all the virtual functions to the base TypeDecl class would give lots of functions that mostly return nothing useful (e.g. NULL), and aren't called except when you already know that the class is (or at least should be) one of the derived types. For example, getting to know the range/size of an array is useless for types that aren't arrays.
No advantages really. Sometimes dynamic_cast is useful for a quick hack, but generally it is better to design classes properly and use polymorphism. There may be cases when due to some reasons it is not possible to modify the base class in order to add necessary virtual functions (e.g. it is from a third-party which we do not want to modify), but still dynamic_cast usage should be an exception, not a rule.
An often used argument that it is not convenient to add everything to the base class does not work really, since the Visitor pattern (see e.g. http://sourcemaking.com/design_patterns/visitor/cpp/2) solves this problem in a more organised way purely with polymorphism - using Visitor you can keep the base class small and still use virtual functions without casting.
dynamic_cast needs to be used on base class pointer for down cast when member function is not available in base class, but only in derived class. There is no advantage to use it. It is a way to safely down cast when virtual function is not overridden from base class. Check for null pointer on return value. You are correct in that it is used where there is no virtual function derivation.