Does non-polymorphic inheritance require this pointer adjustment? In all the cases I've seen this pointer adjustment discussed the examples used involved polymorphic inheritance via keyword virtual.
It's not clear to me if non-polymorphic inheritance would require this pointer adjustment.
An extremely simple example would be:
struct Base1 {
void b1() {}
};
struct Base2 {
void b2() {}
};
struct Derived : public Base1, Base2 {
void derived() {}
};
Would the following function call require this pointer adjustment?
Derived d;
d.b2();
In this case the this pointer adjustment would clearly be superfluous since no data members are accessed. On the other hand, if the inherited functions accessed data members then this pointer adjustment might be a good idea. On the other other hand, if the member functions are not inlined it seems like this pointer adjustment is necessary no matter what.
I realize this is an implementation detail and not part of the C++ standard but this is a question about how real compilers behave. I don't know if this is a case like vtables where all compilers follow the same general strategy or if I've asked a very compiler dependent question. If it is very compiler dependent, then that in itself would be a sufficient answer or if you'd prefer, you can focus on either gcc or clang.
Layout of objects is not specified by the language. From the C++ Draft Standard N3337:
10 Derived Classes
5 The order in which the base class subobjects are allocated in the most derived object (1.8) is unspecified. [ Note: a derived class and its base class subobjects can be represented by a directed acyclic graph (DAG) where an arrow means “directly derived from.” A DAG of subobjects is often referred to as a “subobject lattice.”
6 The arrows need not have a physical representation in memory. —end note ]
Coming to your question:
Would the following function call require this pointer adjustment?
It depends on how the object layout is created by the compiler. It may or may not.
In your case, since there are no member data in the classes, there are no virtual member functions, and you are using the member function of the first base class, you probably won't see any pointer adjustments. However, if you add member data, and use a member function of the second base class, you are most likely going to see pointer adjustments.
Here's some example code and the output from running the code:
#include <iostream>
struct Base1 {
void b1()
{
std::cout << (void*)this << std::endl;
}
int x;
};
struct Base2 {
void b2()
{
std::cout << (void*)this << std::endl;
}
int y;
};
struct Derived : public Base1, public Base2 {
void derived() {}
};
int main()
{
Derived d;
d.b1();
d.b2();
return 0;
}
Output:
0x28ac28
0x28ac2c
This is not just compiler-specific but also optimization-level-specific. As a rule of thumb, all this pointers are adjusted, only sometimes it is by 0 as would be your example in many compilers (but definitely not all — IIRC, MSVC is a notable exception). If the function is inlined and does not access this, then the adjustment may be optimized out altogether.
Using R Sahu's method for testing this, it looks like the answer for gcc, clang, and icc is yes, this pointer adjustment occurs, unless the base class is the primary base class or an empty base class.
The test code:
#include <iostream>
namespace {
struct Base1
{
void b1()
{
std::cout << "b1() " << (void*)this << std::endl;
}
int x;
};
struct Base2
{
void b2()
{
std::cout << "b2() " << (void*)this << std::endl;
}
int x;
};
struct EmptyBase
{
void eb()
{
std::cout << "eb(): " << (void*)this << std::endl;
}
};
struct Derived : private Base1, Base2, EmptyBase
{
void derived()
{
b1();
b2();
eb();
std::cout << "derived(): " << (void*)this << std::endl;
}
};
}
int main()
{
Derived d;
d.derived();
}
An anonymous namespace is used to give the base classes internal linkage. An intelligent compiler could determine that the only use of the base classes is in this translation unit and this pointer adjustment is unnecessary. Private inheritance is used for good measure but I don't think it has real significance.
Example g++ 4.9.2 output:
b1() 0x7fff5c5337d0
b2() 0x7fff5c5337d4
eb(): 0x7fff5c5337d0
derived(): 0x7fff5c5337d0
Example clang 3.5.0 output
b1() 0x7fff43fc07e0
b2() 0x7fff43fc07e4
eb(): 0x7fff43fc07e0
derived(): 0x7fff43fc07e0
Example icc 15.0.0.077 output:
b1() 0x7fff513e76d8
b2() 0x7fff513e76dc
eb(): 0x7fff513e76d8
derived(): 0x7fff513e76d8
All three compilers adjust the this pointer for b2(). If they don't elide the this pointer adjustment in this easy case then they very likely won't ever elide this pointer adjustment. The primary base class and empty base classes are exceptions.
As far as I know, an intelligent standards conforming compiler could elide the this pointer adjustment for b2() but it's simply an optimization that they don't do.
Related
I stumbled across this pattern today. It compiles fine but does not work correctly at runtime. ("Der1" is printed twice)
I can sort of see why, given that the address dereferenced is always the same, but I don't fully understand.
I am not looking for a solution or workaround, I have already restructured this code. Just interested to understand what happens under the hood in this scenario.
#include <iostream>
struct Base
{
virtual void Func() = 0;
};
struct Der1 : public Base
{
virtual void Func() override
{
std::cout << "Der1" <<std::endl;
}
};
struct Der2 : public Base
{
virtual void Func() override
{
std::cout << "Der2" <<std::endl;
}
};
static union Ders
{
Der1 D1;
Der2 D2;
Ders() : D1() {}
} theDers;
static Base * b = &theDers.D1;
int main()
{
b->Func();
b = &theDers.D2;
b->Func();
return 0;
}
What's happening is undefined behavior. What happens "under the hood" is immaterial. A different C++ compiler might produce completely different results (called a "crash").
You can observe undefined behavior in action by adding a constructor to both classes:
struct Der1 : public Base
{
Der1()
{
std::cout << "Der1 construct\n";
}
// ...
struct Der2 : public Base
{
Der2()
{
std::cout << "Der2 construct\n";
}
You will observe that only Der1 gets constructed. This is your big honking clue.
In a union, the first object in the union gets initially constructed for you. It becomes your onus to make a different member union "active" by manually invoking the existing active object's destructor and invoking the new active object's constructor, directly (typically using placement new). It's your onus to keep track of which union member is active.
The shown code invokes a method of an object that was never constructed, resulting in undefined behavior.
This is why in C++ it's much easier to use std::variant, which does all this work for you.
struct base
{
base(){}
~base() { cout << "base destructor" << endl; }
};
struct derived : public base
{
derived() : base() { vec.resize(200000000); }
~derived() { cout << "derived destructor" << endl; }
vector<int> vec;
};
int main()
{
base* ptr = new derived();
delete ptr;
while (true)
{
}
}
The above code leaks due to delete operation not calling derived object's destructor. But...
struct base
{
base() {}
~base() { cout << "base destructor" << endl; }
};
struct derived : public base
{
derived() : base() {}
~derived() { cout << "derived destructor" << endl; }
int arr[200000000];
};
int main()
{
base* ptr = new derived();
delete ptr;
while (true)
{
}
}
In second case, the memory doesn't leak despite the base destructor is only being called. So I'm assuming it's safe to not have a base destructor if all my members are automatic variables? Doesn't 'arr' member in derived class never go out of scope when derived object's destructor is not being called? What's going on behind the scenes?
YES!
I see that you are thinking "practically", about what destructions might be missed. Consider that the destructor of your derived class is not just the destructor body you write — in this context you also need to consider member destruction, and your suggestion may fail to destroy the vector (because the routine non-virtually destroying your object won't even know that there is a derived part to consider). The vector has dynamically allocated contents which would be leaked.
However we don't even need to go that far. The behaviour of your program is undefined, period, end of story. The optimiser can make assumptions based on your code being valid. If it's not, you can and should expect strange sh!t to happen that may not fit with how your expectation of a computer should work. That's because C++ is an abstraction, compilation is complex, and you made a contract with the language.
It is always necessary to have a virtual destructor in a base class if a derived object is ever deleted through a pointer to that base. Otherwise behaviour of the program is undefined. In any other case it is not necessary to have a virtual destructor. It is irrelevant what members the class has.
It's not necessary to have a memory leak and still invoke an UB. Memory leak is a kind of expected UB if your derived class isn't trivial. Example:
#include <iostream>
class Field {
public:
int *data;
Field() : data(new int[100]) {}
~Field() { delete[] data; std::cout << "Field is destroyed"; }
};
class Base {
int c;
};
// Derived class, contains a non-trivial non-static member
class Core : public Base
{
Field A;
};
int main()
{
Base *base = new Core;
delete base; // won't delete Field
}
he C++ Standard, [expr.delete], paragraph 3 states (2014 edition)
In the first alternative (delete object), if the static type of the
object to be deleted is different from its dynamic type, the static
type shall be a base class of the dynamic type of the object to be
deleted and the static type shall have a virtual destructor or the
behavior is undefined. In the second alternative (delete array) if the
dynamic type of the object to be deleted differs from its static type,
the behavior is undefined.
In reality , if base class is trivial, all fields are trivial and derived class contains no non-static or non-trivial members, one might argue, that those classes are equal, but I'm yet to find way how to prove that through standard.It's likely an IB instead of UB.
Consider following snippet of code making using of CRTP
#include <iostream>
struct Alone
{
Alone() { std::cout << "Alone constructor called" << std::endl; }
int me {10};
};
struct Dependant
{
explicit Dependant(const Alone& alone)
: ref_alone(alone)
{ std::cout << "Dependant called with alone's me = " << alone.me << std::endl; }
const Alone& ref_alone;
void print() { std::cout << ref_alone.me << std::endl; }
};
template <typename D>
struct Base
{
Base() { std::cout << "Base constructor called" << std::endl; }
D* getDerived() { return static_cast<D*>(this); }
Dependant f { getDerived()->alone };
void print() { f.print(); }
};
struct Derived : Base <Derived>
{
Derived() { std::cout << "Derived constructor called " << std::endl; }
Alone alone {};
void print() { Base::print(); };
};
int main()
{
Derived d;
d.print();
}
original link http://coliru.stacked-crooked.com/a/79f8ba2d9c38b965
I have a basic question first
How does memory allocation happen when using inheritance? I know the constructors are called from Base to Derived, but it seems that when I do
Derived d;
memory equivalent to sizeof(D) is allocated, and then constructors are called. Is my understanding correct here? (This would explain printing uninitialised member)
Considering the above example in mind, would you suggest/recommend any best practices when it comes to CRTP?
memory equivalent to sizeof(D) is allocated, and then constructors are called
How else could it possibly work? You can't construct an object in memory that isn't allocated yet. Memory allocation always comes before object construction.
Considering the above example in mind, would you suggest/recommend any best practices when it comes to CRTP?
The standard practices for the CRTP: don't call into the CRTP in a constructor/destructor. This is also true of virtual functions. Virtuals are dynamic polymorphism, while CRTP is static polymorphism. But they're both using the same basic mechanism: a base class that defines an interface that the derived class must implement.
And just like with virtual functions, trying to call it in constructors/destructors won't do what you mean. The only difference is that with virtual functions, the compiler will actually keep you from getting undefined behavior. Whereas with the CRTP, you just get breakage.
Note that this includes default member initializers, which for the purpose of non-aggregates are just shorthand for constructor initialization lists.
I have the following code (stolen from virtual functions and static_cast):
#include <iostream>
class Base
{
public:
virtual void foo() { std::cout << "Base::foo() \n"; }
};
class Derived : public Base
{
public:
virtual void foo() { std::cout << "Derived::foo() \n"; }
};
If I have:
int main()
{
Base base;
Derived& _1 = static_cast<Derived&>(base);
_1.foo();
}
The print-out will be: Base::foo()
However, if I have:
int main()
{
Base * base;
Derived* _1 = static_cast<Derived*>(base);
_1->foo();
}
The print-out will be: Segmentation fault: 11
Honestly, I don't quite understand both. Can somebody explain the complications between static_cast and virtual methods based on the above examples? BTW, what could I do if I want the print-out to be "Derived::foo()"?
A valid static_cast to pointer or reference type does not affect virtual calls at all. Virtual calls are resolved in accordance with the dynamic type of the object. static_cast to pointer or reference does not change the dynamic type of the actual object.
The output you observe in your examples is irrelevant though. The examples are simply broken.
The first one makes an invalid static_cast. You are not allowed to cast Base & to Derived & in situations when the underlying object is not Derived. Any attempt to perform such cast produces undefined behavior.
Here's an example of valid application of static_cast for reference type downcasting
int main()
{
Derived derived;
Base &base = derived;
Derived& _1 = static_cast<Derived&>(base);
_1.foo();
}
In your second example the code is completely broken for reasons that have nothing to do with any casts or virtual calls. The code attempts to manipulate non-initialized pointers - the behavior is undefined.
In your second example, you segfault because you did not instanciate your base pointer. So there is no v-table to call. Try:
Base * base = new Base();
Derived* _1 = static_cast<Derived*>(base);
_1->foo();
This will print Base::foo()
The question makes no sense, as the static_cast will not affect the v-table. However, this makes more sens with non-virtual functions :
class Base
{
public:
void foo() { std::cout << "Base::foo() \n"; }
};
class Derived : public Base
{
public:
void foo() { std::cout << "Derived::foo() \n"; }
};
int main()
{
Base base;
Derived& _1 = static_cast<Derived&>(base);
_1.foo();
}
This one will output Derived::foo(). This is however a very wrong code, and though it compiles, the behavior is undefined.
The whole purpose of virtual functions is that the static type of the variable shouldn't matter. The compiler will look up the actual implementation for the object itself (usually with a vtable pointer hidden within the object). static_cast should have no effect.
In both examples the behavior is undefined. A Base object is not a Derived object, and telling the compiler to pretend that it is doesn't make it one. The way to get the code to print out "Derived::foo()" is to use an object of type Derived.
Lets say we have the following two class definitions.
#include <iostream>
#include <array>
class A
{
public:
virtual void f() = 0;
};
class B : public A
{
public:
virtual void f() { std::cout << i << std::endl; }
int i;
};
Here sizeof(B) == 8, presumably 4 the virtual pointer and 4 for the int.
Now lets say we make an array of B, like so:
std::array<B, 10> x;
Now we get sizeof(x) == 80.
If my understanding is correct, all method calls on elements of x are resolved statically, as we know the type at compile time. Unless we do something like A* p = &x[i] I don't see a need to even store the virtual pointer.
Is there a way to create an object of type B without a virtual pointer if you know it is not going to be used?
i.e. a template type nonvirtual<T> which does not contain a virtual pointer, and cannot be pointed to by a subtype of T?
Is there a way to create an object of type B without a virtual pointer if you know it is not going to be used?
No. Objects are what they are. A virtual object is virtual, always.
After all, you could do this:
A *a = &x[2];
a->f();
That is perfectly legitimate and legal code. And C++ has to allow it. The type B is virtual, and it has a certain size. You can't make a type be a different type based on where it is used.
Answering my own question here, but I've found that the following does the job, by splitting A into it's virtual and non-virtual components:
enum is_virtual
{
VIRTUAL,
STATIC
};
template <is_virtual X>
class A;
template<>
class A<STATIC>
{
};
template<>
class A<VIRTUAL> : public A<STATIC>
{
public:
virtual void f() = 0;
virtual ~A() {}
};
template <is_virtual X>
class B : public A<X>
{
public:
void f() { std::cout << i << std::endl; }
int i;
};
The important thing here is that in B<> don't specify f() as virtual. That way it will be virtual if the class inherits A<VIRTUAL>, but not virtual if it inherits A<STATIC>. Then we can do the following:
int main()
{
std::cout << sizeof(B<STATIC>) << std::endl; // 4
std::cout << sizeof(B<VIRTUAL>) << std::endl; // 8
std::array<B<STATIC>, 10> x1;
std::array<B<VIRTUAL>, 10> x2;
std::cout << sizeof(x1) << std::endl; // 40
std::cout << sizeof(x2) << std::endl; // 80
}
That would be a nice one to have, but I can't think of any way to revoke virtual members or avoid storing the virtual pointer.
You could probably do some nasty hacks by keeping a buffer that's the size of B without the virtual pointer, and play with casts and such. But is all undefined behavior, and platform dependant.
Unfortunately it won't work in any normal sense as the code inside the method calls expects the virtual pointer to be in the class definition.
I suppose you could copy/paste all of A and B's code into a new class but that gets to be a maintenance headache fast.