C++ union of derived classes with pure virtual base - what happens? - c++

I stumbled across this pattern today. It compiles fine but does not work correctly at runtime. ("Der1" is printed twice)
I can sort of see why, given that the address dereferenced is always the same, but I don't fully understand.
I am not looking for a solution or workaround, I have already restructured this code. Just interested to understand what happens under the hood in this scenario.
#include <iostream>
struct Base
{
virtual void Func() = 0;
};
struct Der1 : public Base
{
virtual void Func() override
{
std::cout << "Der1" <<std::endl;
}
};
struct Der2 : public Base
{
virtual void Func() override
{
std::cout << "Der2" <<std::endl;
}
};
static union Ders
{
Der1 D1;
Der2 D2;
Ders() : D1() {}
} theDers;
static Base * b = &theDers.D1;
int main()
{
b->Func();
b = &theDers.D2;
b->Func();
return 0;
}

What's happening is undefined behavior. What happens "under the hood" is immaterial. A different C++ compiler might produce completely different results (called a "crash").
You can observe undefined behavior in action by adding a constructor to both classes:
struct Der1 : public Base
{
Der1()
{
std::cout << "Der1 construct\n";
}
// ...
struct Der2 : public Base
{
Der2()
{
std::cout << "Der2 construct\n";
}
You will observe that only Der1 gets constructed. This is your big honking clue.
In a union, the first object in the union gets initially constructed for you. It becomes your onus to make a different member union "active" by manually invoking the existing active object's destructor and invoking the new active object's constructor, directly (typically using placement new). It's your onus to keep track of which union member is active.
The shown code invokes a method of an object that was never constructed, resulting in undefined behavior.
This is why in C++ it's much easier to use std::variant, which does all this work for you.

Related

Is it necessary to have virtual destructor if the derived class only contains automatic variable members?

struct base
{
base(){}
~base() { cout << "base destructor" << endl; }
};
struct derived : public base
{
derived() : base() { vec.resize(200000000); }
~derived() { cout << "derived destructor" << endl; }
vector<int> vec;
};
int main()
{
base* ptr = new derived();
delete ptr;
while (true)
{
}
}
The above code leaks due to delete operation not calling derived object's destructor. But...
struct base
{
base() {}
~base() { cout << "base destructor" << endl; }
};
struct derived : public base
{
derived() : base() {}
~derived() { cout << "derived destructor" << endl; }
int arr[200000000];
};
int main()
{
base* ptr = new derived();
delete ptr;
while (true)
{
}
}
In second case, the memory doesn't leak despite the base destructor is only being called. So I'm assuming it's safe to not have a base destructor if all my members are automatic variables? Doesn't 'arr' member in derived class never go out of scope when derived object's destructor is not being called? What's going on behind the scenes?
YES!
I see that you are thinking "practically", about what destructions might be missed. Consider that the destructor of your derived class is not just the destructor body you write — in this context you also need to consider member destruction, and your suggestion may fail to destroy the vector (because the routine non-virtually destroying your object won't even know that there is a derived part to consider). The vector has dynamically allocated contents which would be leaked.
However we don't even need to go that far. The behaviour of your program is undefined, period, end of story. The optimiser can make assumptions based on your code being valid. If it's not, you can and should expect strange sh!t to happen that may not fit with how your expectation of a computer should work. That's because C++ is an abstraction, compilation is complex, and you made a contract with the language.
It is always necessary to have a virtual destructor in a base class if a derived object is ever deleted through a pointer to that base. Otherwise behaviour of the program is undefined. In any other case it is not necessary to have a virtual destructor. It is irrelevant what members the class has.
It's not necessary to have a memory leak and still invoke an UB. Memory leak is a kind of expected UB if your derived class isn't trivial. Example:
#include <iostream>
class Field {
public:
int *data;
Field() : data(new int[100]) {}
~Field() { delete[] data; std::cout << "Field is destroyed"; }
};
class Base {
int c;
};
// Derived class, contains a non-trivial non-static member
class Core : public Base
{
Field A;
};
int main()
{
Base *base = new Core;
delete base; // won't delete Field
}
he C++ Standard, [expr.delete], paragraph 3 states (2014 edition)
In the first alternative (delete object), if the static type of the
object to be deleted is different from its dynamic type, the static
type shall be a base class of the dynamic type of the object to be
deleted and the static type shall have a virtual destructor or the
behavior is undefined. In the second alternative (delete array) if the
dynamic type of the object to be deleted differs from its static type,
the behavior is undefined.
In reality , if base class is trivial, all fields are trivial and derived class contains no non-static or non-trivial members, one might argue, that those classes are equal, but I'm yet to find way how to prove that through standard.It's likely an IB instead of UB.

Is it possible to change a C++ object's class after instantiation?

I have a bunch of classes which all inherit the same attributes from a common base class. The base class implements some virtual functions that work in general cases, whilst each subclass re-implements those virtual functions for a variety of special cases.
Here's the situation: I want the special-ness of these sub-classed objects to be expendable. Essentially, I would like to implement an expend() function which causes an object to lose its sub-class identity and revert to being a base-class instance with the general-case behaviours implemented in the base class.
I should note that the derived classes don't introduce any additional variables, so both the base and derived classes should be the same size in memory.
I'm open to destroying the old object and creating a new one, as long as I can create the new object at the same memory address, so existing pointers aren't broken.
The following attempt doesn't work, and produces some seemingly unexpected behaviour. What am I missing here?
#include <iostream>
class Base {
public:
virtual void whoami() {
std::cout << "I am Base\n";
}
};
class Derived : public Base {
public:
void whoami() {
std::cout << "I am Derived\n";
}
};
Base* object;
int main() {
object = new Derived; //assign a new Derived class instance
object->whoami(); //this prints "I am Derived"
Base baseObject;
*object = baseObject; //reassign existing object to a different type
object->whoami(); //but it *STILL* prints "I am Derived" (!)
return 0;
}
You can at the cost of breaking good practices and maintaining unsafe code. Other answers will provide you with nasty tricks to achieve this.
I dont like answers that just says "you should not do that", but I would like to suggest there probably is a better way to achieve the result you seek for.
The strategy pattern as suggested in a comment by #manni66 is a good one.
You should also think about data oriented design, since a class hierarchy does not look like a wise choice in your case.
Yes and no. A C++ class defines the type of a memory region that is an object. Once the memory region has been instantiated, its type is set. You can try to work around the type system sure, but the compiler won't let you get away with it. Sooner or later it will shoot you in the foot, because the compiler made an assumption about types that you violated, and there is no way to stop the compiler from making such assumption in a portable fashion.
However there is a design pattern for this: It's "State". You extract what changes into it's own class hierarchy, with its own base class, and you have your objects store a pointer to the abstract state base of this new hierarchy. You can then swap those to your hearts content.
No it's not possible to change the type of an object once instantiated.
*object = baseObject; doesn't change the type of object, it merely calls a compiler-generated assignment operator.
It would have been a different matter if you had written
object = new Base;
(remembering to call delete naturally; currently your code leaks an object).
C++11 onwards gives you the ability to move the resources from one object to another; see
http://en.cppreference.com/w/cpp/utility/move
I'm open to destroying the old object and creating a new one, as long as I can create the new object at the same memory address, so existing pointers aren't broken.
The C++ Standard explicitly addresses this idea in section 3.8 (Object Lifetime):
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object <snip>
Oh wow, this is exactly what you wanted. But I didn't show the whole rule. Here's the rest:
if:
the storage for the new object exactly overlays the storage location which the original object occupied, and
the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and
the type of the original object is not const-qualified, and, if a class type, does not contain any non-static data member whose type is const-qualified or a reference type, and
the original object was a most derived object (1.8) of type T and the new object is a most derived object of type T (that is, they are not base class subobjects).
So your idea has been thought of by the language committee and specifically made illegal, including the sneaky workaround that "I have a base class subobject of the right type, I'll just make a new object in its place" which the last bullet point stops in its tracks.
You can replace an object with an object of a different type as #RossRidge's answer shows. Or you can replace an object and keep using pointers that existed before the replacement. But you cannot do both together.
However, like the famous quote: "Any problem in computer science can be solved by adding a layer of indirection" and that is true here too.
Instead of your suggested method
Derived d;
Base* p = &d;
new (p) Base(); // makes p invalid! Plus problems when d's destructor is automatically called
You can do:
unique_ptr<Base> p = make_unique<Derived>();
p.reset(make_unique<Base>());
If you hide this pointer and slight-of-hand inside another class, you'll have the "design pattern" such as State or Strategy mentioned in other answers. But they all rely on one extra level of indirection.
I suggest you use the Strategy Pattern, e.g.
#include <iostream>
class IAnnouncer {
public:
virtual ~IAnnouncer() { }
virtual void whoami() = 0;
};
class AnnouncerA : public IAnnouncer {
public:
void whoami() override {
std::cout << "I am A\n";
}
};
class AnnouncerB : public IAnnouncer {
public:
void whoami() override {
std::cout << "I am B\n";
}
};
class Foo
{
public:
Foo(IAnnouncer *announcer) : announcer(announcer)
{
}
void run()
{
// Do stuff
if(nullptr != announcer)
{
announcer->whoami();
}
// Do other stuff
}
void expend(IAnnouncer* announcer)
{
this->announcer = announcer;
}
private:
IAnnouncer *announcer;
};
int main() {
AnnouncerA a;
Foo foo(&a);
foo.run();
// Ready to "expend"
AnnouncerB b;
foo.expend(&b);
foo.run();
return 0;
}
This is a very flexible pattern that has at least a few benefits over trying to deal with the issue through inheritance:
You can easily change the behavior of Foo later on by implementing a new Announcer
Your Announcers (and your Foos) are easily unit tested
You can reuse your Announcers elsewhere int he code
I suggest you have a look at the age-old "Composition vs. Inheritance" debate (cf. https://www.thoughtworks.com/insights/blog/composition-vs-inheritance-how-choose)
ps. You've leaked a Derived in your original post! Have a look at std::unique_ptr if it is available.
You can do what you're literally asking for with placement new and an explicit destructor call. Something like this:
#include <iostream>
#include <stdlib.h>
class Base {
public:
virtual void whoami() {
std::cout << "I am Base\n";
}
};
class Derived : public Base {
public:
void whoami() {
std::cout << "I am Derived\n";
}
};
union Both {
Base base;
Derived derived;
};
Base *object;
int
main() {
Both *tmp = (Both *) malloc(sizeof(Both));
object = new(&tmp->base) Base;
object->whoami();
Base baseObject;
tmp = (Both *) object;
tmp->base.Base::~Base();
new(&tmp->derived) Derived;
object->whoami();
return 0;
}
However as matb said, this really isn't a good design. I would recommend reconsidering what you're trying to do. Some of other answers here might also solve your problem, but I think anything along the idea of what you're asking for is going to be kludge. You should seriously consider designing your application so you can change the pointer when the type of the object changes.
You can by introducing a variable to the base class, so the memory footprint stays the same. By setting the flag you force calling the derived or the base class implementation.
#include <iostream>
class Base {
public:
Base() : m_useDerived(true)
{
}
void setUseDerived(bool value)
{
m_useDerived = value;
}
void whoami() {
m_useDerived ? whoamiImpl() : Base::whoamiImpl();
}
protected:
virtual void whoamiImpl() { std::cout << "I am Base\n"; }
private:
bool m_useDerived;
};
class Derived : public Base {
protected:
void whoamiImpl() {
std::cout << "I am Derived\n";
}
};
Base* object;
int main() {
object = new Derived; //assign a new Derived class instance
object->whoami(); //this prints "I am Derived"
object->setUseDerived(false);
object->whoami(); //should print "I am Base"
return 0;
}
In addition to other answers, you could use function pointers (or any wrapper on them, like std::function) to achieve the necessary bevahior:
void print_base(void) {
cout << "This is base" << endl;
}
void print_derived(void) {
cout << "This is derived" << endl;
}
class Base {
public:
void (*print)(void);
Base() {
print = print_base;
}
};
class Derived : public Base {
public:
Derived() {
print = print_derived;
}
};
int main() {
Base* b = new Derived();
b->print(); // prints "This is derived"
*b = Base();
b->print(); // prints "This is base"
return 0;
}
Also, such function pointers approach would allow you to change any of the functions of the objects in run-time, not limiting you to some already defined sets of members implemented in derived classes.
There is a simple error in your program. You assign the objects, but not the pointers:
int main() {
Base* object = new Derived; //assign a new Derived class instance
object->whoami(); //this prints "I am Derived"
Base baseObject;
Now you assign baseObject to *object which overwrites the Derived object with a Base object. However, this does work well because you are overwriting an object of type Derived with an object of type Base. The default assignment operator just assigns all members, which in this case does nothing. The object cannot change its type and still is a Derived objects afterwards. In general, this can leads to serious problems e.g. object slicing.
*object = baseObject; //reassign existing object to a different type
object->whoami(); //but it *STILL* prints "I am Derived" (!)
return 0;
}
If you instead just assign the pointer it will work as expected, but you just have two objects, one of type Derived and one Base, but I think you want some more dynamic behavior. It sounds like you could implement the specialness as a Decorator.
You have a base-class with some operation, and several derived classes that change/modify/extend the base-class behavior of that operation. Since it is based on composition it can be changed dynamically. The trick is to store a base-class reference in the Decorator instances and use that for all other functionality.
class Base {
public:
virtual void whoami() {
std::cout << "I am Base\n";
}
virtual void otherFunctionality() {}
};
class Derived1 : public Base {
public:
Derived1(Base* base): m_base(base) {}
virtual void whoami() override {
std::cout << "I am Derived\n";
// maybe even call the base-class implementation
// if you just want to add something
}
virtual void otherFunctionality() {
base->otherFunctionality();
}
private:
Base* m_base;
};
Base* object;
int main() {
Base baseObject;
object = new Derived(&baseObject); //assign a new Derived class instance
object->whoami(); //this prints "I am Derived"
// undecorate
delete object;
object = &baseObject;
object->whoami();
return 0;
}
There are alternative patterns like Strategy which implement different use cases resp. solve different problems. It would probably good to read the pattern documentation with special focus to the Intent and Motivation sections.
I would consider regularizing your type.
class Base {
public:
virtual void whoami() { std::cout << "Base\n"; }
std::unique_ptr<Base> clone() const {
return std::make_unique<Base>(*this);
}
virtual ~Base() {}
};
class Derived: public Base {
virtual void whoami() overload {
std::cout << "Derived\n";
};
std::unique_ptr<Base> clone() const override {
return std::make_unique<Derived>(*this);
}
public:
~Derived() {}
};
struct Base_Value {
private:
std::unique_ptr<Base> pImpl;
public:
void whoami () {
pImpl->whoami();
}
template<class T, class...Args>
void emplace( Args&&...args ) {
pImpl = std::make_unique<T>(std::forward<Args>(args)...);
}
Base_Value()=default;
Base_Value(Base_Value&&)=default;
Base_Value& operator=(Base_Value&&)=default;
Base_Value(Base_Value const&o) {
if (o.pImpl) pImpl = o.pImpl->clone();
}
Base_Value& operator=(Base_Value&& o) {
auto tmp = std::move(o);
swap( pImpl, tmp.pImpl );
return *this;
}
};
Now a Base_Value is semantically a value-type that behaves polymorphically.
Base_Value object;
object.emplace<Derived>();
object.whoami();
object.emplace<Base>();
object.whoami();
You could wrap a Base_Value instance in a smart pointer, but I wouldn't bother.
I don’t disagree with the advice that this isn’t a great design, but another safe way to do it is with a union that can hold any of the classes you want to switch between, since the standard guarantees it can safely hold any of them. Here’s a version that encapsulates all the details inside the union itself:
#include <cassert>
#include <cstdlib>
#include <iostream>
#include <new>
#include <typeinfo>
class Base {
public:
virtual void whoami() {
std::cout << "I am Base\n";
}
virtual ~Base() {} // Every base class with child classes that might be deleted through a pointer to the
// base must have a virtual destructor!
};
class Derived : public Base {
public:
void whoami() {
std::cout << "I am Derived\n";
}
// At most one member of any union may have a default member initializer in C++11, so:
Derived(bool) : Base() {}
};
union BorD {
Base b;
Derived d; // Initialize one member.
BorD(void) : b() {} // These defaults are not used here.
BorD( const BorD& ) : b() {} // No per-instance data to worry about!
// Otherwise, this could get complicated.
BorD& operator= (const BorD& x) // Boilerplate:
{
if ( this != &x ) {
this->~BorD();
new(this) BorD(x);
}
return *this;
}
BorD( const Derived& x ) : d(x) {} // The constructor we use.
// To destroy, be sure to call the base class’ virtual destructor,
// which works so long as every member derives from Base.
~BorD(void) { dynamic_cast<Base*>(&this->b)->~Base(); }
Base& toBase(void)
{ // Sets the active member to b.
Base* const p = dynamic_cast<Base*>(&b);
assert(p); // The dynamic_cast cannot currently fail, but check anyway.
if ( typeid(*p) != typeid(Base) ) {
p->~Base(); // Call the virtual destructor.
new(&b) Base; // Call the constructor.
}
return b;
}
};
int main(void)
{
BorD u(Derived{false});
Base& reference = u.d; // By the standard, u, u.b and u.d have the same address.
reference.whoami(); // Should say derived.
u.toBase();
reference.whoami(); // Should say base.
return EXIT_SUCCESS;
}
A simpler way to get what you want is probably to keep a container of Base * and replace the items individually as needed with new and delete. (Still remember to declare your destructor virtual! That’s important with polymorphic classes, so you call the right destructor for that instance, not the base class’ destructor.) This might save you some extra bytes on instances of the smaller classes. You would need to play around with smart pointers to get safe automatic deletion, though. One advantage of unions over smart pointers to dynamic memory is that you don’t have to allocate or free any more objects on the heap, but can just re-use the memory you have.
DISCLAIMER: The code here is provided as means to understand an idea, not to be implemented in production.
You're using inheritance. It can achieve 3 things:
Add fields
Add methods
replace virtual methods
Out of all those features, you're using only the last one. This means that you're not actually forced to rely on inheritance. You can get the same results by many other means. The simplest is to keep tabs on the "type" by yourself - this will allow you to change it on the fly:
#include <stdexcept>
enum MyType { BASE, DERIVED };
class Any {
private:
enum MyType type;
public:
void whoami() {
switch(type){
case BASE:
std::cout << "I am Base\n";
return;
case DERIVED:
std::cout << "I am Derived\n";
return;
}
throw std::runtime_error( "undefined type" );
}
void changeType(MyType newType){
//insert some checks if that kind of transition is legal
type = newType;
}
Any(MyType initialType){
type = initialType;
}
};
Without inheritance the "type" is yours to do whatever you want. You can changeType at any time it suits you. With that power also comes responsibility: the compiler will no longer make sure the type is correct or even set at all. You have to ensure it or you'll get hard to debug runtime errors.
You may wrap it in inheritance just as well, eg. to get a drop-in replacement for existing code:
class Base : Any {
public:
Base() : Any(BASE) {}
};
class Derived : public Any {
public:
Derived() : Any(DERIVED) {}
};
OR (slightly uglier):
class Derived : public Base {
public:
Derived : Base() {
changeType(DERIVED)
}
};
This solution is easy to implement and easy to understand. But with more options in the switch and more code in each path it gets very messy. So the very first step is to refactor the actual code out of the switch and into self-contained functions. Where better to keep than other than Derivied class?
class Base {
public:
static whoami(Any* This){
std::cout << "I am Base\n";
}
};
class Derived {
public:
static whoami(Any* This){
std::cout << "I am Derived\n";
}
};
/*you know where it goes*/
switch(type){
case BASE:
Base:whoami(this);
return;
case DERIVED:
Derived:whoami(this);
return;
}
Then you can replace the switch with an external class that implements it via virtual inheritance and TADA! We've reinvented the Strategy Pattern, as others have said in the first place : )
The bottom line is: whatever you do, you're not inheriting the main class.
you cannot change to the type of an object after instantiation, as you can see in your example you have a pointer to a Base class (of type base class) so this type is stuck to it until the end.
the base pointer can point to upper or down object doesn't mean changed its type:
Base* ptrBase; // pointer to base class (type)
ptrBase = new Derived; // pointer of type base class `points to an object of derived class`
Base theBase;
ptrBase = &theBase; // not *ptrBase = theDerived: Base of type Base class points to base Object.
pointers are much strong, flexible, powerful as much dangerous so you should handle them cautiously.
in your example I can write:
Base* object; // pointer to base class just declared to point to garbage
Base bObject; // object of class Base
*object = bObject; // as you did in your code
above it's a disaster assigning value to un-allocated pointer. the program will crash.
in your example you escaped the crash through the memory which was allocated at first:
object = new Derived;
it's never good idea to assign a value and not address of a subclass object to base class. however in built-in you can but consider this example:
int* pInt = NULL;
int* ptrC = new int[1];
ptrC[0] = 1;
pInt = ptrC;
for(int i = 0; i < 1; i++)
cout << pInt[i] << ", ";
cout << endl;
int* ptrD = new int[3];
ptrD[0] = 5;
ptrD[1] = 7;
ptrD[2] = 77;
*pInt = *ptrD; // copying values of ptrD to a pointer which point to an array of only one element!
// the correct way:
// pInt = ptrD;
for(int i = 0; i < 3; i++)
cout << pInt[i] << ", ";
cout << endl;
so the result as not as you guess.
I have 2 solutions. A simpler one that doesn't preserve the memory address, and one that does preserve the memory address.
Both require that you provide provide downcasts from Base to Derived which isn't a problem in your case.
struct Base {
int a;
Base(int a) : a{a} {};
virtual ~Base() = default;
virtual auto foo() -> void { cout << "Base " << a << endl; }
};
struct D1 : Base {
using Base::Base;
D1(Base b) : Base{b.a} {};
auto foo() -> void override { cout << "D1 " << a << endl; }
};
struct D2 : Base {
using Base::Base;
D2(Base b) : Base{b.a} {};
auto foo() -> void override { cout << "D2 " << a << endl; }
};
For the former one you can create a smart pointer that can seemingly change the held data between Derived (and base) classes:
template <class B> struct Morpher {
std::unique_ptr<B> obj;
template <class D> auto morph() {
obj = std::make_unique<D>(*obj);
}
auto operator->() -> B* { return obj.get(); }
};
int main() {
Morpher<Base> m{std::make_unique<D1>(24)};
m->foo(); // D1 24
m.morph<D2>();
m->foo(); // D2 24
}
The magic is in
m.morph<D2>();
which changes the held object preserving the data members (actually uses the cast ctor).
If you need to preserve the memory location, you can adapt the above to use a buffer and placement new instead of unique_ptr. It is a little more work a whole lot more attention to pay to, but it gives you exactly what you need:
template <class B> struct Morpher {
std::aligned_storage_t<sizeof(B)> buffer_;
B *obj_;
template <class D>
Morpher(const D &new_obj)
: obj_{new (&buffer_) D{new_obj}} {
static_assert(std::is_base_of<B, D>::value && sizeof(D) == sizeof(B) &&
alignof(D) == alignof(B));
}
Morpher(const Morpher &) = delete;
auto operator=(const Morpher &) = delete;
~Morpher() { obj_->~B(); }
template <class D> auto morph() {
static_assert(std::is_base_of<B, D>::value && sizeof(D) == sizeof(B) &&
alignof(D) == alignof(B));
obj_->~B();
obj_ = new (&buffer_) D{*obj_};
}
auto operator-> () -> B * { return obj_; }
};
int main() {
Morpher<Base> m{D1{24}};
m->foo(); // D1 24
m.morph<D2>();
m->foo(); // D2 24
m.morph<Base>();
m->foo(); // Base 24
}
This is of course the absolute bare bone. You can add move ctor, dereference operator etc.
#include <iostream>
class Base {
public:
virtual void whoami() {
std::cout << "I am Base\n";
}
};
class Derived : public Base {
public:
void whoami() {
std::cout << "I am Derived\n";
}
};
Base* object;
int main() {
object = new Derived;
object->whoami();
Base baseObject;
object = &baseObject;// this is how you change.
object->whoami();
return 0;
}
output:
I am Derived
I am Base
Your assignment only assigns member variables, not the pointer used for virtual member function calls. You can easily replace that with full memory copy:
//*object = baseObject; //this assignment was wrong
memcpy(object, &baseObject, sizeof(baseObject));
Note that much like your attempted assignment, this would replace member variables in *object with those of the newly constructed baseObject - probably not what you actually want, so you'll have to copy the original member variables to the new baseObject first, using either assignment operator or copy constructor before the memcpy, i.e.
Base baseObject = *object;
It is possible to copy just the virtual functions table pointer but that would rely on internal knowledge about how the compiler stores it so is not recommended.
If keeping the object at the same memory address is not crucial, a simpler and so better approach would be the opposite - construct a new base object and copy the original object's member variables over - i.e. use a copy constructor.
object = new Base(*object);
But you'll also have to delete the original object, so the above one-liner won't be enough - you need to remember the original pointer in another variable in order to delete it, etc. If you have multiple references to that original object you'll need to update them all, and sometimes this can be quite complicated. Then the memcpy way is better.
If some of the member variables themselves are pointers to objects that are created/deleted in the main object's constructor/destructor, or if they have a more specialized assignment operator or other custom logic, you'll have some more work on your hands, but for trivial member variables this should be good enough.

Does this pointer adjustment occur for non-polymorphic inheritance?

Does non-polymorphic inheritance require this pointer adjustment? In all the cases I've seen this pointer adjustment discussed the examples used involved polymorphic inheritance via keyword virtual.
It's not clear to me if non-polymorphic inheritance would require this pointer adjustment.
An extremely simple example would be:
struct Base1 {
void b1() {}
};
struct Base2 {
void b2() {}
};
struct Derived : public Base1, Base2 {
void derived() {}
};
Would the following function call require this pointer adjustment?
Derived d;
d.b2();
In this case the this pointer adjustment would clearly be superfluous since no data members are accessed. On the other hand, if the inherited functions accessed data members then this pointer adjustment might be a good idea. On the other other hand, if the member functions are not inlined it seems like this pointer adjustment is necessary no matter what.
I realize this is an implementation detail and not part of the C++ standard but this is a question about how real compilers behave. I don't know if this is a case like vtables where all compilers follow the same general strategy or if I've asked a very compiler dependent question. If it is very compiler dependent, then that in itself would be a sufficient answer or if you'd prefer, you can focus on either gcc or clang.
Layout of objects is not specified by the language. From the C++ Draft Standard N3337:
10 Derived Classes
5 The order in which the base class subobjects are allocated in the most derived object (1.8) is unspecified. [ Note: a derived class and its base class subobjects can be represented by a directed acyclic graph (DAG) where an arrow means “directly derived from.” A DAG of subobjects is often referred to as a “subobject lattice.”
6 The arrows need not have a physical representation in memory. —end note ]
Coming to your question:
Would the following function call require this pointer adjustment?
It depends on how the object layout is created by the compiler. It may or may not.
In your case, since there are no member data in the classes, there are no virtual member functions, and you are using the member function of the first base class, you probably won't see any pointer adjustments. However, if you add member data, and use a member function of the second base class, you are most likely going to see pointer adjustments.
Here's some example code and the output from running the code:
#include <iostream>
struct Base1 {
void b1()
{
std::cout << (void*)this << std::endl;
}
int x;
};
struct Base2 {
void b2()
{
std::cout << (void*)this << std::endl;
}
int y;
};
struct Derived : public Base1, public Base2 {
void derived() {}
};
int main()
{
Derived d;
d.b1();
d.b2();
return 0;
}
Output:
0x28ac28
0x28ac2c
This is not just compiler-specific but also optimization-level-specific. As a rule of thumb, all this pointers are adjusted, only sometimes it is by 0 as would be your example in many compilers (but definitely not all — IIRC, MSVC is a notable exception). If the function is inlined and does not access this, then the adjustment may be optimized out altogether.
Using R Sahu's method for testing this, it looks like the answer for gcc, clang, and icc is yes, this pointer adjustment occurs, unless the base class is the primary base class or an empty base class.
The test code:
#include <iostream>
namespace {
struct Base1
{
void b1()
{
std::cout << "b1() " << (void*)this << std::endl;
}
int x;
};
struct Base2
{
void b2()
{
std::cout << "b2() " << (void*)this << std::endl;
}
int x;
};
struct EmptyBase
{
void eb()
{
std::cout << "eb(): " << (void*)this << std::endl;
}
};
struct Derived : private Base1, Base2, EmptyBase
{
void derived()
{
b1();
b2();
eb();
std::cout << "derived(): " << (void*)this << std::endl;
}
};
}
int main()
{
Derived d;
d.derived();
}
An anonymous namespace is used to give the base classes internal linkage. An intelligent compiler could determine that the only use of the base classes is in this translation unit and this pointer adjustment is unnecessary. Private inheritance is used for good measure but I don't think it has real significance.
Example g++ 4.9.2 output:
b1() 0x7fff5c5337d0
b2() 0x7fff5c5337d4
eb(): 0x7fff5c5337d0
derived(): 0x7fff5c5337d0
Example clang 3.5.0 output
b1() 0x7fff43fc07e0
b2() 0x7fff43fc07e4
eb(): 0x7fff43fc07e0
derived(): 0x7fff43fc07e0
Example icc 15.0.0.077 output:
b1() 0x7fff513e76d8
b2() 0x7fff513e76dc
eb(): 0x7fff513e76d8
derived(): 0x7fff513e76d8
All three compilers adjust the this pointer for b2(). If they don't elide the this pointer adjustment in this easy case then they very likely won't ever elide this pointer adjustment. The primary base class and empty base classes are exceptions.
As far as I know, an intelligent standards conforming compiler could elide the this pointer adjustment for b2() but it's simply an optimization that they don't do.

How does static_cast affect the virtual function calls?

I have the following code (stolen from virtual functions and static_cast):
#include <iostream>
class Base
{
public:
virtual void foo() { std::cout << "Base::foo() \n"; }
};
class Derived : public Base
{
public:
virtual void foo() { std::cout << "Derived::foo() \n"; }
};
If I have:
int main()
{
Base base;
Derived& _1 = static_cast<Derived&>(base);
_1.foo();
}
The print-out will be: Base::foo()
However, if I have:
int main()
{
Base * base;
Derived* _1 = static_cast<Derived*>(base);
_1->foo();
}
The print-out will be: Segmentation fault: 11
Honestly, I don't quite understand both. Can somebody explain the complications between static_cast and virtual methods based on the above examples? BTW, what could I do if I want the print-out to be "Derived::foo()"?
A valid static_cast to pointer or reference type does not affect virtual calls at all. Virtual calls are resolved in accordance with the dynamic type of the object. static_cast to pointer or reference does not change the dynamic type of the actual object.
The output you observe in your examples is irrelevant though. The examples are simply broken.
The first one makes an invalid static_cast. You are not allowed to cast Base & to Derived & in situations when the underlying object is not Derived. Any attempt to perform such cast produces undefined behavior.
Here's an example of valid application of static_cast for reference type downcasting
int main()
{
Derived derived;
Base &base = derived;
Derived& _1 = static_cast<Derived&>(base);
_1.foo();
}
In your second example the code is completely broken for reasons that have nothing to do with any casts or virtual calls. The code attempts to manipulate non-initialized pointers - the behavior is undefined.
In your second example, you segfault because you did not instanciate your base pointer. So there is no v-table to call. Try:
Base * base = new Base();
Derived* _1 = static_cast<Derived*>(base);
_1->foo();
This will print Base::foo()
The question makes no sense, as the static_cast will not affect the v-table. However, this makes more sens with non-virtual functions :
class Base
{
public:
void foo() { std::cout << "Base::foo() \n"; }
};
class Derived : public Base
{
public:
void foo() { std::cout << "Derived::foo() \n"; }
};
int main()
{
Base base;
Derived& _1 = static_cast<Derived&>(base);
_1.foo();
}
This one will output Derived::foo(). This is however a very wrong code, and though it compiles, the behavior is undefined.
The whole purpose of virtual functions is that the static type of the variable shouldn't matter. The compiler will look up the actual implementation for the object itself (usually with a vtable pointer hidden within the object). static_cast should have no effect.
In both examples the behavior is undefined. A Base object is not a Derived object, and telling the compiler to pretend that it is doesn't make it one. The way to get the code to print out "Derived::foo()" is to use an object of type Derived.

C++: Binding to a base class

EDIT:
In the following code container::push takes an object of type T that derives from base as argument and stores in a vector a pointer to the method bool T::test().
container::call calls each of the stored methods in the context of to the member object p, which has type base, not T. It works as long as the called method does not refer to any member outside base and if test() is not declared virtual.
I know this is ugly and may not be even correct.
How can I accomplish the same thing in a better way?
#include <iostream>
#include <tr1/functional>
#include <vector>
class base {
public:
base(int v) : x(v)
{}
bool test() const { // this is NOT called
return false;
}
protected:
int x;
};
class derived : public base {
public:
bool test() const { // this is called instead
return (x == 42);
}
};
class container {
public:
container() : p(42)
{}
template<typename T>
void push(const T&) {
vec.push_back((bool (base::*)() const) &T::test);
}
void call() {
std::vector<bool (base::*)() const>::iterator i;
for(i = vec.begin(); i != vec.end(); ++i) {
if( (p .* (*i))() ) {
std::cout << "ok\n";
}
}
}
private:
std::vector<bool (base::*)() const> vec;
base p;
};
int main(int argc, char* argv[]) {
container c;
c.push(derived());
c.call();
return 0;
}
What you are doing with your "boost::bind" statement is to call derived::test and pass "b" as a "this" pointer. It's important to remmember that the "this" pointer for derived::test is supposed to be a pointer to a "derived" object - which is not the case for you. It works in your particular situation since you have no vtable and the memory layout is identical - but as soon as that will change, your program will likely break.
And besides, it's just plain wrong - ugly, unreadable, bug-prone code. What are you really trying to do?
[Edit] New answer to the edited question: You should use boost::bind to create a functional closure, that wraps both the object & the member function in a single object - and store that object in your collection. Then when you invoke it, it is always reliable.
If you can't use boost in your application... well, you could do something like boost::bind yourself (just look on how it is done in boost), but it's more likely that you'll get it wrong and have bugs.
To the updated question:
Calling a derived member function on a base object is Undefined Behavior. What you are trying to achieve (code) is wrong. Try to post what you need and people will help with a sensible design.
What you are doing is not correct, and in the simple example it will work, but might just raise hell (one of the possibilities for undefined behavior) in other cases.
Since base::test and derived::test are not virtual, they are two different member methods, so for simplicitly I will call them base::foo and derived::bar. In the binder code you are forcing the compiler into adapting a pointer to bar that is defined in derived as if it was actually defined in base and then calling it. That is, you are calling a method of derived on an object or type base!!! which is undefined behavior.
The reason that it is not dying is that the this pointers in base and derived coincide and that you are only accessing data present in the base class. But it is incorrect.
When you declare base::test virtual, you get the correct behavior: your most derived object in the hierarchy is base, the compiler will use the virtual dispatch mechanism and find out that base is where the final overrider for test is found and executed.
When you declare only derived::test as virtual (and not base) the compiler will try to use an inexistent virtual dispatch mechanism (usually a vtable pointer) in the handed object and that kills the application.
At any rate, all but the virtual base::test uses are incorrect. Depending on what your actual requirements are, the most probably correct way of doing it would be:
class base {
public:
virtual bool test() const;
};
class derived : public base {
public:
virtual bool test() const; // <--- virtual is optional here, but informative
};
int main()
{
derived d; // <--- the actual final type
base & b = d; // <--- optional
if ( std::tr1::bind( &base::test, std::tr1::ref(b))() ) {
// ...
}
}
Note that there is no cast (casts are usually a hint into something weird, potentially dangerous is hiding there), that the object is of the concrete type where you want the method to be called, and that the virtual dispatch mechanism guarantees that even if the
bind is to base::test, as the method is virtual, the final overrider will be executed.
This other example will more likely do funny things (I have not tried it):
struct base {
void foo() {}
};
struct derived : base {
void foo() {
for ( int i = 0; i < 1000; ++i ) {
std::cout << data[i];
}
}
int data[1000];
};
int main() {
base b;
std::tr1::bind((void (base::*)()) &derived::foo, std::tr1::ref(b))();
}