What does a C++ compiler do to create an object? - c++

In C code like such:
{
int i = 5;
/* ....... */
}
The compiler will replace the code by moving the Stack pointer down (for stacks growing down) by the size of an int, and places the value 5 in that memory place.
Similarly, in C++ code, what does the compiler do if an object is created? For example:
class b
{
public :
int p;
virtual void fun();
};
main()
{
b obj;
}
What will the compiler do for the above code? Can anyone explain when memory is allocated, and when memory for the virtual table is allocated, and when the default constructor is called?

On Constructions
Logically there is no difference between the two:
In both case the stack is made large enough to hold the obect and the constructor is called on the object.
Just note:
The constructor for a POD type does nothing.
A user defined type with no constructor has a compiler generated default cosntructor.
You can think about it like this:
int x; // stack frame increased by sizeof(int) default construct (do nothing)
B a; // stack frame increased by sizeof(B) default construct.
While:
int y(6); // stack frame increased by sizeof(int) Copy constructor called
B b(a); // stack frame increased by sizeof(B) Copy constructor called
Ok. Of course the constructor for POD types is very trivial and the compiler will do a lot of optimizations (and may all but remove any actual code and even the memory address), but logically it is just fine to think of it happining this way.
Note: All types have a copy constructor (the compiler defines one if you don't) and the POD types you can logically think of it as copy construcion without any problems.
As for virtual tables:
Let me first note this is an implementation detail and not all compilers use them.
But the vtable itself is usually generated at compile time. Any object that needs a vtable has an invisable pointer added to the structure (this is included as part of the objects size). Then during contruction the pointer is set to point at the vtable.
Note: It is impossable to define when the vtable is set as this is not defined by the standard and thus each compiler is free to do it at any time. If you have a multiple level hierarchy then the vtable is probably set by each constructor from base to most derived and thus probably wrong until the final constructor finishes.
Note: You can not call virtual functions in the constructor/destructor. So all you can say is that the vtable will be correctly initialised only after the constructor has fully completed.

It's semantically the same, the stack pointer gets decremented (for stacks growing down) by sizeof b, then the default constructor is called to set up your instance.
In practice, depending on your architecture and on your compiler (and the flags you pass to it), basic types like in your int example may not get allocated actual memory space on the stack unless it's really required. They'll live in registers until an operation requiring a real memory address is needed (like the & operator).

To touch on question about when the virtual table get's allocated. Usually it does done at compile time (though it does depend on the compiler).
The virtual table is static for any given class. Because of this the compiler can emit the table at compile time. During the initialization of the class the pointer to the virtual table is set to the stored table.
Because of this different instances of the same class will point to the same virtual table.

On
b obj;
as with an int, the stack pointer is increased by the size of b. Then the constructor is called. The constructor may or may not call new or any other function to allocate memory. Thats up to the b's implementation. The call itself does not initiate any allocation.
The vftable is a static object. It is not created with the object itself. The object rather contains a 'invisible' pointer that points to its matching vftable. Its size is include in sizeof(b).

Just to add to previous answers, once the object is constructed, the compiler will do whatever magic is necessary under it's conventions to guarantee the destructor is called when the object goes out of scope. The language guarantees this, but most compilers have to do something to follow through on the guarantee (like set up a table of pointers to destructors and rules about when to invoke the various destructors for the various objects).

According to Itanium C++ ABI standard (which, for example, GCC follows), virtual table is stored into a separate memory, global to the translation unit.
For each dynamic class a virtual table is constructed and is stored under specific name in the object file, like _ZTV5Class. Classes, whose runtime type is exactly Class will contain pointers to this table. These pointers will be initialized, adjusted and accessed, but no class contains its virtual table within its instance.
So the answer is that virtual tables are allocated at compile time, and during construction only pointers to them are set up.

Related

memcpy derived class to base class, why still called base class function

I am reading Inside the C++ Object Model. In section 1.3
So, then, why is it that, given
Bear b;
ZooAnimal za = b;
// ZooAnimal::rotate() invoked
za.rotate();
the instance of rotate() invoked is the ZooAnimal instance and not that of Bear? Moreover, if memberwise initialization copies the values of one object to another, why is za's vptr not addressing Bear's virtual table?
The answer to the second question is that the compiler intercedes in the initialization and assignment of one class object with another. The compiler must ensure that if an object contains one or more vptrs, those vptr values are not initialized or changed by the source object .
So I wrote the test code below:
#include <stdio.h>
class Base{
public:
virtual void vfunc() { puts("Base::vfunc()"); }
};
class Derived: public Base
{
public:
virtual void vfunc() { puts("Derived::vfunc()"); }
};
#include <string.h>
int main()
{
Derived d;
Base b_assign = d;
Base b_memcpy;
memcpy(&b_memcpy, &d, sizeof(Base));
b_assign.vfunc();
b_memcpy.vfunc();
printf("sizeof Base : %d\n", sizeof(Base));
Base &b_ref = d;
b_ref.vfunc();
printf("b_assign: %x; b_memcpy: %x; b_ref: %x\n",
*(int *)&b_assign,
*(int *)&b_memcpy,
*(int *)&b_ref);
return 0;
}
The result
Base::vfunc()
Base::vfunc()
sizeof Base : 4
Derived::vfunc()
b_assign: 80487b4; b_memcpy: 8048780; b_ref: 8048780
My question is why b_memcpy still called Base::vfunc()
What you are doing is illegal in C++ language, meaning that the behavior of your b_memcpy object is undefined. The latter means that any behavior is "correct" and your expectations are completely unfounded. There's not much point in trying to analyze undefined behavior - it is not supposed to follow any logic.
In practice, it is quite possible that your manipulations with memcpy did actually copy Derived's virtual table pointer to b_memcpy object. And your experiments with b_ref confirm that. However, when a virtual method is called though an immediate object (as is the case with b_memcpy.vfunc() call) most implementations optimize away the access to the virtual table and perform a direct (non-virtual) call to the target function. Formal rules of the language state that no legal action can ever make b_memcpy.vfunc() call to dispatch to anything other than Base::vfunc(), which is why the compiler can safely replace this call with a direct call to Base::vfunc(). This is why any virtual table manipulations will normally have no effect on b_memcpy.vfunc() call.
The behavior you've invoked is undefined because the standard says it's undefined, and your compiler takes advantage of that fact. Lets look at g++ for a concrete example. The assembly it generates for the line b_memcpy.vfunc(); with optimizations disabled looks like this:
lea rax, [rbp-48]
mov rdi, rax
call Base::vfunc()
As you can see, the vtable wasn't even referenced. Since the compiler knows the static type of b_memcpy it has no reason to dispatch that method call polymorphically. b_memcpy can't be anything other than a Base object, so it just generates a call to Base::vfunc() as it would with any other method call.
Going a bit further, lets add a function like this:
void callVfunc(Base& b)
{
b.vfunc();
}
Now if we call callVfunc(b_memcpy); we can see different results. Here we get a different result depending on the optimization level at which I compile the code. On -O0 and -O1 Derived::vfunc() is called and on -O2 and -O3 Base::vfunc() is printed. Again, since the standard says the behavior of your program is undefined, the compiler makes no effort to produce a predictable result, and simply relies on the assumptions made by the language. Since the compiler knows b_memcpy is a Base object, it can simply inline the call to puts("Base::vfunc()"); when the optimization level allows for it.
You aren't allowed to do
memcpy(&b_memcpy, &d, sizeof(Base));
- it's undefined behaviour, because b_memcpy and d aren't "plain old data" objects (because they have virtual member functions).
If you wrote:
b_memcpy = d;
then it would print Base::vfunc() as expected.
Any use of a vptr is outside the scope of the standard
Granted, the use of memcpy here has UB
The answers pointing out that any use of memcpy, or other byte manipulation of non-PODs, that is, of any object with a vptr, has undefined behavior, are strictly technically correct but do not answer the question. The question is predicated on the existence of a vptr (vtable pointer) which isn't even mandated by the standard: of course the answer will involve facts outside the standard and the result bill not be guaranteed by the standard!
Standard text is not relevant regarding the vptr
The issue is not that you are not allowed to manipulate the vptr; the notion of being allowed by the standard to manipulate anything that is not even described in the standard text is absurd. Of course not standard way to change the vptr will exist and this is beside the point.
The vptr encodes the type of a polymorphic object
The issue here is not what the standard says about the vptr, the issue is what the vptr represents, and what the standard says about that: the vptr represents the dynamic type of an object. Whenever the result of an operation depends on the dynamic type, the compiler will generate code to use the vptr.
[Note regarding MI: I say "the" vptr (as if the only one vptr), but when MI (multiple inheritance) is involved, objects can have more than one vptr, each representing the complete object viewed as a particular polymorphic base class type. (A polymorphic class is a class with a least one virtual function.)]
[Note regarding virtual bases: I mention only the vptr, but some compilers insert other pointers to represent aspects of the dynamic type, like the location of virtual base subobjects, and some other compilers use the vptr for that purpose. What is true about the vptr is also true about these other internal pointers.]
So a particular value of the vptr corresponds to a dynamic type: that is the type of most derived object.
Changes of the dynamic type of an object during its lifetime
During construction, the dynamic type changes, and that is why virtual function calls from inside the constructor can be "surprising". Some people say that the rules of calling virtual functions during construction are special, but they are absolutely not: the final overrider is called; that override is the one the class corresponding to the most derived object that has been constructed, and in a constructor C::C(arg-list), it is always the type of the class C.
During destruction, the dynamic type changes, in the reverse order. Calls to virtual function from inside destructors follow the same rules.
What it means when something is left undefined
You can do low level manipulations that are not sanctioned in the standard. That a behavior is not explicitly defined in the C++ standard does not imply that it is not described elsewhere. Just because the result of a manipulation is explicitly described has having UB (undefined behavior) in the C++ standard does not mean your implementation cannot define it.
You can also use your knowledge of the way the compilers work: if strict separate compilation is used, that is when the compiler can get no information from separately compiled code, every separately compiled function is a "black box". You can use this fact: the compiler will have to assume that anything that a separately compiled function could do will be done. Even with inside a given function, you can use asm directive to get the same effects: an asm directive with no constraint can do anything that is legal in C++. The effect is a "forget what you know from code analysis at that point" directive.
The standard describes what can change the dynamic type, and nothing is allowed to change it except construction/destruction, so only an "external" (blackbox) function is is otherwise allowed to perform construction/destruction can change a dynamic type.
Calling constructors on an existing object is not allowed, except to reconstruct it with the exact same type (and with restrictions) see [basic.life]/8 :
If, after the lifetime of an object has ended and before the storage
which the object occupied is reused or released, a new object is
created at the storage location which the original object occupied, a
pointer that pointed to the original object, a reference that referred
to the original object, or the name of the original object will
automatically refer to the new object and, once the lifetime of the
new object has started, can be used to manipulate the new object, if:
(8.1) the storage for the new object exactly overlays the storage
location which the original object occupied, and
(8.2) the new object is of the same type as the original object
(ignoring the top-level cv-qualifiers), and
(8.3) the type of the original object is not const-qualified, and, if
a class type, does not contain any non-static data member whose type
is const-qualified or a reference type, and
(8.4) the original object was a most derived object ([intro.object])
of type T and the new object is a most derived object of type T (that
is, they are not base class subobjects).
This means that the only case where you could call a constructor (with placement new) and still use the same expressions that used to designate the objects (its name, pointers to it, etc.) are those where the dynamic type would not change, so the vptr would still be the same.
On other words, if you want to overwrite the vptr using low level tricks, you could; but only if you write the same value.
On other words, don't try to hack the vptr.

In C++ are constructors called before or after object creation?

I found some answers for this question regarding java but nothing specifically regarding c++. So I've read in Java the object is first created and then the constructor is called. I was wondering if this was the same process for c++? Also, if this is the case, then what's the point of having a default constructor at all? Is it for inheritance purposes?
"Object creation" means different things in different languages. But in C++ the most salient question is "when does the object lifetime begin". When an object's lifetime has begun, that means that when it later ends (you delete it, or if it is a stack object, then when it goes out of scope), the destructor will be called.
If the object's lifetime did not formally begin, then if it goes out of scope later, the destructor will not be called.
C++ resolves this as follows:
When you make an object, say of class type, by invoking a constructor, first the memory is allocated, then the constructor runs.
When the constructor runs to completion, then the lifetime has begun, and the destructor will be called when it ends. After the destructor finishes, the memory will be freed.
If the constructor aborts, say, by throwing an exception, then the destructor for that object will not be called. The memory will still be freed, however.
For more info on object lifetimes you might want to look at e.g. this question, or better, at the standard / a good text book.
The basic idea is that, in C++, we try to minimize the window of time between when the memory has been allocated and when it is initialized -- or rather, the language itself promotes the idea that "resource acquisition is initialization" and makes it un-idiomatic to acquire memory without also giving it a type and initializing it. When typically writing code, e.g. if you have a variable of type A, you can think of it as "This refers to a block of memory for an A, where a constructor for A successfully ran to completion." You don't normally have to consider the possibility that "this is a block of memory the size of an A, but the constructor failed and now its an uninitialized / partially initialized headless blob."
It depends on what you mean by "created". Obviously memory should be allocated before object can be created.
But officially, object is not created (its lifetime is not started) until after constructor finishes execution. If constructor did not execute completely (exception happened, for example), object is considered never existing in first place. For example destructor for this object will not be called (as it would for any existing object)
When you enter constructor body, you can be sure that members are create (either by default constructors or whatever was passed in constructor initializer list; usual language rules for variable initialization applies: initial values of scalar variables are undefined).
When you consider inheritance, it is even more complex. When you enter constructor body, ancestors parts are already existing objects: their constructor had finished, and even if children would not be able to construct itself properly, their destructors will be called.
Neither.
The process of object creation includes a constructor call.
The point of having a default constructor is to allow this part of the object creation process to be a no-op. What else would it be?
Here's the sequence: first, memory is allocated. Then any base class constructors are called. Finally the class constructor is called.
You don't actually need a default constructor if you'll always create your objects with a parameter list.
You seem to be confusing terms but I'll try to define some (unofficial) terms that should clarify this issue:
Allocation
This is the step where memory is allocated for the object.
Initialization
This is the step where the language related object properties are "set". The vTable and any other "language implementation" related operations are done.
Construction
Now that an object is allocated and initialized, the constructor is being executed. Whether the default constructor is used or not depends on how the object was created.
You can consider an object created after the 3rd step.
Also, if this is the case, then what's the point of having a default
constructor at all?
The point of "Default Constructor" is to tell the program how objects with no parameters should be built, in other words - what is the "default" state of an Object.
for example, the default state of std::unique_ptr is to point to null with no costume deleter.
The default state of a string is an empty string with the size of 0.
The default state of a vector is an empty vector with the size of 0.
The default state of T is the one specified by the constructor T().
Think that constructor has to be called after object's creation due to resource allocation issues. An object is no more than a structure with functions pointers (members function) and member variables (attributes). An error would occur if the constructor sets up any of these values before they are allocated in memory.
For example, your constructor stores an int value in a member variable of your object, but your object's member variables haven't been allocated, so the value cannot be stored successfully.
Regards!
Yes, the object's memory is first allocated, then the constructor called to actually construct the content. A bit like building a house, you first purchase [or otherwise legally get permission] the land (memory) to build it on, then start building. If you do it the other way around, you're likely to get into trouble.
A default constructor is used when your object needs to be constructed with no parameters. In some cases, this doesn't make any sense at all, but default constructors are used for example in the std::vector - since a std::vector<myclass> will be implemented as an array of objects, and if you grow it [using push_back], the size will double (or something like it), and the objects at the back of the vector that hasn't been used will be default constructed.
All objects need to be constructed after creation (even if you don't declare one, in which case you get an empty constructor, and if the compiler is clever it will not call the constructor because it doesn't do anything)
The term "default constructor" just means a constructor that needs no parameters, such that it can be used by-default. For example:
struct MyObject {
MyObject() { ... } // default constructor
MyObject(int) { ... } // some other non-default constructor
};
int main()
{
MyObject x; // default constructor is called since you didn't give
// any explicit parameters.
MyObject x2(5); // A non-default constructor is called.
}
Note that the term "default" doesn't mean that you haven't defined it. The default constructor may be provided by you or it may be automatically generated in certain situations. You even have the case of the defaulted default constructor:
struct MyObject {
MyObject() = default; // defaulted default constructor
};
Here, you've told the compiler to generate a default constructor using the default implementation.
Regardless of whether it is a default constructor or not, the object has been constructed as much as automatically possible before the constructor body is executed. This means the memory has been allocated to store the object, any base classes have been constructed, and any members have been constructed. The object also has taken the type identity of the class being constructed for purposes of RTTI and virtual function calls.

Stack allocation of user defined class object

I'm trying to understand POD types and how they are allocated and initialized on the stack.
Given
class A {
public:
A();
int x;
};
class B {
public:
int x;
};
int func()
{
A a;
B b;
}
Am I correct in saying that b is allocated after a but initialized prior to a? By that I mean
that the space is allocate for a and b in the order that they are declared but b is initialized
when the space is allocated and a is initialized when it is declared?
I read a very good FAQ about PODs and Aggregated here
What are Aggregates and PODs and how/why are they special?
One of the things he said is:
The lifetime of objects of non-POD class type begins when the constructor has finished and ends when the destructor has finished. For POD classes, the lifetime begins when storage for the object is occupied and finishes when that storage is released or reused.
So I'm trying understand the details of how PODs are allocated and initialized and how that is
different from non-PODs.
No. a is allocated and initialized first, and b is allocated and initialized second. C++ programs are executed statement by statement. Since the memory is automatic, there is no explicit allocation happening anyway -- it's all taken care of automatically.
(For instance, in typical call-stack implementations used on desktop operating systems, the memory is and has always been there and doesn't need to be allocated at all, just addressed.)
You have zero guarantees of any kind for the order in memory that A and B are allocated.
If A and B both had constructors, a's would be called before b's. But POD types, which you're asking about (and which B is) are not initialized at all with this syntax, so the question is moot.
The question of object initialization in relation to when the storage is allocated doesn't make much sense anyway. For example, most compilers here will allocate space for A and B in a single stack pointer move. Given that there is no way a conforming C++ program can detect such a thing (what does it even mean?), the compiler can do pretty much whatever it wants, though.
These are local variables, they are not "allocated" in the common sense, you can just consider them being there. (How is left to implementation; common way is to use a processor-supported stack. In that case all the storage for all local objects is taken on the stack at function entry).
Initialization always happens in the order of declarations. Here it means A::A() is called for a, then B::B() is called for b.

PODs and inheritance in C++11. Does the address of the struct == address of the first member?

(I've edited this question to avoid distractions. There is one core question which would need to be cleared up before any other question would make sense. Apologies to anybody whose answer now seems less relevant.)
Let's set up a specific example:
struct Base {
int i;
};
There are no virtual method, and there is no inheritance, and is generally a very dumb and simple object. Hence it's Plain Old Data (POD) and it falls back on a predictable layout. In particular:
Base b;
&b == reinterpret_cast<B*>&(b.i);
This is according to Wikipedia (which itself claims to reference the C++03 standard):
A pointer to a POD-struct object, suitably converted using a reinterpret cast, points to its initial member and vice versa, implying that there is no padding at the beginning of a POD-struct.[8]
Now let's consider inheritance:
struct Derived : public Base {
};
Again, there are no virtual methods, no virtual inheritance, and no multiple inheritance. Therefore this is POD also.
Question: Does this fact (Derived is POD in C++11) allow us to say that:
Derived d;
&d == reinterpret_cast<D*>&(d.i); // true on g++-4.6
If this is true, then the following would be well-defined:
Base *b = reinterpret_cast<Base*>(malloc(sizeof(Derived)));
free(b); // It will be freeing the same address, so this is OK
I'm not asking about new and delete here - it's easier to consider malloc and free. I'm just curious about the regulations about the layout of derived objects in simple cases like this, and where the initial non-static member of the base class is in a predictable location.
Is a Derived object supposed to be equivalent to:
struct Derived { // no inheritance
Base b; // it just contains it instead
};
with no padding beforehand?
You don't care about POD-ness, you care about standard-layout. Here's the definition, from the standard section 9 [class]:
A standard-layout class is a class that:
has no non-static data members of type non-standard-layout class (or array of such types) or reference,
has no virtual functions (10.3) and no virtual base classes (10.1),
has the same access control (Clause 11) for all non-static data members,
has no non-standard-layout base classes,
either has no non-static data members in the most derived class and at most one base class with non-static data members, or has no base classes with non-static data members, and
has no base classes of the same type as the first non-static data member.
And the property you want is then guaranteed (section 9.2 [class.mem]):
A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa.
This is actually better than the old requirement, because the ability to reinterpret_cast isn't lost by adding non-trivial constructors and/or destructor.
Now let's move to your second question. The answer is not what you were hoping for.
Base *b = new Derived;
delete b;
is undefined behavior unless Base has a virtual destructor. See section 5.3.5 ([expr.delete])
In the first alternative (delete object), if the static type of the object to be deleted is different from its dynamic type, the static type shall be a base class of the dynamic type of the object to be deleted and the static type shall have a virtual destructor or the behavior is undefined.
Your earlier snippet using malloc and free is mostly correct. This will work:
Base *b = new (malloc(sizeof(Derived))) Derived;
free(b);
because the value of pointer b is the same as the address returned from placement new, which is in turn the same address returned from malloc.
Presumably your last bit of code is intended to say:
Base *b = new Derived;
delete b; // delete b, not d.
In that case, the short answer is that it remains undefined behavior. The fact that the class or struct in question is POD, standard layout or trivially copyable doesn't really change anything.
Yes, you're passing the right address, and yes, you and I know that in this case the dtor is pretty much a nop -- nonetheless, the pointer you're passing to delete has a different static type than dynamic type, and the static type does not have a virtual dtor. The standard is quite clear that this gives undefined behavior.
From a practical viewpoint, you can probably get away with the UB if you really insist -- chances are pretty good that there won't be any harmful side effects from what you're doing, at least with most typical compilers. Beware, however, that even at best the code is extremely fragile so seemingly trivial changes could break everything -- and even switching to a compiler with really heavy type checking and such could do so as well.
As far as your argument goes, the situation's pretty simple: it basically means the committee probably could make this defined behavior if they wanted to. As far as I know, however, it's never been proposed, and even if it had it would probably be a very low priority item -- it doesn't really add much, enable new styles of programming, etc.
This is meant as a supplement to Ben Voigt's answer', not a replacement.
You might think that this is all just a technicality. That the standard calling it 'undefined' is just a bit of semantic twaddle that has no real-world effects beyond allowing compiler writers to do silly things for no good reason. But this is not the case.
I could see desirable implementations in which:
Base *b = new Derived;
delete b;
Resulted in behavior that was quite bizarre. This is because storing the size of your allocated chunk of memory when it is known statically by the compiler is kind of silly. For example:
struct Base {
};
struct Derived {
int an_int;
};
In this case, when delete Base is called, the compiler has every reason (because of the rule you quoted at the beginning of your question) to believe that the size of the data pointed at is 1, not 4. If it, for example, implements a version of operator new that has a separate array in which 1 byte entities are all densely packed, and a different array in which 4 byte entities are all densely packed, it will end up assuming the Base * points to somewhere in the 1-byte entity array when in fact it points somewhere in the 4-byte entity array, and making all kinds of interesting errors for this reason.
I really wish operator delete had been defined to also take a size, and the compiler passed in either the statically known size if operator delete was called on an object with a non-virtual destructor, or the known size of the actual object being pointed at if it were being called as a result of a virtual destructor. Though this would likely have other ill effects and maybe isn't such a good idea (like if there are cases in which operator delete is called without a destructor having been called). But it would make the problem painfully obvious.
There is lots of discussion on irrelevant issues above. Yes, mainly for C compatibility there are a number of guarantees you can rely as long as you know what you are doing. All this is, however, irrelevant to your main question. The main question is: Is there any situation where an object can be deleted using a pointer type which doesn't match the dynamic type of the object and where the pointed to type doesn't have a virtual destructor. The answer is: no, there is not.
The logic for this can be derived from what the run-time system is supposed to do: it gets a pointer to an object and is asked to delete it. It would need to store information on how to call derived class destructors or about the amount of memory the object actually takes if this were to be defined. However, this would imply a possibly quite substantial cost in terms of used memory. For example, if the first member requires very strict alignment, e.g. to be aligned at an 8 byte boundary as is the case for double, adding a size would add an overhead of at least 8 bytes to allocate memory. Even though this might not sound too bad, it may mean that only one object instead of two or four fits into a cache line, reducing performance substantially.

Explain bug behaviour

Can you please explain what's going on in this buggy example:
Base base; Derived* d = reinterpret_cast<Derived*> (&base);
d->method();
d->virtual_method();
//output: Derived-method() Base-virtual_method()
I would expect this code to behave the other way around. Probably the compiler shares a single memory layout for Base and Derived, and of course the vtable is common.
When d->method is invoked I would expect the compiler to say "I'm just calling method at offset 0 with respect to my pointer. The pointer points to Base object, the only object around.
When d->virtual_method is invoked the compiler should say I am going to resolve it through the vtable, and thus the Derived method should be called (though the Base object is the only one around, the layout extends to Derived).
So I am expecting to see:
//output: Base-method() Derived-virtual_method()
base is a Base object; you reinterpret its bytes as a Derived object and then attempt to use it as if it were a Derived object. The behavior when you do this is undefined. Your program might crash; it might appear to do the right thing; it might make your computer light on fire.
Note that it is never correct to use reinterpret_cast to cast up and down a class hierarchy. You must use static_cast or dynamic_cast (or, if you are converting to a base class, no cast may be necessary).
To explain why you see this particular behavior, though: when you call a nonvirtual member function (as you do with d->method(), assuming method is a nonvirtual member function of Derived), the function that gets called is determined at compile time, not at runtime.
Here, the compiler knows that d points to a D object (because you've lied to the compiler and said that it is), so it generates code that calls Derived::method(). There is no "offset with respect to a pointer" at all. No computation needs to be done because the function to be called is known when the program is compiled.
Only when you call a virtual member function is a table lookup required (and even then, the lookup is only required when the compiler doesn't know the dynamic type of the object on which the member function is being called).
When you call d->virtual_method(), Base::virtual_method gets called. Why? In this particular implementation of C++, the first few bytes of an object of a class type that has virtual member functions (a polymorphic class type) contain a tag (called a "vptr" or a "virtual table pointer") that identifies the actual type of the object. When you call a virtual member function, then at runtime that tag is inspected and the function that is called is selected based on that tag.
When you reinterpret base as a Derived object, you don't actually change the object itself, so its tag still states that it is a Base object, hence why Base::virtual_method gets called.
Remember, though, that all this just happens to be what happens when you compile this code with a particular version of a particular compiler. The behavior is undefined and this is just one way that the undefined behavior can manifest itself.
The compiler only allocates enough memory to hold the requested object. Base might be 20 bytes, and Derived might be an extra 10 bytes on top of that (so Derived is 30 bytes in size.)
When you allocate 20 bytes for Base, and then (via Derived) access byte position 25, it's past the end of the allocated memory and (at best) you will get a crash.
The compiler cannot allocate 30 bytes for Base as you suggest, because not only would this be wasteful, but Derived could be implemented in a third party library and it may not even be known about when Base is being compiled.