What's the issue with malloc() and virtual functions? [duplicate]

What's the issue with malloc() and virtual functions? [duplicate] - c++

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
C++: why is new needed?
Why cant I use malloc to allocate space for my objects when they are children of a class containing virtual functions? This is really frustrating. Is there a good reason?
The following program illustrates the problem. It segfaults on line 27, where I call aa->f()
#include <iostream>
#include <cstdlib>
class A
{
public:
virtual int f() {return 1;}
};
class B
{
public:
int f() {return 1;}
};
class Aa : public A {};
class Bb : public B {};
int main()
{
Aa* aa = (Aa*)malloc(sizeof(Aa));
Aa* aan = (Aa*)new Aa();
Bb* bb = (Bb*)malloc(sizeof(Bb));
std::cout << bb->f() << std::endl;
std::cout << aan->f() << std::endl;
std::cout << aa->f() << std::endl;
return 0;
}
Version info: g++ (Ubuntu/Linaro 4.4.4-14ubuntu5) 4.4.5

A common way to implement virtual functions is to have a pointer to a "virtual table" or vtable at a negative offset from the object. This table is needed to figure out what virtual function to call. This is why just malloc'ing space doesn't work.

malloc only allocates memory, but does not create an object. So, the line
Aa* aa = (Aa*)malloc(sizeof(Aa));
allocates a region of memory that is large enough to hold an A, but contains garbage. As others pointed out, this also means that the pointer to the vtable will not be set (I got that one from #David Rodríguez's comment on another answer), which is required to dispatch calls to virtual functions. Since B does not contain virtual functions, no such problem arises. It would happen with B too, however, if B contained any data-members that require initialization, such as this:
class B
{
public:
B() : foo(new int()) {}
int f() {return *foo;}
private:
int * foo;
};
The line
Aa* aan = (Aa*)new Aa();
can do without the cast:
Aa* aan = new Aa();

The reason is that malloc knows nothing about C++ constuctors and consequently does not call them. You can call the constuctors yourself using placement new:
int main()
{
Aa* aa = (Aa*)malloc(sizeof(Aa));
new(aa)Aa;
Aa* aan = (Aa*)new Aa();
Bb* bb = (Bb*)malloc(sizeof(Bb));
new(bb)Bb;
std::cout << bb->f() << std::endl;
std::cout << aan->f() << std::endl;
std::cout << aa->f() << std::endl;
aa->~Aa();
free(aa);
delete aan;
bb->~Bb();
free(bb);
return 0;
}
Note that you have to manually call the destructors before freeing such memory.

Don't use malloc, use new - malloc does not call constructors.
When you do A * a = new A(); the compiler will allocate memory, set up the vtable pointer for A and call the constructor. When you call a virtual function, the vtable is used to actually find the function.
When you do A * a = (A *) malloc(...); the compiler will allocate memory, which will contain random data. When you call a virtual function, it'll look at the (garbage) vtable and call some random location.
A class with virtual functions look something like this internally:
struct Foo {
void * vtable;
int aClassMemberVar;
};
Calling a virtual function looks at the "hidden" vtable pointer, which points to the class vtable, a linked list of pointers to functions. So this vtable pointer must be initialized, and malloc doesn't do that.

Surely because the Virtual Function Table isn't getting created properly?

malloc does not call the constructor of the class, so your object is not initailized properly, hence it seg faults. Use new to allocate memory when using C++. BTW, there is no need to cast the pointer returned from new.

The good reason is called virtual tables. Objects of types that have virtual methods have a table of pointers pointing to the address of the actual virtual methods to be called. These are called virtual tables or v-tables.

Related

Why does this static C++ cast work?

Imagine this code:
class Base {
public:
virtual void foo(){}
};
class Derived: public Base {
public:
int i;
void foo() override {}
void do_derived() {
std::cout << i;
}
};
int main(){
Base *ptr = new Base;
Derived * static_ptr = static_cast<Derived*>(ptr);
static_ptr->i = 10; // Why does this work?
static_ptr->foo(); // Why does this work?
return 0;
}
Why do I get the result 10 on the console? I wonder because I thought the ptr is a pointer to a base object. Therefore the object doesn't contain a int i or the method do_derived(). Is a new derived-Object automatically generated?
When I declare a virtual do_derived() method in the Base class too, then this one is chosen, but why?

int* i = new int{1};
delete i;
std::cout << *i << std::endl;
This will also "work", if the definition of working is that the code will compile and execute.
However, it is clearly undefined behavior and there are no guarantees as to what might happen.
In your case, the code compiles as static_cast won't perform any checks, it just converts the pointer. It is still undefined behavior to access memory that hasn't been allocated and initialized though.

As mentioned in the comments, "happens to do what you expected" is not the same as "works".
Let's make a few modifications:
#include <iostream>
#include <string>
class Base{
public:
virtual void foo(){
std::cout << "Base::foo" << std::endl;
}
};
class Derived: public Base{
public:
int a_chunk_of_other_stuff[1000000] = { 0 };
std::string s = "a very long string so that we can be sure we have defeated SSO and allocated some memory";
void foo() override {
std::cout << "Derived::foo" << std::endl;
}
void do_derived() {
std::cout << s << std::endl;
}
};
int main(){
Base *ptr = new Base;
Derived * static_ptr = static_cast<Derived*>(ptr);
static_ptr -> foo(); // does it though?
static_ptr -> do_derived(); // doesn't work?
static_ptr->a_chunk_of_other_stuff[500000] = 10; // BOOM!
return 0;
}
Sample Output:
Base::foo
Process finished with exit code 11
In this case, none of the operations did what we expected. The assignment into the array caused a segfault.

The statement:
Base *ptr = new Base;
Doesn't always allocate sizeof(Base) - it would probably allocate more memory. Even if it does allocate exact sizeof(Base) bytes, it doesn't necessarily mean any byte access after this range (i.e. sizeof(Base)+n, n>1) would be invalid.
Hence let's assume the size of class Base is 4 bytes (due to virtual function table in most compiler's implementation, on a 32-bit platform). However, the new operator, the heap-management API, the memory management of OS, and/or the hardware does allocate 16 bytes for this allocation (assumption). This makes additional 12 bytes valid! It makes the following statement valid:
static_ptr->i = 10;
Since now it tries to write 4 bytes (sizeof(int), normally) after the first 4 bytes (size of polymorphic class Base).
The function call:
static_ptr->foo();
would simply make a call to Derived::foo since the pointer is of type Derived, and nothing is wrong in it. The compiler must call Derived::foo. The method Derived::foo doesn't even try to access any data member of derived class (and even base class).
Had you called:
static_ptr->do_derived();
which is accessing i member of derived. It would still be valid, since:
The function call is always valid, till method tries to access data-member (i.e. accesses something out of this pointer).
Data-member access became valid due to memory allocation (UD
behaviour)
Note that following is perfectly valid:
class Abc
{
public:
void foo() { cout << "Safe"; }
};
int main()
{
Abc* p = NULL;
p->foo(); // Safe
}
The call it valid, since it translates to:
foo(NULL);
where foo is:
void foo(Abc* p)
{
// doesn't read anything out of pointer!
}

why does this static cast work?
Because static cast is compile time checker. There is a relationship between Base and Derived. Since it has relationship, static cast believe's that relationship and believe's the programmer too. So As a programmer, you should make sure that Base object should not be static casted to derived class object.

Why the output is different from what I expect?

I got a simple program like this:
#include "stdafx.h"
#include <iostream>
using namespace std;
int main()
{
class B {
protected:
int data = 0;
public:
B() { cout << "B() ctor\n";}
virtual ~B() { cout << "~B()\n"; }
virtual void method() { cout << "data in B: " << data << "\n"; }
};
class A : public B
{
int dataA = 2;
public:
A() { cout << "A() ctor\n"; }
~A() { cout << "~A()\n"; }
void method() { cout << "data in A: " << dataA << "\n"; }
};
{
B* fptrList[]{ &B{}, &A{}};
for (auto& itr : fptrList)
itr->method();
}
cin.get();
return 0;
}
Here is a result I expect:
B() ctor
B() ctor
A() ctor
data in B: 0
data in A: 2
~A()
~B()
~B()
Here is the actual result when I ran this program:
B() ctor
~B()
B() ctor
A() ctor
~A()
~B()
data in B: 0
data in B: 0
My questions are:
Why the output is different from what I expect?
How can method() be called after ~A() and ~B() get called?
Why method() of class B get called twice?

This program cannot be explained, because it exhibits undefined behavior.
Translation: it's buggy. It's taking an address of temporary objects, and then attempts to dereference them, after the temporary have been destructed.
A good C++ compiler will even tell you that the program is broken, and will refuse to participate in this disaster:
t.C: In function ‘int main()’:
t.C:26:27: error: taking address of temporary [-fpermissive]
B* fptrList[]{ &B{}, &A{}};
^
t.C:26:33: error: taking address of temporary [-fpermissive]
B* fptrList[]{ &B{}, &A{}};
^
Any output from this program is meaningless garbage.

Here is what's going on:
You initialize fptrList to addresses of temporary variables A and B
The temporary variables get destroyed right after their addresses are taken, so your code has undefined behavior.
Proper way of doing what you are trying to do it is to use operator new with smart pointers, or to make instances outside initializer.
Here is one possible fix:
{
B b;
A a;
B* fptrList[]{ &b, &a };
for (auto& itr : fptrList)
itr->method();
}

Ok, it is undefined behavior, but the question is still interesting why it is this undefined behavior.
Why are constructors/destructors are called in this order? As already established, you creates temporary objects which are created/destroyed one after each other.
Why can I call methods of already non-existent objects? Your temporary object lives at the stack and thus the memory will be freed only at the end of the main function, so you are still able to access this memory and it does not get clobbered by calls to other functions (e.g. printing to the terminal). If you would create the object with new, than delete it and than try to use it - the chances would be higher, that the system has already reclaimed this memory and you would get a segmentation error.
Why I see 2 times the method of B called? This one is funny. To call a virtual function of an object, compiler delegates the decision which method exactly should be called to a virtual table (it's address occupies the first 8 byte of such an object (at least for my compiler and 64bit)). A not that well known details about virtual methods is that during the destructor-call all virtual methods are called as if they were not virtual. But what has it to do with your code? You see its side-effect: The non-virtual behavior is ensured in the destructor by overwriting the virtual table of the current object by the virtual table of the current class. So after the destructor of B is called, the memory contains a virtual table of the class B which you can see, because B::method is called twice.
Let's track the value of the virtual table it in your program:
call of A{}: At first the superclass B-constructor is called - the (not yet fully finished) object has the virtual table of class B (this address is moved into the first 8 byte occupied by the object), than A-constructor is called - now the object has the virtual table of class A.
call of ~A(): after its execution, the destructor of A automatically calls the destructor of B. The first thing the destructor of B does is to overwrite the virtual table of the object with the virtual table of the class B.
So after the destruction the memory is still there and interpreted as an object would have the the virtual table of class B.
itr->method(); finds the virtual table of class B at the address itr points to and calls B::method().

Deleting polymorphic objects and memory leaks

Suppose I have a class A and a class B which inherits from A.
Then I do something like:
A* a = new B();
delete a;
Why a memory leak happens only when there is a dynamic memory allocation within B?
How C++ knows to remove the "B part" when there is dynamic memory allocations within B but fails where it is?
[Update:]
How does the following code does not result in a memory leak: [I suspect it to be undefined behaviour, but I don't really understand anything :(]
#include <iostream>
using std::cout;
using std::endl;
class A {
public:
virtual void f() {
cout << "1" << endl;
}
~A() {
cout<< "A A'tor" << endl;
}
};
class B : public A {
private:
public:
B(){}
~B(){
cout<< "B D'tor" << endl;
}
void f() {
cout << "2" << endl;
}
};
int main() {
A* a = new B();
//a->f();
delete a;
return 0;
}

While compiler sees statement "delete a;"
It knows only static type of "a" is pointer to A, if there is no virtual destructor in class A. As a result wrong destructor gets called leading to memory leak
If there is a virtual destructor in class A
then compiler comes to know that Dynamic type of "a" is pointer to B and there will a vtable from where it will get B's destructor address at run time

This is because the destructor isn't virtual. If there is ever a chance in your program to delete derived class objects via base class pointers, you should define a virtual destructor in the base class. This will allow B's destructor to get called first, allowing you to free any dynamic memory B may have allocated. A's destructor will get called after B's.
Another good rule of thumb is to train yourself to think about making your destructor virtual anytime you make a method virtual. Since the motivation to make a method virtual means you will be calling methods on base class pointers, it follows that you will likely be deleting objects through base class pointers.

function call via class pointer in c++

#include"iostream"
using namespace std;
class base{
public:
void f()
{
cout<<"base f:"<<endl; // prints base f:
}
};
int main()
{
base *b; // even same out put with " base *b =NULL; "
b->f();
return 0;
}
O/p : base f:
can any one please explain how the function is getting called without assigning the object to the pointer .
Thanks.

Call of member-function with not initialized (or initialized to 0) pointer to object is undefined behaviour, however it may works since there is no attempts to access variables of object and there is no vtable here. You can look at this function like
void f_base(base* p)
{
cout << "base f:" << endl;
}
there is no access - there is no error, on all modern-compilers it will work, but it can be changed anytime.

This is invalid code, but since nothing in base::f() accesses a member variable, no invalid memory is getting touched.
If you add a member and try to print it out in the function, you will almost undoubtedly get a crash.

You need to use new
i.e.
base *b; // even same out put with " base *b =NULL; "
should be
base *b = new base;
... Need a delete to prevent memory leaks

C++ Debug Assertion Fails Only With VPTR

I'm wondering why I get an exception on the delete part in one case here, but not in the other.
No exception case
#include <iostream>
using namespace std;
class A
{
public:
~A() { cout << "A dtor" << endl; }
};
class B : public A
{
public:
int x;
~B() { cout << "B dtor" << endl; }
};
A* f() { return new B; }
int _tmain(int argc, _TCHAR* argv[])
{
cout << sizeof(B) << " " << sizeof(A) << endl;
A* bptr= f();
delete bptr;
}
Herethe output is 4 1 .. A dtor, since A has 1 byte for identity and B has 4 because of int x.
Exception case
#include <iostream>
using namespace std;
class A
{
public:
~A() { cout << "A dtor" << endl; }
};
class B : public A
{
public:
virtual ~B() { cout << "B dtor" << endl; }
};
A* f() { return new B; }
int _tmain(int argc, _TCHAR* argv[])
{
cout << sizeof(B) << " " << sizeof(A) << endl;
A* bptr= f();
delete bptr;
}
Here the output is 4 1 .. A dtor, since A has 1 byte for identity and B has 4 because of the vptr that's needed for its virtual destructor.
But then a debug assertion fails inside the delete call (_BLOCK_TYPE_IS_VALID).
Environment
I'm running Windows 7 with Visual Studio 2010 SP1Rel.

See this post
A quick summary:
You are telling the machine to delete an instance of A
As this is a class which we call through pointer/reference maybe we should use a virtual table (VT)?
There is no virtual members in A thus no VT is used
We call the standard destructor of A…
Bang! We are trying to delete class A but it happens that the pointer
has lead us to object of B which contains VT which A didn't know of.
sizeof(A) is 1 (as AFAIK it’s not legal to have size equal 0) and
sizeof(B) is 4 (due to presence of VT). We wish to delete 1 byte, but
there is a block of 4 bytes. Due to DEBUG heap monitoring, the error
was caught.
The solution of course is to declare the base class's (A's) dtor as virtual so B's dtor will always be called.
EDIT: For the first case, here's what the standard has to say:
§5.3 In the first alternative (delete object), if the static type of the object to be deleted is different from its
dynamic type, the static type shall be a base class of the dynamic type of the object to be deleted and the
static type shall have a virtual destructor or the behavior is undefined. In the second alternative (delete
array) if the dynamic type of the object to be deleted differs from its static type, the behavior is undefined.
So both cases lead us to the realm of undefined behavior which of course differs from one implementation to the other. But it stands to reason that for most implementations the first case is easier to handle or at least easier to contemplate than the second which is just an esoteric anti-pattern.

As others have pointed out, you are deleting an object whose static type is different from its dynamic type, and since the static type doesn't have a virtual destructor, you get undefined behavior. This includes the behavior of sometimes working and sometimes not working as you are seeing. However, I think you are interested in a little deeper understanding of what is happening with your particular compiler.
Class A has no members at all, so its data layout ends up looking like this:
struct A {
};
Since class B derives from class A, class A becomes embedded within B. When class B has no virtual functions, the layout ends up looking like this:
struct B {
A __a_part;
int x;
};
The compiler can convert a B* to an A* by just taking the address of __a_part, as if the compiler had a function like this:
A* convertToAPointer(B* bp) { return &bp->__a_part; }
Since __a_part is the first member of B, the B* and the A* point to the same address.
Code like this:
A* bptr = new B;
delete bptr;
Is effectively doing something like this:
// Allocate a new B
void* vp1 = allocateMemory(sizeof(B));
B* bp = static_cast<B*>(vp1);
bp->B(); // assume for a second that this was a legal way to construct
// Convert the B* to an A*
A* bptr = &bp->__a_part;
// Deallocate the A*
void* vp2 = ap;
deallocateMemory(vp2);
In this case, vp2 and vp1 are the same. The system is allocating and deallocating the same memory address, so the program runs without an error.
When class B has a virtual member function (the destructor in this case). The compiler adds a virtual table pointer, so class B ends up looking like this:
struct B {
B_vtable* __vptr;
A __a_part;
};
The issue here is that __a_part is no longer the first member, and the convertToAPointer operation will now change the address of the pointer, so vp2 and vp1 no longer point to the same address. Since a different memory location is being deallocated than the one that was allocated, you get the error.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

What's the issue with malloc() and virtual functions? [duplicate] - c++

A common way to implement virtual functions is to have a pointer to a "virtual table" or vtable at a negative offset from the object. This table is needed to figure out what virtual function to call. This is why just malloc'ing space doesn't work.

Surely because the Virtual Function Table isn't getting created properly?

malloc does not call the constructor of the class, so your object is not initailized properly, hence it seg faults. Use new to allocate memory when using C++. BTW, there is no need to cast the pointer returned from new.

The good reason is called virtual tables. Objects of types that have virtual methods have a table of pointers pointing to the address of the actual virtual methods to be called. These are called virtual tables or v-tables.

Related

Why does this static C++ cast work?

Why the output is different from what I expect?

Deleting polymorphic objects and memory leaks

function call via class pointer in c++

C++ Debug Assertion Fails Only With VPTR

Categories

Resources