I am currently debugging a crashlog. The crash occurs because the vtable pointer of a (c++-) object is 0x1, while the rest of the object seems to be ok as far as I can tell from the crashlog.
The program crashes when it tries to call a virtual method.
My question: Under what circumstances can a vtable pointer become null? Does operator delete set the vtable pointer to null?
This occurs on OS X using gcc 4.0.1 (Apple Inc. build 5493).
Could be a memory trample - something writing over that vtable by mistake. There is a nearly infinite amount of ways to "achieve" this in C++. A buffer overflow, for example.
Any kind of undefined behaviour you have may lead to this situation. For example:
Errors in pointer arithmetic or other that make your program write into invalid memory.
Uninitialized variables, invalid casts...
Treating an array polymorphically might cause this as a secondary effect.
Trying to use an object after delete.
See also the questions What’s the worst example of undefined behaviour actually possible? and What are all the common undefined behaviour that a C++ programmer should know about?.
Your best bet is to use a bounds and memory checker, as an aid to heavy debugging.
A very common case: trying to call a pure virtual method from the constructor...
Constructors
struct Interface
{
Interface();
virtual void logInit() const = 0;
};
struct Concrete: Interface()
{
virtual void logInit() const { std::cout << "Concrete" << std::endl; }
};
Now, suppose the following implementation of Interface()
Interface::Interface() {}
Then everything is fine:
Concrete myConcrete;
myConcrete.pure(); // outputs "Concrete"
It's such a pain to call pure after the constructor, it would be better to factorize the code right ?
Interface::Interface() { this->logInit(); } // DON'T DO THAT, REALLY ;)
Then we can do it in one line!!
Concrete myConcrete; // CRASHES VIOLENTLY
Why ?
Because the object is built bottom up. Let's look at it.
Instructions to build a Concrete class (roughly)
Allocate enough memory (of course), and enough memory for the _vtable too (1 function pointer per virtual function, usually in the order they are declared, starting from the leftmost base)
Call Concrete constructor (the code you don't see)
a> Call Interface constructor, which initialize the _vtable with its pointers
b> Call Interface constructor's body (you wrote that)
c> Override the pointers in the _vtable for those methods Concrete override
d> Call Concrete constructor's body (you wrote that)
So what's the problem ? Well, look at b> and c> order ;)
When you call a virtual method from within a constructor, it doesn't do what you're hoping for. It does go to the _vtable to lookup the pointer, but the _vtable is not fully initialized yet. So, for all that matters, the effect of:
D() { this->call(); }
is in fact:
D() { this->D::call(); }
When calling a virtual method from within a Constructor, you don't the full dynamic type of the object being built, you have the static type of the current Constructor invoked.
In my Interface / Concrete example, it means Interface type, and the method is virtual pure, so the _vtable does not hold a real pointer (0x0 or 0x01 for example, if your compiler is friendly enough to setup debug values to help you there).
Destructors
Coincidently, let's examine the Destructor case ;)
struct Interface { ~Interface(); virtual void logClose() const = 0; }
Interface::~Interface() { this->logClose(); }
struct Concrete { ~Concrete(); virtual void logClose() const; char* m_data; }
Concrete::~Concrete() { delete[] m_data; } // It's all about being clean
void Concrete::logClose()
{
std::cout << "Concrete refering to " << m_data << std::endl;
}
So what happens at destruction ? Well the _vtable works nicely, and the real runtime type is invoked... what it means here however is undefined behavior, because who knows what happened to m_data after it's been deleted and before Interface destructor was invoked ? I don't ;)
Conclusion
Never ever call virtual methods from within constructors or destructors.
If it's not that, you're left with a memory corruption, tough luck ;)
My first guess would be that some code is memset()'ing a class object.
This is totaly implementation dependant. However it would be quite safe to assume that after delete some other operation may set the memory space to null.
Other possibilities include overwrite of the memory by some loose pointer -- actually in my case it's almost always this...
That said, you should never try to use an object after delete.
Related
I was studying Virtual Functions and Pointers. Below code made me to think about, why does one need Virtual Function when we can type cast base class pointer the way we want?
class baseclass {
public:
void show() {
cout << "In Base\n";
}
};
class derivedclass1 : public baseclass {
public:
void show() {
cout << "In Derived 1\n";
}
};
class derivedclass2 : public baseclass {
public:
void show() {
cout << "In Derived 2\n";
}
};
int main(void) {
baseclass * bptr[2];
bptr[0] = new derivedclass1;
bptr[1] = new derivedclass2;
((derivedclass1*) bptr)->show();
((derivedclass2*) bptr)->show();
delete bptr[0];
delete bptr[1];
return 0;
}
Gives same result if we use virtual in base class.
In Derived 1
In Derived 2
Am I missing something?
Your example appears to work, because there is no data, and no virtual methods, and no multiple inheritance. Try adding int value; to derivedclass1, const char *cstr; to derivedclass2, initialize these in corresponding constructors, and add printing these to corresponding show() methods.
You will see how show() will print garbage value (if you cast pointer to derivedclass1 when it is not) or crash (if you cast the pointer to derivedclass2 when class in fact is not of that type), or behave otherwise oddly.
C++ class member functions AKA methods are nothing more than functions, which take one hidden extra argument, this pointer, and they assume that it points to an object of right type. So when you have an object of type derivedclass1, but you cast a pointer to it to type derivedclass2, then what happens without virtual methods is this:
method of derivedclass2 gets called, because well, you explicitly said "this is a pointer to derivedclass2".
the method gets pointer to actual object, this. It thinks it points to actual instance of derivedclass2, which would have certain data members at certain offsets.
if the object actually is a derivedclass1, that memory contains something quite different. So if method thinks there is a char pointer, but in fact there isn't, then accessing the data it points to will probably access illegal address and crash.
If you instead use virtual methods, and have pointer to common base class, then when you call a method, compiler generates code to call the right method. It actually inserts code and data (using a table filled with virtual method pointers, usually called vtable, one per class, and pointer to it, one per object/instance) with which it knows to call the right method. So when ever you call a virtual method, it's not a direct call, but instead the object has extra pointer to the vtable of the real class, which tells what method should really be called for that object.
In summary, type casts are in no way an alternative to virtual methods. And, as a side note, every type cast is a place to ask "Why is this cast here? Is there some fundamental problem with this software, if it needs a cast here?". Legitimate use cases for type casts are quite rare indeed, especially with OOP objects. Also, never use C-style type casts with object pointers, use static_cast and dynamic_cast if you really need to cast.
If you use virtual functions, your code calling the function doesn't need to know about the actual class of the object. You'd just call the function blindly and correct function would be executed. This is the basis of polymorphism.
Type-casting is always risky and can cause run-time errors in large programs.
Your code should be open for extension but closed for modifications.
Hope this helps.
You need virtual functions where you don't know the derived type until run-time (e.g. when it depends on user input).
In your example, you have hard-coded casts to derivedclass2 and derivedclass1. Now what would you do here?
void f(baseclass * bptr)
{
// call the right show() function
}
Perhaps your confusion stems from the fact that you've not yet encountered a situation where virtual functions were actually useful. When you always know exactly at compile-time the concrete type you are operating on, then you don't need virtual functions at all.
Two other problems in your example code:
Use of C-style cast instead of C++-style dynamic_cast (of course, you usually don't need to cast anyway when you use virtual functons for the problem they are designed to solve).
Treating arrays polymorphically. See Item 3 in Scott Meyer's More Effective C++ book ("Never treat arrays polymorphically").
Consider the following setup.
class I
{
public:
virtual void F() = 0;
};
class A : public I
{
public:
void F() { /* some implementation */ }
};
class B : public I
{
public:
void F() { /* some implementation */ }
};
This allows me to write a function like the following.
std::shared_ptr<I> make_I(bool x)
{
if (x) return std::make_shared<A>();
else return std::make_shared<B>();
}
In this situation, I am paying some costs for the inheritance and polymorphism, namely having a vtable and that calls to F can't be inlined when used like follows (correct me if I'm wrong).
auto i = make_I(false);
i->F(); //can't be inlined
What I want to know is if I have to pay these same costs when using A or B as objects allocated on the stack, like in the following code.
A a;
a.F();
Do A and B have vtables when allocated on the stack? Can the call to F be inlined?
It seems to me that a compiler could feasibly create two memory layouts for classes in an inheritance hierarchy - one for the stack and one for the heap. Is this what a C++ compiler will/may do? Or is there a theoretical or practical reason it can't?
Edit:
I saw a comment (that looks like it was deleted) that actually raised a good point. You could always do the following, and then that A a was allocated on the stack might not be the salient point I'm trying to get at...
A a;
A* p = &a;
p->F(); //likely won't be inlined (correct me if I'm wrong)
Maybe a better way to phrase it would be "Is the behavior different for an object that is allocated on the stack and is used as a 'regular value type'?" Please help me out here with the terminology if you know what I mean but have a better way of putting it!
The point I'm trying to get at is that you could feasibly, at compile time, "flatten" the definition of the base class into the derived class you are allocating an instance of on the stack.
I think your question really has to do with whether a compiler has static knowledge of an object and can elide the vtable lookup (you mentioned this in your edit), rather than whether there is a distinction on where the object lives - stack or heap. Yes, many compilers can elide the virtual dispatch in that case.
The edit to your question asks whether you can flatten the definition of the base class, A, into the derived class, B. If the compiler can tell, at compile time, that an object will only ever contain an instance of B then it can eliminate the vtable lookup at runtime and call B.F(); for that particular call.
For example, the compiler will probably eliminate the vtable lookup at runtime below and call the derived function:
B b;
b.F();
In the code below, the compiler will not be able to eliminate the runtime lookup in doSomething, but it probably can eliminate the lookup in b.F()
void doSomething( A* object ) {
object->F(); // will involve a vtable lookup
}
B b;
b.F(); // probably won't need a vtable lookup
doSomething( &b );
Note it does not matter whether object is allocated on the stack or the heap. What matters is that the compiler is able to determine the type. Each class will still have a vtable, it just might not always be needed for each method call.
You mention code inlining, this is not related to how the object is allocated. When a normal function is called, variables will be pushed onto the stack along with a return address. The CPU will then jump to the function. With inline code, the site of the function call is replaced with the actual code (similar to a macro).
If an object contained in an inheritance hierarchy is allocated on the stack, the compiler still needs to be able to determine what functions it can call, especially if there are virtual and non-virtual functions.
I'm pretty sure this is dangerous code. However, I wanted to check to see if anyone had an idea of what exactly would go wrong.
Suppose I have this class structure:
class A {
protected:
int a;
public:
A() { a = 0; }
int getA() { return a; }
void setA(int v) { a = v; }
};
class B: public A {
protected:
int b;
public:
B() { b = 0; }
};
And then suppose I want to have a way of automatically extending the class like so:
class Base {
public:
virtual ~Base() {}
};
template <typename T>
class Test: public T, public Base {};
One really important guarantee that I can make is that neither Base nor Test will have any other member variables or methods. They are essentially empty classes.
The (potentially) dangerous code is below:
int main() {
B *b = new B();
// dangerous part?
// forcing Test<B> to point to to an address of type B
Test<B> *test = static_cast<Test<B> *>(b);
//
A *a = dynamic_cast<A *>(test);
a->setA(10);
std::cout << "result: " << a->getA() << std::endl;
}
The rationale for doing something like this is I'm using a class similar to Test, but in order for it to work currently, a new instance T (i.e. Test) has necessarily be made, along with copying the instance passed. It would be really nice if I could just point Test to T's memory address.
If Base did not add a virtual destructor, and since nothing gets added by Test, I would think this code is actually okay. However, the addition of the virtual destructor makes me worried that the type info might get added to the class. If that's the case, then it would potentially cause memory access violations.
Finally, I can say this code works fine on my computer/compiler (clang), though this of course is no guarantee that it's not doing bad things to memory and/or won't completely fail on another compiler/machine.
The virtual destructor Base::~Base will be called when you delete the pointer. Since B doesn't have the proper vtable (none at all in the code posted here) that won't end very well.
It only works in this case because you have a memory leak, you're never deleting test.
Your code produces undefined behaviour, as it violates strict aliasing. Even if it did not, you are invoking UB, as neither B nor A are polymorphic classes, and the object pointed to is not a polymorphic class, therefore dynamic_cast cannot succeed. You're attempting to access a Base object that does not exist to determine the runtime type when using dynamic_cast.
One really important guarantee that I can make is that neither Base
nor Test will have any other member variables or methods. They are
essentially empty classes.
It's not important at all- it's utterly irrelevant. The Standard would have to mandate EBO for this to even begin to matter, and it doesn't.
As long as you perform no operations with Test<B>* and avoid any magic like smart pointers or automatic memory management, you should be fine.
You should be sure to look for obscured code like debug prints or logging that will inspect the object. I've had debuggers crash on me for attempting to look into a value of a pointer set up like this. I will bet this will cause you some pain, but you should be able to make it work.
I think the real problem is maintenance. How long will it be before some developer does an operation on Test<B>*?
Can i safely call virtual functions after using static_cast on polymorphic class in situations like in the following code or is it UB?
#include <iostream>
class Base
{
public:
virtual void foo() { std::cout << "Base::foo() \n"; }
};
class Derived : public Base
{
public:
virtual void foo() { std::cout << "Derived::foo() \n"; }
};
int main()
{
Base* derived = new Derived;
Derived* _1 = static_cast<Derived*>(derived);
_1->foo();
}
Yes, you can. Although I don't see the point of doing that in your specific example. Just calling it as
derived->foo();
without any casts would have produced exactly the same effect. I.e. some sort of static_cast in that case would be performed implicitly by the virtual call mechanism.
Note that your static_cast does not in any way suppress the "virtual" nature of the call.
That actually makes me wonder what your question is really about. Why would you even ask about it? What are you trying to do? In your code sample really representative of what you are trying to do?
If the compiler allows you to static_cast and at run-time the dynamic type of the object is as expected, then yes, you can. The question is why do you want to do that...
Yes, but as others have said, you don't need to cast the pointer to the derived type to call virtual functions.
However, it is usually safer to use dynamic_cast when dealing with inherited classes. Using dynamic_cast will generate the proper errors if the type information is incorrect at runtime.
Derived* d = dynamic_cast<Derived*>(derived); //safer, but still unnecessary in this situation
As written, that's going to work, basically because derived is a Derived*. So, all the cast is doing is telling the compiler what you already know. Then again, even without the static cast, you'll just end up with Derived::foo in your output. So, this is somewhat pointless. Still, you might need to do this in a situation where you're absolutely sure you know the actual instanced type of your variable and you need to access some non-virtual members for some reason. If you're using a badly designed class library, for instance...
But, in general, static downcasts are a bad idea. You might end up trying to downcast a variable that isn't a Derived*, in which case, calling virtual (or non-virtual) functions (or, in fact, using that pointer for almost any non-trivial operation) results in Undefined Behavior.
I have a class that derives from a C struct. The class does not do anything special, other than initialization in the constructor, deinitialization function during the destructor, and a few other methods that call into C functions. Basically, it's a run-of-the-mill wrapper. Using GCC, it complained that my destructor was not virtual, so I made it that. Now I run into segfaults.
/* C header file */
struct A
{
/* ... */
}
// My C++ code
class B : public A
{
public:
B() { /* ... init ... */ }
virtual ~B() { /* ... deinit ... */ }
void do()
{
someCFunction(static_cast<A *>(this));
}
};
I was always under the assumption that the static_cast would return the correct pointer to the base class, pruning off the virtual table pointer. So this may not be the case, since I get a segfault in the C function.
By removing the virtual keyword, the code works fine, except that I get a gcc warning. What is the best work around for this? Feel free to enlighten me :).
Both the explicit and implicit conversion to A* are safe. There is neither need for an explicit cast, nor is it going to introduce vtables anywhere, or anything like that. The language would be fundamentally unusable if this were not the case.
I was always under the assumption that the static_cast would return
the correct pointer to the base class, pruning off the virtual table
pointer.
Is absolutely correct.
The destructor need be virtual only if delete ptr; is called where ptr has type A*- or the destructor invoked manually. And it would be A's destructor that would have to be virtual, which it isn't.
Whatever the problem is in your code, it has nothing to do with the code shown. You need to expand your sample considerably.
The destructor of base classes should be virtual. Otherwise, there's a chance that you run into undefined behavior. This is just a speculation, as the code is not enough to tell the actual reason.
Try making the destructor of A virtual and see if it crashes.
Note that a class and a struct are the same thing, other than default access level, so the fact that one's a class and the other a struct has nothing to do with it.
EDIT: If A is a C-struct, use composition instead of inheritance - i.e. have an A member inside of B instead of extending it. There's no point of deriving, since polymorphism is out of the question.
That's not how static_cast works. A pointer to an object continues to be the same pointer, just with a different type. In this case, you're converting a pointer to a derived type (B) into a pointer to the base type (A).
My guess is that casting the pointer does not actually change the pointer value, i.e., it's still pointing to the same memory address, even though it's been cast into an A* pointer type. Remember that struct and class are synonyms in C++.
As #Luchian stated, if you're mixing C and C++, it's better to keep the plain old C structs (and their pointers) as plain old C structs, and use type composition instead of inheritance. Otherwise you're mixing different pointer implementations under the covers. There is no guarantee that the internal arrangement of the C struct and the C++ class are the same.
UPDATE
You should surround the C struct declaration with an extern "C" specification, so that the C++ compiler knows that the struct is a pure C struct:
extern "C"
{
struct A
{
...
};
}
Or:
extern "C"
{
#include "c_header.h"
}