Reading the virtual function table (vtable) pointer?

Reading the virtual function table (vtable) pointer? - c++

Is there a well-defined way of accessing the vtable of a class? When debugging in visual studio I can expand 'this' like: this->_ptr->__vfptr. But this path does not seem to be available from code.
I need this for a unit test of a custom heap implementation (embedded environment).
Background
We had a bug where an object being allocated on our custom heap (which isn't anything more than an array of a certain size) was working as expected until we wanted to add an object having a virtual function (it took quite some time before we realized that this addition was the cause of the problem). The mistake that we did was to assign an object to memory where no object had been initialized prior to assignment. We did not pay much attention when writing that code and as it worked with everything else and was tested, we considered it working. Here's some sample code:
int array_ptr[sizeof(SomeObject)];
*((SomeObject*) array_ptr) = SomeObject(); // Does only partially initialize the object!
Once we realized this line was the issue, it also became clear why that was the case.

Aha, I get it now, with the clarification from the comments.
You're calling CFoo::operator= on raw memory that only has the size of a CFoo. That's indeed not going to set a vtable, on common implementations. This is specific to how assignment in C++ works. Object assignment in C++ is defined to be slicing. If you assign a Derived object to a Base class, you're calling Base::operator=(Base const& src). This only copies the Base sub-object of the Derived object.
The reason why C++ chose this model is because that means the Base object doesn't change size when you assign a Derived value to it, at the obvious price of losing the extra information.
The net effect is that C++ objects do not change type after construction. Practically, that means the type, and the vtable can be fixed by the constructor. The assignment operator won't touch it.
So, by calling the assignment operator on raw memory, you get Undefined Behavior, in particular an uninitialized (garbage) vtable. You can't count on it being all zeroes. Also, in more complicated cases with multiple and virtual inheritance, there are additional data fields to find the various sub-objects. Those would be uninitialized as well. Note that these additional data fields may contain absolute pointers. memcpy such an object, and you'd point back to subobjects of the original.
Can you detect this? No. All your attempts to access the memory are Undefined Behavior, by virtue of there not being a CFoo object in the raw memory.
The solution is placement new. This is the magical incantation that turns raw memory into an object. It can use any constructor, including move constructors and copy constructors, but (barring exceptions) will leave you with a valid object, with proper polymorphic behavior.

Ok, so learning from MSalters and other commenters above I understand there is no straight forward way of reading the vtable pointer. However, I came up with a solution that was enough for my needs (i.e. to test that the vtable pointer is properly initialized). So here's the code (note that I assume that what I get is the vtable pointer as sizeof(size_t) == sizeof(EmptyClassWithOneVirtualFunction)):
class EmptyClassWithOneVirtualFunction
{
virtual void testFunction() {}
};
void test_staticNew_object_vtable()
{
EmptyClassWithOneVirtualFunction correctObject;
EmptyClassWithOneVirtualFunction* object = mem::static_new<EmptyClassWithOneVirtualFunction>();
size_t* correctObjectVtablePtr = ( (size_t*) &correctObject );
size_t* objectVtablePtr = ( (size_t*) object );
TS_ASSERT_EQUALS( *objectVtablePtr, *correctObjectVtablePtr );
}
It should be pointed out that this is test code that is built in debug mode without optimization. To be able to catch this error even in this not entirely "safe" way is more valuable to me than to skip doing it just because there is no right way to do it.

Related

Default initialization for a class pointer

I have seen the following construction in many code reviews:
ClassX *pObj;
ClassX obj;
pObj = &obj; //obj is not used in the rest of the code
Is the line below only used for initialization purposes?
pObj = &obj;
Why is it not initialized to NULL?

pObj = &obj; here pObj is pointer and it is pointing to obj.
Like below,
Note: Only for illustration purpose I have chosen address of obj,pObj as 0x1000,0x2000 respectively.
Why they do not initialize to NULL.
pObj can be initialized to NULL but eventually overwritten by pObj = &obj and hence no side effect occurs. But access to pObj before assignment causes UB.

pObj is a pointer to a properly initialised instance that can be used by the rest of the function or any called functions. NULL would mean there is no instance, a very different thing.
But why would you do this? One answer is that the rest of the code uses pointers and the author feels happier using pObj than using &obj.
Another may be that the pointer later gets assigned to a real object "usually". You didn't show us the later code so we have to speculate (or downvote). Perhaps the author thinks that having a valid temporary is less prone to crashes than having a null ptr if the assignment fails and the later code that uses the pointer is allowed to run, but this really is lazy programming, paying to initialise an object you never intend to use. If the real object is dynamically allocated, then the pointer might be valid outside the scope of this code, but the default instance would not be.

Sometimes construction/copy/move of objects is costly or impossible; thus pointer ClassX* pObj; can serve as a tool to quickly change target - as copying pointers is simple and cheap overall.
Say, in a loop pObj is frequently used and sometimes you need it to point to one object, then to another, and carryover it to next iterations. Or you have some complex rules that determine to which variable the pointer points to.
If one could've simply used obj instead of the pObj in the method without any issues - then it is simply poor coding practices to use pObj. Some might use it to save & when passing pointers... but that's barely an excuse. Regardless, I don't see much harm except for clutter.
Or they simply copied the code from elsewhere (that was also copied from another place...) without dwelling on it much as to why it is written in this way.

Runtime error on calling a function after deletion of 'this' pointer [duplicate]

Is it allowed to delete this; if the delete-statement is the last statement that will be executed on that instance of the class? Of course I'm sure that the object represented by the this-pointer is newly-created.
I'm thinking about something like this:
void SomeModule::doStuff()
{
// in the controller, "this" object of SomeModule is the "current module"
// now, if I want to switch over to a new Module, eg:
controller->setWorkingModule(new OtherModule());
// since the new "OtherModule" object will take the lead,
// I want to get rid of this "SomeModule" object:
delete this;
}
Can I do this?

The C++ FAQ Lite has a entry specifically for this
https://isocpp.org/wiki/faq/freestore-mgmt#delete-this
I think this quote sums it up nicely
As long as you're careful, it's OK for an object to commit suicide (delete this).

Yes, delete this; has defined results, as long as (as you've noted) you assure the object was allocated dynamically, and (of course) never attempt to use the object after it's destroyed. Over the years, many questions have been asked about what the standard says specifically about delete this;, as opposed to deleting some other pointer. The answer to that is fairly short and simple: it doesn't say much of anything. It just says that delete's operand must be an expression that designates a pointer to an object, or an array of objects. It goes into quite a bit of detail about things like how it figures out what (if any) deallocation function to call to release the memory, but the entire section on delete (§[expr.delete]) doesn't mention delete this; specifically at all. The section on destructors does mention delete this in one place (§[class.dtor]/13):
At the point of definition of a virtual destructor (including an implicit definition (15.8)), the non-array deallocation function is determined as if for the expression delete this appearing in a non-virtual destructor of the destructor’s class (see 8.3.5).
That tends to support the idea that the standard considers delete this; to be valid -- if it was invalid, its type wouldn't be meaningful. That's the only place the standard mentions delete this; at all, as far as I know.
Anyway, some consider delete this a nasty hack, and tell anybody who will listen that it should be avoided. One commonly cited problem is the difficulty of ensuring that objects of the class are only ever allocated dynamically. Others consider it a perfectly reasonable idiom, and use it all the time. Personally, I'm somewhere in the middle: I rarely use it, but don't hesitate to do so when it seems to be the right tool for the job.
The primary time you use this technique is with an object that has a life that's almost entirely its own. One example James Kanze has cited was a billing/tracking system he worked on for a phone company. When you start to make a phone call, something takes note of that and creates a phone_call object. From that point onward, the phone_call object handles the details of the phone call (making a connection when you dial, adding an entry to the database to say when the call started, possibly connect more people if you do a conference call, etc.) When the last people on the call hang up, the phone_call object does its final book-keeping (e.g., adds an entry to the database to say when you hung up, so they can compute how long your call was) and then destroys itself. The lifetime of the phone_call object is based on when the first person starts the call and when the last people leave the call -- from the viewpoint of the rest of the system, it's basically entirely arbitrary, so you can't tie it to any lexical scope in the code, or anything on that order.
For anybody who might care about how dependable this kind of coding can be: if you make a phone call to, from, or through almost any part of Europe, there's a pretty good chance that it's being handled (at least in part) by code that does exactly this.

If it scares you, there's a perfectly legal hack:
void myclass::delete_me()
{
std::unique_ptr<myclass> bye_bye(this);
}
I think delete this is idiomatic C++ though, and I only present this as a curiosity.
There is a case where this construct is actually useful - you can delete the object after throwing an exception that needs member data from the object. The object remains valid until after the throw takes place.
void myclass::throw_error()
{
std::unique_ptr<myclass> bye_bye(this);
throw std::runtime_exception(this->error_msg);
}
Note: if you're using a compiler older than C++11 you can use std::auto_ptr instead of std::unique_ptr, it will do the same thing.

One of the reasons that C++ was designed was to make it easy to reuse code. In general, C++ should be written so that it works whether the class is instantiated on the heap, in an array, or on the stack. "Delete this" is a very bad coding practice because it will only work if a single instance is defined on the heap; and there had better not be another delete statement, which is typically used by most developers to clean up the heap. Doing this also assumes that no maintenance programmer in the future will cure a falsely perceived memory leak by adding a delete statement.
Even if you know in advance that your current plan is to only allocate a single instance on the heap, what if some happy-go-lucky developer comes along in the future and decides to create an instance on the stack? Or, what if he cuts and pastes certain portions of the class to a new class that he intends to use on the stack? When the code reaches "delete this" it will go off and delete it, but then when the object goes out of scope, it will call the destructor. The destructor will then try to delete it again and then you are hosed. In the past, doing something like this would screw up not only the program but the operating system and the computer would need to be rebooted. In any case, this is highly NOT recommended and should almost always be avoided. I would have to be desperate, seriously plastered, or really hate the company I worked for to write code that did this.

It is allowed (just do not use the object after that), but I wouldn't write such code on practice. I think that delete this should appear only in functions that called release or Release and looks like: void release() { ref--; if (ref<1) delete this; }.

Well, in Component Object Model (COM) delete this construction can be a part of Release method that is called whenever you want to release aquisited object:
void IMyInterface::Release()
{
--instanceCount;
if(instanceCount == 0)
delete this;
}

This is the core idiom for reference-counted objects.
Reference-counting is a strong form of deterministic garbage collection- it ensures objects manage their OWN lifetime instead of relying on 'smart' pointers, etc. to do it for them. The underlying object is only ever accessed via "Reference" smart pointers, designed so that the pointers increment and decrement a member integer (the reference count) in the actual object.
When the last reference drops off the stack or is deleted, the reference count will go to zero. Your object's default behavior will then be a call to "delete this" to garbage collect- the libraries I write provide a protected virtual "CountIsZero" call in the base class so that you can override this behavior for things like caching.
The key to making this safe is not allowing users access to the CONSTRUCTOR of the object in question (make it protected), but instead making them call some static member- the FACTORY- like "static Reference CreateT(...)". That way you KNOW for sure that they're always built with ordinary "new" and that no raw pointer is ever available, so "delete this" won't ever blow up.

You can do so. However, you can't assign to this. Thus the reason you state for doing this, "I want to change the view," seems very questionable. The better method, in my opinion, would be for the object that holds the view to replace that view.
Of course, you're using RAII objects and so you don't actually need to call delete at all...right?

This is an old, answered, question, but #Alexandre asked "Why would anyone want to do this?", and I thought that I might provide an example usage that I am considering this afternoon.
Legacy code. Uses naked pointers Obj*obj with a delete obj at the end.
Unfortunately I need sometimes, not often, to keep the object alive longer.
I am considering making it a reference counted smart pointer. But there would be lots of code to change, if I was to use ref_cnt_ptr<Obj> everywhere. And if you mix naked Obj* and ref_cnt_ptr, you can get the object implicitly deleted when the last ref_cnt_ptr goes away, even though there are Obj* still alive.
So I am thinking about creating an explicit_delete_ref_cnt_ptr. I.e. a reference counted pointer where the delete is only done in an explicit delete routine. Using it in the one place where the existing code knows the lifetime of the object, as well as in my new code that keeps the object alive longer.
Incrementing and decrementing the reference count as explicit_delete_ref_cnt_ptr get manipulated.
But NOT freeing when the reference count is seen to be zero in the explicit_delete_ref_cnt_ptr destructor.
Only freeing when the reference count is seen to be zero in an explicit delete-like operation. E.g. in something like:
template<typename T> class explicit_delete_ref_cnt_ptr {
private:
T* ptr;
int rc;
...
public:
void delete_if_rc0() {
if( this->ptr ) {
this->rc--;
if( this->rc == 0 ) {
delete this->ptr;
}
this->ptr = 0;
}
}
};
OK, something like that. It's a bit unusual to have a reference counted pointer type not automatically delete the object pointed to in the rc'ed ptr destructor. But it seems like this might make mixing naked pointers and rc'ed pointers a bit safer.
But so far no need for delete this.
But then it occurred to me: if the object pointed to, the pointee, knows that it is being reference counted, e.g. if the count is inside the object (or in some other table), then the routine delete_if_rc0 could be a method of the pointee object, not the (smart) pointer.
class Pointee {
private:
int rc;
...
public:
void delete_if_rc0() {
this->rc--;
if( this->rc == 0 ) {
delete this;
}
}
}
};
Actually, it doesn't need to be a member method at all, but could be a free function:
map<void*,int> keepalive_map;
template<typename T>
void delete_if_rc0(T*ptr) {
void* tptr = (void*)ptr;
if( keepalive_map[tptr] == 1 ) {
delete ptr;
}
};
(BTW, I know the code is not quite right - it becomes less readable if I add all the details, so I am leaving it like this.)

Delete this is legal as long as object is in heap.
You would need to require object to be heap only.
The only way to do that is to make the destructor protected - this way delete may be called ONLY from class , so you would need a method that would ensure deletion

Sequencing of the copying when passing by value in C++

In C++, when passing an object by value, are there restrictions on when the copy takes place ?
I have the following code (simplified):
class A;
class Parent
{
public:
void doSomething(std::auto_ptr<A> a); // meant to transfer ownership.
};
std::auto_ptr<A> a = ...;
a->getParent()->doSomething(a);
It acts like:
std::auto_ptr<A> a = ...;
std::auto_ptr<A> copy(a);
a->getParent()->doSomething(copy);
Which will obviously segfault since a is now referencing NULL.
And not like:
std::auto_ptr<A> a = ...;
Parent* p = a->getParent();
p->doSomething(a);
Is this expected ?

A: auto_ptr is deprecated in newer versions of C++, I recommend checking out unique_ptr.
B: This behavior is expected. An auto_ptr owns the thing that it has created. So if you wish to properly transfer ownership from one auto_ptr to another, the original auto_ptrs managed object would properly be a null pointer. Though I believe this logic is handled by the std::auto_ptr library and you shouldn't have to do anything special to get this behavior. If two auto_ptrs were allowed to manage the same object, they would also both try and free the memory for this object when they went out of scope. This is bad in itself, but even worse, is that if one of these auto_ptrs had broader scope it could attempt to reference memory that no longer held the object in question because it had since been freed by the other auto_ptr and in this we have true chaos. Hence, when ownership is transferred, the original pointers managed object is set to null, and we have the illusion of safety. :)

From my point of view, the example is not good because of at least three reasons.
1) Looking at the code without seeing doSomething proto it is not clear that the ownership can change.
2) If by a slightest chance the result can depend on the order of evaluation, the code is not portable or implementation dependent and so not acceptable.
3) Even if the order of evaluation is right, the code can raise this exact question from other
developers and will waste their time. The readability must be of the highest priority.

Is it alright to return a reference to a non-pointer member variable as a pointer?

I recently came across some C++ code that looked like this:
class SomeObject
{
private:
// NOT a pointer
BigObject foobar;
public:
BigObject * getFoobar() const
{
return &foobar;
}
};
I asked the programmer why he didn't just make foobar a pointer, and he said that this way he didn't have to worry about allocating/deallocating memory. I asked if he considered using some smart pointer, he said this worked just as well.
Is this bad practice? It seems very hackish.

That's perfectly reasonable, and not "hackish" in any way; although it might be considered better to return a reference to indicate that the object definitely exists. A pointer might be null, and might lead some to think that they should delete it after use.
The object has to exist somewhere, and existing as a member of an object is usually as good as existing anywhere else. Adding an extra level of indirection by dynamically allocating it separately from the object that owns it makes the code less efficient, and adds the burden of making sure it's correctly deallocated.
Of course, the member function can't be const if it returns a non-const reference or pointer to a member. That's another advantage of making it a member: a const qualifier on SomeObject applies to its members too, but doesn't apply to any objects it merely has a pointer to.
The only danger is that the object might be destroyed while someone still has a pointer or reference to it; but that danger is still present however you manage it. Smart pointers can help here, if the object lifetimes are too complex to manage otherwise.

You are returning a pointer to a member variable not a reference. This is bad design.
Your class manages the lifetime of foobar object and by returning a pointer to its members you enable the consumers of your class to keep using the pointer beyond the lifetime of SomeObject object. And also it enables the users to change the state of SomeObject object as they wish.
Instead you should refactor your class to include the operations that would be done on the foobar in SomeObject class as methods.
ps. Consider naming your classes properly. When you define it is a class. When you instantiate, then you have an object of that class.

It's generally considered less than ideal to return pointers to internal data at all; it prevents the class from managing access to its own data. But if you want to do that anyway I see no great problem here; it simplifies the management of memory.

Is this bad practice? It seems very hackish.
It is. If the class goes out of scope before the pointer does, the member variable will no longer exist, yet a pointer to it still exists. Any attempt to dereference that pointer post class destruction will result in undefined behaviour - this could result in a crash, or it could result in hard to find bugs where arbitrary memory is read and treated as a BigObject.
if he considered using some smart pointer
Using smart pointers, specifically std::shared_ptr<T> or the boost version, would technically work here and avoid the potential crash (if you allocate via the shared pointer constructor) - however, it also confuses who owns that pointer - the class, or the caller? Furthermore, I'm not sure you can just add a pointer to an object to a smart pointer.
Both of these two points deal with the technical issue of getting a pointer out of a class, but the real question should be "why?" as in "why are you returning a pointer from a class?" There are cases where this is the only way, but more often than not you don't need to return a pointer. For example, suppose that variable needs to be passed to a C API which takes a pointer to that type. In this case, you would probably be better encapsulating that C call in the class.

As long as the caller knows that the pointer returned from getFoobar() becomes invalid when the SomeObject object destructs, it's fine. Such provisos and caveats are common in older C++ programs and frameworks.
Even current libraries have to do this for historical reasons. e.g. std::string::c_str, which returns a pointer to an internal buffer in the string, which becomes unusable when the string destructs.
Of course, that is difficult to ensure in a large or complex program. In modern C++ the preferred approach is to give everything simple "value semantics" as far as possible, so that every object's life time is controlled by the code that uses it in a trivial way. So there are no naked pointers, no explicit new or delete calls scattered around your code, etc., and so no need to require programmers to manually ensure they are following the rules.
(And then you can resort to smart pointers in cases where you are totally unable to avoid shared responsibility for object lifetimes.)

Two unrelated issues here:
1) How would you like your instance of SomeObject to manage the instance of BigObject that it needs? If each instance of SomeObject needs its own BigObject, then a BigObject data member is totally reasonable. There are situations where you'd want to do something different, but unless that situation arises stick with the simple solution.
2) Do you want to give users of SomeObject direct access to its BigObject? By default the answer here would be "no", on the basis of good encapsulation. But if you do want to, then that doesn't change the assessment of (1). Also if you do want to, you don't necessarily need to do so via a pointer -- it could be via a reference or even a public data member.
A third possible issue might arise that does change the assessment of (1):
3) Do you want to give users of SomeObject direct access to an instance of BigObject that they continue using beyond the lifetime of the instance of SomeObject that they got it from? If so then of course a data member is no good. The proper solution might be shared_ptr, or for SomeObject::getFooBar to be a factory that returns a different BigObject each time it's called.
In summary:
Other than the fact it doesn't compile (getFooBar() needs to return const BigObject*), there is no reason so far to suppose that this code is wrong. Other issues could arise that make it wrong.
It might be better style to return const & rather than const *. Which you return has no bearing on whether foobar should be a BigObject data member.
There is certainly no "just" about making foobar a pointer or a smart pointer -- either one would necessitate extra code to create an instance of BigObject to point to.

Can I make a bitwise copy of a C++ object?

Can C++ objects be copied using bitwise copy? I mean using memcopy_s? Is there a scenario in which that can go wrong?

If they're Plain Old Data (POD) types, then this should work. Any class that has instances of other classes inside it will potentially fail, since you're copying them without invoking their copy constructors. The most likely way it will fail is one of their destructors will free some memory, but you've duplicated pointers that point to it, so you then try to use it from one of your copied objects and get a segfault. In short, don't do it unless it's a POD and you're sure it will always be a POD.

No, doing so can cause a lot of problems. You should always copy C++ types by using the assignment operator or copy constructor.
Using a bitwise copy breaks any kind of resource management because at the end of the day you are left with 2 objects for which 1 constructor has run and 2 destructors will run.
Consider as an example a ref counted pointer.
void foo() {
RefPointer<int> p1(new int());
RefPointer<int> p2;
memcpy(&p2,p1,sizeof(RefPointer<int>));
}
Now both p1 and p2 are holding onto the same data yet the internal ref counting mechanism has not been notified. Both destructors will run thinking they are the sole owner of the data potentially causing the value to be destructed twice.

It depends on the implementation of the C++ object you are trying to copy. In general the owner of the C++ object's memory is the object itself, so trying to "move" or "copy" it with something like memcopy_s is going behind its back which is going to get you in trouble more often than not.
Usually if a C++ object is intended to be copied or moved, there are APIs within the class itself that facilitate this.

If it is a single object, why not use assignment operator (I suppose the compiler-generated assignment operator could be implemented in terms of memcpy if that is so advantageous, and the compiler knows better whether your class is a POD.)
If you want to copy an array of objects, you can use std::copy. Depending on the implementation, this may end up using memmove (one more thing that you can mess up - the buffers may overlap; I don't know whether the nonstandard memcpy_s somehow checks for that) if the involved types allow that. Again, the decision is done by the compiler, which will get it right even if the types are modified.

In general if your structure contains pointers, you can't memcpy it because the structure would most likely allocate new memory spaces and point to those. A memcpy can't handle that.
If however your class only has primitive, non-pointer types, you should be able to.

In addition to the problem of unbalanced resource management calls in the two instance you end up with after a memcopy (as #JaredPar and #rmeador pointed), if the object supports a notion of an instance ID doing a memcopy will leave you with two instances with the same ID. This can lead to all sorts of "interesting" problems to hunt later on, especially if your objects are mapped to a database.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js