Here is a sample piece of code. Note that B is a subclass of A and both provide a unique print routine. Also notice in main that both bind calls are to &A::print, though in the latter case a reference to B is passed.
#include <iostream>
#include <tr1/functional>
struct A
{
virtual void print()
{
std::cerr << "A" << std::endl;
}
};
struct B : public A
{
virtual void print()
{
std::cerr << "B" << std::endl;
}
};
int main (int argc, char * const argv[])
{
typedef std::tr1::function<void ()> proc_t;
A a;
B b;
proc_t a_print = std::tr1::bind(&A::print, std::tr1::ref(a));
proc_t b_print = std::tr1::bind(&A::print, std::tr1::ref(b));
a_print();
b_print();
return 0;
}
Here is the output I see compiling with GCC 4.2:
A
B
I would consider this correct behavior, but I am at a loss to explain how it is working properly given that the std::tr1::functions were bound to &A::print in both cases. Can someone please enlighten me?
EDIT: Thanks for the answers. I am familiar with inheritance and polymorphic types. What I am interested in is what does &A::print mean? Is it an offset into a vtable, and that vtable changes based on the referred object (in this case, a or b?) From a more nuts-and-bolts perspective, how does this code behave correctly?
This works in the same manner as it would have worked with plain member function pointers. The following produces the same output:
int main ()
{
A a;
B b;
typedef void (A::*fp)();
fp p = &A::print;
(a.*p)(); // prints A
(b.*p)(); // prints B
}
It would have been surprising if boost/tr1/std::function did anything different since they presumably store these pointers to member functions under the hood. Oh, and of course no mention of these pointers is complete without a link to the Fast Delegates article.
Because print() is declared virtual, A is a polymorphic class. By binding to the print function pointer, you will be calling through an A pointer, much in the same way as:
A* ab = &b;
ab->print();
In the ->print call above, you would expect polymorphic behavior. Same it true in your code as well. And this is a Good Thing, if you ask me. At least, most of the time. :)
Related
I've got this code snippet, it tries to call virtual function through an object's vptr (pointing to virtual function table) and uses object pointer to convert to p->vptr, like this:
#include<iostream>
using namespace std;
struct C {
virtual int f() {
return 7;
}
};
typedef int (*pf)();
int main() {
C c1;
pf *pvtable = (pf *) &c1;
cout << (*pvtable[0])() << endl;
return 0;
}
I used clang++14 to compile/link. On running it, programs returns 139, and no cout line is shown, seems it has crashed.
Why it doesn't work and how to fix it?
Why it doesn't work
You are casting a pointer-to-C to a pointer-to-int(*)().
This cast has no meaning in the C++ language and using the resulting pointer is explicitly Undefined Behavior.
and how to fix it?
There is no reliable way.
C++ does not promise the existence of a vtable pointer in any program, and if there is one C++ does not offer any method to access it.
In the following code I use static_cast<B*> on a void*, which points to an A object.
A and B are not related in any way. I understand the compiler cannot raise an error against this. But what I don't understand is, how come this actually seems to work when run...? I would expect a segfault or an error of some kind.
#include <iostream>
using namespace std;
class A {
public:
void f() const {
cout << "f" << endl;
}
};
class B {
public:
void q() {
cout << "q" << endl;
}
};
int main(int argc, char** argv) {
A a;
void* p = &a;
static_cast<B*>(p)->q(); // Prints "q"!
return 0;
}
What is the mechanism behind this?
The code causes undefined behaviour (because it dereferences a B * which does not point to a B object), which means anything can happen. You should not expect any particular subset of consequences.
To find out what your compiler did, you could inspect the assembly. My guess would be that the compiler generated assembly which would be correct if there were a B object there: call the function B::q() with implicit argument p.
how come this actually seems to work when run...?
Because the behaviour of the program is undefined.
I would expect a segfault or an error of some kind.
When behaviour is undefined, there is no guarantee that there would be a segfault or an error of some kind. It would generally be unreasonable to expect such. When behaviour is undefined, there are no guarantees whatsoever.
This was a question asked to me in an interview...
Is it possible to change the vtable memory locations after it's
created via constructor? If yes, is it a good idea? And how to do that?
If not, why not?
As i don't have that in depth idea about C++, my guess is that, it's not possible to change vtable after it's created!
Can anyone explain?
The C++ standards do not tell us how dynamic dispatch must be implemented. But vtable is the most common way.
Generally, the first 8 bytes of the object is used to store the pointer to vtable, but only if the object has at least 1 virtual function (otherwise we can save this 8 bytes for something else). And it is not possible to change the records in the vtable during run-time.
But you have memset or memcpy like functions and can do whatever you want (change the vtable pointer).
Code sample:
#include <bits/stdc++.h>
class A {
public:
virtual void f() {
std::cout << "A::f()" << std::endl;
}
virtual void g() {
std::cout << "A::g()" << std::endl;
}
};
class B {
public:
virtual void f() {
std::cout << "B::f()" << std::endl;
}
virtual void g() {
std::cout << "B::g()" << std::endl;
}
};
int main() {
std::ios_base::sync_with_stdio(false);
std::cin.tie(nullptr);
A * p_a = new A();
B * p_b = new B();
p_a->f();
p_a->g();
p_b->f();
p_b->g();
size_t * vptr_a = reinterpret_cast<size_t *>(p_a);
size_t * vptr_b = reinterpret_cast<size_t *>(p_b);
std::swap(*vptr_a, *vptr_b);
p_a->f();
p_a->g();
p_b->f();
p_b->g();
return 0;
}
Output:
A::f()
A::g()
B::f()
B::g()
B::f()
B::g()
A::f()
A::g()
https://ideone.com/CEkkmN
Of course all these manipulations is the way to shoot yourself in the foot.
The proper response to that question is simple: This question can't be answered. The question talks about the "vtable memory locations" and then continues with "after it's created via constructor". This sentence doesn't make sense because "locations" is plural while "it" can only refer to a singular.
Now, if you have a question concerning the typical C++ implementations that use a vtable pointer, please feel free to ask. I'd also consider reading The Design and Evolution of C++, which contains a bunch of background infos for people that want to understand how C++ works and why.
in first, want to start to mention that it could be sorry for my bad english.
in C++, when we want to create a instance of certain type of class, we usually use "ClassType ObjectName;"
for example,
class Foo {};
Foo instance1;
but, i've met some codes make me embarassment a little. it following next.
class A {/*....bla bla*/};
class B {
public:
B(char*) {}
};
void main() {
A aaa;
B(aaa); // this makes a error.
}
by trial and error, i could know that "B(aaa);" is exactly same to "B aaa;".
But why? is this a kind of what depicted on standard documents? if so, please let me know where i can see.
Thanks in advance.
UPDATE:
Thank you for your all replies.
But i think that i've omitted some codes. Sorry.
#include <iostream>
using namespace std;
class A
{
};
class B
{
public:
B() { cout << "null\n"; }
B(char* str) {}
void print() {
cout << "print!\n";
}
};
void main()
{
A aaa;
//B(aaa); this line makes a error that says 'redefinition; different basic types'. VS2008
B(aa1);
aa1.print();
}
Output:
null
print!
as you can see, "B(aa1)" statement means not to pass aa1 to constructor as argument, but to create a instance aa1.
Until now, I've known "B(argument)" to "Pass argument to propel one of a overloaded construtor, and create a nameless temporary instance".
but value "aa1" looks lke neither a defined value nor a temporary instance.
Sometimes a set of parenthesis is needed to disambiguate declarations.
For example:
int *f(); // a function returning a pointer to int
int (*f)(); // a pointer to a function returning an int
Rather than listing exactly when and where using parenthesis is required and where it perhaps should be forbidden (because it is useless), the standard just says that they are allowed.
So you end up with the slightly confusing:
int a; // an int variable
int (b); // another int variable
I was wondering whether assert( this != nullptr ); was a good idea in member functions and someone pointed out that it wouldn’t work if the value of this had been added an offset. In that case, instead of being 0, it would be something like 40, making the assert useless.
When does this happen though?
Multiple inheritance can cause an offset, skipping the extra v-table pointers in the object. The generic name is "this pointer adjustor thunking".
But you are helping too much. Null references are very common bugs, the operating system already has an assert built-in for you. Your program will stop with a segfault or access violation. The diagnostic you'll get from the debugger is always good enough to tell you that the object pointer is null, you'll see a very low address. Not just null, it works for MI cases as well.
this adjustment can happen only in classes that use multiple-inheritance. Here's a program that illustrates this:
#include <iostream>
using namespace std;
struct A {
int n;
void af() { cout << "this=" << this << endl; }
};
struct B {
int m;
void bf() { cout << "this=" << this << endl; }
};
struct C : A,B {
};
int main(int argc, char** argv) {
C* c = NULL;
c->af();
c->bf();
return 0;
}
When I run this program I get this output:
this=0
this=0x4
That is: your assert this != nullptr will not catch the invocation of c->bf() where c is nullptr because the this of the B sub-object inside the C object is shifted by four bytes (due to the A sub-object).
Let's try to illustrate the layout of a C object:
0: | n |
4: | m |
the numbers on the left-hand-side are offsets from the object's beginning. So, at offset 0 we have the A sub-object (with its data member n). at offset 4 we have the B sub-objects (with its data member m).
The this of the entire object, as well as the this of the A sub-object both point at offset 0. However, when we want to refer to the B sub-object (when invoking a method defined by B) the this value need to be adjusted such that it points at the beginning of the B sub-object. Hence the +4.
Note this is UB anyway.
Multiple inheritance can introduce an offset, depending on the implementation:
#include <iostream>
struct wup
{
int i;
void foo()
{
std::cout << (void*)this << std::endl;
}
};
struct dup
{
int j;
void bar()
{
std::cout << (void*)this << std::endl;
}
};
struct s : wup, dup
{
void foobar()
{
foo();
bar();
}
};
int main()
{
s* p = nullptr;
p->foobar();
}
Output on some version of clang++:
0
0x4
Live example.
Also note, as I pointed out in the comments to the OP, that this assert might not work for virtual function calls, as the vtable isn't initialized (if the compiler does a dynamic dispatch, i.e. doesn't optimize if it know the dynamic type of *p).
Here is a situation where it might happen:
struct A {
void f()
{
// this assert will probably not fail
assert(this!=nullptr);
}
};
struct B {
A a1;
A a2;
};
static void g(B *bp)
{
bp->a2.f(); // undefined behavior at this point, but many compilers will
// treat bp as a pointer to address zero and add sizeof(A) to
// the address and pass it as the this pointer to A::f().
}
int main(int,char**)
{
g(nullptr); // oops passed null!
}
This is undefined behavior for C++ in general, but with some compilers, it might have the
consistent behavior of the this pointer having some small non-zero address inside A::f().
Compilers typically implement multiple inheritance by storing the base objects sequentially in memory. If you had, e.g.:
struct bar {
int x;
int something();
};
struct baz {
int y;
int some_other_thing();
};
struct foo : public bar, public baz {};
The compiler will allocate foo and bar at the same address, and baz will be offset by sizeof(bar). So, under some implementation, it's possible that nullptr -> some_other_thing() results in a non-null this.
This example at Coliru demonstrates (assuming the result you get from the undefined behavior is the same one I did) the situation, and shows an assert(this != nullptr) failing to detect the case. (Credit to #DyP who I basically stole the example code from).
I think its not that bad a idea to put assert, for example atleast it can catch see below example
class Test{
public:
void DoSomething() {
std::cout << "Hello";
}
};
int main(int argc , char argv[]) {
Test* nullptr = 0;
nullptr->DoSomething();
}
The above example will run without error, If more complex becomes difficult to debug if that assert is absent.
I am trying to make a point that null this pointer can go unnoticed, and in complex situation becomes difficult to debug , I have faced this situation.