can someone tell me what is the different between (*ptr).field and ptr->field?
I know it connect somehow to static and dynamic linking, but i dont know what is it.
can someone tell me the differnet and give me an example?
edit:
if i have this code:
Point p; //point is a class that derive from class shape
Shape *s=&p;
//there is a diffrence if i write:
(*s).print(); //print is virtual func
s->print(); // the answers will not be the same, why?
TNX!
This has nothing to do with static or dynamic linking.
See C++'s operator precedence. The . has a lower precedence than *, so there's actually quite a difference between *ptr.fld and ptr->fld. For example, the following code demonstrates:
#include <iostream>
struct foo {
int f;
};
int main() {
struct foo *p = new struct foo;
p->f = 42;
std::cout << p->f << std::endl;
std::cout << (*p).f << std::endl;
// The following will not compile
// std::cout << *p.f << std::endl;
}
As John Knoeller points out, ptr->fld is syntactic sugar for (*(ptr)).fld, but is not the same as *ptr.fld, which would actually evaluate to *(ptr.fld), probably not what you want.
You'd use ptr->fld when you have a pointer to a structure and want to access a field contained therein. (*(ptr)).fld means the same thing but is not as tidy. You'd use *strct.fld when you have a structure, not a pointer to a structure, that contains a field (fld) which is a pointer you want to dereference. The case of ptr->fld is shown above. The case of *strct.fld could be used with the following structure:
struct foo {
int *fld;
}
struct foo f;
f.fld = new int;
*f.fld = 42;
it has nothing to do with static or dynamic linking
both expressions will return the value of ptr.field
the ptr->field form is an abbreviated syntax for accessing a member directly from a pointer
UPDATE: it occurred to me that your original intent was not linking but binding
if this indeed was what you were aiming to then there is static binding and dynamic binding
which have some relation to the -> operator see here
Static linking is the result of the linker copying all library routines used in the program into the executable image. This may require more disk space and memory than dynamic linking, but is both faster and more portable, since it does not require the presence of the library on the system where it is run.
Dynamic linking is accomplished by placing the name of a sharable library in the executable image. Actual linking with the library routines does not occur until the image is run, when both the executable and the library are placed in memory. An advantage of dynamic linking is that multiple programs can share a single copy of the library.
But it is unrelated to pointer indirection you've mentioned -- in fact, those two expressions are identical.
I assume that by *ptr.field you meant (*ptr).field.
As far as only built-in operators are considered, there's no difference between the two. And no, it has nothing to do with and "static" or "dynamic" linking, whatever is implied by these terms.
The only potential difference between the two is that in ptr->field variant -> is an overloadable operator in C++, while in the (*ptr).field variant only * is overloadable, while . is not.
Also, some differences between these two methods of member access existed in very archaic versions of C language (CRM C), but I doubt anyone cares about those today.
This has nothing to do with linking.
ptr->fld
is merely shorthand for
(*ptr).fld
It's pure syntactic sugar ;)
As long as ptr is a pointer, the two are equivalent once properly parenthesized (as others have said). If ptr is an object, rather than a pointer, they might be different depending on definitions for operator* and operator-> (if any) from the object's class or ancestors.
The form -> is just shorthand for de-refrencing the pointer and accessing the member.
(*ptr).field;
// Equiv to
ptr->field;
One good reason for using -> is that when you are following a chain:
int x = (*(*(*(*ptr).field1).field2).field3).field4;
// Equiv to
int y = ptr->field1->field2->field3->field4;
The second one becomes much more readable.
As for the second part of your question.
I find it really easy to just nock up an example.
#include <iostream>
class Shape
{
public: virtual ~Shape() {}
virtual void Print() {std::cout << "Shape\n";}
};
class Point: public Shape
{
public: virtual void Print() {std::cout << "Point\n";}
};
int main ()
{
Point p;
Shape* s = &p;
s->Print();
(*s).Print();
}
> vi x.cpp
> g++ x.cpp
> ./a.exe
Point
Point
As you can see the result is the same in both situations.
When you call a method via a pointer or a reference the virtual call mechanism will be invoked. The star operator (AKA derefence operator) returns a reference to an object (it does not actually de-reference the object). So when it is used to call a method the virtual call mechanism will be invoked and the most derived version of the method called.
They are both the same.
*ptr.field dereferences the variable ptr then returns the value of member field.
The -> operator is shorthand notation for the above.
Related
I came across articles where in they explain about vptr and vtable.
I know that the first pointer in an object in case of a class with virtual functions stored, is a vptr to vtable and vtable's array entries are pointers to the function in the same sequence as they occur in class ( which I have verified with my test program).
But I am trying to understand what syntax must compiler put in order to call the appropriate function.
Example:
class Base
{
virtual void func1()
{
cout << "Called me" << endl;
}
};
int main()
{
Base obj;
Base *ptr;
ptr=&obj;
// void* is not needed. func1 can be accessed directly with obj or ptr using vptr/vtable
void* ptrVoid=ptr;
// I can call the first virtual function in the following way:
void (*firstfunc)()=(void (*)(void))(*(int*)*(int*)ptrVoid);
firstfunc();
}
Questions:
1. But what I am really trying to understand is how compiler replaces the call to ptr->func1() with vptr?
If I were to simulate the call then what should I do? should I overload the -> operator. But even that would not help as I would not know what really the name func1 is. Even if they say that compiler accesses the vtable through vptr, still how does it know that the entry of func1 is the first array adn entry of func2 is the second element in the array? There must be some mapping for the names of function to the elements of array.
2. How can I simulate it. Can you provide the actual syntax that compiler uses to call function func1(how does it replace ptr->func1())?
Don't think of a vtable as an array. It's only an array if you strip it of everything C++ knows about it other than the size of its members. Instead, think of it as a second struct whose members are all pointers to functions.
Suppose I have a class like this:
struct Foo {
virtual void bar();
virtual int baz(int qux);
int quz;
}
int callSomeFun(Foo* foo) {
foo->bar();
return foo->baz(2);
}
Breaking it down 1 step:
class Foo;
// adding Foo* parameter to simulate the this pointer, which
// in the above would be a pointer to foo.
struct FooVtable {
void (*bar)(Foo* foo);
int (*baz)(Foo* foo, int qux);
}
struct Foo {
FooVtable* vptr;
int quz;
}
int callSomeFun(Foo* foo) {
foo->vptr->bar(foo);
return foo->vptr->baz(foo, 2);
}
I hope that's what you're looking for.
The backgroud:
After compilation (without debug info) binaries of C/C++ have no names, and names aren't required to runtime work, its only machine code
You can think about vptr like clasic C function pointer, in sense that type, argument list etc is known.
It isn't important on which positions are placed func1, func2 etc, only required is order was always the same (so all parts of multi file C++ must be compiled in the same way, compiler settings etc). Lets imagine, position is in declaration order, FIRST parent class, then newly declared in override BUT reimplemented virtuals are at lower positions, like from parent.
Its only image. Implementation must correctly fire overrides classApionter->methodReimplementedInB()
Usually C++ compiler has/had (my knowledge is from years 16/32b migration) 2-4 option to optimalize vtables against speed/size etc. Classic C sizeof() was quite well to understand (size of data plus ev. alignment), in C++ sizeof is bigger, but can guarantee if it is 2,4,8 bytes.
4 Few conversion tool can convert "object" files i.e. from MS format to Borland etc, but usually/only classic C was possible/safe, because of unknown machine code implementations of vtable.
Hard to touch vtable from high level code, fire analysers for intermediate files (.obj, . etc)
EDIT: story about runtime is different than about compilation. My answer is about compiled code & runtime
EDIT2: quasi assembler code (from my head)
load ax, 2
call vt[ax]
vt:
0x123456
0x126785 // virlual parent func1()
derrived:
vt:
0x123456
0x126999 // overriden finc1()
0x456788 // new method
EDIT3: BTW I can't totally agree that C++ has always better speed JVM/.NET because "these are interpreted". C++ has part of "intepretation", and interpreted part is groving: real component/GUI frameworks have interpreted connections between too (map for example). Out of our discussion: what memory model is better, with C++ delete or GC?
I have a doubt about C++ virtual table recently.
Why does C++ use virtual table?
=>Because C++ compiler does not know the actual function address
--->Why?
=>Because C++ compiler does not know the exact type(Cat? Dog? Animal?) of the object the pointer "panimal" points to
---Why? Is that any way compiler can figure out the object type?
=>Yes, I think the compiler can make it via tracking object type.
Let's consider the sources where an object pointer gets its value. 2 sources indeed.
another pointer
address of class instance
Where does "another pointer" get its value? Eventually, there's a pointer that gets its value from "class instance".
So, via tracking the assignment thread backwards to the original source object
=> the compiler is able to figure out the exact type of a pointer.
=>the compiler knows the address of the exact function being called
=>no virtual table is needed.
Object type tracking saves both virtual table memery and virtual table pointer of each class instances.
Where does object type tracking not work?
Library Linking.
If a library function returns a base-class pointer, there's no way for the compiler to track back to the original source object. The compiler can probably adapt to library code and none-library code. For library classes that are exported out, use virtual table. For other classes, just track theire object type to save memory.
I am not sure whether there's some error in above statements, please kindly point it out if any. Thanks in advance~
In some cases, yes, the compiler can figure out the type a pointer points to at compile time. It is quite easy to construct a case where it cannot though.
int x;
cin >> x;
Animal* p;
if (x == 10)
p = new Cat();
else
p = new Dog();
If the compiler can, in all cases, prove the type of an object, it is free to eliminate virtual tables from its generated code, as per the as-if rule.
the compiler is able to figure out the exact type of a pointer.
yes, but how do you want it to call the right function at runtime? The compiler knows, but c++ has no virtual machine to tell it the type of the object being passed at runtime, ergo, the need for a vtable for virtual functions of inherited types.
Would you rather the compiler creates code for all the different code paths that lead to the execution of each virtual function so it calls the right function at runtime? That would lead to much much bigger binaries, if at all possible.
In this example it becomes clear that, whatever the static code analysis the compiler could take, the actual method that gets called at ptrA->f(); can only be known at runtime.
#include <sys/time.h>
#include <iostream>
#include <stdlib.h>
struct A {
virtual int f()
{
std::cout<<"class A\n";
}
};
struct B: public A {
int f()
{
std::cout<<"class B\n";
}
};
int main()
{
A objA;
B objB;
A* ptrA;
timeval tv;
gettimeofday(&tv, NULL);
unsigned int seed = (unsigned int)tv.tv_sec;
int randVal = rand_r(&seed);
if( randVal < RAND_MAX/2)
{
ptrA=&objA;
}
else
{
ptrA=&objB;
}
ptrA->f();
return 0;
}`
class C
{
public:
C() : m_x(0) { }
virtual ~C() { }
public:
static ptrdiff_t member_offset(const C &c)
{
const char *p = reinterpret_cast<const char*>(&c);
const char *q = reinterpret_cast<const char*>(&c.m_x);
return q - p;
}
private:
int m_x;
};
int main(void)
{
C c;
std::cout << ((C::member_offset(c) == 0) ? 0 : 1);
std::cout << std::endl;
std::system("pause");
return 0;
}
The program above outputs 1. What it does is just check the addresses of the c object and the c's field m_x. It prints out 1 which means the addresses are not equal. My guess is that is because the d'tor is virtual so the compiler has to create a vtable for the class and put a vpointer in the class's object. If I'm already wrong please correct me.
Apparently, it puts the vpointer at the beginning of the object, pushing the m_x field farther and thus giving it a different address. Is that the case? If so does the standard specify vpointer's position in the object? According to wiki it's implementation-dependent. And its position may change the output of the program.
So can you always predict the output of this program without specifying the target platform?
In reality, it is NEARLY ALWAYS laid out in this way. However, the C++ standard allows whatever works to be used. And I can imagine several solutions that doesn't REQUIRE the above to be true - although they would perhaps not work well as a real solution.
Note however that you can have more than one vptr/vtable for an object if you have multiple inheritance.
There are no "vpointers" in C++. The implementation of polymorphism and dynamic dispatch is left to the compiler, and the resulting class layout is not in any way specified. Certainly an object of polymorphic type will have to carry some extra state in order to identify the concrete type when given only a view of a base subobject.
Implementations with vtables and vptrs are common and popular, and putting the vptr at the beginning of the class means that you don't need any pointer adjustments for single inheritance up and downcasts.
Many C++ compilers follow (parts of) the Itanium ABI for C++, which specifies class layout decisions like this. This popular article may also provide some insights.
Yes, it is implementation dependent and no, you can't predict the program's output without knowing the target platform/compiler.
Suppose I have
class A { public: void print(){cout<<"A"; }};
class B: public A { public: void print(){cout<<"B"; }};
class C: public A { };
How is inheritance implemented at the memory level?
Does C copy print() code to itself or does it have a pointer to the it that points somewhere in A part of the code?
How does the same thing happen when we override the previous definition, for example in B (at the memory level)?
Compilers are allowed to implement this however they choose. But they generally follow CFront's old implementation.
For classes/objects without inheritance
Consider:
#include <iostream>
class A {
void foo()
{
std::cout << "foo\n";
}
static int bar()
{
return 42;
}
};
A a;
a.foo();
A::bar();
The compiler changes those last three lines into something similar to:
struct A a = <compiler-generated constructor>;
A_foo(a); // the "a" parameter is the "this" pointer, there are not objects as far as
// assembly code is concerned, instead member functions (i.e., methods) are
// simply functions that take a hidden this pointer
A_bar(); // since bar() is static, there is no need to pass the this pointer
Once upon a time I would have guessed that this was handled with pointers-to-functions in each A object created. However, that approach would mean that every A object would contain identical information (pointer to the same function) which would waste a lot of space. It's easy enough for the compiler to take care of these details.
For classes/objects with non-virtual inheritance
Of course, that wasn't really what you asked. But we can extend this to inheritance, and it's what you'd expect:
class B : public A {
void blarg()
{
// who knows, something goes here
}
int bar()
{
return 5;
}
};
B b;
b.blarg();
b.foo();
b.bar();
The compiler turns the last four lines into something like:
struct B b = <compiler-generated constructor>
B_blarg(b);
A_foo(b.A_portion_of_object);
B_bar(b);
Notes on virtual methods
Things get a little trickier when you talk about virtual methods. In that case, each class gets a class-specific array of pointers-to-functions, one such pointer for each virtual function. This array is called the vtable ("virtual table"), and each object created has a pointer to the relevant vtable. Calls to virtual functions are resolved by looking up the correct function to call in the vtable.
Check out the C++ ABI for any questions regarding the in-memory layout of things. It's labelled "Itanium C++ ABI", but it's become the standard ABI for C++ implemented by most compilers.
I don't think the standard makes any guarantees. Compilers can choose to make multiple copies of functions, combine copies that happen to access the same memory offsets on totally different types, etc. Inlining is just one of the more obvious cases of this.
But most compilers will not generate a copy of the code for A::print to use when called through a C instance. There may be a pointer to A in the compiler's internal symbol table for C, but at runtime you're most likely going to see that:
A a; C c; a.print(); c.print();
has turned into something much along the lines of:
A a;
C c;
ECX = &a; /* set up 'this' pointer */
call A::print;
ECX = up_cast<A*>(&c); /* set up 'this' pointer */
call A::print;
with both call instructions jumping to the exact same address in code memory.
Of course, since you've asked the compiler to inline A::print, the code will most likely be copied to every call site (but since it replaces the call A::print, it's not actually adding much to the program size).
There will not be any information stored in a object to describe a member function.
aobject.print();
bobject.print();
cobject.print();
The compiler will just convert the above statements to direct call to function print, essentially nothing is stored in a object.
pseudo assembly instruction will be like below
00B5A2C3 call print(006de180)
Since print is member function you would have an additional parameter; this pointer. That will be passes as just every other argument to the function.
In your example here, there's no copying of anything. Generally an object doesn't know what class it's in at runtime -- what happens is, when the program is compiled, the compiler says "hey, this variable is of type C, let's see if there's a C::print(). No, ok, how about A::print()? Yes? Ok, call that!"
Virtual methods work differently, in that pointers to the right functions are stored in a "vtable"* referenced in the object. That still doesn't matter if you're working directly with a C, cause it still follows the steps above. But for pointers, it might say like "Oh, C::print()? The address is the first entry in the vtable." and the compiler inserts instructions to grab that address at runtime and call to it.
* Technically, this is not required to be true. I'm pretty sure you won't find any mention in the standard of "vtables"; it's by definition implementation-specific. It just happens to be the method the first C++ compilers used, and happens to work better all-around than other methods, so it's the one nearly every C++ compiler in existence uses.
Situation is following. I have shared library, which contains class definition -
QueueClass : IClassInterface
{
virtual void LOL() { do some magic}
}
My shared library initialize class member
QueueClass *globalMember = new QueueClass();
My share library export C function which returns pointer to globalMember -
void * getGlobalMember(void) { return globalMember;}
My application uses globalMember like this
((IClassInterface*)getGlobalMember())->LOL();
Now the very uber stuff - if i do not reference LOL from shared library, then LOL is not linked in and calling it from application raises exception. Reason - VTABLE contains nul in place of pointer to LOL() function.
When I move LOL() definition from .h file to .cpp, suddenly it appears in VTABLE and everything works just great.
What explains this behavior?! (gcc compiler + ARM architecture_)
The linker is the culprit here. When a function is inline it has multiple definitions, one in each cpp file where it is referenced. If your code never references the function it is never generated.
However, the vtable layout is determined at compile time with the class definition. The compiler can easily tell that the LOL() is a virtual function and needs to have an entry in the vtable.
When it gets to link time for the app it tries to fill in all the values of the QueueClass::_VTABLE but doesn't find a definition of LOL() and leaves it blank(null).
The solution is to reference LOL() in a file in the shared library. Something as simple as &QueueClass::LOL;. You may need to assign it to a throw away variable to get the compiler to stop complaining about statements with no effect.
I disagree with #sechastain.
Inlining is far from being automatic. Whether or not the method is defined in place or a hint (inline keyword or __forceinline) is used, the compiler is the only one to decide if the inlining will actually take place, and uses complicated heuristics to do so. One particular case however, is that it shall not inline a call when a virtual method is invoked using runtime dispatch, precisely because runtime dispatch and inlining are not compatible.
To understand the precision of "using runtime dispatch":
IClassInterface* i = /**/;
i->LOL(); // runtime dispatch
i->QueueClass::LOL(); // compile time dispatch, inline is possible
#0xDEAD BEEF: I find your design brittle to say the least.
The use of C-Style casts here is wrong:
QueueClass* p = /**/;
IClassInterface* q = p;
assert( ((void*)p) == ((void*)q) ); // may fire or not...
Fundamentally there is no guarantee that the 2 addresses are equal: it is implementation defined, and unlikely to resist change.
I you wish to be able to safely cast the void* pointer to a IClassInterface* pointer then you need to create it from a IClassInterface* originally so that the C++ compiler may perform the correct pointer arithmetic depending on the layout of the objects.
Of course, I shall also underline than the use of global variables... you probably know it.
As for the reason of the absence ? I honestly don't see any apart from a bug in the compiler/linker. I've seen inlined definition of virtual functions a few times (more specifically, the clone method) and it never caused issues.
EDIT: Since "correct pointer arithmetic" was not so well understood, here is an example
struct Base1 { char mDum1; };
struct Base2 { char mDum2; };
struct Derived: Base1, Base2 {};
int main(int argc, char* argv[])
{
Derived d;
Base1* b1 = &d;
Base2* b2 = &d;
std::cout << "Base1: " << b1
<< "\nBase2: " << b2
<< "\nDerived: " << &d << std::endl;
return 0;
}
And here is what was printed:
Base1: 0x7fbfffee60
Base2: 0x7fbfffee61
Derived: 0x7fbfffee60
Not the difference between the value of b2 and &d, even though they refer to one entity. This can be understood if one thinks of the memory layout of the object.
Derived
Base1 Base2
+-------+-------+
| mDum1 | mDum2 |
+-------+-------+
When converting from Derived* to Base2*, the compiler will perform the necessary adjustment (here, increment the pointer address by one byte) so that the pointer ends up effectively pointing to the Base2 part of Derived and not to the Base1 part mistakenly interpreted as a Base2 object (which would be nasty).
This is why using C-Style casts is to be avoided when downcasting. Here, if you have a Base2 pointer you can't reinterpret it as a Derived pointer. Instead, you will have to use the static_cast<Derived*>(b2) which will decrement the pointer by one byte so that it correctly points to the beginning of the Derived object.
Manipulating pointers is usually referred to as pointer arithmetic. Here the compiler will automatically perform the correct adjustment... at the condition of being aware of the type.
Unfortunately the compiler cannot perform them when converting from a void*, it is thus up to the developer to make sure that he correctly handles this. The simple rule of thumb is the following: T* -> void* -> T* with the same type appearing on both sides.
Therefore, you should (simply) correct your code by declaring: IClassInterface* globalMember and you would not have any portability issue. You'll probably still have maintenance issue, but that's the problem of using C with OO-code: C is not aware of any object-oriented stuff going on.
My guess is that GCC is taking the opportunity to inline the call to LOL. I'll see if I can find a reference for you on this...
I see sechastain beat me to a more thorough description and I could not google up the reference I was looking for. So I'll leave it at that.
Functions defined in header files are in-lined on usage. They're not compiled as part of the library; instead where the call is made, the code of the function simply replaces the code of the call, and that is what gets compiled.
So, I'm not surprised to see that you are not finding a v-table entry (what would it point to?), and I'm not surprised to see that moving the function definition to a .cpp file suddenly makes things work. I'm a little surprised that creating an instance of the object with a call in the library makes a difference, though.
I'm not sure if it's haste on your part, but from the code provided IClassInterface does not necessarily contain LOL, only QueueClass. But you're casting to a IClassInterface pointer to make the LOL call.
If this example is simplified, and your actual inheritance tree uses multiple inheritance, this might be easily explained. When you do a typecast on an object pointer, the compiler needs to adjust the pointer so that the proper vtable is referenced. Because you're returning a void *, the compiler doesn't have the necessary information to do the adjustment.
Edit: There is no standard for C++ object layout, but for one example of how multiple inheritance might work see this article from Bjarne Stroustrup himself: http://www-plan.cs.colorado.edu/diwan/class-papers/mi.pdf
If this is indeed your problem, you might be able to fix it with one simple change:
IClassInterface *globalMember = new QueueClass();
The C++ compiler will do the necessary pointer modifications when it makes the assignment, so that the C function can return the correct pointer.