How do I know which function will be called? - c++

Today I found the following disturbingly ambiguous situation in our code base:
class Base {
public:
virtual void Irrelevant_Function(void) = 0;
protected:
C_Container * Get_Container(void);
};
class A : public Base, public Not_Important {
public:
inline C_Container * Get_Container(void);
};
class B : public Base, protected SomethingElse {
public:
C_Container * Get_Container(void);
};
Many things were calling the Get_Container method, but not always calling the correct one - note that none of these functions were virtual.
I need to rename the methods Get_Base_Container, Get_A_Container, etc to remove the ambiguity. What rules does C++ use to determine which version of a function it should call? I'd like to start from the "known state" of what should have been getting called, and then figure out the bugs from there.
For example, if I have a pointer to a Base and call Get_Container, I assume it would just call the Base version of the function. What if I have a pointer to an A? What about a pointer to a B? What about an A or B on the heap?
Thanks.

It depends how you're calling the function. If you're calling through an A *, an A & or an A, then you'll be calling A::Get_Container(). If you're calling through a Base *, a Base & (even if they point to/reference an A), then you'll be calling Base::Get_Container().

As long as there's no virtual inheritance going on, it's quite easy. If you're working directly with an object, it's the object's method that gets called; if you're working with a pointer or reference, it's the type of the pointer or reference that determines the method, and the type of the object pointed to doesn't matter.

A method is first looked up according to the object's static type. If it is non-virtual there, you're done: that's the method that's called. The dynamic type is what virtual methods, dynamic_cast, and typeid use, and is the "actual" type of the object. The static type is what the static type system works with.
A a; // Static type and dynamic type are identical.
Base &a_base = a; // Static type is Base; dynamic type is A.
a.Get_Contaienr(); // Calls A::Get_Container.
a_base.Get_Container(); // Calls Base::Get_Container.
B *pb = new B(); // Static type and dynamic type of *pb (the pointed-to
// object) are identical.
Base *pb_base = pb; // Static type is Base; dynamic type is B.
pb->Get_Container(); // Calls B::Get_Container.
pb_base->Get_Container(); // Calls Base::Get_Container.
I've assumed above that the protected Base::Get_Container method is accessible, otherwise those will be compile errors.

A couple of additional points to note here:
Name lookup occurs in a single scope; E.g. When calling the method on an object with static type 'B', the compiler considers the interface of 'B' to determine whether or not there is a valid match. If there is not, it only then looks at the interface of Base to find a match. This is why that from the compiler's view, there is no ambiguity and it can resolve the call. If your real code has overloading etc. this may be an issue.
Secondly, it is often forgotten that the 'protected' keyword applies at class and not object level. So for example:
class Base {
protected:
C_Container * Get_Container(void);
};
class B : public Base{
public:
C_Container * Get_Container(void)
{
B b;
// Call the 'protected' base class method on another object.
return b.Base::Get_Container();
}
};

Related

Why does a base pointer can access derived member variable in virtual funtion

class Base {
public:
virtual void test() {};
virtual int get() {return 123;}
private:
int bob = 0;
};
class Derived: public Base{
public:
virtual void test() { alex++; }
virtual int get() { return alex;}
private:
int alex = 0;
};
Base* b = new Derived();
b->test();
When test and get are called, the implicit this pointer is passed in. Is it because Derived classes having a sub memory layout that is identical to what a pure base object would be, then this pointer works for both as a base pointer and derived pointer?
Another way to put it is, the memory layout for Derived is like
vptr <-- this
bob
alex
That is why it can use alex in b->test(), right?
Inside of Derived's methods, the implicit this pointer is always a Derived* pointer (more generically, the this pointer always matches the class type being called). That is why Derived::test() and Derived::get() can access the Derived::alex member. That has nothing to do with Base.
The memory layout of a Derived object begins with the data members of Base, followed by optional padding, followed by the data members of Derived. That allows you to use a Derived object wherever a Base object is expected. When you pass a Derived* pointer to a Base* pointer, or a Derived& reference to a Base& reference, the compiler will adjust the pointer/reference accordingly at compile-time to point at the Base portion of the Derived object.
When you call b->test() at runtime, where b is a Base* pointer, the compiler knows test() is virtual and will generate code that accesses the appropriate slot in b's vtable and call the method being pointed at. But, the compiler doesn't know what derived object type b is actually pointing at in runtime (that is the whole magic of polymorphism), so it can't automatically adjust the implicit this pointer to the correct derived pointer type at compile-time.
In the case where b is pointing at a Derived object, b's vtable is pointing at Derived's vtable. The compiler knows the exact offset of the start of Derived from the start of Base. So, the slot for test() in Derived's vtable will point to a private stub generated by the compiler to adjust the implicit Base *this pointer into a Derived *this pointer before then jumping into the actual implementation code for Derived::test().
Behind the scenes, it is roughly (not exactly) implemented like the following pseudo-code:
void Derived_test_stub(Base *this)
{
Derived *adjusted_this = reinterpret_cast<Derived*>(reinterpret_cast<uintptr_t>(this) + offset_from_Base_to_Derived);
Derived::test(adjusted_this);
}
int Derived_get_stub(Base *this)
{
Derived *adjusted_this = reinterpret_cast<Derived*>(reinterpret_cast<uintptr_t>(this) + offset_from_Base_to_Derived);
return Derived::get(adjusted_this);
}
struct vtable_Base
{
void* funcs[2] = {&Base::test, &Base::get};
};
struct vtable_Derived
{
void* funcs[2] = {&Derived_test_stub, &Derived_get_stub};
};
Base::Base()
{
this->vtable = &vtable_Base;
bob = 0;
}
Derived::Derived() : Base()
{
Base::vtable = &vtable_Derived;
this->vtable = &vtable_Derived;
alex = 0;
}
...
Base *b = new Derived;
//b->test(); // calls Derived::test()...
typedef void (*test_type)(Base*);
static_cast<test_type>(b->vtable[0])(b); // calls Derived_test_stub()...
//int i = b->get(); // calls Derived::get()...
typedef int (*get_type)(Base*);
int i = static_cast<get_type>(b->vtable[1])(b); // calls Derived_get_stub()...
The actual details are a bit more involved, but that should give you a basic idea of how polymorphism is able to dispatch virtual methods at runtime.
What you've shown is reasonably accurate, at least for a typical implementation. It's not guaranteed to be precisely as you've shown it (e.g., the compiler might easily insert some padding between bob and alex, but either way it "knows" that alex is at some predefined offset from this, so it can take a pointer to Base, calculate the correct offset from it, and use what's there.
Not what you asked about, so I won't try to get into detail, but just a fair warning: computing such offsets can/does get a bit more complex when/if multiple inheritance gets involved. Not so much for accessing a member of the most derived class, but if you access a member of a base class, it has to basically compute an offset to the beginning of that base class, then add an offset to get to the correct offset within that base class.
A derived class is not a seperate class but an extension. If something is allocated as derived then a pointer (which is just an address in memory) will be able to find everything from the derived class. Classes don't exist in assembly, the compiler keeps track of everything according to how it is allocated in memory and provides appropriate checking accordingly.

Post constructor initialization

I have a set of objects derived from common base, ApiObject. I need to be able to register all ApiObjects in a separate data structure, but I need to have an actual address of the object being created, not the base class (I'm using multiple inheritance).
I can't put the code to register an object in ApiObject constructor, because it does not know the address of the derived object; nor can I put it in the derived classes' constructors, because we have no way of knowing whether we are actually constructing another derived class (e.g. if class B is inherited from A, and both can be constructed).
So the only option I see is to explicitly call the registration function every time we create an object, as in
B* b = new B(...);
RegisterObject(b);
However, this doesn't seem to be a very good solution, as I have to remember to call this function every time.
I suppose I should give more context to explain why I'm doing this. The objects are created via an overloaded new operator, and it needs the object to know the context it was created in (Lua state). E.g.
Foo* object = new(L) Foo(...);
// Foo is derived from ApiObject, and we want ApiObject to have a reference to L
Currently it is done in a somewhat unelegant way - the new operator allocates additional bytes before the object and stores the L pointer in there, along with some additional data to describe the object type. The base class then receives a pointer to this 'metadata' via the init function.
Otherwise, the first thing that comes to mind are virtual functions, but they can't be called from the constructor, so I'd have to register the base ApiObject pointer but only call the virtual function at some later point, and I'm not sure that's prettier than my current implementation.
What is the type required for RegisterObject? If it takes a
Base*, then you can call it from the constructor of Base,
regardless of the final hierarchy. If it takes some other type,
then you want to call it from the constructor of that type; you
do not want to call it from all classes derived from Base,
but only for those derived from whatever type it takes.
If RegisterObject takes a Base*, and you call it from a
function in a derived class, the first thing that will occur is
that the pointer you pass it will be converted to a Base*.
RegisterObject never receives a pointer to the derived object,
only to the Base in the derived object.
You can additionaly derieve every object you want to register from CRTP class, which performs registration, e.g.
template<class T>
struct registarar_t<T>
{
registarar_t()
{
register(derived());
}
T* derieved()
{
return static_cast<T*>(this);
}
}
struct IWantToRegister : registrar_t<IWantToRegister>, ApiObject
{
}
Also, be careful, derived() pointer is right, but object is not yet initialized (accessing it in parent constructor)
Maybe kassak's solution is more elegant, I'm not that advanced, but I'd recommend something like this (register shoudl be called in the constructor so you don't have to write it every time:
#include <iostream>
struct ApiObject;
void registerObj(ApiObject *foo);
struct ApiObject{
public:
ApiObject(std::string n){
name = n;
registerObj(this);
}
std::string name;
};
void registerObj(ApiObject *foo){
std::cout<<"register called on "<<foo->name<<"\n";
}
struct A : public ApiObject{
public:
A(std::string n) : ApiObject(n) {
std::cout<<"init A\n";
}
};
struct B : public ApiObject{
public:
B(std::string n) : ApiObject(n) {
std::cout<<"init B\n";
}
};
int main(){
B *b = new B("b obj");
A *a = new A("a obj");
delete b;
delete a;
}
You can call the registration function from the base constructor. Just make the base destructor virtual. The address will be same for base and derived class. Just don't use the pointer address before the whole object is created.
Once all the objects are fully created, the pointer address can be safely used through virtual functions or dynamic-casted to derived class.

What happens when an upcast pointer is passed by reference to a function expecting derived?

I've inherited something similar to this code:
class Base
{
public:
virtual double getElement(int i) {return NULL;}
virtual Derived* GetAsDerived() {return this;}
};
class Derived : public Base
{
public:
virtual double getElement(int i) {return vec[i];}
private:
std::vector<double>;
};
And a function as follows:
void f(Base& b)
{
Derived* d = b.GetAsDerived();
}
The program flow is something similar to this
Derived A;
/* get data into vector in A */
f(A);
After the call to GetAsDerived() in f .The vector in the object pointed by d contains junk.Inside the debugger I can go back in the call stack and see that the vector inside A still has valid data.
I am sure that that all this weird upcasting - downcasting is the cause but I was not able to find a formal explanation in the specs or online.
So why does it fail in such a manner?
Upcasting is formally defined in 4.10/3 of the standard. Downcasting is in 5.2.9/5, although that's not specifically what you're doing.
Addressing the title of your question, it's not possible to pass (or return) a base class pointer to (from) a function whose parameters (return type) indicate a derived class pointer. If the parameters indicate a reference to base class pointer, then it's not possible to pass a derived class pointer by reference. If a base class pointer value, it is possible to pass a derived pointer because of the implicit conversion.
I can't say why your code fails: as commented above, your real code must fail for different reasons than your example code does, since your example code doesn't work at all.
In function f,Derived* d = b.GetAsDerived();does not change anything about b,so it can't change the object b points to,in this case A.In f,If there is b = d or something alike,then A will be changed.

C++ allocate objects on heap of base class with protected constructors via inheritance

I have a class with protected constructor:
class B {
protected:
B(){};
};
Now I derive from it and define two static functions and I manage to actually create objects of the class B, but not on the heap:
class A : public B {
public:
static B createOnStack() {return B();}
//static B* createOnHeap() {return new B;} //Compile time Error on VS2010
};
B b = A::createOnStack(); //This works on VS2010!
The question is: 1) Is VS2010 wrong in allowing the first case? 2) Is it possible to create objects of B without modifying B in any way (no friendship and no extra functions).
I am asking, because it is possible to make something similar when dealing with instances of B and its member functions, see:
http://accu.org/index.php/journals/296
Thank you in advance for any suggestion!
Kind regards
Yes, this code is non-compliant. This is related to special rules for protected member access (C++03 draft, 11.5/1):
When a friend or a member function of a derived class references a protected nonstatic member function or
protected nonstatic data member of a base class, an access check applies in addition to those described earlier in clause 11.10). Except when forming a pointer to member (5.3.1), the access must be through a pointer to, reference to, or object of the derived class itself (or any class derived from that class) (5.2.5).
When you use B() or new B(), you're effectively using the constructor through a pointer to the base class.
You can create an object of type A (I assume that A is as posted - no additional members/non-static functions) and use it instead. If you're creating it on stack, everything should work fine, unless you're trying to assign other objects of type B to it. If you're creating it on heap, everything is fine as long as B's destructor is virtual. If B's destructor is not virtual, and you're returning new A() as a B*, then deleting the pointer is technically undefined behavior (5.3.5/3:
In the first alternative (delete object), if the static type of the operand is different from its dynamic type, the static type shall be a base class of the operand’s dynamic type and the static type shall have a virtual destructor or the behavior is undefined.
However you'll probably find it working fine in practice, so you can rely on the actual behavior if there is no other workaround (i.e. use it as a last resort).
There is a common misunderstanding on what protected actually means. It means that the derived class can access that particular member on itself not on other objects. The compiler should have rejected both functions as in both cases it is accessing the constructor of an object that is not of the derived type.
Another example, easier to discuss for its correctness would be:
struct base {
protected:
int x;
};
struct derived : base{
static void modify( base& b ) {
b.x = 5; // error
}
};
The commented line is an error as it is trying to modify an object of type base, not necessarily a derived object. If the language allowed that code to compile, then you would be able to modify an object of type base or even objects of types derived1, derived2... effectively breaking access rules.
struct derived2 : base {};
int main() {
base b;
derived2 d;
derived::modify( b ); // modifying a base!!!
derived::modify( d ); // modifying a derived2!!!
}

In C++ if a member function is virtual when can static binding be used?

In C++ when can a virtual function use static binding? If it is being accessed through a pointer, accessed directly, or never?
When a virtual method is called through a pointer or reference, dynamic binding is used. Any other time, compile-time binding is used. Ex:
class C;
void Foo(C* a, C& b, C c) {
a->foo(); // dynamic
b.foo(); // dynamic
c.foo(); // static (compile-time)
}
If you want to call the base class version of a function, you can do that by explicitly naming the base class:
class Base
{
public:
virtual ~Base() {}
virtual void DoIt() { printf("In Base::DoIt()\n"); }
};
class Derived : public Base
{
public:
virtual void DoIt() { printf("In Derived::DoIt()\n"); }
};
Base *basePtr = new Derived;
basePtr->DoIt(); // Calls Derived::DoIt() through virtual function call
basePtr->Base::DoIt(); // Explicitly calls Base::DoIt() using normal function call
delete basePtr;
Static binding can only be done when the object's type is totally unambiguous at compile time. I can only think of four places where an abstract object's type is unambiguous: in the constructor, in the destructor, when declared locally and within the same scope as a dynamic allocation. I don't know the standard that well so I couldn't say what it says about those four possibilities (I'd say the first two are statically bound, the third possible statically bound and the last not; although it probably says it's undefined or implementation dependent). Other than those points, the object being accessed through a base class pointer could be pointing to a derived class and the current translation unit has no way of knowing, so static binding is not possible. The function could be called with a pointer to the base class in one instance and a pointer to a derived class in another!