Say we have a situation like this:
base.h:
class Base { };
derived.h:
#include "base.h"
class Derived : public Base { };
extern Derived *variable;
derived.cpp:
#include "derived.h"
Derived *variable;
Is it correct to declare variable as a pointer to Base in elsewhere.cpp?
class Base;
extern Base *variable;
The C++Builder linker doesn't complain and everything seems to work. Is this safe and correct according to the standard, or should every declaration of variablebe of the same type?
Here's a few ways this can go wrong (aside from being undefined behavior, which means that you shouldn't even rely on one of those happening):
The compiler might make the type part of the mangling of the variable. Apparently C++Builder doesn't, but I'm pretty sure MSVC does. If it does, you get linker errors.
Suppose at some point Derived is changed to
class Derived : public Something, public Base {};
where Something isn't empty, then in most ABIs, a Derived* changes its value when it's cast to a Base*. However, the aliasing of the global variable bypasses this adjustment, leaving you with a Base* that doesn't point to a Base.
The compiler might actually detect your ODR violation through some other meansĀ and just error out.
What happens if you assign an object of type OtherDerived (which also derives from Base, but not Derived) to your variable? The parts of the program that see it as a Derived* expect it to point to a Derived, but it really points to a OtherDerived. You can see the most fascinating effects in such code. You could set it up so that calling virtual function foo calls the Derived version while foo2 calls the OtherDerived version. The possibilities are endless.
It's not OK. If the name variable is supposed to refere to the same entity (which it is, since it's extern), then it must have the same type. Otherwise, you'll be violating ODR (One Definition Rule).
Of course it's not fine.
It creates an ambiguity in some cases:
// Example Function:
void do_stuff(Base* b);
// Code
do_stuff(variable); // You could mean Derived* or Base*.
// You would write the same thing, but mean
// 2 different things.
Related
I am working with a GitHub library and ran across a derived class instantiation that perplexes me. In abbreviated form,
class A
{
public:
A() {}
int AFunc(void) { return(1); }
};
class B : public A
{
public:
B(void) : A() {}
int BFunc(void) { return(2); }
};
Within an include file, the class is instantiated as follows:
A &tObject = *(new B());
Sample code then refers to 'tObject' as global variable calling methods from class A and/or B.
For example:
tObject.AFunc();
tObject.BFunc();
So here's the question, is that instantiation legal?
The compiler is only fussing on the call to a service class's method, saying that class A has no such member. That error makes sense to me and I've narrowed the issue to the above explanation.
While I do not have broad compiler experience, I have been programming in C++ for many years. I've never seen such a construct.
Would someone kindly explain how an object declared, in my example, as 'class A' can access methods from the derived class B?
In my experience, I've always declared the derived class as a pointer and then accessed methods from the base or derived class using the '->' construct. Oftentimes, I've stored the derived class as a pointer to the base and then performed a cast to convert when or if I needed access to the derived class's methods.
An insight is highly appreciated.
It cannot. The compiler is right to complain, there is no way this is valid. Remember that C++ is a static language, which means that the compiler will try to find a function named BFunc in A, which it cannot, as there is no such function.
This might be a compiler extension of some sort, but anyways, this isn't legal standard C++. Most probably, the author wanted to make BFunc a virtual method in A, which would have made the access legal.
Would someone kindly explain how an object declared, in my example, as 'class A' can access methods from the derived class B?
As explained, this cannot be.
I've always declared the derived class as a pointer and then accessed methods from the base or derived class using the '->' construct.
You can also do this with references, not just with pointers. Although this is done less often than pointers, so this might explain why you haven't encountered this yet.
Oftentimes, I've stored the derived class as a pointer to the base and then performed a cast to convert when or if I needed access to the derived class's methods.
Exactly, this is the correct way to access the derived class members. As then the compiler will know the type of the object and can actually find BFunc and call it. Now, if the type is not really a B, then you have undefined behavior, but yes, this is what one should do.
Also, please get your terminology right:
the class is instantiated as follows
If there are no templates involved, then there is no instantiation happening. The only thing you are doing here is declaring or more specifically defining a variable named tObject.
// The declaration of the reference as a reference to the base class is not a problem, and is actually performed in some STL implementations.
class A
{
};
class B : public A
{
public:
void f1() {}
};
int main()
{
A * a = new B; // You know its OK using pointers
A & a2 = *(new B); // Also OK, a2 is a reference for a place in memory which is-a-kind-of A
// a2.f1(); // Will not compile - compiler only knows for sure its of type A
((B&) a2).f1(); // This will work. There is really a B there
return 0;
}
I want to, basically, inherit a C struct in C++ (well, literally). I have:
struct foo { // C-side definition
int sz;
/* whatever */
// no virtual destructor, special mechanism
};
class cxx_class {
/* something here */
// no virtual destructor, no need
};
class derived : public foo /*, public cxx_class */ {
/* some other stuff here */
};
derived instances (in form of foo*) will be passed back to the external library which only knows and uses the foo part of derived (of course). But the problem is that (I assume) the library uses only c-style cast, which are equivalent to reinterpret_cast in C++, the foo in derived must be at the beginning of the memory block.
I wonder if this is defined behavior and does standard guarantee this?
In other words:
EDIT: The two questions are not the same as pointed out by answers and this part is not answered.
Sometimes I use static_cast for downcast. But some of the code uses reinterpret_cast Is it guaranteed the two always give the same answer?
derived instances (in form of foo*) will be passed back to the external library which only knows and uses the foo part of derived (of course).
If your functions take a foo*, it doesn't matter. Even if constructing your types in such a way that foo is not placed at the beginning of derived, the conversion from derived* to foo* happens in the C++ code which does know about the type layout. The external function then doesn't need to worry about it.
does standard guarantee this?
No.
Since you have no virtual methods though, why don't you simply make foo the first member of derived?
Then check with static_assert that the derived class is standard_layout, which is a relatively good guarantee.
The standard has no requirements about the layout of class hierarchies when multiple classes in that hierarchy have non-static data members.
Now, if derived and cxx_class are both empty (no non-static data members), then derived shall be a standard layout type in C++11+.
I have a class that derives from a C struct. The class does not do anything special, other than initialization in the constructor, deinitialization function during the destructor, and a few other methods that call into C functions. Basically, it's a run-of-the-mill wrapper. Using GCC, it complained that my destructor was not virtual, so I made it that. Now I run into segfaults.
/* C header file */
struct A
{
/* ... */
}
// My C++ code
class B : public A
{
public:
B() { /* ... init ... */ }
virtual ~B() { /* ... deinit ... */ }
void do()
{
someCFunction(static_cast<A *>(this));
}
};
I was always under the assumption that the static_cast would return the correct pointer to the base class, pruning off the virtual table pointer. So this may not be the case, since I get a segfault in the C function.
By removing the virtual keyword, the code works fine, except that I get a gcc warning. What is the best work around for this? Feel free to enlighten me :).
Both the explicit and implicit conversion to A* are safe. There is neither need for an explicit cast, nor is it going to introduce vtables anywhere, or anything like that. The language would be fundamentally unusable if this were not the case.
I was always under the assumption that the static_cast would return
the correct pointer to the base class, pruning off the virtual table
pointer.
Is absolutely correct.
The destructor need be virtual only if delete ptr; is called where ptr has type A*- or the destructor invoked manually. And it would be A's destructor that would have to be virtual, which it isn't.
Whatever the problem is in your code, it has nothing to do with the code shown. You need to expand your sample considerably.
The destructor of base classes should be virtual. Otherwise, there's a chance that you run into undefined behavior. This is just a speculation, as the code is not enough to tell the actual reason.
Try making the destructor of A virtual and see if it crashes.
Note that a class and a struct are the same thing, other than default access level, so the fact that one's a class and the other a struct has nothing to do with it.
EDIT: If A is a C-struct, use composition instead of inheritance - i.e. have an A member inside of B instead of extending it. There's no point of deriving, since polymorphism is out of the question.
That's not how static_cast works. A pointer to an object continues to be the same pointer, just with a different type. In this case, you're converting a pointer to a derived type (B) into a pointer to the base type (A).
My guess is that casting the pointer does not actually change the pointer value, i.e., it's still pointing to the same memory address, even though it's been cast into an A* pointer type. Remember that struct and class are synonyms in C++.
As #Luchian stated, if you're mixing C and C++, it's better to keep the plain old C structs (and their pointers) as plain old C structs, and use type composition instead of inheritance. Otherwise you're mixing different pointer implementations under the covers. There is no guarantee that the internal arrangement of the C struct and the C++ class are the same.
UPDATE
You should surround the C struct declaration with an extern "C" specification, so that the C++ compiler knows that the struct is a pure C struct:
extern "C"
{
struct A
{
...
};
}
Or:
extern "C"
{
#include "c_header.h"
}
(C++,MinGW 4.4.0,Windows OS)
All that is commented in the code, except labels <1> and <2>, is my guess. Please correct me in case you think I'm wrong somewhere:
class A {
public:
virtual void disp(); //not necessary to define as placeholder in vtable entry will be
//overwritten when derived class's vtable entry is prepared after
//invoking Base ctor (unless we do new A instead of new B in main() below)
};
class B :public A {
public:
B() : x(100) {}
void disp() {std::printf("%d",x);}
int x;
};
int main() {
A* aptr=new B; //memory model and vtable of B (say vtbl_B) is assigned to aptr
aptr->disp(); //<1> no error
std::printf("%d",aptr->x); //<2> error -> A knows nothing about x
}
<2> is an error and is obvious. Why <1> is not an error? What I think is happening for this invocation is: aptr->disp(); --> (*aptr->*(vtbl_B + offset to disp))(aptr) aptr in the parameter being the implicit this pointer to the member function. Inside disp() we would have std::printf("%d",x); --> std::printf("%d",aptr->x); SAME AS std::printf("%d",this->x); So why does <1> give no error while <2> does?
(I know vtables are implementation specific and stuff but I still think it's worth asking the question)
this is not the same as aptr inside B::disp. The B::disp implementation takes this as B*, just like any other method of B. When you invoke virtual method via A* pointer, it is converted to B* first (which may even change its value so it is not necessarily equal to aptr during the call).
I.e. what really happens is something like
typedef void (A::*disp_fn_t)();
disp_fn_t methodPtr = aptr->vtable[index_of_disp]; // methodPtr == &B::disp
B* b = static_cast<B*>(aptr);
(b->*methodPtr)(); // same as b->disp()
For more complicated example, check this post http://blogs.msdn.com/b/oldnewthing/archive/2004/02/06/68695.aspx. Here, if there are multiple A bases which may invoke the same B::disp, MSVC generates different entry points with each one shifting A* pointer by different offset. This is implementation-specific, of course; other compilers may choose to store the offset somewhere in vtable for example.
The rule is:
In C++ dynamic dispatch only works for member functions functions not for member variables.
For a member variable the compiler only looksup for the symbol name in that particular class or its base classes.
In case 1, the appropriate method to be called is decided by fetching the vpt, fetching the address of the appropriate method and then calling the appropiate member function.
Thus dynamic dispatch is essentially a fetch-fetch-call instead of a normal call in case of static binding.
In Case 2: The compiler only looks for x in the scope of this Obviously, it cannot find it and reports the error.
You are confused, and it seems to me that you come from more dynamic languages.
In C++, compilation and runtime are clearly isolated. A program must first be compiled and then can be run (and any of those steps may fail).
So, going backward:
<2> fails at compilation, because compilation is about static information. aptr is of type A*, thus all methods and attributes of A are accessible through this pointer. Since you declared disp() but no x, then the call to disp() compiles but there is no x.
Therefore, <2>'s failure is about semantics, and those are defined in the C++ Standard.
Getting to <1>, it works because there is a declaration of disp() in A. This guarantees the existence of the function (I would remark that you actually lie here, because you did not defined it in A).
What happens at runtime is semantically defined by the C++ Standard, but the Standard provides no implementation guidance. Most (if not all) C++ compilers will use a virtual table per class + virtual pointer per instance strategy, and your description looks correct in this case.
However this is pure runtime implementation, and the fact that it runs does not retroactively impact the fact that the program compiled.
virtual void disp(); //not necessary to define as placeholder in vtable entry will be
//overwritten when derived class's vtable entry is prepared after
//invoking Base ctor (unless we do new A instead of new B in main() below)
Your comment is not strictly correct. A virtual function is odr-used unless it is pure (the converse does not necessarily hold) which means that you must provide a definition for it. If you don't want to provide a definition for it you must make it a pure virtual function.
If you make one of these modifications then aptr->disp(); works and calls the derived class disp() because disp() in the derived class overrides the base class function. The base class function still has to exist as you are calling it through a pointer to base. x is not a member of the base class so aptr->x is not a valid expression.
I had a question about C++ destructor behavior, more out of curiosity than anything else. I have the following classes:
Base.h
class BaseB;
class BaseA
{
public:
virtual int MethodA(BaseB *param1) = 0;
};
class BaseB
{
};
Imp.h
#include "Base.h"
#include <string>
class BImp;
class AImp : public BaseA
{
public:
AImp();
virtual ~AImp();
private:
AImp(const AImp&);
AImp& operator= (const AImp&);
public:
int MethodA(BaseB *param1) { return MethodA(reinterpret_cast<BImp *>(param1)); }
private:
int MethodA(BImp *param1);
};
class BImp : public BaseB
{
public:
BImp(std::string data1, std::string data2) : m_data1(data1), m_data2(data2) { }
~BImp();
std::string m_data1;
std::string m_data2;
private:
BImp();
BImp(const BImp&);
BImp& operator= (const BImp&);
};
Now, the issue is that with this code, everything works flawlessly. However, when I make the destructor for BImp virtual, on the call to AImp::MethodA, the class BImp seems to have its data (m_data1 and m_data2) uninitialized. I've checked and made sure the contained data is correct at construction time, so I was wondering what the reason behind this could be...
Cheers!
Edit: param1 was actually a reference to B in MethodA. Looks like I over-sanitized my real code a bit too much!
Edit2: Re-arranged the code a bit to show the two different files. Tested that this code compiles, a well. Sorry about that!
If you are casting between related types as you do in this case, you should use static_cast or dynamic_cast, rather than reinterpret_cast, because the compiler may adjust the object pointer value while casting it to a more derived type. The result of reinterpret_cast is undefined in this case, because it just takes the pointer value and pretends it's another object without any regard for object layout.
MethodA takes its parameters by value. This means a copy is passed (and the copy has to be destroyed). That's my best guess for why you might have a BImpl being destroyed that you didn't expect to be, but I don't see what the virtual or non-virtual nature of A's destructor could possibly have to do with it.
But this code can't compile - you use class B in declaring the virtual function in A, but B isn't defined until later. And I don't know what's going on with that cast - you can't reinterpret_cast class types. Perhaps if you work up a test case which demonstrates your issue, and post that?
There's a lot of iffy stuff in this code, so I'm amazed that it works or compiles in any case.
Passing parameters by value instead of reference to MethodA
Casting a B to a BImp via reinterpret_cast -- bad idea! If you're going to cast in that direction, dynamic_cast is the safest.
I fail to see how you're supposed to get a BImp out of a B. You are not invoking any constructors, and you have none that could be invoked that would accept a B. Your default constructor for BImp is private, and assigning a B that has no data, casted to a BImp that still has no data, to a BImp, still isn't going to give you any data!
Several comments:
Your base classes should have virtual destructors so the derived class' dtor is called instead of the just the base class dtor when the object is deleted.
MethodA taking a BaseB pointer as a parameter only to have the pointer reinterpreted as a BImp (a derived class of BaseB) is dangerous. There is no guarantee something else other than BImp is passed to MethodA. What would happen if just a BaseB object was to MethodA? Potentially lots of bad things, I would suspect.
I'm guessing your code "works flawlessly" because you only pass BImp to MethodA. If you are only passing BImp to MethodA then make the signature match the intent (this has the added benefit of removing that awful reinterpret call).
Your code is ill-formed. It is not valid C++. In C++ language reinterpret_cast can only be used to cast between pointer types, reference types, to perform pointer-to-integer conversions (in either direction).
In your code you are trying to use reinterpret_cast to convert from type B to type BImp. This is explicitly illegal in C++. If your compiler allows this code, you have to consult your compiler's documentation in order to determine what's going on.
Other replies already mentioned "slicing". Keep in mind that this is nothing more than just a guess about specific non-standard behavior of your specific compiler. It has nothing to do with C++ language.