C++, statically detect base classes with differing addresses? - c++

If I have a derived class with multiple bases, each this pointer for each base will be different from that of the derived object's this pointer, except for one. Given two types in an inheritance hierarchy, I'd like to detect at compile time whether they share the same this pointer. Something like this should work, but doesn't:
BOOST_STATIC_ASSERT(static_cast<Base1*>((Derived *)0xDEADBEEF) == (Derived*)0xDEADBEEF);
Because it needs to be an 'integral constant expression' and only integer casts are allowed in those according to the standard (which is stupid, because they only need compile time information if no virtual inheritance is being used). The same problem occurs trying to pass the results as integer template parameters.
The best I've been able to do is check at startup, but I need the information during compile (to get some deep template hackery to work).

I don't know how to check what you wan't but note that your assumption is false in presence of empty base classes. Any number of them can share the same offset from the start of the object, as long as they are of different type.

I am trying to solve this exact same issue. I have an implementation that works if you know what member variable is at the beginning of the base class's layout. E.g. if member variable "x" exists at the start of each class, then the following code will work to yield the byte offset of a particular base class layout from the derived class layout: offsetof(derived, base2::x).
In the case of:
struct base1 { char x[16]; };
struct base2 { int x; };
struct derived : public base1, public base2 { int x; };
static const int my_constant = offsetof(derived, base2::x);
The compiler will properly assign "16" to my_constant on my architecture (x86_64).
The difficulty is to get "16" when you don't know what member variable is at the start of a base class's layout.

I am not even sure that this offset is a constant in the first place. Do you have normative wording suggesting otherwise?
I'd agree that a non-const offset would be bloody hard to implement in the absence of virtual inheritance, and pointless to boot. That's besides the point.

Classes do not have a this pointer - instances of classes do, and it will be different for each instance, no matter how they are derived.

What about using
BOOST_STATIC_ASSERT(boost::is_convertible<Derived*,Base*>::value)
as documented in the following locations...
http://www.boost.org/doc/libs/1_39_0/doc/html/boost_staticassert.html
http://www.boost.org/doc/libs/1_38_0/libs/type_traits/doc/html/boost_typetraits/reference/is_convertible.html

I didn't realize that the compiler would insert this check at runtime, but your underlying assumption isn't entirely correct. Probably not in ways that you care about though: the compiler can use the Empty Base Class Optimization if you happen to inherit from more than one base class with sizeof(base class)==0. That would result in (base class *)(derived *)1==at least one other base class.
Like I said, this probably isn't something you would really need to care about.

Related

How downcasts and upcasts are preformed? And how the types are compared? How RTTI is usually stored?

I've read about RTTI. The information written here may be wrong. It's just what I've understood.
1 - Each type has a pointer to it's base classes and a pointer to a string containing it's name. When donwcasting using dynamic_cast for example, it goes through the base classes and recurses until it find one that matches. Assuming that what I've said is true, what about upcasting? How it's done since each type has only a knowledge of it's base classes, how does it figure out it's subclasses?
2 - Also does it know whether the operation is a downcast or an upcast before actually casting? In other words, when I preform dynamic_cast<SomeClass>, does it try to find SomeClass in the entire hierarchy tree? Or does it know which direction to go (to search in the parents node or to search in the child nodes)? And if it does, then how?
3 - As far as I've understood, The type of each class is stored as a string and whenever someone uses dynamic_cast, it compares the strings of types until it finds the right class. If that's true, why this is done? Why not giving each class an integer ID at compile time and storing that ID instead of the string name. And whenever casting happens, just compare the two numbers. And let the type string of all classes be stored in an array somewhere (lets call it typesArr) and whenever actually needing to get the name of the class, just lookup typesArr[ID]. I think something like that is more intuitive and I'm missing something. So how actually is RTTI stored? I don't mean how it works. I mean if how is it represented in memory? I know that it's implementation dependent. But how is it stored usually in most compilers? And how types are actually compared?
(Obligatory disclaimer: as already noted in the question, most of this is implementation specific, not general C++ rules.)
I think you may be using "upcast" and "downcast" reversed from what I'm used to.
But in any case, a conversion from a derived class pointer to base class pointer (or initializing a base class reference from a derived class glvalue) doesn't need to involve the RTTI "class tree" at all. The compiler knows all the base classes and the layout of the subobjects in the derived class. For a non-virtual base, the base subobject address is at a fixed offset from the derived object address. For a virtual base class, the offset to the base subobject depends on the object's most derived type, so the conversion involves looking up that offset in the vtable.
dynamic_cast is defined to just do the above derived-to-base cast if valid ([expr.dynamic.cast]/5). Otherwise, yes, it searches for the base class in the entire tree of classes inherited by the object's complete type. The implementation of this search will probably start from the root: the most derived class. Note that derived-to-base and base-to-derived are not the only cases: dynamic_cast can also cast "sideways", to find a sibling/cousin/etc. subobject.
struct A { virtual ~A(); int m; };
struct B { virtual ~B(); int n; };
int f(const A& a) {
// Valid, even though there's no inheritance relation between
// A and B at all:
auto& b = dynamic_cast<const B&>(a);
return b.n;
}
struct C : public A, public B { int p; };
void g() {
C c;
c.n = 2;
// The dynamic_cast in f will be a successful "sideways cast"
// from the A base subobject of c to the B base subobject of c.
assert(f(c) == 2);
}
As far as I've understood, The type of each class is stored as a string and whenever someone uses dynamic_cast, it compares the strings of types until it finds the right class. If that's true, why this is done? Why not giving each class an integer ID at compile time and storing that ID instead of the string name. And whenever casting happens, just compare the two numbers.
This wouldn't easily work because of the separate compilation model used by C++. Say Alice compiles her file a.cpp which defines some polymorphic classes. The compiler would need to choose some IDs for those classes. Meanwhile, Bob is a developer on NiftyLib, and adds a new feature, meaning that file b.cpp in NiftyLib source has some new polymorphic classes. His feature is ready for release, so the library project compiles b.cpp and other sources into library files, which are made available for developers. This will mean choosing some IDs for those classes. Alice's a.cpp is part of a program which uses NiftyLib, so she upgrades to the newer NiftyLib version. Making Alice's complete program involves linking the previously compiled a.cpp and the NiftyLib library file. But how could those compilers have chosen unique IDs so that none of the classes from a.cpp and b.cpp happen to share the same ID?
So I think some implementations of dynamic_cast do compare some mangled type name found via the RTTI, possibly the same C-string data returned by std::type_info::name(). But not all. On Itanium ABI systems (see below), the compiler and linker can set things up to guarantee that there's just one RTTI data object (which is also the std::type_info object) per type, even if duplicate objects were originally emitted from different translation units. Then when code requests dynamic_cast<T*>(ptr), the compiler will pass the known RTTI object for T and the RTTI object obtained via a vptr in *ptr to the internal support function implementing dynamic_cast. When that function is searching the tree of linked RTTI objects, it can then just compare addresses of the RTTI objects rather than checking any of their contents match.
For tons of the technical details, you can look at the Itanium C++ ABI, used on Linux, Mac, and several other platforms. In particular, section 2.9 is all about RTTI, with 2.9.4 specifying all the contents of an RTTI object and 2.9.7 describing how that data is used to implement dynamic_cast (in the truly dynamic case). The last I looked at it, the RTTI data scheme used by MSVC was fairly similar, just different in the details.

Cast a simple (c++) struct to another derived (c++) struct containing same datatypes [duplicate]

If I have a class as follows
class Example_Class
{
private:
int x;
int y;
public:
Example_Class()
{
x = 8;
y = 9;
}
~Example_Class()
{ }
};
And a struct as follows
struct
{
int x;
int y;
} example_struct;
Is the structure in memory of the example_struct simmilar to that in Example_Class
for example if I do the following
struct example_struct foo_struct;
Example_Class foo_class = Example_Class();
memcpy(&foo_struct, &foo_class, sizeof(foo_struct));
will foo_struct.x = 8 and foo_struct.y = 9 (ie: the same values as the x,y values in the foo_class) ?
The reason I ask is I have a C++ library (don't want to change it) that is sharing an object with C code and I want to use a struct to represent the object coming from the C++ library. I'm only interested in the attributes of the object.
I know the ideal situation would be to have Example_class wrap arround a common structure between the C and C++ code but it is not going to be easy to change the C++ library in use.
The C++ standard guarantees that memory layouts of a C struct and a C++ class (or struct -- same thing) will be identical, provided that the C++ class/struct fits the criteria of being POD ("Plain Old Data"). So what does POD mean?
A class or struct is POD if:
All data members are public and themselves POD or fundamental types (but not reference or pointer-to-member types), or arrays of such
It has no user-defined constructors, assignment operators or destructors
It has no virtual functions
It has no base classes
About the only "C++-isms" allowed are non-virtual member functions, static members and member functions.
Since your class has both a constructor and a destructor, it is formally speaking not of POD type, so the guarantee does not hold. (Although, as others have mentioned, in practice the two layouts are likely to be identical on any compiler that you try, so long as there are no virtual functions).
See section [26.7] of the C++ FAQ Lite for more details.
Is the structure in memory of the example_struct simmilar to that in Example_Class
The behaviour isn't guaranteed, and is compiler-dependent.
Having said that, the answer is "yes, on my machine", provided that the Example_Class contains no virtual method (and doesn't inherit from a base class).
In the case you describe, the answer is "probably yes". However, if the class has any virtual functions (including virtual destructor, which could be inherited from a base class), or uses multiple inheritance then the class layout may be different.
To add to what other people have said (eg: compiler-specific, will likely work as long as you don't have virtual functions):
I would highly suggest a static assert (compile-time check) that the sizeof(Example_class) == sizeof(example_struct) if you are doing this. See BOOST_STATIC_ASSERT, or the equivalent compiler-specific or custom construction. This is a good first-line of defense if someone (or something, such as a compiler change) modifies the class to invalidate the match. If you want extra checking, you can also runtime check that the offsets to the members are the same, which (together with the static size assert) will guarantee correctness.
In the early days of C++ compilers there were examples when compiler first changes struct keywords with class and then compiles. So much about similarities.
Differences come from class inheritance and, especially, virtual functions. If class contains virtual functions, then it must have a pointer to type descriptor at the beginning of its layout. Also, if class B inherits from class A, then class A's layout comes first, followed by class B's own layout.
So the precise answer to your question about just casting a class instance to a structure instance is: depends on class contents. For particular class which has methods (constructor and non-virtual destructor), the layout is probably going to be the same. Should the destructor be declared virtual, the layout would definitely become different between structure and class.
Here is an article which shows that there is not much needed to do to step from C structures to C++ classes: Lesson 1 - From Structure to Class
And here is the article which explains how virtual functions table is introduced to classes that have virtual functions: Lesson 4 - Polymorphism
Classes & structs in C++ are the equivalent, except that all members of a struct are public by default (class members are private by default). This ensures that compiling legacy C code in a C++ compiler will work as expected.
There is nothing stopping you from using all the fancy C++ features in a struct:
struct ReallyAClass
{
ReallyAClass();
virtual !ReallAClass();
/// etc etc etc
};
Why not explicitly assign the class's members to the struct's when you want to pass the data to C? That way you know your code will work anywhere.
You probably just derive the class from the struct, either publicly or privately. Then casting it would resolve correctly in the C++ code.

Dynamically pass the type name to a static_cast

Is it possible to dynamically choose the type you wish to cast to, during runtime?
For example, suppose I have:
ClassType * pointer=static_cast<ClassType*>(baseClassPointer);
At run time, however, I'd like to choose what the ClassType is, rather than having to hardcode it into the function.
Is this possible? The two ways I'm thinking I'd want to use it is either passing an actual char * that contains the name of the type I want to use, or somehow extracting the type information from an existing class and using that as the cast.
The reason I want to do this, is I have several derived classes from a common base class. I can get the basic functionality of the base class for each of the derived classes, but if I want a pointer to access some of the specific functionality that only exists in the derived class, I need to cast that pointer as such. And I'd like to have a function that allows me to do this casting dynamically for any of the derived classes.
Templates might be able to serve your purpose here for whenever you need it to statically cast upwards.
template <typename TTo> derived_cast ( BaseClass* b ) {
static_assert( std::is_base_of<BaseClass, TTo>::value, "You can't cast to a class that's not derived from BaseClass!" );
return static_cast<TTo*>( b );
}
The syntax becomes a bit more compressed then:
Derived* d = derived_cast<Derived>( b );
The static_assert to makes sure TTo is actually derived and such.
Now, for run-time, you'd have to use dynamic_cast, but that involves other things than what the code you have up there implies.
EDIT: Beyond this point are dragons of a crazy kind.
In C++, dynamically casting and operating on an object (without ever knowing that type at compile-time) is impossible, save for using a robust base class or if/else on some kind of run-time identifier to then provide static type information. In almost all cases, you would be better to just use a virtual method on a base class and then override them in derived classes. Switching on strings and other things is not only slow but painful when you have to keep adding extra cases: do not do it. However, if you are going to ignore my advice, here's some built-in pieces you can work with to get sort of what you want:
typeid - an operator that defines an implementation-defined, but unique-to-the-class, object (type_info) that compares uniquely to that class. You can compare them typeid( Dog) == typeid( Dog ) and get true/false correctly. This will allow you some run-time typing information.
dynamic_cast - a cast that fails on up-casting or to a class of a certain type (abridged definition, for more information see cppreference on dynamic_cast). You can use this to dynamically cast pointers and other such things, with it returning null on failure to cast. It might help you here, but you still need to know the types statically that you're working with (after switching on, say, typeid).
With these two, you could do a better implementation of what you see in the other answer with the string usage. But that's about it. Anything else requires things like boost::variant, a stronger base class, or a different design. A stronger base class sounds like what you could use here, but I can't say with 100% certainty.
Good luck!
Bad design spotted !
What about overriding your derived classes ? You are messing with something really dangerous
If you have for example base class Animal, and you have derived classes Dog and Cat, you can always cast Animal to Dog or Cat (if you know which it is):
void doCustomOperation(Animal *animal, string runtimeDecision) {
if (runtimeDecision == "dog") {
((Dog *)animal).bark();
}
if (runtimeDecision == "cat") {
((Cat *)animal).chaseMouse();
}
}
Even if you could do this at runtime (which I don't think you can), what would you do with the derived class pointer you get back? You wouldn't know at the time you write the code what type of object you'll get back so how would you know what methods you can call?
I think you would be better off either adding virtual methods to your base class to capture the necessary functionality or alternatively you could employ the Visitor Pattern.

Does extra inheritance make any difference on object structure or instantiation?

In the code there are some special classes and there are some normal classes. I want to differentiate them because special classes needed to be given different treatment. All these special classes are base (not child of any other class)
To achieve that I am tokenizing special classes in the source code by inserting an inheritance to them with an empty struct:
struct _special {}; // empty class
class A : public _special { // A becomes special
...
};
class B { // 'B' remains normal
...
};
class D : public A { // 'D' becomes special due to 'A'
...
};
Whenever needed, I can find segregate special and normal classes using is_base_of<Base,Derived>. The alternate way would have been of using typedef inside the special classes:
class A {
public: typedef something _special;
};
The problem is that if A's child are inheriting from multiple classes then there will be ambiguous typedefs.
Question: With adding such interface like inheritance with empty class _special, will it it hurt the current code in any way (e.g. object structuring, compilation error etc.) ?
The layout of objects in memory is only partially specified in the C++ standard however there are certain conventions that most compilers use. Empty types will take up a little bit of memory (so that they will have a memory address which will give their pointers identity). This extra bit of memory is generally just four bytes, nothing to worry about for most purposes. If you inherit from an empty type on the other hand it shouldn't increase the size of your object because the rest of the object will be taking up space so it will have an address anyway.
If you are using single inheritance objects will be laid out with the first bit of memory being laid out like the first base class, and then the memory to hold the members of later classes in the chain. If you have any virtual functions there will also be a place, probably at the beginning, for the virtual pointer. If you are deriving one type from another you will generally want to follow the "rule of three": a virtual destructor, copy constructor, and copy assignment operator. So then you will have a virtual pointer, again this is probably 4 bytes, not a big deal.
If you get into multiple inheritance then your objects start to get very complicated structurally. They will have various pointers to different parts of themselves so that functions can find the members that they are looking for.
That said, consider whether you want to use inheritance to model this at all. Perhaps giving the objects a bool member variable would be a good idea.
Most if not all decent compilers implement Empty Base Optimization (EBO) for simple cases, which means that your object sizes won't grow by inheriting from an empty base. However when a class inherits from an empty base in more than one way the optimization may be impossible due to the need to have different addresses for the different empty bases of the same type. To protect against that, one usually makes the empty base a template taking the derived class as an argument, but it would render is_base_of unusable.
Personally, I would implement this classification externally. Template specialization won't get the desired result of classes derived from special indirectly being considered special as well. It looks like you are using C++11 so I would do:
std::false_type is_special( ... );
std::true_type is_special( A const* );
And replace is_base_of<T, _special> with decltype( is_special( static_cast<T*>(0) ) ). In C++03 the same can be achieved with the sizeof trick by having the classification function return types of different sizes:
typedef char no_type;
struct yes_type { no_type _[2]; };
no_type is_special( ... );
yes_type is_special( A const* );
And replace is_base_of<T, _special> with sizeof( is_special( static_cast<T*>(0) ) ) == sizeof( yes_type ). You could wrap that classification check within a helper class template.
not sure what you mean with hurt or object structuring (care to elaborate?), but there should be no compiler errors, instantiation/constructor of the classed deriving from _special does not change since _special has a default constructor and perfomance-wise the compiler might apply empty base class optimization.
That being said, the option of using typedefs to tag classes might be a better, clearer and more extendible solution. And just as ambiguous as A's children inheriting form multiple other classes that all might inherit from _special.

Why can't we create objects for an abstract class in C++?

I know it is not allowed in C++, but why? What if it was allowed, what would the problems be?
Judging by your other question, it seems you don't understand how classes operate. Classes are a collection of functions which operate on data.
Functions themselves contain no memory in a class. The following class:
struct dumb_class
{
void foo(){}
void bar(){}
void baz(){}
// .. for all eternity
int i;
};
Has a size of int. No matter how many functions you have ever, this class will only take up the space it takes to operate on an int. When you call a function in this class, the compiler will pass you a pointer to the place where the data in the class is stored; this is the this pointer.
So, the function lie in memory somewhere, loaded once at the beginning of your program, and wait to be called with data to operate on.
Virtual functions are different. The C++ standard does not mandate how the behavior of the virtual functions should go about, only what that behavior should be. Typically, implementations use what's called a virtual table, or vtable for short. A vtable is a table of function pointers, which like normal functions, only get allocated once.
Take this class, and assume our implementor uses vtables:
struct base { virtual void foo(void); };
struct derived { virtual void foo(void); };
The compiler will need to make two vtables, one for base and one for derived. They will look something like this:
typedef /* some generic function pointer type */ func_ptr;
func_ptr __baseTable[] = {&base::foo};
func_ptr __derivedTable[] = {&derived::foo};
How does it use this table? When you create an instance of a class above, the compiler slips in a hidden pointer, which will point to the correct vtable. So when you say:
derived d;
base* b = &d;
b->foo();
Upon executing the last line, it goes to the correct table (__derivedTable in this case), goes to the correct index (0 in this case), and calls that function. As you can see, that will end up calling derived::foo, which is exactly what should happen.
Note, for later, this is the same as doing derived::foo(b), passing b as the this pointer.
So, when virtual methods are present, the class of the size will increase by one pointer (the pointer to the vtable.) Multiple inheritance changes this a bit, but it's mostly the same. You can get more details at C++-FAQ.
Now, to your question. I have:
struct base { virtual void foo(void) = 0; }; // notice the = 0
struct derived { virtual void foo(void); };
and base::foo has no implementation. This makes base::foo a pure abstract function. So, if I were to call it, like above:
derived d;
base* b = &d;
base::foo(b);
What behavior should we expect? Being a pure virtual method, base::foo doesn't even exist. The above code is undefined behavior, and could do anything from nothing to crashing, with anything in between. (Or worse.)
Think about what a pure abstract function represents. Remember, functions take no data, they only describe how to manipulate data. A pure abstract function says: "I want to call this method and have my data be manipulated. How you do this is up to you."
So when you say, "Well, let's call an abstract method", you're replying to the above with: "Up to me? No, you do it." to which it will reply "##^##^". It simply doesn't make sense to tell someone who's saying "do this", "no."
To answer your question directly:
"why we cannot create an object for an abstract class?"
Hopefully you see now, abstract classes only define the functionality the concrete class should be able to do. The abstract class itself is only a blue-print; you don't live in blue-prints, you live in houses that implement the blue-prints.
The problem is simply this:
what should the program do when an abstract method is called?
and even worse: what should be returned for a non-void function?
The application whould proabably have to crash or thow a runtime exception and thus this would cause trouble. You can't dummy-implement every abstract function.
A class can simply be declared abstract where it has no abstract methods. I guess that could be instantiated in theory but the class designer doesn't want you to. It may have unintended consequences.
Usually however abstract classes have abstract methods. They can't be instantiated for the simple reason that they're missing those methods.
Because logically it does not make any sense.
An abstract class is a description that is incomplete.
It indicates what things need to be filled out to make it complete but without those bits its not complete.
My first example was a chess game:
The game has lots of pieces of different type (King,Queen,Pawn ... etc).
But there are no actual objects of type piece, but all objects are instances of objects derived from piece. How can you have an object of something that is not fully defined. There is not point in creating an object of piece as the game does not know how it moves (that is the abstract part). It knows it can move but not how it does it.
Abstract classes are non-instantiable by definition. They require that there be derived, concrete classes. What else would an abstract class be if it didn't have pure virtual (unimplemented) functions?
It's the same class of question as why can't I change the value of a const variable, why can't I access private class members from other classes or why can't I override final methods.
Because that's the purpose of these keywords, to prevent you from doing so. Because the author of the code deemed doing so dangerous, undesired or simply impossible due to some abstract reasons like lack of essential functions that need to be added by specific child classes. It isn't really that you can't instantiate because a class is virtual. It's that inability to instantiate a class defines it as virtual (and if a class that can't be instantiated isn't virtual, it's an error. Same goes the other way, if instance of given class makes sense, it shouldn't be marked as virtual)
Why we cant create an object of an abstract class?
simply abstract class contains abstract methods(means the functions which are without the body) and we cannot give functionality to the abstract methods. And if we try to give functionality to the abstract methods then there will be no difference between abstract class and virtual class. So lastly if we create an object Of an abstrast class then there is no fun to call the useless functions or abstract methods as they are without the functionality..so thats why any language doesnt allow us to create an object of an abstract class..
Abstract classes instantiated would be pretty useless, because you would be seeing a lot more of "pure virtual function called". :)
It's like: we all know that a car would have 3 pedals and a steering wheel and a gear stick. Now, if that would be it, and there'd be an instance of 3 pedals and gear stick and a wheel, I'm not buying it, I want a car, like with seats, doors, AC etc. with pedals actually doing something apart from being in existence and that's what abstract class doesn't promise me, the ones implementing it do.
Basically creation of object is responsible for allocation of memory for member variables and member functions. but here, in pure virtual function we have declaration and defination in derived class.so creation of object generates error.