Does extra inheritance make any difference on object structure or instantiation?

Does extra inheritance make any difference on object structure or instantiation? - c++

In the code there are some special classes and there are some normal classes. I want to differentiate them because special classes needed to be given different treatment. All these special classes are base (not child of any other class)
To achieve that I am tokenizing special classes in the source code by inserting an inheritance to them with an empty struct:
struct _special {}; // empty class
class A : public _special { // A becomes special
...
};
class B { // 'B' remains normal
...
};
class D : public A { // 'D' becomes special due to 'A'
...
};
Whenever needed, I can find segregate special and normal classes using is_base_of<Base,Derived>. The alternate way would have been of using typedef inside the special classes:
class A {
public: typedef something _special;
};
The problem is that if A's child are inheriting from multiple classes then there will be ambiguous typedefs.
Question: With adding such interface like inheritance with empty class _special, will it it hurt the current code in any way (e.g. object structuring, compilation error etc.) ?

The layout of objects in memory is only partially specified in the C++ standard however there are certain conventions that most compilers use. Empty types will take up a little bit of memory (so that they will have a memory address which will give their pointers identity). This extra bit of memory is generally just four bytes, nothing to worry about for most purposes. If you inherit from an empty type on the other hand it shouldn't increase the size of your object because the rest of the object will be taking up space so it will have an address anyway.
If you are using single inheritance objects will be laid out with the first bit of memory being laid out like the first base class, and then the memory to hold the members of later classes in the chain. If you have any virtual functions there will also be a place, probably at the beginning, for the virtual pointer. If you are deriving one type from another you will generally want to follow the "rule of three": a virtual destructor, copy constructor, and copy assignment operator. So then you will have a virtual pointer, again this is probably 4 bytes, not a big deal.
If you get into multiple inheritance then your objects start to get very complicated structurally. They will have various pointers to different parts of themselves so that functions can find the members that they are looking for.
That said, consider whether you want to use inheritance to model this at all. Perhaps giving the objects a bool member variable would be a good idea.

Most if not all decent compilers implement Empty Base Optimization (EBO) for simple cases, which means that your object sizes won't grow by inheriting from an empty base. However when a class inherits from an empty base in more than one way the optimization may be impossible due to the need to have different addresses for the different empty bases of the same type. To protect against that, one usually makes the empty base a template taking the derived class as an argument, but it would render is_base_of unusable.
Personally, I would implement this classification externally. Template specialization won't get the desired result of classes derived from special indirectly being considered special as well. It looks like you are using C++11 so I would do:
std::false_type is_special( ... );
std::true_type is_special( A const* );
And replace is_base_of<T, _special> with decltype( is_special( static_cast<T*>(0) ) ). In C++03 the same can be achieved with the sizeof trick by having the classification function return types of different sizes:
typedef char no_type;
struct yes_type { no_type _[2]; };
no_type is_special( ... );
yes_type is_special( A const* );
And replace is_base_of<T, _special> with sizeof( is_special( static_cast<T*>(0) ) ) == sizeof( yes_type ). You could wrap that classification check within a helper class template.

not sure what you mean with hurt or object structuring (care to elaborate?), but there should be no compiler errors, instantiation/constructor of the classed deriving from _special does not change since _special has a default constructor and perfomance-wise the compiler might apply empty base class optimization.
That being said, the option of using typedefs to tag classes might be a better, clearer and more extendible solution. And just as ambiguous as A's children inheriting form multiple other classes that all might inherit from _special.

Related

Should members variables used by the CRTP base type be in the derived type?

I've been learning about CRTP (Curiously Recurring Template Pattern) today and believe I understand it well enough.
However, in the examples I've seen the state is stored in the derived type, even though the base type relies upon their presence. To me, this seems illogical and the member variables should be in the base type since its functionality relies upon them.
Here's a simple example of what I'm talking about:
template <typename DerivedType>
class Base{
public:
int calculate() {
return static_cast<DerivedType&>(*this).x + static_cast<DerivedType&>(*this).y;
}
};
class Derived : Base<Derived>{
public:
int x; // ignore the fact that these aren't initialised for simplicity
int y;
};
My question is:
Am I correct in thinking that the members x and y would be better off in the base type? If not, why?

Short answer
It depends on what you need to do, but if different derived classes provide different types of x, y, then these cannot be in the base class.
Long answer
The most common use of inheritance is that a base class includes whatever is common in a number of (more than one) derived classes. This common part is written only once and reused by each derived class, leading to shorter, cleaner code, easier maintenance etc. The common part is calculate() in your case.
Now, wherever this common code needs to access information that is specialized per derived class, this information needs to be accessed through a common interface. In your example, this information is members x, y that may be of different types for each derived class. Or, it could be member functions x(), y(). Such functions could take different types (but same number) of arguments and have different return types per derived class.
Either way, it is the job of the derived class to provide a common interface to heterogeneous information. For CRTP/static polymorphism, this common interface is merely the name of the member and the number of arguments, in the case of member functions. For dynamic polymorphism, the relevant mechanism is virtual functions and the common interface includes the entire signature of the function.
It doesn't matter where the data are actually stored; this depends on many things. It may well be the case that the data are stored in the base class after all, however they are still accessed through member functions in the derived classes.
An example is a tuple implementation where a base class implements all common functionality among different kinds of tuples, whereas a number of tuple views are derived from this base to model operations like flipping the order of tuple elements, concatenating or "zipping" tuples together etc. Note that all such views are lazy, similarly to the way an std::reverse_iterator lets you traverse a sequence in reverse order without actually manipulating the data in advance.
In this case, member function at() of the base class provides random access to a tuple element. This calls call_at() of the derived class, which in turn accesses data that are actually stored in the base class. So, each derived class only knows where each element is to be found; using this information, the base class implements all the remaining functionality (for instance, an operator[] that yields a new tuple where each element is the result of applying operator[] to the respective element of the original tuple).
D's template mixins provide a much more convenient and less verbose alternative to CRTP; almost as convenient as macros. Your static_cast<DerivedType&>(*this).x and my der().x would be just x in this case. Plus, you wouldn't need DerivedType within Base at all.

I think it is better to assume that the derived class has two member functions x() and y(). You can change your implementation of Base::calculate() to use these functions instead of using the member variables.
Then, the derived class has a lot more freedom in the kinds of data it holds.

Vector of pointers to base type, find all instances of a given derived type stored in a base type

Suppose you have a base class inside of a library:
class A {};
and derived classes
class B: public A {};
class C: public A {};
Now Instances of B and C are stored in a std::vector of boost::shared_ptr<A>:
std::vector<boost::shared_ptr<A> > A_vec;
A_vec.push_back(boost::shared_ptr<B>(new B()));
A_vec.push_back(boost::shared_ptr<C>(new C()));
Adding instances of B and C is done by a user, and there is no way to determine in advance the order, in which they will be added.
However, inside of the library, there may be a need to perform specific actions on B and C, so the pointer to the base class needs to be casted to B and C.
I can of course do "trial and error" conversions, i.e. try to cast to Band C(and any other derivative of the base class), until I find a conversion that doesn't throw. However, this method seems very crude and error-prone, and I'm looking for a more elegant (and better performing) way.
I am looking for a solution that will also work with C++98, but may involve boost functionality.
Any ideas ?
EDIT:
O.k., thanks for all the answers so far!
I'd like to give some more details regarding the use-case. All of this happens in the context of parametric optimization.
Users define the optimization problem by:
Specifying the parameters, i.e. their types (e.g. "constrained double", "constrained integer", "unconstrained double", "boolean", etc.) and initial values
Specifying the evaluation function, which assigns one or more evaluations (double values) to a given parameter set
Different optimization algorithms then act on the problem definitions, including their parameters.
There is a number of predefined parameter objects for common cases, but users may also create their own parameter objects, by deriving from one of my base classes. So from a library perspective, apart from the fact that the parameter objects need to comply with a given (base-class) API, I cannot assume much about parameter objects.
The problem definition is a user-defined C++-class, derived from a base-class with a std::vector interface. The user adds his (predefined or home-grown) parameter objects and overloads a fitness-function.
Access to the parameter objects may happen
from within the optimization algorithms (usually o.k., even for home-grown parameter objects, as derived parameter objects need to provide access functions for their values).
from within the user-supplied fitness function (usually o.k., as the user knows where to find which parameter object in the collection and its value can be accessed easily)
This works fine.
There may however be special cases where
a user wants to access specifics of his home-grown parameter types
a third party has supplied the parameter structure (this is an Open Source library, others may add code for specific optimization problems)
the parameter structure (i.e. which parameters are where in the vector) may be modified as part of the optimization problem --> example: training of the architecture of a neural network
Under these circumstances it would be great to have an easy method to access all parameter objects of a given derived type inside of the collection of base types.
I already have a templated "conversion_iterator". It iterates over the vector of base objects and skips those that do not comply with the desired target type. However, this is based on "trial and error" conversion (i.e. I check whether the converted smart pointer is NULL), which I find very unelegant and error-prone.
I'd love to have a better solution.
NB: The optimization library is targetted at use-cases, where the evaluation step for a given parameter set may last arbitrarily long (usually seconds, possibly hours or longer). So speed of access to parameter types is not much of an issue. But stability and maintainability is ...

There’s no better general solution than trying to cast and seeing whether it succeeds. You can alternatively derive the dynamic typeid and compare it to all types in turn, but that is effectively the same amount of work.
More fundamentally, your need to do this hints at a design problem: the whole purpose of a base class is to be able to treat children as if they were parents. There are certain situations where this is necessary though, in which case you’d use a visitor to dispatch them.

If possible, add virtual methods to class A to do the "specific actions on B and C".
If that's not possible or not reasonable, use the pointer form of dynamic_cast, so there are no exceptions involved.
for (boost::shared_ptr<A> a : A_vec)
{
if (B* b = dynamic_cast<B*>(a.get()))
{
b->do_something();
}
else if (C* c = dynamic_cast<C*>(a.get()))
{
something_else(*c);
}
}

Adding instances of B and C is done by a user, and there is no way to determine in advance the order, in which they will be added.
Okay, so just put them in two different containers?
std::vector<boost::shared_ptr<A> > A_vec;
std::vector<boost::shared_ptr<B> > B_vec;
std::vector<boost::shared_ptr<C> > C_vec;
void add(B * p)
{
B_vec.push_back(boost::shared_ptr<B>(p));
A_vec.push_back(b.back());
}
void add(C * p)
{
C_vec.push_back(boost::shared_ptr<C>(p));
A_vec.push_back(c.back());
}
Then you can iterate over the Bs or Cs to your hearts content.

I would suggest to implement a method in the base class (e.g. TypeOf()), which will return the type of the particular object. Make sure you define that method as virtual and abstract so that you will be enforced to implement in the derived types. As for the type itself, you can define an enum for each type (e.g. class).
enum class ClassType { ClassA, ClassB, ClassC };

This answer might interest you: Generating an interface without virtual functions?
This shows you both approaches
variant w/visitor in a single collection
separate collections,
as have been suggested by others (Fred and Konrad, notably). The latter is more efficient for iteration, the former could well be more pure and maintainable. It could even be more efficient too, depending on the usage patterns.

Extending std::(w)string, by adding a constructor and some member functions

Sure, the std::string interface is already bloated. But it's missing some (for me) crucial elements. For example, a std::wstring cannot be constructed from a plain const char* (which is what is needed to create one from a string literal). I'd also like to add an operator/ and a split function. Anyways, that's all besides the point of the question. Which is preventing me writing a core class's guts for a project.
I know I can privately inherit from std::(w)string, and "import" all members with using. This misses the crucial non-member template functions, which are numerous.
How can I approach this better? I know public inheritance "solves" the problem, but it introduces the problems of deleteing a base class pointer of a class without a virtual destructor. Note that I'm not planning to add data members, so is this really a problem, or is this corner case still fine to use public inheritance?
Please don't say "don't do this", unless you can provide a way that 1) does what I want, 2) doesn't require me to write it all myself, 3) does not bloat my caller-side interface.

Do not derive from std::basic_string<...> but rather create algorithms doing the appropriate operations, e.g.:
template <typename cT>
std::basic_string<cT> construct(char const* str) {
// ...
}
Likewise for split(), operator/(), etc. In principle, most members of std::basic_string<...> shouldn't be members in the first place...

C++ typedef versus unelaborated inheritance

I have a data structure made of nested STL containers:
typedef std::map<Solver::EnumValue, double> SmValueProb;
typedef std::map<Solver::VariableReference, Solver::EnumValue> SmGuard;
typedef std::map<SmGuard, SmValueProb> SmTransitions;
typedef std::map<Solver::EnumValue, SmTransitions> SmMachine;
This form of the data is only used briefly in my program, and there's not much behavior that makes sense to attach to these types besides simply storing their data. However, the compiler (VC++2010) complains that the resulting names are too long.
Redefining the types as subclasses of the STL containers with no further elaboration seems to work:
typedef std::map<Solver::EnumValue, double> SmValueProb;
class SmGuard : public std::map<Solver::VariableReference, Solver::EnumValue> { };
class SmTransitions : public std::map<SmGuard, SmValueProb> { };
class SmMachine : public std::map<Solver::EnumValue, SmTransitions> { };
Recognizing that the STL containers aren't intended to be used as a base class, is there actually any hazard in this scenario?

There is one hazard: if you call delete on a pointer to a base class with no virtual destructor, you have Undefined Behavior. Otherwise, you are fine.
At least that's the theory. In practice, in the MSVC ABI or the Itanium ABI (gcc, Clang, icc, ...) delete on a base class with no virtual destructor (-Wdelete-non-virtual-dtor with gcc and clang, providing the class has virtual methods) only results in a problem if your derived class adds non-static attributes with non-trivial destructor (eg. a std::string).
In your specific case, this seems fine... but...
... you might still want to encapsulate (using Composition) and expose meaningful (business-oriented) methods. Not only will it be less hazardous, it will also be easier to understand than it->second.find('x')->begin()...

Yes there is:
std::map<Solver::VariableReference, Solver::EnumValue>* x = new SmGuard;
delete x;
results in undefined behavior.

This is one of the controversial point of C++ vs "inheritance based classical OOP".
There are two aspect that must be taken in consideration:
a typedef is introduce another name for a same type: std::map<Solver::EnumValue, double> and SmValueProb are -at all effect- the exact same thing and cna be used interchangably.
a class introcuce a new type that is (by principle) unrelated with anything else.
Class relation are defined by the way the class is "made up", and what lets implicit operations and conversion to be possible with other types.
Outside of specific programming paradigms (like OOP, that associate to the concept of "inhritance" and "is-a" relation) inheritance, implicit constructors, implicit casts, and so on, all do a same thing: let a type to be used across the interface of another type, thus defining a network of possible operations across different types. This is (generally speaking) "polymorphism".
Various programming paradigms exist about saying how such a network should be structured each attempting to optimize a specific aspect of programming, like the representation or runtime-replacable objects (classical OOP), the representation of compile-time replacable objects (CRTP), the use of genreric algorithial function for different types (Generic programming), teh use of "pure function" to express algorithm composition (functional and lambda "captures").
All of them dictates some "rules" about how language "features" must be used, since -being C++ multiparadigm- non of its features satisfy alone the requirements of the paradigm, letting some dirtiness open.
As Luchian said, inheriting a std::map will not produce a pure OOP replaceable type, since a delete over a base-pointer will not know how to destroy the derived part, being the destructor not virtual by design.
But -in fact- this is just a particular case: also pbase->find will not call your own eventually overridden find method, being std::map::find not virtual. (But this is not undefined: it is very well defined to be most likely not what you intend).
The real question is another: is "classic OOP substitution principle" important in your design or not?
In other word, are you going to use your classes AND their bases each other interchangeably, with functions just taking a std::map* or std::map& parameter, pretending those function to call std::map functions resulting in calls to your methods?
If yes, inheritance is NOT THE WAY TO GO. There are no virtual methods in std::map, hence runtime polymorphism will not work.
If no, that is: you're just writing your own class reusing both std::map behavior and interface, with no intention of interchange their usage (in particular, you are not allocating your own classes with new and deletinf them with delete applyed to an std::map pointer), providing just a set of functions taking yourclass& or yourclass* as parameters, that that's perfectly fine. It may even be better than a typedef, since your function cannot be used with a std::map anymore, thus separating the functionalities.
The alternative can be "encapsulation": that is: make the map and explicit member of your class letting the map accessible as a public member, or making it a private member with an accessor function, or rewriting yourself the map interface in your class. You gat finally an unrelated type with tha same interface an its own behavior. At the cost to rewrite the entire interface of something that may have hundredths of methods.
NOTE:
To anyone thinking about the danger of the missing of vitual dtor, note tat encapluating with public visibility won't solve the problem:
class myclass: public std::map<something...>
{};
std::map<something...>* p = new myclass;
delete p;
is UB excatly like
class myclass
{
public:
std::map<something...> mp;
};
std::map<something...>* p = &((new myclass)->mp);
delete p;
The second sample has the same mistake as the first, it is just less common: they both pretend to use a pointer to a partial object to operate on the entire one, with nothing in the partial object letting you able to know what the "containing one" is.

C++, statically detect base classes with differing addresses?

If I have a derived class with multiple bases, each this pointer for each base will be different from that of the derived object's this pointer, except for one. Given two types in an inheritance hierarchy, I'd like to detect at compile time whether they share the same this pointer. Something like this should work, but doesn't:
BOOST_STATIC_ASSERT(static_cast<Base1*>((Derived *)0xDEADBEEF) == (Derived*)0xDEADBEEF);
Because it needs to be an 'integral constant expression' and only integer casts are allowed in those according to the standard (which is stupid, because they only need compile time information if no virtual inheritance is being used). The same problem occurs trying to pass the results as integer template parameters.
The best I've been able to do is check at startup, but I need the information during compile (to get some deep template hackery to work).

I don't know how to check what you wan't but note that your assumption is false in presence of empty base classes. Any number of them can share the same offset from the start of the object, as long as they are of different type.

I am trying to solve this exact same issue. I have an implementation that works if you know what member variable is at the beginning of the base class's layout. E.g. if member variable "x" exists at the start of each class, then the following code will work to yield the byte offset of a particular base class layout from the derived class layout: offsetof(derived, base2::x).
In the case of:
struct base1 { char x[16]; };
struct base2 { int x; };
struct derived : public base1, public base2 { int x; };
static const int my_constant = offsetof(derived, base2::x);
The compiler will properly assign "16" to my_constant on my architecture (x86_64).
The difficulty is to get "16" when you don't know what member variable is at the start of a base class's layout.

I am not even sure that this offset is a constant in the first place. Do you have normative wording suggesting otherwise?
I'd agree that a non-const offset would be bloody hard to implement in the absence of virtual inheritance, and pointless to boot. That's besides the point.

Classes do not have a this pointer - instances of classes do, and it will be different for each instance, no matter how they are derived.

What about using
BOOST_STATIC_ASSERT(boost::is_convertible<Derived*,Base*>::value)
as documented in the following locations...
http://www.boost.org/doc/libs/1_39_0/doc/html/boost_staticassert.html
http://www.boost.org/doc/libs/1_38_0/libs/type_traits/doc/html/boost_typetraits/reference/is_convertible.html

I didn't realize that the compiler would insert this check at runtime, but your underlying assumption isn't entirely correct. Probably not in ways that you care about though: the compiler can use the Empty Base Class Optimization if you happen to inherit from more than one base class with sizeof(base class)==0. That would result in (base class *)(derived *)1==at least one other base class.
Like I said, this probably isn't something you would really need to care about.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js