What, if anything, is wrong with a child class containing a parent class object in C++? - c++

In C++ specifically, but also generally as an OO design principle, is there anything wrong with doing the following? Is it done in practice? If it shows a clear design flaw, what is a good alternative? Are there any advantages?
class Property {};
class CompositeProperty : public Property
{
...
private:
std::vector<Property> m_properties;
};
So specifically, can a derived class contain base class objects?
As I bit of background I have seen this used to model/mirror an XML structure but felt the design somewhat went in the face of is-a-is-inheritance and has-a-is-composition relationships for which one usually strives.

There is no flaw in design - in fact, this design is only one step away from the well-known and very useful composite pattern. However, there is a significant flaw in the implementation.
Your CompositeProperty aggregates instances of Property, rather than aggregating pointers. This kills the ability to use elements of the CompositeProperty polymorphically. You need to replace a vector of instances with a vector of pointers (preferably, smart pointers) in order to address this issue.
A classic place for the composite pattern is representation of expression trees: you start off with an abstract base, and then add representations for constants, variables, function calls, unary expressions, binary expressions, conditionals, and so on. Expressions such as constants and variables do not reference other expressions, while expressions such as unary expressions, binary expressions, and function calls do. This makes the object graph recursive, letting you represent expressions of arbitrary complexity.

So specifically, can a child class contain parent class objects?
In short YES

Please take a look at the Composite pattern.
Composite can be used when clients should ignore the difference between compositions of objects and individual objects. If programmers find that they are using multiple objects in the same way, and often have nearly identical code to handle each of them, then composite is a good choice; it is less complex in this situation to treat primitives and composites as homogeneous.

Nothing wrong with it at all but you may want to consider using pointers to the base/parent class. This allows for polymorphic behaviour of your objects. If you add instances of derived classes to your vector as it stands, you will suffer from object slicing.

There is nothing wrong with it.
Take this example (it's C#, but should be straightforward enough) where you define two sub-classes each containing an instance of the inherited one:
public class Person
{
public string Name;
}
public class MalePerson : Person
{
public Person BestFriend;
}
public class FemalePerson : Person
{
public Person BestFriend;
}
In my experience, including an instance of the superclass in the subclasses is most commonly used to either model a hierarchy between heterogeneous objects or refer to an object with as few assumptions as possible.

Related

C++ Inheritence vs Using parent class as a field in children classes [duplicate]

Why prefer composition over inheritance? What trade-offs are there for each approach? When should you choose inheritance over composition?
Prefer composition over inheritance as it is more malleable / easy to modify later, but do not use a compose-always approach. With composition, it's easy to change behavior on the fly with Dependency Injection / Setters. Inheritance is more rigid as most languages do not allow you to derive from more than one type. So the goose is more or less cooked once you derive from TypeA.
My acid test for the above is:
Does TypeB want to expose the complete interface (all public methods no less) of TypeA such that TypeB can be used where TypeA is expected? Indicates Inheritance.
e.g. A Cessna biplane will expose the complete interface of an airplane, if not more. So that makes it fit to derive from Airplane.
Does TypeB want only some/part of the behavior exposed by TypeA? Indicates need for Composition.
e.g. A Bird may need only the fly behavior of an Airplane. In this case, it makes sense to extract it out as an interface / class / both and make it a member of both classes.
Update: Just came back to my answer and it seems now that it is incomplete without a specific mention of Barbara Liskov's Liskov Substitution Principle as a test for 'Should I be inheriting from this type?'
Think of containment as a has a relationship. A car "has an" engine, a person "has a" name, etc.
Think of inheritance as an is a relationship. A car "is a" vehicle, a person "is a" mammal, etc.
I take no credit for this approach. I took it straight from the Second Edition of Code Complete by Steve McConnell, Section 6.3.
If you understand the difference, it's easier to explain.
Procedural Code
An example of this is PHP without the use of classes (particularly before PHP5). All logic is encoded in a set of functions. You may include other files containing helper functions and so on and conduct your business logic by passing data around in functions. This can be very hard to manage as the application grows. PHP5 tries to remedy this by offering a more object-oriented design.
Inheritance
This encourages the use of classes. Inheritance is one of the three tenets of OO design (inheritance, polymorphism, encapsulation).
class Person {
String Title;
String Name;
Int Age
}
class Employee : Person {
Int Salary;
String Title;
}
This is inheritance at work. The Employee "is a" Person or inherits from Person. All inheritance relationships are "is-a" relationships. Employee also shadows the Title property from Person, meaning Employee.Title will return the Title for the Employee and not the Person.
Composition
Composition is favoured over inheritance. To put it very simply you would have:
class Person {
String Title;
String Name;
Int Age;
public Person(String title, String name, String age) {
this.Title = title;
this.Name = name;
this.Age = age;
}
}
class Employee {
Int Salary;
private Person person;
public Employee(Person p, Int salary) {
this.person = p;
this.Salary = salary;
}
}
Person johnny = new Person ("Mr.", "John", 25);
Employee john = new Employee (johnny, 50000);
Composition is typically "has a" or "uses a" relationship. Here the Employee class has a Person. It does not inherit from Person but instead gets the Person object passed to it, which is why it "has a" Person.
Composition over Inheritance
Now say you want to create a Manager type so you end up with:
class Manager : Person, Employee {
...
}
This example will work fine, however, what if Person and Employee both declared Title? Should Manager.Title return "Manager of Operations" or "Mr."? Under composition this ambiguity is better handled:
Class Manager {
public string Title;
public Manager(Person p, Employee e)
{
this.Title = e.Title;
}
}
The Manager object is composed of an Employee and a Person. The Title behaviour is taken from Employee. This explicit composition removes ambiguity among other things and you'll encounter fewer bugs.
With all the undeniable benefits provided by inheritance, here's some of its disadvantages.
Disadvantages of Inheritance:
You can't change the implementation inherited from super classes at runtime (obviously because inheritance is defined at compile time).
Inheritance exposes a subclass to details of its parent class implementation, that's why it's often said that inheritance breaks encapsulation (in a sense that you really need to focus on interfaces only not implementation, so reusing by sub classing is not always preferred).
The tight coupling provided by inheritance makes the implementation of a subclass very bound up with the implementation of a super class that any change in the parent implementation will force the sub class to change.
Excessive reusing by sub-classing can make the inheritance stack very deep and very confusing too.
On the other hand Object composition is defined at runtime through objects acquiring references to other objects. In such a case these objects will never be able to reach each-other's protected data (no encapsulation break) and will be forced to respect each other's interface. And in this case also, implementation dependencies will be a lot less than in case of inheritance.
Another, very pragmatic reason, to prefer composition over inheritance has to do with your domain model, and mapping it to a relational database. It's really hard to map inheritance to the SQL model (you end up with all sorts of hacky workarounds, like creating columns that aren't always used, using views, etc). Some ORMLs try to deal with this, but it always gets complicated quickly. Composition can be easily modeled through a foreign-key relationship between two tables, but inheritance is much harder.
While in short words I would agree with "Prefer composition over inheritance", very often for me it sounds like "prefer potatoes over coca-cola". There are places for inheritance and places for composition. You need to understand difference, then this question will disappear. What it really means for me is "if you are going to use inheritance - think again, chances are you need composition".
You should prefer potatoes over coca cola when you want to eat, and coca cola over potatoes when you want to drink.
Creating a subclass should mean more than just a convenient way to call superclass methods. You should use inheritance when subclass "is-a" super class both structurally and functionally, when it can be used as superclass and you are going to use that. If it is not the case - it is not inheritance, but something else. Composition is when your objects consists of another, or has some relationship to them.
So for me it looks like if someone does not know if he needs inheritance or composition, the real problem is that he does not know if he want to drink or to eat. Think about your problem domain more, understand it better.
Didn't find a satisfactory answer here, so I wrote a new one.
To understand why "prefer composition over inheritance", we need first get back the assumption omitted in this shortened idiom.
There are two benefits of inheritance: subtyping and subclassing
Subtyping means conforming to a type (interface) signature, i.e. a set of APIs, and one can override part of the signature to achieve subtyping polymorphism.
Subclassing means implicit reuse of method implementations.
With the two benefits comes two different purposes for doing inheritance: subtyping oriented and code reuse oriented.
If code reuse is the sole purpose, subclassing may give one more than what he needs, i.e. some public methods of the parent class don't make much sense for the child class. In this case, instead of favoring composition over inheritance, composition is demanded. This is also where the "is-a" vs. "has-a" notion comes from.
So only when subtyping is purposed, i.e. to use the new class later in a polymorphic manner, do we face the problem of choosing inheritance or composition. This is the assumption that gets omitted in the shortened idiom under discussion.
To subtype is to conform to a type signature, this means composition has always to expose no less amount of APIs of the type. Now the trade offs kick in:
Inheritance provides straightforward code reuse if not overridden, while composition has to re-code every API, even if it's just a simple job of delegation.
Inheritance provides straightforward open recursion via the internal polymorphic site this, i.e. invoking overriding method (or even type) in another member function, either public or private (though discouraged). Open recursion can be simulated via composition, but it requires extra effort and may not always viable(?). This answer to a duplicated question talks something similar.
Inheritance exposes protected members. This breaks encapsulation of the parent class, and if used by subclass, another dependency between the child and its parent is introduced.
Composition has the befit of inversion of control, and its dependency can be injected dynamically, as is shown in decorator pattern and proxy pattern.
Composition has the benefit of combinator-oriented programming, i.e. working in a way like the composite pattern.
Composition immediately follows programming to an interface.
Composition has the benefit of easy multiple inheritance.
With the above trade offs in mind, we hence prefer composition over inheritance. Yet for tightly related classes, i.e. when implicit code reuse really make benefits, or the magic power of open recursion is desired, inheritance shall be the choice.
Inheritance is pretty enticing especially coming from procedural-land and it often looks deceptively elegant. I mean all I need to do is add this one bit of functionality to some other class, right? Well, one of the problems is that inheritance is probably the worst form of coupling you can have
Your base class breaks encapsulation by exposing implementation details to subclasses in the form of protected members. This makes your system rigid and fragile. The more tragic flaw however is the new subclass brings with it all the baggage and opinion of the inheritance chain.
The article, Inheritance is Evil: The Epic Fail of the DataAnnotationsModelBinder, walks through an example of this in C#. It shows the use of inheritance when composition should have been used and how it could be refactored.
When can you use composition?
You can always use composition. In some cases, inheritance is also possible and may lead to a more powerful and/or intuitive API, but composition is always an option.
When can you use inheritance?
It is often said that if "a bar is a foo", then the class Bar can inherit the class Foo. Unfortunately, this test alone is not reliable, use the following instead:
a bar is a foo, AND
bars can do everything that foos can do.
The first test ensures that all getters of Foo make sense in Bar (= shared properties), while the second test makes sure that all setters of Foo make sense in Bar (= shared functionality).
Example: Dog/Animal
A dog is an animal AND dogs can do everything that animals can do (such as breathing, moving, etc.). Therefore, the class Dog can inherit the class Animal.
Counter-example: Circle/Ellipse
A circle is an ellipse BUT circles can't do everything that ellipses can do. For example, circles can't stretch, while ellipses can. Therefore, the class Circle cannot inherit the class Ellipse.
This is called the Circle-Ellipse problem, which isn't really a problem, but more an indication that "a bar is a foo" isn't a reliable test by itself. In particular, this example highlights that derived classes should extend the functionality of base classes, never restrict it. Otherwise, the base class couldn't be used polymorphically. Adding the test "bars can do everything that foos can do" ensures that polymorphic use is possible, and is equivalent to the Liskov Substitution Principle:
Functions that use pointers or references to base classes must be able to use objects of derived classes without knowing it
When should you use inheritance?
Even if you can use inheritance doesn't mean you should: using composition is always an option. Inheritance is a powerful tool allowing implicit code reuse and dynamic dispatch, but it does come with a few disadvantages, which is why composition is often preferred. The trade-offs between inheritance and composition aren't obvious, and in my opinion are best explained in lcn's answer.
As a rule of thumb, I tend to choose inheritance over composition when polymorphic use is expected to be very common, in which case the power of dynamic dispatch can lead to a much more readable and elegant API. For example, having a polymorphic class Widget in GUI frameworks, or a polymorphic class Node in XML libraries allows to have an API which is much more readable and intuitive to use than what you would have with a solution purely based on composition.
In Java or C#, an object cannot change its type once it has been instantiated.
So, if your object need to appear as a different object or behave differently depending on an object state or conditions, then use Composition: Refer to State and Strategy Design Patterns.
If the object need to be of the same type, then use Inheritance or implement interfaces.
Personally I learned to always prefer composition over inheritance. There is no programmatic problem you can solve with inheritance which you cannot solve with composition; though you may have to use Interfaces(Java) or Protocols(Obj-C) in some cases. Since C++ doesn't know any such thing, you'll have to use abstract base classes, which means you cannot get entirely rid of inheritance in C++.
Composition is often more logical, it provides better abstraction, better encapsulation, better code reuse (especially in very large projects) and is less likely to break anything at a distance just because you made an isolated change anywhere in your code. It also makes it easier to uphold the "Single Responsibility Principle", which is often summarized as "There should never be more than one reason for a class to change.", and it means that every class exists for a specific purpose and it should only have methods that are directly related to its purpose. Also having a very shallow inheritance tree makes it much easier to keep the overview even when your project starts to get really large. Many people think that inheritance represents our real world pretty well, but that isn't the truth. The real world uses much more composition than inheritance. Pretty much every real world object you can hold in your hand has been composed out of other, smaller real world objects.
There are downsides of composition, though. If you skip inheritance altogether and only focus on composition, you will notice that you often have to write a couple of extra code lines that weren't necessary if you had used inheritance. You are also sometimes forced to repeat yourself and this violates the DRY Principle (DRY = Don't Repeat Yourself). Also composition often requires delegation, and a method is just calling another method of another object with no other code surrounding this call. Such "double method calls" (which may easily extend to triple or quadruple method calls and even farther than that) have much worse performance than inheritance, where you simply inherit a method of your parent. Calling an inherited method may be equally fast as calling a non-inherited one, or it may be slightly slower, but is usually still faster than two consecutive method calls.
You may have noticed that most OO languages don't allow multiple inheritance. While there are a couple of cases where multiple inheritance can really buy you something, but those are rather exceptions than the rule. Whenever you run into a situation where you think "multiple inheritance would be a really cool feature to solve this problem", you are usually at a point where you should re-think inheritance altogether, since even it may require a couple of extra code lines, a solution based on composition will usually turn out to be much more elegant, flexible and future proof.
Inheritance is really a cool feature, but I'm afraid it has been overused the last couple of years. People treated inheritance as the one hammer that can nail it all, regardless if it was actually a nail, a screw, or maybe a something completely different.
My general rule of thumb: Before using inheritance, consider if composition makes more sense.
Reason: Subclassing usually means more complexity and connectedness, i.e. harder to change, maintain, and scale without making mistakes.
A much more complete and concrete answer from Tim Boudreau of Sun:
Common problems to the use of inheritance as I see it are:
Innocent acts can have unexpected results - The classic example of this is calls to overridable methods from the superclass
constructor, before the subclasses instance fields have been
initialized. In a perfect world, nobody would ever do that. This is
not a perfect world.
It offers perverse temptations for subclassers to make assumptions about order of method calls and such - such assumptions tend not to
be stable if the superclass may evolve over time. See also my toaster
and coffee pot analogy.
Classes get heavier - you don't necessarily know what work your superclass is doing in its constructor, or how much memory it's going
to use. So constructing some innocent would-be lightweight object can
be far more expensive than you think, and this may change over time if
the superclass evolves
It encourages an explosion of subclasses. Classloading costs time, more classes costs memory. This may be a non-issue until you're
dealing with an app on the scale of NetBeans, but there, we had real
issues with, for example, menus being slow because the first display
of a menu triggered massive class loading. We fixed this by moving to
more declarative syntax and other techniques, but that cost time to
fix as well.
It makes it harder to change things later - if you've made a class public, swapping the superclass is going to break subclasses -
it's a choice which, once you've made the code public, you're married
to. So if you're not altering the real functionality to your
superclass, you get much more freedom to change things later if you
use, rather than extend the thing you need. Take, for example,
subclassing JPanel - this is usually wrong; and if the subclass is
public somewhere, you never get a chance to revisit that decision. If
it's accessed as JComponent getThePanel() , you can still do it (hint:
expose models for the components within as your API).
Object hierarchies don't scale (or making them scale later is much harder than planning ahead) - this is the classic "too many layers"
problem. I'll go into this below, and how the AskTheOracle pattern can
solve it (though it may offend OOP purists).
...
My take on what to do, if you do allow for inheritance, which you may
take with a grain of salt is:
Expose no fields, ever, except constants
Methods shall be either abstract or final
Call no methods from the superclass constructor
...
all of this applies less to small projects than large ones, and less
to private classes than public ones
Inheritance is very powerful, but you can't force it (see: the circle-ellipse problem). If you really can't be completely sure of a true "is-a" subtype relationship, then it's best to go with composition.
Inheritance creates a strong relationship between a subclass and super class; subclass must be aware of super class'es implementation details. Creating the super class is much harder, when you have to think about how it can be extended. You have to document class invariants carefully, and state what other methods overridable methods use internally.
Inheritance is sometimes useful, if the hierarchy really represents a is-a-relationship. It relates to Open-Closed Principle, which states that classes should be closed for modification but open to extension. That way you can have polymorphism; to have a generic method that deals with super type and its methods, but via dynamic dispatch the method of subclass is invoked. This is flexible, and helps to create indirection, which is essential in software (to know less about implementation details).
Inheritance is easily overused, though, and creates additional complexity, with hard dependencies between classes. Also understanding what happens during execution of a program gets pretty hard due to layers and dynamic selection of method calls.
I would suggest using composing as the default. It is more modular, and gives the benefit of late binding (you can change the component dynamically). Also it's easier to test the things separately. And if you need to use a method from a class, you are not forced to be of certain form (Liskov Substitution Principle).
Suppose an aircraft has only two parts: an engine and wings.
Then there are two ways to design an aircraft class.
Class Aircraft extends Engine{
var wings;
}
Now your aircraft can start with having fixed wings
and change them to rotary wings on the fly. It's essentially
an engine with wings. But what if I wanted to change
the engine on the fly as well?
Either the base class Engine exposes a mutator to change its
properties, or I redesign Aircraft as:
Class Aircraft {
var wings;
var engine;
}
Now, I can replace my engine on the fly as well.
If you want the canonical, textbook answer people have been giving since the rise of OOP (which you see many people giving in these answers), then apply the following rule: "if you have an is-a relationship, use inheritance. If you have a has-a relationship, use composition".
This is the traditional advice, and if that satisfies you, you can stop reading here and go on your merry way. For everyone else...
is-a/has-a comparisons have problems
For example:
A square is-a rectangle, but if your rectangle class has setWidth()/setHeight() methods, then there's no reasonable way to make a Square inherit from Rectangle without breaking Liskov's substitution principle.
An is-a relationship can often be rephrased to sound like a has-a relationship. For example, an employee is-a person, but a person also has-an employment status of "employed".
is-a relationships can lead to nasty multiple inheritance hierarchies if you're not careful. After all, there's no rule in English that states that an object is exactly one thing.
People are quick to pass this "rule" around, but has anyone ever tried to back it up, or explain why it's a good heuristic to follow? Sure, it fits nicely into the idea that OOP is supposed to model the real world, but that's not in-and-of-itself a reason to adopt a principle.
See this StackOverflow question for more reading on this subject.
To know when to use inheritance vs composition, we first need to understand the pros and cons of each.
The problems with implementation inheritance
Other answers have done a wonderful job at explaining the issues with inheritance, so I'll try to not delve into too many details here. But, here's a brief list:
It can be difficult to follow a logic that weaves between base and sub-class methods.
Carelessly implementing one method in your class by calling another overridable method will cause you to leak implementation details and break encapsulation, as the end-user could override your method and detect when you internally call it. (See "Effective Java" item 18).
The fragile base problem, which simply states that your end-user's code will break if they happen to depend on the leakage of implementation details when you attempt to change them. To make matters worse, most OOP languages allow inheritance by default - API designers who aren't proactively preventing people from inheriting from their public classes need to be extra cautious whenever they refactor their base classes. Unfortunately, the fragile base problem is often misunderstood, causing many to not understand what it takes to maintain a class that anyone can inherit from.
The deadly diamond of death
The problems with composition
It can sometimes be a little verbose.
That's it. I'm serious. This is still a real issue and can sometimes create conflict with the DRY principle, but it's generally not that bad, at least compared to the myriad of pitfalls associated with inheritance.
When should inheritance be used?
Next time you're drawing out your fancy UML diagrams for a project (if you do that), and you're thinking about adding in some inheritance, please adhere to the following advice: don't.
At least, not yet.
Inheritance is sold as a tool to achieve polymorphism, but bundled with it is this powerful code-reuse system, that frankly, most code doesn't need. The problem is, as soon as you publicly expose your inheritance hierarchy, you're locked into this particular style of code-reuse, even if it's overkill to solve your particular problem.
To avoid this, my two cents would be to never expose your base classes publicly.
If you need polymorphism, use an interface.
If you need to allow people to customize the behavior of your class, provide explicit hook-in points via the strategy pattern, it's a more readable way to accomplish this, plus, it's easier to keep this sort of API stable as you're in full control over what behaviors they can and can not change.
If you're trying to follow the open-closed principle by using inheritance to avoid adding a much-needed update to a class, just don't. Update the class. Your codebase will be much cleaner if you actually take ownership of the code you're hired to maintain instead of trying to tack stuff onto the side of it. If you're scared about introducing bugs, then get the existing code under test.
If you need to reuse code, start out by trying to use composition or helper functions.
Finally, if you've decided that there's no other good option, and you must use inheritance to achieve the code-reuse that you need, then you can use it, but, follow these four P.A.I.L. rules of restricted inheritance to keep it sane.
Use inheritance as a private implementation detail. Don't expose your base class publicly, use interfaces for that. This lets you freely add or remove inheritance as you see fit without making a breaking change.
Keep your base class abstract. It makes it easier to divide out the logic that needs to be shared from the logic that doesn't.
Isolate your base and child classes. Don't let your subclass override base class methods (use the strategy pattern for that), and avoid having them expect properties/methods to exist on each other, use other forms of code-sharing to achieve that. Use appropriate language features to force all methods on the base class to be non-overridable ("final" in Java, or non-virtual in C#).
Inheritance is a last resort.
The Isolate rule in particular may sound a little rough to follow, but if you discipline yourself, you'll get some pretty nice benefits. In particular, it gives you the freedom to avoid all of the main nasty pitfalls associated with the inheritance that were mentioned above.
It's much easier to follow the code because it doesn't weave in and out of base/sub classes.
You can not accidentally leak when your methods are internally calling other overridable methods if you never make any of your methods overridable. In other words, you won't accidentally break encapsulation.
The fragile base class problem stems from the ability to depend on accidentally leaked implementation details. Since the base class is now isolated, it will be no more fragile than a class depending on another via composition.
The deadly diamond of death isn't an issue anymore, since there's simply no need to have multiple layers of inheritance. If you have the abstract base classes B and C, which both share a lot of functionality, just move that functionality out of B and C and into a new abstract base class, class D. Anyone who inherited from B should update to inherit from both B and D, and anyone who inherited from C should inherit from C and D. Since your base classes are all private implementation details, it shouldn't be too difficult to figure out who's inheriting from what, to make these changes.
Conclusion
My primary suggestion would be to use your brain on this matter. What's far more important than a list of dos and don'ts about when to use inheritance is an intuitive understanding of inheritance and its associated pros and cons, along with a good understanding of the other tools out there that can be used instead of inheritance (composition isn't the only alternative. For example, the strategy pattern is an amazing tool that's forgotten far too often). Perhaps when you have a good, solid understanding of all of these tools, you'll choose to use inheritance more often than I would recommend, and that's completely fine. At least, you're making an informed decision, and aren't just using inheritance because that's the only way you know how to do it.
Further reading:
An article I wrote on this subject, that dives even deeper and provides examples.
A webpage talking about three different jobs that inheritance does, and how those jobs can be done via other means in the Go language.
A list of reasons why it can be good to declare your class as non-inheritable (e.g. "final" in Java).
The "Effective Java" book by Joshua Bloch, item 18, which discusses composition over inheritance, and some of the dangers of inheritance.
You need to have a look at The Liskov Substitution Principle in Uncle Bob's SOLID principles of class design. :)
To address this question from a different perspective for newer programmers:
Inheritance is often taught early when we learn object-oriented programming, so it's seen as an easy solution to a common problem.
I have three classes that all need some common functionality. So if I
write a base class and have them all inherit from it, then they will
all have that functionality and I'll only need to maintain it in once
place.
It sounds great, but in practice it almost never, ever works, for one of several reasons:
We discover that there are some other functions that we want our classes to have. If the way that we add functionality to classes is through inheritance, we have to decide - do we add it to the existing base class, even though not every class that inherits from it needs that functionality? Do we create another base class? But what about classes that already inherit from the other base class?
We discover that for just one of the classes that inherits from our base class we want the base class to behave a little differently. So now we go back and tinker with our base class, maybe adding some virtual methods, or even worse, some code that says, "If I'm inherited type A, do this, but if I'm inherited type B, do that." That's bad for lots of reasons. One is that every time we change the base class, we're effectively changing every inherited class. So we're really changing class A, B, C, and D because we need a slightly different behavior in class A. As careful as we think we are, we might break one of those classes for reasons that have nothing to do with those classes.
We might know why we decided to make all of these classes inherit from each other, but it might not (probably won't) make sense to someone else who has to maintain our code. We might force them into a difficult choice - do I do something really ugly and messy to make the change I need (see the previous bullet point) or do I just rewrite a bunch of this.
In the end, we tie our code in some difficult knots and get no benefit whatsoever from it except that we get to say, "Cool, I learned about inheritance and now I used it." That's not meant to be condescending because we've all done it. But we all did it because no one told us not to.
As soon as someone explained "favor composition over inheritance" to me, I thought back over every time I tried to share functionality between classes using inheritance and realized that most of the time it didn't really work well.
The antidote is the Single Responsibility Principle. Think of it as a constraint. My class must do one thing. I must be able to give my class a name that somehow describes that one thing it does. (There are exceptions to everything, but absolute rules are sometimes better when we're learning.) It follows that I cannot write a base class called ObjectBaseThatContainsVariousFunctionsNeededByDifferentClasses. Whatever distinct functionality I need must be in its own class, and then other classes that need that functionality can depend on that class, not inherit from it.
At the risk of oversimplifying, that's composition - composing multiple classes to work together. And once we form that habit we find that it's much more flexible, maintainable, and testable than using inheritance.
When you want to "copy"/Expose the base class' API, you use inheritance. When you only want to "copy" functionality, use delegation.
One example of this: You want to create a Stack out of a List. Stack only has pop, push and peek. You shouldn't use inheritance given that you don't want push_back, push_front, removeAt, et al.-kind of functionality in a Stack.
These two ways can live together just fine and actually support each other.
Composition is just playing it modular: you create interface similar to the parent class, create new object and delegate calls to it. If these objects need not to know of each other, it's quite safe and easy to use composition. There are so many possibilites here.
However, if the parent class for some reason needs to access functions provided by the "child class" for inexperienced programmer it may look like it's a great place to use inheritance. The parent class can just call it's own abstract "foo()" which is overwritten by the subclass and then it can give the value to the abstract base.
It looks like a nice idea, but in many cases it's better just give the class an object which implements the foo() (or even set the value provided the foo() manually) than to inherit the new class from some base class which requires the function foo() to be specified.
Why?
Because inheritance is a poor way of moving information.
The composition has a real edge here: the relationship can be reversed: the "parent class" or "abstract worker" can aggregate any specific "child" objects implementing certain interface + any child can be set inside any other type of parent, which accepts it's type. And there can be any number of objects, for example MergeSort or QuickSort could sort any list of objects implementing an abstract Compare -interface. Or to put it another way: any group of objects which implement "foo()" and other group of objects which can make use of objects having "foo()" can play together.
I can think of three real reasons for using inheritance:
You have many classes with same interface and you want to save time writing them
You have to use same Base Class for each object
You need to modify the private variables, which can not be public in any case
If these are true, then it is probably necessary to use inheritance.
There is nothing bad in using reason 1, it is very good thing to have a solid interface on your objects. This can be done using composition or with inheritance, no problem - if this interface is simple and does not change. Usually inheritance is quite effective here.
If the reason is number 2 it gets a bit tricky. Do you really only need to use the same base class? In general, just using the same base class is not good enough, but it may be a requirement of your framework, a design consideration which can not be avoided.
However, if you want to use the private variables, the case 3, then you may be in trouble. If you consider global variables unsafe, then you should consider using inheritance to get access to private variables also unsafe. Mind you, global variables are not all THAT bad - databases are essentially big set of global variables. But if you can handle it, then it's quite fine.
Aside from is a/has a considerations, one must also consider the "depth" of inheritance your object has to go through. Anything beyond five or six levels of inheritance deep might cause unexpected casting and boxing/unboxing problems, and in those cases it might be wise to compose your object instead.
When you have an is-a relation between two classes (example dog is a canine), you go for inheritance.
On the other hand when you have has-a or some adjective relationship between two classes (student has courses) or (teacher studies courses), you chose composition.
A simple way to make sense of this would be that inheritance should be used when you need an object of your class to have the same interface as its parent class, so that it can thereby be treated as an object of the parent class (upcasting). Moreover, function calls on a derived class object would remain the same everywhere in code, but the specific method to call would be determined at runtime (i.e. the low-level implementation differs, the high-level interface remains the same).
Composition should be used when you do not need the new class to have the same interface, i.e. you wish to conceal certain aspects of the class' implementation which the user of that class need not know about. So composition is more in the way of supporting encapsulation (i.e. concealing the implementation) while inheritance is meant to support abstraction (i.e. providing a simplified representation of something, in this case the same interface for a range of types with different internals).
Subtyping is appropriate and more powerful where the invariants can be enumerated, else use function composition for extensibility.
I agree with #Pavel, when he says, there are places for composition and there are places for inheritance.
I think inheritance should be used if your answer is an affirmative to any of these questions.
Is your class part of a structure that benefits from polymorphism ? For example, if you had a Shape class, which declares a method called draw(), then we clearly need Circle and Square classes to be subclasses of Shape, so that their client classes would depend on Shape and not on specific subclasses.
Does your class need to re-use any high level interactions defined in another class ? The template method design pattern would be impossible to implement without inheritance. I believe all extensible frameworks use this pattern.
However, if your intention is purely that of code re-use, then composition most likely is a better design choice.
Inheritance is a very powerfull machanism for code reuse. But needs to be used properly. I would say that inheritance is used correctly if the subclass is also a subtype of the parent class. As mentioned above, the Liskov Substitution Principle is the key point here.
Subclass is not the same as subtype. You might create subclasses that are not subtypes (and this is when you should use composition). To understand what a subtype is, lets start giving an explanation of what a type is.
When we say that the number 5 is of type integer, we are stating that 5 belongs to a set of possible values (as an example, see the possible values for the Java primitive types). We are also stating that there is a valid set of methods I can perform on the value like addition and subtraction. And finally we are stating that there are a set of properties that are always satisfied, for example, if I add the values 3 and 5, I will get 8 as a result.
To give another example, think about the abstract data types, Set of integers and List of integers, the values they can hold are restricted to integers. They both support a set of methods, like add(newValue) and size(). And they both have different properties (class invariant), Sets does not allow duplicates while List does allow duplicates (of course there are other properties that they both satisfy).
Subtype is also a type, which has a relation to another type, called parent type (or supertype). The subtype must satisfy the features (values, methods and properties) of the parent type. The relation means that in any context where the supertype is expected, it can be substitutable by a subtype, without affecting the behaviour of the execution. Let’s go to see some code to exemplify what I’m saying. Suppose I write a List of integers (in some sort of pseudo language):
class List {
data = new Array();
Integer size() {
return data.length;
}
add(Integer anInteger) {
data[data.length] = anInteger;
}
}
Then, I write the Set of integers as a subclass of the List of integers:
class Set, inheriting from: List {
add(Integer anInteger) {
if (data.notContains(anInteger)) {
super.add(anInteger);
}
}
}
Our Set of integers class is a subclass of List of Integers, but is not a subtype, due to it is not satisfying all the features of the List class. The values, and the signature of the methods are satisfied but the properties are not. The behaviour of the add(Integer) method has been clearly changed, not preserving the properties of the parent type. Think from the point of view of the client of your classes. They might receive a Set of integers where a List of integers is expected. The client might want to add a value and get that value added to the List even if that value already exist in the List. But her wont get that behaviour if the value exists. A big suprise for her!
This is a classic example of an improper use of inheritance. Use composition in this case.
(a fragment from: use inheritance properly).
Even though Composition is preferred, I would like to highlight pros of Inheritance and cons of Composition.
Pros of Inheritance:
It establishes a logical "IS A" relation. If Car and Truck are two types of Vehicle ( base class), child class IS A base class.
i.e.
Car is a Vehicle
Truck is a Vehicle
With inheritance, you can define/modify/extend a capability
Base class provides no implementation and sub-class has to override complete method (abstract) => You can implement a contract
Base class provides default implementation and sub-class can change the behaviour => You can re-define contract
Sub-class adds extension to base class implementation by calling super.methodName() as first statement => You can extend a contract
Base class defines structure of the algorithm and sub-class will override a part of algorithm => You can implement Template_method without change in base class skeleton
Cons of Composition:
In inheritance, subclass can directly invoke base class method even though it's not implementing base class method because of IS A relation. If you use composition, you have to add methods in container class to expose contained class API
e.g. If Car contains Vehicle and if you have to get price of the Car, which has been defined in Vehicle, your code will be like this
class Vehicle{
protected double getPrice(){
// return price
}
}
class Car{
Vehicle vehicle;
protected double getPrice(){
return vehicle.getPrice();
}
}
A rule of thumb I have heard is inheritance should be used when its a "is-a" relationship and composition when its a "has-a". Even with that I feel that you should always lean towards composition because it eliminates a lot of complexity.
As many people told, I will first start with the check - whether there exists an "is-a" relationship. If it exists I usually check the following:
Whether the base class can be instantiated. That is, whether the base class can be non-abstract. If it can be non-abstract I usually prefer composition
E.g 1. Accountant is an Employee. But I will not use inheritance because a Employee object can be instantiated.
E.g 2. Book is a SellingItem. A SellingItem cannot be instantiated - it is abstract concept. Hence I will use inheritacne. The SellingItem is an abstract base class (or interface in C#)
What do you think about this approach?
Also, I support #anon answer in Why use inheritance at all?
The main reason for using inheritance is not as a form of composition - it is so you can get polymorphic behaviour. If you don't need polymorphism, you probably should not be using inheritance.
#MatthieuM. says in https://softwareengineering.stackexchange.com/questions/12439/code-smell-inheritance-abuse/12448#comment303759_12448
The issue with inheritance is that it can be used for two orthogonal purposes:
interface (for polymorphism)
implementation (for code reuse)
REFERENCE
Which class design is better?
Inheritance vs. Aggregation
Composition v/s Inheritance is a wide subject. There is no real answer for what is better as I think it all depends on the design of the system.
Generally type of relationship between object provide better information to choose one of them.
If relation type is "IS-A" relation then Inheritance is better approach.
otherwise relation type is "HAS-A" relation then composition will better approach.
Its totally depend on entity relationship.

c++: why not use friend for compositions?

I'm a computational physicist trying to learn how to code properly. I've written several program by now, but the following canonical example keeps coming back, and I'm unsure as to how to handle it. Let's say that I have a composition of two objects such as
class node
{
int position;
};
class lattice
{
vector <node*> nodes;
double distance (node*,node*);
};
Now, this will not work, because position is a private member of node. I know of two ways to solve this: either you create an accessor such as getpos(){return position}, or make lattice a friend of node.
The second of these solutions seems a lot easier to me. However, I am under the impression that it is considered slightly bad practice, and that one generally ought to stick to accessors and avoid friend. My question is this: When should I use accessors, and when should I use friendship for compositions such as these?
Also, a bonus question that has been bugging me for some time: Why are compositions preferred to subclasses in the first place? To my understanding the HAS-A mnemonic argues this, but, it seems more intuitive to me to imagine a lattice as an object that has an object called node. That would then be an object inside of an object, e.i. a subclass?
Friend is better suited if you give access rights to only specific classes, rather than to all. If you define getpos(){return position}, position information will be publicly accessible via that getter method. If you use friend keyword, on the other hand, only the lattice class will be able to access position info. Therefore, it is purely dependent on your design decisions, whether you wanna make the information publicly accessible or not.
You made a "quasi class", this a textbook example of how not to do OOP because changing position doesn't change anything else in node. Even if changing position would change something in node, I would rethink the structure to avoid complexity and improve the compiler's ability to optimize your code.
I’ve witnessed C++ and Java programmers routinely churning out such
classes according to a sort of mental template. When I ask them to
explain their design, they often insist that this is some sort of
“canonical form” that all elementary and composite item (i.e.
non-container) classes are supposed to take, but they’re at a loss to
explain what it accomplishes. They sometimes claim that we need the
get and set functions because the member data are private, and, of
course, the member data have to be private so that they can be changed
without affecting other programs!
Should read:
struct node
{
int position;
};
Not all classes have to have private data members at all. If your intention is to create a new data type, then it may be perfectly reasonable for position to just be a public member. For instance, if you were creating a type of "3D Vectors", that is essentially nothing but a 3-tuple of numeric data types. It doesn't benefit from hiding its data members since its constructor and accessor methods have no fewer degrees of freedom than its internal state does, and there is no internal state that can be considered invalid.
template<class T>
struct Vector3 {
T x;
T y;
T z;
};
Writing that would be perfectly acceptable - plus overloads for various operators and other functions for normalizing, taking the magnitude, and so on.
If a node has no illegal position value, but no two nodes in a lattice cannot have the same position or some other constraint, then it might make sense for node to have public member position, while lattice has private member nodes.
Generally, when you are constructing "algebraic data types" like the Vector3<T> example, you use struct (or class with public) when you are creating product types, i.e. logical ANDs between other existent types, and you use std::variant when you are creating sum types, i.e. logical ORs between existent types. (And for completeness' sake, function types then take the place of logical implications.)
Compositions are preferred over inheritance when, like you say, the relationship is a "has-a" relationship. Inheritance is best used when you are trying to extend or link with some legacy code, I believe. It was previously also used as a poor approximation of sum types, before std::variant existed, because the union keyword really doesn't work very well. However, you are almost always better off using composition.
Concerning your example code, I am not sure that this poses a composition. In a composition, the child object does not exist as an independent entity. As a rule of thumb, it's life time is coupled with the container. Since you are using a vector<node*> nodes, I assume that the nodes are created somewhere else and lattice only has a pointer to these objects. An example for a composition would be
class lattice {
node n1; // a single object
std::vector<node> manyNodes;
};
Now, addressing the questions:
"When should I use accessors, and when should I use friendship for compositions such as these?"
If you use plenty of accessors in your code, your are creating structs and not classes in an OO sense. In general, I would argue that besides certain prominent exceptions such as container classes one rarely needs setters at all. The same can be argued for simple getters for plain members, except when the returning the property is a real part of the class interface, e.g. the number of elements in a container. Your interface should provide meaningful services that manipulate the internal data of your object. If you frequently get some internal data with a getter, then compute something and set it with an accessor you should put this computation in a method.
One of the main reasons why to avoid ´friend´ is because it introduces a very strong coupling between two components. The guideline here is "low coupling, high cohesion". Strong coupling is considered a problem because it makes code hard to change, and most time on software projects is spent in maintenance or evoluation. Friend is especially problematic because it allows unrelated code to be based on internal properties of your class, which can break encapsulation. There are valid use-cases for ´friend´ when the classes form a strongly related cluster (aka high cohesion).
"Why are compositions preferred to subclasses in the first place?"
In general, you should prefer plain composition over inheritance and friend classes since it reduces coupling. In a composition, the container class can only access the public interface of the contained class and has no knowledge about the internal data.
From a pure OOP point of view, your design has some weaknesses and is probably not very OO. One of the basic principles of OOP is encapsulation which means to couple related data and behavior into objects. The node class e.g. does not have any meaning other than storing a position, so it does not have any behavior. It seems that you modeled the data of your code but not the behavior. This can be a very appropriate design and lead to good code, but it not really object-oriented.
"To my understanding the HAS-A mnemonic argues this, but, it seems more intuitive to me to imagine a lattice as an object that has an object called node. That would then be an object inside of an object, e.i. a subclass?"
I think you got this wrong. Public inheritance models an is-a-relationship.
class A: public B {};
It basically says that objects of class A are a special kind of B, fulfilling all the assumptions that you can make about objects of type B. This is known as the Liskov substitution principle. It basically says that everywhere in your code where you use a B you should be able to also use an A. Considering this, class lattice: public node would mean that every lattice is a node. On the other hand,
class lattice {
int x;
node n;
int y;
};
means that an object of type lattice contains another object of type node, in C++ physically placed together with x and y. This is a has-a-relationship.

We use inheritance when A (derived class) "is a" B (base class). What do we do when A "can be" B or C?

Sorry for this ugly question, but I didn't know how to word it. I'll give an example of what I mean:
A Human can be a Mage or a Warrior, so Mage and Warrior could inherit from Human. But what if Orc can be a both too? We can't say "a Human is a Warrior" or "a Warrior is a Human". Do Orc and Human (or a parent class, Humanoid) inherit all the skills, and then choose what to use?
I don't know if I should tag a specific language, since it's a general question about oop, but since different languages can have different approaches to the same problem, I would prefer answers from a C++ perspective.
Improve your modelling
Abstract class Race, concrete classes Human, Orc, etc...
Abstract class Class, concrete classes Mage, Warrior, etc...
When calculating the stats for your Mage, for instance, ask Race (not Human, or Orc) for things like int get_intelligence_bonus() (abstract function in Race). If at all possible, interactions should be between Race and Class, not the concrete counterparts.
Use composition to actually construct your character:
Character::Character( std::unique_ptr<Race> r, std::unique_ptr<Class> c )
{
// calculate stats, etc...
}
You can use pure OOP for dynamic binding or if you prefer, you can use templates for static-time binding, but that's an orthogonal implementation issue. Either way, you need to get your class hierarchy right.
If you are new to C++ or if you only need race and class in the constructor, you may use const& instead of std::unique_ptr<>. But if you are serious about learning C++, do read about ownership semantics and variable lifetimes so you can better understand std::unique_ptr<>.
http://en.cppreference.com/w/cpp/memory/unique_ptr
That's when I break out policy-based design. Then I can have Human<Warrior> or Orc<Warrior> or Orc<Mage>. (Or turn it around if it makes more sense: Warrior<Human> etc.) But that's if I know what's going on at compile time.
If I don't know until run time, I'd use a class Humanoid which has a Race (subclasses would be Orc, Human, etc.) and has a RPGClass (subclasses Mage, Warrior, etc.)
In other languages, there are things such as protocols which can define the interface to an object, so that you don't have to define a base class full of abstract virtual methods and such. So RPGClass and Race would be protocols used to interface to classes such as Mage and Human respectively. Policies are just protocols resolved at compile time.
P.S. I have no idea how metaphorical (or not) your examples are...
In this example, I would see a use for either composition or multiple inheritance.
That is, I would have a mage and warrior class as well as a human and orc class. If I want an Orc Mage, I inherit from both the mage and orc classes.
That said, in practice, multiple inheritance can be screwy (see here: What is the exact problem with multiple inheritance?). As a result, the orc mage class in my implementation would probably have an orc and mage private objects that would do the the orc and mage specific stuff.
A being has an occupation (class) and race. There is no need for 'is-a' here.
There will be class-instance pattern uses, but they will probably not line up perfectly with C++ (or any other language) inheritance rules, so run it manually, restricting it or enhancing it as use cases appear, and be willing to throw it out if you hit a tangle.
If I follow, you are trying to define a class Warrior, that can be Human or Orc, but should inherit of the class in question.
If that is what you want to do, the correct way to go about it is to write a generic Warrior class:
template<typename Race>
class Warrior : public Race
{
// Warrior attributes & methods
};
By doing this, you will be able to instanciate human warriors (Warrior<Human>) as well as orc warriors (Warrior<Orc>), or warriors from any other race.
However, Warrior<Human> and Warrior<Orc> will be considered as totally different classes, which do not belong to the same inheritance tree.
Now, maybe that is not what you want to do. Maybe you want to be able to manipulate containers like vector<Warrior*> to handle both human and orc warriors indicriminately and make use of polymorphism. In this case, it may make sense to reverse the inheritance and to have a template class Human<Class> that inherits from Class.
But what if you want to also be able to manipulate containers like vector<Human*>? Well, in this case, you want to handle characters polymorphically with regard to their race or class, so it should inherit from both!
template<typename Race, typename Class>
class Character : public Race, public Class
{
// Character attributes & methods
};
Then you can safely have an inheritance tree for character classes and one for races.
Then again, if you want to be able to manipulate things like vector<Character*> to handle all characters at once, that won't do. The simplest thing would be to create a class AbstractCharacter, from which all characters would inherit. But what if, while you're at it, you want your characters to be able to change class? Then you have to change your design: maybe your character is not a human that is also a warrior, but a character that has a race that is 'human' and a class that is 'warrior'.
In this case, make race and class attributes of your character:
class Character
{
Race* characterRace;
Class* characterClass;
// Character attributes & methods
};
where Race and Class are the roots of the inheritance trees of races and classes respectively. And you will still be able to use polymorphism.
(Note: although I wrote it like this for more readibility, it would be better to use smart pointers instead of regular pointers in the actual implementation)
Edit: when I talk about changing the class of the character, I mean at runtime. And that also means that you can create a character of the class and race you want at runtime rather simply.
Traditional OOP terminology includes "is a" and "has a". But no "can be".
You've provided very little details about your specific coding needs. But if I needed something like this, I would still use inheritance. I would create a base class that shares the common qualities of both possible items. And then I would declare specialized derived classes for each of the possible specific cases.
Then I would create the appropriate data type for each specific cases, that derive from my common case. I would derive the specific type depending on that type the item is. I could then create a collection of the base class that could contain different specific types.
If there is no common base class, I might use an interface instead of a base class.
In the end, it's impossible to provide specific examples when you've provided no specific examples of what you are trying to accomplish.

Class design to avoid need for list of base classes

I'm currently in the design phase of a class library and stumbled up on a question similar to "Managing diverse classes with a central manager without RTTI" or "pattern to avoid dynamic_cast".
Imagine there is a class hierarchy with a base class Base and two classes DerivedA and DerivedB that are subclasses of Base. Somewhere in my library there will be a class that needs to hold lists of objects of both types DerivedA and DerivedB. Further suppose that this class will need to perform actions on both types depending on the type. Obviously I will use virtual functions here to implement this behavior. But what if I will need the managing class to give me all objects of type DerivedA?
Is this an indicator of a bad class design because I have the need to perform actions only on a subset of the class hierarchy?
Or does it just mean that my managing class should not use a list of Base but two lists - one for DerivedA and one for DerivedB? So in case I need to perform an action on both types I would have to iterate over two lists. In my case the probability that there will be a need to add new subclasses to the hierarchy is quite low and the current number is around 3 or 4 subclasses.
But what if I will need the managing class to give me all objects of
type DerivedA?
Is this an indicator of a bad class design because I have the need to
perform actions only on a subset of the class hierarchy?
More likely yes than no. If you often need to do this, then it makes sense to question whether the hierarchy makes sense. In that case, you should separate this into two unrelated lists.
Another possible approach is to also handle it through virtual methods, where e.g. DeriveB will have a no-op implementation for methods which don't affect that. It is hard to tell without knowing more information.
It certainly is a sign of bad design if you store (pointers to) objects together that have to be handled differently.
You could however just implement this differing behaviour as an empty function in the base class or use the visitor pattern.
You can do it in several ways.
Try to dynamic_cast to specific class (this is a bruteforce solution, but I'd use it only for interfaces, using it for classes is a kind of code smell. It'll work though.)
Do something like:
class BaseRequest {};
class DerivedASupportedRequest : public BaseRequest {};
Then modify your classes to support the method:
// (...)
void ProcessRequest(const BaseRequest & request);
Create a virtual method bool TryDoSth() in a base class; DerivedB will always return false, while DerivedA will implement the required functionality.
Alternative to above: Create method Supports(Action action), where Action is an enum defining possible actions or groups of actions; in such case calling DoSth() on class, which does not support given feature should result in thrown exception.
Base class may have a method ActionXController * GetControllerForX(); DerivedA will return the actual controller, DerivedB will return nullptr.
Similarly, base class can provide method: BaseController * GetController(Action a)
You asked, if it is a bad design. I believe, that it depends on how much functionality is common and how much is different. If you have 100 common methods and only one different, it would be weird to hold these data in separate lists. However, if count of different methods is noticeable, consider changing design of your application. This may be a general rule, but there are also exceptions. It's hard to tell without knowing the context.

Why should one not derive from c++ std string class?

I wanted to ask about a specific point made in Effective C++.
It says:
A destructor should be made virtual if a class needs to act like a polymorphic class. It further adds that since std::string does not have a virtual destructor, one should never derive from it. Also std::string is not even designed to be a base class, forget polymorphic base class.
I do not understand what specifically is required in a class to be eligible for being a base class (not a polymorphic one)?
Is the only reason that I should not derive from std::string class is it does not have a virtual destructor? For reusability purpose a base class can be defined and multiple derived class can inherit from it. So what makes std::string not even eligible as a base class?
Also, if there is a base class purely defined for reusability purpose and there are many derived types, is there any way to prevent client from doing Base* p = new Derived() because the classes are not meant to be used polymorphically?
I think this statement reflects the confusion here (emphasis mine):
I do not understand what specifically is required in a class to be eligible for being a base clas (not a polymorphic one)?
In idiomatic C++, there are two uses for deriving from a class:
private inheritance, used for mixins and aspect oriented programming using templates.
public inheritance, used for polymorphic situations only. EDIT: Okay, I guess this could be used in a few mixin scenarios too -- such as boost::iterator_facade -- which show up when the CRTP is in use.
There is absolutely no reason to publicly derive a class in C++ if you're not trying to do something polymorphic. The language comes with free functions as a standard feature of the language, and free functions are what you should be using here.
Think of it this way -- do you really want to force clients of your code to convert to using some proprietary string class simply because you want to tack on a few methods? Because unlike in Java or C# (or most similar object oriented languages), when you derive a class in C++ most users of the base class need to know about that kind of a change. In Java/C#, classes are usually accessed through references, which are similar to C++'s pointers. Therefore, there's a level of indirection involved which decouples the clients of your class, allowing you to substitute a derived class without other clients knowing.
However, in C++, classes are value types -- unlike in most other OO languages. The easiest way to see this is what's known as the slicing problem. Basically, consider:
int StringToNumber(std::string copyMeByValue)
{
std::istringstream converter(copyMeByValue);
int result;
if (converter >> result)
{
return result;
}
throw std::logic_error("That is not a number.");
}
If you pass your own string to this method, the copy constructor for std::string will be called to make a copy, not the copy constructor for your derived object -- no matter what child class of std::string is passed. This can lead to inconsistency between your methods and anything attached to the string. The function StringToNumber cannot simply take whatever your derived object is and copy that, simply because your derived object probably has a different size than a std::string -- but this function was compiled to reserve only the space for a std::string in automatic storage. In Java and C# this is not a problem because the only thing like automatic storage involved are reference types, and the references are always the same size. Not so in C++.
Long story short -- don't use inheritance to tack on methods in C++. That's not idiomatic and results in problems with the language. Use non-friend, non-member functions where possible, followed by composition. Don't use inheritance unless you're template metaprogramming or want polymorphic behavior. For more information, see Scott Meyers' Effective C++ Item 23: Prefer non-member non-friend functions to member functions.
EDIT: Here's a more complete example showing the slicing problem. You can see it's output on codepad.org
#include <ostream>
#include <iomanip>
struct Base
{
int aMemberForASize;
Base() { std::cout << "Constructing a base." << std::endl; }
Base(const Base&) { std::cout << "Copying a base." << std::endl; }
~Base() { std::cout << "Destroying a base." << std::endl; }
};
struct Derived : public Base
{
int aMemberThatMakesMeBiggerThanBase;
Derived() { std::cout << "Constructing a derived." << std::endl; }
Derived(const Derived&) : Base() { std::cout << "Copying a derived." << std::endl; }
~Derived() { std::cout << "Destroying a derived." << std::endl; }
};
int SomeThirdPartyMethod(Base /* SomeBase */)
{
return 42;
}
int main()
{
Derived derivedObject;
{
//Scope to show the copy behavior of copying a derived.
Derived aCopy(derivedObject);
}
SomeThirdPartyMethod(derivedObject);
}
To offer the counter side to the general advice (which is sound when there are no particular verbosity/productivity issues evident)...
Scenario for reasonable use
There is at least one scenario where public derivation from bases without virtual destructors can be a good decision:
you want some of the type-safety and code-readability benefits provided by dedicated user-defined types (classes)
an existing base is ideal for storing the data, and allows low-level operations that client code would also want to use
you want the convenience of reusing functions supporting that base class
you understand that any any additional invariants your data logically needs can only be enforced in code explicitly accessing the data as the derived type, and depending on the extent to which that will "naturally" happen in your design, and how much you can trust client code to understand and cooperate with the logically-ideal invariants, you may want members functions of the derived class to reverify expectations (and throw or whatever)
the derived class adds some highly type-specific convenience functions operating over the data, such as custom searches, data filtering / modifications, streaming, statistical analysis, (alternative) iterators
coupling of client code to the base is more appropriate than coupling to the derived class (as the base is either stable or changes to it reflect improvements to functionality also core to the derived class)
put another way: you want the derived class to continue to expose the same API as the base class, even if that means the client code is forced to change, rather than insulating it in some way that allows the base and derived APIs to grow out of sync
you're not going to be mixing pointers to base and derived objects in parts of the code responsible for deleting them
This may sound quite restrictive, but there are plenty of cases in real world programs matching this scenario.
Background discussion: relative merits
Programming is about compromises. Before you write a more conceptually "correct" program:
consider whether it requires added complexity and code that obfuscates the real program logic, and is therefore more error prone overall despite handling one specific issue more robustly,
weigh the practical costs against the probability and consequences of issues, and
consider "return on investment" and what else you could be doing with your time.
If the potential problems involve usage of the objects that you just can't imagine anyone attempting given your insights into their accessibility, scope and nature of usage in the program, or you can generate compile-time errors for dangerous use (e.g. an assertion that derived class size matches the base's, which would prevent adding new data members), then anything else may be premature over-engineering. Take the easy win in clean, intuitive, concise design and code.
Reasons to consider derivation sans virtual destructor
Say you have a class D publicly derived from B. With no effort, the operations on B are possible on D (with the exception of construction, but even if there are a lot of constructors you can often provide effective forwarding by having one template for each distinct number of constructor arguments: e.g. template <typename T1, typename T2> D(const T1& x1, const T2& t2) : B(t1, t2) { }. Better generalised solution in C++0x variadic templates.)
Further, if B changes then by default D exposes those changes - staying in sync - but someone may need to review extended functionality introduced in D to see if it remains valid, and the client usage.
Rephrasing this: there is reduced explicit coupling between base and derived class, but increased coupling between base and client.
This is often NOT what you want, but sometimes it is ideal, and other times a non issue (see next paragraph). Changes to the base force more client code changes in places distributed throughout the code base, and sometimes the people changing the base may not even have access to the client code to review or update it correspondingly. Sometimes it is better though: if you as the derived class provider - the "man in the middle" - want base class changes to feed through to clients, and you generally want clients to be able - sometimes forced - to update their code when the base class changes without you needing to be constantly involved, then public derivation may be ideal. This is common when your class is not so much an independent entity in its own right, but a thin value-add to the base.
Other times the base class interface is so stable that the coupling may be deemed a non issue. This is especially true of classes like Standard containers.
Summarily, public derivation is a quick way to get or approximate the ideal, familiar base class interface for the derived class - in a way that's concise and self-evidently correct to both the maintainer and client coder - with additional functionality available as member functions (which IMHO - which obviously differs with Sutter, Alexandrescu etc - can aid usability, readability and assist productivity-enhancing tools including IDEs)
C++ Coding Standards - Sutter & Alexandrescu - cons examined
Item 35 of C++ Coding Standards lists issues with the scenario of deriving from std::string. As scenarios go, it's good that it illustrates the burden of exposing a large but useful API, but both good and bad as the base API is remarkably stable - being part of the Standard Library. A stable base is a common situation, but no more common than a volatile one and a good analysis should relate to both cases. While considering the book's list of issues, I'll specifically contrast the issues' applicability to the cases of say:
a) class Issue_Id : public std::string { ...handy stuff... }; <-- public derivation, our controversial usage
b) class Issue_Id : public string_with_virtual_destructor { ...handy stuff... }; <- safer OO derivation
c) class Issue_Id { public: ...handy stuff... private: std::string id_; }; <-- a compositional approach
d) using std::string everywhere, with freestanding support functions
(Hopefully we can agree the composition is acceptable practice, as it provides encapsulation, type safety as well as a potentially enriched API over and above that of std::string.)
So, say you're writing some new code and start thinking about the conceptual entities in an OO sense. Maybe in a bug tracking system (I'm thinking of JIRA), one of them is say an Issue_Id. Data content is textual - consisting of an alphabetic project id, a hyphen, and an incrementing issue number: e.g. "MYAPP-1234". Issue ids can be stored in a std::string, and there will be lots of fiddly little text searches and manipulation operations needed on issue ids - a large subset of those already provided on std::string and a few more for good measure (e.g. getting the project id component, providing the next possible issue id (MYAPP-1235)).
On to Sutter and Alexandrescu's list of issues...
Nonmember functions work well within existing code that already manipulates strings. If instead you supply a super_string, you force changes through your code base to change types and function signatures to super_string.
The fundamental mistake with this claim (and most of the ones below) is that it promotes the convenience of using only a few types, ignoring the benefits of type safety. It's expressing a preference for d) above, rather than insight into c) or b) as alternatives to a). The art of programming involves balancing the pros and cons of distinct types to achieve reasonable reuse, performance, convenience and safety. The paragraphs below elaborate on this.
Using public derivation, the existing code can implicitly access the base class string as a string, and continue to behave as it always has. There's no specific reason to think that the existing code would want to use any additional functionality from super_string (in our case Issue_Id)... in fact it's often lower-level support code pre-existing the application for which you're creating the super_string, and therefore oblivious to the needs provided for by the extended functions. For example, say there's a non-member function to_upper(std::string&, std::string::size_type from, std::string::size_type to) - it could still be applied to an Issue_Id.
So, unless the non-member support function is being cleaned up or extended at the deliberate cost of tightly coupling it to the new code, then it needn't be touched. If it is being overhauled to support issue ids (for example, using the insight into the data content format to upper-case only leading alpha characters), then it's probably a good thing to ensure it really is being passed an Issue_Id by creating an overload ala to_upper(Issue_Id&) and sticking to either the derivation or compositional approaches allowing type safety. Whether super_string or composition is used makes no difference to effort or maintainability. A to_upper_leading_alpha_only(std::string&) reusable free-standing support function isn't likely to be of much use - I can't recall the last time I wanted such a function.
The impulse to use std::string everywhere isn't qualitatively different to accepting all your arguments as containers of variants or void*s so you don't have to change your interfaces to accept arbitrary data, but it makes for error prone implementation and less self-documenting and compiler-verifiable code.
Interface functions that take a string now need to: a) stay away from super_string's added functionality (unuseful); b) copy their argument to a super_string (wasteful); or c) cast the string reference to a super_string reference (awkward and potentially illegal).
This seems to be revisiting the first point - old code that needs to be refactored to use the new functionality, albeit this time client code rather than support code. If the function wants to start treating its argument as an entity for which the new operations are relevant, then it should start taking its arguments as that type and the clients should generate them and accept them using that type. The exact same issues exists for composition. Otherwise, c) can be practical and safe if the guidelines I list below are followed, though it is ugly.
super_string's member functions don't have any more access to string's internals than nonmember functions because string probably doesn't have protected members (remember, it wasn't meant to be derived from in the first place)
True, but sometimes that's a good thing. A lot of base classes have no protected data. The public string interface is all that's needed to manipulate the contents, and useful functionality (e.g. get_project_id() postulated above) can be elegantly expressed in terms of those operations. Conceptually, many times I've derived from Standard containers, I've wanted not to extend or customise their functionality along the existing lines - they're already "perfect" containers - rather I've wanted to add another dimension of behaviour that's specific to my application, and requires no private access. It's because they're already good containers that they're good to reuse.
If super_string hides some of string's functions (and redefining a nonvirtual function in a derived class is not overriding, it's just hiding), that could cause widespread confusion in code that manipulates strings that started their life converted automatically from super_strings.
True for composition too - and more likely to happen as the code doesn't default to passing things through and hence staying in sync, and also true in some situations with run-time polymorphic hierarchies as well. Samed named functions that behave differently in classes that initial appear interchangeable - just nasty. This is effectively the usual caution for correct OO programming, and again not a sufficient reason to abandon the benefits in type safety etc..
What if super_string wants to inherit from string to add more state [explanation of slicing]
Agreed - not a good situation, and somewhere I personally tend to draw the line as it often moves the problems of deletion through a pointer to base from the realm of theory to the very practical - destructors aren't invoked for additional members. Still, slicing can often do what's wanted - given the approach of deriving super_string not to change its inherited functionality, but to add another "dimension" of application-specific functionality....
Admittedly, it's tedious to have to write passthrough functions for the member functions you want to keep, but such an implementation is vastly better and safer than using public or nonpublic inheritance.
Well, certainly agree about the tedium....
Guidelines for successful derivation sans virtual destructor
ideally, avoid adding data members in derived class: variants of slicing can accidentally remove data members, corrupt them, fail to initialise them...
even more so - avoid non-POD data members: deletion via base-class pointer is technically undefined behaviour anyway, but with non-POD types failing to run their destructors is more likely to have non-theoretical problems with resource leaks, bad reference counts etc.
honour the Liskov Substitution Principal / you can't robustly maintain new invariants
for example, in deriving from std::string you can't intercept a few functions and expect your objects to remain uppercase: any code that accesses them via a std::string& or ...* can use std::string's original function implementations to change the value)
derive to model a higher level entity in your application, to extend the inherited functionality with some functionality that uses but doesn't conflict with the base; do not expect or try to change the basic operations - and access to those operations - granted by the base type
be aware of the coupling: base class can't be removed without affecting client code even if the base class evolves to have inappropriate functionality, i.e. your derived class's usability depends on the ongoing appropriateness of the base
sometimes even if you use composition you'll need to expose the data member due to performance, thread safety issues or lack of value semantics - so the loss of encapsulation from public derivation isn't tangibly worse
the more likely people using the potentially-derived class will be unaware of its implementation compromises, the less you can afford to make them dangerous
therefore, low-level widely deployed libraries with many ad-hoc casual users should be more wary of dangerous derivation than localised use by programmers routinely using the functionality at application level and/or in "private" implementation / libraries
Summary
Such derivation is not without issues so don't consider it unless the end result justifies the means. That said, I flatly reject any claim that this can't be used safely and appropriately in particular cases - it's just a matter of where to draw the line.
Personal experience
I do sometimes derive from std::map<>, std::vector<>, std::string etc - I've never been burnt by the slicing or delete-via-base-class-pointer issues, and I've saved a lot of time and energy for more important things. I don't store such objects in heterogeneous polymorphic containers. But, you need to consider whether all the programmers using the object are aware of the issues and likely to program accordingly. I personally like to write my code to use heap and run-time polymorphism only when needed, while some people (due to Java backgrounds, their prefered approach to managing recompilation dependencies or switching between runtime behaviours, testing facilities etc.) use them habitually and therefore need to be more concerned about safe operations via base class pointers.
If you really want to derive from it (not discussing why you want to do it) I think you can prevent Derived class direct heap instantiation by making it's operator new private:
class StringDerived : public std::string {
//...
private:
static void* operator new(size_t size);
static void operator delete(void *ptr);
};
But this way you restrict yourself from any dynamic StringDerived objects.
Not only is the destructor not virtual, std::string contains no virtual functions at all, and no protected members. That makes it very hard for the derived class to modify its functionality.
Then why would you derive from it?
Another problem with being non-polymorphic is that if you pass your derived class to a function expecting a string parameter, your extra functionality will just be sliced off and the object will be seen as a plain string again.
Why should one not derive from c++ std string class?
Because it is not necessary. If you want to use DerivedString for functionality extension; I don't see any problem in deriving std::string. The only thing is, you should not interact between both classes (i.e. don't use string as a receiver for DerivedString).
Is there any way to prevent client from doing Base* p = new Derived()
Yes. Make sure that you provide inline wrappers around Base methods inside Derived class. e.g.
class Derived : protected Base { // 'protected' to avoid Base* p = new Derived
const char* c_str () const { return Base::c_str(); }
//...
};
There are two simple reasons for not deriving from a non-polymorphic class:
Technical: it introduces slicing bugs (because in C++ we pass by value unless otherwise specified)
Functional: if it is non-polymorphic, you can achieve the same effect with composition and some function forwarding
If you wish to add new functionalities to std::string, then first consider using free functions (possibly templates), like the Boost String Algorithm library does.
If you wish to add new data members, then properly wrap the class access by embedding it (Composition) inside a class of your own design.
EDIT:
#Tony noticed rightly that the Functional reason I cited was probably meaningless to most people. There is a simple rule of thumb, in good design, that says that when you can pick a solution among several, you should consider the one with the weaker coupling. Composition has weaker coupling that Inheritance, and thus should be preferred, when possible.
Also, composition gives you the opportunity to nicely wrap the original's class method. This is not possible if you pick inheritance (public) and the methods are not virtual (which is the case here).
The C++ standard states that If Base class destructor is not virtual and you delete an object of Base class that points to the object of an derived class then it causes an undefined Behavior.
C++ standard section 5.3.5/3:
if the static type of the operand is different from its dynamic type, the static type shall be a base class of the operand’s dynamic type and the static type shall have a virtual destructor or the behavior is undefined.
To be clear on the Non-polymorphic class & need of virtual destructor
The purpose of making a destructor virtual is to facilitate the polymorphic deletion of objects through delete-expression. If there is no polymorphic deletion of objects, then you don't need virtual destructor's.
Why not to derive from String Class?
One should generally avoid deriving from any standard container class because of the very reason that they don' have virtual destructors, which make it impossible to delete objects polymorphically.
As for the string class, the string class doesn't have any virtual functions so there is nothing that you can possibly override. The best you can do is hide something.
If at all you want to have a string like functionality you should write a class of your own rather than inherit from std::string.
As soon as you add any member (variable) into your derived std::string class, will you systematically screw the stack if you attempt to use the std goodies with an instance of your derived std::string class? Because the stdc++ functions/members have their stack pointers[indexes] fixed [and adjusted] to the size/boundary of the (base std::string) instance size.
Right?
Please, correct me if I am wrong.