Related
After reading and watching much about SOLID principles I was very keen to use these principles in my work (mostly C++ development) since I do think they are good principles and that they indeed will bring much benefit to the quality of my code, readability, testability, reuse and maintainability.
But I have real hard time with the 'D' (Dependency inversion).
This principal states that:
A. High-level modules should not depend on low-level modules. Both should depend on abstractions.
B. Abstractions should not depend on details. Details should depend on abstractions.
Let me explain by example:
Lets say I am writing the following interface:
class SOLIDInterface {
//usual stuff with constructor, destructor, don't copy etc
public:
virtual void setSomeString(const std::string &someString) = 0;
};
(for the sake of simplicity please ignore the other things needed for a "correct interface" such as non virutal publics, private virtuals etc, its not part of the problem.)
notice, that setSomeString() is taking an std::string.
But that breaks the above principal since std::string is an implementation.
Java and C# don't have that problem since the language offers interfaces to all the complex common types such as string and containers.
C++ does not offer that.
Now, C++ does offer the possibility to write this interface in such a way that I could write an 'IString' interface that would take any implementation that will support an std::string interface using type erasure
(Very good article: http://www.artima.com/cppsource/type_erasure.html)
So the implementation could use STL (std::string) or Qt (QString), or my own string implementation or something else.
Like it should be.
But this means, that if I (and not only I but all C++ developers) want to write C++ API which obeys SOLID design principles ('D' included), I will have to implement a LOT of code to accommodate all the common non natural types.
Beyond being not realistic in terms of effort, this solution has other problems such as - what if STL changes?(for this example)
And its not really a solution since STL is not implementing IString, rather IString is abstracting STL, so even if I were to create such an interface the principal problem remains.
(I am not even getting into issues such as this adds polymorphic overhead, which for some systems, depending on size and HW requirements may not be acceptable)
So may question is:
Am I missing something here (which I guess the true answer, but what?), is there a way to use Dependency inversion in C++ without writing a whole new interface layer for the common types in a realistic way - or are we doomed to write API which is always dependent on some implementation?
Thanks for your time!
From the first few comments I received so far I think a clarification is needed:
The selection of std::string was just an example.
It could be QString for that matter - I just took STL since it is the standard.
Its not even important that its a string type, it could be any common type.
I have selected the answer by Corristo not because he explicitly answered my question but because the extensive post (coupled with the other answers) allowed me to extract my answer from it implicitly, realizing that the discussion tends to drift from the actual question which is:
Can you implement Dependency inversion in C++ when you use basic complex types like strings and containers and basically any of the STL with an effort that makes sense. (and the last part is a very important element of the question).
Maybe I should have explicitly noted that I am after run-time polymorphism not compile time.
The clear answer is NO, its not possible.
It might have been possible if STL would have exposed abstract interfaces to their implementations (if there are indeed reasons that prevent the STL implementations to derive from these interfaces (say, performance)) then it still could have simply maintained these abstract interfaces to match the implementations).
For types that I have full control over, yes, there is no technical problem implementing the DIP.
But most likely any such interface (of my own) will still use a string or a container, forcing it to use either the STL implementation or another.
All the suggested solutions below are either not polymorphic in runtime, or/and are forcing quiet a some coding around the interface - when you think you have to do this for all these common types the practicality is simply not there.
If you think you know better, and you say it is possible to have what I described above then simply post the code proving it.
I dare you! :-)
Note that C++ is not an object-oriented programming language, but rather lets the programmer choose between many different paradigms. One of the key principles of C++ is that of zero-cost abstractions, which in particular entails to build abstractions in such a way that users don't pay for what they don't use.
The C#/Java style of defining interfaces with virtual methods that are then implemented by derived classes don't fall into that category though, because even if you don't need the polymorphic behavior, were std::string implementing a virtual interface, every call of one of its methods would incur a vtable lookup. This is unacceptable for classes in the C++ standard library supposed to be used in all kinds of settings.
Defining interfaces without inheriting from an abstract interface class
Another problem with the C#/Java approach is that in most cases you don't actually care that something inherits from a particular abstract interface class and only need that the type you pass to a function supports the operations you use. Restricting accepted parameters to those inheriting from a particular interface class thus actually hinders reuse of existing components, and you often end up writing wrappers to make classes of one library conform to the interfaces of another - even when they already have the exact same member functions.
Together with the fact that inheritance-based polymorphism typically also entails heap allocations and reference semantics with all its problems regarding lifetime management, it is best to avoid inheriting from an abstract interface class in C++.
Generic templates for implicit interfaces
In C++ you can get compile-time polymorphism through templates.
In its simplest form, the interface that an object used in a templated function or class need to conform to is not actually specified in C++ code, but implied by what functions are called on them.
This is the approach used in the STL, and it is really flexible. Take std::vector for example. There the requirements on the value type T of objects you store in it are dependent on what operations you perform on the vector. This allows e.g. to store move-only types as long as you don't use any of the operations that need to make a copy. In such a case, defining an interface that the value types needs to conform to would greatly reduce the usefulness of std::vector, because you'd either need to remove methods that require copies or you'd need to exclude move-only types from being stored in it.
That doesn't mean you can't use dependency inversion, though: The common Button-Lamp example for dependency inversion implemented with templates would look like this:
class Lamp {
public:
void activate();
void deactivate();
};
template <typename T>
class Button {
Button(T& switchable)
: _switchable(&switchable) {
}
void toggle() {
if (_buttonIsInOnPosition) {
_switchable->deactivate();
_buttonIsInOnPosition = false;
} else {
_switchable->activate();
_buttonIsInOnPosition = true;
}
}
private:
bool _buttonIsInOnPosition{false};
T* _switchable;
}
int main() {
Lamp l;
Button<Lamp> b(l)
b.toggle();
}
Here Button<T>::toggle implicitly relies on a Switchable interface, requiring T to have member functions T::activate and T::deactivate. Since Lamp happens to implement that interface it can be used with the Button class. Of course, in real code you would also state these requirements on T in the documentation of the Button class so that users don't need to look up the implementation.
Similarly, you could also declare your setSomeString method as
template <typename String>
void setSomeString(String const& string);
and then this will work with all types that implement all the methods you used in the implementation of setSomeString, hence only relying on an abstract - although implicit - interface.
As always, there are some downsides to consider:
In the string example, assuming you only make use of .begin() and .end() member functions returning iterators that return a char when dereferenced (e.g. to copy it into the classes' local, concrete string data member), you can also accidentally pass a std::vector<char> to it, even though it isn't technically a string. If you consider this a problem is arguable, in a way this can also be seen as the epitome of relying only on abstractions.
If you pass an object of a type that doesn't have the required (member) functions, then you can end up with horrible compiler error messages that make it very hard to find the source of the error.
Only in very limited cases it is possible to separate the interface of a templated class or function from its implementation, as is typically done with separate .h and .cpp files. This can thus lead to longer compile times.
Defining interfaces with the Concepts TS
if you really care about types used in templated functions and classes to conform to a fixed interface, regardless of what you actually use, there are ways to restrict the template parameters only to types conforming to a certain interface with std::enable_if, but these are very verbose and unreadable. In order to make this kind of generic programming easier, the Concepts TS allows to actually define interfaces that are checked by the compiler and thus greatly improves diagnostics. With the Concepts TS, the Button-Lamp example from above translates to
template <typename T>
concept bool Switchable = requires(T t) {
t.activate();
t.deactivate();
};
// Lamp as before
template <Switchable T>
class Button {
public:
Button(T&); // implementation as before
void toggle(); // implementation as before
private:
T* _switchable;
bool _buttonIsInOnPosition{false};
};
If you can't use the Concepts TS (it is only implemented in GCC right now), the closest you can get is the Boost.ConceptCheck library.
Type erasure for runtime polymorphism
There is one case where compile-time polymorphism doesn't suffice, and that is when the types you pass to or get from a particular function aren't fully determined at compile-time but depend on runtime parameters (e.g. from a config file, command-line arguments passed to the executable or even the value of a parameter passed to the function itself).
If you need to store objects (even in a variable) of a type dependent on runtime parameters, the traditional approach is to store pointers to a common base class instead and to use dynamic dispatch via virtual member functions to get the behavior you need. But this still suffers from the problem described before: You can't use types that effectively do what you need but were defined in an external library, and thus don't inherit from the base class you defined. So you have to write a wrapper class.
Or you do what you described in your question and create a type-erasure class.
An example from the standard library is std::function. You declare only the interface of the function and it can store arbitrary function pointers and callables that have that interface. In general, writing a type erasure class can be quite tedious, so I refrain from giving an example of a type-erasing Switchable here, but I can highly recommend Sean Parent's talk Inheritance is the base class of evil, where he demonstrates the technique for "Drawable" objects and explores what you can build on top of it in just over 20 minutes.
There are libraries that help writing type-erasure classes though, e.g. Louis Dionne's experimental dyno, where you define the interface via what he calls "concept maps" directly in C++ code, or Zach Laine's emtypen which uses a python tool to create the type erasure classes from a C++ header file you provide. The latter also comes with a CppCon talk describing the features as well as the general idea and how to use it.
Conclusion
Inheriting from a common base class just to define interfaces, while easy, leads to many problems that can be avoided using different approaches:
(Constrained) templates allow for compile-time polymorphism, which is sufficient for the majority of cases, but can lead to hard-to-understand compiler errors when used with types that don't conform to the interface.
If you need runtime polymorphism (which actually is rather rare in my experience), you can use type-erasure classes.
So even though the classes in the STL and other C++ libraries rarely derive from an abstract interface, you can still apply dependency inversion with one of the two methods described above if you really want to.
But as always, use good judgment on a case-by-case basis whether you really need the abstraction or if it is better to simply use a concrete type. The string example you brought up is one where I'd go with concrete types, simply because the different string classes don't share a common interface (e.g. std::string has .find(), but QStrings version of the same function is called .contains()). It might be just as much effort to write wrapper classes for both as it is to write a conversion function and to use that at well-defined boundaries within the project.
Ahh, but C++ lets you write code that is independent of a particular implementation without actually using inheritance.
std::string itself is a good example... it's actually a typedef for std::basic_string<char, std::char_traits<char>, std::allocator<char>>. Which allows you to create strings using other allocators if you choose (or mock the allocator object in order to measure number of calls, if you like). There just isn't any explicit interface like an IAllocator, because C++ templates use duck-typing.
A future version of C++ will support explicit description of the interface a template parameter must adhere to -- this feature is called concepts -- but just using duck-typing enables decoupling without requiring redundant interface definitions.
And because C++ performs optimization after instantiation of templates, there's no polymorphic overhead.
Now, when you do have virtual functions, you'll need to commit to a particular type, because the virtual-table layout doesn't accommodate use of templates each of which generates an arbitrary number of instances each of which require separate dispatch. But when using templates, you'll won't need virtual functions nearly as much as e.g. Java does, so in practice this isn't a big problem.
Microsoft's GDI+ defines many empty classes to be treated as handles internally. For example, (source GdiPlusGpStubs.h)
//Approach 1
class GpGraphics {};
class GpBrush {};
class GpTexture : public GpBrush {};
class GpSolidFill : public GpBrush {};
class GpLineGradient : public GpBrush {};
class GpPathGradient : public GpBrush {};
class GpHatch : public GpBrush {};
class GpPen {};
class GpCustomLineCap {};
There are other two ways to define handles. They're,
//Approach 2
class BOOK; //no need to define it!
typedef BOOK *PBOOK;
typedef PBOOK HBOOK; //handle to be used internally
//Approach 3
typedef void* PVOID;
typedef PVOID HBOOK; //handle to be used internally
I just want to know the advantages and disadvantages of each of these approaches.
One advantage with Microsoft's approach is that, they can define type-safe hierarchy of handles using empty classes, which (I think) is not possible with the other two approaches, though I wonder what advantages this hierarchy would bring to the implementation? Anyway, what else?
EDIT:
One advantage with the second approach (i.e using incomplete classes) is that we can prevent clients from dereferencing the handles (that means, this approach appears to support encapsulation strongly, I suppose). The code would not even compile if one attempts to dereference handles. What else?
The same advantage one has with third approach as well, that you cannot dereference the handles.
Approach #1 is some mid-way between C style and C++ interface. Instead of member functions you have to pass the handle as argument. The advantage of exposed polymorphism is that you can reduce the amount of functions in interface and the types are checked compile time. Usually most experts prefer pimpl idiom (sometimes called compilation firewall) to such interface. You can not use approach #1 to interface with C so better go full C++.
Approach #2 is C style encapsulation and information hiding. The pointer may be (and often is) a pointer to real thing, so it is not over-engineered. User of library may not dereference that pointer. Disadvantage is that it does not expose any polymorphism. Advantage is that you may use it when interfacing with modules written in C.
Approach #3 is over-abstracted C-style encapsulation. The pointer may be really not a pointer at all since user of library should not cast, deallocate or dereference it. Advantage is that it may so carry exception or error values, disadvantage is that most of it has to be checked run time.
I agree with DeadMG that language-neutral object-oriented interfaces are very easy and elegant to use from C++, but these also involve more run-time checks than compile time checks and are overkill when i don't need to interface with other languages. So i personally prefer Approach #2 if it needs to interface with C or Pimpl idiom when it is C++ only.
2 and 3 are slightly less typesafe as they allow to use handles instead of void*
void bluescreeen(HBOOK hb){
memset(hb,0,100000); // no compile errors
}
Approach 3 is not very good at all, as it allows the mixing and matching of handle types that don't actually make sense, any function that takes a HANDLE can take any HANDLE, even if it's compile-time determinable that that is the wrong type.
The downside of Approach 1 is that you have to do a bunch of casting on the other end to their actual types.
Approach 2 isn't that bad, except you can't do any kind of inheritance with it without having to externally query every time.
However, all of this is entirely moot ever since compilers discovered how to implement efficient virtual functions. The approach taken by DirectX and COM is the best- it's very flexible, powerful, and completely type-safe.
It even allows for some truly insane things, like you can inherit from DirectX interfaces and extend it that way. One of the best advantages of this is Direct2D and Direct3D11. They're not actually compatible (which is truly, horrendously stupid), but you can define a proxy type that inherits from ID3D10Device1 and forwards to the ID3D11Device and solve the problem like that. That kind of thing would never even think about being possible with any of the above approaches.
Oh, and last thing: You really, really shouldn't name your types in allcaps.
I wanted to ask about a specific point made in Effective C++.
It says:
A destructor should be made virtual if a class needs to act like a polymorphic class. It further adds that since std::string does not have a virtual destructor, one should never derive from it. Also std::string is not even designed to be a base class, forget polymorphic base class.
I do not understand what specifically is required in a class to be eligible for being a base class (not a polymorphic one)?
Is the only reason that I should not derive from std::string class is it does not have a virtual destructor? For reusability purpose a base class can be defined and multiple derived class can inherit from it. So what makes std::string not even eligible as a base class?
Also, if there is a base class purely defined for reusability purpose and there are many derived types, is there any way to prevent client from doing Base* p = new Derived() because the classes are not meant to be used polymorphically?
I think this statement reflects the confusion here (emphasis mine):
I do not understand what specifically is required in a class to be eligible for being a base clas (not a polymorphic one)?
In idiomatic C++, there are two uses for deriving from a class:
private inheritance, used for mixins and aspect oriented programming using templates.
public inheritance, used for polymorphic situations only. EDIT: Okay, I guess this could be used in a few mixin scenarios too -- such as boost::iterator_facade -- which show up when the CRTP is in use.
There is absolutely no reason to publicly derive a class in C++ if you're not trying to do something polymorphic. The language comes with free functions as a standard feature of the language, and free functions are what you should be using here.
Think of it this way -- do you really want to force clients of your code to convert to using some proprietary string class simply because you want to tack on a few methods? Because unlike in Java or C# (or most similar object oriented languages), when you derive a class in C++ most users of the base class need to know about that kind of a change. In Java/C#, classes are usually accessed through references, which are similar to C++'s pointers. Therefore, there's a level of indirection involved which decouples the clients of your class, allowing you to substitute a derived class without other clients knowing.
However, in C++, classes are value types -- unlike in most other OO languages. The easiest way to see this is what's known as the slicing problem. Basically, consider:
int StringToNumber(std::string copyMeByValue)
{
std::istringstream converter(copyMeByValue);
int result;
if (converter >> result)
{
return result;
}
throw std::logic_error("That is not a number.");
}
If you pass your own string to this method, the copy constructor for std::string will be called to make a copy, not the copy constructor for your derived object -- no matter what child class of std::string is passed. This can lead to inconsistency between your methods and anything attached to the string. The function StringToNumber cannot simply take whatever your derived object is and copy that, simply because your derived object probably has a different size than a std::string -- but this function was compiled to reserve only the space for a std::string in automatic storage. In Java and C# this is not a problem because the only thing like automatic storage involved are reference types, and the references are always the same size. Not so in C++.
Long story short -- don't use inheritance to tack on methods in C++. That's not idiomatic and results in problems with the language. Use non-friend, non-member functions where possible, followed by composition. Don't use inheritance unless you're template metaprogramming or want polymorphic behavior. For more information, see Scott Meyers' Effective C++ Item 23: Prefer non-member non-friend functions to member functions.
EDIT: Here's a more complete example showing the slicing problem. You can see it's output on codepad.org
#include <ostream>
#include <iomanip>
struct Base
{
int aMemberForASize;
Base() { std::cout << "Constructing a base." << std::endl; }
Base(const Base&) { std::cout << "Copying a base." << std::endl; }
~Base() { std::cout << "Destroying a base." << std::endl; }
};
struct Derived : public Base
{
int aMemberThatMakesMeBiggerThanBase;
Derived() { std::cout << "Constructing a derived." << std::endl; }
Derived(const Derived&) : Base() { std::cout << "Copying a derived." << std::endl; }
~Derived() { std::cout << "Destroying a derived." << std::endl; }
};
int SomeThirdPartyMethod(Base /* SomeBase */)
{
return 42;
}
int main()
{
Derived derivedObject;
{
//Scope to show the copy behavior of copying a derived.
Derived aCopy(derivedObject);
}
SomeThirdPartyMethod(derivedObject);
}
To offer the counter side to the general advice (which is sound when there are no particular verbosity/productivity issues evident)...
Scenario for reasonable use
There is at least one scenario where public derivation from bases without virtual destructors can be a good decision:
you want some of the type-safety and code-readability benefits provided by dedicated user-defined types (classes)
an existing base is ideal for storing the data, and allows low-level operations that client code would also want to use
you want the convenience of reusing functions supporting that base class
you understand that any any additional invariants your data logically needs can only be enforced in code explicitly accessing the data as the derived type, and depending on the extent to which that will "naturally" happen in your design, and how much you can trust client code to understand and cooperate with the logically-ideal invariants, you may want members functions of the derived class to reverify expectations (and throw or whatever)
the derived class adds some highly type-specific convenience functions operating over the data, such as custom searches, data filtering / modifications, streaming, statistical analysis, (alternative) iterators
coupling of client code to the base is more appropriate than coupling to the derived class (as the base is either stable or changes to it reflect improvements to functionality also core to the derived class)
put another way: you want the derived class to continue to expose the same API as the base class, even if that means the client code is forced to change, rather than insulating it in some way that allows the base and derived APIs to grow out of sync
you're not going to be mixing pointers to base and derived objects in parts of the code responsible for deleting them
This may sound quite restrictive, but there are plenty of cases in real world programs matching this scenario.
Background discussion: relative merits
Programming is about compromises. Before you write a more conceptually "correct" program:
consider whether it requires added complexity and code that obfuscates the real program logic, and is therefore more error prone overall despite handling one specific issue more robustly,
weigh the practical costs against the probability and consequences of issues, and
consider "return on investment" and what else you could be doing with your time.
If the potential problems involve usage of the objects that you just can't imagine anyone attempting given your insights into their accessibility, scope and nature of usage in the program, or you can generate compile-time errors for dangerous use (e.g. an assertion that derived class size matches the base's, which would prevent adding new data members), then anything else may be premature over-engineering. Take the easy win in clean, intuitive, concise design and code.
Reasons to consider derivation sans virtual destructor
Say you have a class D publicly derived from B. With no effort, the operations on B are possible on D (with the exception of construction, but even if there are a lot of constructors you can often provide effective forwarding by having one template for each distinct number of constructor arguments: e.g. template <typename T1, typename T2> D(const T1& x1, const T2& t2) : B(t1, t2) { }. Better generalised solution in C++0x variadic templates.)
Further, if B changes then by default D exposes those changes - staying in sync - but someone may need to review extended functionality introduced in D to see if it remains valid, and the client usage.
Rephrasing this: there is reduced explicit coupling between base and derived class, but increased coupling between base and client.
This is often NOT what you want, but sometimes it is ideal, and other times a non issue (see next paragraph). Changes to the base force more client code changes in places distributed throughout the code base, and sometimes the people changing the base may not even have access to the client code to review or update it correspondingly. Sometimes it is better though: if you as the derived class provider - the "man in the middle" - want base class changes to feed through to clients, and you generally want clients to be able - sometimes forced - to update their code when the base class changes without you needing to be constantly involved, then public derivation may be ideal. This is common when your class is not so much an independent entity in its own right, but a thin value-add to the base.
Other times the base class interface is so stable that the coupling may be deemed a non issue. This is especially true of classes like Standard containers.
Summarily, public derivation is a quick way to get or approximate the ideal, familiar base class interface for the derived class - in a way that's concise and self-evidently correct to both the maintainer and client coder - with additional functionality available as member functions (which IMHO - which obviously differs with Sutter, Alexandrescu etc - can aid usability, readability and assist productivity-enhancing tools including IDEs)
C++ Coding Standards - Sutter & Alexandrescu - cons examined
Item 35 of C++ Coding Standards lists issues with the scenario of deriving from std::string. As scenarios go, it's good that it illustrates the burden of exposing a large but useful API, but both good and bad as the base API is remarkably stable - being part of the Standard Library. A stable base is a common situation, but no more common than a volatile one and a good analysis should relate to both cases. While considering the book's list of issues, I'll specifically contrast the issues' applicability to the cases of say:
a) class Issue_Id : public std::string { ...handy stuff... }; <-- public derivation, our controversial usage
b) class Issue_Id : public string_with_virtual_destructor { ...handy stuff... }; <- safer OO derivation
c) class Issue_Id { public: ...handy stuff... private: std::string id_; }; <-- a compositional approach
d) using std::string everywhere, with freestanding support functions
(Hopefully we can agree the composition is acceptable practice, as it provides encapsulation, type safety as well as a potentially enriched API over and above that of std::string.)
So, say you're writing some new code and start thinking about the conceptual entities in an OO sense. Maybe in a bug tracking system (I'm thinking of JIRA), one of them is say an Issue_Id. Data content is textual - consisting of an alphabetic project id, a hyphen, and an incrementing issue number: e.g. "MYAPP-1234". Issue ids can be stored in a std::string, and there will be lots of fiddly little text searches and manipulation operations needed on issue ids - a large subset of those already provided on std::string and a few more for good measure (e.g. getting the project id component, providing the next possible issue id (MYAPP-1235)).
On to Sutter and Alexandrescu's list of issues...
Nonmember functions work well within existing code that already manipulates strings. If instead you supply a super_string, you force changes through your code base to change types and function signatures to super_string.
The fundamental mistake with this claim (and most of the ones below) is that it promotes the convenience of using only a few types, ignoring the benefits of type safety. It's expressing a preference for d) above, rather than insight into c) or b) as alternatives to a). The art of programming involves balancing the pros and cons of distinct types to achieve reasonable reuse, performance, convenience and safety. The paragraphs below elaborate on this.
Using public derivation, the existing code can implicitly access the base class string as a string, and continue to behave as it always has. There's no specific reason to think that the existing code would want to use any additional functionality from super_string (in our case Issue_Id)... in fact it's often lower-level support code pre-existing the application for which you're creating the super_string, and therefore oblivious to the needs provided for by the extended functions. For example, say there's a non-member function to_upper(std::string&, std::string::size_type from, std::string::size_type to) - it could still be applied to an Issue_Id.
So, unless the non-member support function is being cleaned up or extended at the deliberate cost of tightly coupling it to the new code, then it needn't be touched. If it is being overhauled to support issue ids (for example, using the insight into the data content format to upper-case only leading alpha characters), then it's probably a good thing to ensure it really is being passed an Issue_Id by creating an overload ala to_upper(Issue_Id&) and sticking to either the derivation or compositional approaches allowing type safety. Whether super_string or composition is used makes no difference to effort or maintainability. A to_upper_leading_alpha_only(std::string&) reusable free-standing support function isn't likely to be of much use - I can't recall the last time I wanted such a function.
The impulse to use std::string everywhere isn't qualitatively different to accepting all your arguments as containers of variants or void*s so you don't have to change your interfaces to accept arbitrary data, but it makes for error prone implementation and less self-documenting and compiler-verifiable code.
Interface functions that take a string now need to: a) stay away from super_string's added functionality (unuseful); b) copy their argument to a super_string (wasteful); or c) cast the string reference to a super_string reference (awkward and potentially illegal).
This seems to be revisiting the first point - old code that needs to be refactored to use the new functionality, albeit this time client code rather than support code. If the function wants to start treating its argument as an entity for which the new operations are relevant, then it should start taking its arguments as that type and the clients should generate them and accept them using that type. The exact same issues exists for composition. Otherwise, c) can be practical and safe if the guidelines I list below are followed, though it is ugly.
super_string's member functions don't have any more access to string's internals than nonmember functions because string probably doesn't have protected members (remember, it wasn't meant to be derived from in the first place)
True, but sometimes that's a good thing. A lot of base classes have no protected data. The public string interface is all that's needed to manipulate the contents, and useful functionality (e.g. get_project_id() postulated above) can be elegantly expressed in terms of those operations. Conceptually, many times I've derived from Standard containers, I've wanted not to extend or customise their functionality along the existing lines - they're already "perfect" containers - rather I've wanted to add another dimension of behaviour that's specific to my application, and requires no private access. It's because they're already good containers that they're good to reuse.
If super_string hides some of string's functions (and redefining a nonvirtual function in a derived class is not overriding, it's just hiding), that could cause widespread confusion in code that manipulates strings that started their life converted automatically from super_strings.
True for composition too - and more likely to happen as the code doesn't default to passing things through and hence staying in sync, and also true in some situations with run-time polymorphic hierarchies as well. Samed named functions that behave differently in classes that initial appear interchangeable - just nasty. This is effectively the usual caution for correct OO programming, and again not a sufficient reason to abandon the benefits in type safety etc..
What if super_string wants to inherit from string to add more state [explanation of slicing]
Agreed - not a good situation, and somewhere I personally tend to draw the line as it often moves the problems of deletion through a pointer to base from the realm of theory to the very practical - destructors aren't invoked for additional members. Still, slicing can often do what's wanted - given the approach of deriving super_string not to change its inherited functionality, but to add another "dimension" of application-specific functionality....
Admittedly, it's tedious to have to write passthrough functions for the member functions you want to keep, but such an implementation is vastly better and safer than using public or nonpublic inheritance.
Well, certainly agree about the tedium....
Guidelines for successful derivation sans virtual destructor
ideally, avoid adding data members in derived class: variants of slicing can accidentally remove data members, corrupt them, fail to initialise them...
even more so - avoid non-POD data members: deletion via base-class pointer is technically undefined behaviour anyway, but with non-POD types failing to run their destructors is more likely to have non-theoretical problems with resource leaks, bad reference counts etc.
honour the Liskov Substitution Principal / you can't robustly maintain new invariants
for example, in deriving from std::string you can't intercept a few functions and expect your objects to remain uppercase: any code that accesses them via a std::string& or ...* can use std::string's original function implementations to change the value)
derive to model a higher level entity in your application, to extend the inherited functionality with some functionality that uses but doesn't conflict with the base; do not expect or try to change the basic operations - and access to those operations - granted by the base type
be aware of the coupling: base class can't be removed without affecting client code even if the base class evolves to have inappropriate functionality, i.e. your derived class's usability depends on the ongoing appropriateness of the base
sometimes even if you use composition you'll need to expose the data member due to performance, thread safety issues or lack of value semantics - so the loss of encapsulation from public derivation isn't tangibly worse
the more likely people using the potentially-derived class will be unaware of its implementation compromises, the less you can afford to make them dangerous
therefore, low-level widely deployed libraries with many ad-hoc casual users should be more wary of dangerous derivation than localised use by programmers routinely using the functionality at application level and/or in "private" implementation / libraries
Summary
Such derivation is not without issues so don't consider it unless the end result justifies the means. That said, I flatly reject any claim that this can't be used safely and appropriately in particular cases - it's just a matter of where to draw the line.
Personal experience
I do sometimes derive from std::map<>, std::vector<>, std::string etc - I've never been burnt by the slicing or delete-via-base-class-pointer issues, and I've saved a lot of time and energy for more important things. I don't store such objects in heterogeneous polymorphic containers. But, you need to consider whether all the programmers using the object are aware of the issues and likely to program accordingly. I personally like to write my code to use heap and run-time polymorphism only when needed, while some people (due to Java backgrounds, their prefered approach to managing recompilation dependencies or switching between runtime behaviours, testing facilities etc.) use them habitually and therefore need to be more concerned about safe operations via base class pointers.
If you really want to derive from it (not discussing why you want to do it) I think you can prevent Derived class direct heap instantiation by making it's operator new private:
class StringDerived : public std::string {
//...
private:
static void* operator new(size_t size);
static void operator delete(void *ptr);
};
But this way you restrict yourself from any dynamic StringDerived objects.
Not only is the destructor not virtual, std::string contains no virtual functions at all, and no protected members. That makes it very hard for the derived class to modify its functionality.
Then why would you derive from it?
Another problem with being non-polymorphic is that if you pass your derived class to a function expecting a string parameter, your extra functionality will just be sliced off and the object will be seen as a plain string again.
Why should one not derive from c++ std string class?
Because it is not necessary. If you want to use DerivedString for functionality extension; I don't see any problem in deriving std::string. The only thing is, you should not interact between both classes (i.e. don't use string as a receiver for DerivedString).
Is there any way to prevent client from doing Base* p = new Derived()
Yes. Make sure that you provide inline wrappers around Base methods inside Derived class. e.g.
class Derived : protected Base { // 'protected' to avoid Base* p = new Derived
const char* c_str () const { return Base::c_str(); }
//...
};
There are two simple reasons for not deriving from a non-polymorphic class:
Technical: it introduces slicing bugs (because in C++ we pass by value unless otherwise specified)
Functional: if it is non-polymorphic, you can achieve the same effect with composition and some function forwarding
If you wish to add new functionalities to std::string, then first consider using free functions (possibly templates), like the Boost String Algorithm library does.
If you wish to add new data members, then properly wrap the class access by embedding it (Composition) inside a class of your own design.
EDIT:
#Tony noticed rightly that the Functional reason I cited was probably meaningless to most people. There is a simple rule of thumb, in good design, that says that when you can pick a solution among several, you should consider the one with the weaker coupling. Composition has weaker coupling that Inheritance, and thus should be preferred, when possible.
Also, composition gives you the opportunity to nicely wrap the original's class method. This is not possible if you pick inheritance (public) and the methods are not virtual (which is the case here).
The C++ standard states that If Base class destructor is not virtual and you delete an object of Base class that points to the object of an derived class then it causes an undefined Behavior.
C++ standard section 5.3.5/3:
if the static type of the operand is different from its dynamic type, the static type shall be a base class of the operand’s dynamic type and the static type shall have a virtual destructor or the behavior is undefined.
To be clear on the Non-polymorphic class & need of virtual destructor
The purpose of making a destructor virtual is to facilitate the polymorphic deletion of objects through delete-expression. If there is no polymorphic deletion of objects, then you don't need virtual destructor's.
Why not to derive from String Class?
One should generally avoid deriving from any standard container class because of the very reason that they don' have virtual destructors, which make it impossible to delete objects polymorphically.
As for the string class, the string class doesn't have any virtual functions so there is nothing that you can possibly override. The best you can do is hide something.
If at all you want to have a string like functionality you should write a class of your own rather than inherit from std::string.
As soon as you add any member (variable) into your derived std::string class, will you systematically screw the stack if you attempt to use the std goodies with an instance of your derived std::string class? Because the stdc++ functions/members have their stack pointers[indexes] fixed [and adjusted] to the size/boundary of the (base std::string) instance size.
Right?
Please, correct me if I am wrong.
A big reason why I use OOP is to create code that is easily reusable. For that purpose Java style interfaces are perfect. However, when dealing with C++ I really can't achieve any sort of functionality like interfaces... at least not with ease.
I know about pure virtual base classes, but what really ticks me off is that they force me into really awkward code with pointers. E.g. map<int, Node*> nodes; (where Node is the virtual base class).
This is sometimes ok, but sometimes pointers to base classes are just not a possible solution. E.g. if you want to return an object packaged as an interface you would have to return a base-class-casted pointer to the object.. but that object is on the stack and won't be there after the pointer is returned. Of course you could start using the heap extensively to avoid this but that's adding so much more work than there should be (avoiding memory leaks).
Is there any way to achieve interface-like functionality in C++ without have to awkwardly deal with pointers and the heap?? (Honestly for all that trouble and awkardness id rather just stick with C.)
You can use boost::shared_ptr<T> to avoid the raw pointers. As a side note, the reason why you don't see a pointer in the Java syntax has nothing to do with how C++ implements interfaces vs. how Java implements interfaces, but rather it is the result of the fact that all objects in Java are implicit pointers (the * is hidden).
Template MetaProgramming is a pretty cool thing. The basic idea? "Compile time polymorphism and implicit interfaces", Effective C++. Basically you can get the interfaces you want via templated classes. A VERY simple example:
template <class T>
bool foo( const T& _object )
{
if ( _object != _someStupidObject && _object > 0 )
return true;
return false;
}
So in the above code what can we say about the object T? Well it must be compatible with '_someStupidObject' OR it must be convertible to a type which is compatible. It must be comparable with an integral value, or again convertible to a type which is. So we have now defined an interface for the class T. The book "Effective C++" offers a much better and more detailed explanation. Hopefully the above code gives you some idea of the "interface" capability of templates. Also have a look at pretty much any of the boost libraries they are almost all chalk full of templatization.
Considering C++ doesn't require generic parameter constraints like C#, then if you can get away with it you can use boost::concept_check. Of course, this only works in limited situations, but if you can use it as your solution then you'll certainly have faster code with smaller objects (less vtable overhead).
Dynamic dispatch that uses vtables (for example, pure virtual bases) will make your objects grow in size as they implement more interfaces. Managed languages do not suffer from this problem (this is a .NET link, but Java is similar).
I think the answer to your question is no - there is no easier way. If you want pure interfaces (well, as pure as you can get in C++), you're going to have to put up with all the heap management (or try using a garbage collector. There are other questions on that topic, but my opinion on the subject is that if you want a garbage collector, use a language designed with one. Like Java).
One big way to ease your heap management pain somewhat is auto pointers. Boost has a nice automatic pointer that does a lot of heap management work for you. The std::auto_ptr works, but it's quite quirky in my opinion.
You might also evaluate whether you really need those pure interfaces or not. Sometimes you do, but sometimes (like some of the code I work with), the pure interfaces are only ever instantiated by one class, and thus just become extra work, with no benefit to the end product.
While auto_ptr has some weird rules of use that you must know*, it exists to make this kind of thing work easily.
auto_ptr<Base> getMeAThing() {
return new Derived();
}
void something() {
auto_ptr<Base> myThing = getMeAThing();
myThing->foo(); // Calls Derived::foo, if virtual
// The Derived object will be deleted on exit to this function.
}
*Never put auto_ptrs in containers, for one. Understand what they do on assignment is another.
This is actually one of the cases in which C++ shines. The fact that C++ provides templates and functions that are not bound to a class makes reuse much easier than in pure object oriented languages. The reality though is that you will have to adjust they manner in which you write your code in order to make use of these benefits. People that come from pure OO languages often have difficulty with this, but in C++ an objects interface includes not member functions. In fact it is considered to be good practice in C++ to use non-member functions to implement an objects interface whenever possible. Once you get the hang of using template nonmember functions to implement interfaces, well it is a somewhat life changing experience. \
We often hear/read that one should avoid dynamic casting. I was wondering what would be 'good use' examples of it, according to you?
Edit:
Yes, I'm aware of that other thread: it is indeed when reading one of the first answers there that I asked my question!
This recent thread gives an example of where it comes in handy. There is a base Shape class and classes Circle and Rectangle derived from it. In testing for equality, it is obvious that a Circle cannot be equal to a Rectangle and it would be a disaster to try to compare them. While iterating through a collection of pointers to Shapes, dynamic_cast does double duty, telling you if the shapes are comparable and giving you the proper objects to do the comparison on.
Vector iterator not dereferencable
Here's something I do often, it's not pretty, but it's simple and useful.
I often work with template containers that implement an interface,
imagine something like
template<class T>
class MyVector : public ContainerInterface
...
Where ContainerInterface has basic useful stuff, but that's all. If I want a specific algorithm on vectors of integers without exposing my template implementation, it is useful to accept the interface objects and dynamic_cast it down to MyVector in the implementation. Example:
// function prototype (public API, in the header file)
void ProcessVector( ContainerInterface& vecIfce );
// function implementation (private, in the .cpp file)
void ProcessVector( ContainerInterface& vecIfce)
{
MyVector<int>& vecInt = dynamic_cast<MyVector<int> >(vecIfce);
// the cast throws bad_cast in case of error but you could use a
// more complex method to choose which low-level implementation
// to use, basically rolling by hand your own polymorphism.
// Process a vector of integers
...
}
I could add a Process() method to the ContainerInterface that would be polymorphically resolved, it would be a nicer OOP method, but I sometimes prefer to do it this way. When you have simple containers, a lot of algorithms and you want to keep your implementation hidden, dynamic_cast offers an easy and ugly solution.
You could also look at double-dispatch techniques.
HTH
My current toy project uses dynamic_cast twice; once to work around the lack of multiple dispatch in C++ (it's a visitor-style system that could use multiple dispatch instead of the dynamic_casts), and once to special-case a specific subtype.
Both of these are acceptable, in my view, though the former at least stems from a language deficit. I think this may be a common situation, in fact; most dynamic_casts (and a great many "design patterns" in general) are workarounds for specific language flaws rather than something that aim for.
It can be used for a bit of run-time type-safety when exposing handles to objects though a C interface. Have all the exposed classes inherit from a common base class. When accepting a handle to a function, first cast to the base class, then dynamic cast to the class you're expecting. If they passed in a non-sensical handle, you'll get an exception when the run-time can't find the rtti. If they passed in a valid handle of the wrong type, you get a NULL pointer and can throw your own exception. If they passed in the correct pointer, you're good to go.
This isn't fool-proof, but it is certainly better at catching mistaken calls to the libraries than a straight reinterpret cast from a handle, and waiting until some data gets mysteriously corrupted when you pass the wrong handle in.
Well it would really be nice with extension methods in C#.
For example let's say I have a list of objects and I want to get a list of all ids from them. I can step through them all and pull them out but I would like to segment out that code for reuse.
so something like
List<myObject> myObjectList = getMyObjects();
List<string> ids = myObjectList.PropertyList("id");
would be cool except on the extension method you won't know the type that is coming in.
So
public static List<string> PropertyList(this object objList, string propName) {
var genList = (objList.GetType())objList;
}
would be awesome.
It is very useful, however, most of the times it is too useful: if for getting the job done the easiest way is to do a dynamic_cast, it's more often than not a symptom of bad OO design, what in turn might lead to trouble in the future in unforeseen ways.