Keeping modules independent, while still using each other - c++

A big part of my C++ application uses classes to describe the data model, e.g. something like ClassType (which actually emulates reflection in plain C++).
I want to add a new module to my application and it needs to make use of these ClassType's, but I prefer not to introduce dependencies from my new module on ClassType.
So far I have the following alternatives:
Not making it independent and introduce a dependency on ClassType, with the risk of creating more 'spaghetti'-dependencies in my application (this is my least-preferred solution)
Introduce a new class, e.g. IType, and letting my module only depend on IType. ClassType should then inherit from IType.
Use strings as identification method, and forcing the users of the new module to convert the ClassType to a string or vice versa where needed.
Use GUID's (or even simple integers) as identification, also requiring conversions between GUID's and ClassType's
How far should you try to go when decoupling modules in an application?
just introduce an interface and let all the other modules rely on the interface? (like in IType describe above)
even decouple it further by using other identifications like strings or GUID's?
I afraid that by decoupling it too far, the code becomes more unstable and more difficult to debug. I've seen one such example in Qt: signals and slots are linked using strings and if you make a typing mistake, the functionality doesn't work, but it still compiles.
How far should you keep your modules decoupled?

99% of the time, if your design is based on reflection, then you have major issues with the design.
Generally speaking, something like
if (x is myclass)
elseif (x is anotherclass)
else
is a poor design because it neglects polymorphism. If you're doing this, then the item x is in violation of the Liskov Substitution Principle.
Also, given that C++ already has RTTI, I don't see why you'd reinvent the wheel. That's what typeof and dynamic_cast are for.

I'll steer away from thinkng about your reflection, and just look at the dependency ideas.
Decouple what it's reasonable to decouple. Coupling implies that if one thing changes so must another. So your NewCode is using ClassType, if some aspects of it change then yuou surely must change NewCode - it can't be completely decoupled. Which of the following do you want to decouple from?
Semantics, what ClassType does.
Interface, how you call it.
Implementation, how it's implemented.
To my eyes the first two are reasonable coupling. But surely an implementation change should not require NewCode to change. So code to Interfaces. We try to keep Interfaces fixed, we tend to extend them rather than change them, keeping them back-compatible if at all possible. Sometimes we use name/value pairs to try to make the interface extensible, and then hit the typo kind of errors you allude to. It's a trade-off between flexibility and "type-safety".

It's a philosophical question; it depends on the type of module, and the trade-offs. I think I have personally done all of them at various times, except for the GUID to type mapping, which doesn't have any advantages over the string to type mapping in my opinion, and at least strings are readable.
I would say you need to look at what level of decoupling is required for the particular module, given the expected external usage and code organization, and go from there. You've hit all the conceptual methods as far as I know, and they are each useful in particular situations.
That's my opinion, anyway.

Related

Is the PIMPL idiom really used in practice?

I am reading the book "Exceptional C++" by Herb Sutter, and in that book I have learned about the PIMPL idiom. Basically, the idea is to create a structure for the private objects of a class and dynamically allocate them to decrease the compilation time (and also hide the private implementations in a better manner).
For example:
class X
{
private:
C c;
D d;
} ;
could be changed to:
class X
{
private:
struct XImpl;
XImpl* pImpl;
};
and, in the .cpp file, the definition:
struct X::XImpl
{
C c;
D d;
};
This seems pretty interesting, but I have never seen this kind of approach before, neither in the companies I have worked, nor in open source projects that I've seen the source code. So, I am wondering whether this technique is really used in practice.
Should I use it everywhere, or with caution? And is this technique recommended to be used in embedded systems (where the performance is very important)?
So, I am wondering it this technique is really used in practice? Should I use it everywhere, or with caution?
Of course it is used. I use it in my project, in almost every class.
Reasons for using the PIMPL idiom:
Binary compatibility
When you're developing a library, you can add/modify fields to XImpl without breaking the binary compatibility with your client (which would mean crashes!). Since the binary layout of X class doesn't change when you add new fields to Ximpl class, it is safe to add new functionality to the library in minor versions updates.
Of course, you can also add new public/private non-virtual methods to X/XImpl without breaking the binary compatibility, but that's on par with the standard header/implementation technique.
Data hiding
If you're developing a library, especially a proprietary one, it might be desirable not to disclose what other libraries / implementation techniques were used to implement the public interface of your library. Either because of Intellectual Property issues, or because you believe that users might be tempted to take dangerous assumptions about the implementation or just break the encapsulation by using terrible casting tricks. PIMPL solves/mitigates that.
Compilation time
Compilation time is decreased, since only the source (implementation) file of X needs to be rebuilt when you add/remove fields and/or methods to the XImpl class (which maps to adding private fields/methods in the standard technique). In practice, it's a common operation.
With the standard header/implementation technique (without PIMPL), when you add a new field to X, every client that ever allocates X (either on stack, or on heap) needs to be recompiled, because it must adjust the size of the allocation. Well, every client that doesn't ever allocate X also need to be recompiled, but it's just overhead (the resulting code on the client side will be the same).
What is more, with the standard header/implementation separation XClient1.cpp needs to be recompiled even when a private method X::foo() was added to X and X.h changed, even though XClient1.cpp can't possibly call this method for encapsulation reasons! Like above, it's pure overhead and is related with how real-life C++ build systems work.
Of course, recompilation is not needed when you just modify the implementation of the methods (because you don't touch the header), but that's on par with the standard header/implementation technique.
Is this technique recommended to be used in embedded systems (where the performance is very important)?
That depends on how powerful your target is. However the only answer to this question is: measure and evaluate what you gain and lose. Also, take into consideration that if you're not publishing a library meant to be used in embedded systems by your clients, only the compilation time advantage applies!
It seems that a lot of libraries out there use it to stay stable in their API, at least for some versions.
But as for all things, you should never use anything everywhere without caution. Always think before using it. Evaluate what advantages it gives you, and if they are worth the price you pay.
The advantages it may give you are:
helps in keeping binary compatibility of shared libraries
hiding certain internal details
decreasing recompilation cycles
Those may or may not be real advantages to you. Like for me, I don't care about a few minutes recompilation time. End users usually also don't, as they always compile it once and from the beginning.
Possible disadvantages are (also here, depending on the implementation and whether they are real disadvantages for you):
Increase in memory usage due to more allocations than with the naïve variant
increased maintenance effort (you have to write at least the forwarding functions)
performance loss (the compiler may not be able to inline stuff as it is with a naïve implementation of your class)
So carefully give everything a value, and evaluate it for yourself. For me, it almost always turns out that using the PIMPL idiom is not worth the effort. There is only one case where I personally use it (or at least something similar):
My C++ wrapper for the Linux stat call. Here the struct from the C header may be different, depending on what #defines are set. And since my wrapper header can't control all of them, I only #include <sys/stat.h> in my .cxx file and avoid these problems.
I agree with all the others about the goods, but let me put in evidence about a limit: doesn't work well with templates.
The reason is that template instantiation requires the full declaration available where the instantiation took place. (And that's the main reason you don't see template methods defined into .cpp files.)
You can still refer to templatised subclasses, but since you have to include them all, every advantage of "implementation decoupling" on compiling (avoiding to include all platform-specific code everywhere, shortening compilation) is lost.
It is a good paradigm for classic OOP (inheritance based), but not for generic programming (specialization based).
Other people have already provided the technical up/downsides, but I think the following is worth noting:
First and foremost, don't be dogmatic. If PIMPL works for your situation, use it - don't use it just because "it's better OO since it really hides implementation", etc. Quoting the C++ FAQ:
encapsulation is for code, not people (source)
Just to give you an example of open source software where it is used and why: OpenThreads, the threading library used by the OpenSceneGraph. The main idea is to remove from the header (e.g., <Thread.h>) all platform-specific code, because internal state variables (e.g., thread handles) differ from platform to platform. This way one can compile code against your library without any knowledge of the other platforms' idiosyncrasies, because everything is hidden.
I would mainly consider PIMPL for classes exposed to be used as an API by other modules. This has many benefits, as it makes recompilation of the changes made in the PIMPL implementation does not affect the rest of the project. Also, for API classes they promote a binary compatibility (changes in a module implementation do not affect clients of those modules, they don't have to be recompiled as the new implementation has the same binary interface - the interface exposed by the PIMPL).
As for using PIMPL for every class, I would consider caution because all those benefits come at a cost: an extra level of indirection is required in order to access the implementation methods.
I think this is one of the most fundamental tools for decoupling.
I was using PIMPL (and many other idioms from Exceptional C++) on embedded project (SetTopBox).
The particular purpose of this idiom in our project was to hide the types XImpl class uses.
Specifically, we used it to hide details of implementations for different hardware, where different headers would be pulled in. We had different implementations of XImpl classes for one platform and different for the other. Layout of class X stayed the same regardless of the platform.
I used to use this technique a lot in the past but then found myself moving away from it.
Of course it is a good idea to hide the implementation detail away from the users of your class. However you can also do that by getting users of the class to use an abstract interface and for the implementation detail to be the concrete class.
The advantages of pImpl are:
Assuming there is just one implementation of this interface, it is clearer by not using abstract class / concrete implementation
If you have a suite of classes (a module) such that several classes access the same "impl" but users of the module will only use the "exposed" classes.
No v-table if this is assumed to be a bad thing.
The disadvantages I found of pImpl (where abstract interface works better)
Whilst you may have only one "production" implementation, by using an abstract interface you can also create a "mock" inmplementation that works in unit testing.
(The biggest issue). Before the days of unique_ptr and moving you had restricted choices as to how to store the pImpl. A raw pointer and you had issues about your class being non-copyable. An old auto_ptr wouldn't work with forwardly declared class (not on all compilers anyway). So people started using shared_ptr which was nice in making your class copyable but of course both copies had the same underlying shared_ptr which you might not expect (modify one and both are modified). So the solution was often to use raw pointer for the inner one and make the class non-copyable and return a shared_ptr to that instead. So two calls to new. (Actually 3 given old shared_ptr gave you a second one).
Technically not really const-correct as the constness isn't propagated through to a member pointer.
In general I have therefore moved away in the years from pImpl and into abstract interface usage instead (and factory methods to create instances).
As many other said, the Pimpl idiom allows to reach complete information hiding and compilation independency, unfortunately with the cost of performance loss (additional pointer indirection) and additional memory need (the member pointer itself). The additional cost can be critical in embedded software development, in particular in those scenarios where memory must be economized as much as possible.
Using C++ abstract classes as interfaces would lead to the same benefits at the same cost.
This shows actually a big deficiency of C++ where, without recurring to C-like interfaces (global methods with an opaque pointer as parameter), it is not possible to have true information hiding and compilation independency without additional resource drawbacks: this is mainly because the declaration of a class, which must be included by its users, exports not only the interface of the class (public methods) needed by the users, but also its internals (private members), not needed by the users.
Here is an actual scenario I encountered, where this idiom helped a great deal. I recently decided to support DirectX 11, as well as my existing DirectX 9 support, in a game engine.
The engine already wrapped most DX features, so none of the DX interfaces were used directly; they were just defined in the headers as private members. The engine uses DLL files as extensions, adding keyboard, mouse, joystick, and scripting support, as week as many other extensions. While most of those DLLs did not use DX directly, they required knowledge and linkage to DX simply because they pulled in headers that exposed DX.
In adding DX 11, this complexity was to increase dramatically, however unnecessarily. Moving the DX members into a PIMPL, defined only in the source, eliminated this imposition.
On top of this reduction of library dependencies, my exposed interfaces became cleaner as I moved private member functions into the PIMPL, exposing only front facing interfaces.
One benefit I can see is that it allows the programmer to implement certain operations in a fairly fast manner:
X( X && move_semantics_are_cool ) : pImpl(NULL) {
this->swap(move_semantics_are_cool);
}
X& swap( X& rhs ) {
std::swap( pImpl, rhs.pImpl );
return *this;
}
X& operator=( X && move_semantics_are_cool ) {
return this->swap(move_semantics_are_cool);
}
X& operator=( const X& rhs ) {
X temporary_copy(rhs);
return this->swap(temporary_copy);
}
PS: I hope I'm not misunderstanding move semantics.
It is used in practice in a lot of projects. It's usefulness depends heavily on the kind of project. One of the more prominent projects using this is Qt, where the basic idea is to hide implementation or platform-specific code from the user (other developers using Qt).
This is a noble idea, but there is a real drawback to this: debugging
As long as the code hidden in private implementations is of premium quality this is all well, but if there are bugs in there, then the user/developer has a problem, because it just a dumb pointer to a hidden implementation, even if he/she has the implementations source code.
So as in nearly all design decisions there are pros and cons.
I thought I would add an answer because although some authors hinted at this, I didn't think the point was made clear enough.
The primary purpose of PIMPL is to solve the N*M problem. This problem may have other names in other literature, however a brief summary is this.
You have some kind of inhertiance hierachy where if you were to add a new subclass to your hierachy, it would require you to implement N or M new methods.
This is only an approximate hand-wavey explanation, because I only recently became aware of this and so I am by my own admission not yet an expert on this.
Discussion of existing points made
However I came across this question, and similar questions a number of years ago, and I was confused by the typical answers which are given. (Presumably I first learned about PIMPL some years ago and found this question and others similar to it.)
Enables binary compatiability (when writing libraries)
Reduces compile time
Hides data
Taking into account the above "advantages", none of them are a particularly compelling reason to use PIMPL, in my opinion. Hence I have never used it, and my program designs suffered as a consequence because I discarded the utility of PIMPL and what it can really be used to accomplish.
Allow me to comment on each to explain:
1.
Binary compatiability is only of relevance when writing libraries. If you are compiling a final executable program, then this is of no relevance, unless you are using someone elses (binary) libraries. (In other words, you do not have the original source code.)
This means this advantage is of limited scope and utility. It is only of interest to people who write libraries which are shipped in proprietary form.
2.
I don't personally consider this to be of any relevance in the modern day when it is rare to be working on projects where the compile time is of critical importance. Maybe this is important to the developers of Google Chrome. The associated disadvantages which probably increase development time significantly probably more than offset this advantage. I might be wrong about this but I find it unlikely, especially given the speed of modern compilers and computers.
3.
I don't immediatly see the advantage that PIMPL brings here. The same result can be accomplished by shipping a header file and a binary object file. Without a concrete example in front of me it is difficult to see why PIMPL is relevant here. The relevant "thing" is shipping binary object files, rather than original source code.
What PIMPL actually does:
You will have to forgive my slightly hand-wavey answer. While I am not a complete expert in this particular area of software design, I can at least tell you something about it. This information is mostly repeated from Design Patterns. The authors call it "Bridge Pattern" aka Handle aka Body.
In this book, the example of writing a Window manager is given. The key point here is that a window manager can implement different types of windows as well as different types of platform.
For example, one may have a
Window
Icon window
Fullscreen window with 3d acceleration
Some other fancy window
These are types of windows which can be rendered
as well as
Microsoft Windows implementation
OS X platform implementation
Linux X Window Manger
Linux Wayland
These are different types of rendering engines, with different OS calls and possibly fundamentally different functionality as well
The list above is analagous to that given in another answer where another user described writing software which should work with different kinds of hardware for something like a DVD player. (I forget exactly what the example was.)
I give slightly different examples here compared to what is written in the Design Patterns book.
The point being that there are two seperate types of things which should be implemented using an inheritance hierachy, however using a single inheritance hierachy does not suffice here. (N*M problem, the complexity scales like the square of the number of things in each bullet point list, which is not feasible for a developer to implement.)
Hence, using PIMPL, one seperates out the types of windows and provides a pointer to an instance of an implementation class.
So PIMPL:
Solves the N*M problem
Decouples two fundamentally different things which are being modelled using inheritance such that there are 2 or more hierachies, rather than just one monolith
Permits runtime exchange of the exact implementation behaviour (by changing a pointer). This may be advantagous in some situations, whereas a single monolith enforces static (compile time) behaviour selection rather than runtime behaviour selection
There may be other ways to implement this, for example with multiple inheritance, but this is usually a more complicated and difficult approach, at least in my experience.

Prefixing interfaces with I?

I am currently reading "Clean Code" By Rober Martin (UncleBob), and generally loving the musings of UncleBob. However, I got a bit confused, when I read that he avoids prefixing interfaces like "IPerson". He states "I don't want my users knowing that I'm handing them an interface".
Thinking in TDD/injection perspective, I will always be very interested in telling the "users" of my classes that I am handing on an interface. The primary reason is that I consider Interfaces contracts between the different "agents" of a system. An agent working with one corner of my system, should not know the concrete implementation of another agents work; they should only exchange contracts, and expect the contracts to be fulfilled without knowing how. The other, but also very important, reason is that an interface can be mocked fully, and thus making unit-testing much easier. There are limits to how much you can mock on a concrete class.
Therefore, I prefer to visualize that I am indeed handing on an interface... or taking an interface as argument. But since UncleBob is a heavyweight champ in our community, and I am just another flyweigth desk jockey, I would like to know if I am missing something.
Is it wrong for me to insist on I's in interfaces??
There are a number of conventions in Java and C# that we have grown comfortable with; but that are backwards. For example, the convention of putting private variables at the top of each class is quite silly from a technical point of view. The most important things about a class are it's public methods. The least important things, the things we hide behind a privacy barrier, are the instance variables. So why would we put them at the top?
The "I" in front of interfaces is another backwards convention. When you are passed a reference to an object, you should expect it to be an interface. Interfaces should be the default; so there is no point in doing something extra, like using an I prefix, to announce that you are doing what everyone expects you to do. It would be better (though still wrong) if we reserved a special marker for the exceptional condition of passing a concrete class.
Another problem with using I, is that (oddly) we use it to communication the implementation decision of using an interface. Usually we don't want implementation decisions expressed so loudly, because that makes them hard to change. Consider, for example, what might happen if you decided that IFoo really ought to be an abstract class instead of an interface. Should you change the name to Foo or CFoo, or ACFoo?
I can hear the wheels turning in your head. You are thinking: "Yeah, but interfaces have a special place in the language, and so it's reasonable to mark them with a special naming convention." That's true. But integers also have a special place in the language, and we don't mark them (any more). Besides, ask yourself this, why do interfaces have a special place in the language?
The whole idea behind interfaces in Java and C# was a cop-out. The language designers could have just used abstract classes, but they were worried about the difficulties of implementing multiple inheritance. So they made a back-room deal with themselves. They invented an artificial construct (i.e. interfaces) that would provide some of the power of multiple inheritance, and they constrained normal classes to single inheritance.
This was one of the worst decision the language designers made. They invented a new and heavyweight syntax element in order to exclude a useful and powerful (albeit controversial) language feature. Interfaces were not invented to enable, they were invented to disable. Interfaces are a hack placed in the language by designers who didn't want to solve the harder problem of MI. So when you use the I prefix, you are putting a big spotlight on one of the largest hacks in language history.
The next time you write a function signature like this:
public void myFunction(IFoo foo) {...}
Ask yourself this: "Why do I want to know that the author of IFoo used the word 'interface'? What difference does it make to me whether he used 'interface' or 'class' or even 'struct'? That's his business, not mine! So why is he forcing me to know his business by putting this great big I in front of his type name? Why doesn't he zip his declarations up and keep his privates out of my face?"
I consider Interfaces contracts
between the different "agents" of a
system. An agent working with one
corner of my system, should not know
the concrete implementation of another
agents work; they should only exchange
contracts, and expect the contracts to
be fulfilled without knowing how. The
other, but also very important, reason
is that an interface can be mocked
fully, and thus making unit-testing
much easier. There are limits to how
much you can mock on a concrete class.
All of this is true - but how does it necessitate a naming convention for interfaces?
Basically, prefixing interfaces with "I" is nothing but another example of the useless kind of Hungarian notation, because in a statically typed language (the only kind where interfaces as a language construct make sense) you can always easily and quickly find out what a type is, usually by hovering the mouse over it in the IDE.
If you're talking about .NET, then interfaces with I at the beginning are so ubiquitous that dropping them would confuse the hell out of everyone.
Plus I'd much rather have
public class Foo : IFoo {}
than
public class FooImpl : Foo {}
It all boils down to personal preference and I did for a while play with the idea myself but I went back to the I prefix. YMMV

Is it better to have lot of interfaces or just one?

I have been working on this plugin system. I thought I passed design and started implementing. Now I wonder if I should revisit my design. my problem is the following:
Currently in my design I have:
An interface class FileNameLoader for loading the names of all the shared libraries my application needs to load. i.e. Load all files in a directory, Load all files specified in a XML file, Load all files user inputs, etc.
An Interface class LibLoader that actually loads the shared object. This class is only responsible for loading a shared object once its file name has been given. There are various ways one may need to load a shared lib. i.e. Use RTLD_NOW/RTLD_LAZY...., check if lib has been already loaded, etc.
An ABC Plugin which loads the functions I need from a handle to a library once that handle is supplied. There are so many ways this could change.
An interface class PluginFactory which creates Plugins.
An ABC PluginLoader which is the mother class which manages everything.
Now, my problem is I feel that FileNameLoader and LibLoader can go inside Plugin. But this would mean that if someone wanted to just change RTLD_NOW to RTLD_LAZY he would have to change Plugin class. On the other hand, I feel that there are too many classes here. Please give some input. I can post the interface code if necessary. Thanks in advance.
EDIT:
After giving this some thought, I have come to the conclusion that more interfaces is better (In my scenario at least). Suppose there are x implementations of FileNameLoader, y implementations of LibLoader, z implementations of Plugin. If I keep these classes separate, I have to write x + y + z implementation classes. Then I can combine them to get any functionality possible. On the other hand, if all these interfces were in Plugin class, I'd have to write x*y*z implementation classes to get all the possible functionalities which is larger than x + y + z given that there are at least 2 implementations for an interface. This is just one side of it. The other advantage is, the purpose of the interfaces are more clearer when there are more interfaces. At least that is what I think.
My c++ projects generally consists of objects that implement one or more interfaces.
I have found that this approach has the following effects:
Use of interfaces enforces your design.
(my opinion only) ensures a better program design.
Related functionality is grouped into interfaces.
The compiler will let you know if your implementation of the interface is incomplete or incorrect (good for changes to interfaces).
You can pass interface pointers around instead of entire objects.
Passing around interface pointers has the benefit that you're exposing only the functionality required to other objects.
COM employs the use of interfaces heavily, as its modular design is useful for IPC (inter process communication), promotes code reuse and enable backwards compatiblity.
Microsoft use COM extensively and base their OS and most important APIs (DirectX, DirectShow, etc.) on COM, for these reasons, and although it's hardly the most accessible technology, COM's not going away any time soon.
Will these aid your own program(s)? Up to you. If you're going to turn a lot of your code into COM objects, it's definitely the right approach.
The other good stuff you get with interfaces that I've mentioned - make your own judgement as to how useful they'll be to you. Personally, I find interfaces indispensable.
Generally the only time I provide more than one interface, it will be because I have two completely different kinds of clients (eg: clients and The Server). In that case, yes it is perfectly OK.
However, this statement worries me:
I thought I passed design and started
implementing
That's old-fashioned Waterfall thinking. You never are done designing. You will almost always have to do a fairly major redesign the first time a real client tries to use your class. Thereafter every now and then you'll discover edge cases of client use that require (or would greatly benifit by) an extra new call or two, or a slightly different approach to all the calls.
You might be interested in the Interface Segregation Principle, which results in more, smaller interfaces.
"Clients should not be forced to depend on interfaces that they do not use."
More detail on this principle is provided by this paper: http://www.objectmentor.com/resources/articles/isp.pdf
This is part of the Bob Martin's synergistic SOLID principles.
There isn't a golden rule. It'll depend on the scenario, and even then you may find in the future some assumptions have changed and you need to update it accordingly.
Personally I like the way you have it now. You can replace at the top level, or very specific pieces.
Having the One Big Class That Does Everything is wrong. So is having One Big Interface That Defines Everything.

How should I design a mechanism in C++ to manage relatively generic entities within a simulation?

I would like to start my question by stating that this is a C++ design question, more then anything, limiting the scope of the discussion to what is accomplishable in that language.
Let us pretend that I am working on a vehicle simulator that is intended to model modern highway systems. As part of this simulation, entities will be interacting with each other to avoid accidents, stop at stop lights and perhaps eventually even model traffic enforcement with radar guns and subsequent exciting high speed chases.
Being a spatial simulation written in C++, it seems like it would be ideal to start with some kind of Vehicle hierarchy, with cars and trucks deriving from some common base class. However, a common problem I have run in to is that such a hierarchy is usually very rigidly defined, and introducing unexpected changes - modeling a boat for instance - tends to introduce unexpected complexity that tends to grow over time into something quite unwieldy.
This simple aproach seems to suffer from a combinatoric explosion of classes. Imagine if I created a MoveOnWater interface and a MoveOnGround interface, and used them to define Car and Boat. Then lets say I add RadarEquipment. Now I have to do something like add the classes RadarBoat and RadarCar. Adding more capabilities using this approach and the whole thing rapidly becomes quite unreasonable.
One approach I have been investigating to address this inflexibility issue is to do away with the inheritance hierarchy all together. Instead of trying to come up with a type safe way to define everything that could ever be in this simulation, I defined one class - I will call it 'Entity' - and the capabilities that make up an entity - can it drive, can it fly, can it use radar - are all created as interfaces and added to a kind of capability list that the Entity class contains. At runtime, the proper capabilities are created and attached to the entity and functions that want to use these interfaced must first query the entity object and check for there existence. This approach seems to be the most obvious alternative, and is working well for the time being. I, however, worry about the maintenance issues that this approach will have. Effectively any arbitrary thing can be added, and there is no single location in which all possible capabilities are defined. Its not a problem currently, when the total number of things is quite small, but I worry that it might be a problem when someone else starts trying to use and modify the code.
As one potential alternative, I pondered using the template system to achieve type safe while keeping the same kind of flexibility. I imagine I could create entities that inherited whatever combination of interfaces I wanted. Using these objects would entail creating a template class or function that used any combination of the interfaces. One example might be the simple move on road using just the MoveOnRoad interface, whereas more complex logic, like a "high speed freeway chase", could use methods from both MoveOnRoad and Radar interfaces.
Of course making this approach usable mandates the use of boost concept check just to make debugging feasible. Also, this approach has the unfortunate side effect of making "optional" interfaces all but impossible. It is not simple to write a function that can have logic to do one thing if the entity has a RadarEquipment interface, and do something else if it doesn't. In this regard, type safety is somewhat of a curse. I think some trickery with boost any may be able to pull it off, but I haven't figured out how to make that work and it seems like way to much complexity for what I am trying to achieve.
Thus, we are left with the dynamic "list of capabilities" and achieving the goal of having decision logic that drives behavior based on what the entity is capable of becomes trivial.
Now, with that background in mind, I am open to any design gurus telling me where I err'd in my reasoning. I am eager to learn of a design pattern or idiom that is commonly used to address this issue, and the sort of tradeoffs I will have to make.
I also want to mention that I have been contemplating perhaps an even more random design. Even though I my gut tells me that this should be designed as a high performance C++ simulation, a part of me wants to do away with the Entity class and object-orientated foo all together and uses a relational model to define all of these entity states. My initial thought is to treat entities as an in memory database and use procedural query logic to read and write the various state information, with the necessary behavior logic that drives these queries written in C++. I am somewhat concerned about performance, although it would not surprise me if that was a non-issue. I am perhaps more concerned about what maintenance issues and additional complexity this would introduce, as opposed to the relatively simple list-of-capabilities approach.
Encapsulate what varies and Prefer object composition to inheritance, are the two OOAD principles at work here.
Check out the Bridge Design pattern. I visualize Vehicle abstraction as one thing that varies, and the other aspect that varies is the "Medium". Boat/Bus/Car are all Vehicle abstractions, while Water/Road/Rail are all Mediums.
I believe that in such a mechanism, there may be no need to maintain any capability. For example, if a Bus cannot move on Water, such a behavior can be modelled by a NOP behavior in the Vehicle Abstraction.
Use the Bridge pattern when
you want to avoid a permanent binding
between an abstraction and its
implementation. This might be the
case, for example, when the
implementation must be selected or
switched at run-time.
both the abstractions and their
implementations should be extensible
by subclassing. In this case, the
Bridge pattern lets you combine the
different abstractions and
implementations and extend them
independently.
changes in the implementation of an
abstraction should have no impact on
clients; that is, their code should
not have to be recompiled.
Now, with that background in mind, I am open to any design gurus telling me where I err'd in my reasoning.
You may be erring in using C++ to define a system for which you as yet have no need/no requirements:
This approach seems to be the most
obvious alternative, and is working
well for the time being. I, however,
worry about the maintenance issues
that this approach will have.
Effectively any arbitrary thing can be
added, and there is no single location
in which all possible capabilities are
defined. Its not a problem currently,
when the total number of things is
quite small, but I worry that it might
be a problem when someone else starts
trying to use and modify the code.
Maybe you should be considering principles like YAGNI as opposed to BDUF.
Some of my personal favourites are from Systemantics:
"15. A complex system that works is invariably found to have evolved from a simple system that works"
"16. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over, beginning with a working simple system."
You're also worring about performance, when you have no defined performance requirements, and no problems with performance:
I am somewhat concerned about
performance, although it would not
surprise me if that was a non-issue.
Also, I hope you know about double-dispatch, which might be useful for implementing anything-to-anything interactions (it's described in some detail in More Effective C++ by Scott Meyers).

For C/C++, When is it beneficial not to use Object Oriented Programming?

I find myself always trying to fit everything into the OOP methodology, when I'm coding in C/C++. But I realize that I don't always have to force everything into this mold. What are some pros/cons for using the OOP methodology versus not? I'm more interested in the pros/cons of NOT using OOP (for example, are there optimization benefits to not using OOP?). Thanks, let me know.
Of course it's very easy to explain a million reasons why OOP is a good thing. These include: design patterns, abstraction, encapsulation, modularity, polymorphism, and inheritance.
When not to use OOP:
Putting square pegs in round holes: Don't wrap everything in classes when they don't need to be. Sometimes there is no need and the extra overhead just makes your code slower and more complex.
Object state can get very complex: There is a really good quote from Joe Armstrong who invented Erlang:
The problem with object-oriented
languages is they’ve got all this
implicit environment that they carry
around with them. You wanted a banana
but what you got was a gorilla holding
the banana and the entire jungle.
Your code is already not OOP: It's not worth porting your code if your old code is not OOP. There is a quote from Richard Stallman in 1995
Adding OOP to Emacs is not clearly an
improvement; I used OOP when working
on the Lisp Machine window systems,
and I disagree with the usual view
that it is a superior way to program.
Portability with C: You may need to export a set of functions to C. Although you can simulate OOP in C by making a struct and a set of functions who's first parameter takes a pointer to that struct, it isn't always natural.
You may find more reasons in this paper entitled Bad Engineering Properties
of Object-Oriented Languages.
Wikipedia's Object Oriented Programming page also discusses some pros and cons.
One school of thought with object-oriented programming is that you should have all of the functions that operate on a class as methods on the class.
Scott Meyers, one of the C++ gurus, actually argues against this in this article:
How Non-Member Functions Improve Encapsulation.
He basically says, unless there's a real compelling reason to, you should keep the function SEPARATE from the class. Otherwise the class can turn into this big bloated unmanageable mess.
Based on experiences in a previous large project, I totally agree with him.
A benefit of non-oop functionality is that it often makes exporting your functionality to different languages easier. For example a simple DLL containing only functions is much easier to use in C#, you can use the P/Invoke to simply call the C++ functions. So in this sense it can be useful for writing extremely time critical algorithms that fit nicely into single/few function calls.
OOP is used a lot in GUI code, computer games, and simulations. Windows should be polymorphic - you can click on them, resize them, and so on. Computer game objects should be polymorphic - they probably have a location, a path to follow, they might have health, and they might have some AI behavior. Simulation objects also have behavior that is similar, but breaks down into classes.
For most things though, OOP is a bit of a waste of time. State usually just causes trouble, unless you have put it safely in the database where it belongs.
I suggest you read Bjarne's Paper about Why C++ is not just an Object-Oriented Programming Language
If we consider, for a moment, not object-orienatation itself but one
of the keystones of object-orientation: encapsulation.
It can be shown that change-propagation probability cannot increase
with distance from the change: if A depends on B and B depends on C,
and we change C, then the probability that A will change
cannot be larger than the proabability that B will
change. If B is a direct dependency on C and A is an indirect
dependency on C, then, more generally, to minimise the potential cost
of any change in a system we must miminimise the potential number of
direct dependencies.
The ISO defines encapsulation as the property that the information
contained in an object is accessible only through interactions at the
interfaces supported by the object.
We use encapsulation to minimise the number of potential dependencies
with the highest change-propagation probability. Basically,
encapsulation mitigates the ripple effect.
Thus one reason not to use encapsulation is when the system is so
small or so unchanging that the cost of potential ripple effects is
negligible. This is also, therefore, a case when OO might not be used
without potentially costly consequences.
Well, there are several alternatives. Non-OOP code in C++ may instead be:
C-style procedural code, or
C++-style generic programming
The only advantages to the first are the simplicity and backwards-compatibility. If you're writing a small trivial app, then messing around with classes is just a waste of time. If you're trying to write a "Hello World", just call printf already. Don't bother wrapping it in a class. And if you're working with an existing C codebase, it's probably not object-oriented, and trying to force it into a different paradigm than it already uses is just a recipe for pain.
For the latter, the situation is different, in that this approach is often superior to "traditional OOP".
Generic programming gives you greater performance (among other things because you often avoid the overhead of vtables, and because with less indirection, the compiler is better able to inline), better type safety (because the exact type is known, rather than hiding it behind an interface), and often cleaner and more concise code as well (STL iterators and algorithms enable much of this, without using a single instance of runtime polymorphism or virtual functions.
OOP is little more than an aging buzzword. A methodology that everyone misunderstood (The version supported by C++ and Java has little to do with what OOP originally meant, as in SmallTalk), and then pretended was the holy grail. There are aspects to it that are useful, certainly, but it is often not the best approach for designing an application.
Rather, express the overall logic by other means, for example generic programming, and when you need a class to encapsulate some simple concept, by all means design it according to OOP principles.
OOP is just a tool among many. The goal is not to write OOP code, but to write good code. Sometimes, the way to do this is by using OOP principles, but often, you can get better code using generic programmming principles, or functional programming.
It is a very project dependent decision. My general feel of OOP is that its useful for organizing large projects that involve multiple components. One area I find that OOP is especially pointless is school assignments. Excepting those specifically designed to teach OOP concepts, or large software design concepts, many of my assignments, specifically those in more algorithmy type classes are best suited to non-OOP design.
So specifically, smaller projects, that are not likely to grow large, and projects that center around a single algorithm seem to be non-OOP candidates in my books. Also, if you can write the specification as a linear set of steps, e.g., with no interactive GUI or state to maintain, this would also be an opportunity.
Of course, if you're required to use an OOP design, or an OOP toolkit, or if you have well defined 'objects' in you're spec, or if you need the features of polymorphism, etc. etc. etc...there are plenty of reasons to use it, the above seem to be indicators of when it would be simple not to.
Just my $0.02.
Having an Ada background, I develop in C in terms of packages containing data and their associated functions. This gives a code very modular with pieces of code that can be taken apart and reused on other projects. I don't feel the need to use OOP.
When I develop in Objective-C, objects are the natural container for data and code. I still develop with more or less the package concept in mind with some new cool features.
I'm used to be an OOP fanboy... Then realized using functions, generics and callbacks can often make a more elegant and change-friendly solution in C++ than classes and virtual functions.
Other big names realized it too: http://harmful.cat-v.org/software/OO_programming/
IMHO, I have a feeling that the OOP concept is not really suits the needs of the Big Data, as OOP assume all the stuff to be kept in memory (concept of Objects and member variables). This always result in memory demanding and heavy applications when OOP is used for example for big images processing. Instead, the simplicity of C maybe used with intensive parallel I/O making apps more efficient and easy to implement. It is the year 2019 I am writing this message...Everything may change in a year! :)
In my mind it comes down to what kind of model suits the problem at hand. It seems to me that OOP is best suited to coding GUI programs, in that the data and functionality for a graphical object is easily bundled together. Other problems- (such as a webserver, as an example off the top of my head), might be more easily modeled with a data centric approach, where there's no strong advantage to having a method and its data near each-other.
tl;dr depends on the problem.
I'd say the greatest benefit of C++ OOP is inheritance and polymorphism (Virtual function etc...) .
This allows for code reuse and extendibility
C++, use OOP - - - C, no, with certain exceptions
In C++ you should use OOP. It's a nice abstraction and it's the tool you are given. You either use it or leave it in the box where it can't help. You don't use the power saw for everything but I would read the manual and have it ready for the right job.
In C, it's a more difficult call. While you can certainly write arbitrarily object-oriented code in C, it's enough of a pain that you immediately find yourself fighting the language in order to use it. You may be more productive dropping the doesn't-fit-so-well design pattern and programming as C was intended to be used.
Furthermore, every time you make an array of function pointers or something in an OOP-in-C design pattern, you sever almost completely all visible links in the inheritance chain, making the code hard to maintain. In real OOP languages, there is an obvious chain of derived classes, often analyzed and documented for you. (mmm, javadoc.) Not so in OOP-in-C, and the tools available won't be able to see it.
So, I would argue in general against OOP in C. For a really complex program, you may well need the abstraction, and then you will have to do it despite needing to fight the language in the process and despite making the program quite hard to follow by anyone other than the original author.
But if you knew the program was going to become that complicated, you shouldn't have written it in C in the first place...
In C, there are some times when I 'emulate' the object oriented approach, by defining some sort of constructor with granular control over things like callbacks, when running several instances of it.
For instance, lets say I have some spiffy event handler library and I know that down the road I'm going to need many allocated copies:
So I would have (in C)
MyEvent *ev1 = new_eventhandler();
set_event_callback_func(ev1, callback_one);
ev1->setfd(fd1);
MyEvent *ev2 = new_eventhandler();
set_event_callback_func(ev2, callback_two);
ev2->setfd(fd2);
destroy_eventhandler(ev1);
destroy_eventhandler(ev2);
Obviously, I would later do something useful with that like handle received events in the two respective callback functions. I'm not going to really elaborate on the method of typing function pointers and structures to hold them, nor what would go on in the 'constructor' because its pretty obvious.
I think, this approach works for more advanced interfaces where its desirable to allow the user to define their own callbacks (and change them on the fly), or when working on complex non-blocking I/O services.
Otherwise, I much prefer a more procedural / functional approach.
Probably an unpopular idea but I think you should stick with non-OOP unless it adds something useful. In most practical problems OOP is useful but if I'm just playing with an idea I start writing non-object code and put functions and data into classes if it becomes useful.
Of course I still use other objects in my code (std::vector et al) and I use namespaces to help organise my functions but why put code into objects until it is useful? Equally don't shy away from free functions in an OO solution.
The question is tricky because OOP encompasses several concepts: object encapsulation, polymorphism, inheritance, etc. It's easy to take those ideas too far. Here's a concrete example:
When C++ first caught on, zillions of string classes sprung into being. Everything you could possibly imagine doing to a string (upcasing, downcasing, trimming, tokenizing, parsing, etc.) was a member function of some string class.
Notice, though, that std::strings from the STL don't have all these methods. STL is object-oriented--the state and implementation details of a string object are well encapsulated, only a small, orthogonal interface is exposed to the world. All the crazy manipulations that people used to include as member functions are now delegated to non-member functions.
This is powerful, because these functions can now work on any string class that exposes the same interface. If you use STL strings for most things and a specialty version tuned to your program's idiosyncracies, you don't have to duplicate member functions. You just have to implement the basic string interface and then you can re-use all those crazy manipulations.
Some people call this hybrid approach generic programming. It's still object-oriented programming, but it moves away from the "everything is a member-function" mentality that a lot of people associate with OOP.