Prefixing interfaces with I? - unit-testing

I am currently reading "Clean Code" By Rober Martin (UncleBob), and generally loving the musings of UncleBob. However, I got a bit confused, when I read that he avoids prefixing interfaces like "IPerson". He states "I don't want my users knowing that I'm handing them an interface".
Thinking in TDD/injection perspective, I will always be very interested in telling the "users" of my classes that I am handing on an interface. The primary reason is that I consider Interfaces contracts between the different "agents" of a system. An agent working with one corner of my system, should not know the concrete implementation of another agents work; they should only exchange contracts, and expect the contracts to be fulfilled without knowing how. The other, but also very important, reason is that an interface can be mocked fully, and thus making unit-testing much easier. There are limits to how much you can mock on a concrete class.
Therefore, I prefer to visualize that I am indeed handing on an interface... or taking an interface as argument. But since UncleBob is a heavyweight champ in our community, and I am just another flyweigth desk jockey, I would like to know if I am missing something.
Is it wrong for me to insist on I's in interfaces??

There are a number of conventions in Java and C# that we have grown comfortable with; but that are backwards. For example, the convention of putting private variables at the top of each class is quite silly from a technical point of view. The most important things about a class are it's public methods. The least important things, the things we hide behind a privacy barrier, are the instance variables. So why would we put them at the top?
The "I" in front of interfaces is another backwards convention. When you are passed a reference to an object, you should expect it to be an interface. Interfaces should be the default; so there is no point in doing something extra, like using an I prefix, to announce that you are doing what everyone expects you to do. It would be better (though still wrong) if we reserved a special marker for the exceptional condition of passing a concrete class.
Another problem with using I, is that (oddly) we use it to communication the implementation decision of using an interface. Usually we don't want implementation decisions expressed so loudly, because that makes them hard to change. Consider, for example, what might happen if you decided that IFoo really ought to be an abstract class instead of an interface. Should you change the name to Foo or CFoo, or ACFoo?
I can hear the wheels turning in your head. You are thinking: "Yeah, but interfaces have a special place in the language, and so it's reasonable to mark them with a special naming convention." That's true. But integers also have a special place in the language, and we don't mark them (any more). Besides, ask yourself this, why do interfaces have a special place in the language?
The whole idea behind interfaces in Java and C# was a cop-out. The language designers could have just used abstract classes, but they were worried about the difficulties of implementing multiple inheritance. So they made a back-room deal with themselves. They invented an artificial construct (i.e. interfaces) that would provide some of the power of multiple inheritance, and they constrained normal classes to single inheritance.
This was one of the worst decision the language designers made. They invented a new and heavyweight syntax element in order to exclude a useful and powerful (albeit controversial) language feature. Interfaces were not invented to enable, they were invented to disable. Interfaces are a hack placed in the language by designers who didn't want to solve the harder problem of MI. So when you use the I prefix, you are putting a big spotlight on one of the largest hacks in language history.
The next time you write a function signature like this:
public void myFunction(IFoo foo) {...}
Ask yourself this: "Why do I want to know that the author of IFoo used the word 'interface'? What difference does it make to me whether he used 'interface' or 'class' or even 'struct'? That's his business, not mine! So why is he forcing me to know his business by putting this great big I in front of his type name? Why doesn't he zip his declarations up and keep his privates out of my face?"

I consider Interfaces contracts
between the different "agents" of a
system. An agent working with one
corner of my system, should not know
the concrete implementation of another
agents work; they should only exchange
contracts, and expect the contracts to
be fulfilled without knowing how. The
other, but also very important, reason
is that an interface can be mocked
fully, and thus making unit-testing
much easier. There are limits to how
much you can mock on a concrete class.
All of this is true - but how does it necessitate a naming convention for interfaces?
Basically, prefixing interfaces with "I" is nothing but another example of the useless kind of Hungarian notation, because in a statically typed language (the only kind where interfaces as a language construct make sense) you can always easily and quickly find out what a type is, usually by hovering the mouse over it in the IDE.

If you're talking about .NET, then interfaces with I at the beginning are so ubiquitous that dropping them would confuse the hell out of everyone.
Plus I'd much rather have
public class Foo : IFoo {}
than
public class FooImpl : Foo {}
It all boils down to personal preference and I did for a while play with the idea myself but I went back to the I prefix. YMMV

Related

Is it better to have lot of interfaces or just one?

I have been working on this plugin system. I thought I passed design and started implementing. Now I wonder if I should revisit my design. my problem is the following:
Currently in my design I have:
An interface class FileNameLoader for loading the names of all the shared libraries my application needs to load. i.e. Load all files in a directory, Load all files specified in a XML file, Load all files user inputs, etc.
An Interface class LibLoader that actually loads the shared object. This class is only responsible for loading a shared object once its file name has been given. There are various ways one may need to load a shared lib. i.e. Use RTLD_NOW/RTLD_LAZY...., check if lib has been already loaded, etc.
An ABC Plugin which loads the functions I need from a handle to a library once that handle is supplied. There are so many ways this could change.
An interface class PluginFactory which creates Plugins.
An ABC PluginLoader which is the mother class which manages everything.
Now, my problem is I feel that FileNameLoader and LibLoader can go inside Plugin. But this would mean that if someone wanted to just change RTLD_NOW to RTLD_LAZY he would have to change Plugin class. On the other hand, I feel that there are too many classes here. Please give some input. I can post the interface code if necessary. Thanks in advance.
EDIT:
After giving this some thought, I have come to the conclusion that more interfaces is better (In my scenario at least). Suppose there are x implementations of FileNameLoader, y implementations of LibLoader, z implementations of Plugin. If I keep these classes separate, I have to write x + y + z implementation classes. Then I can combine them to get any functionality possible. On the other hand, if all these interfces were in Plugin class, I'd have to write x*y*z implementation classes to get all the possible functionalities which is larger than x + y + z given that there are at least 2 implementations for an interface. This is just one side of it. The other advantage is, the purpose of the interfaces are more clearer when there are more interfaces. At least that is what I think.
My c++ projects generally consists of objects that implement one or more interfaces.
I have found that this approach has the following effects:
Use of interfaces enforces your design.
(my opinion only) ensures a better program design.
Related functionality is grouped into interfaces.
The compiler will let you know if your implementation of the interface is incomplete or incorrect (good for changes to interfaces).
You can pass interface pointers around instead of entire objects.
Passing around interface pointers has the benefit that you're exposing only the functionality required to other objects.
COM employs the use of interfaces heavily, as its modular design is useful for IPC (inter process communication), promotes code reuse and enable backwards compatiblity.
Microsoft use COM extensively and base their OS and most important APIs (DirectX, DirectShow, etc.) on COM, for these reasons, and although it's hardly the most accessible technology, COM's not going away any time soon.
Will these aid your own program(s)? Up to you. If you're going to turn a lot of your code into COM objects, it's definitely the right approach.
The other good stuff you get with interfaces that I've mentioned - make your own judgement as to how useful they'll be to you. Personally, I find interfaces indispensable.
Generally the only time I provide more than one interface, it will be because I have two completely different kinds of clients (eg: clients and The Server). In that case, yes it is perfectly OK.
However, this statement worries me:
I thought I passed design and started
implementing
That's old-fashioned Waterfall thinking. You never are done designing. You will almost always have to do a fairly major redesign the first time a real client tries to use your class. Thereafter every now and then you'll discover edge cases of client use that require (or would greatly benifit by) an extra new call or two, or a slightly different approach to all the calls.
You might be interested in the Interface Segregation Principle, which results in more, smaller interfaces.
"Clients should not be forced to depend on interfaces that they do not use."
More detail on this principle is provided by this paper: http://www.objectmentor.com/resources/articles/isp.pdf
This is part of the Bob Martin's synergistic SOLID principles.
There isn't a golden rule. It'll depend on the scenario, and even then you may find in the future some assumptions have changed and you need to update it accordingly.
Personally I like the way you have it now. You can replace at the top level, or very specific pieces.
Having the One Big Class That Does Everything is wrong. So is having One Big Interface That Defines Everything.

Keeping modules independent, while still using each other

A big part of my C++ application uses classes to describe the data model, e.g. something like ClassType (which actually emulates reflection in plain C++).
I want to add a new module to my application and it needs to make use of these ClassType's, but I prefer not to introduce dependencies from my new module on ClassType.
So far I have the following alternatives:
Not making it independent and introduce a dependency on ClassType, with the risk of creating more 'spaghetti'-dependencies in my application (this is my least-preferred solution)
Introduce a new class, e.g. IType, and letting my module only depend on IType. ClassType should then inherit from IType.
Use strings as identification method, and forcing the users of the new module to convert the ClassType to a string or vice versa where needed.
Use GUID's (or even simple integers) as identification, also requiring conversions between GUID's and ClassType's
How far should you try to go when decoupling modules in an application?
just introduce an interface and let all the other modules rely on the interface? (like in IType describe above)
even decouple it further by using other identifications like strings or GUID's?
I afraid that by decoupling it too far, the code becomes more unstable and more difficult to debug. I've seen one such example in Qt: signals and slots are linked using strings and if you make a typing mistake, the functionality doesn't work, but it still compiles.
How far should you keep your modules decoupled?
99% of the time, if your design is based on reflection, then you have major issues with the design.
Generally speaking, something like
if (x is myclass)
elseif (x is anotherclass)
else
is a poor design because it neglects polymorphism. If you're doing this, then the item x is in violation of the Liskov Substitution Principle.
Also, given that C++ already has RTTI, I don't see why you'd reinvent the wheel. That's what typeof and dynamic_cast are for.
I'll steer away from thinkng about your reflection, and just look at the dependency ideas.
Decouple what it's reasonable to decouple. Coupling implies that if one thing changes so must another. So your NewCode is using ClassType, if some aspects of it change then yuou surely must change NewCode - it can't be completely decoupled. Which of the following do you want to decouple from?
Semantics, what ClassType does.
Interface, how you call it.
Implementation, how it's implemented.
To my eyes the first two are reasonable coupling. But surely an implementation change should not require NewCode to change. So code to Interfaces. We try to keep Interfaces fixed, we tend to extend them rather than change them, keeping them back-compatible if at all possible. Sometimes we use name/value pairs to try to make the interface extensible, and then hit the typo kind of errors you allude to. It's a trade-off between flexibility and "type-safety".
It's a philosophical question; it depends on the type of module, and the trade-offs. I think I have personally done all of them at various times, except for the GUID to type mapping, which doesn't have any advantages over the string to type mapping in my opinion, and at least strings are readable.
I would say you need to look at what level of decoupling is required for the particular module, given the expected external usage and code organization, and go from there. You've hit all the conceptual methods as far as I know, and they are each useful in particular situations.
That's my opinion, anyway.

How should I design a mechanism in C++ to manage relatively generic entities within a simulation?

I would like to start my question by stating that this is a C++ design question, more then anything, limiting the scope of the discussion to what is accomplishable in that language.
Let us pretend that I am working on a vehicle simulator that is intended to model modern highway systems. As part of this simulation, entities will be interacting with each other to avoid accidents, stop at stop lights and perhaps eventually even model traffic enforcement with radar guns and subsequent exciting high speed chases.
Being a spatial simulation written in C++, it seems like it would be ideal to start with some kind of Vehicle hierarchy, with cars and trucks deriving from some common base class. However, a common problem I have run in to is that such a hierarchy is usually very rigidly defined, and introducing unexpected changes - modeling a boat for instance - tends to introduce unexpected complexity that tends to grow over time into something quite unwieldy.
This simple aproach seems to suffer from a combinatoric explosion of classes. Imagine if I created a MoveOnWater interface and a MoveOnGround interface, and used them to define Car and Boat. Then lets say I add RadarEquipment. Now I have to do something like add the classes RadarBoat and RadarCar. Adding more capabilities using this approach and the whole thing rapidly becomes quite unreasonable.
One approach I have been investigating to address this inflexibility issue is to do away with the inheritance hierarchy all together. Instead of trying to come up with a type safe way to define everything that could ever be in this simulation, I defined one class - I will call it 'Entity' - and the capabilities that make up an entity - can it drive, can it fly, can it use radar - are all created as interfaces and added to a kind of capability list that the Entity class contains. At runtime, the proper capabilities are created and attached to the entity and functions that want to use these interfaced must first query the entity object and check for there existence. This approach seems to be the most obvious alternative, and is working well for the time being. I, however, worry about the maintenance issues that this approach will have. Effectively any arbitrary thing can be added, and there is no single location in which all possible capabilities are defined. Its not a problem currently, when the total number of things is quite small, but I worry that it might be a problem when someone else starts trying to use and modify the code.
As one potential alternative, I pondered using the template system to achieve type safe while keeping the same kind of flexibility. I imagine I could create entities that inherited whatever combination of interfaces I wanted. Using these objects would entail creating a template class or function that used any combination of the interfaces. One example might be the simple move on road using just the MoveOnRoad interface, whereas more complex logic, like a "high speed freeway chase", could use methods from both MoveOnRoad and Radar interfaces.
Of course making this approach usable mandates the use of boost concept check just to make debugging feasible. Also, this approach has the unfortunate side effect of making "optional" interfaces all but impossible. It is not simple to write a function that can have logic to do one thing if the entity has a RadarEquipment interface, and do something else if it doesn't. In this regard, type safety is somewhat of a curse. I think some trickery with boost any may be able to pull it off, but I haven't figured out how to make that work and it seems like way to much complexity for what I am trying to achieve.
Thus, we are left with the dynamic "list of capabilities" and achieving the goal of having decision logic that drives behavior based on what the entity is capable of becomes trivial.
Now, with that background in mind, I am open to any design gurus telling me where I err'd in my reasoning. I am eager to learn of a design pattern or idiom that is commonly used to address this issue, and the sort of tradeoffs I will have to make.
I also want to mention that I have been contemplating perhaps an even more random design. Even though I my gut tells me that this should be designed as a high performance C++ simulation, a part of me wants to do away with the Entity class and object-orientated foo all together and uses a relational model to define all of these entity states. My initial thought is to treat entities as an in memory database and use procedural query logic to read and write the various state information, with the necessary behavior logic that drives these queries written in C++. I am somewhat concerned about performance, although it would not surprise me if that was a non-issue. I am perhaps more concerned about what maintenance issues and additional complexity this would introduce, as opposed to the relatively simple list-of-capabilities approach.
Encapsulate what varies and Prefer object composition to inheritance, are the two OOAD principles at work here.
Check out the Bridge Design pattern. I visualize Vehicle abstraction as one thing that varies, and the other aspect that varies is the "Medium". Boat/Bus/Car are all Vehicle abstractions, while Water/Road/Rail are all Mediums.
I believe that in such a mechanism, there may be no need to maintain any capability. For example, if a Bus cannot move on Water, such a behavior can be modelled by a NOP behavior in the Vehicle Abstraction.
Use the Bridge pattern when
you want to avoid a permanent binding
between an abstraction and its
implementation. This might be the
case, for example, when the
implementation must be selected or
switched at run-time.
both the abstractions and their
implementations should be extensible
by subclassing. In this case, the
Bridge pattern lets you combine the
different abstractions and
implementations and extend them
independently.
changes in the implementation of an
abstraction should have no impact on
clients; that is, their code should
not have to be recompiled.
Now, with that background in mind, I am open to any design gurus telling me where I err'd in my reasoning.
You may be erring in using C++ to define a system for which you as yet have no need/no requirements:
This approach seems to be the most
obvious alternative, and is working
well for the time being. I, however,
worry about the maintenance issues
that this approach will have.
Effectively any arbitrary thing can be
added, and there is no single location
in which all possible capabilities are
defined. Its not a problem currently,
when the total number of things is
quite small, but I worry that it might
be a problem when someone else starts
trying to use and modify the code.
Maybe you should be considering principles like YAGNI as opposed to BDUF.
Some of my personal favourites are from Systemantics:
"15. A complex system that works is invariably found to have evolved from a simple system that works"
"16. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over, beginning with a working simple system."
You're also worring about performance, when you have no defined performance requirements, and no problems with performance:
I am somewhat concerned about
performance, although it would not
surprise me if that was a non-issue.
Also, I hope you know about double-dispatch, which might be useful for implementing anything-to-anything interactions (it's described in some detail in More Effective C++ by Scott Meyers).

Is there a value in preceding the names of pure virtual base classes with the letter 'I', as is common in C#/Java?

I don't see this very often, if at all, in C++. Any reason not to do this? I think it would be easier to identify the implications and intentions of the typename by doing this, as well as the sourcefile. Thoughts?
IMHO, It is a non-sense in any modern language/ide.
This kind of notation is a relic of Microsoft COM coding standards. It was very common among Visual C++ developer but nowadays even Microsoft discourages the use of this kind of "Hungarian notation" habits.
Using a modern IDE, if I want to know if a "class" is an "interface", I can look at the icons near the name of the "class". I do not need anymore confusing prefix before the class names.
In addition, this coding convention is error prone. It requires human attention and, probably, someone that checks if some "I" was forgot.
Nowadays its more common to have an application/company specific coding style than follow some general rules for what is right or wrong(*) i.e. if your team find it useful to prefix with an I then why not. What is more important (as with all coding styles) is to be consistent.
(*) in C++/C not speaking of other languages but could apply there as well.
In Symbian C++ programming pure virtual classes being with 'M' (for Mixin) by convention.
What's the real significance of a class having pure virtual functions? It means it can't be instantiated directly, that derived classes are forced to implement some features, but is that really significantly different from deriving from a base with virtual but no pure functions, where you might want to override a default implementation anyway? During evolution of code, a pure virtual function might be replaced with a default implementation, but that doesn't affect or undermine the usage or design of existing clients. So, this distinction isn't very useful as far as I can see. But, there is a cost...
To maintain the I prefix convention as you've suggested, a decision to provide default implementations of the virtual functions would require a name change and edits to all the client code. If this is not done, then I becomes actively misleading. So, you'd need to find significant advantage in the convention before you'd adopt it, and a lack of common use suggests that advantage isn't perceived.
Based on the issue above, you might reach for a convention where I only indicated a base class with some virtual members, irrespective of whether any are pure. But then a class would add a virtual function during the evolution of a system, and you'd still need to rename the class and correct client usage.
In C++, most people have found these kind of distinctions don't add enough value to warrant their maintenance. One priority is to maintain as much freedom as possible to vary code without affecting clients, though this can conflict with other priorities - e.g. pImpl idiom versus inline performance.

More on the mediator pattern and OO design

So, I've come back to ask, once more, a patterns-related question. This may be too generic to answer, but my problem is this (I am programming and applying concepts that I learn as I go along):
I have several structures within structures (note, I'm using the word structure in the general sense, not in the strict C struct sense (whoa, what a tongue twister)), and quite a bit of complicated inter-communications going on. Using the example of one of my earlier questions, I have Unit objects, UnitStatistics objects, General objects, Army objects, Soldier objects, Battle objects, and the list goes on, some organized in a tree structure.
After researching a little bit and asking around, I decided to use the mediator pattern because the interdependencies were becoming a trifle too much, and the classes were starting to appear too tightly coupled (yes, another term which I just learned and am too happy about not to use it somewhere). The pattern makes perfect sense and it should straighten some of the chaotic spaghetti that I currently have boiling in my project pot.
But well, I guess I haven't learned yet enough about OO design. My question is this (finally. PS, I hope it makes sense): should I have one central mediator that deals with all communications within the program, and is it even possible? Or should I have, say, an abstract mediator and one subclassed mediator per structure type that deals with communication of a particular set of classes, e.g. a concrete mediator per army which helps out the army, its general, its units, etc.
I'm leaning more towards the second option, but I really am no expert when it comes to OO design. So third question is, what should I read to learn more about this kind of subject (I've looked at Head First's Design Patterns and the GoF book, but they're more of a "learn the vocabulary" kind of book than a "learn how to use your vocabulary" kind of book, which is what I need in this case.
As always, thanks for any and all help (including the witty comments).
I don't think you've provided enough info above to be able to make an informed decision as to which is best.
From looking at your other questions it seems that most of the communication occurs between components within an Army. You don't mention much occurring between one Army and another. In which case it would seem to make sense to have each Mediator instance coordinate communication between the components comprising a single Army - i.e. the Generals, Soldiers etc. So if you have 10 Army's then you will have 10 ArmyMediator's.
If you really want to learn O-O Design you're going to have to try things out and run the risk of getting it wrong from time to time. I think you'll learn just as much, if not more, from having to refactor a design that doesn't quite model the problem correctly into one that does, as you will from getting the design right the first time around.
Often you just won't have enough information up front to be able to choose the right design from the go anyway. Just choose the simplest one that works for now, and improve it later when you have a better idea of the requirements and/or the shortcomings of the current design.
Regarding books, personally I think the GoF book is more useful if you focus less on the specific set of patterns they describe, and focus more on the overall approach of breaking classes down into smaller reusable components, each of which typically encapsulates a single unit of functionality.
I can't answer your question directly, because I have never used that design pattern. However, whenever I have this problem, of message passing between various objects, I use the signal-slot pattern. Usually I use Qt's, but my second option is Boost's. They both solve the problem by having a single, global message passing handler. They are also both type-safe are quite efficient, both in terms of cpu-cycles and in productivity. Because they are so flexible, i.e. any object and emit any kind of signal, and any other object can receive any signal, you'll end up solving, I think, what you describe.
Sorry if I just made things worse by not choosing any of the 2 option, but instead adding a 3rd!
In order to use Mediator you need to determine:
(1) What does the group of objects, which need mediation, consist of?
(2) Among these, which are the ones that have a common interface?
The Mediator design pattern relies on the group of objects that are to be mediated to have a "common interface"; i.e., same base class: the widgets in the GoF book example inherit from same Widget base, etc.
So, for your application:
(1) Which are the structures (Soldier, General, Army, Unit, etc.) that need mediation between each other?
(2) Which ones of those (Soldier, General, Army, Unit, etc.) have a common base?
This should help you determine, as a first step, an outline of the participants in the Mediator design pattern. You may find out that some structures in (1) fall outside of (2). Then, yo may need to force them adhering to a common interface, too, if you can change that or if you can afford to make that change... (may turn out to be too much redesigning work and it violates the Open-Closed principle: your design should be, as much as possible, open to adding new features but closed to modifying existent ones).
If you discover that (1) and (2) above result in a partition of separate groups, each with its own mediator, then the number of these partitions dictate the number of different types of mediators. Now, should these different mediators have a common interface of their own? Maybe, maybe not. Polymorphism is a way of handling complexity by grouping different entities under a common interface such that they can be handled as a group rather then individually. So, would there be any benefit to group all these supposedly different types of mediators under a common interface (like the DialogDirector in the GoF book example)? Possibly, if:
(a) You may have to use a heterogeneous collection of mediators;
or
(b) You envision in the future that these mediators will evolve (and they probably will). Hence providing an abstract interface allows you to derive more evolved versions of mediators without affecting existent ones or their colleagues (the clients of the mediators).
So, without knowing more, I'd have to guess that, yes, it's probably better to use abstract mediators and to subclass them, for each group partition, just to prepare yourself for future changes without having to redesign your mediators (remember the Open-Closed principle).
Hope this helps.