How to refactor an existing class to become polymorphic? - c++

I have a class that is used as a member in many places in my project.
Now, instead of this class I want to have a polymorphism, and the actual object will be created by some kind of factory.
I have to choose between:
Having to change all the places where I use the class - to call the factory and use a pointer instead of object directly.
Changing the class to be just a wrapper, that will call functions of a new polymorphic class I will create.
Which strategy should I better choose?

Change all the places where I use the class to call the factory and use a pointer instead of object directly.
That's best. It seems painful at first, but it's clean and more extensible than implementing a wrapper because you didn't feel like doing a search for new MyClass(.
Once you list all the place with the new, you'll see that it isn't really all that bad a job.

Chances are that even if you go with #2 and implement a base class wrapper, you will have to modify the client code anyway (different methods of construction through a factory, e.g., for a polymorphic base wrapper).
I'd go with #1 but not a regular pointer, but something like boost::shared_ptr or boost::scoped_ptr (depending on what you need).
The second option might allow you to take some liberties with the base wrapper interface, but I'd recommend against that: favor the commonly-accepted approaches when possible. If the base class wrapper provides additional facilities that boost::shared_ptr doesn't provide, e.g., it will be a foreign entity which introduces new concepts in the system and probably with little or no benefits to show for it.
In the best case scenario, your base class wrapper duplicates the interface of something common and familiar to most developers like boost::shared_ptr, in which case it's reinventing the wheel and you might as well have used boost::shared_ptr. In the worst case scenario, your wrapper class introduces an interface that's completely different and therefore introduces foreign code to the system that others will not immediately recognize.
No matter how good you are, other developers will have a much easier time trusting and working with a peer-reviewed, well-documented, thoroughly-tested library like boost than a handrolled solution by one engineer. If only for that reason, try to use the existing library solutions as much as possible and prefer those to, say, a custom base class wrapper.

Your idea of using a wrapper even has a name. It's the letter/envelope idiom, as described by Coplien (before they started to call these idioms "patterns").
Google finds you more explanations for it.
Contrary to the others I don't see anything wrong with using it. For the class' users, it's much easier to deal with something that behaves as simple as a value type. No hassle of having to use a factory or a factory method for creating objects, since the class' constructors are the factories - which is what constructors were invented for in the first place.
What could be wrong with that?


Add constructor to deftype created class

For the purposes of interoperability with Java, I need a class that has a nullary constructor that performs initialization.
Objects of this class need to have something resembling mutable java fields (namely, the object represents the backend of a game, and needs to keep game state).
deftype does everything I want to do except provide a nullary constructor (since I'm creating a class with fields).
I don't need the fields to be publicly readable, so I can think of 4 solutions:
Use gen-class; I don't want to do this if I can avoid it.
Somehow encoding private member variables outside of the knowledge of deftype; I've been told this can't be done.
Writing a modified deftype that also creates a nullary constructor; frankly I don't know clojure well enough for this.
Taking the class created by deftype and somehow adding a new constructor to it.
At the end of this, I need to have a Java class, since I will be handing it off to Java code that will be making a new object from the class.
Are any of the solutions I suggested (or any that I haven't thought of) other than using gen-class viable?
There's absolutely no shame in, where appropriate, writing a dash of Java if your Java interop requirements are simultaneously specific and unshakable. You could write a Java class with a single static factory method that returns an instance of the deftype class and that does whatever initialization/setup you need.
Alternatively, you can write a nullary factory function in Clojure, and call that directly from Java all day long.
In any case, neither deftype nor defrecord are intended to be (or will they ever be) fully-featured interop facilities. gen-class certainly comes the closest, which is why it's been recommended.
I'd suggest just writing the object in Java - for Java-like objects with mutable fields it will probably be more elegant, understandable and practical.
I've generally had pretty good results mixing Java and Clojure code in projects. This seems like one of those cases where this might be appropriate. The interoperability is so good that you barely have any extra complexity.
BTW - I'm assuming that you need a nullary constructor to meet the requirements of some persistence library or something similar? It seems like an odd requirement otherwise. If this is the case then you may find it makes sense to rethink your persistence strategy..... arbitrary restrictions like this always seem like a bit of a code smell to me.

Single-use class

In a project I am working on, we have several "disposable" classes. What I mean by disposable is that they are a class where you call some methods to set up the info, and you call what equates to a doit function. You doit once and throw them away. If you want to doit again, you have to create another instance of the class. The reason they're not reduced to single functions is that they must store state for after they doit for the user to get information about what happened and it seems to be not very clean to return a bunch of things through reference parameters. It's not a singleton but not a normal class either.
Is this a bad way to do things? Is there a better design pattern for this sort of thing? Or should I just give in and make the user pass in a boatload of reference parameters to return a bunch of things through?
What you describe is not a class (state + methods to alter it), but an algorithm (map input data to output data):
result_t do_it(parameters_t);
Why do you think you need a class for that?
Sounds like your class is basically a parameter block in a thin disguise.
There's nothing wrong with that IMO, and it's certainly better than a function with so many parameters it's hard to keep track of which is which.
It can also be a good idea when there's a lot of input parameters - several setup methods can set up a few of those at a time, so that the names of the setup functions give more clue as to which parameter is which. Also, you can cover different ways of setting up the same parameters using alternative setter functions - either overloads or with different names. You might even use a simple state-machine or flag system to ensure the correct setups are done.
However, it should really be possible to recycle your instances without having to delete and recreate. A "reset" method, perhaps.
As Konrad suggests, this is perhaps misleading. The reset method shouldn't be seen as a replacement for the constructor - it's the constructors job to put the object into a self-consistent initialised state, not the reset methods. Object should be self-consistent at all times.
Unless there's a reason for making cumulative-running-total-style do-it calls, the caller should never have to call reset explicitly - it should be built into the do-it call as the first step.
I still decided, on reflection, to strike that out - not so much because of Jalfs comment, but because of the hairs I had to split to argue the point ;-) - Basically, I figure I almost always have a reset method for this style of class, partly because my "tools" usually have multiple related kinds of "do it" (e.g. "insert", "search" and "delete" for a tree tool), and shared mode. The mode is just some input fields, in parameter block terms, but that doesn't mean I want to keep re-initializing. But just because this pattern happens a lot for me, doesn't mean it should be a point of principle.
I even have a name for these things (not limited to the single-operation case) - "tool" classes. A "tree_searching_tool" will be a class that searches (but doesn't contain) a tree, for example, though in practice I'd have a "tree_tool" that implements several tree-related operations.
Basically, even parameter blocks in C should ideally provide a kind of abstraction that gives it some order beyond being just a bunch of parameters. "Tool" is a (vague) abstraction. Classes are a major means of handling abstraction in C++.
I have used a similar design and wondered about this too. A fictive simplified example could look like this:
FileDownloader downloader(url);;
downloader.result(); // get the path to the downloaded file
To make it reusable I store it in a boost::scoped_ptr:
boost::scoped_ptr<FileDownloader> downloader;
// Download first file
downloader.reset(new FileDownloader(url1));
// Download second file
downloader.reset(new FileDownloader(url2));
To answer your question: I think it's ok. I have not found any problems with this design.
As far as I can tell you are describing a class that represents an algorithm. You configure the algorithm, then you run the algorithm and then you get the result of the algorithm. I see nothing wrong with putting those steps together in a class if the alternative is a function that takes 7 configuration parameters and 5 output references.
This structuring of code also has the advantage that you can split your algorithm into several steps and put them in separate private member functions. You can do that without a class too, but that can lead to the sub-functions having many parameters if the algorithm has a lot of state. In a class you can conveniently represent that state through member variables.
One thing you might want to look out for is that structuring your code like this could easily tempt you to use inheritance to share code among similar algorithms. If algorithm A defines a private helper function that algorithm B needs, it's easy to make that member function protected and then access that helper function by having class B derive from class A. It could also feel natural to define a third class C that contains the common code and then have A and B derive from C. As a rule of thumb, inheritance used only to share code in non-virtual methods is not the best way - it's inflexible, you end up having to take on the data members of the super class and you break the encapsulation of the super class. As a rule of thumb for that situation, prefer factoring the common code out of both classes without using inheritance. You can factor that code into a non-member function or you might factor it into a utility class that you then use without deriving from it.
YMMV - what is best depends on the specific situation. Factoring code into a common super class is the basis for the template method pattern, so when using virtual methods inheritance might be what you want.
Nothing especially wrong with the concept. You should try to set it up so that the objects in question can generally be auto-allocated vs having to be newed -- significant performance savings in most cases. And you probably shouldn't use the technique for highly performance-sensitive code unless you know your compiler generates it efficiently.
I disagree that the class you're describing "is not a normal class". It has state and it has behavior. You've pointed out that it has a relatively short lifespan, but that doesn't make it any less of a class.
Short-lived classes vs. functions with out-params:
I agree that your short-lived classes are probably a little more intuitive and easier to maintain than a function which takes many out-params (or 1 complex out-param). However, I suspect a function will perform slightly better, because you won't be taking the time to instantiate a new short-lived object. If it's a simple class, that performance difference is probably negligible. However, if you're talking about an extremely performance-intensive environment, it might be a consideration for you.
Short-lived classes: creating new vs. re-using instances:
There's plenty of examples where instances of classes are re-used: thread-pools, DB-connection pools (probably darn near any software construct ending in 'pool' :). In my experience, they seem to be used when instantiating the object is an expensive operation. Your small, short-lived classes don't sound like they're expensive to instantiate, so I wouldn't bother trying to re-use them. You may find that whatever pooling mechanism you implement, actually costs MORE (performance-wise) than simply instantiating new objects whenever needed.

Why do Boost Parameter elected inheritance rather than composition?

I suppose most of the persons on this site will agree that implementation can be outsourced in two ways:
private inheritance
Inheritance is most often abused. Notably, public inheritance is often used when another form or inheritance could have been better and in general one should use composition rather than private inheritance.
Of course the usual caveats apply, but I can't think of any time where I really needed inheritance for an implementation problem.
For the Boost Parameter library however, you will notice than they have chosen inheritance over composition for the implementation of the named parameter idiom (for the constructor).
I can only think of the classical EBO (Empty Base Optimization) explanation since there is no virtual methods at play here that I can see.
Does anyone knows better or can redirect me to the discussion ?
EDIT: Ooopss! I posted the answer below because I misread your post. I thought you said the Boost library used composition over inheritance, not the other way around. Still, if its usefull for anyone... (See EDIT2 for what I think could be the answer for you question.)
I don't know the specific answer for the Boost Parameter Library. However, I can say that this is usually a better choice. The reason is because whenever you have the option to implement a relationship in more than one way, you should choose the weakest one (low coupling/high cohesion). Since inheritance is stronger than composition...
Notice that sometimes using private inhertiance can make it harder to implement exception-safe code too. Take operator==, for example. Using composition you can create a temporary and do the assignment with commit/rollback logic (assuming a correct construction of the object). But if you use inheritance, you'll probably do something like Base::operator==(obj) inside the operator== of the derived class. If that Base::operator==(obj) call throws, you risk your guarantees.
EDIT 2: Now, trying to answer what you really asked. This is what I could understand from the link you provided. Since I don't know all details of the library, please correct me if I'm wrong.
When you use composition for "implemented in terms of" you need one level of indirection for the delegation.
struct AImpl
//Dummy code, just for the example.
int get_int() const { return 10; }
struct A
AImpl * impl_;
int get_int() const { return impl->get_int(); }
/* ... */
In the case of the parameter-enabled constructor, you need to create an implementation class but you should still be able to use the "wrapper" class in a transparent way. This means that in the example from the link you mentioned, it's desired that you can manipulate myclass just like you would manipulate myclass_impl. This can only be done via inheritance. (Notice that in the example the inheritance is public, since it's the default for struct.)
I assume myclass_impl is supposed to be the "real" class, the one with the data, behavior, etc. Then, if you had a method like get_int() in it and if you didn't use inheritance you would be forced to write a get_int() wrapper in myclass just like I did above.
This isn't a library I've ever used, so a glance through the documentation you linked to is the only thing I'm basing this answer on. It's entirely possible I'm about to be wrong, but...
They mention constructor delegation as a reason for using a common base class. You're right that composition could address that particular issue just as well. Putting it all in a single type, however, would not work. They want to boil multiple constructor signatures into a single user-written initialization function, and without constructor delegation that requires a second data type. My suspicion is that much of the library had already been written from the point of view of putting everything into the class itself. When they ran into the constructor delegation issue they compromised. Putting it into a base class was probably closer to what they were doing with the previous functionality, where they knew that both interface and implementation aspects of the functionality would be accessible to the class you're working with.
I'm not slamming the library in any way. I highly doubt I could put together a library like this one in any reasonable amount of time. I'm just reading between the lines. You know, speaking from ignorance but pretending I actually know something. :-)

Extending an existing class like a namespace (C++)?

I'm writing in second-person just because its easy, for you.
You are working with a game engine and really wish a particular engine class had a new method that does 'bla'. But you'd rather not spread your 'game' code into the 'engine' code.
So you could derive a new class from it with your one new method and put that code in your 'game' source directory, but maybe there's another option?
So this is probably completely illegal in the C++ language, but you thought at first, "perhaps I can add a new method to an existing class via my own header that includes the 'parent' header and some special syntax. This is possible when working with a namespace, for example..."
Assuming you can't declare methods of a class across multiple headers (and you are pretty darn sure you can't), what are the other options that support a clean divide between 'middleware/engine/library' and 'application', you wonder?
My only question to you is, "does your added functionality need to be a member function, or can it be a free function?" If what you want to do can be solved using the class's existing interface, then the only difference is the syntax, and you should use a free function (if you think that's "ugly", then... suck it up and move on, C++ wasn't designed for monkeypatching).
If you're trying to get at the internal guts of the class, it may be a sign that the original class is lacking in flexibility (it doesn't expose enough information for you to do what you want from the public interface). If that's the case, maybe the original class can be "completed", and you're back to putting a free function on top of it.
If absolutely none of that will work, and you just must have a member function (e.g. original class provided protected members you want to get at, and you don't have the freedom to modify the original interface)... only then resort to inheritance and member-function implementation.
For an in-depth discussion (and deconstruction of std::string'), check out this Guru of the Week "Monolith" class article.
Sounds like a 'acts upon' relationship, which would not fit in an inheritance (use sparingly!).
One option would be a composition utility class that acts upon a certain instance of the 'Engine' by being instantiated with a pointer to it.
Inheritance (as you pointed out), or
Use a function instead of a method, or
Alter the engine code itself, but isolate and manage the changes using a patch-manager like quilt or Mercurial/MQ
I don't see what's wrong with inheritance in this context though.
If the new method will be implemented using the existing public interface, then arguably it's more object oriented for it to be a separate function rather than a method. At least, Scott Meyers argues that it is.
Why? Because it gives better encapsulation. IIRC the argument goes that the class interface should define things that the object does. Helper-style functions are things that can be done with/to the object, not things that the object must do itself. So they don't belong in the class. If they are in the class, they can unnecessarily access private members and hence widen the hiding of that member and hence the number of lines of code that need to be touched if the private member changes in any way.
Of course if you want to access protected members then you must inherit. If your desired method requires per-instance state, but not access to protected members, then you can either inherit or composite according to taste - the former is usually more concise, but has certain disadvantages if the relationship isn't really "is a".
Sounds like you want Ruby mixins. Not sure there's anything close in C++. I think you have to do the inheritance.
Edit: You might be able to put a friend method in and use it like a mixin, but I think you'd start to break your encapsulation in a bad way.
You could do something COM-like, where the base class supports a QueryInterface() method which lets you ask for an interface that has that method on it. This is fairly trivial to implement in C++, you don't need COM per se.
You could also "pretend" to be a more dynamic language and have an array of callbacks as "methods" and gin up a way to call them using templates or macros and pushing 'this' onto the stack before the rest of the parameters. But it would be insane :)
Or Categories in Objective C.
There are conceptual approaches to extending class architectures (not single classes) in C++, but it's not a casual act, and requires planning ahead of time. Sorry.
Sounds like a classic inheritance problem to me. Except I would drop the code in an "Engine Enhancements" directory & include that concept in your architecture.

What are some 'good use' examples of dynamic casting?

We often hear/read that one should avoid dynamic casting. I was wondering what would be 'good use' examples of it, according to you?
Yes, I'm aware of that other thread: it is indeed when reading one of the first answers there that I asked my question!
This recent thread gives an example of where it comes in handy. There is a base Shape class and classes Circle and Rectangle derived from it. In testing for equality, it is obvious that a Circle cannot be equal to a Rectangle and it would be a disaster to try to compare them. While iterating through a collection of pointers to Shapes, dynamic_cast does double duty, telling you if the shapes are comparable and giving you the proper objects to do the comparison on.
Here's something I do often, it's not pretty, but it's simple and useful.
I often work with template containers that implement an interface,
imagine something like
template<class T>
class MyVector : public ContainerInterface
Where ContainerInterface has basic useful stuff, but that's all. If I want a specific algorithm on vectors of integers without exposing my template implementation, it is useful to accept the interface objects and dynamic_cast it down to MyVector in the implementation. Example:
// function prototype (public API, in the header file)
void ProcessVector( ContainerInterface& vecIfce );
// function implementation (private, in the .cpp file)
void ProcessVector( ContainerInterface& vecIfce)
MyVector<int>& vecInt = dynamic_cast<MyVector<int> >(vecIfce);
// the cast throws bad_cast in case of error but you could use a
// more complex method to choose which low-level implementation
// to use, basically rolling by hand your own polymorphism.
// Process a vector of integers
I could add a Process() method to the ContainerInterface that would be polymorphically resolved, it would be a nicer OOP method, but I sometimes prefer to do it this way. When you have simple containers, a lot of algorithms and you want to keep your implementation hidden, dynamic_cast offers an easy and ugly solution.
You could also look at double-dispatch techniques.
My current toy project uses dynamic_cast twice; once to work around the lack of multiple dispatch in C++ (it's a visitor-style system that could use multiple dispatch instead of the dynamic_casts), and once to special-case a specific subtype.
Both of these are acceptable, in my view, though the former at least stems from a language deficit. I think this may be a common situation, in fact; most dynamic_casts (and a great many "design patterns" in general) are workarounds for specific language flaws rather than something that aim for.
It can be used for a bit of run-time type-safety when exposing handles to objects though a C interface. Have all the exposed classes inherit from a common base class. When accepting a handle to a function, first cast to the base class, then dynamic cast to the class you're expecting. If they passed in a non-sensical handle, you'll get an exception when the run-time can't find the rtti. If they passed in a valid handle of the wrong type, you get a NULL pointer and can throw your own exception. If they passed in the correct pointer, you're good to go.
This isn't fool-proof, but it is certainly better at catching mistaken calls to the libraries than a straight reinterpret cast from a handle, and waiting until some data gets mysteriously corrupted when you pass the wrong handle in.
Well it would really be nice with extension methods in C#.
For example let's say I have a list of objects and I want to get a list of all ids from them. I can step through them all and pull them out but I would like to segment out that code for reuse.
so something like
List<myObject> myObjectList = getMyObjects();
List<string> ids = myObjectList.PropertyList("id");
would be cool except on the extension method you won't know the type that is coming in.
public static List<string> PropertyList(this object objList, string propName) {
var genList = (objList.GetType())objList;
would be awesome.
It is very useful, however, most of the times it is too useful: if for getting the job done the easiest way is to do a dynamic_cast, it's more often than not a symptom of bad OO design, what in turn might lead to trouble in the future in unforeseen ways.