myth about factory pattern [closed] - factory-pattern

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
This has bothered me for awhile, and I have no clues if this is a myth.
It seems that a factory pattern can ease the pain of adding a dependency for a class.
For example, in a book, it has something like this
Suppose that you have a class named Order. Initially it did not depend on anything. Therefore you didn't bother using a factory to create Order objects and you just used plain new to instantiate the objects. However, you now have a requirement that Order has to be created in association with a Customer. There are million places you need to change to add this extra parameter. If only you had defined a factory for the Order class, you would have met the new requirement without the same pain.
How is this not same pain as adding an extra parameter to the constructor? I mean you would still need to provide an extra argument for the factory and that is also used by million places, right?

If the user is known only at the time, the order is created, you could implement a getCurrentUser() function that is called by the factory.
If that is possible, the factory function obviously wins. If not, then there is no gain.
If, in the past, you didn't know there would ba a customer needed, you probably also could not know whether it's possible to implement a getCurrentUser() function. The chances of the factory method paying off may not be very good but they don't always equal 0.

The real benefit to using a Factory is that it is a façade which hides just how you go about creating an object that fulfills the Order role. To be more exact, the Factory knows that you're really making a FooBarOrder, and nothing else has to be changed to switch from always making a FooBarOrder to sometimes making a BarFooOrder instead. (If Java let you intercept new and make a subclass instead, there would be no need for Factories. But it doesn't – fairly reasonably, to be fair – so you have to have them. Object systems which allow subclassing the class of classes are more flexible in this regard.)

No, because the dependency for the factory should be injected via the factories constructor, and you are only constructing the factory in one place, but the passing it as the dependency to everything that needs to create an order. The things which are getting orders from the factory are still calling the same method, CreateOrder() or whatever, and so that code is unchanged.
The dependencies should all be wired up in a single place, the composition root, and that should be the only place that needs to change, to add the new dependency to the factory

You tell the factory about the new dependency, and let it add it for you. The method call to the factory should be unchanged.

The factory pattern can ease the pain of adding a dependency, because a factory may contain state and, in fact, can encapsulate multiple dependencies (e.g. instead of providing three dependencies, all needed to invoke some object's constructor, you now provide only a single factory object, where the factory contains those three objects that are needed to be provided to the constructor).
To give an example, compare:
void DoIt(const DependencyA& a, const DependencyB& b) {
// NOTE: "x" is a contrived additional variable that we add here to
// justify why we didn't just pass DependencyC directly.
int x = ComputeX();
std::unique_ptr<DependencyC> dependency_c(new DependencyC(a, b, x));
dependency_c->DoStuff();
}
And:
void DoIt(const DependencyCFactory& factory) {
int x = ComputeX();
std::unique_ptr<DependencyC> dependency_c(factory->Create(x));
dependency_c->DoStuff();
}
Note that the second version required fewer dependencies to the method "DoIt". This does not mean that those dependencies aren't need in the entire program (indeed, the program still makes use of DependencyA and DependencyB in the implementaiton of the factory). However, by structuring it this way, that dependency can be isolated to just the factory code, which keeps other code simpler, makes it easier to change the dependencies of DependencyC (now only the factory, itself, needs to be updated, not every place that instantiates DependencyC), and can even have certain safety/security benefits (e.g. if DependencyA and DependencyB are sensitive such as database passwords or API keys, limiing their usage to the factory reduces the chances of mishandling, compared to cases where you pass these around everywhere that you need to use the databse or API, for example).
In the example given in the book, the reason why having a factory for the Order would have helped is that it would have reduced the number of places where the constructor is used directly; only the one place that created the factory would need to be modified to store the Customer as an additional field of the factory; none of the other uses of the factory would need to be modified. By comparison, without the use of the factory, direct uses of the constructor abound, and each one of them must be updated to somehow obtain access to the Customer object.

Related

Is there a design pattern that can deal with method call dependencies? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
My problem is like this:
When you design a c++ class, in some cases, some methods can only be called after some other methods have been called, or after some data member has been properly prepared.
I found it quite hard to deal with when there are some data member dependencies that is often not so obvious, and the class needs to be extended to support more features and functionalities. It is really error-prone.
Execution speed of the code is not important in my case, code clarity/maintainability/sensibility is more important.
If method2 could be called only after method1 then class has a state. You want to keep state consistent and methods which could change/use this state should not damage the state.
So you need incapsulation. Incapsulation is not about how to make field a private and create setField() method but about who could change the state. The right answer: only object could change its state. If you have setters for every single field you have an unprotected object and control about consistent state has leaked.
In practice, you could re-desing code so data could be set up during previous steps only. In this case you don't worry about checking "is data prepared?" each time method2 called.
To avoid untimely calls there are several approaches. Each has pro and contras. I suppose you have chain of state0 -> method1 -> state1 -> method2 -> state2 -> method3 -> state3
Throw an exception if object has different state. F.e. inside method1 add
if currentState.differs(state0) throw exception
Most easy to implement, this way won't help you in terms of understanding and maintaining your project
Use combination of State and Chain-of-responsibility pattern. Divide class in several ones and each could accept previous one as input param and return next class in chain. F.e. Class0 would have method wih signature Class1 method1(state0), Class1 have Class2 method2(class1) and Class 2 have state3 method3(class2). So nobody could call Class2.method3 or method1(class3) - it will not compile. As a side effect you get a lot of classes. Also you could get rigid process flow but it could be more flexible than next option.
Use Template pattern. Create a Processor class with process method and make sure only this class could call desired methods.
state3 process(state0) {
prepareStuff();
state1 = method1(state0)
somePreparation(state1)
state2 = method2(state1)
anotherPrepare(state2)
return method3(state2)
}
Then you could alter process flow by subclassing Processor and overriding preparation methods. Nobody could override process(). The disadvantage is you always get the whole process and can't stop after method2 (In fact you can but it would lead to leak the state and you get again uncontrolable process)
Also note policy template
In some sense both ways incapsulate the state of process at a higher level.
There are another ways to implement call dependency. No matter what you choose you have to strictly limit possibility to call methods anywhere at random moment.
You may consider adding an enum state variable to record the state of your object, and handle state checking and transitions in methods that need to be called sequencially.
And if the state becomes too complex you should consider revising your class design and perhaps split it into smaller classes.

C++ Different subclasses need different parameters

I'm looking for the best way to accomplish the following:
Background
I have a based class with a request() virtual method with different subclasses provide alternate implementations of performing the requests. The idea is I'd like to let the client instantiate one of these subclasses and pass in one of these objects to a subsystem which will call request() when it needs to. The goal is to let the client decide how requests are handled by instantiated the desired subclass.
Problem
However, if a certain subclass implementation is chosen, it needs a piece of information from the subsystem which would most naturally be passed as an argument to request (i.e. request(special_info);). But other subclasses don't need this. Is there a clean way to hide this difference or appropriate design pattern that can be used here?
Thanks
Make the base request() method take the information as argument, and ignore the argument in subclass implementations that don't need it.
Or pass the SubSystem instance itself to the handler, and let the handler get the information it needs from the SubSystem (and ignore it if it doesn't need any information from the SubSystem). That would make the design more extensible: you wouldn't need to pass an additional argument and refactor all the methods each time a new subclass needing additional information is introduced.
JB Nizet's suggestion is one possible solution - it will certainly work.
What worries me a little is the rather vague notion that "some need more information". Where does this information come from, what decides that? The general principle with inheritance is that you have a baseclass that does the right thing for all the objects. If you have to go say "Is it type A object or type B object, then do this, else if it's type C object do something slightly different, and if it's type D object, do a another kind of thing", then you're doing it wrong.
It may be that JB's suggestion is the right one for you, but I would also consider the option that "special_info" can be passed into the constructor, or be fetched via some helper function. The constructor solution is a sane one, because at construction time, obviously, you need to know if something is a A, B, C or D object that you are creating. The helper function is a good solution some other times, but if it's used badly, it can lead to a bit of a messy solution, so use with care.
Generally, when things end up like this, it's because you are splitting the classes up "the wrong way".

Should dynamic dependencies of service objects be avoided?

This question is about testable software design based on mostly value objects and services.
Services that have static dependencies are straightforward to instantiate or configure when using a DI container. However, in some cases, services require dependencies that are known at runtime only.
Say, imagine a simple FileSystemDataStore with some CRUD methods in it for managing files in a directory. This service will need a directory name as one of its constructor parameters. That name could be known at runtime only and will have to be provided by its collaborators.
This seems to be somewhat of a problem because you can't configure such service in a DI container because of its dynamic nature. You'll probably have to use a factory to create such services. However, this will result in a quirk in the unit tests of the service's clients. You will have to mock the factory to return a mock of the service. This adds additional complexity to unit tests. Mocks returning mocks is often considered a test smell.
What is your opinion about this problem? Is it even a problem in your experience? Should such services be instead refactored to be more "pure"?
As a general observation, when services depend on run-time values, an Abstract Factory is indeed the appropriate response.
However, as pointed out in the question, this does have an impact on the maintainability of the tests, so if you can redesign the API to avoid such situations, you should do that. It's not always possible, though.
You would like to inject the directory name, but it is not known during the construction phase. I see three options here.
1. Inject a Provider
Instead of saying "Here is the directory name you need" you are saying "Here is an object that can give you the directory name at run-time". The way to implement this is to declaring a constructor argument Provider<String> directoryNameProvider. The constructor stores a reference to this provider as a member variable. When called apon to do some real-work in the run phase, the class would contain code like this when the directory name is needed:
directoryName = directoryNameProvider.get();
In java, the interface you implement is [javax.inject.Provider<T>][1]. This has a single method: get() which returns type T. The use of the generic provider interface means you do not have a proliferation of intefaces.
When it comes to your unit test, you can inject an anonymous inner class that implements the single method of Provider<T> to return a constant value easily enough. Our code base has a SimpleProvider<T> class that wraps a given object in the Provider interface.
Pro: Allows you to construct the object in the main construction phase. Unit testing is pretty easy.
Con: Details about dependency creation issues are leaking into the class when they should entirely be the concern of the factory. Too bad if the class is already written and accepts directoryName rather than directoryNameProvider already.
Despite the seemingly long list of cons, this is an option I use alot. It is my opinion that there is a missing language construct here.
2. Construct the troublesome object later
You can enter an inner scope when you know more. Within a run-phase method, you can enter a new scope. This means that you go through a whole new mini-construction phase, and then a mini-run phase. Ths is similiar to what happens in your application main() but at a smaller level.
Pro: Class receiving the dependency remains pure.
Con: Entering and exiting too many scopes can make the application and object life-cycles difficult to understand.
3. Use a method argument
You can decide that directoryName is to be a method argument and pass it to your class during the run phase rather than trying to inject it as a constructor argument. This is effectively deciding not to use dependency inject style for this occasion.
Pro: Simplicity
Con: Class that passes directoryName as a method parameter is tightly coupled to the class that needs it. It will be very difficult to implement an alternate implementation that depends on say, a database connection.
These are matters that I have been considering alot lately, so I'm interested in any comments or edits. Are there any other options?

Single-use class

In a project I am working on, we have several "disposable" classes. What I mean by disposable is that they are a class where you call some methods to set up the info, and you call what equates to a doit function. You doit once and throw them away. If you want to doit again, you have to create another instance of the class. The reason they're not reduced to single functions is that they must store state for after they doit for the user to get information about what happened and it seems to be not very clean to return a bunch of things through reference parameters. It's not a singleton but not a normal class either.
Is this a bad way to do things? Is there a better design pattern for this sort of thing? Or should I just give in and make the user pass in a boatload of reference parameters to return a bunch of things through?
What you describe is not a class (state + methods to alter it), but an algorithm (map input data to output data):
result_t do_it(parameters_t);
Why do you think you need a class for that?
Sounds like your class is basically a parameter block in a thin disguise.
There's nothing wrong with that IMO, and it's certainly better than a function with so many parameters it's hard to keep track of which is which.
It can also be a good idea when there's a lot of input parameters - several setup methods can set up a few of those at a time, so that the names of the setup functions give more clue as to which parameter is which. Also, you can cover different ways of setting up the same parameters using alternative setter functions - either overloads or with different names. You might even use a simple state-machine or flag system to ensure the correct setups are done.
However, it should really be possible to recycle your instances without having to delete and recreate. A "reset" method, perhaps.
As Konrad suggests, this is perhaps misleading. The reset method shouldn't be seen as a replacement for the constructor - it's the constructors job to put the object into a self-consistent initialised state, not the reset methods. Object should be self-consistent at all times.
Unless there's a reason for making cumulative-running-total-style do-it calls, the caller should never have to call reset explicitly - it should be built into the do-it call as the first step.
I still decided, on reflection, to strike that out - not so much because of Jalfs comment, but because of the hairs I had to split to argue the point ;-) - Basically, I figure I almost always have a reset method for this style of class, partly because my "tools" usually have multiple related kinds of "do it" (e.g. "insert", "search" and "delete" for a tree tool), and shared mode. The mode is just some input fields, in parameter block terms, but that doesn't mean I want to keep re-initializing. But just because this pattern happens a lot for me, doesn't mean it should be a point of principle.
I even have a name for these things (not limited to the single-operation case) - "tool" classes. A "tree_searching_tool" will be a class that searches (but doesn't contain) a tree, for example, though in practice I'd have a "tree_tool" that implements several tree-related operations.
Basically, even parameter blocks in C should ideally provide a kind of abstraction that gives it some order beyond being just a bunch of parameters. "Tool" is a (vague) abstraction. Classes are a major means of handling abstraction in C++.
I have used a similar design and wondered about this too. A fictive simplified example could look like this:
FileDownloader downloader(url);
downloader.download();
downloader.result(); // get the path to the downloaded file
To make it reusable I store it in a boost::scoped_ptr:
boost::scoped_ptr<FileDownloader> downloader;
// Download first file
downloader.reset(new FileDownloader(url1));
downloader->download();
// Download second file
downloader.reset(new FileDownloader(url2));
downloader->download();
To answer your question: I think it's ok. I have not found any problems with this design.
As far as I can tell you are describing a class that represents an algorithm. You configure the algorithm, then you run the algorithm and then you get the result of the algorithm. I see nothing wrong with putting those steps together in a class if the alternative is a function that takes 7 configuration parameters and 5 output references.
This structuring of code also has the advantage that you can split your algorithm into several steps and put them in separate private member functions. You can do that without a class too, but that can lead to the sub-functions having many parameters if the algorithm has a lot of state. In a class you can conveniently represent that state through member variables.
One thing you might want to look out for is that structuring your code like this could easily tempt you to use inheritance to share code among similar algorithms. If algorithm A defines a private helper function that algorithm B needs, it's easy to make that member function protected and then access that helper function by having class B derive from class A. It could also feel natural to define a third class C that contains the common code and then have A and B derive from C. As a rule of thumb, inheritance used only to share code in non-virtual methods is not the best way - it's inflexible, you end up having to take on the data members of the super class and you break the encapsulation of the super class. As a rule of thumb for that situation, prefer factoring the common code out of both classes without using inheritance. You can factor that code into a non-member function or you might factor it into a utility class that you then use without deriving from it.
YMMV - what is best depends on the specific situation. Factoring code into a common super class is the basis for the template method pattern, so when using virtual methods inheritance might be what you want.
Nothing especially wrong with the concept. You should try to set it up so that the objects in question can generally be auto-allocated vs having to be newed -- significant performance savings in most cases. And you probably shouldn't use the technique for highly performance-sensitive code unless you know your compiler generates it efficiently.
I disagree that the class you're describing "is not a normal class". It has state and it has behavior. You've pointed out that it has a relatively short lifespan, but that doesn't make it any less of a class.
Short-lived classes vs. functions with out-params:
I agree that your short-lived classes are probably a little more intuitive and easier to maintain than a function which takes many out-params (or 1 complex out-param). However, I suspect a function will perform slightly better, because you won't be taking the time to instantiate a new short-lived object. If it's a simple class, that performance difference is probably negligible. However, if you're talking about an extremely performance-intensive environment, it might be a consideration for you.
Short-lived classes: creating new vs. re-using instances:
There's plenty of examples where instances of classes are re-used: thread-pools, DB-connection pools (probably darn near any software construct ending in 'pool' :). In my experience, they seem to be used when instantiating the object is an expensive operation. Your small, short-lived classes don't sound like they're expensive to instantiate, so I wouldn't bother trying to re-use them. You may find that whatever pooling mechanism you implement, actually costs MORE (performance-wise) than simply instantiating new objects whenever needed.

Flexible application configuration in C++

I am developing a C++ application used to simulate a real world scenario. Based on this simulation our team is going to develop, test and evaluate different algorithms working within such a real world scenrio.
We need the possibility to define several scenarios (they might differ in a few parameters, but a future scenario might also require creating objects of new classes) and the possibility to maintain a set of algorithms (which is, again, a set of parameters but also the definition which classes are to be created). Parameters are passed to the classes in the constructor.
I am wondering which is the best way to manage all the scenario and algorithm configurations. It should be easily possible to have one developer work on one scenario with "his" algorithm and another developer working on another scenario with "his" different algorithm. Still, the parameter sets might be huge and should be "sharable" (if I defined a set of parameters for a certain algorithm in Scenario A, it should be possible to use the algorithm in Scenario B without copy&paste).
It seems like there are two main ways to accomplish my task:
Define a configuration file format that can handle my requirements. This format might be XML based or custom. As there is no C#-like reflection in C++, it seems like I have to update the config-file parser each time a new algorithm class is added to project (in order to convert a string like "MyClass" into a new instance of MyClass). I could create a name for every setup and pass this name as command line argument.
The pros are: no compilation required to change a parameter and re-run, I can easily store the whole config file with the simulation results
contra: seems like a lot of effort, especially hard because I am using a lot of template classes that have to be instantiated with given template arguments. No IDE support for writing the file (at least without creating a whole XSD which I would have to update everytime a parameter/class is added)
Wire everything up in C++ code. I am not completely sure how I would do this to separate all the different creation logic but still be able to reuse parameters across scenarios. I think I'd also try to give every setup a (string) name and use this name to select the setup via command line arg.
pro: type safety, IDE support, no parser needed
con: how can I easily store the setup with the results (maybe some serialization?)?, needs compilation after every parameter change
Now here are my questions:
- What is your opinion? Did I miss
important pros/cons?
- did I miss a third option?
- Is there a simple way to implement the config file approach that gives
me enough flexibility?
- How would you organize all the factory code in the seconde approach? Are there any good C++ examples for something like this out there?
Thanks a lot!
There is a way to do this without templates or reflection.
First, you make sure that all the classes you want to create from the configuration file have a common base class. Let's call this MyBaseClass and assume that MyClass1, MyClass2 and MyClass3 all inherit from it.
Second, you implement a factory function for each of MyClass1, MyClass2 and MyClass3. The signatures of all these factory functions must be identical. An example factory function is as follows.
MyBaseClass * create_MyClass1(Configuration & cfg)
{
// Retrieve config variables and pass as parameters
// to the constructor
int age = cfg->lookupInt("age");
std::string address = cfg->lookupString("address");
return new MyClass1(age, address);
}
Third, you register all the factory functions in a map.
typedef MyBaseClass* (*FactoryFunc)(Configuration *);
std::map<std::string, FactoryFunc> nameToFactoryFunc;
nameToFactoryFunc["MyClass1"] = &create_MyClass1;
nameToFactoryFunc["MyClass2"] = &create_MyClass2;
nameToFactoryFunc["MyClass3"] = &create_MyClass3;
Finally, you parse the configuration file and iterate over it to find all the entries that specify the name of a class. When you find such an entry, you look up its factory function in the nameToFactoryFunc table and invoke the function to create the corresponding object.
If you don't use XML, it's possible that boost::spirit could short-circuit at least some of the problems you are facing. Here's a simple example of how config data could be parsed directly into a class instance.
I found this website with a nice template supporting factory which I think will be used in my code.