doctrine 2 best practices for helper methods - doctrine-orm

I often use helper methods in my models. For example, if my model has a 'date' property, I might have getDateShort() or getDateLong() methods. Is it ok to put these methods in my entities, or should I wrap a model class around the entity and consider the entity as a model resource?

This is a pretty broad question.
In general, there's nothing wrong with adding those kinds of methods to your entities directly. A good rule of thumb is that such methods are okay if they only concern themselves with data internal to the entity.
That said, doing things like formatting dates might be better put into some external helper class, as you might want to do consistent formatting on dates from several different entity classes.
With your date example, you might consider just implementing a getter that can take an optional parameter, that you could use like this:
<?php
$myEntity = $em->find('MyEntity',1);
echo $myEntity->getCreatedAt(MY_DATE_FORMAT_SHORT);
In that example, we assume there are some globally defined MY_DATE_FORMAT_* constants. For more specific-to-this-entity type things, you might define class constants instead.
Going back to your question, here's another example of something I think is perfectly okay to do inside an Entity:
<?php
/**
* #Entity
*/
class Order
// ...
/**
* #OneToMany(targetEntity="OrderLine")
*/
protected $lines = array();
// ...
public function getTotal(){
$sum = 0;
foreach($this->lines as $l) $sum += $l->getTotal();
$sum += $this->computeTax();
$sum += $this->applyCoupons();
return $sum;
}
}
So various bits of internal (either directly, or via associations) data go into computing the order total. But since it's only internally-visible data, it makes good sense to include getTotal() as a method on Order.
If, on the other hand, you were making API calls to some shipping service to include shipping charges in the total, it would make more sense to keep that complexity away from the entity, in some sort of service class.

Related

Architecture: Modifying the model in different ways

Problem statement
I have a model class that looks something like (extremely simplified; some members and many, many methods omitted for clarity):
class MyModelItem
{
public:
enum ItemState
{
State1,
State2
};
QString text() const;
ItemState state() const;
private:
QString _text;
ItemState _state;
}
It is a core element of the application and is used in many different parts of the code:
It is serialized/deserialized into/from various file formats
It can be written into or read from a database
It can be updated by an 'import', that reads a file and applies changes to the currently loaded in-memory model
It can be updated by the user through various GUI functions
The problem is, this class is has grown over the years and now has several thousands lines of code; it has become a prime example of how to violate the Single responsibility principle.
It has methods for setting the 'text', 'state', etc. directly (after deserialization) and the same set of methods for setting them from within the UI, which has side effects like updating the 'lastChangedDate' and 'lastChangedUser' etc. Some methods or groups of methods exist even more than twice, with everyone of them doing basically the same thing but slightly different.
When developing new parts of the application, you are very likely using the wrong of the five different ways of manipulating MyModelItem, which makes it extremely time consuming and frustrating.
Requirements
Given this historically grown and overly complex class, the goal is to separate all different concerns of it into different classes, leaving only the core data members in it.
Ideally, I would prefer a solution where a MyModelItem object has nothing but const members for accessing the data and modifications can only be made using special classes.
Every one of these special classes could then contain an actual concrete implementation of the business logic (a setter of 'text' could do something like "if the text to be set begins with a certain substring and the state equals 'State1', set it to 'State2'").
First part of the solution
For loading and storing the whole model, which consists of many MyModelItem objects and some more, the Visitor pattern looks like a promising solution. I could implement several visitor classes for different file formats or database schemas and have a save and load method in MyModelItem, which accept such a visitor object each.
Open question
When the user enters a specific text, I want to validate that input. The same validation must be made if the input comes from another part of the application, which means I can not move the validation into the UI (in any case, UI-only-validation is often a bad idea). But if the validation happens in the MyModelItem itself, I have two problems again:
The separation of concerns, which was the goal to begin with is negated. All the business logic code is still 'dumped' into the poor model.
When called by other parts of the application, this validation has to look differently. Implementing different validating-setter-methods is how it is done right now, which has a bad code smell.
It is clear now that the validation has to be moved outside both the UI and the model, into some sort of controller (in a MVC sense) class or collection of classes. These should then decorate/visit/etc the actual dumb model class with its data.
Which software design pattern fits best to the described case, to allow for different ways of modifying the instances of my class?
I am asking, because none of the patterns I know solves my problem entirely and I feel like I'm missing something here...
Thanks a lot for your ideas!
Plain strategy pattern seems the best strategy to me.
What I understand from your statement is that:
The model is mutable.
the mutation may happen through different source. (ie. different classes)
the model must validate each mutation effort.
Depending on the source of an effort the validation process differs.
the model is oblivious of the source and the process. its prime concern is the state of object it is modeling.
Proposal:
let the Source be the classes which somehow mutate the model. it may be the deserializers, the UI, the importers etc.
let a validator be an interface/super-class which holds a basic logic of validation. it can have methods like : validateText(String), validateState(ItemState)...
Every Source has-a validator. That validator may be an instance of the base-validator or may inherit and override some of its methods.
Every validator has-a reference to the model.
A source first sets its own validator then takes the mutation attempt.
now,
Source1 Model Validator
| setText("aaa") | |
|----------------------->| validateText("aaa") |
| |----------------------->|
| | |
| | setState(2) |
| true |<-----------------------|
|<-----------------------| |
the behavior of different validators might be different.
Although you don't state it explicitly, refactoring thousands of lines of code is a daunting task, and I imagine that some incremental process is preferred over an all-or-nothing one.
Furthermore, the compiler should help as much as possible to detect errors. If it is a lot of work and frustration now to figure out which methods should be called, it will be even worse if the API has been made uniform.
Therefore, I would propose to use the Facade pattern, mostly for this reason:
wrap a poorly designed collection of APIs with a single well-designed API (as per task needs)
Because that is basically what you have: a collection of APIs in one class, that needs to be separated into different groups. Each group would get its own Facade, with its own calls. So the current MyModelItem, with all its carefully crafted different method invocations over the years:
...
void setText(String s);
void setTextGUI(String s); // different name
void setText(int handler, String s); // overloading
void setTextAsUnmentionedSideEffect(int state);
...
becomes:
class FacadeInternal {
setText(String s);
}
class FacadeGUI {
setTextGUI(String s);
}
class FacadeImport {
setText(int handler, String s);
}
class FacadeSideEffects {
setTextAsUnmentionedSideEffect(int state);
}
If we remove the current members in MyModelItem to MyModelItemData, then we get:
class MyModelItem {
MyModelItemData data;
FacadeGUI& getFacade(GUI client) { return FacadeGUI::getInstance(data); }
FacadeImport& getFacade(Importer client) { return FacadeImport::getInstance(data); }
}
GUI::setText(MyModelItem& item, String s) {
//item.setTextGUI(s);
item.getFacade(this).setTextGUI(s);
}
Of course, implementation variants exist here. It could equally well be:
GUI::setText(MyModelItem& item, String s) {
myFacade.setTextGUI(item, s);
}
That is more dependent on restrictions on memory, object creation, concurrency, etc. The point is that up till now, it is all straight forward (I won't say search-and-replace), and the compiler helps every step of the way to catch errors.
The nice thing about the Facade is that it can form an interface to multiple libraries/classes. After splitting things up, the business rules are all in several Facades, but you can refactor them further:
class FacadeGUI {
MyModelItemData data;
GUIValidator validator;
GUIDependentData guiData;
setTextGUI(String s) {
if (validator.validate(data, s)) {
guiData.update(withSomething)
data.setText(s);
}
}
}
and the GUI code won't have to be changed one bit.
After all that you might choose to normalize the Facades, so that they all have the same method names. It isn't necessary, though, and for clarity's sake it might even be better to keep the names distinct. Regardless, once again the compiler will help validate any refactoring.
(I know I stress the compiler bit a lot, but in my experience, once everything has the same name, and works through one or more layers of indirection, it becomes a pain to find out where and when something is actually going wrong.)
Anyway, this is how I would do it, as it allows for splitting up large chunks of code fairly quickly, in a controlled manner, without having to think too much. It provides a nice stepping stone for further tweaking. I guess that at some point the MyModelItem class should be renamed to MyModelItemMediator.
Good luck with your project.
If I understand your problem correctly, then would I not decide yet which design pattern to chose. I think that I have seen code like this several times before and the main problem in my point of view was always that change was build upon change build upon change.
The class had lost is original purpose and was now serving multiple purposes, which were all not clearly defined and set. The result is a big class (or a big database, spaghetti code etc), which seems to be indispensable yet is a nightmare for maintenance.
The big class is the symptom of a process that is gone out of control. It is where you can see it happen, but my guess is that when this class has been recovered that a lot of other classes will be the first to redesign. If I am correct, then is there also a lot of data corruption, because in a lot of cases is the definition of the data unclear.
My advice would be go back to your customer, talk about the business processes, reorganize the project management of the application and try to find out if the application is still serving the business process well. It might be not - I have been in this type of situation several times in different organizations.
If the business process is understood and the data model is converted in line with the new data model, then can you replace the application with a new design, which is much easier to create. The big class that now exists, does not have to be reorganized anymore, because its reason for existence is gone.
It costs money, but the maintenance now costs also money. A good indication for redesign is if new features are not implemented anymore, because it has become too expensive or error prone to execute.
I will try to give you a different perspective of the situation you have. Please note that explanations are written in my own words for the simplicity's sake. However, terms mentioned are from the enterprise application architecture patterns.
You are designing the business logic of the application. So, MyModelItem must be some kind of a business entity. I would say it's the Active Record you have.
Active Record: business entity that can CRUD itself, and can manage
the business logic related to itself.
The business logic contained in the Active Record has increased and has become hard to manage. That's very typical situation with Active Records. This is where you must switch from the Active Record pattern to the Data Mapper pattern.
Data Mapper: mechanism (typically a class) managing the mapping (typically between the entity and the data it translates from/to). It
starts existing when the mapping concerns of the Active Record are so
mature that they need to be put into the separate class. Mapping becomes a logic on its own.
So, here we came to the obvious solution: create a Data Mapper for the MyModelItem entity. Simplify the entity so that it does not handle the mapping of itself. Migrate the mapping management to the Data Mapper.
If the MyModelItem takes part in the inheritance, consider creating an abstract Data Mapper and concrete Data Mappers for each concrete class you want to map in a different way.
Several notes on how I would implement it:
Make entity aware of a mapper.
Mapper is a finder of the entity, so the application always starts from the mapper.
Entity should expose the functionality that is natural to be found on it.
And entity makes use of (abstract or concrete) mapper for doing the concrete things.
In general, you must model your application without the data in mind. Then, design mapper to manage the transformations from objects to the data and vice verca.
Now about validation
If the validation is the same in all the cases, then implement it in the entity, as that sounds natural to me. In most cases, this approach is sufficient.
If the validation differs and depends on something, abstract that something away and call the validation through the abstraction. One way (if it depends on the inheritance) would be to put the validation in the mapper, or have it in the same family of objects as mapper, created by the common Abstract Factory.

Decorator design pattern vs. inheritance?

I've read the decorator design pattern from Wikipedia, and code example from this site.
I see the point that traditional inheritance follows an 'is-a' pattern whereas decorator follows a 'has-a' pattern. And the calling convention of decorator looks like a 'skin' over 'skin' .. over 'core'. e.g.
I* anXYZ = new Z( new Y( new X( new A ) ) );
as demonstrated in above code example link.
However there are still a couple of questions that I do not understand:
what does wiki mean by 'The decorator pattern can be used to extend (decorate) the functionality of a certain object at run-time'? the 'new ...(new... (new...))' is a run-time call and is good but a 'AwithXYZ anXYZ;' is a inheritance at compile time and is bad?
from the code example link I can see that the number of class definition is almost the same in both implementations. I recall in some other design pattern books like 'Head first design patterns'. They use starbuzz coffee as example and say traditional inheritance will cause a 'class explosion' because for each combination of coffee, you would come up with a class for it.
But isn't it the same for decorator in this case? If a decorator class can take ANY abstract class and decorate it, then I guess it does prevent explosion, but from the code example, you have exact # of class definitions, no less...
Would anyone explain?
Let's take some abstract streams for example and imagine you want to provide encryption and compression services over them.
With decorator you have (pseudo code):
Stream plain = Stream();
Stream encrypted = EncryptedStream(Stream());
Stream zipped = ZippedStream(Stream());
Stream zippedEncrypted = ZippedStream(EncryptedStream(Stream());
Stream encryptedZipped = EncryptedStream(ZippedStream(Stream());
With inheritance, you have:
class Stream() {...}
class EncryptedStream() : Stream {...}
class ZippedStream() : Stream {...}
class ZippedEncryptedStream() : EncryptedStream {...}
class EncryptedZippedStream() : ZippedStream {...}
1) with decorator, you combine the functionality at runtime, depending on your needs. Each class only takes care of one facet of functionality (compression, encryption, ...)
2) in this simple example, we have 3 classes with decorators, and 5 with inheritance. Now let's add some more services, e.g. filtering and clipping. With decorator you need just 2 more classes to support all possible scenarios, e.g. filtering -> clipping -> compression -> encription.
With inheritance, you need to provide a class for each combination so you end up with tens of classes.
In reverse order:
2) With, say, 10 different independent extensions, any combination of which might be needed at run time, 10 decorator classes will do the job. To cover all possibilities by inheritance you'd need 1024 subclasses. And there'd be no way of getting around massive code redundancy.
1) Imagine you had those 1024 subclasses to choose from at run time. Try to sketch out the code that would be needed. Bear in mind that you might not be able to dictate the order in which options are picked or rejected. Also remember that you might have to use an instance for a while before extending it. Go ahead, try. Doing it with decorators is trivial by comparison.
You are correct that they can be very similar at times. The applicability and benefits of either solution will depend on your situation.
Others have beat me to adequate answers to your second question. In short it is that you can combine decorators to achieve more combinations which you cannot do with inheritance.
As such I focus on the first:
You cannot strictly say compile-time is bad and run-time is good, it is just different flexibility. The ability to change things at run-time can be important for some projects because it allows changes without recompilation which can be slow and requires you be in an environment where you can compile.
An example where you cannot use inheritance, is when you want to add functionality to an instantiated object. Suppose you are provided an instance of an object that implements a logging interface:
public interface ILog{
//Writes string to log
public void Write( string message );
}
Now suppose you begin a complicated task that involves many objects and each of them does logging so you pass along the logging object. However you want every message from the task to be prefixed with the task Name and Task Id. You could pass around a function, or pass along the Name and Id and trust every caller to follow the rule of pre-pending that information, or you could decorate the logging object before passing it along and not have to worry about the other objects doing it right
public class PrependLogDecorator : ILog{
ILog decorated;
public PrependLogDecorator( ILog toDecorate, string messagePrefix ){
this.decorated = toDecorate;
this.prefix = messagePrefix;
}
public void Write( string message ){
decorated.Write( prefix + message );
}
}
Sorry about the C# code but I think it will still communicate the ideas to someone who knows C++
To address the second part of your question (which might in turn address your first part), using the decorator method you have access to the same number of combinations, but don't have to write them. If you have 3 layers of decorators with 5 options at each level, you have 5*5*5 possible classes to define using inheritance. Using the decorator method you need 15.
First off, I'm a C# person and haven't dealt with C++ in a while, but hopefully you get where I'm coming from.
A good example that comes to mind is a DbRepository and a CachingDbRepository:
public interface IRepository {
object GetStuff();
}
public class DbRepository : IRepository {
public object GetStuff() {
//do something against the database
}
}
public class CachingDbRepository : IRepository {
public CachingDbRepository(IRepository repo){ }
public object GetStuff() {
//check the cache first
if(its_not_there) {
repo.GetStuff();
}
}
So, if I just used inheritance, I'd have a DbRepository and a CachingDbRepository. The DbRepository would query from a database; the CachingDbRepository would check its cache and if the data wasn't there, it would query a database. So there's a possible duplicate implementation here.
By using the decorator pattern, I still have the same number of classes, but my CachingDbRepository takes in a IRepository and calls its GetStuff() to get the data from the underlying repo if it's not in the cache.
So the number of classes are the same, but the use of the classes are related. CachingDbRepo calls the Repo that was passed into it...so it's more like composition over inheritance.
I find it subjective when to decide when to use just inheritance over decoration.
I hope this helps. Good luck!

Best practices for a class with many members

Any opinions on best way to organize members of a class (esp. when there are many) in C++. In particular, a class has lots of user parameters, e.g. a class that optimizes some function and has number of parameters such as # of iterations, size of optimization step, specific method to use, optimization function weights etc etc. I've tried several general approaches and seem to always find something non-ideal with it. Just curious others experiences.
struct within the class
struct outside the class
public member variables
private member variables with Set() & Get() functions
To be more concrete, the code I'm working on tracks objects in a sequence of images. So one important aspect is that it needs to preserve state between frames (why I didn't just make a bunch of functions). Significant member functions include initTrack(), trackFromLastFrame(), isTrackValid(). And there are a bunch of user parameters (e.g. how many points to track per object tracked, how much a point can move between frames, tracking method used etc etc)
If your class is BIG, then your class is BAD.
A class should respect the Single Responsibility Principle , i.e. : A class should do only one thing, but should do it well. (Well "only one" thing is extreme, but it should have only one role, and it has to be implemented clearly).
Then you create classes that you enrich by composition with those single-role little classes, each one having a clear and simple role.
BIG functions and BIG classes are nest for bugs, and misunderstanding, and unwanted side effects, (especially during maintainance), because NO MAN can learn in minutes 700 lines of code.
So the policy for BIG classes is: Refactor, Composition with little classes targetting only at what they have do.
if i had to choose one of the four solutions you listed: private class within a class.
in reality: you probably have duplicate code which should be reused, and your class should be reorganized into smaller, more logical and reusable pieces. as GMan said: refactor your code
First, I'd partition the members into two sets: (1) those that are internal-only use, (2) those that the user will tweak to control the behavior of the class. The first set should just be private member variables.
If the second set is large (or growing and changing because you're still doing active development), then you might put them into a class or struct of their own. Your main class would then have a two methods, GetTrackingParameters and SetTrackingParameters. The constructor would establish the defaults. The user could then call GetTrackingParameters, make changes, and then call SetTrackingParameters. Now, as you add or remove parameters, your interface remains constant.
If the parameters are simple and orthogonal, then they could be wrapped in a struct with well-named public members. If there are constraints that must be enforced, especially combinations, then I'd implement the parameters as a class with getters and setters for each parameter.
ObjectTracker tracker; // invokes constructor which gets default params
TrackerParams params = tracker.GetTrackingParameters();
params.number_of_objects_to_track = 3;
params.other_tracking_option = kHighestPrecision;
tracker.SetTrackingParameters(params);
// Now start tracking.
If you later invent a new parameter, you just need to declare a new member in the TrackerParams and initialize it in ObjectTracker's constructor.
It all depends:
An internal struct would only be useful if you need to organize VERY many items. And if this is the case, you ought to reconsider your design.
An external struct would be useful if it will be shared with other instances of the same or different classes. (A model, or data object class/struct might be a good example)
Is only ever advisable for trivial, throw-away code.
This is the standard way of doing things but it all depends on how you'll be using the class.
Sounds like this could be a job for a template, the way you described the usage.
template class FunctionOptimizer <typename FUNCTION, typename METHOD,
typename PARAMS>
for example, where PARAMS encapsulates simple optimization run parameters (# of iterations etc) and METHOD contains the actual optimization code. FUNCTION describes the base function you are targeting for optimization.
The main point is not that this is the 'best' way to do it, but that if your class is very large there are likely smaller abstractions within it that lend themselves naturally to refactoring into a less monolithic structure.
However you handle this, you don't have to refactor all at once - do it piecewise, starting small, and make sure the code works at every step. You'll be surprised how much better you quickly feel about the code.
I don't see any benefit whatsoever to making a separate structure to hold the parameters. The class is already a struct - if it were appropriate to pass parameters by a struct, it would also be appropriate to make the class members public.
There's a tradeoff between public members and Set/Get functions. Public members are a lot less boilerplate, but they expose the internal workings of the class. If this is going to be called from code that you won't be able to refactor if you refactor the class, you'll almost certainly want to use Get and Set.
Assuming that the configuration options apply only to this class, use private variables that are manipulated by public functions with meaningful function names. SetMaxInteriorAngle() is much better than SetMIA() or SetParameter6(). Having getters and setters allows you to enforce consistency rules on the configuration, and can be used to compensate for certain amounts of change in the configuration interface.
If these are general settings, used by more than one class, then an external class would be best, with private members and appropriate functions.
Public data members are usually a bad idea, since they expose the class's implementation and make it impossible to have any guaranteed relation between them. Walling them off in a separate internal struct doesn't seem useful, although I would group them in the list of data members and set them off with comments.

How to unit-test private code without refactoring to separate class?

Assume i have a private routine that performs some calculation:
private function TCar.Speed: float
{
Result = m_furlogs * 23;
}
But now i want to begin testing this calculation more thoroughly, so i refactor it out to a separate function:
public function TCar.Speed: float
{
Result = CalculateSpeed(m_furlogs);
}
private function TCar.CalculateSpeed(single furlogs): float
{
Result = furlogs * 23;
}
Now i can perform all kinds of tests on CalculateSpeed:
Check( CalculateSpeed(0) = 0);
Check( CalculateSpeed(1) = 23);
Check( CalculateSpeed(2) = 46);
Check( CalculateSpeed(88) = -1);
Except that i can't perform these tests, because CalculateSpeed is private to TCar. An abstract tennant of unit-testing is that you never test private code - only public interfaces. As a practical matter, *x*Unit is not normally structured to be able to access private methods of the separate class being tested.
The issue is that none of the rest of the class is setup to handle unit-tests. This is the very first routine that will have testing of any kind. And it is very difficult to configure the host class a set of initial conditions that will allow me to test calling CalculateSpeed with every set of inputs that i would like.
The only alternative i can see, is moving this private calculation out into it's own TCarCalculateSpeed class:
public class TCarCalculateSpeed
{
public function CalculateSpeed(float furlogs)
{
Result = furlogs * 23;
}
}
A whole class, dedicated to exposing one method, that's supposed to be private, just so i can test it?
Class explosion.
Plus it's private. If i wanted it to be public, i'd rather promote it to public visibility - at least that way i save a separate class being created.
i'd like to add some unit-testing; but it can only be done in small pieces, as code changes. i can't completely redesign functioning 12 year old software, possibly breaking everything, because i wanted to test one internal calculation.
My current, best, thinking is to add a Test method to my Car class, and just call that:
TCar Car = new TCar();
Car.RunTests;
public procedure TCar.RunTests
{
Check( CalculateSpeed(0) = 0);
Check( CalculateSpeed(1) = 23);
Check( CalculateSpeed(2) = 46);
Check( CalculateSpeed(88) = -1);
}
But now i have to figure out how to have TCar.RunTests get trigged by the external TestRunner, which is only designed to use TestCase classes.
Note: i've tried my damnest to mix syntax from a bunch of languages. In other words: language agnostic.
This can't really be quite language-agnostic, as the protection mechanisms and the tactics to bypass them vary quite widely with language.
But most languages do provide bypasses in some form, and as others have noted, there are sometimes protections midway between private and public that make testing easier.
In Java, for example, reflection can be used to private stuff if you really need to, and things can be made protected or package-private so that you don't need reflection.
Generally speaking, if something is complex enough to require testing it should not be buried as a private method or class in something else. It is doing something that warrants its own class.
Rather than worrying about the number of classes, worry about their size and complexity. Many small classes adhering to the Single Responsibility Principle are better than a small number of classes doing complex things internally.
If a method is complicated (and risky) enough to test on its own, it's worth creating a class for it or making it a public member of the existing class - whichever is more suitable, given the characteristics of the existing class.
Can you create multiple instances of the TCar class with different initial values of m_furlogs? Do you have a getter on speed anywhere? You could validate against that if so.
If it's only internally used, and you really want to test it, you could create a utilities class that holds the logic for the simple calculations. I know it's refactoring, but it's not the class explosion you might be envisioning.
In some languages, there is middle ground between private and public. You can expose a logically private method to a unit test without exposing it to the world.
In method documentation, you can document that the method is intended private, but accessible for the purpose of unit-testing.
In Java, for example, you could make the private method package protected, and place the unit test in the same package. In C#, if I recall correctly, you could make it internal. In C++, the unit-test could be a friend.
If your language supports compiler defines, you could use these to your advantage.
(Example code in Delphi)
In your unit test project, set a compiler conditional define, either using the project options or in an include file that you include in each and every unit in that project.
{$DEFINE UNIT_TESTS}
In your class code, check the conditional define and switch between public or protected and private accordingly:
{$IFDEF UNIT_TESTS}
public // or protected
{$ELSE}
private
{$ENDIF}
function CalculateSpeed: float;
This means your unit tests will have the access they need to your method, while in production code it will still be private.

Unit-testing private methods: Facade pattern

Lots of developers think that testing private methods is a bad idea. However, all examples I've found were based on the idea that private methods are private because calling them could break internal object's state. But that's not only reason to hide methods.
Let's consider Facade pattern. My class users need the 2 public methods. They would be too large. In my example, they need to load some complex structure from the database's BLOB, parse it, fill some temporary COM objects, run user's macro to validate and modify these objects, and serialize modified objects to XML. Quite large functionality for the single metod :-) Most of these actions are required for both public methods. So, I've created about 10 private methods, and 2 public methods do call them. Actually, my private methods should not necessarily be private; they'll not break the internal state of instance. But, when I don't wont to test private methods, I have the following problems:
Publishing them means complexity for users (they have a choice they don't need)
I cannot imagine TDD style for such a large public methods, when you're to write 500+ lines of code just to return something (even not real result).
Data for these methods is retrieved from database, and testing DB-related functionality is much more difficult.
When I'm testing private methods:
I don't publish details that would confuse users. Public interface includes 2 methods.
I can work in TDD style (write small methods step-by-step).
I can cover most of class's functionality using test data, without database connection.
Could somebody describe, what am I doing wrong? What design should I use to obtain the same bonuses and do not test private methods?
UPDATE: It seems to me I've extracted everything I was able to another classes. So, I cannot imagine what could I extract additionally. Loading from database is performed by ORM layer, parsing stream, serializing to XML, running macro - everything is done by standalone classes. This class contains quite complex data structure, routines to search and conversion, and calls for all mentioned utilities. So, I don't think something else could be extracted; otherwise, its responsibility (knowledge about the data structure) would be divided between classes.
So, the best method to solve I see now is dividing into 2 objects (Facade itself and real object, with private methods become public) and move real object to somewhere nobody would try to find it. In my case (Delphi) it would be a standalone unit, in other languages it could be a separate name space. Other similar option is 2 interfaces, thanks for idea.
I think you are putting too many responsibilities (implementations) into the facade. I would normally consider this to be a front-end for actual implementations that are in other classes.
So the private methods in your facade are likely to be public methods in one or more other classes. Then you can test them there.
Could somebody describe, what am I
doing wrong?
Maybe nothing?
If I want to test a method I make it default (package) scope and test it.
You already mentioned another good solution: create an interface with your two methods. You clients access those two methods and the visibility of the other methods don't matter.
Private methods are used to encapsulate some behavior that has no meaning outside of the class you are trying to test. You should never have to test private methods because only the public or protected methods of the same class will ever call private methods.
It may just be that your class is very complex and it will take significant effort to test it. However, I would suggest you look for abstractions that you can break out into their own classes. These classes will have a smaller scope of items and complexity to test.
I am not familiar with your requirements & design but it seems that your design is procedural rather than object oriented. i.e. you have 2 public methods and many private methods. If you break your class to objects where every object has its role it would be easier to test each of the "small" classes. In addition you can set the "helpers" objects access level to package (the default in Java, I know there is a similar access level in C#) this way you are not exposing them in the API but you can unit test them independently (as they are units).
Maybe if you take time and look the
Clean Code Tech talks from Miško. He is very insightfull of how code should be written in order to be tested.
This is a bit controversial topic... Most TDDers hold opinion that refactoring your methods for easier unit testing actually makes your design better. I think that this is often true, but specific case of private methods for public APIs is definitely an exception. So, yes, you should test private method, and no, you shouldn't make it public.
If you're working in Java, here's a utility method I wrote that will help you test static private methods in a class:
import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;
import junit.framework.Assert;
public static Object invokeStaticPrivateMethod(Class<?> clazz, String methodName, Object... params) {
Assert.assertNotNull(clazz);
Assert.assertNotNull(methodName);
Assert.assertNotNull(params);
// find requested method
final Method methods[] = clazz.getDeclaredMethods();
for (int i = 0; i < methods.length; ++i) {
if (methodName.equals(methods[i].getName())) {
try {
// this line makes testing private methods possible :)
methods[i].setAccessible(true);
return methods[i].invoke(clazz, params);
} catch (IllegalArgumentException ex) {
// maybe method is overloaded - try finding another method with the same name
continue;
} catch (IllegalAccessException ex) {
Assert.fail("IllegalAccessException accessing method '" + methodName + "'");
} catch (InvocationTargetException ex) {
// this makes finding out where test failed a bit easier by
// purging unnecessary stack trace
if (ex.getCause() instanceof RuntimeException) {
throw (RuntimeException) ex.getCause();
} else {
throw new InvocationException(ex.getCause());
}
}
}
}
Assert.fail("method '" + methodName + "' not found");
return null;
}
This could probably be rewritten for non-static methods as well, but those pesky private methods usually are private so I never needed that. :)
suppose you have 8 private methods and 2 public ones. If you can execute a private method independently, i.e. without calling any of the other methods, and without state-corrupting side-effects, then unit testing just that method makes sense. But then in that case there is no need for the method to be private!
in C# i would make such methods protected instead of private, and expose them as public in a subclass for testing
given your scenario, it might make more sense for the testable methods to be public and let the user have a true facade with only the 2 public methods that they need for their interface