unit testing a factory method

unit testing a factory method - unit-testing

suppose I have several OrderProcessors, each of them handles an order a little differently.
The decision about which OrderProcessor to use is done according to the properties of the Order object, and is done by a factory method, like so:
public IOrderProcessor CreateOrderProcessor(IOrdersRepository repository, Order order, DiscountPercentages discountPercentages)
{
if (order.Amount > 5 && order.Unit.Price < 8)
{
return new DiscountOrderProcessor(repository, order, discountPercentages.FullDiscountPercentage);
}
if (order.Amount < 5)
{
// Offer a more modest discount
return new DiscountOrderProcessor(repository, order, discountPercentages.ModestDiscountPercentage);
}
return new OutrageousPriceOrderProcessor(repository, order);
}
Now, my problem is that I want to verify that the returned OrderProcessor has received the correct parameters (for example- the correct discount percentage).
However, those properties are not public on the OrderProcessor entities.
How would you suggest I handle this scenario?
The only solution I was able to come up with is making the discount percentage property of the OrderProcessors public, but it seems like an overkill to do that just for the purpose of unit testing...

One way around this is to change the fields you want to test to internal instead of private and then set the project's internals visible to the testing project. You can read about this here: http://msdn.microsoft.com/en-us/library/system.runtime.compilerservices.internalsvisibletoattribute.aspx
You would do something like this in your AssemblyInfo.cs file:
[assembly:InternalsVisibleTo("Orders.Tests")]
Although you could argue that your unit tests should not necessarily care about the private fields of you class. Maybe it's better to pass in the values to the factory method and write unit tests for the expected result when some method (assuming Calculate() or something similar) is called on the interface.
Or another approach would be to unit test the concrete types (DiscountOrderProcessor, etc.) and confirm their return values from the public methods/properties. Then write unit tests for the factory method that it correctly returns the correct type of interface implementation.
These are the approaches I usually take when writing similar code, however there are many different ways to tackle a problem like this. I would recommend figuring out where you would get the most value in unit tests and write according to that.

If discount percentage is not public, then it's not part of the IOrderProcessor contract and therefore doesn't need to be verified. Just have a set of unit tests for the DiscountOrderProcessor to verify it's properly computing your discounts based on the discount percent passed in via the constructor.

You have a couple of choices as I see it. you could create specializations of DiscountOrderProcessor :
public class FullDiscountOrderProcessor : DiscountOrderProcessor
{
public FullDiscountOrderProcessor(IOrdersRepository repository, Order order):base(repository,order,discountPercentages.FullDiscountPercentage)
{}
}
public class ModestDiscountOrderProcessor : DiscountOrderProcessor
{
public ModestDiscountOrderProcessor (IOrdersRepository repository, Order order):base(repository,order,discountPercentages.ModestDiscountPercentage)
{}
}
and check for the correct type returned.
you could pass in a factory for creating the DiscountOrderProcessor which just takes an amount, then you could check this was called with the correct params.
You could provide a virtual method to create the DiscountOrderProcessor and check that is called with the correct params.
I quite like the first option personally, but all of these approaches suffer from the same problem that in the end you can't check the actual value and so someone could change your discount amounts and you wouldn't know. Even wioth the first approach you'd end up not being able to test what the value applied to FullDiscountOrderProcessor was.
You need to have someway to check the actual values which leaves you with:
you could make the properties public (or internal - using InternalsVisibleTo) so you can interrogate them.
you could take the returned object and check that it correctly applies the discount to some object which you pass in to it.
Personally I'd go for making the properties internal, but it depends on how the objects interact and if passing a mock object in to the discount order processor and verifying that it is acted on correctly is simple then this might be a better solution.

Related

Please explain the Repository-, Mapping-, and Business-Layer relations and responsibilities

I've been reading a lot about this stuff and I am currently in the middle of the development of a larger web-application and its corresponding back-end.
However, I've started with a design where I ask a Repository to fetch data from the database and map it into a DTO. Why DTO? Simply because until now basically everything was simple stuff and no more complexity was necessary. If it got a bit more complex then I started to map e.g. 1-to-n relations directly in the service layer. Something like:
// This is Service-Layer
public List<CarDTO> getCarsFromOwner(Long carOwnerId) {
// Entering Repository-Layer
List<CarDTO> cars = this.carRepository = this.carRepository.getCars(carOwnerId);
Map<Long, List<WheelDTO>> wheelMap = this.wheelRepository.getWheels(carId);
for(CarDTO car : cars) {
List<WheelDTO> wheels = wheelMap.get(car.getId());
car.setWheels(wheels);
}
return cars;
}
This works of course but it turns out that sometimes things are getting more complex than this and I'm starting to realize that the code might look quite ugly if I don't do anything about this.
Of course, I could load wheelMap in the CarRepository, do the wheel-mapping there and only return complete objects, but since SQL queries can sometimes look quite complex I don't want to fetch all cars and their wheels plus taking care of the mapping in getCars(Long ownerId).
I'm clearly missing a Business-Layer, right? But I'm simply not able to get my head around its best practice.
Let's assume I have Car and a Owner business-objects. Would my code look something like this:
// This is Service-Layer
public List<CarDTO> getCarsFromOwner(Long carOwnerId) {
// The new Business-Layer
CarOwner carOwner = new CarOwner(carOwnerId);
List<Car> cars = carOwner.getAllCars();
return cars;
}
which looks as simple as it can be, but what would happen on the inside? The question is aiming especially at CarOwner#getAllCars().
I imagine that this function would use Mappers and Repositories in order to load the data and that especially the relational mapping part is taken care of:
List<CarDTO> cars = this.carRepository = this.carRepository.getCars(carOwnerId);
Map<Long, List<WheelDTO>> wheelMap = this.wheelRepository.getWheels(carId);
for(CarDTO car : cars) {
List<WheelDTO> wheels = wheelMap.get(car.getId());
car.setWheels(wheels);
}
But how? Is the CarMapper providing functions getAllCarsWithWheels() and getAllCarsWithoutWheels()? This would also move the CarRepository and the WheelRepository into CarMapper but is this the right place for a repository?
I'd be happy if somebody could show me a good practical example for the code above.
Additional Information
I'm not using an ORM - instead I'm going with jOOQ. It's essentially just a type-safe way to write SQL (and it makes quite fun using it btw).
Here is an example how that looks like:
public List<CompanyDTO> getCompanies(Long adminId) {
LOGGER.debug("Loading companies for user ..");
Table<?> companyEmployee = this.ctx.select(COMPANY_EMPLOYEE.COMPANY_ID)
.from(COMPANY_EMPLOYEE)
.where(COMPANY_EMPLOYEE.ADMIN_ID.eq(adminId))
.asTable("companyEmployee");
List<CompanyDTO> fetchInto = this.ctx.select(COMPANY.ID, COMPANY.NAME)
.from(COMPANY)
.join(companyEmployee)
.on(companyEmployee.field(COMPANY_EMPLOYEE.COMPANY_ID).eq(COMPANY.ID))
.fetchInto(CompanyDTO.class);
return fetchInto;
}

Pattern Repository belongs to the group of patterns for data access objects and usually means an abstraction of storage for objects of the same type. Think of Java collection that you can use to store your objects - which methods does it have? How it operates?
By this defintion, Repository cannot work with DTOs - it's a storage of domain entities. If you have only DTOs, then you need more generic DAO or, probably, CQRS pattern. It is common to have separate interface and implementation of a Repository, as it's done, for example, in Spring Data (it generates implementation automatically, so you have only to specify the interface, probably, inheriting basic CRUD operations from common superinterface CrudRepository).
Example:
class Car {
private long ownerId;
private List<Wheel> wheels;
}
#Repository
interface CarRepository extends CrudRepository<Car,Long> {
List<Car> findByOwnerId(long id);
}
Things get complicated, when your domain model is a tree of objects and you store them in relational database. By definition of this problem you need an ORM. Every piece of code that loads relational content into object model is an ORM, so your repository will have an ORM as an implementation. Typically, JPA ORMs do the wiring of objects behind the scene, simpler solutions like custom mappers based on JOOQ or plain JDBC have to do it manually. There's no silver bullet that will solve all ORM problems efficiently and correctly: if you have chosen to write custom mapping, it's still better to keep the wiring inside the repository, so business layer (services) will operate with true object models.
In your example, CarRepository knows about Cars. Car knows about Wheels, so CarRepository already has transitive dependency on Wheels. In CarRepository#findByOwnerId() method you can either fetch the Wheels for a Car directly in the same query by adding a join, or delegate this task to WheelRepository and then only do the wiring. User of this method will receive fully-initialized object tree. Example:
class CarRepositoryImpl implements CarRepository {
public List<Car> findByOwnerId(long id) {
// pseudocode for some database query interface
String sql = select(CARS).join(WHEELS);
final Map<Long, Car> carIndex = new HashMap<>();
execute(sql, record -> {
long carId = record.get(CAR_ID);
Car car = carIndex.putIfAbsent(carId, Car::new);
... // map the car if necessary
Wheel wheel = ...; // map the wheel
car.addWheel(wheel);
});
return carIndex.values().stream().collect(toList());
}
}
What's the role of business layer (sometimes also called service layer)? Business layer performs business-specific operations on objects and, if these operations are required to be atomic, manages transactions. Basically, it knows, when to signal transaction start, transaction commit and rollback, but has no underlying knowledge about what these messages will actually trigger in transaction manager implementation. From business layer perspective, there are only operations on objects, boundaries and isolation of transactions and nothing else. It does not have to be aware of mappers or whatever sits behind Repository interface.

In my opinion, there is no correct answer.
This really depends upon the chosen design decision, which itself depends upon your stack and your/team's comfort.
Case 1:
I disagree with below highlighted sections in your statement:
"Of course, I could load wheelMap in the CarRepository, do the wheel-mapping there and only return complete objects, but since SQL queries can sometimes look quite complex I don't want to fetch all cars and their wheels plus taking care of the mapping in getCars(Long ownerId)"
sql join for the above will be simple. Additionally, it might be faster as databases are optimized for joins and fetching data.
Now, i called this approach as Case1 because this could be followed if you decide to pull data via your repository using sql joins. Instead of just using sql for simple CRUD and then manipulating objects in java. (below)
Case2: Repository is just used to fetch data for "each" domain object which corresponds to "one" table.
In this case what you are doing is already correct.
If you will never use WheelDTO separately, then there is no need to make a separate service for it. You can prepare everything in the Car service.
But, if you need WheelDTO separately, then make different services for each. In this case, there can be a helper layer on top of service layer to perform object creation. I do not suggest implementing an orm from scratch by making repositories and loading all joins for every repo (use hibernate or mybatis directly instead)
Again IMHO, whichever approach you take out of the above, the service or business layers just complement it. So, there is no hard and fast rule, try to be flexible according to your requirements. If you decide to use ORM, some of the above will again change.

Setting Custom Coders & Handling Parameterized types

I have two questions related to coder issues I am facing with my Dataflow pipeline.
How do I go about setting a coder for my custom data types? The class consists of just three items - two doubles and another parameterized property. I tried annotating the type with SerializableCoder but I still end up with the error "com.google.cloud.dataflow.sdk.coders.CannotProvideCoderException: Cannot provide coder based on value with class interface java.util.Set: No CoderFactory has been registered for the class." The Set actually contains the parameterized custom data-type - so I am assuming that the custom datatype is the problem. I could not find enough documentation/examples on the right way to do this. Please point me to the right place if its available.
Even without the custom datatype, whenever I try switching to a parameterized version of Transform functions, it results in coder errors. Specifically, inside a complex transform which is parameterized, a ParDo works with parameterized types but when I apply a Combine.PerKey on the resulting PCollection after the ParDo, it results in the CoderNotFoundException.
Any help regarding these two items would be helpful as I am kind of stuck on this for sometime now.

It looks like you have been bitten by two issues. Thanks for bringing them to our attention! Fortunately, there are easy workarounds for both while we improve things.
The first issue is that the default coder registry does not have an entry for mapping Set.class to SetCoder. We have filed GitHub issue #56 to track its resolution. In the meantime, you can use the following code to perform the needed registration:
pipeline.getCoderRegistry().registerCoder(Set.class, SetCoder.class);
The second issue is that parameterized types currently require advanced treatment in the coder registry, so the #DefaultCoder will not be honored. We have filed Github issue #57 to track this. The best way to ensure that SerializableCoder is used everywhere for CustomType is to register a CoderFactory for your type that will return a SerializableCoder. Supposing your type is something like this:
public class CustomType<T extends Serializable> implements Serializable {
T field;
}
Then the following code registers a CoderFactory that produces appropriate SerializableCoder instances:
pipeline.getCoderRegistry().registerCoder(CustomType.class, new CoderFactory() {
#Override
public Coder<?> create(List<? extends Coder<?>>) {
// No matter what the T is, return SerializableCoder
return SerializableCoder.of(CustomType.class);
}
#Override
public List<Object> getInstanceComponents(Object value) {
// Return the T inside your CustomType<T> to enable coder inference for Create
return Collections.singletonList(((CustomType<Object>) value).field);
}
});
Now, whenever you use CustomType in your pipeline, the coder registry will produce a SerializableCoder.
Note that SerializableCoder is not deterministic (the bytes of encoded objects are not necessarily equal for objects that are equals()) so values encoded using this coder cannot be used as keys in a GroupByKey operation.

What is the controller allowed to assume about what it recieves from a service?

Quick terminology question that's somewhat related to my main question: What is the correct term for a model class and the term for a instance of that class?
I am learning Test Driven Development, and want to make sure I am learning it the right so I can form good habits.
My current project has a SalesmanController, which is pretty much a basic resource controller. Here's my current issue (I can get it working, but I want to make sure its done as "right" as possible)
I have a 'Salesman' model.
The 'Salesman' is mapped as having many 'Sales' using my ORM.
The 'Sales' model is mapped as belongsTo 'Salesman' using my ORM.
I have created a ORMSalesmanRepository which implements the SalesmanRepositoryInterface.
My controller has SalesmanRepositoryInterface passed to it upon construction(constructor dependency injection).
My controller calls the find method on the SalesmanRepositoryInterface implementation it has been given to find the correct salesman.
My view needs information on the salesman and information on all 'Sales' records that belong to him.
The current implementation of SalesmanRepositoryInterface returns an instance of a ORMRecord, and then passes that to the view, which retrieves the sales from the ORMRecord.
My gut tells me this implementation is wrong. The orm record implements Array Access, so it still behaves like an array as far as the view knows.
However, when trying to go back and implement my unit tests I am running into issues mocking my dependencies. (I now know with TDD I'm supposed to make my unit tests and then develop the actual implementation, didn't figure this out till recently).
Salesman = MockedSalesman;
SalesRecords = MockedSalesman->Sales;
Is it poor programming to expect my Salesman to return a ORMObject for the controller to use (for chaining relationships maybe?) or is a controller becoming to 'fat' if I'm allowing it to call more than just basic get methods (using arrayaccess []) on the ORMObject? Should the controller just assume that whatever it gets back is an array (or at least acts like one?)
Also, should it ever come up where something one of my mocked classes returns needs to be mocked again?
Thanks in advance everybody.

What is the correct term for a model class and the term for a instance of that class?
Depends on what you mean with "model classes"? Technically model is a layer, that contains several groups of classes. Most notable ones would be: mappers, services an domain objects. Domain objects as whole are implementation of accumulated knowledge about business requirements, insight from specialists and project goals. This knowledge is referred to as "domain model".
Basically, there is no such thing as "model class". There are classes that are part of model.
What is the controller allowed to assume about what it recieves from a service?
Nothing, because controller should not receive anything from model layer. The responsibility of controller is to alter the state of model layer (and in rare cases - state of current view).
Controller is NOT RESPONSIBLE for:
gather data from model layer,
initializing views
passing data from model layer to views
dealing with authorization checks
The current implementation of SalesmanRepositoryInterface returns an instance of a ORMRecord, and then passes that to the view, which retrieves the sales from the ORMRecord.
It sounds like you are implementing active record pattern. It has very limited use-case, where is is appropriate to use AR - when object mostly consists of getters and setters tht are directly stored in a single table. For anything beyond that active record becomes an anti-pattern because it violates SRP and you loose the ability to test your domain logic without database.
Also, are repository should be returning an instance of domain object and makes sure that you are not retrieving data repeatedly. Repositories are not factories for active record instances.
Is it poor programming to expect my Salesman to return a ORMObject for the controller to use (for chaining relationships maybe?) or is a controller becoming to 'fat' if I'm allowing it to call more than just basic get methods (using arrayaccess []) on the ORMObject?
Yes, it's bad code. Your application logic (one that would usually be contained in services) is leaking in the presentation layer.

At this stage I wouldn't stress too much about your implementation - if you try and write proper tests, you'll quickly find out what works and what doesn't.
The trick is to think hard about what each component is trying to achieve. What is your controller method supposed to do? Most likely it is intended to create a ViewModel of some kind, and then choose which View to render. So there's a few tests right there:
When I call my controller method with given arguments (ShowSalesmanDetail(5))
It should pick the correct View to render ('ShowSalesmanDetail')
It should construct the ViewModel that I expect (A Salesman object with some Sales)
In theory, at this point you don't care how the controller constructs the model, only that it does. In practice though you do need to care, because the controller has dependencies (the big one being the database), which you need to cater for. You've chosen to abstract this with a Repository class that talks to an ORM, but that shouldn't impact the purpose of your tests (though it will definitely alter how you implement those tests).
Ideally the Salesman object in your example would be a regular class, with no dependencies of its own. This way, your repository can construct a Salesman object by populating it from the database/ORM, and your unit tests can also use Salesman objects that you've populated yourself with test data. You shouldn't need to mock your models or data classes.
My personal preference is that you don't take your 'data entities' (what you get back from your ORM) and put them into Views. I would construct a ViewModel class that is tied to one View, and then map from your data entities to your ViewModel. In your example, you might have a Salesman class which represents the data in the database, then a SalesmanModel which represents the information you are displaying on the page (which is usually a subset of what's in the db).
So you might end up with a unit test looking something like this:
public void CallingShowSalesmanShouldReturnAValidModel()
{
ISalesmanRepository repository = A.Fake<ISalesmanRepository>();
SalesmanController controller = new SalesmanController(repository);
const int salesmanId = 5;
Salesman salesman = new Salesman
{
Id = salesmanId,
Name = "Joe Bloggs",
Address = "123 Sesame Street",
Sales = new[]
{
new Sale { OrderId = 123, SaleDate = DateTime.Now.AddMonths(-1) }
}
};
A.CallTo(() => repository.Find(salesmanId)).Returns(salesman);
ViewResult result = controller.ShowSalesman(salesmanId) as ViewResult;
SalesmanModel model = result.Model as SalesmanModel;
Assert.AreEqual(salesman.Id, model.Id);
Assert.AreEqual(salesman.Name, model.Name);
SaleModel saleModel = model.Sales.First();
Assert.AreEqual(salesman.Sales.First().OrderId, saleModel.OrderId);
}
This test is by no means ideal but hopefully gives you an idea of the structure. For reference, the A.Fake<> and A.CallTo() stuff is from FakeItEasy, you could replace that with your mocking framework of choice.
If you were doing proper TDD, you wouldn't have started with the Repository - you'd have written your Controller tests, probably got them passing, and then realised that having all this ORM/DB code in the controller is a bad thing, and refactored it out. The same approach should be taken for the Repository itself and so on down the dependency chain, until you run out of things that you can mock (the ORM layer most likely).

TDD behavior testing with no getters/setters

I'm applying TDD to my first event centric project (CQRS, Event sourcing etc) and I'm writing my tests according to Greg Young's simple testing framework Given, When, Expect. My test fixture takes a command, commandhandler and aggregate root and then tests the events outputted.
CommandTestFixture<TCommand, TCommandHandler, TAggregateRoot>
For example here is a typical test
[TestFixture]
public class When_moving_a_group :
CommandTestFixture<MoveGroup, MoveGroupHandler, Foo>
I am very happy with these tests on the whole but with the the above test I've hit a problem. The aggregate root contains a collection of groups. The command MoveGroup reorders the collection, taking a from & to index. I setup the test and asserted that the correct GroupMoved event was generated with the correct data.
As an additional test I need to assert that the reordering of the Groups collection actually took place correctly? How do I do this when the aggregate root has no public getters/setters. I could add a method to retrieve the group at a particular index but isn't this breaking encapsulation simply to be testable?
What's the correct way to go about this?
EDIT
The reordering of the groups takes place in the GroupMoved handler on the Aggregate root.
private void Apply(GroupMoved e)
{
var moved = groups[e.From];
groups.RemoveAt(e.From);
groups.Insert(e.To, moved);
}

The friction here comes because you want to assert something about the internal implementation, but what you have at hand is at the top level.
Your tests and assertions need to be at the same logical level. There are two ways to re-arrange this:
What effect does re-ordering groups have on subsequent commands or queries which you do have at the top level?
This should give you an avenue for asserting that the correct outcome occurs without needing to assert anything about the ordering of the groups directly. This keeps the test at the top level and would allow all sorts of internal refactoring (e.g. perhaps lazy sorting of the groups).
Can you test at a lower level?
If you feel that testing as described above is too complicated, you might want to frame your test at a more detailed level. I think of this like focusing in on a section of detail to get it right.
Down at this level (rather than your composite root), the interfaces will know about groups and you'll have the opportunity to assert what you want to assert.
Alternatively, do you need this test at all?
If you can not find a suitable test at either of the above levels, then are you sure you need this test at all? If there is no visible external difference then there is no need to lock the behaviour in place with a test.

Unit testing style question: should the creation and deletion of data be in the same method?

I am writing unit tests for a PHP class that maintains users in a database. I now want to test if creating a user works, but also if deleting a user works. I see multiple possibilities to do that:
I only write one method that creates a user and deletes it afterwards
I write two methods. The first one creates the user, saves it's ID. The second one deletes that user with the saved ID.
I write two methods. The first one only creates a user. The second method creates a user so that there is one that can afterwards be deleted.
I have read that every test method should be independent of the others, which means the third possibility is the way to go, but that also means every method has to set up its test data by itself (e.g. if you want to test if it's possible to add a user twice).
How would you do it? What is good unit testing style in this case?

Two different things = Two tests.
Test_DeleteUser() could be in a different test fixture as well because it has a different Setup() code of ensuring that a User already exists.
[SetUp]
public void SetUp()
{
CreateUser("Me");
Assert.IsTrue( User.Exists("Me"), "Setup failed!" );
}
[Test]
public void Test_DeleteUser()
{
DeleteUser("Me");
Assert.IsFalse( User.Exists("Me") );
}
This means that if Test_CreateUser() passes and Test_DeleteUser() doesn't - you know that there is a bug in the section of the code that is responsible for deleting users.
Update: Was just giving some thought to Charlie's comments on the dependency issue - by which i mean if Creation is broken, both tests fail even though Delete. The best I could do was to move a guard check so that Setup shows up in the Errors and Failures tab; to distinguish setup failures (In general cases, setup failures should be easy to spot by an entire test-fixture showing Red.)

How you do this codependent on how you utilize Mocks and stubs. I would go for the more granular approach so having 2 different tests.
Test A
CreateUser("testuser");
assertTrue(CheckUserInDatabase("testuser"))
Test B
LoadUserIntoDB("testuser2")
DeleteUser("testuser2")
assertFalse(CheckUserInDatabase("testuser2"))
TearDown
RemoveFromDB("testuser")
RemoveFromDB("testuser2")
CheckUserInDatabase(string user)
...//Access DAL and check item in DB
If you utilize mocks and stubs you don't need to access the DAL until you do your integration testing so won't need as much work done on the asserting and setting up the data

Usually, you should have two methods but reality still wins over text on paper in the following case:
You need a lot of expensive setup code to create the object to test. This is a code smell and should be fixed but sometimes, you really have no choice (think of some code that aggregates data from several places: You really need all those places). In this case, I write mega tests (where a test case can have thousands of lines of code spread over many methods). It creates the database, all tables, fills them with defined data, runs the code step by step, verifies each step.
This should be a rare case. If you need one, you must actively ignore the rule "Tests should be fast". This scenario is so complex that you want to check as many things as possible. I had a case where I would dump the contents of 7 database tables to files and compare them for each of the 15 SQL updates (which gave me 105 files to compare in a single test) plus about a million asserts that would run.
The goal here is to make the test fail in such a way that you notice the source of the problem right away. It's like pouring all the constraints into code and make them fail early so you know which line of app code to check. The main drawback is that these test cases are hell to maintain. Every change of the app code means that you'll have to update many of the 105 "expected data" files.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js