Mock testing with Repository pattern example - unit-testing

I'm refactoring existing class to support Products import from CSV file.
Requirements:
1) during import products are identified by special product number.
2) product has category attribute. If such category exist - use its id, if not - create it first.
3) if product with some number already exists in DB - do an update of other fields in another case - create a new product.
Goal:
Make a real unit-test(no DB interaction) verifiing that categories creation/reuse works fine.
Correct me if I'm wrong:
1) I need to inject a list of existing product categories.
2) Loop through parsed products from CSV file and see whether category can be found in injected categories list or not.
3) What to return? Repository of aggregates(Should aggregate root be a product or product category?)
The problem is that we don't know what ID will new categories will obtain from DB.
Please give me some direction of how this problem can solved?
I'm new to the Repository pattern(and persistence-ignorant domain concept) and I'm using Mock testing in my daily coding.

As far as I can tell, you need to ask about the previous existence of both products and categories. This hints at two different Repositories: a ProductRepository and a CategoryRepository.
Injecting a list of existing categories is one possible approach, but you would also need to inject a list of existing products.
Another alternative would be to inject both Repositories and simply ask them whether the product or category already exists. If you need other functionality provided by these Repositories, this may be a better option, since you already have the required dependencies.
You might also want to consider doing both to keep closer to the Single Responsibility Principle. One collaborator could be a service that constructs a new Product instance based on the parsed data and the existing categories and products. Another would be responsible for retrieving the existing data. Yet another class would implement the CSV parser.
All types would implement interfaces, so that you might have collaborators such as these:
public class CsvParser : IParser
public class DataRetriever : IDataRetriever
public class ProductCreator : IProductCreator
Your overall class would then be one that takes those three interfaces as dependencies and orchestrates their interaction.
This will allow you to unit test each in isolation using mocks for each dependency.

Related

Unit Testing Domain Models attributes in CQRS System

I'm seeking other's thoughts on a specific scenario re: unit testing domain model (entities, VOs etc) attributes in a App that is utilising CQRS.
Imagine a simplified example of a Product (entity), with a Product Name (VO).
class Product
private ProductName productName
Product has some basic guard rails and invariants:
A Product Name can't be more than 100 characters (enforced in the VO).
A Product must have a Product Name (enforced in the entity, with Product Name set in the constructor to ensure it's always valid).
I can unit test the Product entity and Product Name VO to ensure these are enforced correctly by ensuring the correct errors and exceptions are raised. Easy.
My question is regarding Unit Testing the happy path — that the Product Name is set successfully when a Product is initially Developed.
In a non-CQRS system I would have a public read-only property or getter on the Product Name, so that it can be retrieved for reporting, displaying or populating a DTO. I can then unit test with this property.
In a CQRS system though, the Product Name can be private in the Entity, as retrieval of the Product Name happens via the Read Model on the query side. Setting the Product Name happens through the command when the product is Developed, but after that the Product Name isn't required for any core business actions (except I could imagine for renaming the product).
To unit test the successful creation of a product though I need to make the Product Name a read-only public property to test it, but it doesn't seem right to do this just to fulfil a unit test. Without the unit test the Product Name can remain private and everything would work as needed.
Wondering if anyone has come across a similar scenario for unit testing -> Where an attribute that is mostly for presentation purposes (although obviously very important... I wouldn't expect users to try and work with Products via only part numbers or identifiers) is being tested?
I'm leaning towards testing this via the Read Model? ie. Create the Product , then determine the name has been successfully set via the Read Model. But this seems to be a lot for a unit test to take on.
To unit test the successful creation of a product though I need to make the Product Name a read-only public property to test it, but it doesn't seem right to do this just to fulfil a unit test. Without the unit test the Product Name can remain private and everything would work as needed.
How does data get from your domain model to your persistent storage? How does data get from your "write" model to your "read" model?
Somewhere in your code base you have either a function, that takes a Product as an input and returns some domain agnostic representation of it (byte[], or JSON, or whatever), or you have a method that takes as arguments a Product and a callback that accepts some domain agnostic representation.
It might be explicit, or it might be implicit (ORM magic?), but it's going to be there somewhere -- "write only" domain entities aren't very interesting.
Your tests should be using that same mechanism.
It may help to keep in mind one of the old understandings of "unit" in "unit test" -- that the test is a single isolated self contained thing, insulated from interference by other tests that may also be running.

Please explain the Repository-, Mapping-, and Business-Layer relations and responsibilities

I've been reading a lot about this stuff and I am currently in the middle of the development of a larger web-application and its corresponding back-end.
However, I've started with a design where I ask a Repository to fetch data from the database and map it into a DTO. Why DTO? Simply because until now basically everything was simple stuff and no more complexity was necessary. If it got a bit more complex then I started to map e.g. 1-to-n relations directly in the service layer. Something like:
// This is Service-Layer
public List<CarDTO> getCarsFromOwner(Long carOwnerId) {
// Entering Repository-Layer
List<CarDTO> cars = this.carRepository = this.carRepository.getCars(carOwnerId);
Map<Long, List<WheelDTO>> wheelMap = this.wheelRepository.getWheels(carId);
for(CarDTO car : cars) {
List<WheelDTO> wheels = wheelMap.get(car.getId());
car.setWheels(wheels);
}
return cars;
}
This works of course but it turns out that sometimes things are getting more complex than this and I'm starting to realize that the code might look quite ugly if I don't do anything about this.
Of course, I could load wheelMap in the CarRepository, do the wheel-mapping there and only return complete objects, but since SQL queries can sometimes look quite complex I don't want to fetch all cars and their wheels plus taking care of the mapping in getCars(Long ownerId).
I'm clearly missing a Business-Layer, right? But I'm simply not able to get my head around its best practice.
Let's assume I have Car and a Owner business-objects. Would my code look something like this:
// This is Service-Layer
public List<CarDTO> getCarsFromOwner(Long carOwnerId) {
// The new Business-Layer
CarOwner carOwner = new CarOwner(carOwnerId);
List<Car> cars = carOwner.getAllCars();
return cars;
}
which looks as simple as it can be, but what would happen on the inside? The question is aiming especially at CarOwner#getAllCars().
I imagine that this function would use Mappers and Repositories in order to load the data and that especially the relational mapping part is taken care of:
List<CarDTO> cars = this.carRepository = this.carRepository.getCars(carOwnerId);
Map<Long, List<WheelDTO>> wheelMap = this.wheelRepository.getWheels(carId);
for(CarDTO car : cars) {
List<WheelDTO> wheels = wheelMap.get(car.getId());
car.setWheels(wheels);
}
But how? Is the CarMapper providing functions getAllCarsWithWheels() and getAllCarsWithoutWheels()? This would also move the CarRepository and the WheelRepository into CarMapper but is this the right place for a repository?
I'd be happy if somebody could show me a good practical example for the code above.
Additional Information
I'm not using an ORM - instead I'm going with jOOQ. It's essentially just a type-safe way to write SQL (and it makes quite fun using it btw).
Here is an example how that looks like:
public List<CompanyDTO> getCompanies(Long adminId) {
LOGGER.debug("Loading companies for user ..");
Table<?> companyEmployee = this.ctx.select(COMPANY_EMPLOYEE.COMPANY_ID)
.from(COMPANY_EMPLOYEE)
.where(COMPANY_EMPLOYEE.ADMIN_ID.eq(adminId))
.asTable("companyEmployee");
List<CompanyDTO> fetchInto = this.ctx.select(COMPANY.ID, COMPANY.NAME)
.from(COMPANY)
.join(companyEmployee)
.on(companyEmployee.field(COMPANY_EMPLOYEE.COMPANY_ID).eq(COMPANY.ID))
.fetchInto(CompanyDTO.class);
return fetchInto;
}
Pattern Repository belongs to the group of patterns for data access objects and usually means an abstraction of storage for objects of the same type. Think of Java collection that you can use to store your objects - which methods does it have? How it operates?
By this defintion, Repository cannot work with DTOs - it's a storage of domain entities. If you have only DTOs, then you need more generic DAO or, probably, CQRS pattern. It is common to have separate interface and implementation of a Repository, as it's done, for example, in Spring Data (it generates implementation automatically, so you have only to specify the interface, probably, inheriting basic CRUD operations from common superinterface CrudRepository).
Example:
class Car {
private long ownerId;
private List<Wheel> wheels;
}
#Repository
interface CarRepository extends CrudRepository<Car,Long> {
List<Car> findByOwnerId(long id);
}
Things get complicated, when your domain model is a tree of objects and you store them in relational database. By definition of this problem you need an ORM. Every piece of code that loads relational content into object model is an ORM, so your repository will have an ORM as an implementation. Typically, JPA ORMs do the wiring of objects behind the scene, simpler solutions like custom mappers based on JOOQ or plain JDBC have to do it manually. There's no silver bullet that will solve all ORM problems efficiently and correctly: if you have chosen to write custom mapping, it's still better to keep the wiring inside the repository, so business layer (services) will operate with true object models.
In your example, CarRepository knows about Cars. Car knows about Wheels, so CarRepository already has transitive dependency on Wheels. In CarRepository#findByOwnerId() method you can either fetch the Wheels for a Car directly in the same query by adding a join, or delegate this task to WheelRepository and then only do the wiring. User of this method will receive fully-initialized object tree. Example:
class CarRepositoryImpl implements CarRepository {
public List<Car> findByOwnerId(long id) {
// pseudocode for some database query interface
String sql = select(CARS).join(WHEELS);
final Map<Long, Car> carIndex = new HashMap<>();
execute(sql, record -> {
long carId = record.get(CAR_ID);
Car car = carIndex.putIfAbsent(carId, Car::new);
... // map the car if necessary
Wheel wheel = ...; // map the wheel
car.addWheel(wheel);
});
return carIndex.values().stream().collect(toList());
}
}
What's the role of business layer (sometimes also called service layer)? Business layer performs business-specific operations on objects and, if these operations are required to be atomic, manages transactions. Basically, it knows, when to signal transaction start, transaction commit and rollback, but has no underlying knowledge about what these messages will actually trigger in transaction manager implementation. From business layer perspective, there are only operations on objects, boundaries and isolation of transactions and nothing else. It does not have to be aware of mappers or whatever sits behind Repository interface.
In my opinion, there is no correct answer.
This really depends upon the chosen design decision, which itself depends upon your stack and your/team's comfort.
Case 1:
I disagree with below highlighted sections in your statement:
"Of course, I could load wheelMap in the CarRepository, do the wheel-mapping there and only return complete objects, but since SQL queries can sometimes look quite complex I don't want to fetch all cars and their wheels plus taking care of the mapping in getCars(Long ownerId)"
sql join for the above will be simple. Additionally, it might be faster as databases are optimized for joins and fetching data.
Now, i called this approach as Case1 because this could be followed if you decide to pull data via your repository using sql joins. Instead of just using sql for simple CRUD and then manipulating objects in java. (below)
Case2: Repository is just used to fetch data for "each" domain object which corresponds to "one" table.
In this case what you are doing is already correct.
If you will never use WheelDTO separately, then there is no need to make a separate service for it. You can prepare everything in the Car service.
But, if you need WheelDTO separately, then make different services for each. In this case, there can be a helper layer on top of service layer to perform object creation. I do not suggest implementing an orm from scratch by making repositories and loading all joins for every repo (use hibernate or mybatis directly instead)
Again IMHO, whichever approach you take out of the above, the service or business layers just complement it. So, there is no hard and fast rule, try to be flexible according to your requirements. If you decide to use ORM, some of the above will again change.

Doctrine specify logical name for JoinColumn [duplicate]

I'm working on an events site and have a one to many relationship between a production and its performances, when I have a performance object if I need its production id at the moment I have to do
$productionId = $performance->getProduction()->getId();
In cases when I literally just need the production id it seems like a waste to send off another database query to get a value that's already in the object somewhere.
Is there a way round this?
Edit 2013.02.17:
What I wrote below is no longer true. You don't have to do anything in the scenario outlined in the question, because Doctrine is clever enough to load the id fields into related entities, so the proxy objects will already contain the id, and it will not issue another call to the database.
Outdated answer below:
It is possible, but it is unadvised.
The reason behind that, is Doctrine tries to truly adhere to the principle that your entities should form an object graph, where the foreign keys have no place, because they are just "artifacts", that come from the way relational databases work.
You should rewrite the association to be
eager loaded, if you always need the related entity
write a DQL query (preferably on a Repository) to fetch-join the related entity
let it lazy-load the related entity by calling a getter on it
If you are not convinced, and really want to avoid all of the above, there are two ways (that I know of), to get the id of a related object, without triggering a load, and without resorting to tricks like reflection and serialization:
If you already have the object in hand, you can retrieve the inner UnitOfWork object that Doctrine uses internally, and use it's getEntityIdentifier() method, passing it the unloaded entity (the proxy object). It will return you the id, without triggering the lazy-load.
Assuming you have many-to-one relation, with multiple articles belonging to a category:
$articleId = 1;
$article = $em->find('Article', $articleId);
$categoryId = $em->getUnitOfWork()->getEntityIdentifier($article->getCategory());
Coming 2.2, you will be able to use the IDENTITY DQL function, to select just a foreign key, like this:
SELECT IDENTITY(u.Group) AS group_id FROM User u WHERE u.id = ?0
It is already committed to the development versions.
Still, you should really try to stick to one of the "correct" methods.

What is the controller allowed to assume about what it recieves from a service?

Quick terminology question that's somewhat related to my main question: What is the correct term for a model class and the term for a instance of that class?
I am learning Test Driven Development, and want to make sure I am learning it the right so I can form good habits.
My current project has a SalesmanController, which is pretty much a basic resource controller. Here's my current issue (I can get it working, but I want to make sure its done as "right" as possible)
I have a 'Salesman' model.
The 'Salesman' is mapped as having many 'Sales' using my ORM.
The 'Sales' model is mapped as belongsTo 'Salesman' using my ORM.
I have created a ORMSalesmanRepository which implements the SalesmanRepositoryInterface.
My controller has SalesmanRepositoryInterface passed to it upon construction(constructor dependency injection).
My controller calls the find method on the SalesmanRepositoryInterface implementation it has been given to find the correct salesman.
My view needs information on the salesman and information on all 'Sales' records that belong to him.
The current implementation of SalesmanRepositoryInterface returns an instance of a ORMRecord, and then passes that to the view, which retrieves the sales from the ORMRecord.
My gut tells me this implementation is wrong. The orm record implements Array Access, so it still behaves like an array as far as the view knows.
However, when trying to go back and implement my unit tests I am running into issues mocking my dependencies. (I now know with TDD I'm supposed to make my unit tests and then develop the actual implementation, didn't figure this out till recently).
Salesman = MockedSalesman;
SalesRecords = MockedSalesman->Sales;
Is it poor programming to expect my Salesman to return a ORMObject for the controller to use (for chaining relationships maybe?) or is a controller becoming to 'fat' if I'm allowing it to call more than just basic get methods (using arrayaccess []) on the ORMObject? Should the controller just assume that whatever it gets back is an array (or at least acts like one?)
Also, should it ever come up where something one of my mocked classes returns needs to be mocked again?
Thanks in advance everybody.
What is the correct term for a model class and the term for a instance of that class?
Depends on what you mean with "model classes"? Technically model is a layer, that contains several groups of classes. Most notable ones would be: mappers, services an domain objects. Domain objects as whole are implementation of accumulated knowledge about business requirements, insight from specialists and project goals. This knowledge is referred to as "domain model".
Basically, there is no such thing as "model class". There are classes that are part of model.
What is the controller allowed to assume about what it recieves from a service?
Nothing, because controller should not receive anything from model layer. The responsibility of controller is to alter the state of model layer (and in rare cases - state of current view).
Controller is NOT RESPONSIBLE for:
gather data from model layer,
initializing views
passing data from model layer to views
dealing with authorization checks
The current implementation of SalesmanRepositoryInterface returns an instance of a ORMRecord, and then passes that to the view, which retrieves the sales from the ORMRecord.
It sounds like you are implementing active record pattern. It has very limited use-case, where is is appropriate to use AR - when object mostly consists of getters and setters tht are directly stored in a single table. For anything beyond that active record becomes an anti-pattern because it violates SRP and you loose the ability to test your domain logic without database.
Also, are repository should be returning an instance of domain object and makes sure that you are not retrieving data repeatedly. Repositories are not factories for active record instances.
Is it poor programming to expect my Salesman to return a ORMObject for the controller to use (for chaining relationships maybe?) or is a controller becoming to 'fat' if I'm allowing it to call more than just basic get methods (using arrayaccess []) on the ORMObject?
Yes, it's bad code. Your application logic (one that would usually be contained in services) is leaking in the presentation layer.
At this stage I wouldn't stress too much about your implementation - if you try and write proper tests, you'll quickly find out what works and what doesn't.
The trick is to think hard about what each component is trying to achieve. What is your controller method supposed to do? Most likely it is intended to create a ViewModel of some kind, and then choose which View to render. So there's a few tests right there:
When I call my controller method with given arguments (ShowSalesmanDetail(5))
It should pick the correct View to render ('ShowSalesmanDetail')
It should construct the ViewModel that I expect (A Salesman object with some Sales)
In theory, at this point you don't care how the controller constructs the model, only that it does. In practice though you do need to care, because the controller has dependencies (the big one being the database), which you need to cater for. You've chosen to abstract this with a Repository class that talks to an ORM, but that shouldn't impact the purpose of your tests (though it will definitely alter how you implement those tests).
Ideally the Salesman object in your example would be a regular class, with no dependencies of its own. This way, your repository can construct a Salesman object by populating it from the database/ORM, and your unit tests can also use Salesman objects that you've populated yourself with test data. You shouldn't need to mock your models or data classes.
My personal preference is that you don't take your 'data entities' (what you get back from your ORM) and put them into Views. I would construct a ViewModel class that is tied to one View, and then map from your data entities to your ViewModel. In your example, you might have a Salesman class which represents the data in the database, then a SalesmanModel which represents the information you are displaying on the page (which is usually a subset of what's in the db).
So you might end up with a unit test looking something like this:
public void CallingShowSalesmanShouldReturnAValidModel()
{
ISalesmanRepository repository = A.Fake<ISalesmanRepository>();
SalesmanController controller = new SalesmanController(repository);
const int salesmanId = 5;
Salesman salesman = new Salesman
{
Id = salesmanId,
Name = "Joe Bloggs",
Address = "123 Sesame Street",
Sales = new[]
{
new Sale { OrderId = 123, SaleDate = DateTime.Now.AddMonths(-1) }
}
};
A.CallTo(() => repository.Find(salesmanId)).Returns(salesman);
ViewResult result = controller.ShowSalesman(salesmanId) as ViewResult;
SalesmanModel model = result.Model as SalesmanModel;
Assert.AreEqual(salesman.Id, model.Id);
Assert.AreEqual(salesman.Name, model.Name);
SaleModel saleModel = model.Sales.First();
Assert.AreEqual(salesman.Sales.First().OrderId, saleModel.OrderId);
}
This test is by no means ideal but hopefully gives you an idea of the structure. For reference, the A.Fake<> and A.CallTo() stuff is from FakeItEasy, you could replace that with your mocking framework of choice.
If you were doing proper TDD, you wouldn't have started with the Repository - you'd have written your Controller tests, probably got them passing, and then realised that having all this ORM/DB code in the controller is a bad thing, and refactored it out. The same approach should be taken for the Repository itself and so on down the dependency chain, until you run out of things that you can mock (the ORM layer most likely).

Constructing a Domain Object from multiple DTOs

Suppose you have the canonical Customer domain object. You have three different screens on which Customer is displayed: External Admin, Internal Admin, and Update Account.
Suppose further that each screen displays only a subset of all of the data contained in the Customer object.
The problem is: when the UI passes data back from each screen (e.g. through a DTO), it contains only that subset of a full Customer domain object. So when you send that DTO to the Customer Factory to re-create the Customer object, you have only part of the Customer.
Then you send this Customer to your Customer Repository to save it, and a bunch of data will get wiped out because it isn't there. Tragedy ensues.
So the question is: how would you deal with this problem?
Some of my ideas:
include an argument to the
Repository indicating which part of
the Customer to update, and ignore
others
when you load the Customer, keep it in static memory, or in the session, or wherever, and then when you receive one of the DTOs from the UI, update only the parts relevant to the DTO
IMO, both of these are kludges. Are there any other better ideas?
#chadmyers: Here is the problem.
Entity has properties A, B, C, and D.
DTO #1 contains properties for B and C.
DTO #2 contains properties for C and D.
UI asks for DTO #1, you load entity from the repository, convert it into DTO #1, filling in only B and C, and give it to the UI.
Now UI updates B and sends the DTO back. You recreate the entity and it has only B and C filled in because that is all that is contained in the DTO.
Now you want to save the entity, which has only B and C filled in, with A and D null/blank. The repository has no way of knowing if it should update A and D in persistence as blanks, or whether it should ignore them.
I would use factory to load a complete customer object from repository upon receipt of DTO. After that you can update only those fields that were specified in DTO.
That also allows you to apply some optimistic concurrency on your customer by checking last-updated timestamp, for example.
Is this a web app? Load the customer object from the repo, update it from the DTO, save it back. That doesn't seem like a kludge to me. :)
UPDATE: As per your updates (the A, B, C, D example)
So what I was thinking is that when you load the entity, it has A, B, C, and D filled in. If DTO#1 only updates B & C, that's OK. A and D are unaffected (which is the desired situation).
What the repository does with the B & C updates is up to him. If you're using Hibernate/NHibernate, for example, it will just figure it out and issue an update.
Just because DTO #1 only has B & C doesn't mean you have to also null out A & D. Just leave them alone.
I missed the point of this question at first because it is predicated on a few things that I don't think make sense from a design perspective.
Hydrating an entity from repository and then converting it to a DTO is a waste of effort. I assume that your DAL passes a DTO to your repository which then converts it to a full entity object. So converting it back to a DTO seems wasteful.
Having multiple DTOs makes sense if you have a search results page that shows a high volume of records and only displays part of your entity data. In that case it's efficient to pass that page just the data it needs. It does not make sense to pass a DTO that contains partial data to a CRUD page. Just give it a full DTO or even a full entity object. If it doesn't use all of the data, fine, no harm done.
So that main problem is that I don't think you should pass data to these pages using partial DTOs. If you used a full DTO, I would do the following 3 steps whenever the save action is performed:
Pull the full DTO from repository or db
Update the DTO with any changes made through the form
Save the full DTO back to the repository or db
This method requires an extra db hit but that's really not a significant issue on a CRUD form.
If we have an understanding that a Repository handles (almost exclusively) very rich domain Entity, then you numerous DTO's could simply map back.
i.e.
dtoUser.MapFrom<In,Out>(Entity)
or
dtoAdmin.MapFrom<In,Out>(Entity)
you would do the reverse to get the dto information back to the Entity and so on. So your repository only saves rich Entity's NOT numerous DTO's
entity.Foo = dtoUser.Foo
or
entity.Bar = dtoAdmin.Bar
entityRepsotiry.Save(entity) <-- do not pass DTO.
The whole point of DTO's is to keep things simple for the presentation or say for WCF dataTransfer, it has nothing to do with the Repository or the Entity for that matter.
Furthermore, you should never construct an Entity from DTO's... the only two ways to ever acquire an Entity is through a Factory(new) or a Repository(existing) respectively.
You mention storing the Entity somewhere, why would you do this? That is the job of your repository. It will decide where to get the Entity(db,cache,e.t.c), no need to store it somewhere else.
Hope that helps assign responsibility in your domain, it is always a challenge and there are gray area's here and there but in general, these are the typical uses of Repository, DTO e.t.c.