Where do I put business logic when I'm using the repository pattern?

Where do I put business logic when I'm using the repository pattern? - repository-pattern

I am using the Repository Pattern for my application. I have a class User. User is identified by Email. The UserRepository contains a method CreateUser(User user). There is a business rule saying that users should have a unique Email.
I want to implement a transaction which first checks whether an email is in use and if not, the user is created. Where should I put this code which is responsible for checking the uniqueness of the Email?
This is definitely a business rule; it is business logic. I think it is not correct to put this check in my UserRepository implementation.

This sort of thing typically goes in either (1) a service or (2) directly into the schema as a database constraint (and frequently both).
Using a service, you don't access the Repository directly from client code; you call a service which does the useful operations for you.
For example, something like:
public class UserService : ... {
private Repository<User> _userRepository;
public void CreateUser(User u) {
// Verify that the user's email is unique.
if ( ... ) {
_userRepository.Create(u);
}
}
}

If you're building an application large enough to warrent a repository pattern then you'll want to put this validation as close to the data as possible, probably a database constraint such as a unique index/key. This prevents situations of bugs leaking into code later due to corrupt data.

Assuming you're using a database for storage, you should definitely add a unique constraint on the e-mail column in the database.

Check out this excellent article on Simple Talk:
Five Simple Database Design Errors You Should Avoid
See in Section 4:
Enforcing Integrity via applications
Proponents of application based
integrity usually argues that
constraints negatively impact data
access. They also assume selectively
applying rules based on the needs of
the application is the best route to
take. .....
The solution is simple.
Rely on nothing else to provide
completeness and correctness except
the database itself. By nothing, I
mean neither users nor applications
external to the database.**
So in your case - a unique constraint on your e-mail column should really be modelled in the database. That's the best place to put that piece of business logic, and will save you from a lot of grief in the long run.
Marc

Related

Event Sourcing: concurrently creating conflicting events

I am trying to implement an Event Sourcing system using Kafka and have run into the following issue. During a new user sign-up I want to check if the username the user provided is already taken. However, consider the case where 2 users are trying to sign-up at the same time providing the same username.
In my understanding of how ES works the controller that processes the sign-up request will check if the request is valid, it will then send a new event (e.g. NewUser) to Kafka, and finally that event will be picked up by another controller which will persist it in a materialized view (e.g. Postgres DB). The problem is that the validation of the request is done against the materialized view but the actual persistence to it happens later. So because the 2 requests are being processed in parallel (by different service instances) they might both pass the validation, resulting in 2 NewUser messages. However, when the second controller tries to persist those 2 NewUser messages in the database saving the second event will fail because of the violation of the uniqueness constraint for the username.
Any ideas on how to address this?
Thanks.
UPDATE:
In particular, I would like to verify whether the following are accepted approaches to the problem:
use the username as the userId (restrictive)
send an event to a topic partitioned by username and when validation
is done send an event to another topic

Initial validation against the materialized view won't be enough in most scenarios where you have constraints. There can always be some relevant events haven't been materialized yet. There are two main concurrency control approaches to ensure that correct results are generated:
1. Pessimistic approach:
If you want to validate constraints before you publish an event, you need to lock relevant resources (entity, aggregate or data set). The locking means your services must not be able to publish events on these resources. After this point, to get the current state of your data:
You can wait until all events published before locking are materialized.
You can read current state from the database and apply events on it in a separate process.
2. Optimistic approach:
In this approach, you perform your validations after publishing events. To achieve this, you need to implement a feedback mechanism. The process which consumes events and performs validations should be able to publish validation results. You can perform the validations in-memory when possible. Otherwise, you can rely on your materialized data store.
Martin Kleppman talks about a two-step solution for exactly the same problem here and in his book. In this solution, there are two topics: "claims" and "registrations". First, you publish a claim to take the username, then try to write it to the database, and finally publish the result to the registrations topic. At conceptual level, it follows the same steps in the second approach you have mentioned. In validation step, it avoids implementing validation logic and keeping secondary indexes in memory by relying on the database.

During a new user sign-up I want to check if the username the user provided is already taken.
You may want to review Greg Young's essay on Set Validation.
In my understanding of how ES works the controller that processes the sign-up request will check if the request is valid, it will then send a new event (e.g. NewUser) to Kafka, and finally that event will be picked up by another controller which will persist it in a materialized view (e.g. Postgres DB).
That's a little bit different from the usual arrangement. (You may also want to review Greg's talk on polyglot data.)
Suppose we begin with two writers; that's fine, but if there is going to be a single point of truth, then you are going to need synchronization somewhere.
The usual arrangement is to use a form of optimistic concurrency; when processing a request, you reserve a copy of your original state, then you do your calculation, and finally you send the book of record a `replace(originalState,newState)'.
So at this point, we have two writes racing toward the book of record
replace(red,green)
replace(red,blue)
At the book of record, the writes are processed in series.
[...,replace(red,blue)...,replace(red,green)]
So when the book of record processes replace(red,blue), it performs a check that yes, the state is currently red, and swaps in blue. Later, when the book of record tries to process replace(red,green), the book of record performs the check, which fails because the state is no longer red.
So one of the writes has succeeded, and the other fails; the latter can propagate the failure outwards, or retry, or..., precisely what depends on the specific mechanics in question. A retry should mean, of course, reload the "original state", at which point the model would discover that some previous edit already claimed the username.
Any ideas on how to address this?
Single writer per stream makes the rest of the problem pretty simple, by eliminating the ambiguity introduced by having multiple in memory copies of the model.
Multiple writers using a synchronous write to the durable store is probably the most common design. It requires an event store that understands the idea of writing to a specific location in a stream -- aka "expected version".
You can perform an asynchronous write, and then start doing other work until you get an acknowledgement that the write succeeded (or not, or until you time out, or)....
There's no magic -- if you want uniqueness (or any other sort of invariant enforcement, for that matter), then everybody needs to agree on a single authority, and anybody else who wants to propose a change won't know if it has been accepted without getting word back from the authority, and needs to be prepared for a rejected proposal.
(Note: this shouldn't be a surprise -- if you were using a traditional design with current state stored in a RDBMS, then your authority would be a user table in the database, with a uniqueness constraint on the username column, and the race would be between the two insert statements trying to finish their transaction first....)

How do you perform service-oriented parent-child transactions?

Example:
A SalesOrder is composed of a SalesOrderHeader and one or more SalesOrderItems. When editing an existing SalesOrder, the SalesOrderHeader can be modified and SalesOrderItems can be added, modified and deleted. All changes must be saved in a single transaction. Multiple users may edit the SalesOrder at the same time with optimistic concurrency.
I believe that the requirement to have the save done in a single transaction encourages us to communicate both the SaleOrderHeader and the SalesOrderItems in a single service call. The implication of packaging up the child data with its parent is that there will need to be some understanding as to whether the child data is added, modified or deleted.
Change tracking of the child entities can happen either on the server or on the client.
Change tracking on the server
The idea with this strategy is that the client can modify the SalesOrder to its will without tracking which SalesOrderItems are added, modified or deleted. The state of the SalesOrderItems will be determined on the server when the save service is called.
The server should remain stateless between service calls. This means that the server can’t retain any information about the state of the SalesOrder between its retrieval and its eventual save. The only option left if for the server to determine the state of its entities by comparing the modified object graph to the database object graph.
With nHibernate, there is a merge function to accomplish this. With Entity framework, the highest voted feature request is to have this added. There’s also an open source implementation of this for EF called GraphDiff.
This sounds great in theory because it makes the services very easy to design and use. However, I see two major issues with this strategy. The first is performance. The entire object graph must be sent back on every save. Whether or not a SalesOrderItem was modified, it must be sent back or the server will assume it’s been deleted. The second problem is even more critical and it has to do with concurrency. If User 1 adds a SalesOrderItem to a SalesOrder and User 2 makes a change to the same SalesOrder, when User 2 saves the server will assume that the SalesOrderItem added by User 1 should be deleted because it was not included in User 2’s object graph. I don’t see a way this can be prevented in any implementation of server side change tracking.
Change tracking on the client
The alternative is to have the client track changes to its entities and communicate that state when calling the save service. One benefit is that the client does not need to send its unchanged child entities. This helps with performance. A downside is that all entities will need an additional property named something along the lines of “ObjectState” to track whether it’s added, modified or deleted. This makes the entity models on the server quite messy and filled with concerns unrelated to the business domain. This also puts onus on the different consumers of the service to maintain this state. Another problem is that it becomes difficult to deal with deleted entities. Should the SalesOrderHeader maintain a list of deleted SalesOrderItems? or should the SalesOrderItems get assigned a state of deleted which must be filtered out by the client UI?
I know that breeze javascript library has its own implementation of client-side entity tracking but my concern is that its implementation requires both client-side and server-side components. Shouldn't the service layer isolate which technology we use on either side? What if non-javascript clients want to use my services?
Question
I would think this is a common scenario that should be addressed by the majority of service implementations. Have I made any incorrect assumptions or am I doing anything out or the ordinary? What strategy have you implemented? Are there any reasonable alternatives?

Full disclosure: I work with Breeze, and I think change tracking on the client is the way to go. Change tracking on the client allows stateless servers, reduces traffic between the client and server, and allows offline use.
In Breeze, the "ObjectState" that you mention is called the EntityAspect, and each entity has one, but it is not part of the domain model. The server-side entities don't need an EntityAspect, but the server-side service has to know how to handle the entity state information that comes from the client.
Basically, the service needs to create, update, or delete entities based on the information coming from the client. There are existing server-side backends for Breeze that do all this already (in .NET (EF and NHibernate), Java, PHP, Node, and Ruby), but you can also write your own. Your server just needs to know how to talk to the client.
Let's say we've updated a SalesOrder and added a new SalesOrderItem. The Breeze client sends a save bundle that looks something like this:
{
"entities": [
{
"Id": 123,
"Title": "My Updated Title",
"OrderDate": "2014-08-03T07:00:00.000Z",
"entityAspect": {
"entityTypeName": "SalesOrder:#My.DomainModel",
"entityState": "Modified",
"originalValuesMap": {
"Title": "My Original Title"
},
"autoGeneratedKey": {
"propertyName": "Id",
"autoGeneratedKeyType": "Identity"
}
}
},
{
"Id": -1,
"SalesOrderId": 123,
"ProductId": 456,
"Quantity": 11,
"entityAspect": {
"entityTypeName": "SalesOrderItem:#My.DomainModel",
"entityState": "Added",
"originalValuesMap": {
},
"autoGeneratedKey": {
"propertyName": "Id",
"autoGeneratedKeyType": "Identity"
}
}
}
]
}
Here, SalesOrder with Id# 123 has been modified (its Title has been changed). The entityAspect includes the originalValuesMap which shows what the previous Title was.
The server would need to update the existing SalesOrder with the new value. Whether the server needs to query the existing SalesOrder from the database before applying the changes is implementation-dependent.
A new SalesOrderItem has been added. A temporary Id, -1, was created for it on the client. The server needs to create and persist a new SalesOrderItem and generate a real Id for it.
The response from the server should contain the entities that were created and updated, and KeyMapping information that shows what server-generated keys map to the temporary client-side keys, so that the client can replace them.
Change tracking is not a simple problem, but Breeze tries to do the hard parts for you.

I'd like to piggy back on Steve's answer.
We should be clear: the onus for implementing the Order-graph (AKA "Order aggregate") transaction in a relational data model falls on the developer. BreezeJS (and Breeze helpers for .NET servers) can facilitate but you have to make it work.
The key to making this work is including the root element of the aggregate - the Order - in all changes to any entity within the aggregate. If you add, delete, or modify an OrderItem, make sure you modify the Order at the same time .
How? By bumping the Order's concurrency property (e.g, the rowVersion) and making sure that Breeze KNOWS this is your concurrency property.
You must implement root entity optimistic concurrency if you want to ensure Order aggregate consistency.
Now you can detect if someone else has made a change to any part of the Order aggregate. That could be a change to the Order or an add/mod/delete of one of its OrderItems.
You do not have to include all OrderItems in the change-set when you save a changed Order aggregate. You only need to include the OrderItems that are added/modified/deleted.
Of course some other user may make a change to the Order aggregate before you save yours. When you try to save yours, the save will fail with an optimistic concurrency error.
Upon detecting an optimistic concurrency error for an Order, make sure the client removes the entire order aggregate from cache - the Order and all of its OrderItems - and then re-fetch the aggregate Don't just re-fetch the root Order entity and start messing with its items. Make sure you remove the entire aggregate from cache and then re-fetch it (the order and its items).
If everyone follows this protocol you'll be in fine shape on the server.

Return record from a database table using Apache CXF

I am using Apache CXF (apache-cxf-2.5.0) to create Web Services using a bottom-up approach (Java first approach). I want to return some data/records (for example, username, email) from a database table. I can write a Java class which returns a simple response. But I am not able to find way to return a response such as data/records extracted from a database table. How to do that?

You don't mention how you are accessing the database, but the basic idea is that you ensure that the classes that you return have JAXB annotations (notably #XmlRootElement or #XmlType) on them, which allows CXF to convert the instances of those classes into XML document fragments. The classes which you annotate this way probably should not have lots of functionality in them; they should exist just to hold data. (I find anything else too confusing given the complex lifecycle they'll have.) Once the annotations are in place, just return the relevant objects and all the conversions will happen automatically.
I'm talking a simple class like this:
#XmlRootElement // <---- THIS LINE HERE!
public class UserInfo {
public String username;
public String email;
}
You can use this in conjunction with other annotations (e.g., for your ORM) as necessary. Of course, if you're talking straight JDBC to the DB to get the information out, you won't need to worry about that.
The one tricky bit is that the objects being returned will have a lifespan that goes beyond that of the database transaction you're using; you may need to detach (i.e., do some copying, though the ORM layer might provide assistance) the objects extracted from the DB for that to work. This won't be much of a concern in this case as the DB you're describing is very simple (one table, no inter-row relations) but could be an issue if you make things more complex.

Save data through a web service using NHibernate?

We currently have an application that retrieves data from the server through a web service and populates a DataSet. Then the users of the API manipulate it through the objects which in turn change the dataset. The changes are then serialized, compressed and sent back to the server to get updated.
However, I have begin using NHibernate within projects and I really like the disconnected nature of the POCO objects. The problem we have now is that our objects are so tied to the internal DataSet that they cannot be used in many situations and we end up making duplicate POCO objects to pass back and forth.
Batch.GetBatch() -> calls to web server and populates an internal dataset
Batch.SaveBatch() -> send changes to web server from dataset
Is there a way to achieve a similar model that we are using which all database access occurs through a web service but use NHibernate?
Edit 1
I have a partial solution that is working and persisting through a web service but it has two problems.
I have to serialize and send my whole collection and not just changed items
If I try to repopulate the collection upon return my objects then any references I had are lost.
Here is my example solution.
Client Side
public IList<Job> GetAll()
{
return coreWebService
.GetJobs()
.BinaryDeserialize<IList<Job>>();
}
public IList<Job> Save(IList<Job> Jobs)
{
return coreWebService
.Save(Jobs.BinarySerialize())
.BinaryDeserialize<IList<Job>>();
}
Server Side
[WebMethod]
public byte[] GetJobs()
{
using (ISession session = NHibernateHelper.OpenSession())
{
return (from j in session.Linq<Job>()
select j).ToList().BinarySerialize();
}
}
[WebMethod]
public byte[] Save(byte[] JobBytes)
{
var Jobs = JobBytes.BinaryDeserialize<IList<Job>>();
using (ISession session = NHibernateHelper.OpenSession())
using (ITransaction transaction = session.BeginTransaction())
{
foreach (var job in Jobs)
{
session.SaveOrUpdate(job);
}
transaction.Commit();
}
return Jobs.BinarySerialize();
}
As you can see I am sending the whole collection to the server each time and then returning the whole collection. But I'm getting a replaced collection instead of a merged/updated collection. Not to mention the fact that it seems highly inefficient to send all the data back and forth when only part of it could be changed.
Edit 2
I have seen several references on the web for almost a transparent persistent mechanism. I'm not exactly sure if these will work and most of them look highly experimental.
ADO.NET Data Services w/NHibernate (Ayende)
ADO.NET Data Services w/NHibernate (Wildermuth)
Custom Lazy-loadable Business Collections with NHibernate
NHibernate and WCF is Not a Perfect Match
Spring.NET, NHibernate, WCF Services and Lazy Initialization
How to use NHibernate Lazy Initializing Proxies with Web Services or WCF
I'm having a hard time finding a replacement for the DataSet model we are using today. The reason I want to get away from that model is because it takes a lot of work to tie every property of every class to a row/cell of a dataset. Then it also tightly couples all of my classes together.

I've only taken a cursory look at your question, so forgive me if my response is shortsighted but here goes:
I don't think you can logically get away from doing a mapping from domain object to DTO.
By using the domain objects over the wire you are tightly coupling your client and service, part of the reason to have a service in the first place is to promote loose coupling. So that's an immediate issue.
On top of that you're going to end up with a brittle domain logic interface where you can't make changes on the service side without breaking your client.
I suspect your best bet would be to implement a loosely coupled service which implements a REST / or some other loosely coupled interface. You could use a product such as automapper to make the conversions simpler and easier and also flatten data structures as necessary.
At this point I don't know of any way to really cut down the verbosity involved in doing the interface layers but having worked on large projects that didn't make the effort I can honestly tell you the savings wasn't worth it.

I think your issue revolves around this issue:
http://thatextramile.be/blog/2010/05/why-you-shouldnt-expose-your-entities-through-your-services/
Are you or are you not going to send ORM-Entities over the wire?
Since you have a Services-Oriented architecture.. I (like the author) do not recommend this practice.
I use NHibernate. I call those ORM-Entities. They are THE POCO model. But they have "virtual" properties that allow for lazy-loading.
However, I also have some DTO-Objects. These are also POCO's. These do not have lazy'loading friendly properties.
So I do alot of "converting". I hydrate ORM-Entities (with NHibernate)...and then I end up converting them to Domain-DTO-Objects. Yes, it stinks in the beginning.
The server sends out the Domain-DTO-Objects's. There is NO lazy loading. I have to populate them with the "Goldie Locks" "just right" model. Aka, if I need Parent(s) with one level of children, I have to know that up front and send the Domain-DTO-Objects over that way, with just the right amount of hydration.
WHen I send back Domain-DTO-Objects's (from client to the server), I have to reverse the process. I convert the Domain-DTO-Objects into ORM-Entities. And allow NHibernate to work with the ORM-Entities.
Because the architecture is "disconnected", I do alot of (NHiberntae) ".Merge()" calls.
// ormItem is any NHibernate poco
using (ISession session = ISessionCreator.OpenSession())
{
using (ITransaction transaction = session.BeginTransaction())
{
session.BeginTransaction();
ParkingAreaNHEntity mergedItem = session.Merge(ormItem);
transaction.Commit();
}
}
.Merge is a wonderful thing. Entity Framework does not have it. Boo.
Is this alot of setup? Yes.
Do I think it is perfect? No.
However. Because I send very basic DTO's(Poco's) that are not "flavored" to the ORM, I have the ability to switch ORM's without killing my contracts to the outside world.
My datalayer can be ADO.NET, EF, NHibernate, or anything. I have to write the "Converters" if I switch, and the ORM code, but everything else is isolated.
Many people argue with me. They said I'm doing too much, and the ORM-Entities are fine.
Again, I like to "now allow any lazy loading" appearances. And I prefer to have my data-layer isolated. My clients should not know or care about my data-layer/orm of choice.
There are just enough subtle differences (or some not so subtle ones) between EF and NHibernate to screwball the game plan.
Do my Domain-DTO-Objects's look 95% like my ORM-Entities? Yep. But its the 5% that can screwball you.
Moving from DataSets, especially if they are populated from stored-procedures with alot of biz-logic in the TSQL, isn't trivial. But now that I do object model, and I NEVER write a stored procedure that isn't simple CRUD functions, I'd never go back.
And I hate maintenance projects with voodoo TSQL in the stored procedures. It ain't 1999 anymore. Well, most places.
Good luck.
PS Without .Merge(in EF), here is what you have to do in a disconnected world: (boo microsoft)
http://www.entityframeworktutorial.net/EntityFramework4.3/update-many-to-many-entity-using-dbcontext.aspx

Coding architectural question

I'm after some guidance on how to approach coding a problem, I don't want to jump straight into coding without think about it as I need it to be as generic and customisable as possible,
The scenario is i have a web service that acts as a gateway to downstream services, with the aim of authenticating and authorising SOAP message destined for down stream services, basically allivating the downstream service from doing it themselves. Each SOAP message has a variety of different WS-Security mechanisms attached usually a WS-UsernameToken, WS-Timestamp, and a XML Signature of the message body.
My problem is i want to figure out a good extensible way of validating all these security mechanims, I'm not after how to do it just how to appraoch it.
I thought about having a controller class that is intialised and controls the validation flow i.e.
ISecurityController controller = SecurityControllerFacotry.getInstance();
boolean proceed = controller.Validate(soapMessage);
using it very much like a template design pattern which ditates the flow of logic i.e.
public Boolean Validate(Message soapMessage)
{
return ValidateAuthentication(soapMessage) && ValidateTimeStamp(soapMessage) && ValidateSignture(soapMessage);
}
Would this be the best apporach to the problem?
Also would it be best to put these validation methods each into a class of there own that which implemented a common interface? So that a class could be instantiated and retrieved from some sort of validation factory i.e.
IValidationMechanism val = ValidationFactory.getValidationType(ValidationFactory.UsernameToken);
boolean result = val.Validate(soapMessage);
This would give me an an easily extensible aspect.
Would this be an vaible solution or can anyone think of other ways of doing it?
I'm interset in design patterns and good oo principles so would like to go down a route utilising them if possible.
Thanks in advance
Jon
EDIT: The service is basically a gateway security service that relieves the burden of authentication and authorisation from services that sit behind it. The security service can be thought of as an implicitly invoke intermediary on the SOAP message path that validates the security mechanisms in the SOAP message and depending on the validation result forwards the message to the appropriate down stream service by interrogating the WS-addressing headers. Although the service is not really the question it is more on how to implement the validation procedure.

I think your intuition on this is good; go with the single interface approach. That is, hide your validation implementations behind a single validation interface; this allows you to extend your validation implementations later without modifying the calling code.
And yes, the idea of putting the validation into its own class is a good one; you might want to think about having a common base class, if you have any common validation items (for example, username might be a common validation element, even though each different validation scheme may encode it differently; one as an element, another as an attribute, etc.). I think validation classes is a more appropriate mapping for the level of complexity that you're talking about anyhow, as opposed to validation methods; I suspect that the type of validation you're doing requires groups of methods (i.e., classes).

I can think of another way to validate your SOAP message against different validations. You use a visitor Pattern.
For that You will have a simple wrapper around the SOAP message you get.
MySoapMessage{
SOAPMessage soapMessage;
List<String> validatonErrors;
void accept(Validator validator){
validator.isValid(this);
}
}
Your security Controller will contain the list of Validatiors which you will inject basically.
SecurityController{
List<IValidator> validators;
//Validate the message
void validate(MySOAPMessage soapMessage){
forEach(Validator validator: validators){
soapMessage.isValid(validator)
}
}
}
Your Validators will look something like this.
UserNameValidator implements IValidator{
public void validate(MySOAPMessage message){
// Validate and put error if any
}
}
You dont need and unnecessary factory here for the validators.. if you want to want to add/remove validators from the controller you just inject/un inject then from the list.

Spring has a generic validation package that handles this type of process nicely IMHO.
Theirs looks something like
public interface Validator {
public boolean supports(Class<?> clazz);
public void validate(Object o, Errors errors);
}
Granted, they're using an Errors param to return validation issues in, which might or might not suit your goal.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js