When implementing the repository pattern should lookup value / tables get their own Repository? - repository-pattern

I am creating RESTful services for several database entities based on a modified version of the BISDM. Some of these entities have associated lookup tables, such as depicted below:
I have decided to use the repository pattern to provide a clean separation between data persistance / retrieval; however, I am not sure how lookups ( as opposed to entities ) should be represented in the repository.
Should lookups get their own repository interface, "share" one with the associated entity, or should there be a generic ILookupRepository interface?
For the moment, these lookups are read-only; however, there will be a time where we may want to edit the lookups via services.
Option 1:
ISpaceRepository.GetSpaceCategoryById(string id);
Option 2:
ISpaceCategoryRepository.GetById(string id);
Option 3:
ILookupRepository.GetSpaceCategoryById(string id);
Incidentally, this question is related to another one regarding look-up tables & RESTful web services.

No. Repositories should represent domain model concepts, not entity level concepts, and certainly not database level. Think about all the things you would want to do with a given component of your domain, for example, Spaces.
One of the things that you'll want to do, is GetSpaceCategories(). This should definitely be included in the Spaces repository, as anyone dealing with Spaces will want access to the Space categories without having to instantiate some other repository.
A generic repository would be fairly counter-productive I would think. Treating a repository like a utility class would virtually guarantee that any moderately complex operation would have to instantiate both repositories.

Related

Nested router vs filters

I'm very new to API implementation from ground up and I needed some advice on what the standard or the best approach in my API structure is.
Currently my implementation includes nested routers (drf-nested-routers package) such as
"www.thissite.com/store/21/products/1/"
Now as I dig deeper in django I've uncovered that there are filters that allow me to do the exact same operation above with a little less code like this
"www.thissite.com/products/?store__id=21&id=1"
My question is which one is best practice and why?
Both are best practices, since REST does not constrain URI design. I call www.thissite.com/store/21/products/1/ hierarchical URI design and www.thissite.com/products/?store__id=21&id=1 flat URI design. I like the flat design better, but that is just my personal taste. If you need both store-id and product-id in order to identify a product then these URIs are okay and any URIs are okay with these variables, so for example x/y/z/:pid/q/r/s/:sid, etc... By REST the URI (template) creation is the responsibility of the service and the clients consumes only the URIs it gets from the service in forms of hyperlinks. So from REST client perspective the URI structure does not matter. We tend to design nice URIs only to keep the REST service routing logic clear.
If a product is always related to a store (which seems to be the case, given the names), then it's considered a best practice for REST, to maintain an hierarchical structure by making products a subresource of stores. Thus I would suggest you to follow the first aforementioned approach.
The filtering should be used to filter resources based on some internal characteristics (e.g class attributes), not based on relations to other resources.

Interface Segregation, is it valid to use this on top of the 'composite' repository pattern

I'm using the Entity Framework as my ORM and I'm using a Repository of repositories (?) to abstract the EF so I can mock out etc and test etc.
A single repo.
public interface IRepository<T> : IView<T>
{
IQueryable<T> GetAll();
void Update( T entity );
void Delete( T entity );
void Add(T entity);
T Default();
}
the repo of repo's ;)
public interface IRepoOfRepos
{
IRepository<Table_a> Table_as { get; }
IRepository<Table_b> Table_bs { get; }
IRepository<Table_c> Table_cs { get; }
etc etc
In our application we have a series of 'modules' that perform discrete chunks of business logic and I was planning on 'injecting' the 'IRepoOfRepos' into each.
However, another team member has suggested that we should really be creating an additional layer (interface) with only the data access methods needed by each module (aka the 'I' in SOLID).
We have quite a large number of modules (30+) and this seems like quite a lot of extra work for a principle that I feel may not apply to the Data Access Layer and is really aimed at the Business Layer?
Your thoughts are much appreciated and thanks in advance!
There's a bunch of questions that cover this already:
How to use the repository pattern correctly? (best)
One repository per table or one per functional section?
What is best practise for repository pattern - repo per table?
Are you supposed to have one repository per table in JPA?
From my experience: Be ruthlessly pragmatic in your repository design. Only implement the queries you actually need RIGHT NOW, like Customer.CreateOrder() which may have required several different IRepository.Add() calls.
You won't end up using all of the CRUD methods on every table, anyway. (YAGNI)
If your Data Access Layer just provides implementations of IRepository<T>, then it doesn't fulfill its purpose. Have a look at the first question I linked - it's very instructive.
The 'composite' repository pattern doesn't exist. A repository doesn't know about other repositories. If you need to use more than one, have all the relevant interfaces injected as arguments for the service using them.
A repository interface is defined only for the specific bounded context needs. You have 30 modules,that's ok, some of their needs are common so you can have a common interface definition (because it's an abstraction there's no tight coupling). You define then other interface specific for the module's needs. Each module service (regardless of the module) will use only the needed abstractions.
When testing you'll be testing business service behaviour, using repository fakes/mocks. ORM is irrelevant, because your repo interface knows only about business objects and you never tell the repository how to do its work.
In conclusion, yes, Interface Segregation is valid to use, but no repository of repositories.

NHibernate ISession.Replicate with SQLite and native id generation

We are mapping the primary key of an object like this:
Id(x => x.Id, "ID").GeneratedBy.Native("SEQUENCENAME");
We have business logic depending on certain ids to exist (legacy, not easily changed). New objects should get generated ids from an Oracle sequence, but there are always rows with known ids.
We're using SQLite for unit testing and I need to persist new objects to the in-memory database with these known ids. This will not work with any of the following methods:
session.Replicate(objectWithKnownId, <any replication mode>);
session.Merge(objectWithKnownId)
According to nHibernate documentation, the Replicate method seems to be what I'm looking for.
Persist all reachable transient
objects, reusing the current
identifier values.
When using it with SQLite, however, I will only get generated ids. Can anyone think of a good way of solving this?
I typically run any database tests against the database that I'm running the app against - SQLite can be good for quick tests but it is just missing too many of the features that you'll find in a full blown DBMS. You may be able to use a method like the one discussed here to tweak your mappings at runtime if it is a mapping issue.
You could also preload a SQLite database with the entities you need, and copy this in for reuse every time you run the test. This is probably the route I would take for something like this, but I can't offer any technical details on how to do it.
To be honest it sounds a bit strange to have your business logic depend on certain Id's - I would think you'd want it to depend on certain entities - you could then insert these entities and store their generated Id's for the duration of your tests.
After looking into this problem and reading the respons from AlexCuse (+1 to his answer), I deemed it was not possible to use the native id generator in this case. I both needed unit tests to work when saving rows with known ids in test setups and tests inserting with autogenerated ids.
One option was to have some sort of check in the fluent mapping that would use GeneratedBy.Native("SEQUENCENAME") in production code and GeneratedBy.Assigned in tests, but I didn't like the idea of having differences related to NHibernate mappings between unit tests and production.
What I opted for in the end was to handle this in the repository. I have an Add method in the relevant repository and this will handle assigning a generated id from a sequence if the id isn't already set, something like this:
public void Add(TheClass newObject) {
if (newObject.Id == 0) {
newObject.Id = sequenceGenerator.GetNextValue("SEQUENCENAME");
}
session.Save(newObject);
}
In unit tests I will insert a mock sequence generator in the repository. You could argue that this is similar to the approach of having different mappings for unit tests and production code, but I think this approach makes the difference a bit more isolated. The most important reason, though, is that it allows me to use both assigned and automatically generated ids also in unit tests.

Admin interface to manage two related data sources

In the project there are two data sources: one is project's own database, another is (semi-)legacy web service. The problem is that admin part has to keep them in sync and manage both so that user doesn't have to know they're separate (or, do know, but they do not care).
Here's an example: there's list of languages. Both apps - project and legacy - need to use them. However, they both add their own meaning. For example, project may need active/inactive, and legacy will need language code.
But admin part has to manage everything - language name, active/inactive, language code. When loading, data from both systems has to be merged and presented, and when saved, data has to be updated in both systems.
Thus, what's the best way to represent this separated data (to be used in the admin page)? Notice that I use ASP.NET MVC / NHibernate.
How do I manage legacy data?
Do I connect admin part to legacy web service external interface - where it currently only has GetXXX() methods - and add the missed C[R]UD methods?
Or, do I connect directly to legacy database - which is possible since I do control it.
Where do I do split/merge of data - in the controller/service layer, or in the repository/data layer?
In the controller layer I'll do "var viewmodel = new ViewModel { MyData = ..., LegacyData = ... }; The problem - code cluttered with legacy issues.
In the data layer, I'll do "var model = repository.Get(id)" and model will contain data from both worlds, and when I do "repository.Save(entity)" it will update both data sources - in local db only project specific fields will be stored. The problems: a) possible leaky abstraction b) getting data from web service always while it is only need sometimes and usually for admin part only
a modification, use ICombinedRepository<Language> which will provide additional split/merge. Problems: still need either new model or IWithLegacy<Language, LegacyLanguage>...
Have a single "sync" method; this will remove legacy items not present in the project item list, update those that are present, create legacy items that are missed, etc...
Well, to summarize the main issues:
do I develop CRUD interface on web service or connect directly to its database (which is under my complete control, so that I may even later decide to move that web service part into the main app or make it use the main db)?
do I have separate classes for project's and legacy entities, thus managed separately, or have project's entities have all the legacy fields, managed transparently when saved/loaded?
Anyway, are there any useful tips on managing mostly duplicated data from different sources? What are the best practices?
In the non-admin part, I'd like to completely hide the notion of the legacy data. Which is what I do now, behind the repository interfaces. But for admin part it's not that clear or easy...
What you are describing here seems to warrant the need for an Anti-Corruption Layer. You can find solutions related to this topic here: DDD, Anti Corruption layer, how-to?
When you have two conceptual Bounded Contexts, but you're only using DDD for one of them, the Anti-Corruption layer comes into play. When reading from your data source (performing a get operation [R]), the anti-corruption layer will translate your legacy data into usable objects for your project. When writing to your data source (performing a set operation [CUD]), the anti-corruption layer will translate your DDD objects into objects understood by your legacy code.
Whether or not to use the existing Web Service depends on whether or not you're willing to change existing code. Sticking with DRY practices, you don't want to duplicate what you already have. If you want to keep the Web Service, you can add CUD methods inside the anti-corruption layer without impacting your legacy application.
In the anti-corruption layer, you will want to make use of adapters and facades to bring together separate classes for your DDD project and the legacy application.
The anti-corruption layer is exactly where you handle splitting and merging.
Let me know if you have any questions on this, as it can be a somewhat advanced topic. I'll try to answer as best I can.
Good luck!

Query building in a database agnostic way

In a C++ application that can use just about any relational database, what would be the best way of generating queries that can be easily extended to allow for a database engine's eccentricities?
In other words, the code may need to retrieve data in a way that is not consistent among the various database engines. What's the best way to design the code on the client side to generate queries in a way that will make supporting a new database engine a relatively painless affair.
For example, if I have (MFC)code that looks like this:
CString query = "SELECT id FROM table"
results = dbConnection->Query(query);
and we decide to support some database that uses, um, "AVEC" instead of "FROM". Now whenever the user uses that database engine, this query will fail.
Options so far:
Worst option: have the code making the query check the database type.
Better option: Create query request method on the db connection object that takes a unique query "code" and returns the appropriate query based on the database engine in use.
Betterer option: Create a query builder class that allows the caller to construct queries without using any SQL directly. Once the query is completed, caller can invoke a "Generate" method which returns a query string approrpriate for the active database engine
Best option: ??
Note: The database engine itself is abstracted away through some thin layers of our own creation. It's the queries themselves are the only remaining problem.
Solution:
I've decided to go with the "better" option (query "selector") for two reasons.
Debugging: As mentioned below, debugging is going to be slightly easier with the selector approach since the queries are pre-built and listed out in a readable form in code.
Flexibility: It occurred to me that there are some databases which might have vastly better and completely different ways of solving a particular query. For example, with Access I perform a complicated query on multiple tables each time because I have to, but on Sql Server I'd like to setup a view. Selecting from the view and from several tables are completely different queries (i think) and this query selector would handle it easily.
You need your own query-writing object, which can be inherited from by database-specific implementations.
So you would do something like:
DbAgnosticQueryObject query = new PostgresSQLQuery();
query.setFrom('foo');
query.setSelect('id');
// and so on
CString queryString = query.toString();
It can get pretty complicated in there once you go past simple selects from a single table. There are already ORM packages out there that deal with a lot of these nuances; it may be worth at looking at them instead of writing your own.
Best option: Pick a database, and code to it.
How often are you going to up and swap out the database on the back end of a production system? And even if you did, you'd have a lot more to worry about than just minor syntax issues. (Major stuff like join syntax, even datatypes can differ widely between databases.)
Now, if you are designing a commercial application where you want the customer to be able to use one of several back-end options when they implement it, then you may have to specify "we support Oracle, MS SQl, or MYSQL" and code to those specific options.
All of your options can be reduced to
Worst option: have the code making the query check the database type.
It's just a matter of where you're putting the logic to check the database type.
The option that I've seen work best in practice is
Better option: Create query request method on the db connection object that takes a unique query "code" and returns the appropriate query based on the database engine in use.
In my experience it is much easier to test queries independently from the rest of your code. It gets a lot harder if you have objects that are piecing together queries from bits of syntax, because then you have to test the query-creation code and the query itself.
If you pull all of your SQL out into separate files that are written and maintained by hand, you can have someone who is an expert in SQL write them (you can still automate the testing of these queries). If you try to write query-generating functions you'll essentially have a C++ expert writing SQL.
Choose an ORM, and start mapping.
If you are to support more than one DB, your problem is only going to get worse.
And just think of DB that are comming - cloud dbs with no (or close to no) SQL, and Object databases.
Take your queries outside the code - put them in the DB or in a resource file and allow overrides for different database engines.
If you use SPs it's potentially even easier, since the SPs abstract away your database differences.
I would think that what you would want to do, if you needed the ability to support multiple databases, would be to create a data provider interface (or abstract class) and associated concrete implementations. The data provider would need to support your standard query operators and other common, supported functionality required support your query operations (have a look at IEnumerable extension methods in .NET 3.5). Each concrete provider would then translate these into specific queries based on the target database engine.
Essentially, what you do is create a database abstraction layer and have your code interact with it. If you can find one of these for C++, it would probably be worth buying instead of writing. You may also want to look for Inversion of Control (IoC) containers for C++ that would basically do this and more. I know of several for Java and C#, but I'm not familiar with any for C++.